- abort(String) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- abortStage(Stage, String) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Aborts all jobs depending on a particular Stage.
- abs(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the absolutle value.
- AbsoluteError - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for absolute error loss calculation (for regression).
- AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
-
- accept(File, String) - Method in class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
-
- AcceptanceResult - Class in org.apache.spark.util.random
-
Object used by seqOp to keep track of the number of items accepted and items waitlisted per
stratum, as well as the bounds for accepting and waitlisting items.
- AcceptanceResult(long, long) - Constructor for class org.apache.spark.util.random.AcceptanceResult
-
- acceptBound() - Method in class org.apache.spark.util.random.AcceptanceResult
-
- Accumulable<R,T> - Class in org.apache.spark
-
A data type that can be accumulated, ie has an commutative and associative "add" operation,
but where the result type, R
, may be different from the element type being added, T
.
- Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
-
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, to which tasks can add values
with
+=
.
- accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, with a name for display in the
Spark UI.
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
-
Create an accumulator from a "mutable collection" type.
- AccumulableInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about an
Accumulable
modified during a task or stage.
- AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
-
- accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- AccumulableParam<R,T> - Interface in org.apache.spark
-
Helper object defining how to accumulate values of a particular type.
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
-
Terminal values of accumulables updated during this stage.
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
-
Intermediate updates to accumulables during this task.
- accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- Accumulator<T> - Class in org.apache.spark
-
A simpler value of
Accumulable
where the result type being accumulated is the same
as the types of elements being merged, i.e.
- Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
-
- Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
-
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
+=
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, with a name for display
in the Spark UI.
- accumulator() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
- AccumulatorParam<T> - Interface in org.apache.spark
-
A simpler version of
AccumulableParam
where the only data type you can add
in is the same type as the accumulated value.
- AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
- AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
- AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
- AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
- Accumulators - Class in org.apache.spark
-
- Accumulators() - Constructor for class org.apache.spark.Accumulators
-
- accumUpdates() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- accumUpdates() - Method in class org.apache.spark.scheduler.DirectTaskResult
-
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns accuracy
- aclsEnabled() - Method in class org.apache.spark.SecurityManager
-
Check to see if Acls for the UI are enabled
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- activeExecutorIds() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- ActiveJob - Class in org.apache.spark.scheduler
-
Tracks information about an active job in the DAGScheduler.
- ActiveJob(int, Stage, Function2<TaskContext, Iterator<Object>, ?>, int[], CallSite, JobListener, Properties) - Constructor for class org.apache.spark.scheduler.ActiveJob
-
- activeJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- activeTaskSets() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- actor() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- ACTOR_NAME() - Static method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
- ActorHelper - Interface in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A receiver trait to be mixed in with your Actor to gain access to
the API for pushing received data into Spark Streaming for being processed.
- ActorLogReceive - Interface in org.apache.spark.util
-
A trait to enable logging all Akka actor messages.
- ActorReceiver<T> - Class in org.apache.spark.streaming.receiver
-
Provides Actors as receivers for receiving stream.
- ActorReceiver(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver
-
- ActorReceiver.Supervisor - Class in org.apache.spark.streaming.receiver
-
- ActorReceiver.Supervisor() - Constructor for class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
-
- ActorReceiverData - Interface in org.apache.spark.streaming.receiver
-
Case class to receive data sent by child actors
- actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A helper with set of defaults for supervisor strategy
- ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- actorSystem() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- actorSystem() - Method in class org.apache.spark.SparkEnv
-
- actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
-
- actualSize(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Returns the size of the value row(ordinal)
.
- actualSize(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
-
- add(T) - Method in class org.apache.spark.Accumulable
-
Add more data to this accumulator / accumulable
- add(Map<Object, Object>) - Static method in class org.apache.spark.Accumulators
-
- add(long, long, ED) - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
-
Add a new edge to the partition.
- add(long, long, int, int, ED) - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
-
Add a new edge to the partition.
- add(float[], float) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
Adds an observation.
- add(ALS.Rating<ID>) - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
-
Adds a rating.
- add(int, Object, int[], float[]) - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
-
Adds a dst block of (srcId, dstLocalIndex, rating) tuples.
- add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Adds a new document.
- add(Iterable<T>, long) - Method in class org.apache.spark.mllib.fpm.FPTree
-
Adds a transaction with count.
- add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Adds two block matrices together.
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Add a new sample to this summarizer, and update the statistical summary.
- add(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Add the stats from another calculator into this one, modifying and returning this calculator.
- add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- add(Vector) - Method in class org.apache.spark.util.Vector
-
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
-
Add additional data to the accumulator value.
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-
- addAccumulator(R, T) - Method in class org.apache.spark.GrowableAccumulableParam
-
- addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addBinary(Binary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
-
- addBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
-
Add the given block to this storage status.
- AddBlock - Class in org.apache.spark.streaming.scheduler
-
- AddBlock(ReceivedBlockInfo) - Constructor for class org.apache.spark.streaming.scheduler.AddBlock
-
- addBlock(ReceivedBlockInfo) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Add received block.
- addBoolean(boolean) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addData(Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Push a single data item into the buffer.
- addDataWithCallback(Object, Object) - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Push a single data item into the buffer.
- addDouble(double) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addedFiles() - Method in class org.apache.spark.SparkContext
-
- addedJars() - Method in class org.apache.spark.SparkContext
-
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(File) - Method in class org.apache.spark.HttpFileServer
-
- addFile(String) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String, boolean) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- AddFile - Class in org.apache.spark.sql.hive.execution
-
- AddFile(String) - Constructor for class org.apache.spark.sql.hive.execution.AddFile
-
- addFileToDir(File, File) - Method in class org.apache.spark.HttpFileServer
-
- addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
-
Add filters, if any, to the given list of ServletContextHandlers
- addFloat(float) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a param with multiple values (overwrites if the input param exists).
- addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a double param with multiple values.
- addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a int param with multiple values.
- addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a float param with multiple values.
- addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a long param with multiple values.
- addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a boolean param with true and false.
- addImplicit(float[], float, double) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
Adds an observation with implicit feedback.
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
-
Merge two accumulated values together.
- addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
- addInPlace(R, R) - Method in class org.apache.spark.GrowableAccumulableParam
-
- addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- addInPlace(Vector) - Method in class org.apache.spark.util.Vector
-
- addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
-
- addInputStream(InputDStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
-
- addInt(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addJar(File) - Method in class org.apache.spark.HttpFileServer
-
- addJar(String) - Method in class org.apache.spark.SparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- AddJar - Class in org.apache.spark.sql.hive.execution
-
- AddJar(String) - Constructor for class org.apache.spark.sql.hive.execution.AddJar
-
- addListener(L) - Method in interface org.apache.spark.util.ListenerBus
-
Add a listener to listen events.
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
-
Add Hadoop configuration specific to a single partition and attempt.
- addLong(long) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Adds a callback function to be executed on task completion.
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
-
- addOutputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
-
- addOutputLoc(int, MapStatus) - Method in class org.apache.spark.scheduler.Stage
-
- addOutputStream(DStream<?>) - Method in class org.apache.spark.streaming.DStreamGraph
-
- addPartitioningAttributes(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
-
- addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- address() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
-
- address(String, String, String, Object, String) - Static method in class org.apache.spark.util.AkkaUtils
-
- addresses() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- addRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
-
If the given task ID is not in the set of running tasks, adds it.
- addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
-
- addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
-
- addSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Register a listener to receive up-calls from events that happen during execution.
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
-
Adds a (Java friendly) listener to be executed on task completion.
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Adds a listener in the form of a Scala closure to be executed on task completion.
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContextImpl
-
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContextImpl
-
- addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- addTaskSetManager(Schedulable, Properties) - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
-
- addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
-
- addURL(URL) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- addURL(URL) - Method in class org.apache.spark.util.MutableURLClassLoader
-
- addValueFromDictionary(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
-
- adminAcls() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- advance(long) - Method in class org.apache.spark.util.ManualClock
-
- advanceCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
-
Advance the checkpoint clock by the checkpoint interval.
- agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame
-
Aggregates on the entire
DataFrame
without groups.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Aggregates on the entire
DataFrame
without groups.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
(Java-specific) Aggregates on the entire
DataFrame
without groups.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Aggregates on the entire
DataFrame
without groups.
- agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData
-
Compute aggregates by specifying a series of aggregate columns.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
-
(Java-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData
-
Compute aggregates by specifying a series of aggregate columns.
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesEdgeScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Send messages along edges and aggregate them at the receiving vertices.
- aggregateMessagesIndexScan(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, EdgeActiveness, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Send messages along edges and aggregate them at the receiving vertices.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- aggregateSizeForNode(DecisionTreeMetadata, Option<int[]>) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Get the number of values to be stored for this node in the bin aggregates.
- aggregateUsingIndex(Iterator<Product2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Aggregates vertices in messages
that have the same ids using reduceFunc
, returning a
VertexRDD co-indexed with this
.
- AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
-
- AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Aggregator<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
A set of functions used to aggregate data.
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-
- akkaSSLOptions() - Method in class org.apache.spark.SecurityManager
-
- AkkaUtils - Class in org.apache.spark.util
-
Various utility classes for working with Akka.
- AkkaUtils() - Constructor for class org.apache.spark.util.AkkaUtils
-
- Algo - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to select the algorithm for the decision tree
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
-
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- alias() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- aliasNames() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- All - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose all the fields (source, edge, and destination).
- AllCompressionSchemes - Interface in org.apache.spark.sql.columnar.compression
-
- AllJobsCancelled - Class in org.apache.spark.scheduler
-
- AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
-
- AllJobsPage - Class in org.apache.spark.ui.jobs
-
Page showing list of all ongoing and recently finished jobs
- AllJobsPage(JobsTab) - Constructor for class org.apache.spark.ui.jobs.AllJobsPage
-
- allJoinTokens() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Allocate all unallocated blocks to the given batch.
- allocateBlocksToBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Allocate all unallocated blocks to the given batch.
- AllocatedBlocks - Class in org.apache.spark.streaming.scheduler
-
Class representing the blocks of all the streams allocated to a batch
- AllocatedBlocks(Map<Object, Seq<ReceivedBlockInfo>>) - Constructor for class org.apache.spark.streaming.scheduler.AllocatedBlocks
-
- allocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
-
- allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- allowExisting() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- allowExisting() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- allowLocal() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- allPendingTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- AllStagesPage - Class in org.apache.spark.ui.jobs
-
Page showing list of all ongoing and recently finished stages and pools
- AllStagesPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.AllStagesPage
-
- alpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the alpha parameter in the implicit preference formulation.
- AlphaComponent - Annotation Type in org.apache.spark.annotation
-
A new component of Spark which may have unstable API's.
- ALS - Class in org.apache.spark.ml.recommendation
-
Alternating Least Squares (ALS) matrix factorization.
- ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
-
- ALS - Class in org.apache.spark.mllib.recommendation
-
Alternating Least Squares matrix factorization.
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10,
lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
- ALS.CholeskySolver - Class in org.apache.spark.ml.recommendation
-
Cholesky solver for least square problems.
- ALS.CholeskySolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.CholeskySolver
-
- ALS.InBlock<ID> - Class in org.apache.spark.ml.recommendation
-
In-link block for computing src (user/item) factors.
- ALS.InBlock(Object, int[], int[], float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock
-
- ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
-
- ALS.InBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock$
-
- ALS.LeastSquaresNESolver - Interface in org.apache.spark.ml.recommendation
-
Trait for least squares solvers applied to the normal equation.
- ALS.LocalIndexEncoder - Class in org.apache.spark.ml.recommendation
-
Encoder for storing (blockId, localIndex) into a single integer.
- ALS.LocalIndexEncoder(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
-
- ALS.NNLSSolver - Class in org.apache.spark.ml.recommendation
-
NNLS solver.
- ALS.NNLSSolver() - Constructor for class org.apache.spark.ml.recommendation.ALS.NNLSSolver
-
- ALS.NormalEquation - Class in org.apache.spark.ml.recommendation
-
Representing a normal equation (ALS' subproblem).
- ALS.NormalEquation(int) - Constructor for class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
- ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
-
Rating class for better code readability.
- ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
-
- ALS.Rating$ - Class in org.apache.spark.ml.recommendation
-
- ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
-
- ALS.RatingBlock<ID> - Class in org.apache.spark.ml.recommendation
-
A rating block that contains src IDs, dst IDs, and ratings, stored in primitive arrays.
- ALS.RatingBlock(Object, Object, float[], ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock
-
- ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
-
- ALS.RatingBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock$
-
- ALS.RatingBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
-
- ALS.RatingBlockBuilder(ClassTag<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
-
- ALS.UncompressedInBlock<ID> - Class in org.apache.spark.ml.recommendation
-
A block of (srcId, dstEncodedIndex, rating) tuples stored in primitive arrays.
- ALS.UncompressedInBlock(Object, int[], float[], ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
- ALS.UncompressedInBlockBuilder<ID> - Class in org.apache.spark.ml.recommendation
-
Builder for uncompressed in-blocks of (srcId, dstEncodedIndex, rating) tuples.
- ALS.UncompressedInBlockBuilder(ALS.LocalIndexEncoder, ClassTag<ID>, Ordering<ID>) - Constructor for class org.apache.spark.ml.recommendation.ALS.UncompressedInBlockBuilder
-
- ALSModel - Class in org.apache.spark.ml.recommendation
-
Model fitted by ALS.
- ALSModel(ALS, ParamMap, int, RDD<Tuple2<Object, float[]>>, RDD<Tuple2<Object, float[]>>) - Constructor for class org.apache.spark.ml.recommendation.ALSModel
-
- ALSParams - Interface in org.apache.spark.ml.recommendation
-
Common params for ALS.
- analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- AnalyzeTable - Class in org.apache.spark.sql.hive.execution
-
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- AnalyzeTable(String) - Constructor for class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- and(Column) - Method in class org.apache.spark.sql.Column
-
Boolean AND.
- AND() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- And - Class in org.apache.spark.sql.sources
-
- And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
-
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- append(boolean, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- append(byte, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- append(byte[], ByteBuffer) - Method in class org.apache.spark.sql.columnar.ByteArrayColumnType
-
- append(JvmType, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Appends the given value v of type T into the given ByteBuffer.
- append(Row, int, ByteBuffer) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Appends row(ordinal)
of type T into the given ByteBuffer.
- append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DATE
-
- append(double, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- append(float, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- append(int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.INT
-
- append(long, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.LONG
-
- append(short, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- append(Row, int, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- append(String, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.STRING
-
- append(Timestamp, ByteBuffer) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
-
- append(AvroFlumeEvent) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
-
- appendBatch(List<AvroFlumeEvent>) - Method in class org.apache.spark.streaming.flume.FlumeEventServer
-
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns a new vector with 1.0
(bias) appended to the input vector.
- appendFrom(Row, int) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
-
Appends row(ordinal)
to the column builder.
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- appendFrom(Row, int) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
-
- AppendingParquetOutputFormat - Class in org.apache.spark.sql.parquet
-
TODO: this will be able to append to directories it created itself, not necessarily
to imported ones.
- AppendingParquetOutputFormat(int) - Constructor for class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
-
- appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- appId() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- appId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
-
- ApplicationEventListener - Class in org.apache.spark.scheduler
-
A simple listener for application events.
- ApplicationEventListener() - Constructor for class org.apache.spark.scheduler.ApplicationEventListener
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- applicationId() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
Get an application ID associated with the job.
- applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
Get an application ID associated with the job.
- applicationId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- applicationId() - Method in class org.apache.spark.SparkContext
-
- applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of vertices and
edges with attributes.
- apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
- apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
- apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a `ShippableVertexPartition` from the given vertices without any routing table.
- apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a ShippableVertexPartition
from the given vertices with the specified routing
table, filling in missing vertices mentioned in the routing table using defaultVal
.
- apply(Iterator<Tuple2<Object, VD>>, RoutingTablePartition, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Construct a ShippableVertexPartition
from the given vertices with the specified routing
table, filling in missing vertices mentioned in the routing table using defaultVal
,
and merging duplicate vertex atrribute with mergeFunc.
- apply(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
-
Construct a `VertexPartition` from the given vertices.
- apply(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
Return the vertex attribute for the given vertex ID.
- apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a standalone
VertexRDD
(one that is not set up for efficient joins with an
EdgeRDD
) from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Gets the value of the input param or its default value if it does not exist.
- apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
-
- apply(BinaryConfusionMatrix) - Method in class org.apache.spark.mllib.evaluation.binary.FMeasure
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
-
- apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- apply(int, int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- apply(int, int, int) - Static method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
Creates a new
GridPartitioner
instance with the input suggested number of partitions.
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Gets the value of the ith element.
- apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
- apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
-
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
Alternate factory method that takes a ByteBuffer directly for the data field
- apply(BlockManagerId, long[]) - Static method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
-
- apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
-
- apply(String) - Static method in class org.apache.spark.sql.Column
-
- apply(Expression) - Static method in class org.apache.spark.sql.Column
-
- apply(DataType) - Static method in class org.apache.spark.sql.columnar.ColumnType
-
- apply(boolean, int, StorageLevel, SparkPlan, Option<String>) - Static method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- apply(String) - Method in class org.apache.spark.sql.DataFrame
-
Selects column based on the column name and return it as a
Column
.
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.CreateTables
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.ParquetConversions
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.DataSinks
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveDDLStrategy
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.ResolveUdtfsAlias
-
- apply(int, long) - Static method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.DataSourceStrategy
-
- apply(String, boolean) - Method in class org.apache.spark.sql.sources.DDLParser
-
- apply(LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
-
- apply(LogicalPlan) - Method in class org.apache.spark.sql.sources.PreWriteCheck
-
- apply(SQLContext, Option<StructType>, String, Map<String, String>) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
-
- apply(SQLContext, String, SaveMode, Map<String, String>, DataFrame) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
-
- apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
-
- apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
Returns a
Column
that will evaluate to calling this UDF with the given input.
- apply(String) - Static method in class org.apache.spark.storage.BlockId
-
Converts a BlockId "name" String back into a BlockId.
- apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object without setting useOffHeap.
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object.
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object from its integer representation.
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- apply(Map<String, String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
-
Make a consumer config without requiring group.id or zookeeper.connect,
since communicating with brokers also needs common settings such as timeout
- apply(SparkContext, Map<String, String>, Map<TopicAndPartition, Object>, Map<TopicAndPartition, KafkaCluster.LeaderOffset>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaRDD
-
- apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- apply(Tuple4<String, Object, Object, Object>) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-
- apply(I, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.CompletionIterator
-
- apply(Traversable<Object>) - Static method in class org.apache.spark.util.Distribution
-
- apply(InputStream, File, SparkConf) - Static method in class org.apache.spark.util.logging.FileAppender
-
Create the right appender based on Spark configuration
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values.
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values passed as variable-length arguments.
- apply(A) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- apply(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- apply(int) - Method in class org.apache.spark.util.Vector
-
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
:: DeveloperApi ::
Creates a
DataFrame
from an
RDD
containing
Row
s by applying a schema to this RDD.
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
Applies a schema to an RDD of Java Beans.
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
Applies a schema to an RDD of Java Beans.
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- appName() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appName() - Method in class org.apache.spark.SparkContext
-
- appName() - Method in class org.apache.spark.ui.SparkUI
-
- appName() - Method in class org.apache.spark.ui.SparkUITab
-
- approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- ApproximateActionListener<T,U,R> - Class in org.apache.spark.partial
-
A JobListener for an approximate single-result action, such as count() or non-parallel reduce().
- ApproximateActionListener(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Constructor for class org.apache.spark.partial.ApproximateActionListener
-
- ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial
-
An object that computes a function incrementally by merging in results of type U from multiple
tasks.
- appUIAddress() - Method in class org.apache.spark.ui.SparkUI
-
- appUIHostPort() - Method in class org.apache.spark.ui.SparkUI
-
Return the application UI host:port.
- AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
-
Computes the area under the curve (AUC) using the trapezoidal rule.
- AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
-
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the precision-recall curve.
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the receiver operating characteristic (ROC) curve.
- areBoundsEmpty() - Method in class org.apache.spark.util.random.AcceptanceResult
-
- argString() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- arr() - Method in class org.apache.spark.rdd.PartitionGroup
-
- array(DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type array
- ARRAY() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- ARRAY_CONTAINS_NULL_BAG_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- ARRAY_ELEMENTS_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- arrayBuffer() - Method in class org.apache.spark.streaming.receiver.ArrayBufferBlock
-
- ArrayBufferBlock - Class in org.apache.spark.streaming.receiver
-
class representing a block received as an ArrayBuffer
- ArrayBufferBlock(ArrayBuffer<?>) - Constructor for class org.apache.spark.streaming.receiver.ArrayBufferBlock
-
- ArrayValues - Class in org.apache.spark.storage
-
- ArrayValues(Object[]) - Constructor for class org.apache.spark.storage.ArrayValues
-
- as(String) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(Symbol) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(String) - Method in class org.apache.spark.sql.DataFrame
-
- as(Symbol) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Returns a new
DataFrame
with an alias set.
- asc() - Method in class org.apache.spark.sql.Column
-
Returns an ordering used in sorting.
- asc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on ascending order of the column.
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator.
- AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
-
- AskPermissionToCommitOutput(int, long, long) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
-
- askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- askTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
-
Returns the default Spark timeout to use for Akka ask operations.
- askWithReply(Object, ActorRef, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
-
Send a message to the given actor and get its result within a default timeout, or
throw a SparkException if this fails.
- askWithReply(Object, ActorRef, int, int, FiniteDuration) - Static method in class org.apache.spark.util.AkkaUtils
-
Send a message to the given actor and get its result within a default timeout, or
throw a SparkException if this fails even after the specified number of retries.
- asNullable() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- asNullable() - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-
- assertValid() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Check validity of parameters.
- assertValid() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Check validity of parameters.
- assertValid() - Method in class org.apache.spark.rdd.BlockRDD
-
Check if this BlockRDD is valid.
- assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- AsynchronousListenerBus<L,E> - Class in org.apache.spark.util
-
Asynchronously passes events to registered listeners.
- AsynchronousListenerBus(String) - Constructor for class org.apache.spark.util.AsynchronousListenerBus
-
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
-
A set of asynchronous RDD actions available through an implicit conversion.
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-
- ata() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
A^T^ * A
- atb() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
A^T^ * b
- attachExecutor(ReceiverSupervisor) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Attach Network Receiver executor to this receiver.
- attachHandler(ServletContextHandler) - Method in class org.apache.spark.ui.WebUI
-
Attach a handler to this UI.
- attachListener(CleanerListener) - Method in class org.apache.spark.ContextCleaner
-
Attach a listener object to get information of when objects are cleaned.
- attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUI
-
Attach a page to this UI.
- attachPage(WebUIPage) - Method in class org.apache.spark.ui.WebUITab
-
Attach a page to this tab.
- attachTab(WebUITab) - Method in class org.apache.spark.ui.WebUI
-
Attach a tab to this UI, along with all of its attached pages.
- attempt() - Method in class org.apache.spark.scheduler.TaskInfo
-
- attempt() - Method in class org.apache.spark.scheduler.TaskSet
-
- attemptId() - Method in class org.apache.spark.scheduler.Stage
-
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptID() - Method in class org.apache.spark.TaskCommitDenied
-
- attemptId() - Method in class org.apache.spark.TaskContext
-
- attemptId() - Method in class org.apache.spark.TaskContextImpl
-
- attemptNumber() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
-
- attemptNumber() - Method in class org.apache.spark.scheduler.TaskDescription
-
- attemptNumber() - Method in class org.apache.spark.TaskContext
-
How many times this task has been attempted.
- attemptNumber() - Method in class org.apache.spark.TaskContextImpl
-
- attr() - Method in class org.apache.spark.graphx.Edge
-
- attr() - Method in class org.apache.spark.graphx.EdgeContext
-
The attribute associated with the edge.
- attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- attr() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- attribute() - Method in class org.apache.spark.sql.sources.EqualTo
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.In
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNull
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThan
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- attributeMap() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
An attribute map that can be used to lookup original attributes based on expression id.
- attributeMap() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- attributeMap() - Method in class org.apache.spark.sql.sources.LogicalRelation
-
Used to lookup original attribute capitalization
- attributes() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- attributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- attributes() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
Non-partitionKey attributes
- attributes() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- attributes() - Method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- attrs() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
-
- AUTO_BROADCASTJOIN_THRESHOLD() - Static method in class org.apache.spark.sql.SQLConf
-
- autoBroadcastJoinThreshold() - Method in class org.apache.spark.sql.SQLConf
-
Upper bound on the sizes (in bytes) of the tables qualified for the auto conversion to
a broadcast value during the physical executions of join operations.
- Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- avg(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the mean value for each numeric columns for each group.
- avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the mean value for each numeric columns for each group.
- AVG() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- awaitResult() - Method in class org.apache.spark.partial.ApproximateActionListener
-
Waits for up to timeout milliseconds since the listener was created and then returns a
PartialResult with the result so far.
- awaitResult() - Method in class org.apache.spark.scheduler.JobWaiter
-
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Wait the thread until the supervisor is stopped
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination() - Method in class org.apache.spark.util.logging.FileAppender
-
Wait for the appender to stop appending, either because input stream is closed
or because of any error in appending
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y += a * x
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the previously-specified target
storage levels, which default to MEMORY_ONLY
.
- cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Caches the underlying RDD.
- cache() - Method in class org.apache.spark.partial.StudentTCacher
-
- cache() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.sql.DataFrame
-
- cache() - Method in interface org.apache.spark.sql.RDDApi
-
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- CachedBatch - Class in org.apache.spark.sql.columnar
-
- CachedBatch(byte[][], Row) - Constructor for class org.apache.spark.sql.columnar.CachedBatch
-
- cachedColumnBuffers() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- CachedData - Class in org.apache.spark.sql
-
Holds a cached logical plan and its data
- CachedData(LogicalPlan, InMemoryRelation) - Constructor for class org.apache.spark.sql.CachedData
-
- cachedRepresentation() - Method in class org.apache.spark.sql.CachedData
-
- CacheManager - Class in org.apache.spark
-
Spark class responsible for passing RDDs partition contents to the BlockManager and making
sure a node doesn't load two copies of an RDD at once.
- CacheManager(BlockManager) - Constructor for class org.apache.spark.CacheManager
-
- cacheManager() - Method in class org.apache.spark.SparkEnv
-
- CacheManager - Class in org.apache.spark.sql
-
Provides support in a SQLContext for caching query results and automatically using these cached
results when subsequent queries are executed.
- CacheManager(SQLContext) - Constructor for class org.apache.spark.sql.CacheManager
-
- cacheQuery(DataFrame, Option<String>, StorageLevel) - Method in class org.apache.spark.sql.CacheManager
-
Caches the data produced by the logical representation of the given schema rdd.
- cacheTable(String) - Method in class org.apache.spark.sql.CacheManager
-
Caches the specified table in-memory.
- cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Caches the specified table in-memory.
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for regression
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
variance calculation
- calculate() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
Calculate the impurity from the stored sufficient statistics.
- calculatedTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- calculateNumBatchesToRemember(Duration) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
-
Calculate the number of last batches to remember, such that all the files selected in
at least last MIN_REMEMBER_DURATION duration can be remembered.
- calculateTotalMemory(SparkContext) - Static method in class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-
- callSite() - Method in class org.apache.spark.scheduler.ActiveJob
-
- callSite() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- callSite() - Method in class org.apache.spark.scheduler.Stage
-
- CallSite - Class in org.apache.spark.util
-
CallSite represents a place in user code.
- CallSite(String, String) - Constructor for class org.apache.spark.util.CallSite
-
- callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 0 arguments as user-defined function (UDF).
- callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 1 arguments as user-defined function (UDF).
- callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 2 arguments as user-defined function (UDF).
- callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 3 arguments as user-defined function (UDF).
- callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 4 arguments as user-defined function (UDF).
- callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 5 arguments as user-defined function (UDF).
- callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 6 arguments as user-defined function (UDF).
- callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 7 arguments as user-defined function (UDF).
- callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 8 arguments as user-defined function (UDF).
- callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 9 arguments as user-defined function (UDF).
- callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 10 arguments as user-defined function (UDF).
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-
- cancel() - Method in interface org.apache.spark.FutureAction
-
Cancels the execution of this action.
- cancel(boolean) - Method in class org.apache.spark.JavaFutureActionWrapper
-
- cancel() - Method in class org.apache.spark.scheduler.JobWaiter
-
Sends a signal to the DAGScheduler to cancel the job.
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-
- cancel() - Method in class org.apache.spark.util.MetadataCleaner
-
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel all jobs that are running or waiting in the queue.
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelJob(int) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel a job that is running or waiting in the queue.
- cancelJob(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given job if it's scheduled or running
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel active jobs for the specified group.
- cancelJobGroup(String) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
-
Cancel active jobs for the specified group.
- cancelStage(int) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Cancel all jobs associated with a running or scheduled stage.
- cancelStage(int) - Method in class org.apache.spark.SparkContext
-
Cancel a given stage and all jobs associated with it
- cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- cancelTasks(int, boolean) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- canCommit(int, long, long) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
Called by tasks to ask whether they can commit their output to HDFS.
- canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-
- canFetchMoreResults(long) - Method in class org.apache.spark.scheduler.TaskSetManager
-
Check whether has enough quota to fetch the result with size
bytes
- capacity() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- CartesianPartition - Class in org.apache.spark.rdd
-
- CartesianPartition(int, RDD<?>, RDD<?>, int, int) - Constructor for class org.apache.spark.rdd.CartesianPartition
-
- CartesianRDD<T,U> - Class in org.apache.spark.rdd
-
- CartesianRDD(SparkContext, RDD<T>, RDD<U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.CartesianRDD
-
- CASE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- CaseInsensitiveMap - Class in org.apache.spark.sql.sources
-
Builds a map in which keys are case insensitive
- CaseInsensitiveMap(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CaseInsensitiveMap
-
- caseSensitive() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- cast(DataType) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type.
- cast(String) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type, using the canonical string representation
of the type.
- castAndRenameChildOutput(InsertIntoTable, Seq<Attribute>, LogicalPlan) - Static method in class org.apache.spark.sql.sources.PreInsertCastAndRename
-
If necessary, cast data types and rename fields to the expected types and names.
- castChildOutput(InsertIntoTable, MetastoreRelation, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.PreInsertionCasts
-
- catalog() - Method in class org.apache.spark.sql.sources.PreWriteCheck
-
- CatalystArrayContainsNullConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array contains null (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystArrayContainsNullConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- CatalystArrayContainsNullConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- CatalystArrayConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystArrayConverter(DataType, int, CatalystConverter, Buffer<Object>) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- CatalystArrayConverter(DataType, int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- CatalystConverter - Class in org.apache.spark.sql.parquet
-
- CatalystConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystConverter
-
- CatalystGroupConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.GroupConverter
that is able to convert a Parquet record
to a org.apache.spark.sql.catalyst.expressions.Row
object.
- CatalystGroupConverter(StructField[], int, CatalystConverter, ArrayBuffer<Object>, ArrayBuffer<Row>) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- CatalystGroupConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- CatalystGroupConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystGroupConverter
-
This constructor is used for the root converter only!
- CatalystMapConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts two-element groups that
match the characteristics of a map (see
ParquetTypesConverter
) into an
MapType
.
- CatalystMapConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystMapConverter
-
- CatalystNativeArrayConverter - Class in org.apache.spark.sql.parquet
-
A
parquet.io.api.GroupConverter
that converts a single-element groups that
match the characteristics of an array (see
ParquetTypesConverter
) into an
ArrayType
.
- CatalystNativeArrayConverter(NativeType, int, CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- CatalystPrimitiveConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.PrimitiveConverter
that converts Parquet types to Catalyst types.
- CatalystPrimitiveConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveConverter
-
- CatalystPrimitiveRowConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.GroupConverter
that is able to convert a Parquet record
to a org.apache.spark.sql.catalyst.expressions.Row
object.
- CatalystPrimitiveRowConverter(StructField[], MutableRow) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- CatalystPrimitiveRowConverter(Attribute[]) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- CatalystPrimitiveStringConverter - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.PrimitiveConverter
that converts Parquet Binary to Catalyst String.
- CatalystPrimitiveStringConverter(CatalystConverter, int) - Constructor for class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
-
- CatalystScan - Interface in org.apache.spark.sql.sources
-
::Experimental::
An interface for experimenting with a more direct connection to the query planner.
- CatalystStructConverter - Class in org.apache.spark.sql.parquet
-
This converter is for multi-element groups of primitive or complex types
that have repetition level optional or required (so struct fields).
- CatalystStructConverter(StructField[], int, CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.CatalystStructConverter
-
- CatalystTimestampConverter - Class in org.apache.spark.sql.parquet
-
- CatalystTimestampConverter() - Constructor for class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-
- category() - Method in class org.apache.spark.mllib.tree.model.Bin
-
- channelFactory() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- channelFactoryExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- checkEquals(ASTNode) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
-
Throws an error if this is not equal to other.
- checkHost(String, String) - Static method in class org.apache.spark.util.Utils
-
- checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
-
- checkInputColumn(StructType, String, DataType) - Method in interface org.apache.spark.ml.param.Params
-
Check whether the given schema contains an input column.
- checkMinimalPollingPeriod(TimeUnit, int) - Static method in class org.apache.spark.metrics.MetricsSystem
-
- checkModifyPermissions(String) - Method in class org.apache.spark.SecurityManager
-
Checks the given user against the modify acl list to see if they have
authorization to modify the application.
- checkOutputSpecs(JobContext) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
-
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.Graph
-
Mark this Graph for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- checkpoint() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for checkpointing.
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Enable periodic checkpointing of RDDs of this DStream.
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance.
- Checkpoint - Class in org.apache.spark.streaming
-
- Checkpoint(StreamingContext, Time) - Constructor for class org.apache.spark.streaming.Checkpoint
-
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Enable periodic checkpointing of RDDs of this DStream
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Set the context to periodically checkpoint the DStream operations for driver
fault-tolerance.
- checkpointBackupFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
-
Get the checkpoint backup file for the given checkpoint time
- checkpointClock() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
-
- checkpointData() - Method in class org.apache.spark.rdd.RDD
-
- checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDir() - Method in class org.apache.spark.SparkContext
-
- checkpointDir() - Method in class org.apache.spark.streaming.Checkpoint
-
- checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
-
- checkpointDirToLogDir(String, int) - Static method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
- checkpointDirToLogDir(String) - Static method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
- checkpointDuration() - Method in class org.apache.spark.streaming.Checkpoint
-
- checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
-
- Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
-
- checkpointFile(String, Time) - Static method in class org.apache.spark.streaming.Checkpoint
-
Get the checkpoint file for the given checkpoint time
- CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
-
- checkpointInProgress() - Method in class org.apache.spark.streaming.DStreamGraph
-
- checkpointInterval() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
-
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
- checkpointPath() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- CheckpointRDD<T> - Class in org.apache.spark.rdd
-
This RDD represents a RDD checkpoint file (similar to HadoopRDD).
- CheckpointRDD(SparkContext, String, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CheckpointRDD
-
- checkpointRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- CheckpointRDDPartition - Class in org.apache.spark.rdd
-
- CheckpointRDDPartition(int) - Constructor for class org.apache.spark.rdd.CheckpointRDDPartition
-
- CheckpointReader - Class in org.apache.spark.streaming
-
- CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
-
- CheckpointState - Class in org.apache.spark.rdd
-
Enumeration to manage state transitions of an RDD through checkpointing
[ Initialized --> marked for checkpointing --> checkpointing in progress --> checkpointed ]
- CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
-
- checkpointTime() - Method in class org.apache.spark.streaming.Checkpoint
-
- CheckpointWriter - Class in org.apache.spark.streaming
-
Convenience class to handle the writing of graph checkpoint to file
- CheckpointWriter(JobGenerator, SparkConf, String, Configuration) - Constructor for class org.apache.spark.streaming.CheckpointWriter
-
- CheckpointWriter.CheckpointWriteHandler - Class in org.apache.spark.streaming
-
- CheckpointWriter.CheckpointWriteHandler(Time, byte[]) - Constructor for class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.Pool
-
- checkSpeculatableTasks() - Method in interface org.apache.spark.scheduler.Schedulable
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- checkSpeculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
Check for tasks to be speculated and return true if there are any.
- checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- checkTimeoutInterval() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- checkUIViewPermissions(String) - Method in class org.apache.spark.SecurityManager
-
Checks the given user against the view acl list to see if they have
authorization to view the UI.
- child() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- child() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- child() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- child() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- child() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- child() - Method in class org.apache.spark.sql.sources.Not
-
- ChildFirstURLClassLoader - Class in org.apache.spark.util
-
A mutable class loader that gives preference to its own URLs over the parent class loader
when loading classes and resources.
- ChildFirstURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.ChildFirstURLClassLoader
-
- children() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
-
- children() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- children() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- children() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- children() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- children() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- ChiSqSelector - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Creates a ChiSquared feature selector.
- ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
-
- ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Chi Squared selector model.
- ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
expected distribution.
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
distribution, with each category having an expected frequency of 1 / observed.size
.
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
negative entries or columns or rows that sum up to 0.
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test for every feature against the label across the input RDD.
- ChiSqTest - Class in org.apache.spark.mllib.stat.test
-
Conduct the chi-squared test for the input RDDs using the specified method.
- ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
-
- ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
-
- ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
-
- ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
-
- ChiSqTest.NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Object containing the test results for the chi-squared hypothesis test.
- ChiSqTestResult(double, int, double, String, String) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
Conduct Pearson's independence test for each feature against the label across the input RDD.
- chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- chmod700(File) - Static method in class org.apache.spark.util.Utils
-
JDK equivalent of chmod 700 file
.
- classForName(String) - Static method in class org.apache.spark.util.Utils
-
Preferred alternative to Class.forName(className)
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
Model produced by a
Classifier
.
- ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
-
- ClassificationModel - Interface in org.apache.spark.mllib.classification
-
:: Experimental ::
Represents a classification model that predicts to which of a set of categories an example
belongs.
- Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
Single-label binary or multiclass classification.
- Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
-
- ClassifierParams - Interface in org.apache.spark.ml.classification
-
:: DeveloperApi ::
Params for classification.
- classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
-
Determines whether the provided class is loadable in the current thread.
- classLoader() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- className() - Method in class org.apache.spark.ExceptionFailure
-
- classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- clean(F, boolean) - Method in class org.apache.spark.SparkContext
-
Clean a closure to make it ready to serialized and send to tasks
(removes unreferenced variables in $outer's, updates REPL variables)
If checkSerializable is set, clean will also proactively
check to see if f is serializable and throw a SparkException
if not.
- clean(Object, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
-
- CleanBroadcast - Class in org.apache.spark
-
- CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
-
- cleaner() - Method in class org.apache.spark.SparkContext
-
- CleanerListener - Interface in org.apache.spark
-
Listener class used for testing when any item has been cleaned by the Cleaner class.
- CleanRDD - Class in org.apache.spark
-
- CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
-
- CleanShuffle - Class in org.apache.spark
-
- CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
-
- cleanup(long) - Method in class org.apache.spark.SparkContext
-
Called by MetadataCleaner to clean up the persistentRdds map periodically
- cleanup(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
Cleanup old checkpoint data.
- cleanup(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- cleanup(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
-
- cleanUpAfterSchedulerStop() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- cleanupOldBatches(Time, boolean) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Clean up block information of old batches.
- cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
-
- CleanupOldBlocks - Class in org.apache.spark.streaming.receiver
-
- CleanupOldBlocks(Time) - Constructor for class org.apache.spark.streaming.receiver.CleanupOldBlocks
-
- cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
-
Cleanup old blocks older than the given threshold time
- cleanupOldBlocks(long) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
- cleanupOldBlocksAndBatches(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Clean up the data and metadata of blocks and batches that are strictly
older than the threshold time.
- cleanupOldLogs(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
Delete the log files that are older than the threshold time.
- CleanupTask - Interface in org.apache.spark
-
Classes that represent cleaning tasks.
- CleanupTaskWeakReference - Class in org.apache.spark
-
A WeakReference associated with a CleanupTask.
- CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
-
- clear() - Static method in class org.apache.spark.Accumulators
-
- clear() - Method in class org.apache.spark.sql.SQLConf
-
- clear() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- clear() - Method in class org.apache.spark.storage.BlockStore
-
- clear() - Method in class org.apache.spark.storage.MemoryStore
-
- clear() - Method in class org.apache.spark.util.BoundedPriorityQueue
-
- CLEAR_NULL_VALUES_INTERVAL() - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- clearActiveContext() - Static method in class org.apache.spark.SparkContext
-
Clears the active SparkContext metadata.
- clearCache() - Method in class org.apache.spark.sql.CacheManager
-
Clears all cached tables.
- clearCache() - Method in class org.apache.spark.sql.SQLContext
-
Removes all cached tables from the in-memory cache.
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- clearCallSite() - Method in class org.apache.spark.SparkContext
-
Clear the thread-local property for overriding the call sites
of actions and RDDs.
- clearCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
- clearCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- ClearCheckpointData - Class in org.apache.spark.streaming.scheduler
-
- ClearCheckpointData(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearCheckpointData
-
- clearDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- clearDependencies() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearFiles() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the current thread's job group ID and its description.
- clearJobGroup() - Method in class org.apache.spark.SparkContext
-
Clear the current thread's job group ID and its description.
- clearMetadata(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Clear metadata that are older than rememberDuration
of this DStream.
- clearMetadata(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- ClearMetadata - Class in org.apache.spark.streaming.scheduler
-
- ClearMetadata(Time) - Constructor for class org.apache.spark.streaming.scheduler.ClearMetadata
-
- clearNullValues() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
Remove entries with values that are no longer strongly reachable.
- clearOldValues(long, Function2<A, B, BoxedUnit>) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashMap
-
Removes old key-value pairs that have timestamp earlier than `threshTime`.
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedHashSet
-
Removes old values that have timestamp earlier than threshTime
- clearOldValues(long) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
Remove old key-value pairs with timestamps earlier than `threshTime`.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- client() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- client() - Method in class org.apache.spark.storage.TachyonBlockManager
-
- client() - Method in class org.apache.spark.streaming.flume.FlumeConnection
-
- clock() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
-
- clock() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- Clock - Interface in org.apache.spark.util
-
An interface to represent clocks, so that they can be mocked out in unit tests.
- clone() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
-
- clone() - Method in class org.apache.spark.SparkConf
-
Copy this object
- clone(JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Creates a duplicated copy of the value.
- clone() - Method in class org.apache.spark.storage.StorageLevel
-
- clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
-
return a copy of the RandomSampler object
- clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
-
Clone an object using a Spark serializer.
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
Return a sampler that is the complement of the range specified of the current sampler.
- close() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- close() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- close() - Method in class org.apache.spark.input.PortableDataStream
-
Close the file (if it is currently open)
- close() - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- close() - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-
- close() - Method in class org.apache.spark.serializer.JavaDeserializationStream
-
- close() - Method in class org.apache.spark.serializer.JavaSerializationStream
-
- close() - Method in class org.apache.spark.serializer.KryoDeserializationStream
-
- close() - Method in class org.apache.spark.serializer.KryoSerializationStream
-
- close() - Method in class org.apache.spark.serializer.SerializationStream
-
- close() - Method in class org.apache.spark.SparkHadoopWriter
-
- close() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- close() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- close() - Method in class org.apache.spark.storage.BlockObjectWriter
-
- close() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- close() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogReader
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLogWriter
-
- closeIfNeeded() - Method in class org.apache.spark.util.NextIterator
-
Calls the subclass-defined close method, but only once.
- ClosureCleaner - Class in org.apache.spark.util
-
- ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
-
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-
- cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- cn() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null.
- coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null.
- COALESCE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- CoalescedRDD<T> - Class in org.apache.spark.rdd
-
Represents a coalesced RDD that has fewer partitions than its parent RDD
This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD
so that each new partition has roughly the same number of parent partitions and that
the preferred location of each new partition overlaps with as many preferred locations of its
parent partitions
- CoalescedRDD(RDD<T>, int, double, ClassTag<T>) - Constructor for class org.apache.spark.rdd.CoalescedRDD
-
- CoalescedRDDPartition - Class in org.apache.spark.rdd
-
Class that captures a coalesced RDD by essentially keeping track of parent partitions
- CoalescedRDDPartition(int, RDD<?>, int[], Option<String>) - Constructor for class org.apache.spark.rdd.CoalescedRDDPartition
-
- CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
-
- CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
-
- CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
-
- CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
-
- CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask(long, String, boolean) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
-
- CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask(SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
-
- CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
-
- CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
-
- CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor(String, String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
-
- CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor(String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
-
- CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
-
- CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors(int) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
-
- CoarseGrainedClusterMessages.RetrieveSparkProps$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.RetrieveSparkProps$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
-
- CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
-
Alternate factory method that takes a ByteBuffer directly for the data field
- CoarseGrainedClusterMessages.ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
-
- CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate(String, long, Enumeration.Value, SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
- CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
-
- CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
-
- CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedClusterMessages.StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
-
- CoarseGrainedSchedulerBackend - Class in org.apache.spark.scheduler.cluster
-
A scheduler backend that waits for coarse grained executors to connect to it through Akka.
- CoarseGrainedSchedulerBackend(TaskSchedulerImpl, ActorSystem) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- CoarseGrainedSchedulerBackend.DriverActor - Class in org.apache.spark.scheduler.cluster
-
- CoarseGrainedSchedulerBackend.DriverActor(Seq<Tuple2<String, String>>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- CoarseMesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
-
A SchedulerBackend that runs tasks on Mesos, but uses "coarse-grained" tasks, where it holds
onto each Mesos node for the duration of the Spark job instead of relinquishing cores whenever
a task is done.
- CoarseMesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- code() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- CODEGEN_ENABLED() - Static method in class org.apache.spark.sql.SQLConf
-
- codegenEnabled() - Method in class org.apache.spark.sql.SQLConf
-
When set to true, Spark SQL will use the Scala compiler at runtime to generate custom bytecode
that evaluates expressions found in queries.
- codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD that cogroups its parents.
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-
- CoGroupPartition - Class in org.apache.spark.rdd
-
- CoGroupPartition(int, CoGroupSplitDep[]) - Constructor for class org.apache.spark.rdd.CoGroupPartition
-
- cogroupResult2ToJava(RDD<Tuple2<K, Tuple3<Iterable<V>, Iterable<W1>, Iterable<W2>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- cogroupResult3ToJava(RDD<Tuple2<K, Tuple4<Iterable<V>, Iterable<W1>, Iterable<W2>, Iterable<W3>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- cogroupResultToJava(RDD<Tuple2<K, Tuple2<Iterable<V>, Iterable<W>>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- CoGroupSplitDep - Interface in org.apache.spark.rdd
-
- col(String) - Method in class org.apache.spark.sql.DataFrame
-
Selects column based on the column name and return it as a
Column
.
- col(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in this RDD.
- collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- collect() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD that contains all matching values by applying f
.
- collect() - Method in class org.apache.spark.sql.DataFrame
-
Returns an array that contains all of
Row
s in this
DataFrame
.
- collect() - Method in interface org.apache.spark.sql.RDDApi
-
- collectAsList() - Method in class org.apache.spark.sql.DataFrame
-
Returns a Java list that contains all of
Row
s in this
DataFrame
.
- collectAsList() - Method in interface org.apache.spark.sql.RDDApi
-
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of collect
, which returns a future for
retrieving an array containing all of the elements in this RDD.
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving all elements of this RDD.
- collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Returns an RDD that contains for each vertex v its local edges,
i.e., the edges that are incident on v, in the user-specified direction.
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.ByteColumnStats
-
- collectedStatistics() - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
Column statistics represented as a single row, currently including closed lower bound, closed
upper bound and null count.
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.FloatColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.GenericColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.IntColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.LongColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.NoopColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.ShortColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.StringColumnStats
-
- collectedStatistics() - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
-
- CollectionsUtils - Class in org.apache.spark.util
-
- CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
-
- collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex ids for each vertex.
- collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex attributes for each vertex.
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in a specific partition of this RDD.
- collectPartitions() - Method in class org.apache.spark.rdd.RDD
-
A private method for tests, to look at the contents of each partition
- colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- cols() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- colsPerPart() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Computes column-wise summary statistics for the input RDD[Vector].
- Column - Class in org.apache.spark.sql
-
- Column(Expression) - Constructor for class org.apache.spark.sql.Column
-
- Column(String) - Constructor for class org.apache.spark.sql.Column
-
- column(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- column() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
-
- COLUMN_BATCH_SIZE() - Static method in class org.apache.spark.sql.SQLConf
-
- COLUMN_NAME_OF_CORRUPT_RECORD() - Static method in class org.apache.spark.sql.SQLConf
-
- ColumnAccessor - Interface in org.apache.spark.sql.columnar
-
An Iterator
like trait used to extract values from columnar byte buffer.
- columnBatchSize() - Method in class org.apache.spark.sql.SQLConf
-
The number of rows that will be
- ColumnBuilder - Interface in org.apache.spark.sql.columnar
-
- ColumnName - Class in org.apache.spark.sql
-
:: Experimental ::
A convenient class used for constructing schema.
- ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
-
- columnNameOfCorruptRecord() - Method in class org.apache.spark.sql.SQLConf
-
- columnNames() - Method in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
-
- columnOrdinals() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
An attribute map for determining the ordinal for non-partition columns.
- columnPartition(JDBCPartitioningInfo) - Static method in class org.apache.spark.sql.jdbc.JDBCRelation
-
Given a partitioning schematic (a column of integral type, a number of
partitions, and upper and lower bounds on the column's value), generate
WHERE clauses for each partition so that each row in the table appears
exactly once.
- columnPruningPred() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- columns() - Method in class org.apache.spark.sql.DataFrame
-
Returns all column names as an array.
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute all cosine similarities between columns of this matrix using the brute-force
approach of computing normalized dot products.
- columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute similarities between columns of this matrix using a sampling approach.
- columnSimilaritiesDIMSUM(double[], double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Find all similar columns using the DIMSUM sampling algorithm, described in two papers
- ColumnStatisticsSchema - Class in org.apache.spark.sql.columnar
-
- ColumnStatisticsSchema(Attribute) - Constructor for class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- columnStats() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- columnStats() - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
-
Column statistics information
- ColumnStats - Interface in org.apache.spark.sql.columnar
-
Used to collect statistical information when building in-memory columns.
- columnStats() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
-
- columnType() - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- ColumnType<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
-
An abstract class that represents type of a column.
- ColumnType(int, int) - Constructor for class org.apache.spark.sql.columnar.ColumnType
-
- columnType() - Method in class org.apache.spark.sql.columnar.NativeColumnBuilder
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
partitioner/parallelism level.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Combine elements of each key in DStream's RDDs using custom functions.
- combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
-
- combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combiningStrategy() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
-
- command() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
- commit() - Method in class org.apache.spark.SparkHadoopWriter
-
- commitAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
-
Flush the partial writes and commit them as a single atomic block.
- commitAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- commitJob() - Method in class org.apache.spark.SparkHadoopWriter
-
- commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- commitJob() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
-
- comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FairSchedulingAlgorithm
-
- comparator(Schedulable, Schedulable) - Method in class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
-
- comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
-
- compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-
- compatibilityBlackList() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- compatibleType(DataType, DataType) - Static method in class org.apache.spark.sql.json.JsonRDD
-
Returns the most general data type for two given data types.
- completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- completion() - Method in class org.apache.spark.util.CompletionIterator
-
- CompletionEvent - Class in org.apache.spark.scheduler
-
- CompletionEvent(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.CompletionEvent
-
- CompletionIterator<A,I extends scala.collection.Iterator<A>> - Class in org.apache.spark.util
-
Wrapper around an iterator which calls a completion method after it successfully iterates
through all the elements.
- CompletionIterator(I) - Constructor for class org.apache.spark.util.CompletionIterator
-
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
Time when all tasks in the stage completed or when the stage was cancelled.
- completionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- ComplexColumnBuilder<T extends org.apache.spark.sql.types.DataType,JvmType> - Class in org.apache.spark.sql.columnar
-
- ComplexColumnBuilder(ColumnStats, ColumnType<T, JvmType>) - Constructor for class org.apache.spark.sql.columnar.ComplexColumnBuilder
-
- ComplexFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
for actions that could trigger multiple Spark jobs.
- ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
-
- compress() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- compress(ByteBuffer, ByteBuffer) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- COMPRESS_CACHED() - Static method in class org.apache.spark.sql.SQLConf
-
- compressCodec() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compressed() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- CompressedMapStatus - Class in org.apache.spark.scheduler
-
A
MapStatus
implementation that tracks the size of each block.
- CompressedMapStatus(BlockManagerId, byte[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
-
- CompressedMapStatus(BlockManagerId, long[]) - Constructor for class org.apache.spark.scheduler.CompressedMapStatus
-
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- compressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- compressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- CompressibleColumnAccessor<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
- CompressibleColumnBuilder<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
A stackable trait that builds optionally compressed byte buffer for a column.
- COMPRESSION_CODEC_KEY() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- CompressionCodec - Interface in org.apache.spark.io
-
:: DeveloperApi ::
CompressionCodec allows the customization of choosing different compression implementations
to be used in block storage.
- compressionCodec() - Method in class org.apache.spark.streaming.CheckpointWriter
-
- compressionEncoders() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- compressionRatio() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- CompressionScheme - Interface in org.apache.spark.sql.columnar.compression
-
- compressType() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
-
Provides the RDD[(VertexId, VD)]
equivalent output.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point.
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point,
add the gradient to a provided vector to avoid creating new objects, and return loss.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
-
Compute an updated value for weights given the gradient, stepSize, iteration number and
regularization parameter.
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.RandomVectorRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.BlockRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CartesianRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CheckpointRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoalescedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.EmptyRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.MapPartitionsRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PipedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Implemented by subclasses to compute a given partition.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SampledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.SubtractedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Runs the SQL query against the JDBC driver.
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Generate an RDD for the given duration
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Method that generates a RDD for the given Duration
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Method that generates a RDD for the given time
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream
-
Finds the files that were modified since the last time this method was called and makes
a union RDD out of them.
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FilteredDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.GlommedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MappedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Generates RDDs with blocks received by the receiver of this stream.
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.TransformedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.UnionDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.kafka.KafkaRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
-
Gets the partition data by getting the corresponding block from the block manager.
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes column-wise summary statistics.
- computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation for two datasets.
- computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation for two datasets.
- computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation
between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the
correlation between column i and j.
- computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
Compute the Pearson correlation matrix from the covariance matrix.
- computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation
-
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the
correlation implementation for RDD[Vector].
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Return the K-means cost (sum of squared distances of points to their nearest center) for this
model on the given data.
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the covariance matrix, treating each row as an observation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate error of the base learner for the gradient boosting calculation.
- computeError(TreeEnsembleModel, RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
-
Method to calculate loss of the base learner for the gradient boosting calculation.
- computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
-
Returns a sampling rate that guarantees a sample of size >= sampleSizeLowerBound 99.99% of
the time.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the Gramian matrix A^T A
.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the Gramian matrix A^T A
.
- computeOrReadCheckpoint(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
Compute an RDD partition or read it from a checkpoint if the RDD is checkpointing.
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
-
Computes the preferred locations based on input(s) and returned a location to block map.
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components.
- computeSplitSize(long, long, long) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
This input format overrides computeSplitSize() to make sure that each split
only contains full records.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Computes the singular value decomposition of this IndexedRowMatrix.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes singular value decomposition of this matrix.
- computeSVD(int, boolean, double, int, double, String) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
The actual SVD implementation, visible for testing.
- computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Given the result returned by getCounts, determine the threshold for accepting items to
generate exact sample size.
- conf() - Method in interface org.apache.spark.input.Configurable
-
- conf() - Method in class org.apache.spark.rdd.RDD
-
- conf() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- conf() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- conf() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- conf() - Method in class org.apache.spark.SparkContext
-
- conf() - Method in class org.apache.spark.SparkEnv
-
- conf() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- conf() - Method in class org.apache.spark.storage.BlockManager
-
- conf() - Method in class org.apache.spark.streaming.StreamingContext
-
- conf() - Method in class org.apache.spark.ui.SparkUI
-
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-
- config() - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- configFile() - Method in class org.apache.spark.metrics.MetricsConfig
-
- configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
-
config a log4j properties used for testsuite
- Configurable - Interface in org.apache.spark.input
-
A trait to implement Configurable
interface.
- ConfigurableCombineFileRecordReader<K,V> - Class in org.apache.spark.input
-
A CombineFileRecordReader
that can pass Hadoop Configuration to Configurable
RecordReaders.
- ConfigurableCombineFileRecordReader(InputSplit, TaskAttemptContext, Class<? extends RecordReader<K, V>>) - Constructor for class org.apache.spark.input.ConfigurableCombineFileRecordReader
-
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- configuration() - Method in interface org.apache.spark.sql.parquet.ParquetTest
-
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns confusion matrix:
predicted classes are in columns,
they are ordered by class label ascending,
as in "labels"
- connect(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- connected(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- ConnectedComponents - Class in org.apache.spark.graphx.lib
-
Connected components algorithm.
- ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
-
- connectLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- CONSOLE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- CONSOLE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- ConsoleProgressBar - Class in org.apache.spark.ui
-
ConsoleProgressBar shows the progress of stages in the next line of the console.
- ConsoleProgressBar(SparkContext) - Constructor for class org.apache.spark.ui.ConsoleProgressBar
-
- ConsoleSink - Class in org.apache.spark.metrics.sink
-
- ConsoleSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.ConsoleSink
-
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that always returns the same RDD on each timestep.
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
Given a list of nodes from a tree, construct the tree.
- constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- constructURIForAuthentication(URI, SecurityManager) - Static method in class org.apache.spark.util.Utils
-
Construct a URI container information used for authentication.
- consumerConnector() - Method in class org.apache.spark.streaming.kafka.KafkaReceiver
-
- contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
-
Checks whether a parameter is explicitly specified.
- contains(String) - Method in class org.apache.spark.SparkConf
-
Does the configuration contain a given parameter?
- contains(Object) - Method in class org.apache.spark.sql.Column
-
Contains the other element.
- contains(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Check if block manager master has a block.
- contains(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- contains(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- contains(A) - Method in class org.apache.spark.util.TimeStampedHashSet
-
- containsBlock(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
-
Check if disk block manager has a block.
- containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return whether the given block is stored in this block manager in O(1) time.
- containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- containsShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
-
Check if the given shuffle is being tracked
- contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
-
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- context() - Method in class org.apache.spark.InterruptibleIterator
-
- context() - Method in class org.apache.spark.rdd.RDD
-
- context() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- context() - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveCommandStrategy
-
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- context() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return the StreamingContext associated with this DStream
- ContextCleaner - Class in org.apache.spark
-
An asynchronous cleaner for RDD, shuffle, and broadcast state.
- ContextCleaner(SparkContext) - Constructor for class org.apache.spark.ContextCleaner
-
- ContextWaiter - Class in org.apache.spark.streaming
-
- ContextWaiter() - Constructor for class org.apache.spark.streaming.ContextWaiter
-
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- convert() - Method in class org.apache.spark.WritableConverter
-
- convert() - Method in class org.apache.spark.WritableFactory
-
- convertFromAttributes(Seq<Attribute>, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertFromString(String) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertFromTimestamp(Timestamp) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- convertSplitLocationInfo(Object[]) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- convertToAttributes(Type, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertToBaggedRDD(RDD<Datum>, double, int, boolean, int) - Static method in class org.apache.spark.mllib.tree.impl.BaggedPoint
-
Convert an input dataset into its BaggedPoint representation,
choosing subsamplingRate counts for each instance.
- convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
-
Convert bi-directional edges into uni-directional ones.
- convertToString(Seq<Attribute>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- convertToTimestamp(Binary) - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- convertToTreeRDD(RDD<LabeledPoint>, Bin[][], DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.impl.TreePoint
-
Convert an input dataset into its TreePoint representation,
binning feature values in preparation for DecisionTree training.
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a matrix in coordinate format.
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- coordinatorActor() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
- copiesRunning() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- copy() - Method in class org.apache.spark.ml.param.ParamMap
-
Make a copy of this param map.
- copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y = x
- copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Get a deep copy of the matrix.
- copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Makes a deep copy of this vector.
- copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
-
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
class when applicable for non-locking concurrent usage.
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Returns a shallow copy of this instance.
- copy() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
- copy() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
- copy() - Method in class org.apache.spark.util.StatCounter
-
Clone this StatCounter
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- copyField(Row, int, MutableRow, int) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Copies from(fromOrdinal)
to to(toOrdinal)
.
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.INT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.LONG
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- copyField(Row, int, MutableRow, int) - Static method in class org.apache.spark.sql.columnar.STRING
-
- copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
-
Copy all data from an InputStream to an OutputStream.
- cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- cores() - Method in class org.apache.spark.scheduler.WorkerOffer
-
- coresByTaskId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation matrix for the input RDD of Vectors.
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation matrix for the input RDD of Vectors using the specified method.
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation for the input RDDs.
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation for the input RDDs using the specified method.
- Correlation - Interface in org.apache.spark.mllib.stat.correlation
-
Trait for correlation algorithms.
- CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
-
Maintains supported and default correlation names.
- CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- Correlations - Class in org.apache.spark.mllib.stat.correlation
-
Delegates computation to the specific correlation object based on the input method name.
- Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
-
- corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
The number of edges in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
The number of vertices in the RDD.
- count() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
-
- count() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
-
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample size.
- count() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
Number of data points accounted for in the sufficient statistics.
- count() - Method in class org.apache.spark.rdd.RDD
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- count() - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
- count() - Method in class org.apache.spark.sql.DataFrame
-
- count(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count() - Method in class org.apache.spark.sql.GroupedData
-
Count the number of rows for each group.
- COUNT() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- count() - Method in interface org.apache.spark.sql.RDDApi
-
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.util.StatCounter
-
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of count
, which returns a
future for counting the number of elements in this RDD.
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for counting the number of elements in the RDD.
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Count the number of elements for each key, collecting the results to a local Map.
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of countByValue().
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a window over this DStream.
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a sliding window over this DStream.
- countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- counter() - Method in class org.apache.spark.partial.MeanEvaluator
-
- counter() - Method in class org.apache.spark.partial.SumEvaluator
-
- CountEvaluator - Class in org.apache.spark.partial
-
An ApproximateEvaluator for counts.
- CountEvaluator(int, double) - Constructor for class org.apache.spark.partial.CountEvaluator
-
- cpFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- cpRDD() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- cpState() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- CPUS_PER_TASK() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- CR() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- CreatableRelationProvider - Interface in org.apache.spark.sql.sources
-
- create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Deprecated.
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Create a new StorageLevel object.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
-
Create a PartitionPruningRDD.
- create(String, LogicalPlan, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
-
Creates a new ParquetRelation and underlying Parquetfile for the given LogicalPlan.
- create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
-
- create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- createActorSystem(String, String, int, SparkConf, SecurityManager) - Static method in class org.apache.spark.util.AkkaUtils
-
Creates an ActorSystem ready for remoting, with various Spark features.
- createAkkaConfig() - Method in class org.apache.spark.SSLOptions
-
Creates an Akka configuration object which contains all the SSL settings represented by this
object.
- createCombiner() - Method in class org.apache.spark.Aggregator
-
- createCommand(Protos.Offer, int) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils
-
Creates a compiled class with the given name.
- createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates a DataFrame from an RDD of case classes.
- createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates a DataFrame from a local Seq of Product.
- createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
:: DeveloperApi ::
Creates a
DataFrame
from an
RDD
containing
Row
s using the given schema.
- createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
:: DeveloperApi ::
Creates a
DataFrame
from an
JavaRDD
containing
Row
s using the given schema.
- createDataFrame(JavaRDD<Row>, List<String>) - Method in class org.apache.spark.sql.SQLContext
-
Creates a
DataFrame
from an
JavaRDD
containing
Row
s by applying
a seq of names of columns to this RDD, the data type for each column will
be inferred by the first row.
- createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
Applies a schema to an RDD of Java Beans.
- createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
Applies a schema to an RDD of Java Beans.
- createDataSourceTable(String, Option<StructType>, String, Map<String, String>, boolean) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
Creates a data source table (a table created with USING clause) in Hive's metastore.
- createDecimal(BigDecimal) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createDefaultDBIfNeeded(HiveContext) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a directory inside the given parent directory.
- createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDriverEnv(SparkConf, boolean, LiveListenerBus, Option<OutputCommitCoordinator>) - Static method in class org.apache.spark.SparkEnv
-
Create a SparkEnv for the driver.
- createDriverResultsArray() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- createEmpty(String, Seq<Attribute>, boolean, Configuration, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation
-
Creates an empty ParquetRelation and underlying Parquetfile that only
consists of the Metadata for the given schema.
- createExecutorEnv(SparkConf, String, String, int, int, boolean) - Static method in class org.apache.spark.SparkEnv
-
Create a SparkEnv for an executor.
- createExecutorInfo(String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates an external table from the given path and returns the corresponding DataFrame.
- createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates an external table from the given path based on a data source
and returns the corresponding DataFrame.
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Creates an external table from the given path based on a data source and a set of options.
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Scala-specific)
Creates an external table from the given path based on a data source and a set of options.
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Create an external table from the given path based on a data source, a schema and
a set of options.
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Scala-specific)
Create an external table from the given path based on a data source, a schema and
a set of options.
- createFilter(Expression) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
- createFunction() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- createHistoryUI(SparkConf, SparkListenerBus, SecurityManager, String, String) - Static method in class org.apache.spark.ui.SparkUI
-
- createJar(Seq<File>, File) - Static method in class org.apache.spark.TestUtils
-
Create a jar file that contains this set of files.
- createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils
-
Create a jar that defines classes with the given names.
- createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils
-
Create a jar file containing multiple files.
- createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
-
Save this RDD to a JDBC database at url
under the table name table
.
- createJettySslContextFactory() - Method in class org.apache.spark.SSLOptions
-
Creates a Jetty SSL context factory according to the SSL settings represented by this object.
- createJobID(Date, int) - Static method in class org.apache.spark.SparkHadoopWriter
-
- createLiveUI(SparkContext, SparkConf, SparkListenerBus, JobProgressListener, SecurityManager, String) - Static method in class org.apache.spark.ui.SparkUI
-
- createMesosTask(TaskDescription, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
Turn a Spark TaskDescription into a Mesos task
- CreateMetastoreDataSource - Class in org.apache.spark.sql.hive.execution
-
- CreateMetastoreDataSource(String, Option<StructType>, String, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- CreateMetastoreDataSourceAsSelect - Class in org.apache.spark.sql.hive.execution
-
- CreateMetastoreDataSourceAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
-
- createMetricsSystem(String, SparkConf, SecurityManager) - Static method in class org.apache.spark.metrics.MetricsSystem
-
- createNewSparkContext(SparkConf) - Static method in class org.apache.spark.streaming.StreamingContext
-
- createNewSparkContext(String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.streaming.StreamingContext
-
- createPartitioner() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- createPathFromString(String, JobConf) - Static method in class org.apache.spark.SparkHadoopWriter
-
- createPathFromString(String, JobConf) - Static method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- createPlan(String) - Static method in class org.apache.spark.sql.hive.HiveQl
-
Creates LogicalPlan for a given HiveQL string.
- createPlanForView(Table, Option<String>) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPythonWorker(String, Map<String, String>) - Method in class org.apache.spark.SparkEnv
-
- createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRecordFilter(Seq<Expression>) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
Create a FixedLengthBinaryRecordReader
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamFileInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler that always redirects the user to the given path
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.DefaultSource
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.json.DefaultSource
-
Returns a new base relation with the parameters.
- createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.json.DefaultSource
-
Returns a new base relation with the given schema and parameters.
- createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.json.DefaultSource
-
- createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.sql.parquet.DefaultSource
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>, StructType) - Method in class org.apache.spark.sql.parquet.DefaultSource
-
Returns a new base relation with the given parameters and schema.
- createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in class org.apache.spark.sql.parquet.DefaultSource
-
Returns a new base relation with the given parameters and save given data into it.
- createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
-
Creates a relation with the given parameters based on the contents of the given
DataFrame.
- createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
-
Returns a new base relation with the given parameters and user defined schema.
- createRoutingTables(EdgeRDD<?>, Partitioner) - Static method in class org.apache.spark.graphx.VertexRDD
-
- createServlet(JettyUtils.ServletParams<T>, SecurityManager, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
-
- createServletHandler(String, JettyUtils.ServletParams<T>, SecurityManager, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a context handler that responds to a request with the given path prefix
- createSparkEnv(SparkConf, boolean, LiveListenerBus) - Method in class org.apache.spark.SparkContext
-
- createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Create a handler for serving files from a static directory
- createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, Map<String, String>, Map<String, Integer>, StorageLevel) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an InputDStream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create a Java-friendly InputDStream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createTable(String, String, Seq<Attribute>, boolean, Option<CreateTableDesc>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
Create table with specified database, table name, table description and schema
- CreateTableAsSelect - Class in org.apache.spark.sql.hive.execution
-
Create table and insert the query result into it.
- CreateTableAsSelect(String, String, LogicalPlan, boolean, Option<CreateTableDesc>) - Constructor for class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- CreateTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- CreateTableUsing - Class in org.apache.spark.sql.sources
-
Used to represent the operation of create table using a data source.
- CreateTableUsing(String, Option<StructType>, String, boolean, Map<String, String>, boolean, boolean) - Constructor for class org.apache.spark.sql.sources.CreateTableUsing
-
- CreateTableUsingAsSelect - Class in org.apache.spark.sql.sources
-
A node used to support CTAS statements and saveAsTable for the data source API.
- CreateTableUsingAsSelect(String, String, boolean, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- createTaskSetManager(TaskSet, int) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
-
Create a temporary directory inside the given parent directory.
- createTempLocalBlock() - Method in class org.apache.spark.storage.DiskBlockManager
-
Produces a unique block id and File suitable for storing local intermediate results.
- createTempShuffleBlock() - Method in class org.apache.spark.storage.DiskBlockManager
-
Produces a unique block id and File suitable for storing shuffled intermediate results.
- CreateTempTableUsing - Class in org.apache.spark.sql.sources
-
- CreateTempTableUsing(String, Option<StructType>, String, Map<String, String>) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsing
-
- CreateTempTableUsingAsSelect - Class in org.apache.spark.sql.sources
-
- CreateTempTableUsingAsSelect(String, String, SaveMode, Map<String, String>, LogicalPlan) - Constructor for class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
-
- createTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- createUsingIndex(Iterator<Product2<Object, VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Similar effect as aggregateUsingIndex((a, b) => a)
- createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
-
- creationSite() - Method in class org.apache.spark.rdd.RDD
-
User code that created this RDD (e.g.
- creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
-
- credentialsProvider() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
-
- CrossValidator - Class in org.apache.spark.ml.tuning
-
:: AlphaComponent ::
K-fold cross validation.
- CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidatorModel - Class in org.apache.spark.ml.tuning
-
:: AlphaComponent ::
Model from k-fold cross validation.
- CrossValidatorModel(CrossValidator, ParamMap, Model<?>) - Constructor for class org.apache.spark.ml.tuning.CrossValidatorModel
-
- CrossValidatorParams - Interface in org.apache.spark.ml.tuning
-
- CSV_DEFAULT_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_DIR() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CSV_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- CsvSink - Class in org.apache.spark.metrics.sink
-
- CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.CsvSink
-
- currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
-
- currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- currentGraph() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
-
- currentInterval(Duration) - Static method in class org.apache.spark.streaming.Interval
-
- currentLocalityIndex() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.CountEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedCountEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.GroupedSumEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.MeanEvaluator
-
- currentResult() - Method in class org.apache.spark.partial.SumEvaluator
-
- currentUnrollMemory() - Method in class org.apache.spark.storage.MemoryStore
-
Return the amount of memory currently occupied for unrolling blocks across all threads.
- currentUnrollMemoryForThisThread() - Method in class org.apache.spark.storage.MemoryStore
-
Return the amount of memory currently occupied for unrolling blocks by this thread.
- currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- DAGScheduler - Class in org.apache.spark.scheduler
-
The high-level scheduling layer that implements stage-oriented scheduling.
- DAGScheduler(SparkContext, TaskScheduler, LiveListenerBus, MapOutputTrackerMaster, BlockManagerMaster, SparkEnv, Clock) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- DAGScheduler(SparkContext, TaskScheduler) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- DAGScheduler(SparkContext) - Constructor for class org.apache.spark.scheduler.DAGScheduler
-
- dagScheduler() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
-
- dagScheduler() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- dagScheduler() - Method in class org.apache.spark.SparkContext
-
- DAGSchedulerEvent - Interface in org.apache.spark.scheduler
-
Types of events that can be handled by the DAGScheduler.
- DAGSchedulerEventProcessLoop - Class in org.apache.spark.scheduler
-
- DAGSchedulerEventProcessLoop(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerEventProcessLoop
-
- DAGSchedulerSource - Class in org.apache.spark.scheduler
-
- DAGSchedulerSource(DAGScheduler) - Constructor for class org.apache.spark.scheduler.DAGSchedulerSource
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
-
- data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- data() - Method in class org.apache.spark.storage.BlockResult
-
- data() - Method in class org.apache.spark.storage.PutResult
-
- data() - Method in class org.apache.spark.util.Distribution
-
- data() - Method in class org.apache.spark.util.random.GapSamplingIterator
-
- data() - Method in class org.apache.spark.util.random.GapSamplingReplacementIterator
-
- database() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- database() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
-
- databaseName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- databaseName() - Method in class org.apache.spark.sql.sources.RefreshTable
-
- dataDeserialize(BlockId, ByteBuffer, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Deserializes a ByteBuffer into an iterator of values and disposes of it when the end of
the iterator is reached.
- DataFrame - Class in org.apache.spark.sql
-
:: Experimental ::
A distributed collection of data organized into named columns.
- DataFrame(SQLContext, SQLContext.QueryExecution) - Constructor for class org.apache.spark.sql.DataFrame
-
- DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame
-
A constructor that automatically analyzes the logical plan.
- DATAFRAME_EAGER_ANALYSIS() - Static method in class org.apache.spark.sql.SQLConf
-
- dataFrameEagerAnalysis() - Method in class org.apache.spark.sql.SQLConf
-
- DataFrameHolder - Class in org.apache.spark.sql
-
A container for a
DataFrame
, used for implicit conversions.
- DataFrameHolder(DataFrame) - Constructor for class org.apache.spark.sql.DataFrameHolder
-
- dataSerialize(BlockId, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Serializes into a byte buffer.
- dataSerializeStream(BlockId, OutputStream, Iterator<Object>, Serializer) - Method in class org.apache.spark.storage.BlockManager
-
Serializes into a stream.
- DataSinks() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- DataSourceStrategy - Class in org.apache.spark.sql.sources
-
A Strategy for planning scans over data sources defined using the sources API.
- DataSourceStrategy() - Constructor for class org.apache.spark.sql.sources.DataSourceStrategy
-
- dataType() - Method in class org.apache.spark.sql.columnar.NativeColumnType
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- dataType() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- dataType() - Method in class org.apache.spark.sql.sources.DDLParser
-
- dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
-
- dataType() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
- DataValidators - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
A collection of methods used to validate data before applying ML algorithms.
- DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
-
- DATE - Class in org.apache.spark.sql.columnar
-
- DATE() - Constructor for class org.apache.spark.sql.columnar.DATE
-
- date() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type date
- DateColumnAccessor - Class in org.apache.spark.sql.columnar
-
- DateColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DateColumnAccessor
-
- DateColumnBuilder - Class in org.apache.spark.sql.columnar
-
- DateColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DateColumnBuilder
-
- DateColumnStats - Class in org.apache.spark.sql.columnar
-
- DateColumnStats() - Constructor for class org.apache.spark.sql.columnar.DateColumnStats
-
- DateConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- datum() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
-
- DDLException - Exception in org.apache.spark.sql.sources
-
The exception thrown from the DDL parser.
- DDLException(String) - Constructor for exception org.apache.spark.sql.sources.DDLException
-
- DDLParser - Class in org.apache.spark.sql.sources
-
A parser for foreign DDL commands.
- DDLParser(Function1<String, LogicalPlan>) - Constructor for class org.apache.spark.sql.sources.DDLParser
-
- dead(String) - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- decimal() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type decimal
- decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type decimal
- DecimalConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- decimalMetadata() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
-
- decimalMetastoreString(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- decimalTypeInfo(DecimalType) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- DecisionTree - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class which implements a decision tree learning algorithm for classification and regression.
- DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
-
- DecisionTreeMetadata - Class in org.apache.spark.mllib.tree.impl
-
Learning and dataset metadata for DecisionTree.
- DecisionTreeMetadata(int, long, int, int, Map<Object, Object>, Set<Object>, int[], Impurity, Enumeration.Value, int, int, double, int, int) - Constructor for class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Decision tree model for classification or regression.
- DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- DecisionTreeModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- DecisionTreeModel.SaveLoadV1_0$.NodeData - Class in org.apache.spark.mllib.tree.model
-
Model data for model import/export
- DecisionTreeModel.SaveLoadV1_0$.NodeData(int, int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.PredictData, double, boolean, Option<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.SplitData>, Option<Object>, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- DecisionTreeModel.SaveLoadV1_0$.PredictData - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.PredictData(double, double) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- DecisionTreeModel.SaveLoadV1_0$.SplitData - Class in org.apache.spark.mllib.tree.model
-
- DecisionTreeModel.SaveLoadV1_0$.SplitData(int, double, int, Seq<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
-
- decoder() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
-
- decoder(ByteBuffer, NativeColumnType<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
-
- Decoder<T extends org.apache.spark.sql.types.NativeType> - Interface in org.apache.spark.sql.columnar.compression
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
-
- decoder(ByteBuffer, NativeColumnType<T>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
-
- decreaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
-
- deepCopy() - Method in class org.apache.spark.mllib.tree.model.Node
-
Returns a deep copy of the subtree rooted at this node.
- DEFAULT_BUFFER_SIZE() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- DEFAULT_CLEANER_TTL() - Static method in class org.apache.spark.streaming.StreamingContext
-
- DEFAULT_DATA_SOURCE_NAME() - Static method in class org.apache.spark.sql.SQLConf
-
- DEFAULT_LOG_DIR() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- DEFAULT_MINIMUM_SHARE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_PARTITION_NAME() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- DEFAULT_POOL_NAME() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_POOL_NAME() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_PORT() - Static method in class org.apache.spark.ui.SparkUI
-
- DEFAULT_RETAINED_JOBS() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_RETAINED_STAGES() - Static method in class org.apache.spark.ui.jobs.JobProgressListener
-
- DEFAULT_SCHEDULER_FILE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_SCHEDULING_MODE() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- DEFAULT_SIZE_IN_BYTES() - Static method in class org.apache.spark.sql.SQLConf
-
- DEFAULT_WEIGHT() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
-
- defaultDataSourceName() - Method in class org.apache.spark.sql.SQLConf
-
- defaultFilter(Path) - Static method in class org.apache.spark.streaming.dstream.FileInputDStream
-
- defaultFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultMinPartitions() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
- defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- defaultMinSplits() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- defaultParallelism() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- defaultParallelism() - Method in class org.apache.spark.SparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Returns default configuration for the boosting algorithm
- defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
Returns default configuration for the boosting algorithm
- defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
-
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
- defaultPartitioner(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- defaultProbabilities() - Method in class org.apache.spark.util.Distribution
-
- defaultSize() - Method in class org.apache.spark.sql.columnar.ColumnType
-
- defaultSizeInBytes() - Method in class org.apache.spark.sql.SQLConf
-
The default size in bytes to assign to a logical operator's estimation statistics.
- DefaultSource - Class in org.apache.spark.sql.jdbc
-
Given a partitioning schematic (a column of integral type, a number of
partitions, and upper and lower bounds on the column's value), generate
WHERE clauses for each partition so that each row in the table appears
exactly once.
- DefaultSource() - Constructor for class org.apache.spark.sql.jdbc.DefaultSource
-
- DefaultSource - Class in org.apache.spark.sql.json
-
- DefaultSource() - Constructor for class org.apache.spark.sql.json.DefaultSource
-
- DefaultSource - Class in org.apache.spark.sql.parquet
-
Allows creation of Parquet based tables using the syntax:
- DefaultSource() - Constructor for class org.apache.spark.sql.parquet.DefaultSource
-
- defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- defaultValue() - Method in class org.apache.spark.ml.param.Param
-
- DeferredObjectAdapter - Class in org.apache.spark.sql.hive
-
- DeferredObjectAdapter(ObjectInspector) - Constructor for class org.apache.spark.sql.hive.DeferredObjectAdapter
-
- degrees() - Method in class org.apache.spark.graphx.GraphOps
-
The degree of each vertex in the graph.
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Returns the degree(s) of freedom of the hypothesis test.
- delaySeconds() - Method in class org.apache.spark.streaming.Checkpoint
-
- delegate() - Method in class org.apache.spark.InterruptibleIterator
-
- deleteAllCheckpoints() - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
-
Call this at the end to delete any remaining checkpoint files.
- deleteAllCheckpoints() - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
Call this after training is finished to delete any remaining checkpoints.
- deleteOldFiles() - Method in class org.apache.spark.util.logging.RollingFileAppender
-
Retain only last few files
- deleteRecursively(File) - Static method in class org.apache.spark.util.Utils
-
Delete a file or directory and its contents recursively.
- deleteRecursively(TachyonFile, TachyonFS) - Static method in class org.apache.spark.util.Utils
-
Delete a file or directory and its contents recursively.
- dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major dense matrix.
- dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from a double array.
- DenseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major dense matrix.
- DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
Column-major dense matrix.
- DenseVector - Class in org.apache.spark.mllib.linalg
-
A dense vector represented by a value array.
- DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
-
- dependencies() - Method in class org.apache.spark.rdd.RDD
-
Get the list of dependencies of this RDD, taking into account whether the
RDD is checkpointed or not.
- dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
-
List of parent DStreams on which this DStream depends on
- dependencies() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MappedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.UnionDStream
-
- dependencies() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- Dependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Base class for dependencies.
- Dependency() - Constructor for class org.apache.spark.Dependency
-
- deps() - Method in class org.apache.spark.rdd.CoGroupPartition
-
- depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Get depth of tree.
- DeregisterReceiver - Class in org.apache.spark.streaming.scheduler
-
- DeregisterReceiver(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.DeregisterReceiver
-
- desc() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- desc() - Method in class org.apache.spark.sql.Column
-
Returns an ordering used in sorting.
- desc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on the descending order of the column.
- desc() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- DescribeCommand - Class in org.apache.spark.sql.sources
-
Returned for the "DESCRIBE [EXTENDED] [dbName.]tableName" command.
- DescribeCommand(LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.sources.DescribeCommand
-
- DescribeHiveTableCommand - Class in org.apache.spark.sql.hive.execution
-
Implementation for "describe [extended] table".
- DescribeHiveTableCommand(MetastoreRelation, Seq<Attribute>, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- description() - Method in class org.apache.spark.ExceptionFailure
-
- description() - Method in class org.apache.spark.storage.StorageLevel
-
- description() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- DeserializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for reading serialized objects.
- DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
-
- deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- deserialize(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization
- deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils
-
Deserialize an object using Java serialization and the given ClassLoader
- deserialized() - Method in class org.apache.spark.storage.MemoryEntry
-
- deserialized() - Method in class org.apache.spark.storage.StorageLevel
-
- deserializeFilterExpressions(Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
Note: Inside the Hadoop API we only have access to
Configuration
, not to
SparkContext
, so we cannot use broadcasts to convey
the actual filter predicate.
- deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils
-
Deserialize a Long value (used for PythonPartitioner
)
- deserializeMapStatuses(byte[]) - Static method in class org.apache.spark.MapOutputTracker
-
- deserializePlan(InputStream, Class<?>) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserializeStream(InputStream, ClassLoader) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Deserialize via nested stream using specific serializer
- deserializeWithDependencies(ByteBuffer) - Static method in class org.apache.spark.scheduler.Task
-
Deserialize the list of dependencies in a task serialized with serializeWithDependencies,
and return the task itself as a serialized ByteBuffer.
- destinationToken() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- destroy() - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- destroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- destroyPythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
-
- destTableId() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- details() - Method in class org.apache.spark.scheduler.Stage
-
- details() - Method in class org.apache.spark.scheduler.StageInfo
-
- determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Determines the bounds for range partitioning from candidates with weights indicating how many
items each represents.
- DeveloperApi - Annotation Type in org.apache.spark.annotation
-
A lower-level, unstable API intended for developers.
- df() - Method in class org.apache.spark.sql.DataFrameHolder
-
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a diagonal matrix in DenseMatrix
format from the supplied values.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a diagonal matrix in Matrix
format from the supplied values.
- DIALECT() - Static method in class org.apache.spark.sql.SQLConf
-
- dialect() - Method in class org.apache.spark.sql.SQLConf
-
The SQL dialect that is used when parsing queries.
- DictionaryEncoding - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- DictionaryEncoding.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Decoder
-
- DictionaryEncoding.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- DictionaryEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- diff(Self) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Hides vertices that are the same between this and other.
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
Hides vertices that are the same between this
and other
; for vertices that are different,
keeps the values from other
.
- dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- dir() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- DirectKafkaInputDStream<K,V,U extends kafka.serializer.Decoder<K>,T extends kafka.serializer.Decoder<V>,R> - Class in org.apache.spark.streaming.kafka
-
A stream of
KafkaRDD
where
each given Kafka topic/partition corresponds to an RDD partition.
- DirectKafkaInputDStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>, ClassTag<R>) - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData - Class in org.apache.spark.streaming.kafka
-
- DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
-
- DirectTaskResult<T> - Class in org.apache.spark.scheduler
-
A TaskResult that contains the task's return value and accumulator updates.
- DirectTaskResult(ByteBuffer, Map<Object, Object>, TaskMetrics) - Constructor for class org.apache.spark.scheduler.DirectTaskResult
-
- DirectTaskResult() - Constructor for class org.apache.spark.scheduler.DirectTaskResult
-
- disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-
Allows for the spark.hadoop.validateOutputSpecs
checks to be disabled on a case-by-case
basis; see SPARK-4835 for more details.
- disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- disconnected(SchedulerDriver) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- disconnected() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- diskBlockManager() - Method in class org.apache.spark.storage.BlockManager
-
- DiskBlockManager - Class in org.apache.spark.storage
-
Creates and maintains the logical mapping between logical blocks and physical on-disk
locations.
- DiskBlockManager(BlockManager, SparkConf) - Constructor for class org.apache.spark.storage.DiskBlockManager
-
- DiskBlockObjectWriter - Class in org.apache.spark.storage
-
BlockObjectWriter which writes directly to a file on disk.
- DiskBlockObjectWriter(BlockId, File, Serializer, int, Function1<OutputStream, OutputStream>, boolean, ShuffleWriteMetrics) - Constructor for class org.apache.spark.storage.DiskBlockObjectWriter
-
- diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- diskBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- diskSize() - Method in class org.apache.spark.storage.BlockStatus
-
- diskSize() - Method in class org.apache.spark.storage.RDDInfo
-
- diskStore() - Method in class org.apache.spark.storage.BlockManager
-
- DiskStore - Class in org.apache.spark.storage
-
Stores BlockManager blocks on disk.
- DiskStore(BlockManager, DiskBlockManager) - Constructor for class org.apache.spark.storage.DiskStore
-
- diskUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by this block manager.
- diskUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by the given RDD in this block manager in O(1) time.
- dispose(ByteBuffer) - Static method in class org.apache.spark.storage.BlockManager
-
Attempt to clean up a ByteBuffer if it is memory-mapped.
- dist(Vector) - Method in class org.apache.spark.util.Vector
-
- distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.sql.DataFrame
-
- distinct() - Method in interface org.apache.spark.sql.RDDApi
-
- DistributedLDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- DistributedLDAModel(LDA.EMOptimizer, double[]) - Constructor for class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
-
Represents a distributively stored matrix backed by one or more RDDs.
- Distribution - Class in org.apache.spark.util
-
Util for getting some stats from a small sample of numeric values, with some handy
summary functions.
- Distribution(double[], int, int) - Constructor for class org.apache.spark.util.Distribution
-
- Distribution(Traversable<Object>) - Constructor for class org.apache.spark.util.Distribution
-
- DIV() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- div(Duration) - Method in class org.apache.spark.streaming.Duration
-
- divide(Object) - Method in class org.apache.spark.sql.Column
-
Division this expression by another expression.
- divide(double) - Method in class org.apache.spark.util.Vector
-
- doc() - Method in class org.apache.spark.ml.param.Param
-
- doCancelAllJobs() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- docConcentration() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
- doCheckpoint() - Method in class org.apache.spark.rdd.RDD
-
Performs the checkpointing of this RDD by saving this.
- doCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- DoCheckpoint - Class in org.apache.spark.streaming.scheduler
-
- DoCheckpoint(Time) - Constructor for class org.apache.spark.streaming.scheduler.DoCheckpoint
-
- doCleanupBroadcast(long, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform broadcast cleanup.
- doCleanupRDD(int, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform RDD cleanup.
- doCleanupShuffle(int, boolean) - Method in class org.apache.spark.ContextCleaner
-
Perform shuffle cleanup, asynchronously.
- doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils
-
Determines if a directory contains any files newer than cutoff seconds.
- doKillExecutors(Seq<String>) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
Request that the ApplicationMaster kill the specified executors.
- doRequestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
Request executors from the ApplicationMaster by specifying the total number desired.
- dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
dot(x, y)
- dot(Vector) - Method in class org.apache.spark.util.Vector
-
- DOUBLE - Class in org.apache.spark.sql.columnar
-
- DOUBLE() - Constructor for class org.apache.spark.sql.columnar.DOUBLE
-
- doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- DoubleColumnAccessor - Class in org.apache.spark.sql.columnar
-
- DoubleColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.DoubleColumnAccessor
-
- DoubleColumnBuilder - Class in org.apache.spark.sql.columnar
-
- DoubleColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnBuilder
-
- DoubleColumnStats - Class in org.apache.spark.sql.columnar
-
- DoubleColumnStats() - Constructor for class org.apache.spark.sql.columnar.DoubleColumnStats
-
- DoubleConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more records of type Double from each input record.
- DoubleFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns Doubles, and can be used to construct DoubleRDDs.
- DoubleParam - Class in org.apache.spark.ml.param
-
Specialized version of Param[Double
] for Java.
- DoubleParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleRDDFunctions - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of Doubles through an implicit conversion.
- DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
-
- doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
-
- doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
-
- doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- doubleWritableConverter() - Static method in class org.apache.spark.WritableConverter
-
- doubleWritableFactory() - Static method in class org.apache.spark.WritableFactory
-
- driver() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- driver() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- DRIVER_AKKA_ACTOR_NAME() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
-
- driverActor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- driverActor() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
-
- DriverQuirks - Class in org.apache.spark.sql.jdbc
-
Encapsulates workarounds for the extensions, quirks, and bugs in various
databases.
- DriverQuirks() - Constructor for class org.apache.spark.sql.jdbc.DriverQuirks
-
- driverSideSetup() - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- dropFromMemory(BlockId, Either<Object[], ByteBuffer>) - Method in class org.apache.spark.storage.BlockManager
-
Drop a block from memory, possibly putting it on disk if applicable.
- droppedBlocks() - Method in class org.apache.spark.storage.PutResult
-
- droppedBlocks() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
-
- DropTable - Class in org.apache.spark.sql.hive.execution
-
Drops a table from the metastore and removes it if it is cached.
- DropTable(String, boolean) - Constructor for class org.apache.spark.sql.hive.execution.DropTable
-
- dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Drops the temporary table with the given table name in the catalog.
- Dst - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the destination and edge fields but not the source field.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's destination vertex.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The destination vertex attribute
- dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
-
- dstEncodedIndices() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
- dstId() - Method in class org.apache.spark.graphx.Edge
-
- dstId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's destination vertex.
- dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- dstIds() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
-
- dstPtrs() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- DStream<T> - Class in org.apache.spark.streaming.dstream
-
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
sequence of RDDs (of the same type) representing a continuous stream of data (see
org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
- DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
-
- DStreamCheckpointData<T> - Class in org.apache.spark.streaming.dstream
-
- DStreamCheckpointData(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
- DStreamGraph - Class in org.apache.spark.streaming
-
- DStreamGraph() - Constructor for class org.apache.spark.streaming.DStreamGraph
-
- DTStatsAggregator - Class in org.apache.spark.mllib.tree.impl
-
DecisionTree statistics aggregator for a node.
- DTStatsAggregator(DecisionTreeMetadata, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
- dtypes() - Method in class org.apache.spark.sql.DataFrame
-
Returns all column names and their data types as an array.
- DummyCategoricalSplit - Class in org.apache.spark.mllib.tree.model
-
Split with no acceptable feature values for categorical features.
- DummyCategoricalSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyCategoricalSplit
-
- DummyHighSplit - Class in org.apache.spark.mllib.tree.model
-
Split with maximum threshold for continuous features.
- DummyHighSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyHighSplit
-
- DummyLowSplit - Class in org.apache.spark.mllib.tree.model
-
Split with minimum threshold for continuous features.
- DummyLowSplit(int, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DummyLowSplit
-
- dumpTree(Node, StringBuilder, int) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- duration() - Method in class org.apache.spark.scheduler.TaskInfo
-
- Duration - Class in org.apache.spark.streaming
-
- Duration(long) - Constructor for class org.apache.spark.streaming.Duration
-
- duration() - Method in class org.apache.spark.streaming.Interval
-
- Durations - Class in org.apache.spark.streaming
-
- Durations() - Constructor for class org.apache.spark.streaming.Durations
-
- f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- f() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- f() - Method in class org.apache.spark.sql.UserDefinedFunction
-
- f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based f1-measure averaged by the number of documents
- f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns f1-measure for a given label (category)
- failAnalysis(String) - Method in class org.apache.spark.sql.sources.PreWriteCheck
-
- failed() - Method in class org.apache.spark.scheduler.TaskInfo
-
- FAILED() - Static method in class org.apache.spark.TaskState
-
- failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- failedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- FailedStageTable - Class in org.apache.spark.ui.jobs
-
- FailedStageTable(Seq<StageInfo>, String, JobProgressListener, boolean) - Constructor for class org.apache.spark.ui.jobs.FailedStageTable
-
- failedTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- failedTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- failure() - Method in class org.apache.spark.partial.ApproximateActionListener
-
- failureReason() - Method in class org.apache.spark.scheduler.StageInfo
-
If the stage failed, the reason why.
- failuresBySlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- FAIR_SCHEDULER_PROPERTIES() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- FairSchedulableBuilder - Class in org.apache.spark.scheduler
-
- FairSchedulableBuilder(Pool, SparkConf) - Constructor for class org.apache.spark.scheduler.FairSchedulableBuilder
-
- FairSchedulingAlgorithm - Class in org.apache.spark.scheduler
-
- FairSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FairSchedulingAlgorithm
-
- fakeClassTag() - Static method in class org.apache.spark.api.java.JavaSparkContext
-
Produces a ClassTag[T], which is actually just a casted ClassTag[AnyRef].
- fakeOutput(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.PhysicalPlanHacks
-
- FALSE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary
-
False positive rate.
- FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
-
- falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns false positive rate for a given label (category)
- fastSquaredDistance(VectorWithNorm, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- fastSquaredDistance(Vector, double, Vector, double, double) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns the squared Euclidean distance between two vectors.
- feature() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- feature() - Method in class org.apache.spark.mllib.tree.model.Split
-
- featureArity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- featuresCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
-
param for features column name
- featureSubset() - Method in class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
-
- FeatureType - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to describe whether a feature is "continuous" or "categorical"
- FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.Bin
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.Split
-
- featureUpdate(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Faster version of update
.
- FetchFailed - Class in org.apache.spark
-
:: DeveloperApi ::
Task failed to fetch shuffle data from a remote node.
- FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
-
- fetchFile(String, File, SparkConf, SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils
-
Download a file or directory to target directory.
- fetchHcfsFile(Path, File, FileSystem, SparkConf, Configuration, boolean, Option<String>) - Static method in class org.apache.spark.util.Utils
-
Fetch a file or directory from a Hadoop-compatible filesystem.
- fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
-
- field() - Method in class org.apache.spark.storage.BroadcastBlockId
-
- FieldAccessFinder - Class in org.apache.spark.util
-
- FieldAccessFinder(Map<Class<?>, Set<String>>) - Constructor for class org.apache.spark.util.FieldAccessFinder
-
- FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- FIFOSchedulableBuilder - Class in org.apache.spark.scheduler
-
- FIFOSchedulableBuilder(Pool) - Constructor for class org.apache.spark.scheduler.FIFOSchedulableBuilder
-
- FIFOSchedulingAlgorithm - Class in org.apache.spark.scheduler
-
- FIFOSchedulingAlgorithm() - Constructor for class org.apache.spark.scheduler.FIFOSchedulingAlgorithm
-
- file() - Method in class org.apache.spark.storage.FileSegment
-
- file() - Method in class org.apache.spark.storage.TachyonFileSegment
-
- FileAppender - Class in org.apache.spark.util.logging
-
Continuously appends the data from an input stream into the given file.
- FileAppender(InputStream, File, int) - Constructor for class org.apache.spark.util.logging.FileAppender
-
- fileDir() - Method in class org.apache.spark.HttpFileServer
-
- fileExists(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
-
- FileInputDStream<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> - Class in org.apache.spark.streaming.dstream
-
This class represents an input stream that monitors a Hadoop-compatible filesystem for new
files and creates a stream out of them.
- FileInputDStream(StreamingContext, String, Function1<Path, Object>, boolean, Option<Configuration>, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream
-
- FileInputDStream.FileInputDStreamCheckpointData - Class in org.apache.spark.streaming.dstream
-
A custom version of the DStreamCheckpointData that stores names of
Hadoop files as checkpoint data.
- FileInputDStream.FileInputDStreamCheckpointData() - Constructor for class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- filePath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- files() - Method in class org.apache.spark.SparkContext
-
- fileSegment() - Method in class org.apache.spark.storage.BlockObjectWriter
-
Returns the file segment of committed data that this Writer has written.
- fileSegment() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- FileSegment - Class in org.apache.spark.storage
-
References a particular segment of a file (potentially the entire file),
based off an offset and a length.
- FileSegment(File, long, long) - Constructor for class org.apache.spark.storage.FileSegment
-
- fileServerSSLOptions() - Method in class org.apache.spark.SecurityManager
-
- fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- FileSystemHelper - Class in org.apache.spark.sql.parquet
-
- FileSystemHelper() - Constructor for class org.apache.spark.sql.parquet.FileSystemHelper
-
- fillObject(Iterator<Writable>, Deserializer, Seq<Tuple2<Attribute, Object>>, MutableRow) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
-
Transform all given raw Writable
s into Row
s.
- filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
-
Filter the graph by computing some values to filter on, and applying the predicates.
- filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Construct a new edge partition containing only the edges matching epred
and where both
vertices match vpred
.
- filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- filter(Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Restrict the vertex set to the set of vertices satisfying the given predicate.
- filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
-
Restricts the vertex set to the set of vertices satisfying the given predicate.
- filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
-
Filters this param map for the given parent.
- filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Column) - Method in class org.apache.spark.sql.DataFrame
-
Filters rows using the given condition.
- filter(String) - Method in class org.apache.spark.sql.DataFrame
-
Filters rows using the given SQL expression.
- Filter - Class in org.apache.spark.sql.sources
-
- Filter() - Constructor for class org.apache.spark.sql.sources.Filter
-
- filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
-
- filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- filter(Function1<Tuple2<A, B>, Object>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- FilteredDStream<T> - Class in org.apache.spark.streaming.dstream
-
- FilteredDStream(DStream<T>, Function1<T, Object>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.FilteredDStream
-
- FilteringParquetRowInputFormat - Class in org.apache.spark.sql.parquet
-
We extend ParquetInputFormat in order to have more control over which
RecordFilter we want to use.
- FilteringParquetRowInputFormat() - Constructor for class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
-
Filters this RDD with p, where p takes an additional parameter of type A.
- finalRDD() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- finalStage() - Method in class org.apache.spark.scheduler.ActiveJob
-
- find(Object) - Static method in class org.apache.spark.serializer.SerializationDebugger
-
Find the path leading to a not serializable object.
- findBestSplits(RDD<BaggedPoint<TreePoint>>, DecisionTreeMetadata, Node[], Map<Object, Node[]>, Map<Object, Map<Object, RandomForest.NodeIndexInfo>>, Split[][], Bin[][], Queue<Tuple2<Object, Node>>, TimeTracker, Option<NodeIdCache>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Given a group of nodes, this finds the best split for each node.
- findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
-
- findClosest(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Returns the index of the closest center to the given point, as well as the squared distance.
- findLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- findLeaders(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- findMaxTaskId(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
-
Finds the maximum taskid in the output file names at the given path.
- findSplitsForContinuousFeature(double[], DecisionTreeMetadata, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Find splits for a continuous feature
NOTE: Returned number of splits is set based on featureSamples
and
could be different from the specified numSplits
.
- findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of a word
- findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of the vector representation of a word
- finishAll() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
Mark all the stages as finished, clear the progress bar if showed, then the progress will not
interweave with output of jobs.
- finished() - Method in class org.apache.spark.scheduler.ActiveJob
-
- finished() - Method in class org.apache.spark.scheduler.TaskInfo
-
- FINISHED() - Static method in class org.apache.spark.TaskState
-
- FINISHED_STATES() - Static method in class org.apache.spark.TaskState
-
- finishedTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
-
- finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task has completed successfully (including the time to remotely fetch
results, if necessary).
- first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- first() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- first() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.rdd.RDD
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.sql.DataFrame
-
Returns the first row.
- first(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value in a group.
- first(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value of a column in a group.
- first() - Method in interface org.apache.spark.sql.RDDApi
-
- FIRST_DELAY() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- firstAvailableClass(String, String) - Method in interface org.apache.spark.mapred.SparkHadoopMapRedUtil
-
- firstAvailableClass(String, String) - Method in interface org.apache.spark.mapreduce.SparkHadoopMapReduceUtil
-
- fit(DataFrame, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(DataFrame, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with provided parameter map.
- fit(DataFrame, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
-
Fits multiple models to the input data with multiple sets of parameters.
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.Predictor
-
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Pipeline
-
Fits the pipeline to the input dataset with additional parameters.
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
-
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
Returns a ChiSquared feature selector.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
-
Computes the mean and variance and stores as a model to be used for later scaling.
- fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary.
- fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary (Java version).
- fittingParamMap() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- fittingParamMap() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- fittingParamMap() - Method in class org.apache.spark.ml.Model
-
Fitting parameters, such that parent.fit(..., fittingParamMap) could reproduce the model.
- fittingParamMap() - Method in class org.apache.spark.ml.PipelineModel
-
- fittingParamMap() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- fittingParamMap() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- fittingParamMap() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- FixedLengthBinaryInputFormat - Class in org.apache.spark.input
-
- FixedLengthBinaryInputFormat() - Constructor for class org.apache.spark.input.FixedLengthBinaryInputFormat
-
- FixedLengthBinaryRecordReader - Class in org.apache.spark.input
-
FixedLengthBinaryRecordReader is returned by FixedLengthBinaryInputFormat.
- FixedLengthBinaryRecordReader() - Constructor for class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<Row, TraversableOnce<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by first applying a function to all rows of this
DataFrame
,
and then flattening the results.
- flatMap(Function1<T, TraversableOnce<R>>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
-
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each input record.
- FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A function that takes two inputs and returns zero or more output records.
- FlatMappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
-
- FlatMappedDStream(DStream<T>, Function1<T, Traversable<U>>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- FlatMapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
-
- FlatMapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, TraversableOnce<U>>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
FlatMaps f over this RDD, where f takes an additional parameter of type A.
- FLOAT - Class in org.apache.spark.sql.columnar
-
- FLOAT() - Constructor for class org.apache.spark.sql.columnar.FLOAT
-
- FloatColumnAccessor - Class in org.apache.spark.sql.columnar
-
- FloatColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.FloatColumnAccessor
-
- FloatColumnBuilder - Class in org.apache.spark.sql.columnar
-
- FloatColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.FloatColumnBuilder
-
- FloatColumnStats - Class in org.apache.spark.sql.columnar
-
- FloatColumnStats() - Constructor for class org.apache.spark.sql.columnar.FloatColumnStats
-
- FloatConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- FloatParam - Class in org.apache.spark.ml.param
-
Specialized version of Param[Float
] for Java.
- FloatParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
-
- floatWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- floatWritableConverter() - Static method in class org.apache.spark.WritableConverter
-
- floatWritableFactory() - Static method in class org.apache.spark.WritableFactory
-
- floor(Duration) - Method in class org.apache.spark.streaming.Time
-
- FlumeBatchFetcher - Class in org.apache.spark.streaming.flume
-
- FlumeBatchFetcher(FlumePollingReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeBatchFetcher
-
- FlumeConnection - Class in org.apache.spark.streaming.flume
-
A wrapper around the transceiver and the Avro IPC API.
- FlumeConnection(NettyTransceiver, SparkFlumeProtocol.Callback) - Constructor for class org.apache.spark.streaming.flume.FlumeConnection
-
- FlumeEventServer - Class in org.apache.spark.streaming.flume
-
A simple server that implements Flume's Avro protocol.
- FlumeEventServer(FlumeReceiver) - Constructor for class org.apache.spark.streaming.flume.FlumeEventServer
-
- FlumeInputDStream<T> - Class in org.apache.spark.streaming.flume
-
- FlumeInputDStream(StreamingContext, String, int, StorageLevel, boolean, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumeInputDStream
-
- FlumePollingInputDStream<T> - Class in org.apache.spark.streaming.flume
-
A ReceiverInputDStream
that can be used to read data from several Flume agents running
SparkSink
s.
- FlumePollingInputDStream(StreamingContext, Seq<InetSocketAddress>, int, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- FlumePollingReceiver - Class in org.apache.spark.streaming.flume
-
- FlumePollingReceiver(Seq<InetSocketAddress>, int, int, StorageLevel) - Constructor for class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- FlumeReceiver - Class in org.apache.spark.streaming.flume
-
A NetworkReceiver which listens for events using the
Flume Avro interface.
- FlumeReceiver(String, int, StorageLevel, boolean) - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver
-
- FlumeReceiver.CompressionChannelPipelineFactory - Class in org.apache.spark.streaming.flume
-
A Netty Pipeline factory that will decompress incoming data from
and the Netty client and compress data going back to the client.
- FlumeReceiver.CompressionChannelPipelineFactory() - Constructor for class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
-
- FlumeUtils - Class in org.apache.spark.streaming.flume
-
- FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
-
- flush() - Method in class org.apache.spark.serializer.JavaSerializationStream
-
- flush() - Method in class org.apache.spark.serializer.KryoSerializationStream
-
- flush() - Method in class org.apache.spark.serializer.SerializationStream
-
- flush() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- flush() - Method in class org.apache.spark.streaming.util.RateLimitedOutputStream
-
- FMeasure - Class in org.apache.spark.mllib.evaluation.binary
-
F-Measure.
- FMeasure(double) - Constructor for class org.apache.spark.mllib.evaluation.binary.FMeasure
-
- fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure for a given label (category)
- fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f1-measure for a given label (category)
- fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure
(equals to precision and recall because precision equals recall)
- fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve.
- fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve with beta = 1.0.
- fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative function and a neutral "zero value".
- foldable() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- foldable() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value"
which may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- forAttribute() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
-
- foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to all elements of this RDD.
- foreach(Function1<Edge<ED>, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Apply the function f to all edges in this partition.
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to all elements of this RDD.
- foreach(Function1<Row, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
-
Applies a function f
to all rows.
- foreach(Function1<T, BoxedUnit>) - Method in interface org.apache.spark.sql.RDDApi
-
- foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- foreach(Function1<A, U>) - Method in class org.apache.spark.util.TimeStampedHashSet
-
- foreach(Function1<Tuple2<A, B>, U>) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Applies a function f
to all the active elements of dense and sparse matrix.
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Applies a function f
to all the active elements of dense and sparse vector.
- foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreach
action, which
applies a function f to all the elements of this RDD.
- foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to all elements of this RDD.
- ForEachDStream<T> - Class in org.apache.spark.streaming.dstream
-
- ForEachDStream(DStream<T>, Function2<RDD<T>, Time, BoxedUnit>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ForEachDStream
-
- foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<Row>, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
-
Applies a function f to each partition of this
DataFrame
.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in interface org.apache.spark.sql.RDDApi
-
- foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreachPartition
action, which
applies a function f to each partition of this RDD.
- foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to each partition of this RDD.
- foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies f to each element of this RDD, where f takes an additional parameter of type A.
- foreachWithinEdgePartition(int, boolean, boolean, Function1<Object, BoxedUnit>) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
-
Runs f
on each vertex id to be sent to the specified edge partition.
- formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
-
- formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils
-
Generate a verbose human-readable string representing a duration such as "5 second 35 ms"
- formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils
-
Generate a human-readable string representing a number (e.g.
- formatter() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
- formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable
-
Current version of model save/load format.
- formatWindowsPath(String) - Static method in class org.apache.spark.util.Utils
-
Format a Windows path such that it can be safely passed to a URI.
- FPGrowth - Class in org.apache.spark.mllib.fpm
-
:: Experimental ::
- FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth
-
Constructs a default instance with default parameters {minSupport: 0.3
, numPartitions: same
as the input data}.
- FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm
-
Frequent itemset.
- FPGrowth.FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm
-
:: Experimental ::
- FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
-
- FPTree<T> - Class in org.apache.spark.mllib.fpm
-
FP-Tree data structure used in FP-Growth.
- FPTree() - Constructor for class org.apache.spark.mllib.fpm.FPTree
-
- FPTree.Node<T> - Class in org.apache.spark.mllib.fpm
-
Representing a node in an FP-Tree.
- FPTree.Node(FPTree.Node<T>) - Constructor for class org.apache.spark.mllib.fpm.FPTree.Node
-
- framework() - Method in class org.apache.spark.streaming.Checkpoint
-
- frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- frameworkMessage(SchedulerDriver, Protos.ExecutorID, Protos.SlaveID, byte[]) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- freeCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
-
- freeMemory() - Method in class org.apache.spark.storage.MemoryStore
-
Free memory not occupied by existing blocks.
- freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
- fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- fromBinary(Binary) - Static method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- fromBreeze(Matrix<Object>) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a Matrix instance from a breeze matrix.
- fromBreeze(Vector<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a vector instance from a breeze vector.
- fromByteString(ByteString) - Static method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
-
- fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
from Coordinate List (COO) format.
- fromDataType(DataType, String, boolean, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
Converts a given Catalyst DataType
into
the corresponding Parquet Type
.
- fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
-
- fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
-
Creates an EdgeRDD from already-constructed edge partitions.
- fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.
- fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
-
Creates an EdgeRDD from a set of edges.
- fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges.
- fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
containing all vertices referred to in edges
.
- fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges encoded as vertex id pairs.
- fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
vertices.
- fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
Convert a JavaRDD of key-value pairs to JavaPairRDD.
- fromMesos(Protos.TaskState) - Static method in class org.apache.spark.TaskState
-
- fromMsgs(int, Iterator<Tuple2<Object, Object>>) - Static method in class org.apache.spark.graphx.impl.RoutingTablePartition
-
Build a `RoutingTablePartition` from `RoutingTableMessage`s.
- fromOffset() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- fromOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
inclusive starting offset
- fromOffsets() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromPrimitiveDataType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
For a given Catalyst DataType
return
the name of the corresponding Parquet primitive type or None if the given type
is not primitive.
- fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Implicit conversion from an RDD to RDDFunctions.
- fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
-
- fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo
-
Construct a StageInfo from a Stage.
- fromString(String) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
-
- fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
-
- fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Return the StorageLevel object with the specified name.
- fromWeakReference(WeakReference<V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- fromWeakReferenceIterator(Iterator<Tuple2<K, WeakReference<V>>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- fromWeakReferenceMap(Map<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- fromWeakReferenceOption(Option<WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- fromWeakReferenceTuple(Tuple2<K, WeakReference<V>>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- fs() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- func() - Method in class org.apache.spark.scheduler.ActiveJob
-
- func() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- Function<T1,R> - Interface in org.apache.spark.api.java.function
-
Base interface for functions whose return types do not create special RDDs.
- function() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- function() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A two-argument function that takes arguments of type T1 and T2 and returns an R.
- Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
-
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
- functionClassName() - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- functions - Class in org.apache.spark.sql
-
- functions() - Constructor for class org.apache.spark.sql.functions
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- funcWrapper() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
-
- FutureAction<T> - Interface in org.apache.spark
-
A future for the result of an action to support cancellation.
- gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- GammaGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
-
- gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the gamma distribution with the input
shape and scale.
- gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
gamma distribution with the input shape and scale.
- GapSamplingIterator<T> - Class in org.apache.spark.util.random
-
- GapSamplingIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingIterator
-
- GapSamplingReplacementIterator<T> - Class in org.apache.spark.util.random
-
advance to first sample as part of object construction.
- GapSamplingReplacementIterator(Iterator<T>, double, Random, double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.GapSamplingReplacementIterator
-
- gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- gatherCompressibilityStats(Row, int) - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- gatherCompressibilityStats(Row, int) - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BinaryColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.BooleanColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ByteColumnStats
-
- gatherStats(Row, int) - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
Gathers statistics information from row(ordinal)
.
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.DoubleColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.FloatColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.GenericColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.IntColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.LongColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.NoopColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.ShortColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.StringColumnStats
-
- gatherStats(Row, int) - Method in class org.apache.spark.sql.columnar.TimestampColumnStats
-
- GaussianMixture - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture
-
Constructs a default instance.
- GaussianMixtureModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
C := alpha * A * B + beta * C
- gemv(double, Matrix, DenseVector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
y := alpha * A * x + beta * y
- GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
- GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
- GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearModel (GLM) represents a model trained using
GeneralizedLinearAlgorithm.
- GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- generate(String, String, int, int) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
-
- generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
-
- generateJob(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Generate a SparkStreaming job for the given time.
- generateJob(Time) - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- generateJobs(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- GenerateJobs - Class in org.apache.spark.streaming.scheduler
-
- GenerateJobs(Time) - Constructor for class org.apache.spark.streaming.scheduler.GenerateJobs
-
- generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
Generate an RDD containing test data for KMeans.
- generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Return a Java List of synthetic data randomly generated according to a multi
collinear model.
- generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
and uregularized variants.
- generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
Generate an RDD containing test data for LogisticRegression.
- generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Get the desired name of the rollover file
- generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
Get the desired name of the rollover file
- generateRolledOverFileSuffix() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
- generator() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
-
- GENERIC - Class in org.apache.spark.sql.columnar
-
- GENERIC() - Constructor for class org.apache.spark.sql.columnar.GENERIC
-
- GenericColumnAccessor - Class in org.apache.spark.sql.columnar
-
- GenericColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.GenericColumnAccessor
-
- GenericColumnBuilder - Class in org.apache.spark.sql.columnar
-
- GenericColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.GenericColumnBuilder
-
- GenericColumnStats - Class in org.apache.spark.sql.columnar
-
- GenericColumnStats() - Constructor for class org.apache.spark.sql.columnar.GenericColumnStats
-
- geq(Object) - Method in class org.apache.spark.sql.Column
-
Greater than or equal to an expression.
- get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- get() - Method in interface org.apache.spark.FutureAction
-
Blocks and returns the result of this job.
- get() - Method in class org.apache.spark.JavaFutureActionWrapper
-
- get(long, TimeUnit) - Method in class org.apache.spark.JavaFutureActionWrapper
-
- get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Optionally returns the value associated with a param or its default.
- get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Gets the value of a parameter in the embedded param map.
- get(long) - Method in class org.apache.spark.partial.StudentTCacher
-
- get(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter; throws a NoSuchElementException if it's not set
- get(String, String) - Method in class org.apache.spark.SparkConf
-
Get a parameter, falling back to a default if not set
- get() - Static method in class org.apache.spark.SparkEnv
-
Returns the SparkEnv.
- get(String) - Static method in class org.apache.spark.SparkFiles
-
Get the absolute path of a file added through SparkContext.addFile()
.
- get() - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
-
- get(String) - Static method in class org.apache.spark.sql.jdbc.DriverQuirks
-
Fetch the DriverQuirks class corresponding to a given database url.
- get(String) - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
-
- get(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get a block from the block manager (either local or remote).
- get() - Static method in class org.apache.spark.TaskContext
-
Return the currently active TaskContext.
- get(A) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- get(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Count the number of items instantly accepted and generate the waitlist for each stratum.
- getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getActorSystemHostPortForExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
-
- getAddressHostName(String) - Static method in class org.apache.spark.util.Utils
-
- getAkkaConf() - Method in class org.apache.spark.SparkConf
-
Get all akka conf variables set on this SparkConf
- getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getAll() - Method in class org.apache.spark.SparkConf
-
Get all parameters as a list of pairs
- getAllBlocks() - Method in class org.apache.spark.storage.DiskBlockManager
-
List all the blocks currently stored on disk by the disk manager.
- getAllConfs() - Method in class org.apache.spark.sql.SQLConf
-
Return all the configuration properties that have been set (i.e.
- getAllConfs() - Method in class org.apache.spark.sql.SQLContext
-
Return all the configuration properties that have been set (i.e.
- getAllFiles() - Method in class org.apache.spark.storage.DiskBlockManager
-
List all the files currently stored on disk by the disk manager.
- getAllPartitionsOf(Hive, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getAllPools() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return pools for fair scheduler
- getAlpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getDocConcentration
- getAppId() - Method in class org.apache.spark.SparkConf
-
Returns the Spark application id, valid in the Driver after TaskScheduler registration and
from the start in the Executor.
- getAppName() - Method in class org.apache.spark.ui.SparkUI
-
- getAst(String) - Static method in class org.apache.spark.sql.hive.HiveQl
-
Returns the AST for the given SQL string.
- getBasePath() - Method in class org.apache.spark.ui.WebUI
-
- getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Return the per partition sampling function used for sampling without replacement.
- getBeta() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getTopicConcentration
- getBinaryWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getBinaryWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return the given block stored in this block manager in O(1) time.
- getBlockData(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Interface to get local block data.
- getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Get the blocks allocated to the given batch.
- getBlocksOfBatch(Time) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Get the blocks for the given batch and all input streams.
- getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Get the blocks allocated to the given batch and stream.
- getBlocksOfBatchAndStream(Time, int) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Get the blocks allocated to the given batch and stream.
- getBlocksOfStream(int) - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
-
- getBlockStatus(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Return the block's status on all block managers, if any.
- getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a boolean, falling back to a default if not set
- getBooleanWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getBooleanWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getBytes(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
- getBytes(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- getBytes(FileSegment) - Method in class org.apache.spark.storage.DiskStore
-
- getBytes(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- getBytes(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- getByteWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getByteWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
-
- getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
The three methods below are helpers for accessing the local map, a property of the SparkEnv of
the local process.
- getCachedStorageLevel(StorageLevel) - Static method in class org.apache.spark.storage.StorageLevel
-
- getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
-
- getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
-
- getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
-
- getCalculator(double[], int) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
-
- getCalendar() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- getCallSite() - Method in class org.apache.spark.SparkContext
-
Capture the current user callsite and return a formatted version for printing.
- getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
-
When called inside a class in the spark package, returns the name of the user code class
(outside the spark package) that called into Spark, as well as which Spark method they called.
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.DriverQuirks
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.MySQLQuirks
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.NoQuirks
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.PostgresQuirks
-
- getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- getCheckpointDir() - Method in class org.apache.spark.SparkContext
-
- getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph
-
Gets the name of the files to which this Graph was checkpointed.
- getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- getCheckpointFiles(String, FileSystem) - Static method in class org.apache.spark.streaming.Checkpoint
-
Get checkpoint files present in the give directory, ordered by oldest-first
- getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA
-
Period (in iterations) between checkpoints.
- getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getClause(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- getClauseOption(String, Seq<Node>) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- getClientSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Returns the function used combine results returned by seqOp from different partitions.
- getCommandProcessor(String[], HiveConf) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Return a copy of this JavaSparkContext's configuration.
- getConf() - Method in interface org.apache.spark.input.Configurable
-
- getConf() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getConf() - Method in class org.apache.spark.SparkContext
-
Return a copy of this SparkContext's configuration.
- getConf(String) - Method in class org.apache.spark.sql.SQLConf
-
Return the value of Spark SQL configuration property for the given key.
- getConf(String, String) - Method in class org.apache.spark.sql.SQLConf
-
Return the value of Spark SQL configuration property for the given key.
- getConf(String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
-
- getConnections() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- getConnector(String, String) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Given a driver string and an url, return a function that loads the
specified driver string then returns a connection to the JDBC url.
- getConsumerOffsetMetadata(String, Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
Requires Kafka >= 0.8.1.1
- getConsumerOffsets(String, Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
Requires Kafka >= 0.8.1.1
- getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils
-
Get the Context ClassLoader on this thread or, if not present, the ClassLoader that
loaded Spark.
- getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the largest change in log-likelihood at which convergence is
considered to have occurred.
- getConversions(StructType) - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Maps a StructType to a type tag list.
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
-
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- getConverter(int) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
-
- getCreationSite() - Method in class org.apache.spark.rdd.RDD
-
- getCreationSite() - Static method in class org.apache.spark.streaming.dstream.DStream
-
Get the creation site of a DStream from the stack trace of when the DStream is created.
- getCurrentKey() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- getCurrentKey() - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- getCurrentKey() - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystConverter
-
Should only be called in the root (group) converter!
- getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- getCurrentRecord() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- getCurrentRecord() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
-
- getCurrentUserName() - Static method in class org.apache.spark.util.Utils
-
Returns the current user name.
- getCurrentValue() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- getCurrentValue() - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- getCurrentValue() - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- getDataLocationPath(Partition) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDateWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDateWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDecimalWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDecimalWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils
-
Return the path of the default Spark properties file.
- getDefaultWorkFile(TaskAttemptContext, String) - Method in class org.apache.spark.sql.parquet.AppendingParquetOutputFormat
-
- getDelaySeconds(SparkConf) - Static method in class org.apache.spark.util.MetadataCleaner
-
- getDelaySeconds(SparkConf, Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleaner
-
- getDependencies() - Method in class org.apache.spark.rdd.CartesianRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.CoalescedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- getDirName() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- getDiskWriter(BlockId, File, Serializer, int, ShuffleWriteMetrics) - Method in class org.apache.spark.storage.BlockManager
-
A short circuited method to get a block writer that can write data directly to disk.
- getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- getDouble(String, double) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a double, falling back to a default if not set
- getDoubleWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getDoubleWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getEarliestLeaderOffsets(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getEntrySet() - Method in class org.apache.spark.util.TimeStampedHashMap
-
- getenv(String) - Method in class org.apache.spark.SparkConf
-
By using this instead of System.getenv(), environment variables can be mocked
in unit tests.
- getEpoch() - Method in class org.apache.spark.MapOutputTracker
-
Called to get current epoch number.
- getEstimator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
-
- getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
-
- getEvaluator() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
-
- getExecutorEnv() - Method in class org.apache.spark.SparkConf
-
Get all executor environment variables set on this SparkConf
- getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
-
Return a map from the slave to the max memory available for caching and the remaining
memory available for caching.
- getExecutorsAliveOnHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about blocks stored in all of the slaves
- getExecutorThreadDump(String) - Method in class org.apache.spark.SparkContext
-
Called by the web UI to obtain executor thread dumps.
- getExternalTmpPath(Context, Path) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getFeatureOffset(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Pre-compute feature offset for use with featureUpdate
.
- getFeaturesCol() - Method in interface org.apache.spark.ml.param.HasFeaturesCol
-
- getField(String) - Method in class org.apache.spark.sql.Column
-
An expression that gets a field by name in a StructField
.
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BINARY
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- getField(Row, int) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Returns row(ordinal)
.
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DATE
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.GENERIC
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.INT
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.LONG
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.STRING
-
- getField(Row, int) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
-
- getFile(long) - Static method in class org.apache.spark.broadcast.HttpBroadcast
-
- getFile(String) - Method in class org.apache.spark.storage.DiskBlockManager
-
Looks up a file by hashing it into one of our local subdirectories.
- getFile(BlockId) - Method in class org.apache.spark.storage.DiskBlockManager
-
- getFile(String) - Method in class org.apache.spark.storage.TachyonBlockManager
-
- getFile(BlockId) - Method in class org.apache.spark.storage.TachyonBlockManager
-
- getFilePath(File, String) - Static method in class org.apache.spark.util.Utils
-
Return the absolute path of a file in the given directory.
- getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
Get the locations of the HDFS blocks containing the given file segment.
- getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getFinalValue() - Method in class org.apache.spark.partial.PartialResult
-
Blocking method to wait for and return the final value.
- getFloatWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getFloatWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils
-
Return the class name of the given object, removing all dollar signs
- getFunctionInfo(String) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
-
- getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils
-
Return a Hadoop FileSystem with the scheme encoded in the given path.
- getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils
-
Return a Hadoop FileSystem with the scheme encoded in the given path.
- getHandlers() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- getHandlers() - Method in class org.apache.spark.ui.WebUI
-
- getHttpUser() - Method in class org.apache.spark.SecurityManager
-
Gets the user used for authenticating HTTP connections.
- getImplicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getImpurityCalculator(int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Get an ImpurityCalculator
for a given (node, feature, bin).
- getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the user supplied initial GMM, if supplied
- getInputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
-
- getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
-
- getInstance(String) - Method in class org.apache.spark.metrics.MetricsConfig
-
- getInt(String, int) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an integer, falling back to a default if not set
- getIntWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getIntWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getItem(int) - Method in class org.apache.spark.sql.Column
-
An expression that gets an item at position ordinal
out of an array.
- getItemCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getIteratorSize(Iterator<T>) - Static method in class org.apache.spark.util.Utils
-
Counts the number of elements of an iterator using a while loop rather than calling
TraversableOnce.size()
because it uses a for loop, which is slightly slower
in the current version of Scala.
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.DriverQuirks
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.MySQLQuirks
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.NoQuirks
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.PostgresQuirks
-
- getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns job information, or null
if the job info could not be found or was garbage collected.
- getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns job information, or None
if the job info could not be found or was garbage collected.
- getJulianDay() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the number of Gaussians in the mixture model
- getK() - Method in class org.apache.spark.mllib.clustering.LDA
-
Number of topics to infer.
- getLabelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
-
- getLatestLeaderOffsets(Set<TopicAndPartition>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getLeaderOffsets(Set<TopicAndPartition>, long) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getLeaderOffsets(Set<TopicAndPartition>, long, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Sorts and gets the least element of the list associated with key in groupHash
The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
- getLeftRightFeatureOffsets(int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Pre-compute feature offset for use with featureUpdate
.
- getLocal(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get block from local block manager.
- getLocalBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get block from the local block manager as serialized bytes.
- getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Get the path of a temporary directory.
- getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- getLocalFileWriter(Row) - Method in class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- getLocalityIndex(Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
-
Find the index in myLocalityLevels for a given locality.
- getLocalProperties() - Method in class org.apache.spark.SparkContext
-
- getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocalProperty(String) - Method in class org.apache.spark.SparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocation() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
-
- getLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
-
- getLocations(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Get locations of the blockId from the driver
- getLocations(BlockId[]) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Get locations of multiple blockIds from the driver
- getLogPath(String, String, Option<String>) - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
Return a file-system-safe path to the log file for the given application.
- getLong(String, long) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a long, falling back to a default if not set
- getLongWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getLongWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
-
Returns a threshold p
such that if we conduct n Bernoulli trials with success rate = p
,
it is very unlikely to have more than fraction * n
successes.
- getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
-
Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).
- GetMapOutputStatuses - Class in org.apache.spark
-
- GetMapOutputStatuses(int) - Constructor for class org.apache.spark.GetMapOutputStatuses
-
- getMatchingBlockIds(Function1<BlockId, Object>) - Method in class org.apache.spark.storage.BlockManager
-
Get the ids of existing blocks that match the given filter.
- getMatchingBlockIds(Function1<BlockId, Object>, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Return a list of ids of existing blocks such that the ids match the given filter.
- getMaxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxInputStreamRememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
-
Get the maximum remember duration across all the input streams.
- getMaxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
-
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the maximum number of iterations to run
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA
-
Maximum number of iterations for learning.
- getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxResultSize(SparkConf) - Static method in class org.apache.spark.util.Utils
-
- getMemoryStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
-
Return the memory status for each block manager, in the form of a map from
the block manager's id to two long values.
- getMessage() - Method in exception org.apache.spark.util.TaskCompletionListenerException
-
- getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- getMetricsSnapshot(HttpServletRequest) - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getModel(Estimator<M>) - Method in class org.apache.spark.ml.PipelineModel
-
Gets the model produced by the input estimator.
- getModifyAcls() - Method in class org.apache.spark.SecurityManager
-
- getNarrowAncestors() - Method in class org.apache.spark.rdd.RDD
-
Return the ancestors of the given RDD that are related to it only through a sequence of
narrow dependencies.
- getNewReceiverStreamId() - Method in class org.apache.spark.streaming.StreamingContext
-
- getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Traces down from a root node to get the node with the given node index.
- getNonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
-
- getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
-
- getNumItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getNumObjFields() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getNumUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getObjFieldValues(Object, Object[]) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getOption(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an Option
- getOrCompute(RDD<T>, Partition, TaskContext, StorageLevel) - Method in class org.apache.spark.CacheManager
-
Gets or computes an RDD partition.
- getOrCompute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Get the RDD corresponding to the given time; either retrieve it from cache
or compute-and-cache it.
- getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreateLocalRootDirs(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Gets or creates the directories listed in spark.local.dir or SPARK_LOCAL_DIRS,
and returns only the directories that exist / could be created.
- getOutputCol() - Method in interface org.apache.spark.ml.param.HasOutputCol
-
- getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
-
- getOutputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
-
- getParam(String) - Method in interface org.apache.spark.ml.param.Params
-
Gets a param by its name.
- getParents(int) - Method in class org.apache.spark.NarrowDependency
-
Get the parent partitions for a child partition.
- getParents(int) - Method in class org.apache.spark.OneToOneDependency
-
- getParents(int) - Method in class org.apache.spark.RangeDependency
-
- getParents(int) - Method in class org.apache.spark.rdd.PruneDependency
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
- getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
-
Returns the partition number for a given edge.
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- getPartition(Object) - Method in class org.apache.spark.HashPartitioner
-
- getPartition(Object) - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
Returns the index of the partition the input coordinate belongs to.
- getPartition(Object) - Method in class org.apache.spark.Partitioner
-
- getPartition(Object) - Method in class org.apache.spark.RangePartitioner
-
- getPartitionMetadata(Set<String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getPartitions() - Method in class org.apache.spark.mllib.rdd.RandomRDD
-
- getPartitions() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.BinaryFileRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.BlockRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CartesianRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CheckpointRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CoalescedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.EmptyRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.MapPartitionsRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ParallelCollectionRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- getPartitions() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.PipedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- getPartitions() - Method in class org.apache.spark.rdd.SampledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.WholeTextFileRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
-
- getPartitions() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Retrieve the list of partitions corresponding to this RDD.
- getPartitions(Set<String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
- getPartitions() - Method in class org.apache.spark.streaming.kafka.KafkaRDD
-
- getPartitions() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
-
- getPath() - Method in class org.apache.spark.input.PortableDataStream
-
- getPeers(BlockManagerId) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Get ids of other nodes in the cluster from the driver
- getPendingTimes() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- getPersistentRDDs() - Method in class org.apache.spark.SparkContext
-
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
- getPipeEnvVars() - Method in class org.apache.spark.rdd.HadoopPartition
-
Get any environment variables that should be added to the users environment when running pipes
- getPipeline() - Method in class org.apache.spark.streaming.flume.FlumeReceiver.CompressionChannelPipelineFactory
-
- getPointIterator(RandomRDDPartition<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
-
- getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Return the per partition sampling function used for sampling with replacement.
- getPoolForName(String) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return the pool associated with the given name, if one exists
- getPredictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.mllib.rdd.SlidingRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.BlockRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CartesianRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CheckpointRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.CoalescedRDD
-
Returns the preferred machine for the partition.
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ParallelCollectionRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.SampledRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ZippedWithIndexRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.kafka.KafkaRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDD
-
Get the preferred location of the partition.
- getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Gets the locality information associated with a partition of a particular RDD.
- getPreferredLocs(RDD<?>, int) - Method in class org.apache.spark.SparkContext
-
Gets the locality information associated with the partition in a particular rdd
- getPrimitiveNullWritable() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getPrimitiveNullWritableConstantObjectInspector() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getProbabilityCol() - Method in interface org.apache.spark.ml.param.HasProbabilityCol
-
- getProgress() - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- getProgress() - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- getProgress() - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils
-
Load properties present in the given file.
- getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getQuantiles(Traversable<Object>) - Method in class org.apache.spark.util.Distribution
-
Get the value of the distribution at the given probabilities.
- getRackForHost(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- getRank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getRatingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getRawPredictionCol() - Method in interface org.apache.spark.ml.param.HasRawPredictionCol
-
- getRddBlockLocations(int, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
-
Return a mapping from block ID to its locations for each block that belongs to the given RDD.
- getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about what RDDs are cached, if they are in mem or on disk, how much space
they take, etc.
- getReceiver() - Method in class org.apache.spark.streaming.dstream.PluggableInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.dstream.RawInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Gets the receiver object that will be sent to the worker nodes
to receive data.
- getReceiver() - Method in class org.apache.spark.streaming.dstream.SocketInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.flume.FlumeInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.kafka.KafkaInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.mqtt.MQTTInputDStream
-
- getReceiver() - Method in class org.apache.spark.streaming.twitter.TwitterInputDStream
-
- getReceiverInputStreams() - Method in class org.apache.spark.streaming.DStreamGraph
-
- getRecordLength(JobContext) - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
Retrieves the record length property from a Hadoop configuration
- getReference(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- getRegParam() - Method in interface org.apache.spark.ml.param.HasRegParam
-
- getRemote(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get block from remote block managers.
- getRemoteBytes(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get block from remote block managers as serialized bytes.
- getResource(List<Protos.Resource>, String) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
Helper function to pull out a resource from a Mesos Resources protobuf
- getResource(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- getResources(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- getRestartTime(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
-
Get the time when the timer will fire if it is restarted right now.
- getRootConverter() - Method in class org.apache.spark.sql.parquet.RowRecordMaterializer
-
- getRootDirectory() - Static method in class org.apache.spark.SparkFiles
-
Get the root directory that contains files added through SparkContext.addFile()
.
- getSaslUser() - Method in class org.apache.spark.SecurityManager
-
Gets the user used for authenticating SASL connections.
- getSaslUser(String) - Method in class org.apache.spark.SecurityManager
-
- getSchedulableByName(String) - Method in class org.apache.spark.scheduler.Pool
-
- getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
-
- getSchedulableByName(String) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- getSchedulingMode() - Method in class org.apache.spark.SparkContext
-
Return current scheduling mode
- getSchema(Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- getSecretKey() - Method in class org.apache.spark.SecurityManager
-
Gets the secret key.
- getSecretKey(String) - Method in class org.apache.spark.SecurityManager
-
- getSecurityManager() - Method in class org.apache.spark.ui.WebUI
-
- getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the random seed
- getSeed() - Method in class org.apache.spark.mllib.clustering.LDA
-
Random seed
- getSeqOp(boolean, Map<K, Object>, StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
-
Returns the function used by aggregate to collect sampling statistics for each partition.
- getSerializedMapOutputStatuses(int) - Method in class org.apache.spark.MapOutputTrackerMaster
-
- getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
-
- getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
-
- getServerStatuses(int, int) - Method in class org.apache.spark.MapOutputTracker
-
Called from executors to get the server URIs and output sizes of the map outputs of
a given shuffle.
- getServletHandlers() - Method in class org.apache.spark.metrics.MetricsSystem
-
Get any UI handlers used by this metrics system; can only be called after start().
- getShortWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getShortWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getSingle(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Read a block consisting of a single object.
- getSize(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
Return the size of a block in bytes.
- getSize(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- getSize(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- getSize(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- getSizeForBlock(int) - Method in class org.apache.spark.scheduler.CompressedMapStatus
-
- getSizeForBlock(int) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
-
- getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus
-
Estimated size for the reduce block, in bytes.
- getSizesOfActiveStateTrackingCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- getSizesOfHardSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- getSizesOfSoftSizeLimitedCollections() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- getSlotDescs() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- getSortedRolledOverFiles(String, String) - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
Get the sorted list of rolled over files.
- getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.Pool
-
- getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
-
- getSortedTaskSetQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- getSparkClassLoader() - Static method in class org.apache.spark.util.Utils
-
Get the ClassLoader which loaded Spark.
- getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get Spark's home location from either a value set through the constructor,
or the spark.home Java property, or the SPARK_HOME environment variable
(in that order of preference).
- getSparkHome() - Method in class org.apache.spark.SparkContext
-
Get Spark's home location from either a value set through the constructor,
or the spark.home Java property, or the SPARK_HOME environment variable
(in that order of preference).
- getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils
-
Return the value of a config either through the SparkConf or the Hadoop configuration
if this is Yarn mode.
- getSparkUI(StreamingContext) - Static method in class org.apache.spark.streaming.ui.StreamingTab
-
- getSplits(JobContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- getSplits(Configuration, List<Footer>) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns stage information, or null
if the stage info could not be found or was
garbage collected.
- getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns stage information, or None
if the stage info could not be found or was
garbage collected.
- getStages() - Method in class org.apache.spark.ml.Pipeline
-
- getStartTime() - Method in class org.apache.spark.streaming.util.RecurringTimer
-
Get the time when this timer will fire if it is started right now.
- getStatsSetupConstRawDataSize() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getStatsSetupConstTotalSize() - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManager
-
Get the BlockStatus for the block identified by the given ID, if it exists.
- getStatus(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
-
- getStderr(Process, long) - Static method in class org.apache.spark.util.Utils
-
Return the stderr of a process after waiting for the process to terminate.
- getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.rdd.RDD
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getStorageStatus() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- getStringWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getStringWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getSystemProperties() - Static method in class org.apache.spark.util.Utils
-
Returns the system properties map that is thread-safe to iterator over.
- getTableDesc(Class<? extends Deserializer>, Class<? extends InputFormat<?, ?>>, Class<?>, Properties) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getTables(Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- getTabs() - Method in class org.apache.spark.ui.WebUI
-
- getTaskSideSplits(Configuration, List<Footer>, Long, Long, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.FilteringParquetRowInputFormat
-
- getThreadDump() - Static method in class org.apache.spark.util.Utils
-
Return a thread dump of all threads' stacktraces.
- getThreadLocal() - Static method in class org.apache.spark.SparkEnv
-
Returns the ThreadLocal SparkEnv.
- getThreshold() - Method in interface org.apache.spark.ml.param.HasThreshold
-
- getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getTimeMillis() - Method in interface org.apache.spark.util.Clock
-
- getTimeMillis() - Method in class org.apache.spark.util.ManualClock
-
- getTimeMillis() - Method in class org.apache.spark.util.SystemClock
-
- getTimeOfDayNanos() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- getTimestamp(A) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- getTimeStampedValue(A) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- getTimestampWritable(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- getTimestampWritableConstantObjectInspector(Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
-
- GettingResultEvent - Class in org.apache.spark.scheduler
-
- GettingResultEvent(TaskInfo) - Constructor for class org.apache.spark.scheduler.GettingResultEvent
-
- gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task started remotely getting the result.
- getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getUIPort(SparkConf) - Static method in class org.apache.spark.ui.SparkUI
-
- getUnallocatedBlocks(int) - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Get blocks that have been added but not yet allocated to any batch.
- getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds
-
Returns a threshold p
such that if we conduct n Bernoulli trials with success rate = p
,
it is very unlikely to have less than fraction * n
successes.
- getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds
-
Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).
- getURLs() - Method in class org.apache.spark.util.MutableURLClassLoader
-
- getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils
-
Return the string to tell how long has passed in milliseconds.
- getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getUserCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
- getValues(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
- getValues(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- getValues(BlockId, Serializer) - Method in class org.apache.spark.storage.DiskStore
-
A version of getValues that allows a custom serializer.
- getValues(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- getValues(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- getVectorIterator(RandomRDDPartition<Object>, int) - Static method in class org.apache.spark.mllib.rdd.RandomRDD
-
- getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Returns a map of words to their vector representations.
- getViewAcls() - Method in class org.apache.spark.SecurityManager
-
- Gini - Class in org.apache.spark.mllib.tree.impurity
-
:: Experimental ::
Class for calculating the
Gini impurity
during binary classification.
- Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
-
- GiniAggregator - Class in org.apache.spark.mllib.tree.impurity
-
Class for updating views of a vector of sufficient statistics,
in order to compute impurity from a sample.
- GiniAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniAggregator
-
- GiniCalculator - Class in org.apache.spark.mllib.tree.impurity
-
Stores statistics for one (node, feature, bin) for calculating impurity.
- GiniCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
- GLMClassificationModel - Class in org.apache.spark.mllib.classification.impl
-
Helper class for import/export of GLM classification models.
- GLMClassificationModel() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel
-
- GLMClassificationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.classification.impl
-
- GLMClassificationModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
- GLMClassificationModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.classification.impl
-
Model data for import/export
- GLMClassificationModel.SaveLoadV1_0$.Data(Vector, double, Option<Object>) - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- GLMRegressionModel - Class in org.apache.spark.mllib.regression.impl
-
Helper methods for import/export of GLM regression models.
- GLMRegressionModel() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel
-
- GLMRegressionModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.regression.impl
-
- GLMRegressionModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
- GLMRegressionModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.regression.impl
-
Model data for model import/export
- GLMRegressionModel.SaveLoadV1_0$.Data(Vector, double) - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
Aggregate distributions over topics from all term vertices.
- glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- glom() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- GlommedDStream<T> - Class in org.apache.spark.streaming.dstream
-
- GlommedDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.GlommedDStream
-
- goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- Gradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to compute the gradient for a loss function, given a single data point.
- Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
-
- gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
-
Method to calculate the gradients for the gradient boosting calculation for least
absolute error calculation.
- gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
-
Method to calculate the loss gradients for the gradient boosting calculation for binary
classification
The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
- gradient(TreeEnsembleModel, LabeledPoint) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate the gradients for the gradient boosting calculation.
- gradient(TreeEnsembleModel, LabeledPoint) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
-
Method to calculate the gradients for the gradient boosting calculation for least
squares error calculation.
- GradientBoostedTrees - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class that implements
Stochastic Gradient Boosting
for regression and binary classification.
- GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
-
- GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Represents a gradient boosted trees model.
- GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- GradientDescent - Class in org.apache.spark.mllib.optimization
-
Class used to solve an optimization problem using Gradient Descent.
- GradientDescent(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.GradientDescent
-
- Graph<VD,ED> - Class in org.apache.spark.graphx
-
The Graph abstractly represents a graph with arbitrary objects
associated with vertices and edges.
- graph() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
- graph() - Method in class org.apache.spark.streaming.Checkpoint
-
- graph() - Method in class org.apache.spark.streaming.dstream.DStream
-
- graph() - Method in class org.apache.spark.streaming.StreamingContext
-
- graphCheckpointer() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
- GraphGenerators - Class in org.apache.spark.graphx.util
-
A collection of graph generating functions.
- GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
-
- GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
-
An implementation of
Graph
to support computation on graphs.
- graphite() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_DEFAULT_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_DEFAULT_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_DEFAULT_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_HOST() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_PERIOD() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_PORT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_PREFIX() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_PROTOCOL() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GRAPHITE_KEY_UNIT() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- GraphiteSink - Class in org.apache.spark.metrics.sink
-
- GraphiteSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.GraphiteSink
-
- GraphKryoRegistrator - Class in org.apache.spark.graphx
-
Registers GraphX classes with Kryo for improved performance.
- GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
-
- GraphLoader - Class in org.apache.spark.graphx
-
Provides utilities for loading
Graph
s from files.
- GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
-
- GraphOps<VD,ED> - Class in org.apache.spark.graphx
-
Contains additional functionality for
Graph
.
- GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
-
- graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Implicitly extracts the
GraphOps
member from a graph.
- GraphXUtils - Class in org.apache.spark.graphx
-
- GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
-
- greater(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greater(Time) - Method in class org.apache.spark.streaming.Time
-
- greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greaterEq(Time) - Method in class org.apache.spark.streaming.Time
-
- GreaterThan - Class in org.apache.spark.sql.sources
-
- GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
-
- GreaterThanOrEqual - Class in org.apache.spark.sql.sources
-
- GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create rows
by cols
grid graph with each vertex connected to its
row+1 and col+1 neighbors.
- GridPartitioner - Class in org.apache.spark.mllib.linalg.distributed
-
A grid partitioner, which uses a regular grid to partition coordinates.
- GridPartitioner(int, int, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
on each RDD of this
DStream.
- groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
on each RDD.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Create a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByResultToJava(RDD<Tuple2<K, Iterable<T>>>, ClassTag<K>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- GroupedCountEvaluator<T> - Class in org.apache.spark.partial
-
An ApproximateEvaluator for counts by key.
- GroupedCountEvaluator(int, double, ClassTag<T>) - Constructor for class org.apache.spark.partial.GroupedCountEvaluator
-
- GroupedData - Class in org.apache.spark.sql
-
:: Experimental ::
A set of methods for aggregations on a
DataFrame
, created by
DataFrame.groupBy
.
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
-
Merges multiple edges between two vertices into a single edge.
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Merge all the edges with the same src and dest id into a single
edge using the merge
function
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- GroupedMeanEvaluator<T> - Class in org.apache.spark.partial
-
An ApproximateEvaluator for means by key.
- GroupedMeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedMeanEvaluator
-
- GroupedSumEvaluator<T> - Class in org.apache.spark.partial
-
An ApproximateEvaluator for sums by key.
- GroupedSumEvaluator(int, double) - Constructor for class org.apache.spark.partial.GroupedSumEvaluator
-
- groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- groupId() - Method in class org.apache.spark.scheduler.JobGroupCancelled
-
- groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWriter() - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
-
- GrowableAccumulableParam<R,T> - Class in org.apache.spark
-
- GrowableAccumulableParam(Function1<R, Growable<T>>, ClassTag<R>) - Constructor for class org.apache.spark.GrowableAccumulableParam
-
- gt(Object) - Method in class org.apache.spark.sql.Column
-
Greater than.
- i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- id() - Method in class org.apache.spark.Accumulable
-
- id() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A unique ID for this RDD (within its SparkContext).
- id() - Method in class org.apache.spark.broadcast.Broadcast
-
- id() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- id() - Method in class org.apache.spark.mllib.tree.model.Node
-
- id() - Method in class org.apache.spark.rdd.RDD
-
A unique ID for this RDD (within its SparkContext).
- id() - Method in class org.apache.spark.scheduler.AccumulableInfo
-
- id() - Method in class org.apache.spark.scheduler.Stage
-
- id() - Method in class org.apache.spark.scheduler.TaskInfo
-
- id() - Method in class org.apache.spark.scheduler.TaskSet
-
- id() - Method in class org.apache.spark.storage.RDDInfo
-
- id() - Method in class org.apache.spark.storage.TempLocalBlockId
-
- id() - Method in class org.apache.spark.storage.TempShuffleBlockId
-
- id() - Method in class org.apache.spark.storage.TestBlockId
-
- id() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
This is an unique identifier for the receiver input stream.
- id() - Method in class org.apache.spark.streaming.scheduler.Job
-
- id() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- Identifiable - Interface in org.apache.spark.ml
-
Object with a unique id.
- IDF - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Inverse document frequency (IDF).
- IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
-
- IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
-
- idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Returns the current IDF vector.
- idf() - Method in class org.apache.spark.mllib.feature.IDFModel
-
- IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature
-
Document frequency aggregator.
- IDF.DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- IDFModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Represents an IDF model that can transform term frequency vectors.
- IDFModel(Vector) - Constructor for class org.apache.spark.mllib.feature.IDFModel
-
- IdGenerator - Class in org.apache.spark.util
-
A util used to get a unique generation ID.
- IdGenerator() - Constructor for class org.apache.spark.util.IdGenerator
-
- idx() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
-
- idx() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
-
- idx() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
-
- idx() - Method in class org.apache.spark.sql.jdbc.JDBCPartition
-
- ifExists() - Method in class org.apache.spark.sql.hive.execution.DropTable
-
- implicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param to decide whether to use implicit preference.
- implicits() - Method in class org.apache.spark.sql.SQLContext
-
- improveException(Object, NotSerializableException) - Static method in class org.apache.spark.serializer.SerializationDebugger
-
Improve the given NotSerializableException with the serialization path leading from the given
object to the problematic object.
- Impurities - Class in org.apache.spark.mllib.tree.impurity
-
Factory for Impurity instances.
- Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
-
- impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- impurity() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- Impurity - Interface in org.apache.spark.mllib.tree.impurity
-
:: Experimental ::
Trait for calculating information gain.
- impurity() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- impurity() - Method in class org.apache.spark.mllib.tree.model.Node
-
- impurityAggregator() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
ImpurityAggregator
instance specifying the impurity type.
- ImpurityAggregator - Class in org.apache.spark.mllib.tree.impurity
-
Interface for updating views of a vector of sufficient statistics,
in order to compute impurity from a sample.
- ImpurityAggregator(int) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
-
- ImpurityCalculator - Class in org.apache.spark.mllib.tree.impurity
-
Stores statistics for one (node, feature, bin) for calculating impurity.
- ImpurityCalculator(double[]) - Constructor for class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
- In() - Static method in class org.apache.spark.graphx.EdgeDirection
-
Edges arriving at a vertex.
- in(Column...) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the evaluated values of the arguments.
- in(Seq<Column>) - Method in class org.apache.spark.sql.Column
-
A boolean expression that is evaluated to true if the value of this expression is contained
by the evaluated values of the arguments.
- IN() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- In - Class in org.apache.spark.sql.sources
-
- In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
-
- IN_MEMORY_PARTITION_PRUNING() - Static method in class org.apache.spark.sql.SQLConf
-
- IN_PROGRESS() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- increaseRunningTasks(int) - Method in class org.apache.spark.scheduler.Pool
-
- incrementEpoch() - Method in class org.apache.spark.MapOutputTrackerMaster
-
- inDegrees() - Method in class org.apache.spark.graphx.GraphOps
-
The in-degree of each vertex in the graph.
- independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
- index() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
- index() - Method in class org.apache.spark.graphx.impl.VertexPartition
-
- index() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- index(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Return the index for the (i, j)-th element in the backing array.
- index(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- index() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
-
- index() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
-
- index() - Method in interface org.apache.spark.Partition
-
Get the partition's index within its parent RDD
- index() - Method in class org.apache.spark.rdd.BlockRDDPartition
-
- index() - Method in class org.apache.spark.rdd.CartesianPartition
-
- index() - Method in class org.apache.spark.rdd.CheckpointRDDPartition
-
- index() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
- index() - Method in class org.apache.spark.rdd.CoGroupPartition
-
- index() - Method in class org.apache.spark.rdd.HadoopPartition
-
- index() - Method in class org.apache.spark.rdd.JdbcPartition
-
- index() - Method in class org.apache.spark.rdd.NewHadoopPartition
-
- index() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
-
- index() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
-
- index() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
-
- index() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
-
- index() - Method in class org.apache.spark.rdd.SampledRDDPartition
-
- index() - Method in class org.apache.spark.rdd.ShuffledRDDPartition
-
- index() - Method in class org.apache.spark.rdd.UnionPartition
-
- index() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
-
- index() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
-
- index() - Method in class org.apache.spark.scheduler.TaskDescription
-
- index() - Method in class org.apache.spark.scheduler.TaskInfo
-
- index() - Method in class org.apache.spark.sql.jdbc.JDBCPartition
-
- index() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- index() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- index() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- index() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- index() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- index() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
-
- index2term(long) - Static method in class org.apache.spark.mllib.clustering.LDA
-
- IndexedRow - Class in org.apache.spark.mllib.linalg.distributed
-
- IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
-
- IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
- IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Returns the index of the input term.
- indexSize() - Method in class org.apache.spark.graphx.impl.EdgePartition
-
The number of unique source vertices in the partition.
- indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the level of a tree which the given node is in.
- indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- IndirectTaskResult<T> - Class in org.apache.spark.scheduler
-
A reference to a DirectTaskResult that has been stored in the worker's BlockManager.
- IndirectTaskResult(BlockId, int) - Constructor for class org.apache.spark.scheduler.IndirectTaskResult
-
- inferPartitionColumnValue(String, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
Converts a string to a Literal
with automatic type inference.
- inferSchema(RDD<String>, double, String) - Static method in class org.apache.spark.sql.json.JsonRDD
-
- infoGain() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- InformationGainStats - Class in org.apache.spark.mllib.tree.model
-
:: DeveloperApi ::
Information gain statistics for each split
- InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
-
- init(RDD<BaggedPoint<TreePoint>>, int, int, int) - Static method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
Initialize the node Id cache with initial node Id values.
- init(Configuration, Map<String, String>, MessageType) - Method in class org.apache.spark.sql.parquet.RowReadSupport
-
- init(Configuration) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- init(Configuration) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
-
- initDegreeVector(Graph<Object, Object>) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Generates the degree vector as the vertex properties (v0) to start power iteration.
- initEventLog(OutputStream) - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
Write metadata about an event log to the given stream.
- initFrom(Iterator<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate
entries arbitrarily.
- initFrom(Iterator<Tuple2<Object, VD>>, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
Construct the constituents of a VertexPartitionBase from the given vertices, merging duplicate
entries using mergeFunc
.
- INITIAL_ARRAY_SIZE() - Static method in class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- initialCheckpoint() - Method in class org.apache.spark.streaming.StreamingContext
-
- initialHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- initialize(boolean, SparkConf, SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- initialize(boolean, SparkConf, SecurityManager) - Static method in class org.apache.spark.broadcast.HttpBroadcast
-
- initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
- initialize(boolean, SparkConf, SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- initialize() - Method in class org.apache.spark.HttpFileServer
-
- initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.FixedLengthBinaryRecordReader
-
- initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.StreamBasedRecordReader
-
- initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.spark.input.WholeTextFileRecordReader
-
- initialize() - Method in class org.apache.spark.metrics.MetricsConfig
-
- initialize(SchedulerBackend) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- initialize(int, String, boolean) - Method in class org.apache.spark.sql.columnar.BasicColumnBuilder
-
- initialize() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
-
- initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.ColumnBuilder
-
Initializes with an approximate lower bound on the expected number of elements in this column.
- initialize() - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnAccessor
-
- initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- initialize() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
-
- initialize(int, String, boolean) - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
-
- initialize(String) - Method in class org.apache.spark.storage.BlockManager
-
Initializes the BlockManager with the given appId.
- initialize(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Initialize the DStream by setting the "zero" time, based on which
the validity of future times is calculated.
- initialize(String) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
-
The Kinesis Client Library calls this method during IRecordProcessor initialization.
- initialize() - Method in class org.apache.spark.ui.SparkUI
-
Initialize all components of the server.
- initialize() - Method in class org.apache.spark.ui.WebUI
-
Initialize all components of the server.
- Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
-
- Initialized() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
-
- Initialized() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
-
- initializeIfNecessary() - Method in interface org.apache.spark.Logging
-
- initializeLocalJobConfFunc(String, TableDesc, JobConf) - Static method in class org.apache.spark.sql.hive.HadoopTableReader
-
Curried.
- initializeLogging() - Method in interface org.apache.spark.Logging
-
- initialValue() - Method in class org.apache.spark.partial.PartialResult
-
- initInputSerDe(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initInputSoi(AbstractSerDe, Seq<String>, Seq<DataType>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initLocalProperties() - Method in class org.apache.spark.SparkContext
-
- initNextRecordReader() - Method in class org.apache.spark.input.ConfigurableCombineFileRecordReader
-
- initOutputputSoi(AbstractSerDe) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initOutputSerDe(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- initSerDe(String, Seq<String>, Seq<DataType>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- InMemoryColumnarTableScan - Class in org.apache.spark.sql.columnar
-
- InMemoryColumnarTableScan(Seq<Attribute>, Seq<Expression>, InMemoryRelation) - Constructor for class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- inMemoryPartitionPruning() - Method in class org.apache.spark.sql.SQLConf
-
When set to true, partition pruning for in-memory columnar tables is enabled.
- InMemoryRelation - Class in org.apache.spark.sql.columnar
-
- InMemoryRelation(Seq<Attribute>, boolean, int, StorageLevel, SparkPlan, Option<String>, RDD<CachedBatch>, Statistics) - Constructor for class org.apache.spark.sql.columnar.InMemoryRelation
-
- InnerClosureFinder - Class in org.apache.spark.util
-
- InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
-
- innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD
-
Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same
PartitionStrategy
.
- innerJoin(EdgePartition<ED2, ?>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Apply f
to all edges present in both this
and other
and return a new EdgePartition
containing the resulting edges.
- innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- innerJoin(Self, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Inner join another VertexPartition.
- innerJoin(Iterator<Product2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Inner join an iterator of messages.
- innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
- innerJoinKeepLeft(Iterator<Product2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Similar to innerJoin, but vertices from the left side that don't appear in iter will remain in
the partition, hidden by the bitmask.
- innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- innerZipJoin(VertexRDD<U>, Function3<Object, VD, U, VD2>, ClassTag<U>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
- input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- INPUT() - Static method in class org.apache.spark.ui.ToolTips
-
- inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- inputBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- inputCol() - Method in interface org.apache.spark.ml.param.HasInputCol
-
param for input column name
- inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- InputDStream<T> - Class in org.apache.spark.streaming.dstream
-
This is the abstract base class for all input streams.
- InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
-
- inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
-
- InputFormatInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Parses and holds information about inputFormat (and files) specified as a parameter.
- InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
-
- inputMetrics() - Method in class org.apache.spark.storage.BlockResult
-
- inputMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- inputMetricsToJson(InputMetrics) - Static method in class org.apache.spark.util.JsonProtocol
-
- inputProjection() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
-
- inputRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- inputRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- inputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- inputSplit() - Method in class org.apache.spark.rdd.HadoopPartition
-
- inputSplitWithLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
-
- insert(DataFrame, boolean) - Method in class org.apache.spark.sql.json.JSONRelation
-
- insert(DataFrame, boolean) - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- insert(DataFrame, boolean) - Method in interface org.apache.spark.sql.sources.InsertableRelation
-
- InsertableRelation - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can be used to insert data into it through the insert method.
- insertInto(String, boolean) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Adds the rows from this RDD to the specified table, optionally overwriting the existing data.
- insertInto(String) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Adds the rows from this RDD to the specified table.
- InsertIntoDataSource - Class in org.apache.spark.sql.sources
-
- InsertIntoDataSource(LogicalRelation, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.sources.InsertIntoDataSource
-
- InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution
-
- InsertIntoHiveTable(MetastoreRelation, Map<String, Option<String>>, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- InsertIntoHiveTable - Class in org.apache.spark.sql.hive
-
A logical plan representing insertion into Hive table.
- InsertIntoHiveTable(LogicalPlan, Map<String, Option<String>>, LogicalPlan, boolean) - Constructor for class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- insertIntoJDBC(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
-
Save this RDD to a JDBC database at url
under the table name table
.
- InsertIntoParquetTable - Class in org.apache.spark.sql.parquet
-
:: DeveloperApi ::
Operator that acts as a sink for queries on RDDs and can be used to
store the output inside a directory of Parquet files.
- InsertIntoParquetTable(ParquetRelation, SparkPlan, boolean) - Constructor for class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- inShutdown() - Static method in class org.apache.spark.util.Utils
-
Detect whether this thread might be executing a shutdown hook.
- inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- instance() - Method in class org.apache.spark.metrics.MetricsSystem
-
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
Get this impurity instance.
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
Get this impurity instance.
- instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
Get this impurity instance.
- INT - Class in org.apache.spark.sql.columnar
-
- INT() - Constructor for class org.apache.spark.sql.columnar.INT
-
- intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- IntColumnAccessor - Class in org.apache.spark.sql.columnar
-
- IntColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.IntColumnAccessor
-
- IntColumnBuilder - Class in org.apache.spark.sql.columnar
-
- IntColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.IntColumnBuilder
-
- IntColumnStats - Class in org.apache.spark.sql.columnar
-
- IntColumnStats() - Constructor for class org.apache.spark.sql.columnar.IntColumnStats
-
- IntDelta - Class in org.apache.spark.sql.columnar.compression
-
- IntDelta() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta
-
- IntDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
-
- IntDelta.Decoder(ByteBuffer, NativeColumnType<IntegerType$>) - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Decoder
-
- IntDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
-
- IntDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- IntegerConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
-
- intercept() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- intercept() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
-
- intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- internalMap() - Method in class org.apache.spark.util.TimeStampedHashSet
-
- InterruptibleIterator<T> - Class in org.apache.spark
-
:: DeveloperApi ::
An iterator that wraps around an existing iterator to provide task killing functionality.
- InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
-
- interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
-
- intersect(DataFrame) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
containing rows only in both this frame and another frame.
- intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the intersection of this RDD and another one.
- intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the intersection of this RDD and another one.
- intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return the intersection of this RDD and another one.
- Interval - Class in org.apache.spark.streaming
-
- Interval(Time, Time) - Constructor for class org.apache.spark.streaming.Interval
-
- Interval(long, long) - Constructor for class org.apache.spark.streaming.Interval
-
- INTERVAL_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- INTERVAL_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- IntParam - Class in org.apache.spark.ml.param
-
Specialized version of Param[Int
] for Java.
- IntParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
-
- IntParam - Class in org.apache.spark.util
-
An extractor object for parsing strings into integers.
- IntParam() - Constructor for class org.apache.spark.util.IntParam
-
- intRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLContext.implicits
-
Creates a single column DataFrame from an RDD[Int].
- intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
-
- intWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- intWritableConverter() - Static method in class org.apache.spark.WritableConverter
-
- intWritableFactory() - Static method in class org.apache.spark.WritableFactory
-
- invalidateCache(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
-
Invalidates the cache of any data that contains plan
.
- invalidateTable(String, String) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- invalidInformationGainStats() - Static method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
An
InformationGainStats
object to
denote that current split doesn't satisfies minimum info gain or
minimum number of instances per node.
- invoke(Class<?>, Object, String, Seq<Tuple2<Class<?>, Object>>) - Static method in class org.apache.spark.util.Utils
-
- invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils
-
Test whether the given closure invokes the specified method in the specified class.
- invokeWriteReplace(Object) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- ioschema() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- isActive(long) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Look up vid in activeSet, throwing an exception if it is None.
- isActive() - Method in class org.apache.spark.util.EventLoop
-
Return if the event thread has already been started but not yet stopped.
- isAkkaConf(String) - Static method in class org.apache.spark.SparkConf
-
Return whether the given config is an akka config (e.g.
- isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
-
- isAuthenticationEnabled() - Method in class org.apache.spark.SecurityManager
-
Check to see if authentication for the Spark communication protocols is enabled
- isAvailable() - Method in class org.apache.spark.scheduler.Stage
-
- isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils
-
Return whether the exception is caused by an address-port collision when binding.
- isBroadcast() - Method in class org.apache.spark.storage.BlockId
-
- isCached(String) - Method in class org.apache.spark.sql.CacheManager
-
Returns true if the table is currently cached in-memory.
- isCached(String) - Method in class org.apache.spark.sql.SQLContext
-
Returns true if the table is currently cached in-memory.
- isCached() - Method in class org.apache.spark.storage.BlockStatus
-
- isCached() - Method in class org.apache.spark.storage.RDDInfo
-
- isCancelled() - Method in class org.apache.spark.ComplexFutureAction
-
- isCancelled() - Method in interface org.apache.spark.FutureAction
-
Returns whether the action has been cancelled.
- isCancelled() - Method in class org.apache.spark.JavaFutureActionWrapper
-
- isCancelled() - Method in class org.apache.spark.SimpleFutureAction
-
- isCategorical(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return whether this RDD has been checkpointed or not
- isCheckpointed() - Method in class org.apache.spark.graphx.Graph
-
Return whether this Graph has been checkpointed or not.
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- isCheckpointed() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- isCheckpointed() - Method in class org.apache.spark.rdd.RDD
-
Return whether this RDD has been checkpointed or not
- isCheckpointed() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
-
- isClassification() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isCompleted() - Method in class org.apache.spark.ComplexFutureAction
-
- isCompleted() - Method in interface org.apache.spark.FutureAction
-
Returns whether the action has already been completed with a value or an exception.
- isCompleted() - Method in class org.apache.spark.SimpleFutureAction
-
- isCompleted() - Method in class org.apache.spark.TaskContext
-
Returns true if the task has completed.
- isCompleted() - Method in class org.apache.spark.TaskContextImpl
-
- isContinuous(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isDefined(long) - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- isDocumentVertex(Tuple2<Object, ?>) - Static method in class org.apache.spark.mllib.clustering.LDA
-
- isDone() - Method in class org.apache.spark.JavaFutureActionWrapper
-
- isDriver() - Method in class org.apache.spark.broadcast.BroadcastManager
-
- isDriver() - Method in class org.apache.spark.storage.BlockManagerId
-
- isEmpty() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- isEmpty() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- isEmpty() - Method in class org.apache.spark.rdd.RDD
-
- isEmpty() - Method in class org.apache.spark.sql.CacheManager
-
Checks if the cache is empty.
- isEventLogEnabled() - Method in class org.apache.spark.SparkContext
-
- isExecutorAlive(String) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf
-
Return whether the given config should be passed to an executor on start-up.
- isExtended() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- isExtended() - Method in class org.apache.spark.sql.sources.DescribeCommand
-
- isFairScheduler() - Method in class org.apache.spark.ui.jobs.JobsTab
-
- isFairScheduler() - Method in class org.apache.spark.ui.jobs.StagesTab
-
- isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils
-
Returns true if the given exception was fatal.
- isFinished(Protos.TaskState) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
Check whether a Mesos task state represents a finished task
- isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
-
- isInitialized() - Method in class org.apache.spark.streaming.dstream.DStream
-
- isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
-
- isInMemory() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
-
- isInterrupted() - Method in class org.apache.spark.TaskContext
-
Returns true if the task has been killed.
- isInterrupted() - Method in class org.apache.spark.TaskContextImpl
-
- isLeaf() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
-
- isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Returns true if this is a left child.
- isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- isLocal() - Method in class org.apache.spark.SparkContext
-
- isLocal() - Method in class org.apache.spark.sql.DataFrame
-
Returns true if the collect
and take
methods can be run locally
(without any Spark executors).
- isLocal() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- isLogManagerEnabled() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Check if the log manager is enabled.
- isMac() - Static method in class org.apache.spark.util.Utils
-
Whether the underlying operating system is Mac OS X.
- isMulticlass() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
-
- isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
-
- isNotNull() - Method in class org.apache.spark.sql.Column
-
True if the current expression is NOT null.
- IsNotNull - Class in org.apache.spark.sql.sources
-
- IsNotNull(String) - Constructor for class org.apache.spark.sql.sources.IsNotNull
-
- isNull() - Method in class org.apache.spark.sql.Column
-
True if the current expression is null.
- IsNull - Class in org.apache.spark.sql.sources
-
- IsNull(String) - Constructor for class org.apache.spark.sql.sources.IsNull
-
- isOpen() - Method in class org.apache.spark.storage.BlockObjectWriter
-
- isOpen() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- isotonic() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- IsotonicRegression - Class in org.apache.spark.mllib.regression
-
:: Experimental ::
- IsotonicRegression() - Constructor for class org.apache.spark.mllib.regression.IsotonicRegression
-
Constructs IsotonicRegression instance with default parameter isotonic = true.
- IsotonicRegressionModel - Class in org.apache.spark.mllib.regression
-
:: Experimental ::
- IsotonicRegressionModel(double[], double[], boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- isParquetBinaryAsString() - Method in class org.apache.spark.sql.SQLConf
-
When set to true, we always treat byte arrays in Parquet files as strings.
- isParquetINT96AsTimestamp() - Method in class org.apache.spark.sql.SQLConf
-
When set to true, we always treat INT96Values in Parquet files as timestamp.
- isPartitioned() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- isPrimitiveType(DataType) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- isRDD() - Method in class org.apache.spark.storage.BlockId
-
- isReady() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- isReceiverStarted() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Check if receiver has been marked for stopping
- isReceiverStopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Check if receiver has been marked for stopping
- isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- isRegistered() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- isRoot() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
-
- isRunningInYarnContainer(SparkConf) - Static method in class org.apache.spark.util.Utils
-
- isRunningLocally() - Method in class org.apache.spark.TaskContext
-
Returns true if the task is running locally in the driver program.
- isRunningLocally() - Method in class org.apache.spark.TaskContextImpl
-
- isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Checks whether a param is explicitly set.
- isShuffle() - Method in class org.apache.spark.storage.BlockId
-
- isShuffleMap() - Method in class org.apache.spark.scheduler.Stage
-
- isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf
-
Return true if the given config matches either spark.*.port
or spark.port.*
.
- isSplitable(JobContext, Path) - Method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
Override of isSplitable to ensure initial computation of the record length
- isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Check if the receiver has started or not.
- isStopped() - Method in class org.apache.spark.SparkEnv
-
- isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Check if receiver has been marked for stopping.
- isSymlink(File) - Static method in class org.apache.spark.util.Utils
-
Check to see if file is a symbolic link.
- isTermVertex(Tuple2<Object, ?>) - Static method in class org.apache.spark.mllib.clustering.LDA
-
- isTesting() - Static method in class org.apache.spark.util.Utils
-
Indicates whether Spark is currently running unit tests.
- isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Checks whether the 'time' is valid wrt slideDuration for generating RDD
- isTimeValid(Time) - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Checks whether the 'time' is valid wrt slideDuration for generating RDD.
- isTraceEnabled() - Method in interface org.apache.spark.Logging
-
- isTransposed() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- isTransposed() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Flag that keeps track whether the matrix is transposed or not.
- isTransposed() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- isUDAFBridgeRequired() - Method in class org.apache.spark.sql.hive.HiveUdafFunction
-
- isUnordered(int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- isValid() - Method in class org.apache.spark.broadcast.Broadcast
-
Whether this Broadcast is actually usable.
- isValid() - Method in class org.apache.spark.rdd.BlockRDD
-
Whether this BlockRDD is actually usable.
- isValid() - Method in class org.apache.spark.storage.StorageLevel
-
- isWindows() - Static method in class org.apache.spark.util.Utils
-
Whether the underlying operating system is Windows.
- isWorthCompressing(Encoder<T>) - Method in interface org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder
-
- isZero() - Method in class org.apache.spark.streaming.Duration
-
- isZombie() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- it() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- item() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- item() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
-
- item() - Method in class org.apache.spark.streaming.receiver.SingleItemData
-
- itemCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the column name for item ids.
- items() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- iterationTimes() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
- iterator() - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Get an iterator over the edges in this partition.
- iterator() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
-
Returns an iterator over all vertex ids stored in this `RoutingTablePartition`.
- iterator() - Method in class org.apache.spark.graphx.impl.VertexAttributeBlock
-
- iterator() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- iterator() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
-
- iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
- iterator() - Method in class org.apache.spark.sql.sources.CaseInsensitiveMap
-
- iterator() - Method in class org.apache.spark.storage.IteratorValues
-
- iterator() - Method in class org.apache.spark.streaming.receiver.IteratorBlock
-
- iterator() - Method in class org.apache.spark.streaming.receiver.IteratorData
-
- iterator() - Method in class org.apache.spark.util.BoundedPriorityQueue
-
- iterator() - Method in class org.apache.spark.util.TimeStampedHashMap
-
- iterator() - Method in class org.apache.spark.util.TimeStampedHashSet
-
- iterator() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- IteratorBlock - Class in org.apache.spark.streaming.receiver
-
class representing a block received as an Iterator
- IteratorBlock(Iterator<Object>) - Constructor for class org.apache.spark.streaming.receiver.IteratorBlock
-
- IteratorData<T> - Class in org.apache.spark.streaming.receiver
-
- IteratorData(Iterator<T>) - Constructor for class org.apache.spark.streaming.receiver.IteratorData
-
- IteratorValues - Class in org.apache.spark.storage
-
- IteratorValues(Iterator<Object>) - Constructor for class org.apache.spark.storage.IteratorValues
-
- L1Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Updater for L1 regularized problems.
- L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
-
- label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- label() - Method in class org.apache.spark.mllib.tree.impl.TreePoint
-
- labelCol() - Method in interface org.apache.spark.ml.param.HasLabelCol
-
param for label column name
- LabeledPoint - Class in org.apache.spark.mllib.regression
-
Class that represents the features and labels of a data point.
- LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
-
- LabelPropagation - Class in org.apache.spark.graphx.lib
-
Label Propagation algorithm.
- LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
-
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns the sequence of labels in ascending order
- labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns the sequence of labels in ascending order
- LassoModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using Lasso.
- LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
-
- LassoWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L1-regularization using Stochastic Gradient Descent.
- LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
-
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100,
regParam: 0.01, miniBatchFraction: 1.0}.
- last(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value in a group.
- last(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value of the column in a group.
- lastCompletedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastFinishTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- lastId() - Static method in class org.apache.spark.Accumulators
-
- lastLaunchTime() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- lastProgressBar() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- lastReceivedBatch() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- lastReceivedBatchRecords() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- lastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- lastUpdateTime() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- laterViewToken() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- latestInfo() - Method in class org.apache.spark.scheduler.Stage
-
Pointer to the latest [StageInfo] object, set by DAGScheduler.
- latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Return the latest model.
- latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Return the latest model.
- LAUNCHING() - Static method in class org.apache.spark.TaskState
-
- launchTasks(Seq<Seq<TaskDescription>>) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
- LBFGS - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to solve an optimization problem using Limited-memory BFGS.
- LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
-
- LDA - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA
-
- LDA.EMOptimizer - Class in org.apache.spark.mllib.clustering
-
Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters.
- LDA.EMOptimizer(Graph<DenseVector<Object>, Object>, int, int, double, double, int) - Constructor for class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
- LDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- LDAModel() - Constructor for class org.apache.spark.mllib.clustering.LDAModel
-
- learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Compute gradient and loss for a Least-squared loss function, as used in linear regression.
- LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- left() - Method in class org.apache.spark.sql.sources.And
-
- left() - Method in class org.apache.spark.sql.sources.Or
-
- leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the left child of this node.
- leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftJoin(Self, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Left outer join another VertexPartition.
- leftJoin(Iterator<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Left outer join another iterator of messages.
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
- leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- leftNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this RDD with another VertexRDD with the same index.
- length() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
Size the of block.
- length() - Method in class org.apache.spark.scheduler.SplitInfo
-
- length() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
-
- length() - Method in class org.apache.spark.storage.FileSegment
-
- length() - Method in class org.apache.spark.storage.TachyonFileSegment
-
- length() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
-
- length() - Method in class org.apache.spark.util.Distribution
-
- length() - Method in class org.apache.spark.util.Vector
-
- leq(Object) - Method in class org.apache.spark.sql.Column
-
Less than or equal to.
- less(Duration) - Method in class org.apache.spark.streaming.Duration
-
- less(Time) - Method in class org.apache.spark.streaming.Time
-
- lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- lessEq(Time) - Method in class org.apache.spark.streaming.Time
-
- LessThan - Class in org.apache.spark.sql.sources
-
- LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
-
- LessThanOrEqual - Class in org.apache.spark.sql.sources
-
- LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
-
- level() - Method in class org.apache.spark.storage.BlockInfo
-
- lexicographicOrdering() - Static method in class org.apache.spark.graphx.Edge
-
- lexicographicOrdering() - Static method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- libraryPathEnvName() - Static method in class org.apache.spark.util.Utils
-
Return the current system LD_LIBRARY_PATH name
- libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils
-
Return the prefix of a command that appends the given library paths to the
system-specific library path environment variable.
- like(String) - Method in class org.apache.spark.sql.Column
-
SQL like expression.
- LIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- limit(int) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by taking the first
n
rows.
- LinearDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate sample data used for Linear Data.
- LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
-
- LinearRegression - Class in org.apache.spark.ml.regression
-
:: AlphaComponent ::
- LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
-
- LinearRegressionModel - Class in org.apache.spark.ml.regression
-
:: AlphaComponent ::
- LinearRegressionModel(LinearRegression, ParamMap, Vector, double) - Constructor for class org.apache.spark.ml.regression.LinearRegressionModel
-
- LinearRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using LinearRegression.
- LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
-
- LinearRegressionParams - Interface in org.apache.spark.ml.regression
-
Params for linear regression.
- LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a linear regression model with no regularization using Stochastic Gradient Descent.
- LinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
- LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Construct a LinearRegression object with default parameters: {stepSize: 1.0,
numIterations: 100, miniBatchFraction: 1.0}.
- listener() - Method in class org.apache.spark.scheduler.ActiveJob
-
- listener() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- listener() - Method in class org.apache.spark.streaming.ui.StreamingTab
-
- listener() - Method in class org.apache.spark.ui.env.EnvironmentTab
-
- listener() - Method in class org.apache.spark.ui.exec.ExecutorsTab
-
- listener() - Method in class org.apache.spark.ui.jobs.JobsTab
-
- listener() - Method in class org.apache.spark.ui.jobs.StagesTab
-
- listener() - Method in class org.apache.spark.ui.storage.StorageTab
-
- listenerBus() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- listenerBus() - Method in class org.apache.spark.SparkContext
-
- listenerBus() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- ListenerBus<L,E> - Interface in org.apache.spark.util
-
An event bus which posts events to its listeners.
- listeners() - Method in interface org.apache.spark.util.ListenerBus
-
- listenerThreadIsAlive() - Method in class org.apache.spark.util.AsynchronousListenerBus
-
For testing only.
- listFiles(String, Configuration) - Static method in class org.apache.spark.sql.parquet.FileSystemHelper
-
- listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ui.UIUtils
-
Returns an HTML table constructed by generating a row for each object in a sequence.
- lit(Object) - Static method in class org.apache.spark.sql.functions
-
Creates a
Column
of literal value.
- literals() - Method in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
-
- LiveListenerBus - Class in org.apache.spark.scheduler
-
Asynchronously passes SparkListenerEvents to registered SparkListeners.
- LiveListenerBus() - Constructor for class org.apache.spark.scheduler.LiveListenerBus
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- load(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- load(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader
-
Load a model from the given path.
- load(String) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Returns the dataset stored at path as a DataFrame,
using the default data source configured by spark.sql.sources.default.
- load(String, String) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
Returns the dataset stored at path as a DataFrame, using the given data source.
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Java-specific) Returns the dataset specified by the given data source and
a set of options as a DataFrame.
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Scala-specific) Returns the dataset specified by the given data source and
a set of options as a DataFrame.
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Java-specific) Returns the dataset specified by the given data source and
a set of options as a DataFrame, using the given schema as the schema of the DataFrame.
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
:: Experimental ::
(Scala-specific) Returns the dataset specified by the given data source and
a set of options as a DataFrame, using the given schema as the schema of the DataFrame.
- loadClass(String, boolean) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
-
- loadClass(String) - Method in class org.apache.spark.util.ParentClassLoader
-
- loadClass(String, boolean) - Method in class org.apache.spark.util.ParentClassLoader
-
- loadData(SparkContext, String, String) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Helper method for loading GLM classification model data.
- loadData(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Helper method for loading GLM regression model data.
- loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils
-
Load default Spark properties from the given file.
- Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util
-
:: DeveloperApi ::
- loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
.
- loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
with the default number of
partitions.
- loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
- loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of
partitions.
- loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of
features determined automatically and the default number of partitions.
- loadTrees(SparkContext, String, String) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
Load trees for an ensemble, and return them in order.
- loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
.
- loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
with the default number of partitions.
- localAccums() - Static method in class org.apache.spark.Accumulators
-
- LocalActor - Class in org.apache.spark.scheduler.local
-
Calls to LocalBackend are all serialized through LocalActor.
- LocalActor(TaskSchedulerImpl, LocalBackend, int) - Constructor for class org.apache.spark.scheduler.local.LocalActor
-
- localActor() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- LocalBackend - Class in org.apache.spark.scheduler.local
-
LocalBackend is used when running a local version of Spark where the executor, backend, and
master all run in the same JVM.
- LocalBackend(TaskSchedulerImpl, int) - Constructor for class org.apache.spark.scheduler.local.LocalBackend
-
- localDirs() - Method in class org.apache.spark.storage.DiskBlockManager
-
- localDstId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- localFraction() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
Computes the fraction of the parents' partitions containing preferredLocation within
their getPreferredLocs.
- localHostName() - Static method in class org.apache.spark.util.Utils
-
Get the local machine's hostname.
- localIndex(int) - Method in class org.apache.spark.ml.recommendation.ALS.LocalIndexEncoder
-
Gets the local index from an encoded index.
- localIpAddress() - Static method in class org.apache.spark.util.Utils
-
Get the local host's IP address in dotted-quad format (e.g.
- localIpAddressHostname() - Static method in class org.apache.spark.util.Utils
-
- localityWaits() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- LocalKMeans - Class in org.apache.spark.mllib.clustering
-
An utility object to run K-means locally.
- LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
-
- LocalLDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- LocalLDAModel(Matrix) - Constructor for class org.apache.spark.mllib.clustering.LocalLDAModel
-
- localSeqToDataFrameHolder(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits
-
Creates a DataFrame from a local Seq of Product.
- localSrcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- localValue() - Method in class org.apache.spark.Accumulable
-
Get the current value of this accumulator from within a task.
- location() - Method in class org.apache.spark.scheduler.CompressedMapStatus
-
- location() - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
-
- location() - Method in interface org.apache.spark.scheduler.MapStatus
-
Location where this task was run.
- location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- locations_() - Method in class org.apache.spark.rdd.BlockRDD
-
- log() - Method in interface org.apache.spark.Logging
-
- log() - Method in interface org.apache.spark.util.ActorLogReceive
-
- log1pExp(double) - Static method in class org.apache.spark.mllib.util.MLUtils
-
When x
is positive and large, computing math.log(1 + math.exp(x))
will lead to arithmetic
overflow.
- log2(double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
- log_() - Method in interface org.apache.spark.Logging
-
- logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- logDirName() - Method in class org.apache.spark.scheduler.JobLogger
-
- logError(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- logFileRegex() - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
- logFilesTologInfo(Seq<Path>) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
Convert a sequence of files to a sequence of sorted LogInfo objects
- loggedEvents() - Method in class org.apache.spark.scheduler.EventLoggingListener
-
- Logging - Interface in org.apache.spark
-
:: DeveloperApi ::
Utility trait for classes that want to log data.
- logicalRelation() - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
-
- LogicalRelation - Class in org.apache.spark.sql.sources
-
- LogicalRelation(BaseRelation) - Constructor for class org.apache.spark.sql.sources.LogicalRelation
-
- logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- LogisticGradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Compute gradient and loss for a multinomial logistic loss function, as used
in multi-class classification (it is also used in binary logistic regression).
- LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticRegression - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
- LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-
- LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate test data for LogisticRegression.
- LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- LogisticRegressionModel - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
- LogisticRegressionModel(LogisticRegression, ParamMap, Vector, double) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionModel
-
- LogisticRegressionModel - Class in org.apache.spark.mllib.classification
-
Classification model trained using Multinomial/Binary Logistic Regression.
- LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionParams - Interface in org.apache.spark.ml.classification
-
Params for logistic regression.
- LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
-
Train a classification model for Multinomial/Binary Logistic Regression using
Limited-memory BFGS.
- LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
- LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
Train a classification model for Binary Logistic Regression
using Stochastic Gradient Descent.
- LogisticRegressionWithSGD(double, int, double, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
- LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Construct a LogisticRegression object with default parameters: {stepSize: 1.0,
numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log likelihood of the observed tokens in the training set,
given the current parameter estimates:
log P(docs | topics, topic distributions for docs, alpha, eta)
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- LogLoss - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for log loss calculation (for classification).
- LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
-
- logMemoryUsage() - Method in class org.apache.spark.storage.MemoryStore
-
Log information about current memory usage.
- logName() - Method in interface org.apache.spark.Logging
-
- LogNormalGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
-
- logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Generate a graph whose vertex out degree distribution is log normal.
- logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the log normal distribution with the input
mean and standard deviation
- logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from a
log normal distribution.
- logPath() - Method in class org.apache.spark.scheduler.EventLoggingListener
-
- logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns the log-density of this multivariate Gaussian at given point, x
- logpdf(Vector<Object>) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns the log-density of this multivariate Gaussian at given point, x
- logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log probability of the current parameter estimate:
log P(topics, topic distributions for docs | alpha, eta)
- logStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- logStartToJson(SparkListenerLogStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Execute the given block, logging and re-throwing any uncaught exception.
- logUnrollFailureMessage(BlockId, long) - Method in class org.apache.spark.storage.MemoryStore
-
Log a warning for failing to unroll a block.
- logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
-
- logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- logUrls() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
-
- logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- LONG - Class in org.apache.spark.sql.columnar
-
- LONG() - Constructor for class org.apache.spark.sql.columnar.LONG
-
- LONG_FORM() - Static method in class org.apache.spark.util.CallSite
-
- LongColumnAccessor - Class in org.apache.spark.sql.columnar
-
- LongColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.LongColumnAccessor
-
- LongColumnBuilder - Class in org.apache.spark.sql.columnar
-
- LongColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.LongColumnBuilder
-
- LongColumnStats - Class in org.apache.spark.sql.columnar
-
- LongColumnStats() - Constructor for class org.apache.spark.sql.columnar.LongColumnStats
-
- LongConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- LongDelta - Class in org.apache.spark.sql.columnar.compression
-
- LongDelta() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta
-
- LongDelta.Decoder - Class in org.apache.spark.sql.columnar.compression
-
- LongDelta.Decoder(ByteBuffer, NativeColumnType<LongType$>) - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Decoder
-
- LongDelta.Encoder - Class in org.apache.spark.sql.columnar.compression
-
- LongDelta.Encoder() - Constructor for class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- longForm() - Method in class org.apache.spark.util.CallSite
-
- LongParam - Class in org.apache.spark.ml.param
-
Specialized version of Param[Long
] for Java.
- LongParam(Params, String, String, Option<Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-
- longRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLContext.implicits
-
Creates a single column DataFrame from an RDD[Long].
- longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
-
- longWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- longWritableConverter() - Static method in class org.apache.spark.WritableConverter
-
- longWritableFactory() - Static method in class org.apache.spark.WritableFactory
-
- lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the list of values in the RDD for key key
.
- lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the list of values in the RDD for key key
.
- lookupCachedData(DataFrame) - Method in class org.apache.spark.sql.CacheManager
-
Optionally returns cached data for the given
DataFrame
- lookupCachedData(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
-
Optionally returns cached data for the given LogicalPlan.
- lookupDataSource(String) - Static method in class org.apache.spark.sql.sources.ResolvedDataSource
-
Given a provider name, look up the data source class definition.
- lookupFunction(String, Seq<Expression>) - Method in class org.apache.spark.sql.hive.HiveFunctionRegistry
-
- lookupRelation(Seq<String>, Option<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
-
Returns the default Spark timeout to use for Akka remote actor lookup.
- loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- Loss - Interface in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
- Losses - Class in org.apache.spark.mllib.tree.loss
-
- Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
-
- LOST() - Static method in class org.apache.spark.TaskState
-
- low() - Method in class org.apache.spark.partial.BoundedDouble
-
- lower() - Method in class org.apache.spark.rdd.JdbcPartition
-
- lower(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a string exprsesion to lower case.
- LOWER() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- lowerBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- lowerBound() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
-
- lowerCase() - Method in class org.apache.spark.sql.hive.HiveStrategies.ParquetConversion.LogicalPlanHacks
-
- lowSplit() - Method in class org.apache.spark.mllib.tree.model.Bin
-
- lt(Object) - Method in class org.apache.spark.sql.Column
-
Less than.
- LZ4CompressionCodec - Class in org.apache.spark.io
-
- LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
-
- LZFCompressionCodec - Class in org.apache.spark.io
-
- LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountASL
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountProducerASL
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.rdd.CheckpointRDD
-
- main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
-
- main(String[]) - Static method in class org.apache.spark.streaming.util.RecurringTimer
-
- main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
-
- main(String[]) - Static method in class org.apache.spark.util.random.XORShiftRandom
-
Main method for running benchmark
- makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
-
- makeDriverRef(String, SparkConf, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
-
- makeExecutorRef(String, SparkConf, String, int, ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
-
- makeOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- makeOffers(String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- makeParquetFile(Seq<T>, File, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
-
- makeParquetFile(DataFrame, File, ClassTag<T>, TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
-
- makePartitionDir(File, String, Seq<Tuple2<String, Object>>) - Method in interface org.apache.spark.sql.parquet.ParquetTest
-
- makeProgressBar(int, int, int, int, int) - Static method in class org.apache.spark.ui.UIUtils
-
- makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD, with one or more
location preferences (hostnames of Spark nodes) for each object.
- makeRDDForPartitionedTable(Seq<Partition>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
-
- makeRDDForPartitionedTable(Map<Partition, Class<? extends Deserializer>>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
-
Create a HadoopRDD for every partition key specified in the query.
- makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
-
- makeRDDForTable(Table) - Method in class org.apache.spark.sql.hive.HadoopTableReader
-
- makeRDDForTable(Table, Class<? extends Deserializer>, Option<PathFilter>) - Method in class org.apache.spark.sql.hive.HadoopTableReader
-
Creates a Hadoop RDD to read data from the target table's data directory.
- makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
-
- managedIfNoPath() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- managedIfNoPath() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- ManualClock - Class in org.apache.spark.util
-
A Clock
whose time can be manually set and modified.
- ManualClock(long) - Constructor for class org.apache.spark.util.ManualClock
-
- ManualClock() - Constructor for class org.apache.spark.util.ManualClock
-
- map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- map(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Construct a new edge partition by applying the function f to all
edges in this partition.
- map(Iterator<ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Construct a new edge partition by using the edge attributes
contained in the iterator.
- map(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Pass each vertex attribute along with the vertex id through a map
function and retain the original RDD's partitioning and index.
- map(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Map the values of this matrix using a function.
- map(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
-
Transform this PartialResult into a PartialResult of type T.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to all elements of this RDD.
- map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type map
- map(MapType) - Method in class org.apache.spark.sql.ColumnName
-
- map(Function1<Row, R>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by applying a function to all rows of this DataFrame.
- map(Function1<T, R>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
-
- map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream.
- MAP_KEY_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- MAP_OUTPUT_TRACKER() - Static method in class org.apache.spark.util.MetadataCleanerType
-
- MAP_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- MAP_VALUE_SCHEMA_NAME() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
-
- mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute in the graph using the map function.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it a whole partition at a
time.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
-------------------------------- *
Util JSON deserialization methods |
---------------------------------
- mapId() - Method in class org.apache.spark.FetchFailed
-
- mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- MapOutputTracker - Class in org.apache.spark
-
Class that keeps track of the location of the map output of
a stage.
- MapOutputTracker(SparkConf) - Constructor for class org.apache.spark.MapOutputTracker
-
- mapOutputTracker() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- mapOutputTracker() - Method in class org.apache.spark.SparkEnv
-
- MapOutputTrackerMaster - Class in org.apache.spark
-
MapOutputTracker for the driver.
- MapOutputTrackerMaster(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMaster
-
- MapOutputTrackerMasterActor - Class in org.apache.spark
-
Actor class for MapOutputTrackerMaster
- MapOutputTrackerMasterActor(MapOutputTrackerMaster, SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerMasterActor
-
- MapOutputTrackerMessage - Interface in org.apache.spark
-
- MapOutputTrackerWorker - Class in org.apache.spark
-
MapOutputTracker for the executors, which fetches map output information from the driver's
MapOutputTrackerMaster.
- MapOutputTrackerWorker(SparkConf) - Constructor for class org.apache.spark.MapOutputTrackerWorker
-
- MapPartitionedDStream<T,U> - Class in org.apache.spark.streaming.dstream
-
- MapPartitionedDStream(DStream<T>, Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<Row>, Iterator<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by applying a function to each partition of this DataFrame.
- mapPartitions(Function1<Iterator<T>, Iterator<R>>, ClassTag<R>) - Method in interface org.apache.spark.sql.RDDApi
-
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- MapPartitionsRDD<U,T> - Class in org.apache.spark.rdd
-
- MapPartitionsRDD(RDD<T>, Function3<TaskContext, Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.MapPartitionsRDD
-
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- MappedDStream<T,U> - Class in org.apache.spark.streaming.dstream
-
- MappedDStream(DStream<T>, Function1<T, U>, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MappedDStream
-
- mapper() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- MAPRED_REDUCE_TASKS() - Method in class org.apache.spark.sql.SQLConf.Deprecated$
-
- mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
-
- MapStatus - Interface in org.apache.spark.scheduler
-
Result returned by a ShuffleMapTask to a scheduler.
- mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------ *
Util JSON serialization methods |
-------------------------------
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute a partition at a time using the map function, passing it the
adjacent vertex attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- MapValuedDStream<K,V,U> - Class in org.apache.spark.streaming.dstream
-
- MapValuedDStream(DStream<Tuple2<K, V>>, Function1<V, U>, ClassTag<K>, ClassTag<V>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.MapValuedDStream
-
- mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
-
Map the values in an edge partitioning preserving the structure but changing the values.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, preserving the index.
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, additionally supplying the vertex ID.
- mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapVertexPartitions(Function1<ShippableVertexPartition<VD>, ShippableVertexPartition<VD2>>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Applies a function to each VertexPartition
of this RDD and returns a new VertexRDD.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each vertex attribute in the graph using the map function.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Maps f over this RDD, where f takes an additional parameter of type A.
- markCheckpointed(RDD<?>) - Method in class org.apache.spark.rdd.RDD
-
Changes the dependencies of this RDD from its original parents to a new RDD (newRDD
)
created from the checkpoint file, and forget its old dependencies and partitions.
- MarkedForCheckpoint() - Static method in class org.apache.spark.rdd.CheckpointState
-
- markFailed(long) - Method in class org.apache.spark.scheduler.TaskInfo
-
- markFailure() - Method in class org.apache.spark.storage.BlockInfo
-
Mark this BlockInfo as ready but failed
- markForCheckpoint() - Method in class org.apache.spark.rdd.RDDCheckpointData
-
- markGettingResult(long) - Method in class org.apache.spark.scheduler.TaskInfo
-
- markInterrupted() - Method in class org.apache.spark.TaskContextImpl
-
Marks the task for interruption, i.e.
- markPartiallyConstructed(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
-
Called at the beginning of the SparkContext constructor to ensure that no SparkContext is
running.
- markReady(long) - Method in class org.apache.spark.storage.BlockInfo
-
Mark this BlockInfo as ready (i.e.
- markSuccessful(long) - Method in class org.apache.spark.scheduler.TaskInfo
-
- markTaskCompleted() - Method in class org.apache.spark.TaskContextImpl
-
Marks the task as completed and triggers the listeners.
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges that are also in other
, but keeps the
attributes from this graph.
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mask() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
- mask() - Method in class org.apache.spark.graphx.impl.VertexPartition
-
- mask() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- master() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- master() - Method in class org.apache.spark.SparkContext
-
- master() - Method in class org.apache.spark.storage.BlockManager
-
- master() - Method in class org.apache.spark.storage.TachyonBlockManager
-
- master() - Method in class org.apache.spark.streaming.Checkpoint
-
- Matrices - Class in org.apache.spark.mllib.linalg
-
- Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
-
- Matrix - Interface in org.apache.spark.mllib.linalg
-
Trait for a local matrix.
- MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents an entry in an distributed matrix.
- MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
-
Model representing the result of matrix factorization.
- MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- MatrixFactorizationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.recommendation
-
- MatrixFactorizationModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
- max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the maximum element from this RDD as defined by the specified
Comparator[T].
- max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Maximum value of each column.
- max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the max of this RDD as defined by the implicit Ordering[T].
- max(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the expression in a group.
- max(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the column in a group.
- max(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the max value for each numeric columns for each group.
- max(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the max value for each numeric columns for each group.
- MAX() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- max(Duration) - Method in class org.apache.spark.streaming.Duration
-
- max(Time) - Method in class org.apache.spark.streaming.Time
-
- max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- max() - Method in class org.apache.spark.util.StatCounter
-
- MAX_ATTEMPTS() - Method in class org.apache.spark.streaming.CheckpointWriter
-
- MAX_DICT_SIZE() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- MAX_SLAVE_FAILURES() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- maxAkkaFrameSize() - Method in class org.apache.spark.MapOutputTrackerMasterActor
-
- maxBatchSize() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxBins() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- maxCores() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- maxCores() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- maxCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxDepth() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- maxFrameSizeBytes(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
-
Returns the configured max frame size for Akka messages in bytes.
- maxIter() - Method in interface org.apache.spark.ml.param.HasMaxIter
-
param for max number of iterations
- maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxMem() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- maxMem() - Method in class org.apache.spark.storage.StorageStatus
-
- maxMemory() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the maximum number of nodes which can be in the given level of the tree.
- maxRegisteredWaitingTime() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- maxResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- maxRetries() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- maxTaskFailures() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- maybePartitionSpec() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- maybeSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the mean of this RDD's elements.
- mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample mean vector.
- mean() - Method in class org.apache.spark.partial.BoundedDouble
-
- mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the mean of this RDD's elements.
- mean(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the average value for each numeric columns for each group.
- mean(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the average value for each numeric columns for each group.
- mean() - Method in class org.apache.spark.util.StatCounter
-
- meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean absolute error, which is a risk function corresponding to the
expected value of the absolute error loss or l1-norm loss.
- meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the approximate mean of the elements in this RDD.
- meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Returns the mean average precision (MAP) of all the queries.
- MeanEvaluator - Class in org.apache.spark.partial
-
An ApproximateEvaluator for means.
- MeanEvaluator(int, double) - Constructor for class org.apache.spark.partial.MeanEvaluator
-
- means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean squared error, which is a risk function corresponding to the
expected value of the squared error loss or quadratic loss.
- megabytesToString(long) - Static method in class org.apache.spark.util.Utils
-
Convert a quantity in megabytes to a human-readable string such as "4.0 MB".
- MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- memoryBytesSpilled() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- MemoryEntry - Class in org.apache.spark.storage
-
- MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
-
- MemoryParam - Class in org.apache.spark.util
-
An extractor object for parsing JVM memory strings, such as "10g", into an Int representing
the number of megabytes.
- MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
-
- memoryStore() - Method in class org.apache.spark.storage.BlockManager
-
- MemoryStore - Class in org.apache.spark.storage
-
Stores blocks in memory, either as Arrays of deserialized Java objects or as
serialized ByteBuffers.
- MemoryStore(BlockManager, long) - Constructor for class org.apache.spark.storage.MemoryStore
-
- memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils
-
Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of megabytes.
- memoryUsed() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- MemoryUtils - Class in org.apache.spark.scheduler.cluster.mesos
-
- MemoryUtils() - Constructor for class org.apache.spark.scheduler.cluster.mesos.MemoryUtils
-
- memRemaining() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory remaining in this block manager.
- memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- memSize() - Method in class org.apache.spark.storage.BlockStatus
-
- memSize() - Method in class org.apache.spark.storage.RDDInfo
-
- memUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by this block manager.
- memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by the given RDD in this block manager in O(1) time.
- merge(R) - Method in class org.apache.spark.Accumulable
-
Merge two accumulable objects together
- merge(ALS.NormalEquation) - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
Merges another normal equation object.
- merge(ALS.RatingBlock<ID>) - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
-
- merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Merges another.
- merge(FPTree<T>) - Method in class org.apache.spark.mllib.fpm.FPTree
-
Merges another FP-Tree.
- merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
- merge(DTStatsAggregator) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Merge this aggregator with another, and returns this aggregator.
- merge(double[], int, int) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
-
Merge the stats from one bin into another.
- merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
-
- merge(int, long) - Method in class org.apache.spark.partial.CountEvaluator
-
- merge(int, OpenHashMap<T, Object>) - Method in class org.apache.spark.partial.GroupedCountEvaluator
-
- merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedMeanEvaluator
-
- merge(int, HashMap<T, StatCounter>) - Method in class org.apache.spark.partial.GroupedSumEvaluator
-
- merge(int, StatCounter) - Method in class org.apache.spark.partial.MeanEvaluator
-
- merge(int, StatCounter) - Method in class org.apache.spark.partial.SumEvaluator
-
- merge(Option<AcceptanceResult>) - Method in class org.apache.spark.util.random.AcceptanceResult
-
- merge(double) - Method in class org.apache.spark.util.StatCounter
-
Add a value into this StatCounter, updating the internal statistics.
- merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
-
Add multiple values into this StatCounter, updating the internal statistics.
- merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
-
Merge another StatCounter into this one, adding up the internal statistics.
- MERGE_SCHEMA() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- mergeCombiners() - Method in class org.apache.spark.Aggregator
-
- mergeForFeature(int, int, int) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
For a given feature, merge the stats for two bins.
- mergeMetastoreParquetSchema(StructType, StructType) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
Reconciles Hive Metastore case insensitivity issue and data type conflicts between Metastore
schema and Parquet schema.
- mergeValue() - Method in class org.apache.spark.Aggregator
-
- MesosSchedulerBackend - Class in org.apache.spark.scheduler.cluster.mesos
-
A SchedulerBackend for running fine-grained tasks on Mesos.
- MesosSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- MesosTaskLaunchData - Class in org.apache.spark.scheduler.cluster.mesos
-
Wrapper for serializing the data sent when launching Mesos tasks.
- MesosTaskLaunchData(ByteBuffer, int) - Constructor for class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
-
- message() - Method in class org.apache.spark.FetchFailed
-
- message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
-
- message() - Method in class org.apache.spark.scheduler.ExecutorLossReason
-
- message() - Method in exception org.apache.spark.storage.BlockException
-
- message() - Method in class org.apache.spark.streaming.scheduler.ReportError
-
- metadata() - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
- metadataCleaner() - Method in class org.apache.spark.SparkContext
-
- MetadataCleaner - Class in org.apache.spark.util
-
Runs a timer task to periodically clean up metadata (e.g.
- MetadataCleaner(Enumeration.Value, Function1<Object, BoxedUnit>, SparkConf) - Constructor for class org.apache.spark.util.MetadataCleaner
-
- MetadataCleanerType - Class in org.apache.spark.util
-
- MetadataCleanerType() - Constructor for class org.apache.spark.util.MetadataCleanerType
-
- METASTORE_SCHEMA() - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- MetastoreRelation - Class in org.apache.spark.sql.hive
-
- MetastoreRelation(String, String, Option<String>, Table, Seq<Partition>, SQLContext) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation
-
- MetastoreRelation.SchemaAttribute - Class in org.apache.spark.sql.hive
-
- MetastoreRelation.SchemaAttribute(FieldSchema) - Constructor for class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
-
- method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
param for metric name in evaluation
- metricRegistry() - Method in class org.apache.spark.metrics.source.JvmSource
-
- metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
-
- metricRegistry() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
-
- metricRegistry() - Method in class org.apache.spark.storage.BlockManagerSource
-
- metricRegistry() - Method in class org.apache.spark.streaming.StreamingSource
-
- metrics() - Method in class org.apache.spark.ExceptionFailure
-
- metrics() - Method in class org.apache.spark.scheduler.DirectTaskResult
-
- metrics() - Method in class org.apache.spark.scheduler.Task
-
- MetricsConfig - Class in org.apache.spark.metrics
-
- MetricsConfig(Option<String>) - Constructor for class org.apache.spark.metrics.MetricsConfig
-
- MetricsServlet - Class in org.apache.spark.metrics.sink
-
- MetricsServlet(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.spark.metrics.sink.MetricsServlet
-
- MetricsSystem - Class in org.apache.spark.metrics
-
Spark Metrics System, created by specific "instance", combined by source,
sink, periodically poll source metrics data to sink destinations.
- metricsSystem() - Method in class org.apache.spark.SparkContext
-
- metricsSystem() - Method in class org.apache.spark.SparkEnv
-
- MFDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate RDD(s) containing data for Matrix Factorization.
- MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
-
- microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based f1-measure
(equals to micro-averaged document-based f1-measure)
- microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based precision
(equals to micro-averaged document-based precision)
- microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based recall
(equals to micro-averaged document-based recall)
- milliseconds() - Method in class org.apache.spark.streaming.Duration
-
- milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Milliseconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of milliseconds.
- Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
-
- milliseconds() - Method in class org.apache.spark.streaming.Time
-
- millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
Reformat a time interval in milliseconds to a prettier format for output
- min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the minimum element from this RDD as defined by the specified
Comparator[T].
- min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Minimum value of each column.
- min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the min of this RDD as defined by the implicit Ordering[T].
- min(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the expression in a group.
- min(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the column in a group.
- min(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the min value for each numeric column for each group.
- min(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the min value for each numeric column for each group.
- MIN() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- min(Duration) - Method in class org.apache.spark.streaming.Duration
-
- min(Time) - Method in class org.apache.spark.streaming.Time
-
- min() - Method in class org.apache.spark.util.StatCounter
-
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
-
- MINIMUM_INTERVAL_SECONDS() - Static method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
- MINIMUM_SHARES_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- MINIMUM_SIZE_BYTES() - Static method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
- minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- minInfoGain() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- minMemoryMapBytes() - Method in class org.apache.spark.storage.DiskStore
-
- minPollTime() - Method in class org.apache.spark.util.SystemClock
-
- minRegisteredRatio() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
-
- minShare() - Method in class org.apache.spark.scheduler.Pool
-
- minShare() - Method in interface org.apache.spark.scheduler.Schedulable
-
- minShare() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- minus(Object) - Method in class org.apache.spark.sql.Column
-
Subtraction.
- minus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- minus(Time) - Method in class org.apache.spark.streaming.Time
-
- minus(Duration) - Method in class org.apache.spark.streaming.Time
-
- minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- minutes(long) - Static method in class org.apache.spark.streaming.Durations
-
- Minutes - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of minutes.
- Minutes() - Constructor for class org.apache.spark.streaming.Minutes
-
- MINUTES_PER_HOUR() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- MLUtils - Class in org.apache.spark.mllib.util
-
Helper methods to load, save and pre-process data used in ML Lib.
- MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
-
- mod(Object) - Method in class org.apache.spark.sql.Column
-
Modulo (a.k.a.
- mode() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
-
- mode() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- mode() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
-
- Model<M extends Model<M>> - Class in org.apache.spark.ml
-
- Model() - Constructor for class org.apache.spark.ml.Model
-
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.InBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.RatingBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkProps$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveQl.Token$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.sql.SQLConf.Deprecated$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ExpireDeadHosts$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetActorSystemHostPortForExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.JobUIData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ui.jobs.UIData.TaskUIData$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$
-
Static reference to the singleton instance of this Scala object.
- MQTTInputDStream - Class in org.apache.spark.streaming.mqtt
-
Input stream that subscribe messages from a Mqtt Broker.
- MQTTInputDStream(StreamingContext, String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTInputDStream
-
- MQTTReceiver - Class in org.apache.spark.streaming.mqtt
-
- MQTTReceiver(String, String, StorageLevel) - Constructor for class org.apache.spark.streaming.mqtt.MQTTReceiver
-
- MQTTUtils - Class in org.apache.spark.streaming.mqtt
-
- MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
-
- msDurationToString(long) - Static method in class org.apache.spark.util.Utils
-
Returns a human-readable string representing a duration such as "35ms"
- msg() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
-
- msg() - Method in class org.apache.spark.streaming.scheduler.ErrorReported
-
- mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
-
::Experimental::
Evaluator for multiclass classification.
- MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
-
Evaluator for multilabel classification.
- MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
- multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators
-
Function to check if labels used for k class multi-label classification are
in the range of {0, 1, ..., k - 1}.
- multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for `Matrix`-`DenseMatrix` multiplication.
- multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for `Matrix`-`DenseVector` multiplication.
- multiply(Object) - Method in class org.apache.spark.sql.Column
-
Multiplication of this expression and another expression.
- multiply(double) - Method in class org.apache.spark.util.Vector
-
- multiplyGramianMatrixBy(DenseVector<Object>) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Multiplies the Gramian matrix A^T A
by a dense vector on the right without computing A^T A
.
- MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution
-
:: DeveloperApi ::
This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
- MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MultivariateGaussian(DenseVector<Object>, DenseMatrix<Object>) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
private[mllib] constructor
- MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
-
:: DeveloperApi ::
MultivariateOnlineSummarizer implements
MultivariateStatisticalSummary
to compute the mean,
variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector
format in a online fashion.
- MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
-
Trait for multivariate statistical summary of a data matrix.
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
-
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- MutablePair<T1,T2> - Class in org.apache.spark.util
-
:: DeveloperApi ::
A tuple of 2 elements.
- MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
-
- MutablePair() - Constructor for class org.apache.spark.util.MutablePair
-
No-arg constructor for serialization
- MutableRowWriteSupport - Class in org.apache.spark.sql.parquet
-
- MutableRowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.MutableRowWriteSupport
-
- MutableURLClassLoader - Class in org.apache.spark.util
-
URL class loader that exposes the addURL
and getURLs
methods in URLClassLoader.
- MutableURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.MutableURLClassLoader
-
- myLocalityLevels() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- myName() - Method in class org.apache.spark.util.InnerClosureFinder
-
- MySQLQuirks - Class in org.apache.spark.sql.jdbc
-
- MySQLQuirks() - Constructor for class org.apache.spark.sql.jdbc.MySQLQuirks
-
- pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- PageRank - Class in org.apache.spark.graphx.lib
-
PageRank algorithm implementation.
- PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
-
- pages() - Method in class org.apache.spark.ui.WebUITab
-
- PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
-
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
- PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more key-value pair records from each input record.
- PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns key-value pairs (Tuple2<K, V>), and can be used to
construct PairRDDs.
- pairFunToScalaFun(PairFunction<A, B, C>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
- PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
-
- ParallelCollectionPartition<T> - Class in org.apache.spark.rdd
-
- ParallelCollectionPartition(long, int, Seq<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionPartition
-
- ParallelCollectionRDD<T> - Class in org.apache.spark.rdd
-
- ParallelCollectionRDD(SparkContext, Seq<T>, int, Map<Object, Seq<String>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.ParallelCollectionRDD
-
- parallelism() - Method in class org.apache.spark.streaming.flume.FlumePollingInputDStream
-
- parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- Param<T> - Class in org.apache.spark.ml.param
-
:: AlphaComponent ::
A param with self-contained documentation and optionally default value.
- Param(Params, String, String, Option<T>) - Constructor for class org.apache.spark.ml.param.Param
-
- param() - Method in class org.apache.spark.ml.param.ParamPair
-
- parameters() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- ParamGridBuilder - Class in org.apache.spark.ml.tuning
-
:: AlphaComponent ::
Builder for a param grid used in grid search-based model selection.
- ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
-
- ParamMap - Class in org.apache.spark.ml.param
-
:: AlphaComponent ::
A param to value map.
- ParamMap(Map<Param<Object>, Object>) - Constructor for class org.apache.spark.ml.param.ParamMap
-
- ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
-
Creates an empty param map.
- paramMap() - Method in interface org.apache.spark.ml.param.Params
-
Internal param map.
- ParamPair<T> - Class in org.apache.spark.ml.param
-
A param amd its value.
- ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
-
- Params - Interface in org.apache.spark.ml.param
-
:: AlphaComponent ::
Trait for components that take parameters.
- params() - Method in interface org.apache.spark.ml.param.Params
-
Returns all params.
- parent() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- parent() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- parent() - Method in class org.apache.spark.ml.Model
-
The parent estimator that produced this model.
- parent() - Method in class org.apache.spark.ml.param.Param
-
- parent() - Method in class org.apache.spark.ml.PipelineModel
-
- parent() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- parent() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- parent() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- parent() - Method in class org.apache.spark.mllib.fpm.FPTree.Node
-
- parent() - Method in class org.apache.spark.mllib.rdd.SlidingRDD
-
- parent() - Method in class org.apache.spark.scheduler.Pool
-
- parent() - Method in interface org.apache.spark.scheduler.Schedulable
-
- parent() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- parent() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- parent() - Method in class org.apache.spark.streaming.ui.StreamingTab
-
- ParentClassLoader - Class in org.apache.spark.util
-
A class loader which makes some protected methods in ClassLoader accesible.
- ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
-
- parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Get the parent index of the given node, or 0 if it is the root.
- parentPartition() - Method in class org.apache.spark.rdd.UnionPartition
-
- parentRddIndex() - Method in class org.apache.spark.rdd.UnionPartition
-
- parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- parentRememberDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- parents() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
- parents() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
-
- parents() - Method in class org.apache.spark.scheduler.Stage
-
- parentsIndices() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
- parentSplit() - Method in class org.apache.spark.rdd.PartitionPruningRDDPartition
-
- PARQUET_BINARY_AS_STRING() - Static method in class org.apache.spark.sql.SQLConf
-
- PARQUET_CACHE_METADATA() - Static method in class org.apache.spark.sql.SQLConf
-
- PARQUET_COMPRESSION() - Static method in class org.apache.spark.sql.SQLConf
-
- PARQUET_FILTER_DATA() - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
- PARQUET_FILTER_PUSHDOWN_ENABLED() - Static method in class org.apache.spark.sql.SQLConf
-
- PARQUET_INT96_AS_TIMESTAMP() - Static method in class org.apache.spark.sql.SQLConf
-
- PARQUET_USE_DATA_SOURCE_API() - Static method in class org.apache.spark.sql.SQLConf
-
- parquetCompressionCodec() - Method in class org.apache.spark.sql.SQLConf
-
The compression codec for writing to a Parquetfile
- ParquetConversion() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- ParquetConversions() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext
-
Loads a Parquet file, returning the result as a
DataFrame
.
- parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
-
Loads a Parquet file, returning the result as a
DataFrame
.
- parquetFilterPushDown() - Method in class org.apache.spark.sql.SQLConf
-
When true predicates will be passed to the parquet record reader when possible.
- ParquetFilters - Class in org.apache.spark.sql.parquet
-
- ParquetFilters() - Constructor for class org.apache.spark.sql.parquet.ParquetFilters
-
- ParquetRelation - Class in org.apache.spark.sql.parquet
-
Relation that consists of data stored in a Parquet columnar format.
- ParquetRelation(String, Option<Configuration>, SQLContext, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation
-
- ParquetRelation2 - Class in org.apache.spark.sql.parquet
-
An alternative to
ParquetRelation
that plugs in using the data sources API.
- ParquetRelation2(Seq<String>, Map<String, String>, Option<StructType>, Option<PartitionSpec>, SQLContext) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2
-
- ParquetRelation2.PartitionValues - Class in org.apache.spark.sql.parquet
-
- ParquetRelation2.PartitionValues(Seq<String>, Seq<Literal>) - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues
-
- ParquetRelation2.PartitionValues$ - Class in org.apache.spark.sql.parquet
-
- ParquetRelation2.PartitionValues$() - Constructor for class org.apache.spark.sql.parquet.ParquetRelation2.PartitionValues$
-
- parquetSchema() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
Schema derived from ParquetFile
- ParquetTableScan - Class in org.apache.spark.sql.parquet
-
:: DeveloperApi ::
Parquet table scan operator.
- ParquetTableScan(Seq<Attribute>, ParquetRelation, Seq<Expression>) - Constructor for class org.apache.spark.sql.parquet.ParquetTableScan
-
- ParquetTest - Interface in org.apache.spark.sql.parquet
-
A helper trait that provides convenient facilities for Parquet testing.
- ParquetTestData - Class in org.apache.spark.sql.parquet
-
- ParquetTestData() - Constructor for class org.apache.spark.sql.parquet.ParquetTestData
-
- parquetTsCalendar() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- ParquetTypeInfo - Class in org.apache.spark.sql.parquet
-
A class representing Parquet info fields we care about, for passing back to Parquet
- ParquetTypeInfo(PrimitiveType.PrimitiveTypeName, Option<OriginalType>, Option<DecimalMetadata>, Option<Object>) - Constructor for class org.apache.spark.sql.parquet.ParquetTypeInfo
-
- ParquetTypesConverter - Class in org.apache.spark.sql.parquet
-
- ParquetTypesConverter() - Constructor for class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- parquetUseDataSourceApi() - Method in class org.apache.spark.sql.SQLConf
-
When true uses Parquet implementation based on data source API
- parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Parses a string resulted from
Vector.toString
into a
Vector
.
- parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
-
Parses a string resulted from
LabeledPoint#toString
into
an
LabeledPoint
.
- parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser
-
Parses a string into a Double, an Array[Double], or a Seq[Any].
- parse(SparkConf, String, Option<SSLOptions>) - Static method in class org.apache.spark.SSLOptions
-
Resolves SSLOptions settings from a given Spark configuration object at a given namespace.
- parseAttrs(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- parseDdl(String) - Static method in class org.apache.spark.sql.hive.HiveQl
-
- parseHostPort(String) - Static method in class org.apache.spark.util.Utils
-
- parseNumeric(Object) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
- parsePartition(Path, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
Parses a single partition, returns column names and values of each partition column.
- parsePartitions(Seq<Path>, String) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
Given a group of qualified paths, tries to parse them and returns a partition specification.
- parseSql(String) - Static method in class org.apache.spark.sql.hive.HiveQl
-
Returns a LogicalPlan for a given HiveQL string.
- parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamBasedRecordReader
-
Parse the stream (and close it afterwards) and return the value as in type T
- parseStream(PortableDataStream) - Method in class org.apache.spark.input.StreamRecordReader
-
- parseType(String) - Method in class org.apache.spark.sql.sources.DDLParser
-
- PartialResult<R> - Class in org.apache.spark.partial
-
- PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
-
- Partition - Interface in org.apache.spark
-
An identifier for a partition in an RDD.
- partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- partition() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- Partition - Class in org.apache.spark.sql.parquet
-
- Partition(Row, String) - Constructor for class org.apache.spark.sql.parquet.Partition
-
- partition() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- partition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
Kafka partition id
- partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a copy of the RDD partitioned using the specified partitioner.
- PartitionCoalescer - Class in org.apache.spark.rdd
-
Coalesce the partitions of a parent RDD (prev
) into fewer partitions, so that each partition of
this RDD computes one or more of the parent ones.
- PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
-
- PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
-
- PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- partitionColumns() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- partitionColumns() - Method in class org.apache.spark.sql.parquet.PartitionSpec
-
- partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
If partitionsRDD
already has a partitioner, use it.
- partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- Partitioner - Class in org.apache.spark
-
An object that defines how the elements in a key-value pair RDD are partitioned by key.
- Partitioner() - Constructor for class org.apache.spark.Partitioner
-
- partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- partitioner() - Method in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD
-
- partitioner() - Method in class org.apache.spark.rdd.MapPartitionsRDD
-
- partitioner() - Method in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD
-
- partitioner() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- partitioner() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- partitioner() - Method in class org.apache.spark.rdd.RDD
-
Optionally overridden by subclasses to specify how they are partitioned.
- partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- partitioner() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- partitioner() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- partitioner() - Method in class org.apache.spark.ShuffleDependency
-
- PartitionerAwareUnionRDD<T> - Class in org.apache.spark.rdd
-
Class representing an RDD that can take multiple RDDs partitioned by the same partitioner and
unify them into a single RDD while preserving the partitioner.
- PartitionerAwareUnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- PartitionerAwareUnionRDDPartition - Class in org.apache.spark.rdd
-
Class representing partitions of PartitionerAwareUnionRDD, which maintains the list of
corresponding partitions of parent RDDs.
- PartitionerAwareUnionRDDPartition(Seq<RDD<?>>, int) - Constructor for class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
-
- partitionFilters() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- PartitionGroup - Class in org.apache.spark.rdd
-
- PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
-
- partitionId() - Method in class org.apache.spark.scheduler.Task
-
- partitionID() - Method in class org.apache.spark.TaskCommitDenied
-
- partitionId() - Method in class org.apache.spark.TaskContext
-
The ID of the RDD partition that is computed by this task.
- partitionId() - Method in class org.apache.spark.TaskContextImpl
-
- partitioningAttributes() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- partitionKeys() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- partitionPruningPred() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- PartitionPruningRDD<T> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
all partitions.
- PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
-
- PartitionPruningRDDPartition - Class in org.apache.spark.rdd
-
- PartitionPruningRDDPartition(int, Partition) - Constructor for class org.apache.spark.rdd.PartitionPruningRDDPartition
-
- partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Set of partitions in this RDD.
- partitions() - Method in class org.apache.spark.rdd.PruneDependency
-
- partitions() - Method in class org.apache.spark.rdd.RDD
-
Get the array of partitions of this RDD, taking into account whether the
RDD is checkpointed or not.
- partitions() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
-
- partitions() - Method in class org.apache.spark.scheduler.ActiveJob
-
- partitions() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- partitions() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- partitions() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- partitions() - Method in class org.apache.spark.sql.parquet.PartitionSpec
-
- partitionSize(int) - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
-
Returns the number of vertices that will be sent to the specified edge partition.
- partitionSpec() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- PartitionSpec - Class in org.apache.spark.sql.parquet
-
- PartitionSpec(StructType, Seq<Partition>) - Constructor for class org.apache.spark.sql.parquet.PartitionSpec
-
- partitionsRDD() - Method in class org.apache.spark.graphx.EdgeRDD
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- partitionsRDD() - Method in class org.apache.spark.graphx.VertexRDD
-
- partitionStatistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- PartitionStatistics - Class in org.apache.spark.sql.columnar
-
- PartitionStatistics(Seq<Attribute>) - Constructor for class org.apache.spark.sql.columnar.PartitionStatistics
-
- PartitionStrategy - Interface in org.apache.spark.graphx
-
Represents the way edges are assigned to edge partitions based on their source and destination
vertex IDs.
- PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical
direction, resulting in a random vertex cut that colocates all edges between two vertices,
regardless of direction.
- PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using only the source vertex ID, colocating edges with the same
source.
- PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
- PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix,
guaranteeing a 2 * sqrt(numParts) - 1
bound on vertex replication.
- PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
- PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a
random vertex cut that colocates all same-direction edges between two vertices.
- PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- partitionToOps(VertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.VertexPartition
-
Implicit conversion to allow invoking VertexPartitionBase
operations directly on a
VertexPartition
.
- partitionValues() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
-
- PartitionwiseSampledRDD<T,U> - Class in org.apache.spark.rdd
-
A RDD sampled from its parent RDD partition-wise.
- PartitionwiseSampledRDD(RDD<T>, RandomSampler<T, U>, boolean, long, ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDD
-
- PartitionwiseSampledRDDPartition - Class in org.apache.spark.rdd
-
- PartitionwiseSampledRDDPartition(Partition, long) - Constructor for class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
-
- parts() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
-
- PassThrough - Class in org.apache.spark.sql.columnar.compression
-
- PassThrough() - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough
-
- PassThrough.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- PassThrough.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Decoder
-
- PassThrough.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- PassThrough.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- path() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- path() - Method in class org.apache.spark.scheduler.SplitInfo
-
- path() - Method in class org.apache.spark.sql.hive.execution.AddFile
-
- path() - Method in class org.apache.spark.sql.hive.execution.AddJar
-
- path() - Method in class org.apache.spark.sql.json.JSONRelation
-
- path() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- path() - Method in class org.apache.spark.sql.parquet.Partition
-
- path() - Method in class org.apache.spark.streaming.util.WriteAheadLogFileSegment
-
- path() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
-
- paths() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns density of this multivariate Gaussian at given point, x
- pdf(Vector<Object>) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns density of this multivariate Gaussian at given point, x
- PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
-
- PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation
-
Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix
for an RDD of the type RDD[Vector].
- PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
-
- pendingStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- pendingTasks() - Method in class org.apache.spark.scheduler.Stage
-
- pendingTasksWithNoPrefs() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- pendingTimes() - Method in class org.apache.spark.streaming.Checkpoint
-
- percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- PeriodicGraphCheckpointer<VD,ED> - Class in org.apache.spark.mllib.impl
-
This class helps with persisting and checkpointing Graphs.
- PeriodicGraphCheckpointer(Graph<VD, ED>, int) - Constructor for class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
-
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the specified storage level,
ignoring any target storage levels previously set.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Persists the underlying RDD with the specified storage level.
- persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- persist() - Method in class org.apache.spark.sql.DataFrame
-
- persist(StorageLevel) - Method in class org.apache.spark.sql.DataFrame
-
- persist() - Method in interface org.apache.spark.sql.RDDApi
-
- persist(StorageLevel) - Method in interface org.apache.spark.sql.RDDApi
-
- persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist the RDDs of this DStream with the given storage level
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- persistentRdds() - Method in class org.apache.spark.SparkContext
-
- persistRDD(RDD<?>) - Method in class org.apache.spark.SparkContext
-
Register an RDD to be persisted in memory and/or disk storage
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Takes a parent RDD partition and decides which of the partition groups to put it in
Takes locality into account, but also uses power of 2 choices to load balance
It strikes a balance between the two use the balanceSlack variable
- pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
-
Picks a random vertex from the graph and returns its ID.
- pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(String) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- PipedRDD<T> - Class in org.apache.spark.rdd
-
An RDD that pipes the contents of each parent partition through an external command
(printing them one per line) and returns the output as a collection of strings.
- PipedRDD(RDD<T>, Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
-
- PipedRDD(RDD<T>, String, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PipedRDD
-
- PipedRDD.NotEqualsFileNameFilter - Class in org.apache.spark.rdd
-
A FilenameFilter that accepts anything that isn't equal to the name passed in.
- PipedRDD.NotEqualsFileNameFilter(String) - Constructor for class org.apache.spark.rdd.PipedRDD.NotEqualsFileNameFilter
-
- Pipeline - Class in org.apache.spark.ml
-
:: AlphaComponent ::
A simple pipeline, which acts as an estimator.
- Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
-
- PipelineModel - Class in org.apache.spark.ml
-
:: AlphaComponent ::
Represents a compiled pipeline.
- PipelineModel(Pipeline, ParamMap, Transformer[]) - Constructor for class org.apache.spark.ml.PipelineModel
-
- PipelineStage - Class in org.apache.spark.ml
-
- PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
-
- plan() - Method in class org.apache.spark.sql.CachedData
-
- PluggableInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
- PluggableInputDStream(StreamingContext, Receiver<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.PluggableInputDStream
-
- plus(Object) - Method in class org.apache.spark.sql.Column
-
Sum of this expression and another expression.
- plus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- plus(Duration) - Method in class org.apache.spark.streaming.Time
-
- plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
-
return (this + plus) dot other, but without creating any intermediate storage
- point() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- pointCost(TraversableOnce<VectorWithNorm>, VectorWithNorm) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Returns the K-means cost of a given point against the given cluster centers.
- POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- PoissonBounds - Class in org.apache.spark.util.random
-
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact
sample sizes with high confidence when sampling with replacement.
- PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
-
- PoissonGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
-
- poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the Poisson distribution with the input
mean.
- PoissonSampler<T> - Class in org.apache.spark.util.random
-
:: DeveloperApi ::
A sampler for sampling with replacement, based on values drawn from Poisson distribution.
- PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
-
- poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
Poisson distribution with the input mean.
- pollDir() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- pollPeriod() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- pollPeriod() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- pollPeriod() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- pollUnit() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- pollUnit() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- pollUnit() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- Pool - Class in org.apache.spark.scheduler
-
An Schedulable entity that represent collection of Pools or TaskSetManagers
- Pool(String, Enumeration.Value, int, int) - Constructor for class org.apache.spark.scheduler.Pool
-
- POOL_NAME_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- poolName() - Method in class org.apache.spark.scheduler.Pool
-
- PoolPage - Class in org.apache.spark.ui.jobs
-
Page showing specific pool details
- PoolPage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolPage
-
- POOLS_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- PoolTable - Class in org.apache.spark.ui.jobs
-
Table showing list of pools
- PoolTable(Seq<Schedulable>, StagesTab) - Constructor for class org.apache.spark.ui.jobs.PoolTable
-
- poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- port() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- port() - Method in class org.apache.spark.storage.BlockManagerId
-
- port() - Method in class org.apache.spark.streaming.kafka.Broker
-
Broker's port
- port() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.LeaderOffset
-
- port() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- PortableDataStream - Class in org.apache.spark.input
-
A class that allows DataStreams to be serialized and moved around by not creating them
until they need to be read
- PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
-
- portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils
-
Maximum number of retries when binding to a port before giving up.
- pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
-
- pos() - Method in interface org.apache.spark.sql.columnar.NullableColumnBuilder
-
- post(E) - Method in class org.apache.spark.util.AsynchronousListenerBus
-
- post(E) - Method in class org.apache.spark.util.EventLoop
-
Put the event into the event queue.
- PostgresQuirks - Class in org.apache.spark.sql.jdbc
-
- PostgresQuirks() - Constructor for class org.apache.spark.sql.jdbc.PostgresQuirks
-
- postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- postStartHook() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- postToAll(E) - Method in interface org.apache.spark.util.ListenerBus
-
Post the event to all registered listeners.
- powerIter(Graph<Object, Object>, int) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Runs power iteration.
- PowerIterationClustering - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- PowerIterationClustering(int, int, String) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
-
- PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100,
initMode: "random"}.
- PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
Cluster assignment.
- PowerIterationClustering.Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the precision-recall curve, which is an RDD of (recall, precision),
NOT (precision, recall), with (0.0, 1.0) prepended to it.
- Precision - Class in org.apache.spark.mllib.evaluation.binary
-
Precision.
- Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
-
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision
- precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based precision averaged by the number of documents
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns precision for a given label (category)
- precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Compute the average precision of all the queries, truncated at ranking position k.
- precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, precision) curve.
- predicates() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for examples stored in a JavaRDD.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Maps given points to their cluster indices.
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Returns the cluster index that a given point belongs to.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of one user for one product.
- predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of many users for many products.
- predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Java-friendly version of MatrixFactorizationModel.predict
.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict a single label.
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for examples stored in a JavaRDD.
- predict() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Prediction which should be made based on the sufficient statistics.
- predict() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Prediction which should be made based on the sufficient statistics.
- predict() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Prediction which should be made based on the sufficient statistics.
- predict() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
Prediction which should be made based on the sufficient statistics.
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Node
-
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
-
predict value if node is not leaf
- Predict - Class in org.apache.spark.mllib.tree.model
-
Predicted value for a node
- Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
Predict values for the given data set.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
- predictionCol() - Method in interface org.apache.spark.ml.param.HasPredictionCol
-
param for prediction column name
- PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml.impl.estimator
-
:: AlphaComponent ::
- PredictionModel() - Constructor for class org.apache.spark.ml.impl.estimator.PredictionModel
-
- predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the clustering model to make predictions on batches of data from a DStream.
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on batches of data from a DStream
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `predictOn`.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `predictOnValues`.
- Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml.impl.estimator
-
:: AlphaComponent ::
- Predictor() - Constructor for class org.apache.spark.ml.impl.estimator.Predictor
-
- PredictorParams - Interface in org.apache.spark.ml.impl.estimator
-
:: DeveloperApi ::
- predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Given the input vectors, return the membership value of each vector
to all mixture components.
- preferredLocation() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
- preferredLocation() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
-
- preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Override this to specify a preferred location (hostname).
- preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
-
Get the preferred locations of a partition, taking into account whether the
RDD is checkpointed.
- preferredLocations() - Method in class org.apache.spark.rdd.UnionPartition
-
- preferredLocations() - Method in class org.apache.spark.rdd.ZippedPartitionsPartition
-
- preferredLocations() - Method in class org.apache.spark.scheduler.ResultTask
-
- preferredLocations() - Method in class org.apache.spark.scheduler.ShuffleMapTask
-
- preferredLocations() - Method in class org.apache.spark.scheduler.Task
-
- preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
-
- prefix() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- PREFIX() - Static method in class org.apache.spark.streaming.Checkpoint
-
- prefix() - Method in class org.apache.spark.ui.WebUIPage
-
- prefix() - Method in class org.apache.spark.ui.WebUITab
-
- prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
-
- pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- Pregel - Class in org.apache.spark.graphx
-
Implements a Pregel-like bulk-synchronous message-passing API.
- Pregel() - Constructor for class org.apache.spark.graphx.Pregel
-
- PreInsertCastAndRename - Class in org.apache.spark.sql.sources
-
A rule to do pre-insert data type casting and field renaming.
- PreInsertCastAndRename() - Constructor for class org.apache.spark.sql.sources.PreInsertCastAndRename
-
- PreInsertionCasts() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- prepare(int) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
-
- prepareForRead(Configuration, Map<String, String>, MessageType, ReadSupport.ReadContext) - Method in class org.apache.spark.sql.parquet.RowReadSupport
-
- prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- prepareForWrite(RecordConsumer) - Method in class org.apache.spark.sql.parquet.TestGroupWriteSupport
-
- prepareWritable(Writable) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- prependBaseUri(String, String) - Static method in class org.apache.spark.ui.UIUtils
-
- preSetup() - Method in class org.apache.spark.SparkHadoopWriter
-
- preStart() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- preStart() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- preStart() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
-
- prettyPrint() - Method in class org.apache.spark.streaming.Duration
-
- prev() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
-
- prev() - Method in class org.apache.spark.rdd.CoalescedRDD
-
- prev() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
-
- prev() - Method in class org.apache.spark.rdd.SampledRDDPartition
-
- prev() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- prev() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
-
- prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
-
- PreWriteCheck - Class in org.apache.spark.sql.sources
-
A rule to do various checks before inserting into or writing to a data source table.
- PreWriteCheck(Catalog) - Constructor for class org.apache.spark.sql.sources.PreWriteCheck
-
- primitiveType() - Method in class org.apache.spark.sql.parquet.ParquetTypeInfo
-
- print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first num elements of each RDD generated in this DStream.
- print() - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first num elements of each RDD generated in this DStream.
- printSchema() - Method in class org.apache.spark.sql.DataFrame
-
Prints the schema to the console in a nice tree format.
- printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- prioritizeContainers(HashMap<K, ArrayBuffer<T>>) - Static method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
Used to balance containers across hosts.
- priority() - Method in class org.apache.spark.scheduler.Pool
-
- priority() - Method in interface org.apache.spark.scheduler.Schedulable
-
- priority() - Method in class org.apache.spark.scheduler.TaskSet
-
- priority() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- prob(double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
Probability of the label given by predict
.
- prob(double) - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
Probability of the label given by predict
.
- prob(double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Probability of the label given by predict
, or -1 if no probability is available.
- prob() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- prob() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
- ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
- ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: AlphaComponent ::
- ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
-
- ProbabilisticClassifierParams - Interface in org.apache.spark.ml.classification
-
Params for probabilistic classification.
- probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- probabilityCol() - Method in interface org.apache.spark.ml.param.HasProbabilityCol
-
param for predicted class conditional probabilities column name
- PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the all jobs of this batch to finish processing from the time they started
processing.
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
-
- processingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- processRecords(List<Record>, IRecordProcessorCheckpointer) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
-
This method is called by the KCL when a batch of records is pulled from the Kinesis stream.
- processResults(ArrayList<Object>) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- processStreamByLine(String, InputStream, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Return and start a daemon thread that processes the content of the input stream line by line.
- product() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- progressBar() - Method in class org.apache.spark.SparkContext
-
- progressListener() - Method in class org.apache.spark.streaming.StreamingContext
-
- properties() - Method in class org.apache.spark.metrics.MetricsConfig
-
- properties() - Method in class org.apache.spark.scheduler.ActiveJob
-
- properties() - Method in class org.apache.spark.scheduler.JobSubmitted
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- properties() - Method in class org.apache.spark.scheduler.TaskSet
-
- propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
-
- property() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- property() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- property() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- property() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- property() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- propertyCategories() - Method in class org.apache.spark.metrics.MetricsConfig
-
- propertyToOption(String) - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- protocol() - Method in class org.apache.spark.SSLOptions
-
- protocol(ActorSystem) - Static method in class org.apache.spark.util.AkkaUtils
-
- protocol(boolean) - Static method in class org.apache.spark.util.AkkaUtils
-
- provider() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- provider() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
-
- provider() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- provider() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- provider() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
-
- provider() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
-
- provider() - Method in class org.apache.spark.sql.sources.ResolvedDataSource
-
- proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
-
- pruneColumns(Seq<Attribute>) - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
Applies a (candidate) projection.
- PruneDependency<T> - Class in org.apache.spark.rdd
-
Represents a dependency between the PartitionPruningRDD and its parent.
- PruneDependency(RDD<T>, Function1<Object, Object>) - Constructor for class org.apache.spark.rdd.PruneDependency
-
- PrunedFilteredScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can eliminate unneeded columns and filter using selected
predicates before producing an RDD containing all matching tuples as Row objects.
- PrunedScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can eliminate unneeded columns before producing an RDD
containing all of its tuples as Row objects.
- prunePartitions(Seq<Partition>) - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
Prunes partitions not involve the query plan.
- Pseudorandom - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A class with pseudorandom behavior.
- pushAndReportBlock(ReceivedBlock, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Store block and report it to driver
- pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- pushArrayBuffer(ArrayBuffer<?>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Store the bytes of received data as a data block into Spark's memory.
- pushBytes(ByteBuffer, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Store the bytes of received data as a data block into Spark's memory.
- pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Store a iterator of received data as a data block into Spark's memory.
- pushIterator(Iterator<Object>, Option<Object>, Option<StreamBlockId>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Store a iterator of received data as a data block into Spark's memory.
- pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Push a single data item to backend data store.
- pushSingle(Object) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Push a single record of received data into block generator.
- put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a (param, value) pair (overwrites if the input param exists).
- put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- putAll(Map<A, B>) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- putArray(BlockId, Object[], StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
-
Put a new block of values to the block manager.
- putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
-
- putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
-
- putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
-
- putArray(BlockId, Object[], StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
-
- putBlockData(BlockId, ManagedBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockManager
-
Put the block locally, using the given storage level.
- putBytes(BlockId, ByteBuffer, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
-
Put a new block of serialized bytes to the block manager.
- putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.BlockStore
-
- putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.DiskStore
-
- putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.MemoryStore
-
- putBytes(BlockId, ByteBuffer, StorageLevel) - Method in class org.apache.spark.storage.TachyonStore
-
- putCachedMetadata(String, Object) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- putIfAbsent(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, Option<StorageLevel>) - Method in class org.apache.spark.storage.BlockManager
-
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockStore
-
Put in a block and, possibly, also return its content as either bytes or another Iterator.
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.DiskStore
-
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.MemoryStore
-
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean, boolean) - Method in class org.apache.spark.storage.MemoryStore
-
Attempt to put the given block in memory store.
- putIterator(BlockId, Iterator<Object>, StorageLevel, boolean) - Method in class org.apache.spark.storage.TachyonStore
-
- PutResult - Class in org.apache.spark.storage
-
Result of adding a block into a BlockStore.
- PutResult(long, Either<Iterator<Object>, ByteBuffer>, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.PutResult
-
- putSingle(BlockId, Object, StorageLevel, boolean) - Method in class org.apache.spark.storage.BlockManager
-
Write a block consisting of a single object.
- pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
The probability of obtaining a test statistic result at least as extreme as the one that was
actually observed, assuming that the null hypothesis is true.
- pythonExec() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
- pythonIncludes() - Method in class org.apache.spark.sql.UserDefinedPythonFunction
-
- pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- pyUDT() - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns R^2^, the coefficient of determination.
- RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- RAND() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- random() - Static method in class org.apache.spark.util.Utils
-
- random(int, Random) - Static method in class org.apache.spark.util.Vector
-
Creates this
Vector
of given length containing random numbers
between 0.0 and 1.0.
- RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Trait for random data generators that generate i.i.d.
- RandomForest - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class that implements a Random Forest
learning algorithm for classification and regression.
- RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
-
- RandomForest.NodeIndexInfo - Class in org.apache.spark.mllib.tree
-
- RandomForest.NodeIndexInfo(int, Option<int[]>) - Constructor for class org.apache.spark.mllib.tree.RandomForest.NodeIndexInfo
-
- RandomForestModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Represents a random forest model.
- RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
-
- randomInit(Graph<Object, Object>) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Generates random vertex properties (v0) to start power iteration.
- randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
-
Shuffle the elements of a collection into a random order, returning the
result in a new collection.
- randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils
-
Shuffle the elements of an array into a random order, modifying the
original array.
- randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD comprised of i.i.d.
samples produced by the input RandomDataGenerator.
- RandomRDD<T> - Class in org.apache.spark.mllib.rdd
-
- RandomRDD(SparkContext, long, int, RandomDataGenerator<T>, long, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RandomRDD
-
- RandomRDDPartition<T> - Class in org.apache.spark.mllib.rdd
-
- RandomRDDPartition(int, int, RandomDataGenerator<T>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomRDDPartition
-
- RandomRDDs - Class in org.apache.spark.mllib.random
-
:: Experimental ::
Generator methods for creating RDDs comprised of i.i.d.
samples from some distribution.
- RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
-
- RandomSampler<T,U> - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A pseudorandom sampler.
- randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
-
Randomly splits this RDD with the provided weights.
- randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD[Vector] with vectors containing i.i.d.
samples produced by the
input RandomDataGenerator.
- RandomVectorRDD - Class in org.apache.spark.mllib.rdd
-
- RandomVectorRDD(SparkContext, long, int, int, RandomDataGenerator<Object>, long) - Constructor for class org.apache.spark.mllib.rdd.RandomVectorRDD
-
- RangeDependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
- RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
-
- RangePartitioner<K,V> - Class in org.apache.spark
-
A
Partitioner
that partitions sortable records by range into roughly
equal ranges.
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-
- rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- rank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for rank of the matrix factorization.
- rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
-
::Experimental::
Evaluator for ranking algorithms.
- RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
-
- RateLimitedOutputStream - Class in org.apache.spark.streaming.util
-
- RateLimitedOutputStream(OutputStream, int) - Constructor for class org.apache.spark.streaming.util.RateLimitedOutputStream
-
- RateLimiter - Class in org.apache.spark.streaming.receiver
-
Provides waitToPush() method to limit the rate at which receivers consume data.
- RateLimiter(SparkConf) - Constructor for class org.apache.spark.streaming.receiver.RateLimiter
-
- rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- Rating - Class in org.apache.spark.mllib.recommendation
-
A more compact class to represent a rating than Tuple3[Int, Int, Double].
- Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
-
- rating() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- ratingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the column name for ratings.
- ratings() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
-
- ratings() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
-
- ratings() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
- RawInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that reads blocks of serialized objects from a given network address.
- RawInputDStream(StreamingContext, String, int, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.RawInputDStream
-
- RawNetworkReceiver - Class in org.apache.spark.streaming.dstream
-
- RawNetworkReceiver(String, int, StorageLevel) - Constructor for class org.apache.spark.streaming.dstream.RawNetworkReceiver
-
- rawPredictionCol() - Method in interface org.apache.spark.ml.param.HasRawPredictionCol
-
param for raw prediction column name
- rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- RawTextHelper - Class in org.apache.spark.streaming.util
-
- RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
-
- RawTextSender - Class in org.apache.spark.streaming.util
-
A helper program that sends blocks of Kryo-serialized text strings out on a socket at a
specified rate.
- RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
-
- rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaRDD
-
- rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- rdd() - Method in class org.apache.spark.Dependency
-
- rdd() - Method in class org.apache.spark.NarrowDependency
-
- rdd() - Method in class org.apache.spark.rdd.CoalescedRDDPartition
-
- rdd() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
-
- RDD<T> - Class in org.apache.spark.rdd
-
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
- RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
- RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
Construct an RDD with just a one-to-one dependency on one parent
- rdd() - Method in class org.apache.spark.scheduler.Stage
-
- rdd() - Method in class org.apache.spark.ShuffleDependency
-
- rdd() - Method in class org.apache.spark.sql.DataFrame
-
Returns the content of the
DataFrame
as an
RDD
of
Row
s.
- RDD() - Static method in class org.apache.spark.storage.BlockId
-
- rdd1() - Method in class org.apache.spark.rdd.CartesianRDD
-
- rdd1() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- rdd1() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- rdd2() - Method in class org.apache.spark.rdd.CartesianRDD
-
- rdd2() - Method in class org.apache.spark.rdd.SubtractedRDD
-
- rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD2
-
- rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- rdd2() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD3
-
- rdd3() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- rdd4() - Method in class org.apache.spark.rdd.ZippedPartitionsRDD4
-
- RDDApi<T> - Interface in org.apache.spark.sql
-
An internal interface defining the RDD-like methods for
DataFrame
.
- RDDBlockId - Class in org.apache.spark.storage
-
- RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
-
- rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
-
Return the RDD blocks stored in this block manager.
- rddBlocks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the blocks that belong to the given RDD stored in this block manager.
- RDDCheckpointData<T> - Class in org.apache.spark.rdd
-
This class contains all the information related to RDD checkpointing.
- RDDCheckpointData(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDDCheckpointData
-
- rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
-
- RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
-
Machine learning specific RDD functions.
- RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
-
- rddId() - Method in class org.apache.spark.CleanRDD
-
- rddId() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
-
- rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
-
- rddId() - Method in class org.apache.spark.storage.RDDBlockId
-
- RDDInfo - Class in org.apache.spark.storage
-
- RDDInfo(int, String, int, StorageLevel) - Constructor for class org.apache.spark.storage.RDDInfo
-
- rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
-
Filter RDD info to include only those with cached partitions
- rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
-
- rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- RDDPage - Class in org.apache.spark.ui.storage
-
Page showing storage details for a given RDD
- RDDPage(StorageTab) - Constructor for class org.apache.spark.ui.storage.RDDPage
-
- rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDD
-
- rdds() - Method in class org.apache.spark.rdd.PartitionerAwareUnionRDDPartition
-
- rdds() - Method in class org.apache.spark.rdd.UnionRDD
-
- rdds() - Method in class org.apache.spark.rdd.ZippedPartitionsBaseRDD
-
- rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the storage level, if any, used by the given RDD in this block manager.
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
-
- rddToDataFrameHolder(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits
-
Creates a DataFrame from an RDD of case classes or tuples.
- rddToFileName(String, String, Time) - Static method in class org.apache.spark.streaming.StreamingContext
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, WritableFactory<K>, WritableFactory<V>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
-
- read(String, SparkConf, Configuration) - Static method in class org.apache.spark.streaming.CheckpointReader
-
- read(WriteAheadLogFileSegment) - Method in class org.apache.spark.streaming.util.WriteAheadLogRandomReader
-
- read() - Method in class org.apache.spark.util.ByteBufferInputStream
-
- read(byte[]) - Method in class org.apache.spark.util.ByteBufferInputStream
-
- read(byte[], int, int) - Method in class org.apache.spark.util.ByteBufferInputStream
-
- readBatches() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.CompressedMapStatus
-
- readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.DirectTaskResult
-
- readExternal(ObjectInput) - Method in class org.apache.spark.scheduler.HighlyCompressedMapStatus
-
- readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
-
- readExternal(ObjectInput) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
-
- readExternal(ObjectInput) - Static method in class org.apache.spark.streaming.flume.EventTransformer
-
- readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- readFromFile(Path, Broadcast<SerializableWritable<Configuration>>, TaskContext) - Static method in class org.apache.spark.rdd.CheckpointRDD
-
- readFromLog() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
Read all the existing logs from the log directory.
- readMetadata(JsonAST.JValue) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
Read metadata from the loaded JSON metadata.
- readMetaData(Path, Option<Configuration>) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
Try to read Parquet metadata at the given Path.
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.JavaDeserializationStream
-
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.KryoDeserializationStream
-
- readPartitions() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- readSchema(Seq<Footer>, SQLContext) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- readSchemaFromFile(Path, Option<Configuration>, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
Reads in Parquet Metadata from the given path and tries to extract the schema
(Catalyst attributes) from the application-specific key-value map.
- ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Blocks until this action completes.
- ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
-
- reason() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- reason() - Method in class org.apache.spark.scheduler.TaskSetFailed
-
- recache() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- Recall - Class in org.apache.spark.mllib.evaluation.binary
-
Recall.
- Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
-
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall for a given label (category)
- recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall
(equals to precision for multiclass classifier
because sum of all false positives is equal to sum
of all false negatives)
- recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based recall averaged by the number of documents
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns recall for a given label (category)
- recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, recall) curve.
- receive() - Method in class org.apache.spark.streaming.dstream.SocketReceiver
-
Create a socket connection and receive data until receiver is stopped
- receive() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
-
- receive() - Method in class org.apache.spark.streaming.zeromq.ZeroMQReceiver
-
- receive() - Method in interface org.apache.spark.util.ActorLogReceive
-
- ReceivedBlock - Interface in org.apache.spark.streaming.receiver
-
Trait representing a received block
- ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver
-
Trait that represents a class that handles the storage of blocks received by receiver
- receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.AddBlock
-
- receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.BlockAdditionEvent
-
- receivedBlockInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
-
- ReceivedBlockInfo - Class in org.apache.spark.streaming.scheduler
-
Information about blocks received by the receiver
- ReceivedBlockInfo(int, long, ReceivedBlockStoreResult) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
-
- ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver
-
Trait that represents the metadata related to storage of blocks
- ReceivedBlockTracker - Class in org.apache.spark.streaming.scheduler
-
Class that keep track of all the received blocks, and allocate them to batches
when required.
- ReceivedBlockTracker(SparkConf, Configuration, Seq<Object>, Clock, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
- ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler
-
Trait representing any event in the ReceivedBlockTracker that updates its state.
- receivedRecordsDistributions() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- Receiver<T> - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Abstract class of a receiver that can be run on worker nodes to receive external data.
- Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
-
- receiverActor() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
-
- receiverExecutor() - Method in class org.apache.spark.streaming.flume.FlumePollingReceiver
-
- ReceiverInfo - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Class having information about a receiver
- ReceiverInfo(int, String, ActorRef, boolean, String, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- receiverInfo(int) - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
Abstract class for defining any
InputDStream
that has to start a receiver on worker nodes to receive external data.
- ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- ReceiverMessage - Interface in org.apache.spark.streaming.receiver
-
Messages sent to the Receiver.
- ReceiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
- receiverState() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
State of the receiver
- receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- ReceiverSupervisor - Class in org.apache.spark.streaming.receiver
-
Abstract class that is responsible for supervising a Receiver in the worker.
- ReceiverSupervisor(Receiver<?>, SparkConf) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
- ReceiverSupervisor.ReceiverState - Class in org.apache.spark.streaming.receiver
-
- ReceiverSupervisor.ReceiverState() - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
-
Enumeration to identify current state of the StreamingContext
- ReceiverSupervisorImpl - Class in org.apache.spark.streaming.receiver
-
Concrete implementation of
ReceiverSupervisor
which provides all the necessary functionality for handling the data received by
the receiver.
- ReceiverSupervisorImpl(Receiver<?>, SparkEnv, Configuration, Option<String>) - Constructor for class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
- receiverTracker() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- ReceiverTracker - Class in org.apache.spark.streaming.scheduler
-
This class manages the execution of the receivers of ReceiverInputDStreams.
- ReceiverTracker(StreamingContext, boolean) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker
-
- ReceiverTracker.ReceiverLauncher - Class in org.apache.spark.streaming.scheduler
-
This thread class runs all the receivers on the cluster.
- ReceiverTracker.ReceiverLauncher() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
-
- ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler
-
Messages used by the NetworkReceiver and the ReceiverTracker to communicate
with each other.
- receiveWithLogging() - Method in class org.apache.spark.HeartbeatReceiver
-
- receiveWithLogging() - Method in class org.apache.spark.MapOutputTrackerMasterActor
-
- receiveWithLogging() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- receiveWithLogging() - Method in class org.apache.spark.scheduler.local.LocalActor
-
- receiveWithLogging() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator.OutputCommitCoordinatorActor
-
- receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- receiveWithLogging() - Method in class org.apache.spark.storage.BlockManagerSlaveActor
-
- receiveWithLogging() - Method in interface org.apache.spark.util.ActorLogReceive
-
- recentExceptions() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends products to a user.
- recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends users to a product.
- recomputeLocality() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- RECORD_LENGTH_PROPERTY() - Static method in class org.apache.spark.input.FixedLengthBinaryInputFormat
-
Property name to set in Hadoop JobConfs for record length
- recordProcessorFactory() - Method in class org.apache.spark.streaming.kinesis.KinesisReceiver
-
- RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Update the input bytes read metric each time this number of records has been read
- RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-
- RecurringTimer - Class in org.apache.spark.streaming.util
-
- RecurringTimer(Clock, long, Function1<Object, BoxedUnit>, String) - Constructor for class org.apache.spark.streaming.util.RecurringTimer
-
- RedirectThread - Class in org.apache.spark.util
-
A utility class to redirect the child process's stdout or stderr.
- RedirectThread(InputStream, OutputStream, String, boolean) - Constructor for class org.apache.spark.util.RedirectThread
-
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD using the specified commutative and associative binary
operator.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD using the specified commutative and
associative binary operator.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Create a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by reducing over a using incremental computation.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for reduceByKeyLocally
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As this API is not Java compatible.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reducedStream() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- ReducedWindowedDStream<K,V> - Class in org.apache.spark.streaming.dstream
-
- ReducedWindowedDStream(DStream<Tuple2<K, V>>, Function2<V, V, V>, Function2<V, V, V>, Option<Function1<Tuple2<K, V>, Object>>, Duration, Duration, Partitioner, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- reduceId() - Method in class org.apache.spark.FetchFailed
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Invalidate and refresh all the cached the metadata of the given table.
- refreshTable(String, String) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- RefreshTable - Class in org.apache.spark.sql.sources
-
- RefreshTable(String, String) - Constructor for class org.apache.spark.sql.sources.RefreshTable
-
- REGEX() - Static method in class org.apache.spark.streaming.Checkpoint
-
- REGEXP() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- register(Accumulable<?, ?>, boolean) - Static method in class org.apache.spark.Accumulators
-
- register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 0 arguments as user-defined function (UDF).
- register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 1 arguments as user-defined function (UDF).
- register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 2 arguments as user-defined function (UDF).
- register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 3 arguments as user-defined function (UDF).
- register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 4 arguments as user-defined function (UDF).
- register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 5 arguments as user-defined function (UDF).
- register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 6 arguments as user-defined function (UDF).
- register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 7 arguments as user-defined function (UDF).
- register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 8 arguments as user-defined function (UDF).
- register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 9 arguments as user-defined function (UDF).
- register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 10 arguments as user-defined function (UDF).
- register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 11 arguments as user-defined function (UDF).
- register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 12 arguments as user-defined function (UDF).
- register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 13 arguments as user-defined function (UDF).
- register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 14 arguments as user-defined function (UDF).
- register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 15 arguments as user-defined function (UDF).
- register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 16 arguments as user-defined function (UDF).
- register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 17 arguments as user-defined function (UDF).
- register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 18 arguments as user-defined function (UDF).
- register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 19 arguments as user-defined function (UDF).
- register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 20 arguments as user-defined function (UDF).
- register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 21 arguments as user-defined function (UDF).
- register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 22 arguments as user-defined function (UDF).
- register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 1 arguments.
- register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 2 arguments.
- register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 3 arguments.
- register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 4 arguments.
- register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 5 arguments.
- register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 6 arguments.
- register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 7 arguments.
- register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 8 arguments.
- register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 9 arguments.
- register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 10 arguments.
- register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 11 arguments.
- register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 12 arguments.
- register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 13 arguments.
- register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 14 arguments.
- register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 15 arguments.
- register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 16 arguments.
- register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 17 arguments.
- register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 18 arguments.
- register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 19 arguments.
- register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 20 arguments.
- register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 21 arguments.
- register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 22 arguments.
- register() - Method in class org.apache.spark.streaming.dstream.DStream
-
Register this streaming as an output stream.
- register(Logger) - Static method in class org.apache.spark.util.SignalLogger
-
Register a signal handler to log signals on UNIX-like systems.
- registerBlockManager(BlockManagerId, long, ActorRef) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Register the BlockManager's id with the driver.
- registerBroadcastForCleanup(Broadcast<T>) - Method in class org.apache.spark.ContextCleaner
-
Register a Broadcast for cleanup when it is garbage collected.
- registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
-
- registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
-
- registerDataFrameAsTable(DataFrame, String) - Method in class org.apache.spark.sql.SQLContext
-
Registers the given
DataFrame
as a temporary table in the catalog.
- registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- registered(SchedulerDriver, Protos.FrameworkID, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- registeredLock() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
-
Registers classes that GraphX uses with Kryo.
- registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
-
Use Kryo serialization and register the given set of classes with Kryo.
- registerMapOutput(int, int, MapStatus) - Method in class org.apache.spark.MapOutputTrackerMaster
-
- registerMapOutputs(int, MapStatus[], boolean) - Method in class org.apache.spark.MapOutputTrackerMaster
-
Register multiple map output information for the given shuffle
- registerRDDForCleanup(RDD<?>) - Method in class org.apache.spark.ContextCleaner
-
Register a RDD for cleanup when it is garbage collected.
- RegisterReceiver - Class in org.apache.spark.streaming.scheduler
-
- RegisterReceiver(int, String, String, ActorRef) - Constructor for class org.apache.spark.streaming.scheduler.RegisterReceiver
-
- registerShuffle(int, int) - Method in class org.apache.spark.MapOutputTrackerMaster
-
- registerShuffleForCleanup(ShuffleDependency<?, ?, ?>) - Method in class org.apache.spark.ContextCleaner
-
Register a ShuffleDependency for cleanup when it is garbage collected.
- registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.Utils
-
- registerShutdownDeleteDir(TachyonFile) - Static method in class org.apache.spark.util.Utils
-
- registerSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
-
- registerTable(Seq<String>, LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
- registerTempTable(String) - Method in class org.apache.spark.sql.DataFrame
-
Registers this RDD as a temporary table using the given name.
- registrationDone() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- registrationLock() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- registry() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- registry() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- registry() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- registry() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- registry() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- regParam() - Method in interface org.apache.spark.ml.param.HasRegParam
-
param for regularization parameter
- Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- RegressionMetrics - Class in org.apache.spark.mllib.evaluation
-
:: Experimental ::
Evaluator for regression.
- RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
-
- RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
-
:: AlphaComponent ::
- RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
-
- RegressionModel - Interface in org.apache.spark.mllib.regression
-
- Regressor<FeaturesType,Learner extends Regressor<FeaturesType,Learner,M>,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
-
:: AlphaComponent ::
- Regressor() - Constructor for class org.apache.spark.ml.regression.Regressor
-
- RegressorParams - Interface in org.apache.spark.ml.regression
-
:: DeveloperApi ::
Params for regression.
- reindex() - Method in class org.apache.spark.graphx.impl.VertexPartitionBaseOps
-
Construct a new VertexPartition whose index contains only the vertices in the mask.
- reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reindex() - Method in class org.apache.spark.graphx.VertexRDD
-
Construct a new VertexRDD that is indexed by only the visible vertices.
- relation() - Method in class org.apache.spark.sql.columnar.InMemoryColumnarTableScan
-
- relation() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- relation() - Method in class org.apache.spark.sql.parquet.InsertIntoParquetTable
-
- relation() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- relation() - Method in class org.apache.spark.sql.sources.LogicalRelation
-
- relation() - Method in class org.apache.spark.sql.sources.ResolvedDataSource
-
- RelationProvider - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
Implemented by objects that produce relations for a specific kind of data source.
- relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
-
Return the relative direction of the edge to the corresponding
vertex.
- releasePythonWorker(String, Map<String, String>, Socket) - Method in class org.apache.spark.SparkEnv
-
- releaseUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
-
Release memory used by this thread for unrolling blocks.
- ReliableKafkaReceiver<K,V,U extends kafka.serializer.Decoder<?>,T extends kafka.serializer.Decoder<?>> - Class in org.apache.spark.streaming.kafka
-
ReliableKafkaReceiver offers the ability to reliably store data into BlockManager without loss.
- ReliableKafkaReceiver(Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.kafka.ReliableKafkaReceiver
-
- remainingMem() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
- remember(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
- remember(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
-
- remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
-
Set each DStreams in this context to remember RDDs it generated in the last given duration.
- rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- rememberDuration() - Method in class org.apache.spark.streaming.DStreamGraph
-
- remove(String) - Method in class org.apache.spark.SparkConf
-
Remove a parameter from the configuration
- remove(BlockId) - Method in class org.apache.spark.storage.BlockStore
-
Remove a block, if it exists.
- remove(BlockId) - Method in class org.apache.spark.storage.DiskStore
-
- remove(BlockId) - Method in class org.apache.spark.storage.MemoryStore
-
- remove(BlockId) - Method in class org.apache.spark.storage.TachyonStore
-
- removeBlock(BlockId, boolean) - Method in class org.apache.spark.storage.BlockManager
-
Remove a block from both memory and disk.
- removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerInfo
-
- removeBlock(BlockId) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Remove a block from the slaves that have it.
- removeBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Remove the given block from this storage status.
- removeBlocks() - Method in class org.apache.spark.rdd.BlockRDD
-
Remove the data blocks that this BlockRDD is made from.
- removeBroadcast(long, boolean) - Method in class org.apache.spark.storage.BlockManager
-
Remove all blocks belonging to the given broadcast.
- removeBroadcast(long, boolean, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Remove all blocks belonging to the given broadcast.
- removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverActor
-
- removeExecutor(String, String) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- removeExecutor(String) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Remove a dead executor from the driver actor.
- removeFile(TachyonFile) - Method in class org.apache.spark.storage.TachyonBlockManager
-
- removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
-
- removeOutputLoc(int, BlockManagerId) - Method in class org.apache.spark.scheduler.Stage
-
- removeOutputsOnExecutor(String) - Method in class org.apache.spark.scheduler.Stage
-
Removes all shuffle outputs associated with this executor.
- removeRdd(int) - Method in class org.apache.spark.storage.BlockManager
-
Remove all blocks belonging to the given RDD.
- removeRdd(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Remove all blocks belonging to the given RDD.
- removeRunningTask(long) - Method in class org.apache.spark.scheduler.TaskSetManager
-
If the given task ID is in the set of running tasks, removes it.
- removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.Pool
-
- removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
-
- removeSchedulable(Schedulable) - Method in class org.apache.spark.scheduler.TaskSetManager
-
- removeShuffle(int, boolean) - Method in class org.apache.spark.storage.BlockManagerMaster
-
Remove all blocks belonging to the given shuffle.
- removeSource(Source) - Method in class org.apache.spark.metrics.MetricsSystem
-
- render(HttpServletRequest) - Method in class org.apache.spark.streaming.ui.StreamingPage
-
Render the page
- render(HttpServletRequest) - Method in class org.apache.spark.ui.env.EnvironmentPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorsPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.exec.ExecutorThreadDumpPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllJobsPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.AllStagesPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.JobPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.PoolPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.jobs.StagePage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.RDDPage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.storage.StoragePage
-
- render(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
-
- renderJson(HttpServletRequest) - Method in class org.apache.spark.ui.WebUIPage
-
- repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
that has exactly
numPartitions
partitions.
- repartition(int) - Method in interface org.apache.spark.sql.RDDApi
-
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- replay(InputStream, String) - Method in class org.apache.spark.scheduler.ReplayListenerBus
-
Replay each event in the order maintained in the given stream.
- ReplayListenerBus - Class in org.apache.spark.scheduler
-
A SparkListenerBus that can be used to replay events from serialized event data.
- ReplayListenerBus() - Constructor for class org.apache.spark.scheduler.ReplayListenerBus
-
- replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- ReplicatedVertexView<VD,ED> - Class in org.apache.spark.graphx.impl
-
Manages shipping vertex attributes to the edge partitions of an
EdgeRDD
.
- ReplicatedVertexView(EdgeRDDImpl<ED, VD>, boolean, boolean, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.ReplicatedVertexView
-
- replication() - Method in class org.apache.spark.storage.StorageLevel
-
- report() - Method in class org.apache.spark.metrics.MetricsSystem
-
- report() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- report() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- report() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- report() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- report() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- report() - Method in interface org.apache.spark.metrics.sink.Sink
-
- reporter() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- reporter() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- reporter() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- reporter() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Report exceptions in receiving data.
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Report errors.
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisorImpl
-
Report error to the receiver tracker
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- ReportError - Class in org.apache.spark.streaming.scheduler
-
- ReportError(int, String, String) - Constructor for class org.apache.spark.streaming.scheduler.ReportError
-
- requestedAttributes() - Method in class org.apache.spark.sql.hive.execution.HiveTableScan
-
- requestedPartitionOrdinals() - Method in class org.apache.spark.sql.parquet.ParquetTableScan
-
- requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
-
- requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
-
Request an additional number of executors from the cluster manager.
- requestExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
Request an additional number of executors from the cluster manager.
- requestExecutors(int) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Request an additional number of executors from the cluster manager.
- requestTotalExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient
-
Express a preference to the cluster manager for a given total number of executors.
- requestTotalExecutors(int) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
Express a preference to the cluster manager for a given total number of executors.
- requestTotalExecutors(int) - Method in class org.apache.spark.SparkContext
-
Express a preference to the cluster manager for a given total number of executors.
- reregister() - Method in class org.apache.spark.storage.BlockManager
-
Re-register with the master and report all blocks to it.
- reregisterBlockManager() - Method in class org.apache.spark.HeartbeatResponse
-
- reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- reregistered(SchedulerDriver, Protos.MasterInfo) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- reservedSizeBytes() - Static method in class org.apache.spark.util.AkkaUtils
-
Space reserved for extra data in an Akka message besides serialized task or task result.
- reserveUnrollMemoryForThisThread(long) - Method in class org.apache.spark.storage.MemoryStore
-
Reserve additional memory for unrolling blocks used by this thread.
- reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils
-
Reservoir sampling implementation that also returns the input size.
- reset() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
Resets everything to zero, which should be called after each solve.
- resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- resolveClass(ObjectStreamClass) - Method in class org.apache.spark.streaming.ObjectInputStreamWithLoader
-
- resolved() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- ResolvedDataSource - Class in org.apache.spark.sql.sources
-
- ResolvedDataSource(Class<?>, BaseRelation) - Constructor for class org.apache.spark.sql.sources.ResolvedDataSource
-
- resolvePartitions(Seq<ParquetRelation2.PartitionValues>) - Static method in class org.apache.spark.sql.parquet.ParquetRelation2
-
Resolves possible type conflicts between partitions by up-casting "lower" types.
- resolveTable(String, String) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Takes a (schema, table) specification and returns the table's Catalyst
schema.
- ResolveUdtfsAlias - Class in org.apache.spark.sql.hive
-
Resolve Udtfs Alias.
- ResolveUdtfsAlias() - Constructor for class org.apache.spark.sql.hive.ResolveUdtfsAlias
-
- resolveURI(String, boolean) - Static method in class org.apache.spark.util.Utils
-
Return a well-formed URI for the file described by a user input string.
- resolveURIs(String, boolean) - Static method in class org.apache.spark.util.Utils
-
Resolve a comma-separated list of paths.
- resourceOffer(String, String, Enumeration.Value) - Method in class org.apache.spark.scheduler.TaskSetManager
-
Respond to an offer of a single executor from the scheduler by finding a task
- resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
Method called by Mesos to offer resources on slaves.
- resourceOffers(SchedulerDriver, List<Protos.Offer>) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
Method called by Mesos to offer resources on slaves.
- resourceOffers(Seq<WorkerOffer>) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
Called by cluster manager to offer resources on slaves.
- responder() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
-
- responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
-
- restart(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restartReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Restart receiver with delay
- restartReceiver(String, Option<Throwable>, int) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Restart receiver with delay
- restore() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
Restore the checkpoint data.
- restore() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- restore() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
-
- restoreCheckpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-
Restore the RDDs in generatedRDDs from the checkpointData.
- restoreCheckpointData() - Method in class org.apache.spark.streaming.DStreamGraph
-
- RESUBMIT_TIMEOUT() - Static method in class org.apache.spark.scheduler.DAGScheduler
-
- resubmitFailedStages() - Method in class org.apache.spark.scheduler.DAGScheduler
-
Resubmit any failed stages.
- ResubmitFailedStages - Class in org.apache.spark.scheduler
-
- ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
-
- Resubmitted - Class in org.apache.spark
-
:: DeveloperApi ::
A
ShuffleMapTask
that completed successfully earlier, but we
lost the executor before the stage completed.
- Resubmitted() - Constructor for class org.apache.spark.Resubmitted
-
- result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Awaits and returns the result (of type T) of this action.
- result() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- result() - Method in class org.apache.spark.streaming.scheduler.Job
-
- RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- resultObject() - Method in class org.apache.spark.partial.ApproximateActionListener
-
- resultOfJob() - Method in class org.apache.spark.scheduler.Stage
-
For stages that are the final (consists of only ResultTasks), link to the ActiveJob.
- resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
-
- ResultTask<T,U> - Class in org.apache.spark.scheduler
-
A task that sends back the output to the driver application.
- ResultTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>, int) - Constructor for class org.apache.spark.scheduler.ResultTask
-
- ResultWithDroppedBlocks - Class in org.apache.spark.storage
-
- ResultWithDroppedBlocks(boolean, Seq<Tuple2<BlockId, BlockStatus>>) - Constructor for class org.apache.spark.storage.ResultWithDroppedBlocks
-
- retag(Class<T>) - Method in class org.apache.spark.rdd.RDD
-
Private API for changing an RDD's ClassTag.
- retag(ClassTag<T>) - Method in class org.apache.spark.rdd.RDD
-
Private API for changing an RDD's ClassTag.
- RETAINED_FILES_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- retainedCompletedBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- retryRandom(Function0<T>, int, int) - Static method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
-
Retry the given amount of times with a random backoff time (millis) less than the
given maxBackOffMillis
- retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.AkkaUtils
-
Returns the configured number of milliseconds to wait on each retry
- returnInspector() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- ReturnStatementFinder - Class in org.apache.spark.util
-
- ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
-
- reverse() - Method in class org.apache.spark.graphx.EdgeDirection
-
Reverse the direction of an edge.
- reverse() - Method in class org.apache.spark.graphx.EdgeRDD
-
Reverse all the edges in this RDD.
- reverse() - Method in class org.apache.spark.graphx.Graph
-
Reverses all edges in the graph.
- reverse() - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Reverse all the edges in this partition.
- reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- reverse() - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
-
Return a new ReplicatedVertexView
where edges are reversed and shipping levels are swapped to
match.
- reverse() - Method in class org.apache.spark.graphx.impl.RoutingTablePartition
-
Returns a new RoutingTablePartition reflecting a reversal of all edge directions.
- reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
-
Returns a new
VertexRDD
reflecting a reversal of all edge directions in the corresponding
EdgeRDD
.
- revertPartialWritesAndClose() - Method in class org.apache.spark.storage.BlockObjectWriter
-
Reverts writes that haven't been flushed yet.
- revertPartialWritesAndClose() - Method in class org.apache.spark.storage.DiskBlockObjectWriter
-
- reviveOffers() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- reviveOffers() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalActor
-
- reviveOffers() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- ReviveOffers - Class in org.apache.spark.scheduler.local
-
- ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
-
- reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- RidgeRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using RidgeRegression.
- RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L2-regularization using Stochastic Gradient Descent.
- RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100,
regParam: 0.01, miniBatchFraction: 1.0}.
- right() - Method in class org.apache.spark.sql.sources.And
-
- right() - Method in class org.apache.spark.sql.sources.Or
-
- rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the right child of this node.
- rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- rightNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rlike(String) - Method in class org.apache.spark.sql.Column
-
SQL RLIKE expression (LIKE with Regex).
- RLIKE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
A random graph generator using the R-MAT model, proposed in
"R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
- rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the receiver operating characteristic (ROC) curve,
which is an RDD of (false positive rate, true positive rate)
with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
- rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Notify that rollover has occurred
- rolledOver() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
Rollover has occurred, so reset the counter
- rolledOver() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
Rollover has occurred, so find the next time to rollover
- RollingFileAppender - Class in org.apache.spark.util.logging
-
Continuously appends data from input stream into the given file, and rolls
over the file after the given interval.
- RollingFileAppender(InputStream, File, RollingPolicy, SparkConf, int) - Constructor for class org.apache.spark.util.logging.RollingFileAppender
-
- rollingPolicy() - Method in class org.apache.spark.util.logging.RollingFileAppender
-
- RollingPolicy - Interface in org.apache.spark.util.logging
-
- rolloverIntervalMillis() - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
- rolloverSizeBytes() - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
- root() - Method in class org.apache.spark.mllib.fpm.FPTree
-
- rootHandler() - Method in class org.apache.spark.ui.ServerInfo
-
- rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the root mean squared error, which is defined as the square root of
the mean squared error.
- rootPool() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- rootPool() - Method in class org.apache.spark.scheduler.FIFOSchedulableBuilder
-
- rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
-
- rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- rootPool() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- routingTable() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
- RoutingTablePartition - Class in org.apache.spark.graphx.impl
-
Stores the locations of edge-partition join sites for each vertex attribute in a particular
vertex partition.
- RoutingTablePartition(Tuple3<long[], BitSet, BitSet>[]) - Constructor for class org.apache.spark.graphx.impl.RoutingTablePartition
-
- rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a row-oriented distributed Matrix with no meaningful row indices.
- RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- RowReadSupport - Class in org.apache.spark.sql.parquet
-
A parquet.hadoop.api.ReadSupport
for Row objects.
- RowReadSupport() - Constructor for class org.apache.spark.sql.parquet.RowReadSupport
-
- RowRecordMaterializer - Class in org.apache.spark.sql.parquet
-
A parquet.io.api.RecordMaterializer
for Rows.
- RowRecordMaterializer(CatalystConverter) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
-
- RowRecordMaterializer(MessageType, Seq<Attribute>) - Constructor for class org.apache.spark.sql.parquet.RowRecordMaterializer
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- rowsPerPart() - Method in class org.apache.spark.mllib.linalg.distributed.GridPartitioner
-
- rowToJSON(StructType, JsonGenerator, Row) - Static method in class org.apache.spark.sql.json.JsonRDD
-
Transforms a single Row to JSON using Jackson
- RowWriteSupport - Class in org.apache.spark.sql.parquet
-
A parquet.hadoop.api.WriteSupport
for Row ojects.
- RowWriteSupport() - Constructor for class org.apache.spark.sql.parquet.RowWriteSupport
-
- run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
-
Executes some action enclosed in the closure.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
-
Run static Label Propagation for detecting communities in networks.
- run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run PageRank for a fixed number of iterations returning a graph
with vertex attributes containing the PageRank and edge
attributes the normalized edge weight.
- run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
-
Computes shortest paths to the given set of landmark vertices.
- run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
-
This method is deprecated in favor of runSVDPlusPlus()
, which replaces DoubleMatrix
with Array[Double]
in its return value.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
-
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Perform expectation maximization
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Train a K-means model on the given set of points; data
should be cached for high
performance, because this is an iterative algorithm.
- run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Learn an LDA model using the given dataset.
- run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Java-friendly version of run()
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Run the PIC algorithm.
- run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
A Java-friendly version of PowerIterationClustering.run
.
- run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Computes an FP-Growth model that contains frequent itemsets.
- run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
- run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Run ALS with the configured parameters on an input RDD of (user, product, rating) triples.
- run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Java-friendly version of ALS.run
.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input
RDD of LabeledPoint entries.
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Run IsotonicRegression algorithm to obtain isotonic regression model.
- run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Run pool adjacent violators algorithm to obtain isotonic regression model.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model over an RDD
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model
- run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run
.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model over an RDD
- run() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Runs the packing algorithm and returns an array of PartitionGroups that if possible are
load balanced and grouped by locality
- run(long, int) - Method in class org.apache.spark.scheduler.Task
-
Called by Executor to run this task.
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AddFile
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AddJar
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.DropTable
-
- run(SQLContext) - Method in class org.apache.spark.sql.hive.execution.HiveNativeCommand
-
- run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
-
- run(SQLContext) - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
-
- run(SQLContext) - Method in class org.apache.spark.sql.sources.InsertIntoDataSource
-
- run(SQLContext) - Method in class org.apache.spark.sql.sources.RefreshTable
-
- run() - Method in class org.apache.spark.streaming.CheckpointWriter.CheckpointWriteHandler
-
- run() - Method in class org.apache.spark.streaming.flume.FlumeBatchFetcher
-
- run() - Method in class org.apache.spark.streaming.scheduler.Job
-
- run() - Method in class org.apache.spark.util.RedirectThread
-
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, CallSite, long, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Run a job that can return approximate results.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
-
Runs a Spark job.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties, ClassTag<U>) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and pass the results to the given
handler function.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and return the results as an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on a given set of partitions of an RDD, but take a function of type
Iterator[T] => U
instead of (TaskContext, Iterator[T]) => U
.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
-
Run Limited-memory BFGS (L-BFGS) in parallel.
- RunLengthEncoding - Class in org.apache.spark.sql.columnar.compression
-
- RunLengthEncoding() - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding
-
- RunLengthEncoding.Decoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- RunLengthEncoding.Decoder(ByteBuffer, NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Decoder
-
- RunLengthEncoding.Encoder<T extends org.apache.spark.sql.types.NativeType> - Class in org.apache.spark.sql.columnar.compression
-
- RunLengthEncoding.Encoder(NativeColumnType<T>) - Constructor for class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
-
Run stochastic gradient descent (SGD) in parallel using mini batches.
- running() - Method in class org.apache.spark.scheduler.TaskInfo
-
- RUNNING() - Static method in class org.apache.spark.TaskState
-
- runningBatches() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- runningLocally() - Method in class org.apache.spark.TaskContext
-
- runningLocally() - Method in class org.apache.spark.TaskContextImpl
-
- runningStages() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- runningTasks() - Method in class org.apache.spark.scheduler.Pool
-
- runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
-
- runningTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- runningTasksSet() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- runSVDPlusPlus(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
-
:: Experimental ::
- runTask(TaskContext) - Method in class org.apache.spark.scheduler.ResultTask
-
- runTask(TaskContext) - Method in class org.apache.spark.scheduler.ShuffleMapTask
-
- runTask(TaskContext) - Method in class org.apache.spark.scheduler.Task
-
- RuntimePercentage - Class in org.apache.spark.scheduler
-
- RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
-
- runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- s1() - Method in class org.apache.spark.rdd.CartesianPartition
-
- s2() - Method in class org.apache.spark.rdd.CartesianPartition
-
- sameResult(LogicalPlan) - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
Only compare database and tablename, not alias.
- sameResult(LogicalPlan) - Method in class org.apache.spark.sql.sources.LogicalRelation
-
- sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by sampling a fraction of rows.
- sample(boolean, double) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by sampling a fraction of rows, using a random seed.
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
-
- sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
-
take a random sample
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- SampledRDD<T> - Class in org.apache.spark.rdd
-
- SampledRDD(RDD<T>, boolean, double, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.SampledRDD
-
- SampledRDDPartition - Class in org.apache.spark.rdd
-
- SampledRDDPartition(Partition, int) - Constructor for class org.apache.spark.rdd.SampledRDDPartition
-
- sampleLogNormal(double, double, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Randomly samples from a log normal distribution whose corresponding normal distribution has
the given mean and standard deviation.
- sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.util.StatCounter
-
Return the sample standard deviation of the values, which corrects for bias in estimating the
variance by dividing by N-1 instead of N.
- sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.util.StatCounter
-
Return the sample variance, which corrects for bias in estimating the variance by dividing
by N-1 instead of N.
- samplingRatio() - Method in class org.apache.spark.sql.json.JSONRelation
-
- SamplingUtils - Class in org.apache.spark.util.random
-
- SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
-
- save(SparkContext, String, String, int, int, Vector, double, Option<Object>) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
Helper method for saving GLM classification model metadata and data.
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- save(MatrixFactorizationModel, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
Saves a
MatrixFactorizationModel
, where user features are saved under
data/users
and
product features are saved under
data/products
.
- save(SparkContext, String, String, Vector, double) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
Helper method for saving GLM regression model metadata and data.
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- save(SparkContext, String, DecisionTreeModel) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- save(SparkContext, String, TreeEnsembleModel, String) - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
- save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable
-
Save this model to the given path.
- save(String) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Saves the contents of this DataFrame to the given path,
using the default data source configured by spark.sql.sources.default and
SaveMode.ErrorIfExists
as the save mode.
- save(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Saves the contents of this DataFrame to the given path and
SaveMode
specified by mode,
using the default data source configured by spark.sql.sources.default.
- save(String, String) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Saves the contents of this DataFrame to the given path based on the given data source,
using SaveMode.ErrorIfExists
as the save mode.
- save(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Saves the contents of this DataFrame to the given path based on the given data source and
SaveMode
specified by mode.
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Saves the contents of this DataFrame based on the given data source,
SaveMode
specified by mode, and a set of options.
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
(Scala-specific)
Saves the contents of this DataFrame based on the given data source,
SaveMode
specified by mode, and a set of options
- Saveable - Interface in org.apache.spark.mllib.util
-
:: DeveloperApi ::
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
- saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHiveFile(RDD<Row>, Class<?>, ShimFileSinkDesc, SerializableWritable<JobConf>, SparkHiveWriterContainer) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Save labeled data in LIBSVM format.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using
a Configuration object for that storage system.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
Configuration object for that storage system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as a Sequence file of serialized objects.
- saveAsParquetFile(String) - Method in class org.apache.spark.sql.DataFrame
-
Saves the contents of this
DataFrame
as a parquet file, preserving the schema.
- saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
-
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
and value types.
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Creates a table from the the contents of this DataFrame.
- saveAsTable(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Creates a table from the the contents of this DataFrame, using the default data source
configured by spark.sql.sources.default and SaveMode.ErrorIfExists
as the save mode.
- saveAsTable(String, String) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Creates a table at the given path from the the contents of this DataFrame
based on a given data source and a set of options,
using SaveMode.ErrorIfExists
as the save mode.
- saveAsTable(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Creates a table at the given path from the the contents of this DataFrame
based on a given data source,
SaveMode
specified by mode, and a set of options.
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
Creates a table at the given path from the the contents of this DataFrame
based on a given data source,
SaveMode
specified by mode, and a set of options.
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
:: Experimental ::
(Scala-specific)
Creates a table from the the contents of this DataFrame based on a given data source,
SaveMode
specified by mode, and a set of options.
- saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as at text file, using string representation
of elements.
- saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- SaveMode - Enum in org.apache.spark.sql
-
SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
- sc() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sc() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- sc() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- sc() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- sc() - Method in class org.apache.spark.sql.SQLContext.implicits.StringToColumn
-
- sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- sc() - Method in class org.apache.spark.streaming.StreamingContext
-
- sc() - Method in class org.apache.spark.ui.exec.ExecutorsTab
-
- sc() - Method in class org.apache.spark.ui.jobs.JobsTab
-
- sc() - Method in class org.apache.spark.ui.jobs.StagesTab
-
- sc() - Method in class org.apache.spark.ui.SparkUI
-
- scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
x = a * x
- scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- scalaTag() - Method in class org.apache.spark.sql.columnar.NativeColumnType
-
Scala TypeTag.
- scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- scanTable(SparkContext, StructType, String, String, String, String[], Filter[], Partition[]) - Static method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Build and return JDBCRDD from the given information.
- Schedulable - Interface in org.apache.spark.scheduler
-
An interface for schedulable entities.
- SchedulableBuilder - Interface in org.apache.spark.scheduler
-
An interface to build Schedulable tree
buildPools: build the tree nodes(pools)
addTaskSetManager: build the leaf nodes(TaskSetManagers)
- schedulableBuilder() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- schedulableNameToSchedulable() - Method in class org.apache.spark.scheduler.Pool
-
- schedulableQueue() - Method in class org.apache.spark.scheduler.Pool
-
- schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
-
- schedulableQueue() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- scheduler() - Method in class org.apache.spark.streaming.StreamingContext
-
- SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
-
- schedulerAllocFile() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- SchedulerBackend - Interface in org.apache.spark.scheduler
-
A backend interface for scheduling systems that allows plugging in different ones under
TaskSchedulerImpl.
- schedulerBackend() - Method in class org.apache.spark.SparkContext
-
- SCHEDULING_MODE_PROPERTY() - Method in class org.apache.spark.scheduler.FairSchedulableBuilder
-
- SchedulingAlgorithm - Interface in org.apache.spark.scheduler
-
An interface for sort algorithm
FIFO: FIFO algorithm between TaskSetManagers
FS: FS algorithm between Pools, and FIFO or FS within Pools
- schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the first job of this batch to start processing from the time this batch
was submitted to the streaming scheduler.
- schedulingDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- schedulingMode() - Method in class org.apache.spark.scheduler.Pool
-
- schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
-
- SchedulingMode - Class in org.apache.spark.scheduler
-
"FAIR" and "FIFO" determines which policy is used
to order tasks amongst a Schedulable's sub-queues
"NONE" is used when the a Schedulable has no sub-queues.
- SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
-
- schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- schedulingMode() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- schedulingMode() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- schedulingPool() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- schema() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- schema() - Method in class org.apache.spark.sql.columnar.PartitionStatistics
-
- schema() - Method in class org.apache.spark.sql.DataFrame
-
- schema() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
-
- schema() - Method in class org.apache.spark.sql.json.JSONRelation
-
- schema() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- schema() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- SCHEMA_STRING_LENGTH_THRESHOLD() - Static method in class org.apache.spark.sql.SQLConf
-
- schemaLess() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
-
- SchemaRelationProvider - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
Implemented by objects that produce relations for a specific kind of data source
with a given schema.
- schemaStringLengthThreshold() - Method in class org.apache.spark.sql.SQLConf
-
- schemes() - Method in interface org.apache.spark.sql.columnar.compression.AllCompressionSchemes
-
- schemes() - Method in interface org.apache.spark.sql.columnar.compression.WithCompressionSchemes
-
- scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
- scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
-
- script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
-
- ScriptTransformation - Class in org.apache.spark.sql.hive.execution
-
Transforms the input by forking and running the specified script.
- ScriptTransformation(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveScriptIOSchema, HiveContext) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformation
-
- seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- seconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Seconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of seconds.
- Seconds() - Constructor for class org.apache.spark.streaming.Seconds
-
- SECONDS_PER_MINUTE() - Static method in class org.apache.spark.sql.parquet.CatalystTimestampConverter
-
- SecurityManager - Class in org.apache.spark
-
Spark class responsible for security.
- SecurityManager(SparkConf) - Constructor for class org.apache.spark.SecurityManager
-
- securityManager() - Method in class org.apache.spark.SparkEnv
-
- securityManager() - Method in class org.apache.spark.ui.SparkUI
-
- seed() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
-
- seed() - Method in class org.apache.spark.rdd.PartitionwiseSampledRDDPartition
-
- seed() - Method in class org.apache.spark.rdd.SampledRDDPartition
-
- seedBrokers() - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig
-
- seenNulls() - Method in interface org.apache.spark.sql.columnar.NullableColumnAccessor
-
- segment() - Method in class org.apache.spark.streaming.rdd.WriteAheadLogBackedBlockRDDPartition
-
- segment() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedStoreResult
-
- select(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of expressions.
- select(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of columns.
- select(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of expressions.
- select(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of columns.
- selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- selectExpr(String...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of SQL expressions.
- selectExpr(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of SQL expressions.
- selectNodesToSplit(Queue<Tuple2<Object, Node>>, long, DecisionTreeMetadata, Random) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Pull nodes off of the queue, and collect a group of nodes to be split on this iteration.
- sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
-
- sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the destination vertex.
- sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the source vertex.
- sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
-
Version of sequenceFile() for types implicitly convertible to Writables through a
WritableConverter.
- SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
through an implicit conversion.
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- ser() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- SerializableBuffer - Class in org.apache.spark.util
-
A wrapper around a java.nio.ByteBuffer that is serializable through Java serialization, to make
it easier to pass ByteBuffers in case class messages.
- SerializableBuffer(ByteBuffer) - Constructor for class org.apache.spark.util.SerializableBuffer
-
- serializableHadoopSplit() - Method in class org.apache.spark.rdd.NewHadoopPartition
-
- SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
-
- SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
-
- SerializationDebugger - Class in org.apache.spark.serializer
-
- SerializationDebugger() - Constructor for class org.apache.spark.serializer.SerializationDebugger
-
- SerializationDebugger.ObjectStreamClassMethods - Class in org.apache.spark.serializer
-
An implicit class that allows us to call private methods of ObjectStreamClass.
- SerializationDebugger.ObjectStreamClassMethods(ObjectStreamClass) - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
-
- SerializationDebugger.ObjectStreamClassMethods$ - Class in org.apache.spark.serializer
-
- SerializationDebugger.ObjectStreamClassMethods$() - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
-
- SerializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for writing serialized objects.
- SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
-
- serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serialize(Object) - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- serialize(T) - Static method in class org.apache.spark.util.Utils
-
Serialize an object using Java serialization
- serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- serializedTask() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
-
- serializedTask() - Method in class org.apache.spark.scheduler.TaskDescription
-
- serializeFilterExpressions(Seq<Expression>, Configuration) - Static method in class org.apache.spark.sql.parquet.ParquetFilters
-
Note: Inside the Hadoop API we only have access to
Configuration
, not to
SparkContext
, so we cannot use broadcasts to convey
the actual filter predicate.
- serializeMapStatuses(MapStatus[]) - Static method in class org.apache.spark.MapOutputTracker
-
- serializePlan(Object, OutputStream) - Method in class org.apache.spark.sql.hive.HiveFunctionWrapper
-
- Serializer - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A serializer.
- Serializer() - Constructor for class org.apache.spark.serializer.Serializer
-
- serializer() - Method in class org.apache.spark.ShuffleDependency
-
- serializer() - Method in class org.apache.spark.SparkEnv
-
- SerializerInstance - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
An instance of a serializer, for use by one thread at a time.
- SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.JavaSerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.KryoSerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Serialize via nested stream using specific serializer
- serializeWithDependencies(Task<?>, HashMap<String, Object>, HashMap<String, Object>, SerializerInstance) - Static method in class org.apache.spark.scheduler.Task
-
Serialize a task and the current app dependencies (files and JARs added to the SparkContext)
- server() - Method in class org.apache.spark.streaming.flume.FlumeReceiver
-
- server() - Method in class org.apache.spark.ui.ServerInfo
-
- ServerInfo - Class in org.apache.spark.ui
-
- ServerInfo(Server, int, ContextHandlerCollection) - Constructor for class org.apache.spark.ui.ServerInfo
-
- ServerStateException - Exception in org.apache.spark
-
Exception type thrown by HttpServer when it is in the wrong state for an operation.
- ServerStateException(String) - Constructor for exception org.apache.spark.ServerStateException
-
- serverUri() - Method in class org.apache.spark.HttpFileServer
-
- SERVLET_DEFAULT_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- SERVLET_KEY_PATH() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- SERVLET_KEY_SAMPLE() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- servletPath() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- servletShowSample() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter in the embedded param map.
- set(String, Object) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter (by name) in the embedded param map.
- set(String, String) - Method in class org.apache.spark.SparkConf
-
Set a configuration variable.
- set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
-
- set(Function0<Object>) - Method in class org.apache.spark.sql.hive.DeferredObjectAdapter
-
- set(int, long) - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- setAcls(boolean) - Method in class org.apache.spark.SecurityManager
-
- setActiveContext(SparkContext, boolean) - Static method in class org.apache.spark.SparkContext
-
Called at the end of the SparkContext constructor to ensure that no other SparkContext has
raced with this constructor and started.
- setAdminAcls(String) - Method in class org.apache.spark.SecurityManager
-
Admin acls should be set before the view or modify acls.
- setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set aggregator for RDD's shuffle.
- setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets Algorithm using a String.
- setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple parameters together
- setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setDocConcentration()
- setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets the constant used in computing confidence in implicit ALS.
- setAppName(String) - Method in class org.apache.spark.SparkConf
-
Set a name for your application.
- setAppName(String) - Method in class org.apache.spark.ui.SparkUI
-
Set the app name for this UI.
- setBatchDuration(Duration) - Method in class org.apache.spark.streaming.DStreamGraph
-
- setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setTopicConcentration()
- setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of blocks for both user blocks and product blocks to parallelize the computation
into; pass -1 for an auto-configured number of blocks.
- setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- setCallSite(String) - Method in class org.apache.spark.SparkContext
-
Set the thread-local property for overriding the call sites
of actions and RDDs.
- setCallSite(CallSite) - Method in class org.apache.spark.SparkContext
-
Set the thread-local property for overriding the call sites
of actions and RDDs.
- setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets categoricalFeaturesInfo using a Java Map.
- setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Period (in iterations) between checkpoints (default = 10).
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setClock(Clock) - Method in class org.apache.spark.ExecutorAllocationManager
-
Use a different clock for this allocation manager.
- setCompressCodec(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- setCompressed(boolean) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- setCompressType(String) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- setConf(Configuration) - Method in interface org.apache.spark.input.Configurable
-
- setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
-
- setConf(Properties) - Method in class org.apache.spark.sql.SQLConf
-
Set Spark SQL configuration properties.
- setConf(String, String) - Method in class org.apache.spark.sql.SQLConf
-
Set the given Spark SQL configuration property.
- setConf(Properties) - Method in class org.apache.spark.sql.SQLContext
-
Set Spark SQL configuration properties.
- setConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Set the given Spark SQL configuration property.
- setConsumerOffsetMetadata(String, Map<TopicAndPartition, OffsetMetadataAndError>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
Requires Kafka >= 0.8.1.1
- setConsumerOffsets(String, Map<TopicAndPartition, Object>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
-
Requires Kafka >= 0.8.1.1
- setContext(StreamingContext) - Method in class org.apache.spark.streaming.dstream.DStream
-
- setContext(StreamingContext) - Method in class org.apache.spark.streaming.DStreamGraph
-
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the largest change in log-likelihood at which convergence is
considered to have occurred.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the convergence tolerance of iterations for L-BFGS.
- setCustomHostname(String) - Static method in class org.apache.spark.util.Utils
-
Allow setting a custom host name because when we run on Mesos we need to use the same
hostname it reports to the master.
- setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- setDAGScheduler(DAGScheduler) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the decay factor directly (for forgetful algorithms).
- setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
-
Sets a class loader for the serializer to use in deserialization.
- setDelaySeconds(SparkConf, Enumeration.Value, int) - Static method in class org.apache.spark.util.MetadataCleaner
-
- setDelaySeconds(SparkConf, int, boolean) - Static method in class org.apache.spark.util.MetadataCleaner
-
Set the default delay time (in seconds).
- setDestTableId(int) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- setDictionary(Dictionary) - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter
-
- setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the distance threshold within which we've consider centers to have converged.
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
-
Set an environment variable to be used when launching executors for this application.
- setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setFailure(Exception) - Method in class org.apache.spark.partial.PartialResult
-
- setFeatureScaling(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should use feature scaling to improve the convergence during optimization.
- setFeaturesCol(String) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
-
- setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.BINARY
-
- setField(MutableRow, int, boolean) - Static method in class org.apache.spark.sql.columnar.BOOLEAN
-
- setField(MutableRow, int, byte) - Static method in class org.apache.spark.sql.columnar.BYTE
-
- setField(MutableRow, int, JvmType) - Method in class org.apache.spark.sql.columnar.ColumnType
-
Sets row(ordinal)
to field
.
- setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.DATE
-
- setField(MutableRow, int, double) - Static method in class org.apache.spark.sql.columnar.DOUBLE
-
- setField(MutableRow, int, float) - Static method in class org.apache.spark.sql.columnar.FLOAT
-
- setField(MutableRow, int, byte[]) - Static method in class org.apache.spark.sql.columnar.GENERIC
-
- setField(MutableRow, int, int) - Static method in class org.apache.spark.sql.columnar.INT
-
- setField(MutableRow, int, long) - Static method in class org.apache.spark.sql.columnar.LONG
-
- setField(MutableRow, int, short) - Static method in class org.apache.spark.sql.columnar.SHORT
-
- setField(MutableRow, int, String) - Static method in class org.apache.spark.sql.columnar.STRING
-
- setField(MutableRow, int, Timestamp) - Static method in class org.apache.spark.sql.columnar.TIMESTAMP
-
- setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
:: DeveloperApi ::
Sets storage level for final RDDs (user/product used in MatrixFactorizationModel).
- setFinalValue(R) - Method in class org.apache.spark.partial.PartialResult
-
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the gradient function (of the loss function of one single data example)
to be used for SGD.
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the gradient function (of the loss function of one single data example)
to be used for L-BFGS.
- setGraph(DStreamGraph) - Method in class org.apache.spark.streaming.dstream.DStream
-
- setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the half life and time unit ("batches" or "points") for forgetful algorithms.
- setId(int) - Method in class org.apache.spark.streaming.scheduler.Job
-
- setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
-
Set a parameter if it isn't already configured
- setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets whether to use implicit preference.
- setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Specify initial centers directly.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the initialization algorithm.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set the initialization mode.
- setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of steps for the k-means|| initialization mode.
- setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the initial GMM starting point, bypassing the random initialization.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the initial weights.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the initial weights.
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should add an intercept.
- setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
:: DeveloperApi ::
Sets storage level for intermediate RDDs (user/product in/out links).
- setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
Sets the isotonic parameter.
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of iterations to run.
- setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJars(String[]) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJobDescription(String) - Method in class org.apache.spark.SparkContext
-
Set a human readable description of the current job.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the number of Gaussians in the mixture model.
- setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of clusters to create (k).
- setK(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Number of topics to infer.
- setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set the number of clusters.
- setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the number of clusters.
- setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set key ordering for RDD's shuffle.
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
-
- setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the smoothing parameter.
- setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the regularization parameter, lambda.
- setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets initial learning rate (default: 0.025).
- setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setLocalProperties(Properties) - Method in class org.apache.spark.SparkContext
-
- setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setLocation(Table, CreateTableDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set mapSideCombine flag for RDD's shuffle.
- setMaster(String) - Method in class org.apache.spark.SparkConf
-
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
- setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression
-
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the maximum number of iterations to run.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set maximum number of iterations to run.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Maximum number of iterations for learning.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Set maximum number of iterations of the power iteration loop
- setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets minCount, the minimum number of times a token must appear to be included in the word2vec
model's vocabulary (default: 5).
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
:: Experimental ::
Set fraction of data to be used for each SGD iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.StreamFileInputFormat
-
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API
which is set through setMaxSplitSize
- setMinPartitions(JobContext, int) - Method in class org.apache.spark.input.WholeTextFileInputFormat
-
Allow minPartitions set by end-user in order to keep compatibility with old Hadoop API,
which is set through setMaxSplitSize
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the minimal support level (default: 0.3
).
- setModifyAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
-
Admin acls should be set before the view or modify acls.
- setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- setName(String) - Method in class org.apache.spark.rdd.RDD
-
Assign a name to this RDD
- setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set whether the least-squares problems solved at each iteration should have
nonnegativity constraints.
- setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
Sets both numUserBlocks and numItemBlocks to the specific value.
- setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
:: Experimental ::
Set the number of possible outcomes for k classes classification problem in
Multinomial Logistic Regression.
- setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the number of corrections used in the LBFGS update.
- setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of iterations (default: 1), which should be smaller than or equal to number of
partitions.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the number of iterations for SGD.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the maximal number of iterations for L-BFGS.
- setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets number of partitions (default: 1).
- setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the number of partitions used by parallel FP-growth (default: same as input data).
- setNumSplits(int, int) - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
Set number of splits for a continuous feature.
- setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.impl.estimator.Predictor
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
- setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
-
- setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of product blocks to parallelize the computation.
- setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Initialize random centers, requiring only the number of dimensions.
- setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the rank of the feature matrices computed (number of features).
- setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
-
- setReceiverId(int) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Set the ID of the DStream that this receiver is associated with.
- setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
- setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the regularization parameter.
- setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
:: Experimental ::
Set the number of runs of the algorithm to execute in parallel.
- setSchema(Seq<Attribute>, Configuration) - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the random seed
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the random seed for cluster initialization.
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA
-
Random seed
- setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets random seed (default: a random long integer).
- setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Sets a random seed to have deterministic results.
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
-
- setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
-
Set random seed.
- setSeed(long) - Method in class org.apache.spark.util.random.XORShiftRandom
-
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.SubtractedRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSparkHome(String) - Method in class org.apache.spark.SparkConf
-
Set the location where Spark is installed on worker nodes.
- setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
-
- setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the step size for gradient descent.
- setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the initial step size of SGD for the first step.
- setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the step size for gradient descent.
- setStreamingLogLevels() - Static method in class org.apache.spark.examples.streaming.StreamingExamples
-
Set reasonable logging levels for streaming if the user has not configured log4j.
- setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setTableInfo(TableDesc) - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContextHelper
-
- setTblNullFormat(CreateTableDesc, Table) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions
in Binary Logistic Regression.
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions.
- setTime(long) - Method in class org.apache.spark.util.ManualClock
-
- setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setup(int, int, int) - Method in class org.apache.spark.SparkHadoopWriter
-
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the updater function to actually perform a gradient step in a given direction.
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the updater function to actually perform a gradient step in a given direction.
- setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Initializes targetLen partition groups and assigns a preferredLocation
This uses coupon collector to estimate how many preferredLocations it must rotate through
until it has seen most of the preferred locations (2 * n log(n))
- setupSecureURLConnection(URLConnection, SecurityManager) - Static method in class org.apache.spark.util.Utils
-
If the given URL connection is HttpsURLConnection, it sets the SSL socket factory and
the host verifier from the given security manager.
- setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
Set the number of user blocks to parallelize the computation.
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should validate data before training.
- setValue(R) - Method in class org.apache.spark.Accumulable
-
Set the accumulator's value; only allowed on master
- setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Sets vector size (default: 100).
- setViewAcls(Set<String>, String) - Method in class org.apache.spark.SecurityManager
-
Admin acls should be set before the view or modify acls.
- setViewAcls(String, String) - Method in class org.apache.spark.SecurityManager
-
- setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- shardId() - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
-
- ShimFileSinkDesc - Class in org.apache.spark.sql.hive
-
- ShimFileSinkDesc(String, TableDesc, boolean) - Constructor for class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- shippablePartitionToOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Implicit conversion to allow invoking VertexPartitionBase
operations directly on a
ShippableVertexPartition
.
- ShippableVertexPartition<VD> - Class in org.apache.spark.graphx.impl
-
A map from vertex id to vertex attribute that additionally stores edge partition join sites for
each vertex attribute, enabling joining with an
EdgeRDD
.
- ShippableVertexPartition(OpenHashSet<Object>, Object, BitSet, RoutingTablePartition, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition
-
- ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$ - Class in org.apache.spark.graphx.impl
-
Implicit evidence that ShippableVertexPartition
is a member of the
VertexPartitionBaseOpsConstructor
typeclass.
- ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$() - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
-
- ShippableVertexPartitionOps<VD> - Class in org.apache.spark.graphx.impl
-
- ShippableVertexPartitionOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Constructor for class org.apache.spark.graphx.impl.ShippableVertexPartitionOps
-
- shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Generate a VertexAttributeBlock
for each edge partition keyed on the edge partition ID.
- shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- shipVertexAttributes(boolean, boolean) - Method in class org.apache.spark.graphx.VertexRDD
-
Generates an RDD of vertex attributes suitable for shipping to the edge partitions.
- shipVertexIds() - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition
-
Generate a VertexId
array for each edge partition keyed on the edge partition ID.
- shipVertexIds() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- shipVertexIds() - Method in class org.apache.spark.graphx.VertexRDD
-
Generates an RDD of vertex IDs suitable for shipping to the edge partitions.
- SHORT - Class in org.apache.spark.sql.columnar
-
- SHORT() - Constructor for class org.apache.spark.sql.columnar.SHORT
-
- SHORT_FORM() - Static method in class org.apache.spark.util.CallSite
-
- ShortColumnAccessor - Class in org.apache.spark.sql.columnar
-
- ShortColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.ShortColumnAccessor
-
- ShortColumnBuilder - Class in org.apache.spark.sql.columnar
-
- ShortColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.ShortColumnBuilder
-
- ShortColumnStats - Class in org.apache.spark.sql.columnar
-
- ShortColumnStats() - Constructor for class org.apache.spark.sql.columnar.ShortColumnStats
-
- ShortestPaths - Class in org.apache.spark.graphx.lib
-
Computes shortest paths to the given set of landmark vertices, returning a graph where each
vertex attribute is a map containing the shortest-path distance to each reachable landmark.
- ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
-
- shortForm() - Method in class org.apache.spark.util.CallSite
-
- shortParquetCompressionCodecNames() - Static method in class org.apache.spark.sql.parquet.ParquetRelation
-
- shouldCheckpoint() - Method in class org.apache.spark.streaming.kinesis.KinesisCheckpointState
-
Check if it's time to checkpoint based on the current time and the derived time
for the next checkpoint
- shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy
-
Whether rollover should be initiated at this moment
- shouldRollover(long) - Method in class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
Should rollover if the next set of bytes is going to exceed the size limit
- shouldRollover(long) - Method in class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
Should rollover if current time has exceeded next rollover time
- show(int) - Method in class org.apache.spark.sql.DataFrame
-
- show() - Method in class org.apache.spark.sql.DataFrame
-
Displays the top 20 rows of
DataFrame
in a tabular form.
- showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Option<Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- showQuantiles(PrintStream) - Method in class org.apache.spark.util.Distribution
-
- showString(int) - Method in class org.apache.spark.sql.DataFrame
-
Internal API for Python
- SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_BLOCK_MANAGER() - Static method in class org.apache.spark.util.MetadataCleanerType
-
- SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_PARTITIONS() - Static method in class org.apache.spark.sql.SQLConf
-
- SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.ToolTips
-
- SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
-
- ShuffleBlockFetcherIterator - Class in org.apache.spark.storage
-
An iterator that fetches multiple blocks.
- ShuffleBlockFetcherIterator(TaskContext, ShuffleClient, BlockManager, Seq<Tuple2<BlockManagerId, Seq<Tuple2<BlockId, Object>>>>, Serializer, long) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator
-
- ShuffleBlockFetcherIterator.FailureFetchResult - Class in org.apache.spark.storage
-
Result of a fetch from a remote block unsuccessfully.
- ShuffleBlockFetcherIterator.FailureFetchResult(BlockId, Throwable) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult
-
- ShuffleBlockFetcherIterator.FailureFetchResult$ - Class in org.apache.spark.storage
-
- ShuffleBlockFetcherIterator.FailureFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FailureFetchResult$
-
- ShuffleBlockFetcherIterator.FetchRequest - Class in org.apache.spark.storage
-
A request to fetch blocks from a remote BlockManager.
- ShuffleBlockFetcherIterator.FetchRequest(BlockManagerId, Seq<Tuple2<BlockId, Object>>) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
-
- ShuffleBlockFetcherIterator.FetchRequest$ - Class in org.apache.spark.storage
-
- ShuffleBlockFetcherIterator.FetchRequest$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest$
-
- ShuffleBlockFetcherIterator.FetchResult - Interface in org.apache.spark.storage
-
Result of a fetch from a remote block.
- ShuffleBlockFetcherIterator.SuccessFetchResult - Class in org.apache.spark.storage
-
Result of a fetch from a remote block successfully.
- ShuffleBlockFetcherIterator.SuccessFetchResult(BlockId, long, ManagedBuffer) - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
-
- ShuffleBlockFetcherIterator.SuccessFetchResult$ - Class in org.apache.spark.storage
-
- ShuffleBlockFetcherIterator.SuccessFetchResult$() - Constructor for class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult$
-
- ShuffleBlockId - Class in org.apache.spark.storage
-
- ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
-
- shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
-
- shuffleClient() - Method in class org.apache.spark.storage.BlockManager
-
- ShuffleCoGroupSplitDep - Class in org.apache.spark.rdd
-
- ShuffleCoGroupSplitDep(ShuffleHandle) - Constructor for class org.apache.spark.rdd.ShuffleCoGroupSplitDep
-
- ShuffleDataBlockId - Class in org.apache.spark.storage
-
- ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
-
- ShuffledDStream<K,V,C> - Class in org.apache.spark.streaming.dstream
-
- ShuffledDStream(DStream<Tuple2<K, V>>, Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.streaming.dstream.ShuffledDStream
-
- shuffleDep() - Method in class org.apache.spark.scheduler.Stage
-
- ShuffleDependency<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a dependency on the output of a shuffle stage.
- ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
-
- ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
The resulting RDD from a shuffle (e.g.
- ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
-
- ShuffledRDDPartition - Class in org.apache.spark.rdd
-
- ShuffledRDDPartition(int) - Constructor for class org.apache.spark.rdd.ShuffledRDDPartition
-
- shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.CleanShuffle
-
- shuffleId() - Method in class org.apache.spark.FetchFailed
-
- shuffleId() - Method in class org.apache.spark.GetMapOutputStatuses
-
- shuffleId() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- ShuffleIndexBlockId - Class in org.apache.spark.storage
-
- ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
-
- shuffleManager() - Method in class org.apache.spark.SparkEnv
-
- ShuffleMapTask - Class in org.apache.spark.scheduler
-
A ShuffleMapTask divides the elements of an RDD into multiple buckets (based on a partitioner
specified in the ShuffleDependency).
- ShuffleMapTask(int, Broadcast<byte[]>, Partition, Seq<TaskLocation>) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
-
- ShuffleMapTask(int) - Constructor for class org.apache.spark.scheduler.ShuffleMapTask
-
A constructor used only in test suites.
- shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
-
- shuffleRead() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- shuffleReadMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- shuffleReadMetricsToJson(ShuffleReadMetrics) - Static method in class org.apache.spark.util.JsonProtocol
-
- shuffleReadRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- shuffleReadRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- shuffleReadTotalBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- shuffleServerId() - Method in class org.apache.spark.storage.BlockManager
-
- shuffleToMapStage() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- shuffleWrite() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- shuffleWriteBytes() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- shuffleWriteMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- shuffleWriteMetricsToJson(ShuffleWriteMetrics) - Static method in class org.apache.spark.util.JsonProtocol
-
- shuffleWriteRecords() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- shuffleWriteRecords() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- shutdown(IRecordProcessorCheckpointer, ShutdownReason) - Method in class org.apache.spark.streaming.kinesis.KinesisRecordProcessor
-
Kinesis Client Library is shutting down this Worker for 1 of 2 reasons:
1) the stream is resharding by splitting or merging adjacent shards
(ShutdownReason.TERMINATE)
2) the failed or latent Worker has stopped sending heartbeats for whatever reason
(ShutdownReason.ZOMBIE)
- shutdownCallback() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- SignalLogger - Class in org.apache.spark.util
-
Used to log signals received.
- SignalLogger() - Constructor for class org.apache.spark.util.SignalLogger
-
- SignalLoggerHandler - Class in org.apache.spark.util
-
- SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
-
- SimpleFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
holding the result of an action that triggers a single job.
- SimpleFutureAction(JobWaiter<?>, Function0<T>) - Constructor for class org.apache.spark.SimpleFutureAction
-
- simpleString() - Method in class org.apache.spark.sql.sources.LogicalRelation
-
- SimpleUpdater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
A simple updater for gradient descent *without* any regularization.
- SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
-
- simpleWritableConverter(Function1<W, T>, ClassTag<W>) - Static method in class org.apache.spark.WritableConverter
-
- simpleWritableFactory(Function1<T, W>, ClassTag<T>, ClassTag<W>) - Static method in class org.apache.spark.WritableFactory
-
- SimrSchedulerBackend - Class in org.apache.spark.scheduler.cluster
-
- SimrSchedulerBackend(TaskSchedulerImpl, SparkContext, String) - Constructor for class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- SingleItemData<T> - Class in org.apache.spark.streaming.receiver
-
- SingleItemData(T) - Constructor for class org.apache.spark.streaming.receiver.SingleItemData
-
- SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
-
:: Experimental ::
Represents singular value decomposition (SVD) factors.
- SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- Sink - Interface in org.apache.spark.metrics.sink
-
- SINK_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
-
- size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
-
- size() - Method in class org.apache.spark.graphx.impl.EdgePartition
-
The number of edges in this partition
- size() - Method in class org.apache.spark.graphx.impl.VertexPartitionBase
-
- size() - Method in class org.apache.spark.ml.param.ParamMap
-
Number of param pairs in this set.
- size() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
-
Size of the block.
- size() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
-
Size of the block.
- size() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlockBuilder
-
- size() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- size() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- size() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Size of the vector.
- size() - Method in class org.apache.spark.mllib.rdd.RandomRDDPartition
-
- size() - Method in class org.apache.spark.rdd.PartitionGroup
-
- size() - Method in class org.apache.spark.scheduler.IndirectTaskResult
-
- size() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- size() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- size() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- size() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- size() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- size() - Method in class org.apache.spark.storage.BlockInfo
-
- size() - Method in class org.apache.spark.storage.MemoryEntry
-
- size() - Method in class org.apache.spark.storage.PutResult
-
- size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.FetchRequest
-
- size() - Method in class org.apache.spark.storage.ShuffleBlockFetcherIterator.SuccessFetchResult
-
- size() - Method in class org.apache.spark.util.BoundedPriorityQueue
-
- size() - Method in class org.apache.spark.util.TimeStampedHashMap
-
- size() - Method in class org.apache.spark.util.TimeStampedHashSet
-
- size() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- SIZE_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- SIZE_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- SizeBasedRollingPolicy - Class in org.apache.spark.util.logging
-
Defines a
RollingPolicy
by which files will be rolled
over after reaching a particular size.
- SizeBasedRollingPolicy(long, boolean) - Constructor for class org.apache.spark.util.logging.SizeBasedRollingPolicy
-
- SizeEstimator - Class in org.apache.spark.util
-
Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in
memory-aware caches.
- SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
-
- sizeInBytes() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- sizeInBytes() - Method in interface org.apache.spark.sql.columnar.ColumnStats
-
- sizeInBytes() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
-
Returns an estimated size of this relation in bytes.
- sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Sketches the input RDD via reservoir sampling on each partition.
- skip(long) - Method in class org.apache.spark.util.ByteBufferInputStream
-
- skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- slaveActor() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- slaveIdsWithExecutors() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- slaveLost(SchedulerDriver, Protos.SlaveID) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- SlaveLost - Class in org.apache.spark.scheduler
-
- SlaveLost(String) - Constructor for class org.apache.spark.scheduler.SlaveLost
-
- slaveTimeout() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- slice() - Method in class org.apache.spark.rdd.ParallelCollectionPartition
-
- slice(Seq<T>, int, ClassTag<T>) - Static method in class org.apache.spark.rdd.ParallelCollectionRDD
-
Slice a collection into numSlices sub-collections.
- slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
- slice(Interval) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs defined by the Interval object (both end times included)
- slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs between 'fromTime' to 'toTime' (both included)
- slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
Time interval after which the DStream generates a RDD
- slideDuration() - Method in class org.apache.spark.streaming.dstream.FilteredDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMappedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.FlatMapValuedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.ForEachDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.GlommedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.MapPartitionedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.MappedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.MapValuedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.ReducedWindowedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.ShuffledDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.StateDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.TransformedDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.UnionDStream
-
- slideDuration() - Method in class org.apache.spark.streaming.dstream.WindowedDStream
-
- sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
window over them.
- SlidingRDD<T> - Class in org.apache.spark.mllib.rdd
-
Represents a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
window over them.
- SlidingRDD(RDD<T>, int, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDD
-
- SlidingRDDPartition<T> - Class in org.apache.spark.mllib.rdd
-
- SlidingRDDPartition(int, Partition, Seq<T>) - Constructor for class org.apache.spark.mllib.rdd.SlidingRDDPartition
-
- SnappyCompressionCodec - Class in org.apache.spark.io
-
- SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
-
- SocketInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
- SocketInputDStream(StreamingContext, String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketInputDStream
-
- SocketReceiver<T> - Class in org.apache.spark.streaming.dstream
-
- SocketReceiver(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.SocketReceiver
-
- socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- solve(ALS.NormalEquation, double) - Method in class org.apache.spark.ml.recommendation.ALS.CholeskySolver
-
Solves a least squares problem with L2 regularization:
- solve(ALS.NormalEquation, double) - Method in interface org.apache.spark.ml.recommendation.ALS.LeastSquaresNESolver
-
Solves a least squares problem (possibly with other constraints).
- solve(ALS.NormalEquation, double) - Method in class org.apache.spark.ml.recommendation.ALS.NNLSSolver
-
Solves a nonnegative least squares problem with L2 regularizatin:
- solve(DoubleMatrix, DoubleMatrix, NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS
-
Solve a least squares problem, possibly with nonnegativity constraints, by a modified
projected gradient method.
- Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- sort(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the specified column, all in ascending order.
- sort(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the given expressions.
- sort(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the specified column, all in ascending order.
- sort(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the given expressions.
- sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return this RDD sorted by the given key function.
- sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return this RDD sorted by the given key function.
- sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements in
ascending order.
- sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- Source - Interface in org.apache.spark.metrics.source
-
- SOURCE_REGEX() - Static method in class org.apache.spark.metrics.MetricsSystem
-
- sourceName() - Method in class org.apache.spark.metrics.source.JvmSource
-
- sourceName() - Method in interface org.apache.spark.metrics.source.Source
-
- sourceName() - Method in class org.apache.spark.scheduler.DAGSchedulerSource
-
- sourceName() - Method in class org.apache.spark.storage.BlockManagerSource
-
- sourceName() - Method in class org.apache.spark.streaming.StreamingSource
-
- SPARK_CONTEXT() - Static method in class org.apache.spark.util.MetadataCleanerType
-
- SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
-
- SPARK_METADATA_KEY() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
-
- SPARK_ROW_REQUESTED_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowReadSupport
-
- SPARK_ROW_SCHEMA() - Static method in class org.apache.spark.sql.parquet.RowWriteSupport
-
- SPARK_VERSION_KEY() - Static method in class org.apache.spark.scheduler.EventLoggingListener
-
- SparkConf - Class in org.apache.spark
-
Configuration for a Spark application.
- SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
-
- SparkConf() - Constructor for class org.apache.spark.SparkConf
-
Create a SparkConf that loads defaults from system properties and the classpath
- sparkConf() - Method in class org.apache.spark.streaming.Checkpoint
-
- sparkConfPairs() - Method in class org.apache.spark.streaming.Checkpoint
-
- sparkContext() - Method in class org.apache.spark.rdd.RDD
-
The SparkContext that created this RDD.
- SparkContext - Class in org.apache.spark
-
Main entry point for Spark functionality.
- SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
-
- SparkContext() - Constructor for class org.apache.spark.SparkContext
-
Create a SparkContext that loads settings from system properties (for instance, when
launching with ./bin/spark-submit).
- SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Alternative constructor for setting preferred locations where Spark will create executors.
- SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String, Seq<String>) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- sparkContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- sparkContext() - Method in class org.apache.spark.sql.SQLContext
-
- sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
The underlying SparkContext
- sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
-
Return the associated Spark context
- SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- SparkDeploySchedulerBackend - Class in org.apache.spark.scheduler.cluster
-
- SparkDeploySchedulerBackend(TaskSchedulerImpl, SparkContext, String[]) - Constructor for class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- SparkDriverExecutionException - Exception in org.apache.spark
-
Exception thrown when execution of some user code in the driver process fails, e.g.
- SparkDriverExecutionException(Throwable) - Constructor for exception org.apache.spark.SparkDriverExecutionException
-
- SparkEnv - Class in org.apache.spark
-
:: DeveloperApi ::
Holds all the runtime environment objects for a running Spark instance (either master or worker),
including the serializer, Akka actor system, block manager, map output tracker, etc.
- SparkEnv(String, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, BroadcastManager, BlockTransferService, BlockManager, SecurityManager, HttpFileServer, String, MetricsSystem, ShuffleMemoryManager, OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
-
- sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
--------------------------------------------------- *
JSON deserialization methods for SparkListenerEvents |
----------------------------------------------------
- sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------------------------- *
JSON serialization methods for SparkListenerEvents |
--------------------------------------------------
- SparkException - Exception in org.apache.spark
-
- SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
-
- SparkException(String) - Constructor for exception org.apache.spark.SparkException
-
- SparkExitCode - Class in org.apache.spark.util
-
- SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
-
- SparkFiles - Class in org.apache.spark
-
Resolves paths to files added through SparkContext.addFile()
.
- SparkFiles() - Constructor for class org.apache.spark.SparkFiles
-
- sparkFilesDir() - Method in class org.apache.spark.SparkEnv
-
- SparkFirehoseListener - Class in org.apache.spark
-
Class that allows users to receive all SparkListener events.
- SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
-
- SparkFlumeEvent - Class in org.apache.spark.streaming.flume
-
A wrapper class for AvroFlumeEvent's with a custom serialization format.
- SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- SparkHadoopMapReduceUtil - Interface in org.apache.spark.mapreduce
-
- SparkHadoopMapRedUtil - Interface in org.apache.spark.mapred
-
- SparkHadoopWriter - Class in org.apache.spark
-
Internal helper class that saves an RDD using a Hadoop OutputFormat.
- SparkHadoopWriter(JobConf) - Constructor for class org.apache.spark.SparkHadoopWriter
-
- SparkHiveDynamicPartitionWriterContainer - Class in org.apache.spark.sql.hive
-
- SparkHiveDynamicPartitionWriterContainer(JobConf, ShimFileSinkDesc, String[]) - Constructor for class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- SparkHiveWriterContainer - Class in org.apache.spark.sql.hive
-
Internal helper class that saves an RDD using a Hive OutputFormat.
- SparkHiveWriterContainer(JobConf, ShimFileSinkDesc) - Constructor for class org.apache.spark.sql.hive.SparkHiveWriterContainer
-
- sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils
-
Convert all spark properties set in the given SparkConf to a sequence of java options.
- SparkJobInfo - Interface in org.apache.spark
-
Exposes information about Spark Jobs.
- SparkJobInfoImpl - Class in org.apache.spark
-
- SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
-
- SparkListener - Interface in org.apache.spark.scheduler
-
:: DeveloperApi ::
Interface for listening to events from the Spark scheduler.
- SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- SparkListenerApplicationStart - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationStart(String, Option<String>, long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- SparkListenerBus - Interface in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- SparkListenerEvent - Interface in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
-
Periodic updates from executors.
- SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- SparkListenerJobEnd - Class in org.apache.spark.scheduler
-
- SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
-
- SparkListenerJobStart - Class in org.apache.spark.scheduler
-
- SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
-
- SparkListenerLogStart - Class in org.apache.spark.scheduler
-
An internal class that describes the metadata of an event log.
- SparkListenerLogStart(String) - Constructor for class org.apache.spark.scheduler.SparkListenerLogStart
-
- SparkListenerStageCompleted - Class in org.apache.spark.scheduler
-
- SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
-
- SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- SparkListenerTaskEnd - Class in org.apache.spark.scheduler
-
- SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
-
- SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- SparkListenerTaskStart - Class in org.apache.spark.scheduler
-
- SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
-
- SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
-
- SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- SparkSQLParser - Class in org.apache.spark.sql
-
The top level Spark SQL parser.
- SparkSQLParser(Function1<String, LogicalPlan>) - Constructor for class org.apache.spark.sql.SparkSQLParser
-
- SparkStageInfo - Interface in org.apache.spark
-
Exposes information about Spark Stages.
- SparkStageInfoImpl - Class in org.apache.spark
-
- SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
-
- SparkStatusTracker - Class in org.apache.spark
-
Low-level status reporting APIs for monitoring job and stage progress.
- SparkStatusTracker(SparkContext) - Constructor for class org.apache.spark.SparkStatusTracker
-
- SparkUI - Class in org.apache.spark.ui
-
Top level user interface for a Spark application.
- SparkUITab - Class in org.apache.spark.ui
-
- SparkUITab(SparkUI, String) - Constructor for class org.apache.spark.ui.SparkUITab
-
- SparkUncaughtExceptionHandler - Class in org.apache.spark.util
-
The default uncaught exception handler for Executors terminates the whole process, to avoid
getting into a bad state indefinitely.
- SparkUncaughtExceptionHandler() - Constructor for class org.apache.spark.util.SparkUncaughtExceptionHandler
-
- sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sparkUser() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- sparkUser() - Method in class org.apache.spark.SparkContext
-
- sparkVersion() - Method in class org.apache.spark.scheduler.SparkListenerLogStart
-
- sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
- sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector providing its index array and value array.
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs.
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
- SparseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major sparse matrix.
- SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
- SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
Column-major sparse matrix.
- SparseVector - Class in org.apache.spark.mllib.linalg
-
A sparse vector represented by an index array and an value array.
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
-
- spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a diagonal matrix in SparseMatrix
format from the supplied values.
- SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation
-
Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix
for an RDD of the type RDD[Vector].
- SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
-
- speculatableTasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- SPECULATION_INTERVAL() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- SPECULATION_MULTIPLIER() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- SPECULATION_QUANTILE() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- speculative() - Method in class org.apache.spark.scheduler.TaskInfo
-
- speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a sparse Identity Matrix in Matrix
format.
- speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate an Identity Matrix in SparseMatrix
format.
- split() - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
-
- split() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- split() - Method in class org.apache.spark.mllib.tree.model.Node
-
- Split - Class in org.apache.spark.mllib.tree.model
-
:: DeveloperApi ::
Split applied to a feature
- Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
-
- split() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
-
- SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
-
- splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
Splits lines and counts the words.
- splitCommandString(String) - Static method in class org.apache.spark.util.Utils
-
Split a string of potentially quoted arguments from the command line the way that a shell
would do it to determine arguments to a command.
- splitIdToFile(int) - Static method in class org.apache.spark.rdd.CheckpointRDD
-
- splitIndex() - Method in class org.apache.spark.rdd.NarrowCoGroupSplitDep
-
- splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
-
- SplitInfo - Class in org.apache.spark.scheduler
-
- SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
-
- splitLocationInfo() - Method in class org.apache.spark.rdd.HadoopRDD.SplitInfoReflections
-
- splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Returns the squared distance between two Vectors.
- sqdist(SparseVector, DenseVector) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Returns the squared distance between DenseVector and SparseVector.
- sql() - Method in class org.apache.spark.sql.hive.execution.HiveNativeCommand
-
- sql(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
- sql(String) - Method in class org.apache.spark.sql.SQLContext
-
Executes a SQL query using Spark, returning the result as a
DataFrame
.
- SQLConf - Class in org.apache.spark.sql
-
A class that enables the setting and getting of mutable config parameters/hints.
- SQLConf() - Constructor for class org.apache.spark.sql.SQLConf
-
- SQLConf.Deprecated$ - Class in org.apache.spark.sql
-
- SQLConf.Deprecated$() - Constructor for class org.apache.spark.sql.SQLConf.Deprecated$
-
- sqlContext() - Method in class org.apache.spark.sql.DataFrame
-
- sqlContext() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
-
- sqlContext() - Method in class org.apache.spark.sql.json.JSONRelation
-
- sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- sqlContext() - Method in class org.apache.spark.sql.parquet.ParquetRelation2
-
- sqlContext() - Method in interface org.apache.spark.sql.parquet.ParquetTest
-
- sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- SQLContext - Class in org.apache.spark.sql
-
The entry point for working with structured data (rows and columns) in Spark.
- SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext.implicits - Class in org.apache.spark.sql
-
- SQLContext.implicits() - Constructor for class org.apache.spark.sql.SQLContext.implicits
-
:: Experimental ::
(Scala-specific) Implicit methods available in Scala for converting
common Scala objects into
DataFrame
s.
- SQLContext.implicits.StringToColumn - Class in org.apache.spark.sql
-
Converts $"col name" into an
Column
.
- SQLContext.implicits.StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLContext.implicits.StringToColumn
-
- sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- sqlType() - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- sqrt(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the square root of the specified float value.
- SQRT() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- squaredDist(Vector) - Method in class org.apache.spark.util.Vector
-
- SquaredError - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for squared error loss calculation.
- SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
-
- SquaredL2Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Updater for L2 regularized problems.
- SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- Src - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the source and edge fields but not the destination field.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's source vertex.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The source vertex attribute
- srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srcId() - Method in class org.apache.spark.graphx.Edge
-
- srcId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's source vertex.
- srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srcId() - Method in class org.apache.spark.graphx.impl.EdgeWithLocalIds
-
- srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.InBlock
-
- srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.RatingBlock
-
- srcIds() - Method in class org.apache.spark.ml.recommendation.ALS.UncompressedInBlock
-
- srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- ssc() - Method in class org.apache.spark.streaming.dstream.DStream
-
- ssc() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- ssc() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- SSLOptions - Class in org.apache.spark
-
SSLOptions class is a common container for SSL configuration options.
- SSLOptions(boolean, Option<File>, Option<String>, Option<String>, Option<File>, Option<String>, Option<String>, Set<String>) - Constructor for class org.apache.spark.SSLOptions
-
- sslSocketFactory() - Method in class org.apache.spark.SecurityManager
-
- stackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- stackTrace() - Method in class org.apache.spark.util.ThreadStackTrace
-
- stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
-
- stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- Stage - Class in org.apache.spark.scheduler
-
A stage is a set of independent tasks all computing the same function that need to run as part
of a Spark job, where all the tasks have the same shuffle dependencies.
- Stage(int, RDD<?>, int, Option<ShuffleDependency<?, ?, ?>>, List<Stage>, int, CallSite) - Constructor for class org.apache.spark.scheduler.Stage
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- StageCancelled - Class in org.apache.spark.scheduler
-
- StageCancelled(int) - Constructor for class org.apache.spark.scheduler.StageCancelled
-
- stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
-
- stageEnd(int) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
- stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.scheduler.Pool
-
- stageId() - Method in interface org.apache.spark.scheduler.Schedulable
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageId() - Method in class org.apache.spark.scheduler.StageCancelled
-
- stageId() - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.scheduler.Task
-
- stageId() - Method in class org.apache.spark.scheduler.TaskSet
-
- stageId() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- stageId() - Method in interface org.apache.spark.SparkStageInfo
-
- stageId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- stageId() - Method in class org.apache.spark.TaskContext
-
The ID of the stage that this task belong to.
- stageId() - Method in class org.apache.spark.TaskContextImpl
-
- stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageIds() - Method in interface org.apache.spark.SparkJobInfo
-
- stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
-
- stageIds() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageIdToStage() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- StageInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Stores information about a stage to pass from the scheduler to SparkListeners.
- StageInfo(int, int, String, int, Seq<RDDInfo>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
-
- stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
--------------------------------------------------------------------- *
JSON deserialization methods for classes SparkListenerEvents depend on |
----------------------------------------------------------------------
- stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
------------------------------------------------------------------- *
JSON serialization methods for classes SparkListenerEvents depend on |
--------------------------------------------------------------------
- StagePage - Class in org.apache.spark.ui.jobs
-
Page showing statistics and task list for a given stage
- StagePage(StagesTab) - Constructor for class org.apache.spark.ui.jobs.StagePage
-
- stages() - Method in class org.apache.spark.ml.Pipeline
-
param for pipeline stages
- stages() - Method in class org.apache.spark.ml.PipelineModel
-
- StagesTab - Class in org.apache.spark.ui.jobs
-
Web UI showing progress status of all stages in the given SparkContext.
- StagesTab(SparkUI) - Constructor for class org.apache.spark.ui.jobs.StagesTab
-
- stageStart(int) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
- stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
-
- StageTableBase - Class in org.apache.spark.ui.jobs
-
Page showing list of all ongoing and recently finished stages
- StageTableBase(Seq<StageInfo>, String, JobProgressListener, boolean, boolean) - Constructor for class org.apache.spark.ui.jobs.StageTableBase
-
- StandardNormalGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
-
- StandardScaler - Class in org.apache.spark.ml.feature
-
:: AlphaComponent ::
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
- StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
-
- StandardScaler - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Standardizes features by removing the mean and scaling to unit std using column summary
statistics on the samples in the training set.
- StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScalerModel - Class in org.apache.spark.ml.feature
-
- StandardScalerModel(StandardScaler, ParamMap, StandardScalerModel) - Constructor for class org.apache.spark.ml.feature.StandardScalerModel
-
- StandardScalerModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Represents a StandardScaler model that can transform vectors.
- StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerParams - Interface in org.apache.spark.ml.feature
-
- starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create a star graph with vertex 0 being the center.
- start() - Method in class org.apache.spark.ContextCleaner
-
Start the cleaner.
- start() - Method in class org.apache.spark.ExecutorAllocationManager
-
Register for scheduler callbacks to decide when to add and remove executors.
- start() - Method in class org.apache.spark.HttpServer
-
- start() - Method in class org.apache.spark.metrics.MetricsSystem
-
- start() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- start() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- start() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- start() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- start() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- start() - Method in interface org.apache.spark.metrics.sink.Sink
-
- start(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
-
Starts a new timer, or re-starts a stopped timer.
- start() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- start() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- start() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- start() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- start() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- start() - Method in class org.apache.spark.scheduler.EventLoggingListener
-
Creates the log file in the configured log directory.
- start() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- start() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- start() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystArrayContainsNullConverter
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystArrayConverter
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystGroupConverter
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystMapConverter
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystNativeArrayConverter
-
- start() - Method in class org.apache.spark.sql.parquet.CatalystPrimitiveRowConverter
-
- start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Start the execution of the streams.
- start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to start receiving data.
- start() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- start(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- start() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- start() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Start block generating and pushing threads.
- start() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Start the supervisor
- start() - Method in class org.apache.spark.streaming.scheduler.JobGenerator
-
Start generation of jobs
- start() - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
-
- start() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Start the actor and receiver execution thread.
- start() - Method in class org.apache.spark.streaming.StreamingContext
-
Start the execution of the streams.
- start(long) - Method in class org.apache.spark.streaming.util.RecurringTimer
-
Start at the given start time.
- start() - Method in class org.apache.spark.streaming.util.RecurringTimer
-
Start at the earliest time it can start based on the period.
- start() - Method in class org.apache.spark.util.AsynchronousListenerBus
-
Start sending events to attached listeners.
- start() - Method in class org.apache.spark.util.EventLoop
-
- Started() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
-
- Started() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
-
- startIdx() - Method in class org.apache.spark.util.Distribution
-
- startIndex() - Method in class org.apache.spark.rdd.ZippedWithIndexRDDPartition
-
- startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the first node in the given level.
- startJettyServer(String, int, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils
-
Attempt to start a Jetty server bound to the supplied hostName:port using the given
context handlers.
- startReceiver() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Start receiver
- startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils
-
Attempt to start a service on the given port, or fail after a number of attempts.
- startsWith(Column) - Method in class org.apache.spark.sql.Column
-
String starts with.
- startsWith(String) - Method in class org.apache.spark.sql.Column
-
String starts with another string literal.
- startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- startTime() - Method in class org.apache.spark.partial.ApproximateActionListener
-
- startTime() - Method in class org.apache.spark.scheduler.ApplicationEventListener
-
- startTime() - Method in class org.apache.spark.SparkContext
-
- startTime() - Method in class org.apache.spark.streaming.DStreamGraph
-
- startTime() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager.LogInfo
-
- STARVATION_TIMEOUT() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- statCounter() - Method in class org.apache.spark.util.Distribution
-
- StatCounter - Class in org.apache.spark.util
-
A class for tracking the statistics of a set of numbers (count, mean and variance) in a
numerically robust way.
- StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
-
- StatCounter() - Constructor for class org.apache.spark.util.StatCounter
-
Initialize the StatCounter with no values.
- state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- state() - Method in class org.apache.spark.streaming.StreamingContext
-
- StateDStream<K,V,S> - Class in org.apache.spark.streaming.dstream
-
- StateDStream(DStream<Tuple2<K, V>>, Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, Option<RDD<Tuple2<K, S>>>, ClassTag<K>, ClassTag<V>, ClassTag<S>) - Constructor for class org.apache.spark.streaming.dstream.StateDStream
-
- STATIC_RESOURCE_DIR() - Static method in class org.apache.spark.ui.SparkUI
-
- staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
- statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Test statistic.
- Statistics - Class in org.apache.spark.mllib.stat
-
:: Experimental ::
API for statistical functions in MLlib.
- Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
-
- statistics() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- statistics() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- statistics() - Method in class org.apache.spark.sql.parquet.ParquetRelation
-
- statistics() - Method in class org.apache.spark.sql.sources.LogicalRelation
-
- Statistics - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Statistics for querying the supervisor about state of workers.
- Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
-
- stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- stats() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
- stats() - Method in class org.apache.spark.mllib.tree.model.Node
-
- stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- stats() - Method in class org.apache.spark.sql.columnar.CachedBatch
-
- StatsReportListener - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Simple SparkListener that logs a few summary statistics when each stage completes
- StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
-
- StatsReportListener - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A simple StreamingListener that logs summary statistics across Spark Streaming batches
- StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
-
- statsSize() - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
-
- status() - Method in class org.apache.spark.scheduler.TaskInfo
-
- status() - Method in interface org.apache.spark.SparkJobInfo
-
- status() - Method in class org.apache.spark.SparkJobInfoImpl
-
- status() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- statusTracker() - Method in class org.apache.spark.SparkContext
-
- statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- statusUpdate(SchedulerDriver, Protos.TaskStatus) - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- StatusUpdate - Class in org.apache.spark.scheduler.local
-
- StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
-
- statusUpdate(long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.util.StatCounter
-
Return the standard deviation of the values.
- stop() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Shut down the SparkContext.
- stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.BroadcastManager
-
- stop() - Static method in class org.apache.spark.broadcast.HttpBroadcast
-
- stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- stop() - Method in class org.apache.spark.ContextCleaner
-
Stop the cleaner.
- stop() - Method in class org.apache.spark.HttpFileServer
-
- stop() - Method in class org.apache.spark.HttpServer
-
- stop() - Method in class org.apache.spark.MapOutputTracker
-
Stop the tracker.
- stop() - Method in class org.apache.spark.MapOutputTrackerMaster
-
- stop() - Method in class org.apache.spark.metrics.MetricsSystem
-
- stop() - Method in class org.apache.spark.metrics.sink.ConsoleSink
-
- stop() - Method in class org.apache.spark.metrics.sink.CsvSink
-
- stop() - Method in class org.apache.spark.metrics.sink.GraphiteSink
-
- stop() - Method in class org.apache.spark.metrics.sink.JmxSink
-
- stop() - Method in class org.apache.spark.metrics.sink.MetricsServlet
-
- stop() - Method in interface org.apache.spark.metrics.sink.Sink
-
- stop(String) - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
-
Stops a timer and returns the elapsed time in seconds.
- stop() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- stop() - Method in class org.apache.spark.scheduler.EventLoggingListener
-
Stop logging events.
- stop() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- stop() - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
- stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
-
- stop() - Method in class org.apache.spark.scheduler.TaskResultGetter
-
- stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- stop() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- stop() - Method in class org.apache.spark.SparkContext
-
Shut down the SparkContext.
- stop() - Method in class org.apache.spark.SparkEnv
-
- stop() - Method in class org.apache.spark.storage.BlockManager
-
- stop() - Method in class org.apache.spark.storage.BlockManagerMaster
-
Stop the driver actor, called only on the Spark driver node
- stop() - Method in class org.apache.spark.storage.DiskBlockManager
-
Cleanup local dirs and stop shuffle sender.
- stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop() - Method in class org.apache.spark.streaming.CheckpointWriter
-
- stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.FileInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to stop receiving data.
- stop() - Method in class org.apache.spark.streaming.dstream.QueueInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- stop() - Method in class org.apache.spark.streaming.DStreamGraph
-
- stop() - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream
-
- stop() - Method in class org.apache.spark.streaming.receiver.BlockGenerator
-
Stop all threads.
- stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely.
- stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely due to an exception
- stop(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Mark the supervisor and the receiver for stopping
- stop() - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
- stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobGenerator
-
Stop generation of jobs.
- stop(boolean) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- stop() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockTracker
-
Stop the block tracker.
- stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
-
- stop(boolean) - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker
-
Stop the receiver execution thread.
- stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams immediately (does not wait for all received data
to be processed).
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams, with option of ensuring all received data
has been processed.
- stop(boolean) - Method in class org.apache.spark.streaming.util.RecurringTimer
-
Stop the timer, and return the last time the callback was made.
- stop() - Method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
Stop the manager, close any open log writer
- stop() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
Tear down the timer thread.
- stop() - Method in class org.apache.spark.ui.SparkUI
-
Stop the server behind this web interface.
- stop() - Method in class org.apache.spark.ui.WebUI
-
Stop the server behind this web interface.
- stop() - Method in class org.apache.spark.util.AsynchronousListenerBus
-
Stop the listener bus.
- stop() - Method in class org.apache.spark.util.EventLoop
-
- stop() - Method in class org.apache.spark.util.logging.FileAppender
-
Stop the appender
- stop() - Method in class org.apache.spark.util.logging.RollingFileAppender
-
Stop the appender
- StopCoordinator - Class in org.apache.spark.scheduler
-
- StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
-
- StopExecutor - Class in org.apache.spark.scheduler.local
-
- StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
-
- stopExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- StopMapOutputTracker - Class in org.apache.spark
-
- StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
-
- Stopped() - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor.ReceiverState
-
- Stopped() - Method in class org.apache.spark.streaming.StreamingContext.StreamingContextState$
-
- stopping() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- stopReceiver(String, Option<Throwable>) - Method in class org.apache.spark.streaming.receiver.ReceiverSupervisor
-
Stop receiver
- StopReceiver - Class in org.apache.spark.streaming.receiver
-
- StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
-
- storageLevel() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- storageLevel() - Method in class org.apache.spark.storage.BlockStatus
-
- storageLevel() - Method in class org.apache.spark.storage.RDDInfo
-
- StorageLevel - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Flags for controlling the storage of an RDD.
- StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
-
- storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
-
- storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
-
- storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- StorageLevels - Class in org.apache.spark.api.java
-
Expose some commonly useful storage level constants.
- StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
-
- storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
-
- storageListener() - Method in class org.apache.spark.ui.SparkUI
-
- StorageListener - Class in org.apache.spark.ui.storage
-
:: DeveloperApi ::
A SparkListener that prepares information to be displayed on the BlockManagerUI.
- StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
-
- StoragePage - Class in org.apache.spark.ui.storage
-
Page showing list of RDD's currently stored in the cluster
- StoragePage(StorageTab) - Constructor for class org.apache.spark.ui.storage.StoragePage
-
- StorageStatus - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Storage information for each BlockManager.
- StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
-
- StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
-
Create a storage status with an initial set of blocks, leaving the source unmodified.
- storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
-
- storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
-
- storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
-
- StorageStatusListener - Class in org.apache.spark.storage
-
:: DeveloperApi ::
A SparkListener that maintains executor storage status.
- StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
-
- storageStatusListener() - Method in class org.apache.spark.ui.SparkUI
-
- StorageTab - Class in org.apache.spark.ui.storage
-
Web UI showing storage status of all RDD's in the given SparkContext.
- StorageTab(SparkUI) - Constructor for class org.apache.spark.ui.storage.StorageTab
-
- StorageUtils - Class in org.apache.spark.storage
-
Helper methods for storage-related objects.
- StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
-
- store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store the bytes of received data as a data block into Spark's memory.
- store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store a single item of received data to Spark's memory.
- store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store a single item of received data to Spark's memory.
- store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler
-
- storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler
-
Store a received block with the given block id and return related metadata
- storeBlock(StreamBlockId, ReceivedBlock) - Method in class org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler
-
This implementation stores the block into the block manager as well as a write ahead log.
- Strategy - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Stores all the configuration options for tree construction
- Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- STRATEGY_DEFAULT() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- STRATEGY_PROPERTY() - Static method in class org.apache.spark.util.logging.RollingFileAppender
-
- StratifiedSamplingUtils - Class in org.apache.spark.util.random
-
Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.
- StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
-
- STREAM() - Static method in class org.apache.spark.storage.BlockId
-
- StreamBasedRecordReader<T> - Class in org.apache.spark.input
-
An abstract class of RecordReader
to reading files out as streams
- StreamBasedRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamBasedRecordReader
-
- StreamBlockId - Class in org.apache.spark.storage
-
- StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
-
- StreamFileInputFormat<T> - Class in org.apache.spark.input
-
A general format for reading whole files in as streams, byte arrays,
or other functions to be added
- StreamFileInputFormat() - Constructor for class org.apache.spark.input.StreamFileInputFormat
-
- streamId() - Method in class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Get the unique identifier the receiver input stream that this
receiver is associated with.
- streamId() - Method in class org.apache.spark.streaming.scheduler.DeregisterReceiver
-
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceivedBlockInfo
-
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- streamId() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
-
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReportError
-
- streamIdToAllocatedBlocks() - Method in class org.apache.spark.streaming.scheduler.AllocatedBlocks
-
- StreamingContext - Class in org.apache.spark.streaming
-
Main entry point for Spark Streaming functionality.
- StreamingContext(SparkContext, Checkpoint, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
- StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext using an existing SparkContext.
- StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
- StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
- StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext.StreamingContextState$ - Class in org.apache.spark.streaming
-
Enumeration to identify current state of the StreamingContext
- StreamingContext.StreamingContextState$() - Constructor for class org.apache.spark.streaming.StreamingContext.StreamingContextState$
-
- StreamingContextState() - Method in class org.apache.spark.streaming.StreamingContext
-
Accessor for nested Scala object
- StreamingExamples - Class in org.apache.spark.examples.streaming
-
Utility functions for Spark Streaming examples.
- StreamingExamples() - Constructor for class org.apache.spark.examples.streaming.StreamingExamples
-
- StreamingJobProgressListener - Class in org.apache.spark.streaming.ui
-
- StreamingJobProgressListener(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- StreamingKMeans - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
StreamingLinearAlgorithm implements methods for continuously
training a generalized linear model model on streaming data,
and using it for prediction on (possibly different) streaming data.
- StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
- StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
:: Experimental ::
Train or predict a linear regression model on streaming data.
- StreamingLinearRegressionWithSGD(double, int, double) - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Construct a StreamingLinearRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
- StreamingListener - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A listener interface for receiving information about an ongoing streaming
computation.
- StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- StreamingListenerBus - Class in org.apache.spark.streaming.scheduler
-
Asynchronously passes StreamingListenerEvents to registered StreamingListeners.
- StreamingListenerBus() - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBus
-
- StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Base trait for events related to StreamingListener
- StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
:: Experimental ::
Train or predict a logistic regression model on streaming data.
- StreamingLogisticRegressionWithSGD(double, int, double, double) - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
- StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Construct a StreamingLogisticRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
- StreamingPage - Class in org.apache.spark.streaming.ui
-
Page for Spark Web UI that shows statistics of a streaming job
- StreamingPage(StreamingTab) - Constructor for class org.apache.spark.streaming.ui.StreamingPage
-
- StreamingSource - Class in org.apache.spark.streaming
-
- StreamingSource(StreamingContext) - Constructor for class org.apache.spark.streaming.StreamingSource
-
- StreamingTab - Class in org.apache.spark.streaming.ui
-
Spark Web UI tab that shows statistics of a streaming job.
- StreamingTab(StreamingContext) - Constructor for class org.apache.spark.streaming.ui.StreamingTab
-
- StreamInputFormat - Class in org.apache.spark.input
-
The format for the PortableDataStream files
- StreamInputFormat() - Constructor for class org.apache.spark.input.StreamInputFormat
-
- StreamRecordReader - Class in org.apache.spark.input
-
Reads the record in directly as a stream for other objects to manipulate and handle
- StreamRecordReader(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.StreamRecordReader
-
- STRING - Class in org.apache.spark.sql.columnar
-
- STRING() - Constructor for class org.apache.spark.sql.columnar.STRING
-
- string() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type string
- StringColumnAccessor - Class in org.apache.spark.sql.columnar
-
- StringColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.StringColumnAccessor
-
- StringColumnBuilder - Class in org.apache.spark.sql.columnar
-
- StringColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.StringColumnBuilder
-
- StringColumnStats - Class in org.apache.spark.sql.columnar
-
- StringColumnStats() - Constructor for class org.apache.spark.sql.columnar.StringColumnStats
-
- StringConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- stringifyPartialValue(Object) - Static method in class org.apache.spark.Accumulators
-
- stringifyValue(Object) - Static method in class org.apache.spark.Accumulators
-
- stringRddToDataFrameHolder(RDD<String>) - Method in class org.apache.spark.sql.SQLContext.implicits
-
Creates a single column DataFrame from an RDD[String].
- stringToText(String) - Static method in class org.apache.spark.SparkContext
-
- stringWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- stringWritableConverter() - Static method in class org.apache.spark.WritableConverter
-
- stringWritableFactory() - Static method in class org.apache.spark.WritableFactory
-
- stripDirectory(String) - Static method in class org.apache.spark.util.Utils
-
Strip the directory from a path name
- stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
-
Strongly connected components algorithm implementation.
- StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
- struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type struct
- struct(StructType) - Method in class org.apache.spark.sql.ColumnName
-
- StudentTCacher - Class in org.apache.spark.partial
-
A utility class for caching Student's T distribution values for a given confidence level
and various sample sizes.
- StudentTCacher(double) - Constructor for class org.apache.spark.partial.StudentTCacher
-
- subDirsPerLocalDir() - Method in class org.apache.spark.storage.DiskBlockManager
-
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges satisfying the predicates.
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
When this stage was submitted from the DAGScheduler to a TaskScheduler.
- submissionTime() - Method in interface org.apache.spark.SparkStageInfo
-
- submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
-
- submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- submissionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
-
- submitJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, CallSite, boolean, Function2<Object, U, BoxedUnit>, Properties) - Method in class org.apache.spark.scheduler.DAGScheduler
-
Submit a job to the job scheduler and get a JobWaiter object back.
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
-
:: Experimental ::
Submit a job for execution and return a FutureJob holding the result.
- submitJobSet(JobSet) - Method in class org.apache.spark.streaming.scheduler.JobScheduler
-
- submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
-
- submitTasks(TaskSet) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- subProperties(Properties, Regex) - Method in class org.apache.spark.metrics.MetricsConfig
-
- subsampleWeights() - Method in class org.apache.spark.mllib.tree.impl.BaggedPoint
-
- subsamplingFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
Indicates if feature subsampling is being used.
- subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns subset accuracy
(for equal sets of labels)
- substr(Column, Column) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- substr(int, int) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- SUBSTR() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- subTestSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- subTestSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(ImpurityCalculator) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityCalculator
-
Subtract the stats from another calculator from this one, modifying and returning this
calculator.
- subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
- subtract(Vector) - Method in class org.apache.spark.util.Vector
-
- subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- SubtractedRDD<K,V,W> - Class in org.apache.spark.rdd
-
An optimized version of cogroup for set difference/subtraction.
- SubtractedRDD(RDD<? extends Product2<K, V>>, RDD<? extends Product2<K, W>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<W>) - Constructor for class org.apache.spark.rdd.SubtractedRDD
-
- subtreeDepth() - Method in class org.apache.spark.mllib.tree.model.Node
-
Get depth of tree from this node.
- subtreeIterator() - Method in class org.apache.spark.mllib.tree.model.Node
-
Returns an iterator that traverses (DFS, left to right) the subtree of this node.
- subtreeToString(int) - Method in class org.apache.spark.mllib.tree.model.Node
-
Recursive print function.
- succeededTasks() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- success() - Method in class org.apache.spark.storage.ResultWithDroppedBlocks
-
- Success - Class in org.apache.spark
-
:: DeveloperApi ::
Task succeeded.
- Success() - Constructor for class org.apache.spark.Success
-
- successful() - Method in class org.apache.spark.scheduler.TaskInfo
-
- successful() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- SUCCESSFUL_JOB_OUTPUT_DIR_MARKER() - Static method in class org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer
-
- sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- sufficientResourcesRegistered() - Method in class org.apache.spark.scheduler.cluster.YarnSchedulerBackend
-
- sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Add up the elements in this RDD.
- Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
-
- sum() - Method in class org.apache.spark.partial.CountEvaluator
-
- sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Add up the elements in this RDD.
- sum(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the expression.
- sum(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the given column.
- sum(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the sum for each numeric columns for each group.
- sum(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the sum for each numeric columns for each group.
- SUM() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- sum() - Method in class org.apache.spark.util.StatCounter
-
- sum() - Method in class org.apache.spark.util.Vector
-
- sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- sumDistinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- SumEvaluator - Class in org.apache.spark.partial
-
An ApproximateEvaluator for sums.
- SumEvaluator(int, double) - Constructor for class org.apache.spark.partial.SumEvaluator
-
- summary(PrintStream) - Method in class org.apache.spark.util.Distribution
-
print a summary of this distribution to the given PrintStream.
- sums() - Method in class org.apache.spark.partial.GroupedCountEvaluator
-
- sums() - Method in class org.apache.spark.partial.GroupedMeanEvaluator
-
- sums() - Method in class org.apache.spark.partial.GroupedSumEvaluator
-
- supervisorStrategy() - Method in class org.apache.spark.streaming.receiver.ActorReceiver.Supervisor
-
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
-
List of supported feature subset sampling strategies.
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
-
- supports(ColumnType<?, ?>) - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
-
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
-
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
-
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
-
- supports(ColumnType<?, ?>) - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
-
- SVDPlusPlus - Class in org.apache.spark.graphx.lib
-
Implementation of SVD++ algorithm.
- SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
-
- SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
-
Configuration parameters for SVDPlusPlus.
- SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- SVMDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate sample data used for SVM.
- SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
-
- SVMModel - Class in org.apache.spark.mllib.classification
-
Model for Support Vector Machines (SVMs).
- SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
-
- SVMWithSGD - Class in org.apache.spark.mllib.classification
-
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
- SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
-
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100,
regParm: 0.01, miniBatchFraction: 1.0}.
- symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLContext.implicits
-
An implicit conversion that turns a Scala `Symbol` into a
Column
.
- symlink(File, File) - Static method in class org.apache.spark.util.Utils
-
Creates a symlink.
- symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition
-
Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.
- syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS
-
A := alpha * x * x^T^ + A
- SystemClock - Class in org.apache.spark.util
-
A clock backed by the actual time from the OS as reported by the System
API.
- SystemClock() - Constructor for class org.apache.spark.util.SystemClock
-
- systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- systemProperty(Enumeration.Value) - Static method in class org.apache.spark.util.MetadataCleanerType
-
- t() - Method in class org.apache.spark.SerializableWritable
-
- table() - Method in class org.apache.spark.sql.hive.execution.DescribeHiveTableCommand
-
- table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
-
- table() - Method in class org.apache.spark.sql.hive.InsertIntoHiveTable
-
- table() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- table() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
-
- table() - Method in class org.apache.spark.sql.sources.DescribeCommand
-
- table(String) - Method in class org.apache.spark.sql.SQLContext
-
- TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
-
- TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
-
- tableDesc() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- tableExists(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- tableInfo() - Method in class org.apache.spark.sql.hive.ShimFileSinkDesc
-
- tableName() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.AnalyzeTable
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.CreateTableAsSelect
-
- tableName() - Method in class org.apache.spark.sql.hive.execution.DropTable
-
- tableName() - Method in class org.apache.spark.sql.hive.MetastoreRelation
-
- tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- tableName() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- tableName() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
-
- tableName() - Method in class org.apache.spark.sql.sources.CreateTempTableUsingAsSelect
-
- tableName() - Method in class org.apache.spark.sql.sources.RefreshTable
-
- tableNames() - Method in class org.apache.spark.sql.SQLContext
-
Returns the names of tables in the current database as an array.
- tableNames(String) - Method in class org.apache.spark.sql.SQLContext
-
Returns the names of tables in the given database as an array.
- TableReader - Interface in org.apache.spark.sql.hive
-
A trait for subclasses that handle table scans.
- tables() - Method in class org.apache.spark.sql.SQLContext
-
Returns a
DataFrame
containing names of existing tables in the current database.
- tables(String) - Method in class org.apache.spark.sql.SQLContext
-
Returns a
DataFrame
containing names of existing tables in the given database.
- TableScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can produce all of its tuples as an RDD of Row objects.
- TachyonBlockManager - Class in org.apache.spark.storage
-
Creates and maintains the logical mapping between logical blocks and tachyon fs locations.
- TachyonBlockManager(BlockManager, String, String) - Constructor for class org.apache.spark.storage.TachyonBlockManager
-
- TachyonFileSegment - Class in org.apache.spark.storage
-
References a particular segment of a file (potentially the entire file), based off an offset and
a length.
- TachyonFileSegment(TachyonFile, long, long) - Constructor for class org.apache.spark.storage.TachyonFileSegment
-
- tachyonFolderName() - Method in class org.apache.spark.SparkContext
-
- tachyonSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
-
- tachyonSize() - Method in class org.apache.spark.storage.BlockStatus
-
- tachyonSize() - Method in class org.apache.spark.storage.RDDInfo
-
- tachyonStore() - Method in class org.apache.spark.storage.BlockManager
-
- TachyonStore - Class in org.apache.spark.storage
-
Stores BlockManager blocks on Tachyon.
- TachyonStore(BlockManager, TachyonBlockManager) - Constructor for class org.apache.spark.storage.TachyonStore
-
- tail() - Method in class org.apache.spark.mllib.rdd.SlidingRDDPartition
-
- take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.rdd.RDD
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.sql.DataFrame
-
- take(int) - Method in interface org.apache.spark.sql.RDDApi
-
- takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the take
action, which returns a
future for retrieving the first num
elements of this RDD.
- takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving the first num elements of the RDD.
- takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD as defined by
the specified Comparator[T] and maintains the order.
- takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD using the
natural ordering for T while maintain the order.
- takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the first k (smallest) elements from this RDD as defined by the specified
implicit Ordering[T] and maintains the ordering.
- takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
-
Return a fixed-size sampled subset of this RDD in an array
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- task() - Method in class org.apache.spark.CleanupTaskWeakReference
-
- task() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- task() - Method in class org.apache.spark.scheduler.BeginEvent
-
- task() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- Task<T> - Class in org.apache.spark.scheduler
-
A unit of execution.
- Task(int, int) - Constructor for class org.apache.spark.scheduler.Task
-
- TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
-
- TASK_SIZE_TO_WARN_KB() - Static method in class org.apache.spark.scheduler.TaskSetManager
-
- taskAttempt() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- taskAttemptId() - Method in class org.apache.spark.TaskContext
-
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts
will share the same attempt ID).
- taskAttemptId() - Method in class org.apache.spark.TaskContextImpl
-
- taskAttempts() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- TaskCommitDenied - Class in org.apache.spark
-
:: DeveloperApi ::
Task requested the driver to commit, but was denied.
- TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
-
- taskCompleted(int, long, long, TaskEndReason) - Method in class org.apache.spark.scheduler.OutputCommitCoordinator
-
- TaskCompletionListener - Interface in org.apache.spark.util
-
:: DeveloperApi ::
- TaskCompletionListenerException - Exception in org.apache.spark.util
-
Exception thrown when there is an exception in
executing the callback in TaskCompletionListener.
- TaskCompletionListenerException(Seq<String>) - Constructor for exception org.apache.spark.util.TaskCompletionListenerException
-
- TaskContext - Class in org.apache.spark
-
Contextual information about a task which can be read or mutated during
execution.
- TaskContext() - Constructor for class org.apache.spark.TaskContext
-
- TaskContextHelper - Class in org.apache.spark
-
This class exists to restrict the visibility of TaskContext setters.
- TaskContextHelper() - Constructor for class org.apache.spark.TaskContextHelper
-
- TaskContextImpl - Class in org.apache.spark
-
- TaskContextImpl(int, int, long, int, boolean, TaskMetrics) - Constructor for class org.apache.spark.TaskContextImpl
-
- taskData() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
-
- TaskDescription - Class in org.apache.spark.scheduler
-
Description of a task that gets passed onto executors to be executed, usually created by
TaskSetManager.resourceOffer
.
- TaskDescription(long, int, String, String, int, ByteBuffer) - Constructor for class org.apache.spark.scheduler.TaskDescription
-
- TaskDetailsClassNames - Class in org.apache.spark.ui.jobs
-
Names of the CSS classes corresponding to each type of task detail.
- TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
-
- taskEnded(Task<?>, TaskEndReason, Object, Map<Object, Object>, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskEndReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task ended.
- taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskFailedReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task failed.
- taskGettingResult(TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
-
- taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
-
- taskId() - Method in class org.apache.spark.scheduler.local.KillTask
-
- taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- taskId() - Method in class org.apache.spark.scheduler.TaskDescription
-
- taskId() - Method in class org.apache.spark.scheduler.TaskInfo
-
- taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
-
- taskIdsOnSlave() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- taskIdToExecutorId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- taskIdToSlaveId() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend
-
- taskIdToTaskSetId() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- taskInfo() - Method in class org.apache.spark.scheduler.BeginEvent
-
- taskInfo() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- taskInfo() - Method in class org.apache.spark.scheduler.GettingResultEvent
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- TaskInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about a running task attempt inside a TaskSet.
- TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
-
- taskInfo() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
-
- taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskInfos() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskKilled - Class in org.apache.spark
-
:: DeveloperApi ::
Task was killed intentionally and needs to be rescheduled.
- TaskKilled() - Constructor for class org.apache.spark.TaskKilled
-
- TaskKilledException - Exception in org.apache.spark
-
:: DeveloperApi ::
Exception thrown when a task is explicitly killed (i.e., task failure is expected).
- TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
-
- taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
-
- TaskLocality - Class in org.apache.spark.scheduler
-
- TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
-
- TaskLocation - Interface in org.apache.spark.scheduler
-
A location where a task should run.
- taskMetrics() - Method in class org.apache.spark.Heartbeat
-
- taskMetrics() - Method in class org.apache.spark.scheduler.CompletionEvent
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskMetrics() - Method in class org.apache.spark.TaskContext
-
::DeveloperApi::
- taskMetrics() - Method in class org.apache.spark.TaskContextImpl
-
- taskMetrics() - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData
-
- taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskNotSerializableException - Exception in org.apache.spark
-
Exception thrown when a task cannot be serialized.
- TaskNotSerializableException(Throwable) - Constructor for exception org.apache.spark.TaskNotSerializableException
-
- TaskResult<T> - Interface in org.apache.spark.scheduler
-
- TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
-
- TaskResultBlockId - Class in org.apache.spark.storage
-
- TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
-
- TaskResultGetter - Class in org.apache.spark.scheduler
-
Runs a thread pool that deserializes and remotely fetches (if necessary) task results.
- TaskResultGetter(SparkEnv, TaskSchedulerImpl) - Constructor for class org.apache.spark.scheduler.TaskResultGetter
-
- taskResultGetter() - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
- TaskResultLost - Class in org.apache.spark
-
:: DeveloperApi ::
The task finished successfully, but the result was lost from the executor's block manager before
it was fetched.
- TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
-
- taskRow(boolean, boolean, boolean, boolean, boolean, boolean, UIData.TaskUIData) - Method in class org.apache.spark.ui.jobs.StagePage
-
- tasks() - Method in class org.apache.spark.scheduler.TaskSet
-
- tasks() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- taskScheduler() - Method in class org.apache.spark.scheduler.DAGScheduler
-
- TaskScheduler - Interface in org.apache.spark.scheduler
-
Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.
- taskScheduler() - Method in class org.apache.spark.SparkContext
-
- TaskSchedulerImpl - Class in org.apache.spark.scheduler
-
Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.
- TaskSchedulerImpl(SparkContext, int, boolean) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
-
- TaskSchedulerImpl(SparkContext) - Constructor for class org.apache.spark.scheduler.TaskSchedulerImpl
-
- TaskSet - Class in org.apache.spark.scheduler
-
A set of tasks submitted together to the low-level TaskScheduler, usually representing
missing partitions of a particular stage.
- TaskSet(Task<?>[], int, int, int, Properties) - Constructor for class org.apache.spark.scheduler.TaskSet
-
- taskSet() - Method in class org.apache.spark.scheduler.TaskSetFailed
-
- taskSet() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- taskSetFailed(TaskSet, String) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- TaskSetFailed - Class in org.apache.spark.scheduler
-
- TaskSetFailed(TaskSet, String) - Constructor for class org.apache.spark.scheduler.TaskSetFailed
-
- taskSetFinished(TaskSetManager) - Method in class org.apache.spark.scheduler.TaskSchedulerImpl
-
Called to indicate that all task attempts (including speculated tasks) associated with the
given TaskSetManager have completed, so state associated with the TaskSetManager should be
cleaned up.
- TaskSetManager - Class in org.apache.spark.scheduler
-
Schedules the tasks within a single TaskSet in the TaskSchedulerImpl.
- TaskSetManager(TaskSchedulerImpl, TaskSet, int, Clock) - Constructor for class org.apache.spark.scheduler.TaskSetManager
-
- taskSetSchedulingAlgorithm() - Method in class org.apache.spark.scheduler.Pool
-
- tasksSuccessful() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- taskStarted(Task<?>, TaskInfo) - Method in class org.apache.spark.scheduler.DAGScheduler
-
- taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
-
- TaskState - Class in org.apache.spark
-
- TaskState() - Constructor for class org.apache.spark.TaskState
-
- taskSucceeded(int, Object) - Method in class org.apache.spark.partial.ApproximateActionListener
-
- taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
-
- taskSucceeded(int, Object) - Method in class org.apache.spark.scheduler.JobWaiter
-
- taskTime() - Method in class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- tellMaster() - Method in class org.apache.spark.storage.BlockInfo
-
- TempLocalBlockId - Class in org.apache.spark.storage
-
Id associated with temporary local data managed as blocks.
- TempLocalBlockId(UUID) - Constructor for class org.apache.spark.storage.TempLocalBlockId
-
- temporary() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- temporary() - Method in class org.apache.spark.sql.sources.CreateTableUsingAsSelect
-
- TempShuffleBlockId - Class in org.apache.spark.storage
-
Id associated with temporary shuffle data managed as blocks.
- TempShuffleBlockId(UUID) - Constructor for class org.apache.spark.storage.TempShuffleBlockId
-
- term2index(int) - Static method in class org.apache.spark.mllib.clustering.LDA
-
Term vertex IDs are {-1, -2, ..., -vocabSize}
- TerminalWidth() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- TEST() - Static method in class org.apache.spark.storage.BlockId
-
- TestBlockId - Class in org.apache.spark.storage
-
- TestBlockId(String) - Constructor for class org.apache.spark.storage.TestBlockId
-
- testData() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testFilterDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testFilterSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testGlobDir() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testGlobSubDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testGlobSubDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testGlobSubDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- TestGroupWriteSupport - Class in org.apache.spark.sql.parquet
-
- TestGroupWriteSupport(MessageType) - Constructor for class org.apache.spark.sql.parquet.TestGroupWriteSupport
-
- testNestedData1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedData2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedDir1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedDir2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedDir3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedDir4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedSchema1() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedSchema2() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedSchema3() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testNestedSchema4() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Trait for hypothesis test results.
- testSchema() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- testSchemaFieldNames() - Static method in class org.apache.spark.sql.parquet.ParquetTestData
-
- TestSQLContext - Class in org.apache.spark.sql.test
-
A SQLContext that can be used for local testing.
- TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
-
- TestUtils - Class in org.apache.spark
-
Utilities for tests.
- TestUtils() - Constructor for class org.apache.spark.TestUtils
-
- textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.SparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
-
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- thisClassName() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
-
- thisClassName() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
-
- thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
- thread() - Method in class org.apache.spark.streaming.scheduler.ReceiverTracker.ReceiverLauncher
-
- threadDumpEnabled() - Method in class org.apache.spark.ui.exec.ExecutorsTab
-
- threadId() - Method in class org.apache.spark.util.ThreadStackTrace
-
- threadName() - Method in class org.apache.spark.util.ThreadStackTrace
-
- ThreadStackTrace - Class in org.apache.spark.util
-
Used for shipping per-thread stacktraces from the executors to driver.
- ThreadStackTrace(long, String, Thread.State, String) - Constructor for class org.apache.spark.util.ThreadStackTrace
-
- threadState() - Method in class org.apache.spark.util.ThreadStackTrace
-
- threshold() - Method in interface org.apache.spark.ml.param.HasThreshold
-
param for threshold in (binary) prediction
- threshold() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.Split
-
- thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns thresholds in descending order.
- threshTime() - Method in class org.apache.spark.streaming.receiver.CleanupOldBlocks
-
- THRIFT_ARRAY_ELEMENTS_SCHEMA_NAME_SUFFIX() - Static method in class org.apache.spark.sql.parquet.CatalystConverter
-
- THRIFTSERVER_POOL() - Static method in class org.apache.spark.sql.SQLConf
-
- throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- time() - Method in class org.apache.spark.streaming.scheduler.BatchAllocationEvent
-
- time() - Method in class org.apache.spark.streaming.scheduler.ClearCheckpointData
-
- time() - Method in class org.apache.spark.streaming.scheduler.ClearMetadata
-
- time() - Method in class org.apache.spark.streaming.scheduler.DoCheckpoint
-
- time() - Method in class org.apache.spark.streaming.scheduler.GenerateJobs
-
- time() - Method in class org.apache.spark.streaming.scheduler.Job
-
- time() - Method in class org.apache.spark.streaming.scheduler.JobSet
-
- Time - Class in org.apache.spark.streaming
-
This is a simple class that represents an absolute instant of time.
- Time(long) - Constructor for class org.apache.spark.streaming.Time
-
- TimeBasedRollingPolicy - Class in org.apache.spark.util.logging
-
Defines a
RollingPolicy
by which files will be rolled
over at a fixed interval.
- TimeBasedRollingPolicy(long, String, boolean) - Constructor for class org.apache.spark.util.logging.TimeBasedRollingPolicy
-
- timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils
-
Timing method based on iterations that permit JVM JIT optimization.
- timeout() - Method in class org.apache.spark.storage.BlockManagerMaster
-
- timeoutCheckingTask() - Method in class org.apache.spark.storage.BlockManagerMasterActor
-
- timeRunning(long) - Method in class org.apache.spark.scheduler.TaskInfo
-
- times(int) - Method in class org.apache.spark.streaming.Duration
-
- times() - Method in class org.apache.spark.streaming.scheduler.BatchCleanupEvent
-
- times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Method executed for repeating a task for side effects.
- TIMESTAMP - Class in org.apache.spark.sql.columnar
-
- TIMESTAMP() - Constructor for class org.apache.spark.sql.columnar.TIMESTAMP
-
- timestamp() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new AttributeReference of type timestamp
- timestamp() - Method in class org.apache.spark.util.TimeStampedValue
-
- TimestampColumnAccessor - Class in org.apache.spark.sql.columnar
-
- TimestampColumnAccessor(ByteBuffer) - Constructor for class org.apache.spark.sql.columnar.TimestampColumnAccessor
-
- TimestampColumnBuilder - Class in org.apache.spark.sql.columnar
-
- TimestampColumnBuilder() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnBuilder
-
- TimestampColumnStats - Class in org.apache.spark.sql.columnar
-
- TimestampColumnStats() - Constructor for class org.apache.spark.sql.columnar.TimestampColumnStats
-
- TimestampConversion() - Method in class org.apache.spark.sql.jdbc.JDBCRDD
-
Accessor for nested Scala object
- TimeStampedHashMap<A,B> - Class in org.apache.spark.util
-
This is a custom implementation of scala.collection.mutable.Map which stores the insertion
timestamp along with each key-value pair.
- TimeStampedHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedHashMap
-
- TimeStampedHashSet<A> - Class in org.apache.spark.util
-
- TimeStampedHashSet() - Constructor for class org.apache.spark.util.TimeStampedHashSet
-
- TimeStampedValue<V> - Class in org.apache.spark.util
-
- TimeStampedValue(V, long) - Constructor for class org.apache.spark.util.TimeStampedValue
-
- TimeStampedWeakValueHashMap<A,B> - Class in org.apache.spark.util
-
A wrapper of TimeStampedHashMap that ensures the values are weakly referenced and timestamped.
- TimeStampedWeakValueHashMap(boolean) - Constructor for class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- timeToLogFile(long, long) - Static method in class org.apache.spark.streaming.util.WriteAheadLogManager
-
- TimeTracker - Class in org.apache.spark.mllib.tree.impl
-
Time tracker implementation which holds labeled timers.
- TimeTracker() - Constructor for class org.apache.spark.mllib.tree.impl.TimeTracker
-
- timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- tmpPath() - Method in class org.apache.spark.scheduler.cluster.SimrSchedulerBackend
-
- to(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- toArray() - Method in class org.apache.spark.input.PortableDataStream
-
Read the file as a byte array
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a dense array in column major.
- toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a double array.
- toArray() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- toArrays() - Method in class org.apache.spark.util.io.ByteArrayChunkOutputStream
-
- toAttribute() - Method in class org.apache.spark.sql.hive.MetastoreRelation.SchemaAttribute
-
- toBatchInfo() - Method in class org.apache.spark.streaming.scheduler.JobSet
-
- toBinary() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to BlockMatrix.
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to BlockMatrix.
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Converts to BlockMatrix.
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Converts to BlockMatrix.
- toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- toBreeze() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Collects data and assembles a local dense breeze matrix (for test only).
- toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Collects data and assembles a local matrix.
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
-
Collects data and assembles a local dense breeze matrix (for test only).
- toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toBreeze() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a breeze matrix.
- toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- toBreeze() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a breeze vector.
- toByteString() - Method in class org.apache.spark.scheduler.cluster.mesos.MesosTaskLaunchData
-
- toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
-
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to CoordinateMatrix.
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toDataType(String) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
-
- toDataType(Type, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
Converts a given Parquet Type
into the corresponding
DataType
.
- toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print the full model to a string.
- toDebugString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
Print the full model to a string.
- toDebugString() - Method in class org.apache.spark.rdd.RDD
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.SparkConf
-
Return a string listing all keys and values, one per line.
- toDense() - Method in class org.apache.spark.mllib.clustering.VectorWithNorm
-
Converts the vector to a dense vector.
- toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a DenseMatrix
from the given SparseMatrix
.
- toDF(String...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with columns renamed.
- toDF() - Method in class org.apache.spark.sql.DataFrame
-
Returns the object itself.
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with columns renamed.
- toDF() - Method in class org.apache.spark.sql.DataFrameHolder
-
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrameHolder
-
- toEdgePartition() - Method in class org.apache.spark.graphx.impl.EdgePartitionBuilder
-
- toEdgePartition() - Method in class org.apache.spark.graphx.impl.ExistingEdgePartitionBuilder
-
- toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
-
Converts the edge and vertex properties into an
EdgeTriplet
for convenience.
- toErrorString() - Method in class org.apache.spark.ExceptionFailure
-
- toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
-
- toErrorString() - Method in class org.apache.spark.FetchFailed
-
- toErrorString() - Static method in class org.apache.spark.Resubmitted
-
- toErrorString() - Method in class org.apache.spark.TaskCommitDenied
-
- toErrorString() - Method in interface org.apache.spark.TaskFailedReason
-
Error message displayed in the web UI.
- toErrorString() - Static method in class org.apache.spark.TaskKilled
-
- toErrorString() - Static method in class org.apache.spark.TaskResultLost
-
- toErrorString() - Static method in class org.apache.spark.UnknownReason
-
- toFormattedString() - Method in class org.apache.spark.streaming.Duration
-
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to IndexedRowMatrix.
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to IndexedRowMatrix.
- toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
- toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
Map the catalyst expression to ObjectInspector, however,
if the expression is Literal
or foldable, a constant writable object inspector returns;
Otherwise, we always get the object inspector according to its data type(in catalyst)
- toInt() - Method in class org.apache.spark.storage.StorageLevel
-
- toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Convert to a JavaDStream
- toJavaRDD() - Method in class org.apache.spark.rdd.RDD
-
- toJavaRDD() - Method in class org.apache.spark.sql.DataFrame
-
Returns the content of the
DataFrame
as a
JavaRDD
of
Row
s.
- toJSON() - Method in class org.apache.spark.sql.DataFrame
-
Returns the content of the
DataFrame
as a RDD of JSON strings.
- tokenize(String) - Static method in class org.apache.spark.rdd.PipedRDD
-
- Tokenizer - Class in org.apache.spark.ml.feature
-
:: AlphaComponent ::
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
- Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
-
- toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Convert model to a local model.
- toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an iterator that contains all of the elements in this RDD.
- toLocalIterator() - Method in class org.apache.spark.rdd.RDD
-
Return an iterator that contains all of the elements in this RDD.
- toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Collect the distributed matrix on the driver as a `DenseMatrix`.
- toLowerCase() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog.QualifiedTableName
-
- toMap() - Method in class org.apache.spark.util.TimeStampedHashMap
-
- toMap() - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- toMesos(Enumeration.Value) - Static method in class org.apache.spark.TaskState
-
- toMetastoreType(DataType) - Static method in class org.apache.spark.sql.hive.HiveMetastoreTypes
-
- toNodeSeq() - Method in class org.apache.spark.ui.jobs.ExecutorTable
-
- toNodeSeq() - Method in class org.apache.spark.ui.jobs.PoolTable
-
- toNodeSeq() - Method in class org.apache.spark.ui.jobs.StageTableBase
-
- ToolTips - Class in org.apache.spark.ui
-
- ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
-
- toOps(ShippableVertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.ShippableVertexPartition.ShippableVertexPartitionOpsConstructor$
-
- toOps(VertexPartition<VD>, ClassTag<VD>) - Method in class org.apache.spark.graphx.impl.VertexPartition.VertexPartitionOpsConstructor$
-
- toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
-
- top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD as defined by
the specified Comparator[T].
- top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD using the
natural ordering for T.
- top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
-
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
-
- topic() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- topic() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
Kafka topic name
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA.EMOptimizer
-
- topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
For each document in the training set, return the distribution over topics for that document
("theta_doc").
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper
-
Gets the top k words in terms of word counts.
- topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- toPredict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
-
- toPrimitiveDataType(PrimitiveType, boolean, boolean) - Static method in class org.apache.spark.sql.parquet.ParquetTypesConverter
-
- toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Converts to RowMatrix, dropping row indices after grouping by row index.
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
Drops row indices and converts this matrix to a
RowMatrix
.
- TorrentBroadcast<T> - Class in org.apache.spark.broadcast
-
A BitTorrent-like implementation of
Broadcast
.
- TorrentBroadcast(T, long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.TorrentBroadcast
-
- TorrentBroadcastFactory - Class in org.apache.spark.broadcast
-
A
Broadcast
implementation that uses a BitTorrent-like
protocol to do a distributed transfer of the broadcasted data to the executors.
- TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- toScalaFunction(Function<T, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toScalaFunction2(Function2<T1, T2, R>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toSchemaRDD() - Method in class org.apache.spark.sql.DataFrame
-
Left here for backward compatibility.
- toSeq() - Method in class org.apache.spark.ml.param.ParamMap
-
Converts this param map to a sequence of param pairs.
- toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a SparseMatrix
from the given DenseMatrix
.
- toSplit() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.Accumulable
-
- toString() - Method in class org.apache.spark.api.java.JavaRDD
-
- toString() - Method in class org.apache.spark.broadcast.Broadcast
-
- toString() - Method in class org.apache.spark.graphx.EdgeDirection
-
- toString() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toString() - Method in class org.apache.spark.ml.param.Param
-
- toString() - Method in class org.apache.spark.ml.param.ParamMap
-
- toString() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryLabelCounter
-
- toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
A human readable representation of the matrix
- toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
String explaining the hypothesis test result.
- toString() - Method in class org.apache.spark.mllib.tree.impl.TimeTracker
-
Print all timing results in seconds.
- toString() - Method in class org.apache.spark.mllib.tree.impurity.EntropyCalculator
-
- toString() - Method in class org.apache.spark.mllib.tree.impurity.GiniCalculator
-
- toString() - Method in class org.apache.spark.mllib.tree.impurity.VarianceCalculator
-
- toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print a summary of the model.
- toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Node
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Split
-
- toString() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
Print a summary of the model.
- toString() - Method in class org.apache.spark.partial.BoundedDouble
-
- toString() - Method in class org.apache.spark.partial.PartialResult
-
- toString() - Method in class org.apache.spark.rdd.RDD
-
- toString() - Method in class org.apache.spark.scheduler.ExecutorLossReason
-
- toString() - Method in class org.apache.spark.scheduler.HDFSCacheTaskLocation
-
- toString() - Method in class org.apache.spark.scheduler.HostTaskLocation
-
- toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- toString() - Method in class org.apache.spark.scheduler.ResultTask
-
- toString() - Method in class org.apache.spark.scheduler.ShuffleMapTask
-
- toString() - Method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.scheduler.Stage
-
- toString() - Method in class org.apache.spark.scheduler.TaskDescription
-
- toString() - Method in class org.apache.spark.scheduler.TaskSet
-
- toString() - Method in class org.apache.spark.SerializableWritable
-
- toString() - Method in class org.apache.spark.sql.Column
-
- toString() - Method in class org.apache.spark.sql.columnar.ColumnType
-
- toString() - Method in class org.apache.spark.sql.DataFrame
-
- toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdaf
-
- toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdf
-
- toString() - Method in class org.apache.spark.sql.hive.HiveGenericUdtf
-
- toString() - Method in class org.apache.spark.sql.hive.HiveSimpleUdf
-
- toString() - Method in class org.apache.spark.sql.hive.HiveUdaf
-
- toString() - Method in class org.apache.spark.sql.parquet.timestamp.NanoTime
-
- toString() - Method in class org.apache.spark.SSLOptions
-
Returns a string representation of this SSLOptions with all the passwords masked.
- toString() - Method in class org.apache.spark.storage.BlockId
-
- toString() - Method in class org.apache.spark.storage.BlockManagerId
-
- toString() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- toString() - Method in class org.apache.spark.storage.FileSegment
-
- toString() - Method in class org.apache.spark.storage.RDDInfo
-
- toString() - Method in class org.apache.spark.storage.StorageLevel
-
- toString() - Method in class org.apache.spark.storage.TachyonFileSegment
-
- toString() - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
- toString() - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- toString() - Method in class org.apache.spark.streaming.Duration
-
- toString() - Method in class org.apache.spark.streaming.Interval
-
- toString() - Method in class org.apache.spark.streaming.kafka.Broker
-
- toString() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
- toString() - Method in class org.apache.spark.streaming.scheduler.Job
-
- toString() - Method in class org.apache.spark.streaming.Time
-
- toString() - Method in class org.apache.spark.util.MutablePair
-
- toString() - Method in class org.apache.spark.util.StatCounter
-
- toString() - Method in class org.apache.spark.util.Vector
-
- totalCoreCount() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorData
-
- totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- totalCores() - Method in class org.apache.spark.scheduler.local.LocalBackend
-
- totalCoresAcquired() - Method in class org.apache.spark.scheduler.cluster.mesos.CoarseMesosSchedulerBackend
-
- totalCount() - Method in class org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrixImpl
-
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for all the jobs of this batch to finish processing from the time they
were submitted.
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.JobSet
-
- totalDelayDistribution() - Method in class org.apache.spark.streaming.ui.StreamingJobProgressListener
-
- totalDuration() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- totalExpectedCores() - Method in class org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend
-
- totalInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- totalNumNodes() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
Get total number of nodes, summed over all trees in the forest.
- totalRegisteredExecutors() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
-
- totalResultSize() - Method in class org.apache.spark.scheduler.TaskSetManager
-
- totalShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- totalShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- totalTasks() - Method in class org.apache.spark.partial.ApproximateActionListener
-
- totalTasks() - Method in class org.apache.spark.ui.exec.ExecutorSummaryInfo
-
- toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toTuple() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
this is to avoid ClassNotFoundException during checkpoint restore
- toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
-
- toWeakReference(V) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- toWeakReferenceFunction(Function1<Tuple2<K, V>, R>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- toWeakReferenceTuple(Tuple2<K, V>) - Static method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- trackerActor() - Method in class org.apache.spark.MapOutputTracker
-
Set to the MapOutputTrackerActor living on the driver.
- train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
-
Implementation of the ALS algorithm.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a Linear Regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model.
- train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
- trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to
some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by
users to some products, in the form of (userID, productID, rating) pairs.
- trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Update the clustering model by training on batches of data from a DStream.
- trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Update the model by training on batches of data from a DStream.
- trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `trainOn`.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
- transactions() - Method in class org.apache.spark.mllib.fpm.FPTree
-
Returns all transactions in an iterator.
- transceiver() - Method in class org.apache.spark.streaming.flume.FlumeConnection
-
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
Transforms dataset by reading from featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
.
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
-
Transforms dataset by reading from featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
- probability of each class as probabilityCol
of type Vector
.
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
-
Transforms dataset by reading from featuresCol
, calling predict()
, and storing
the predictions as a new column predictionCol
.
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
-
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transform(DataFrame, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(DataFrame, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with provided parameter map as additional parameters.
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
Applies transformation on a vector.
- transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector.
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector (Java version).
- transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors.
- transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors (Java version).
- transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms a term frequency (TF) vector to a TF-IDF vector
- transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
- transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
-
Applies unit length normalization on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Applies standardization transformation on a vector.
- transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on a vector.
- transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an RDD[Vector].
- transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an JavaRDD[Vector].
- transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Transforms a word to its vector representation
- transform(PartialFunction<ASTNode, ASTNode>) - Method in class org.apache.spark.sql.hive.HiveQl.TransformableNode
-
Returns a copy of this node where rule
has been recursively applied to it and all of its
children.
- transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformColumnsImpl(DataFrame, ClassificationModel<FeaturesType, ?>, ParamMap) - Static method in class org.apache.spark.ml.classification.ClassificationModel
-
Added prediction column(s).
- TransformedDStream<U> - Class in org.apache.spark.streaming.dstream
-
- TransformedDStream(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<U>>, ClassTag<U>) - Constructor for class org.apache.spark.streaming.dstream.TransformedDStream
-
- Transformer - Class in org.apache.spark.ml
-
:: AlphaComponent ::
Abstract class for transformers that transform one dataset into another.
- Transformer() - Constructor for class org.apache.spark.ml.Transformer
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.PredictionModel
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.impl.estimator.Predictor
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.Pipeline
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineModel
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.PipelineStage
-
:: DeveloperAPI ::
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transformSchema(StructType, ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- translateConfKey(String, boolean) - Static method in class org.apache.spark.SparkConf
-
Translate the configuration key if it is deprecated and has a replacement, otherwise just
returns the provided key.
- transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Transpose this BlockMatrix
.
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
Transposes this CoordinateMatrix.
- transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Transpose the Matrix.
- transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- treeAlgo() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
-
- TreeEnsembleModel - Class in org.apache.spark.mllib.tree.model
-
Represents a tree ensemble model.
- TreeEnsembleModel(Enumeration.Value, DecisionTreeModel[], double[], Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel
-
- TreeEnsembleModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
-
- TreeEnsembleModel.SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$
-
- TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData - Class in org.apache.spark.mllib.tree.model
-
Model data for model import/export.
- TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData(int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData
-
- TreeEnsembleModel.SaveLoadV1_0$.Metadata - Class in org.apache.spark.mllib.tree.model
-
- TreeEnsembleModel.SaveLoadV1_0$.Metadata(String, String, String, double[]) - Constructor for class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
-
- treeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
-
- treeId() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.EnsembleNodeData
-
- TreePoint - Class in org.apache.spark.mllib.tree.impl
-
Internal representation of LabeledPoint for DecisionTree.
- TreePoint(double, int[]) - Constructor for class org.apache.spark.mllib.tree.impl.TreePoint
-
- treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD in a multi-level tree pattern.
- treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD in a multi-level tree pattern.
- trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- treeWeights() - Method in class org.apache.spark.mllib.tree.model.TreeEnsembleModel.SaveLoadV1_0$.Metadata
-
- triangleCount() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the number of triangles passing through each vertex.
- TriangleCount - Class in org.apache.spark.graphx.lib
-
Compute the number of triangles passing through each vertex.
- TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
-
- triK() - Method in class org.apache.spark.ml.recommendation.ALS.NormalEquation
-
Number of entries in the upper triangular part of a k-by-k matrix.
- TripletFields - Class in org.apache.spark.graphx
-
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
- TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
-
Constructs a default TripletFields in which all fields are included.
- TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
-
- tripletIterator(boolean, boolean) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Get an iterator over the edge triplets in this partition.
- triplets() - Method in class org.apache.spark.graphx.Graph
-
An RDD containing the edge triplets, which are edges along with the vertex data associated with
the adjacent vertices.
- triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
Return a RDD that brings edges together with their source and destination vertices.
- TRUE() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns true positive rate for a given label (category)
- trustStore() - Method in class org.apache.spark.SSLOptions
-
- trustStorePassword() - Method in class org.apache.spark.SSLOptions
-
- tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Executes the given block in a Try, logging any uncaught exceptions.
- tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the
default UncaughtExceptionHandler
- tryOrIOException(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that evaluates to Unit, re-throwing any non-fatal uncaught
exceptions as IOException.
- tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils
-
Execute a block of code that returns a value, re-throwing any non-fatal uncaught
exceptions as IOException.
- tryUncacheQuery(DataFrame, boolean) - Method in class org.apache.spark.sql.CacheManager
-
Tries to remove the data for the given
DataFrame
from the cache if it's cached
- TwitterInputDStream - Class in org.apache.spark.streaming.twitter
-
- TwitterInputDStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterInputDStream
-
- TwitterReceiver - Class in org.apache.spark.streaming.twitter
-
- TwitterReceiver(Authorization, Seq<String>, StorageLevel) - Constructor for class org.apache.spark.streaming.twitter.TwitterReceiver
-
- TwitterUtils - Class in org.apache.spark.streaming.twitter
-
- TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
-
- typ() - Method in class org.apache.spark.streaming.scheduler.RegisterReceiver
-
- typeId() - Method in class org.apache.spark.sql.columnar.ColumnType
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.BooleanBitSet
-
- typeId() - Method in interface org.apache.spark.sql.columnar.compression.CompressionScheme
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.IntDelta
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.LongDelta
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.PassThrough
-
- typeId() - Static method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding
-
- U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- udf(Function0<RT>, TypeTags.TypeTag<RT>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 0 arguments as user-defined function (UDF).
- udf(Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 1 arguments as user-defined function (UDF).
- udf(Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 2 arguments as user-defined function (UDF).
- udf(Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 3 arguments as user-defined function (UDF).
- udf(Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 4 arguments as user-defined function (UDF).
- udf(Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 5 arguments as user-defined function (UDF).
- udf(Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 6 arguments as user-defined function (UDF).
- udf(Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 7 arguments as user-defined function (UDF).
- udf(Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 8 arguments as user-defined function (UDF).
- udf(Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 9 arguments as user-defined function (UDF).
- udf(Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 10 arguments as user-defined function (UDF).
- udf() - Method in class org.apache.spark.sql.SQLContext
-
A collection of methods for registering user-defined functions (UDF).
- UDF1<T1,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 1 arguments.
- UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 10 arguments.
- UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 11 arguments.
- UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 12 arguments.
- UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 13 arguments.
- UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 14 arguments.
- UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 15 arguments.
- UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 16 arguments.
- UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 17 arguments.
- UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 18 arguments.
- UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 19 arguments.
- UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 2 arguments.
- UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 20 arguments.
- UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 21 arguments.
- UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 22 arguments.
- UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 3 arguments.
- UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 4 arguments.
- UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 5 arguments.
- UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 6 arguments.
- UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 7 arguments.
- UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 8 arguments.
- UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 9 arguments.
- UDFRegistration - Class in org.apache.spark.sql
-
Functions for registering user-defined functions.
- UDFRegistration(SQLContext) - Constructor for class org.apache.spark.sql.UDFRegistration
-
- ui() - Method in class org.apache.spark.SparkContext
-
- uid() - Method in interface org.apache.spark.ml.Identifiable
-
A unique id for the object.
- UIData - Class in org.apache.spark.ui.jobs
-
- UIData() - Constructor for class org.apache.spark.ui.jobs.UIData
-
- UIData.ExecutorSummary - Class in org.apache.spark.ui.jobs
-
- UIData.ExecutorSummary() - Constructor for class org.apache.spark.ui.jobs.UIData.ExecutorSummary
-
- UIData.JobUIData - Class in org.apache.spark.ui.jobs
-
- UIData.JobUIData(int, Option<Object>, Option<Object>, Seq<Object>, Option<String>, JobExecutionStatus, int, int, int, int, int, int, OpenHashSet<Object>, int, int) - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData
-
- UIData.JobUIData$ - Class in org.apache.spark.ui.jobs
-
- UIData.JobUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.JobUIData$
-
- UIData.StageUIData - Class in org.apache.spark.ui.jobs
-
- UIData.StageUIData() - Constructor for class org.apache.spark.ui.jobs.UIData.StageUIData
-
- UIData.TaskUIData - Class in org.apache.spark.ui.jobs
-
These are kept mutable and reused throughout a task's lifetime to avoid excessive reallocation.
- UIData.TaskUIData(TaskInfo, Option<TaskMetrics>, Option<String>) - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData
-
- UIData.TaskUIData$ - Class in org.apache.spark.ui.jobs
-
- UIData.TaskUIData$() - Constructor for class org.apache.spark.ui.jobs.UIData.TaskUIData$
-
- uiRoot() - Static method in class org.apache.spark.ui.UIUtils
-
- uiTab() - Method in class org.apache.spark.streaming.StreamingContext
-
- UIUtils - Class in org.apache.spark.ui
-
Utility functions for generating XML pages with spark content.
- UIUtils() - Constructor for class org.apache.spark.ui.UIUtils
-
- UIWorkloadGenerator - Class in org.apache.spark.ui
-
Continuously generates jobs that expose various features of the WebUI (internal testing tool).
- UIWorkloadGenerator() - Constructor for class org.apache.spark.ui.UIWorkloadGenerator
-
- unapply(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector
-
Extracts the value array from a dense vector.
- unapply(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
-
- unapply(Column) - Static method in class org.apache.spark.sql.Column
-
- unapply(Object) - Method in class org.apache.spark.sql.hive.HiveQl.Token$
-
- unapply(Broker) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- unapply(String) - Static method in class org.apache.spark.util.IntParam
-
- unapply(String) - Static method in class org.apache.spark.util.MemoryParam
-
- UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml
-
Abstract class for transformers that take one input column, apply transformation, and output the
result as a new column.
- UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
-
- unBlockifyObject(ByteBuffer[], Serializer, Option<CompressionCodec>, ClassTag<T>) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
-
- unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.BroadcastManager
-
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
Remove all persisted state associated with the HTTP broadcast with the given ID.
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
Remove all persisted state associated with the torrent broadcast with the given ID.
- uncacheQuery(DataFrame, boolean) - Method in class org.apache.spark.sql.CacheManager
-
Removes the data for the given
DataFrame
from the cache
- uncacheTable(String) - Method in class org.apache.spark.sql.CacheManager
-
Removes the specified table from the in-memory cache.
- uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Removes the specified table from the in-memory cache.
- UNCAUGHT_EXCEPTION() - Static method in class org.apache.spark.util.SparkExitCode
-
The default uncaught exception handler was reached.
- UNCAUGHT_EXCEPTION_TWICE() - Static method in class org.apache.spark.util.SparkExitCode
-
The default uncaught exception handler was called and an exception was encountered while
logging the exception.
- uncaughtException(Thread, Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
-
- uncaughtException(Throwable) - Static method in class org.apache.spark.util.SparkUncaughtExceptionHandler
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.BooleanBitSet.Encoder
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.DictionaryEncoding.Encoder
-
- uncompressedSize() - Method in interface org.apache.spark.sql.columnar.compression.Encoder
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.IntDelta.Encoder
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.LongDelta.Encoder
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.PassThrough.Encoder
-
- uncompressedSize() - Method in class org.apache.spark.sql.columnar.compression.RunLengthEncoding.Encoder
-
- underlyingBuffer() - Method in interface org.apache.spark.sql.columnar.ColumnAccessor
-
- underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
-
- UniformGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
-
- uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the uniform distribution U(0.0, 1.0)
.
- uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
uniform distribution on U(0.0, 1.0)
.
- union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the union of this RDD and another one.
- union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the union of this RDD and another one.
- union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs.
- union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs passed as variable-length arguments.
- union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- unionAll(DataFrame) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
containing union of rows in this frame and another frame.
- UnionDStream<T> - Class in org.apache.spark.streaming.dstream
-
- UnionDStream(DStream<T>[], ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.UnionDStream
-
- UnionPartition<T> - Class in org.apache.spark.rdd
-
Partition for UnionRDD.
- UnionPartition(int, RDD<T>, int, int, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionPartition
-
- UnionRDD<T> - Class in org.apache.spark.rdd
-
- UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
-
- uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
-
- UnknownReason - Class in org.apache.spark
-
:: DeveloperApi ::
We don't know why the task ended -- for example, because of a ClassNotFound exception when
deserializing the task result.
- UnknownReason() - Constructor for class org.apache.spark.UnknownReason
-
- unorderedFeatures() - Method in class org.apache.spark.mllib.tree.impl.DecisionTreeMetadata
-
- unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.broadcast.Broadcast
-
Asynchronously delete cached copies of this broadcast on the executors.
- unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
-
Delete cached copies of this broadcast on the executors.
- unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.HttpBroadcast
-
Remove all persisted blocks associated with this HTTP broadcast on the executors.
- unpersist(long, boolean, boolean) - Static method in class org.apache.spark.broadcast.TorrentBroadcast
-
Remove all persisted blocks associated with this torrent broadcast on the executors.
- unpersist(boolean) - Method in class org.apache.spark.graphx.Graph
-
Uncaches both vertices and edges of this graph.
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Unpersist intermediate RDDs used in the computation.
- unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.sql.DataFrame
-
- unpersist() - Method in class org.apache.spark.sql.DataFrame
-
- unpersist() - Method in interface org.apache.spark.sql.RDDApi
-
- unpersist(boolean) - Method in interface org.apache.spark.sql.RDDApi
-
- unpersistRDD(int, boolean) - Method in class org.apache.spark.SparkContext
-
Unpersist an RDD from memory and/or disk storage
- unpersistRDDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- unpersistRDDToJson(SparkListenerUnpersistRDD) - Static method in class org.apache.spark.util.JsonProtocol
-
- unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph
-
Uncaches only the vertices of this graph, leaving the edges alone.
- unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- unregisterAllTables() - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
- unregisterMapOutput(int, int, BlockManagerId) - Method in class org.apache.spark.MapOutputTrackerMaster
-
Unregister map output information of the given shuffle, mapper and block manager
- unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTracker
-
Unregister shuffle data.
- unregisterShuffle(int) - Method in class org.apache.spark.MapOutputTrackerMaster
-
Unregister shuffle data
- unregisterTable(Seq<String>) - Method in class org.apache.spark.sql.hive.HiveMetastoreCatalog
-
UNIMPLEMENTED: It needs to be decided how we will persist in-memory tables to the metastore.
- unrollSafely(BlockId, Iterator<Object>, ArrayBuffer<Tuple2<BlockId, BlockStatus>>) - Method in class org.apache.spark.storage.MemoryStore
-
Unroll the given block in memory safely.
- unset() - Static method in class org.apache.spark.TaskContextHelper
-
- unsetConf(String) - Method in class org.apache.spark.sql.SQLConf
-
- until(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- untilOffset() - Method in class org.apache.spark.streaming.kafka.KafkaRDDPartition
-
- untilOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
exclusive ending offset
- unwrap(Object, ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
-
Converts hive types to native catalyst types.
- update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
Perform a k-means update on a batch of data.
- update(int, int, double) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- update(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Update element at (i, j)
- update(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Update all the values of this matrix using the function f.
- update(int, int, double) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- update(Function1<Object, Object>) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- update(int, int, double, double) - Method in class org.apache.spark.mllib.tree.impl.DTStatsAggregator
-
Update the stats for a given (feature, bin) for ordered features, using the given label.
- update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.EntropyAggregator
-
Update stats for one (node, feature, bin) with the given label.
- update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.GiniAggregator
-
Update stats for one (node, feature, bin) with the given label.
- update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.ImpurityAggregator
-
Update stats for one (node, feature, bin) with the given label.
- update(double[], int, double, double) - Method in class org.apache.spark.mllib.tree.impurity.VarianceAggregator
-
Update stats for one (node, feature, bin) with the given label.
- update() - Method in class org.apache.spark.scheduler.AccumulableInfo
-
- update(Row) - Method in class org.apache.spark.sql.hive.HiveUdafFunction
-
- update(Time) - Method in class org.apache.spark.streaming.dstream.DStreamCheckpointData
-
Updates the checkpoint data of the DStream.
- update(Time) - Method in class org.apache.spark.streaming.dstream.FileInputDStream.FileInputDStreamCheckpointData
-
- update(Time) - Method in class org.apache.spark.streaming.kafka.DirectKafkaInputDStream.DirectKafkaInputDStreamCheckpointData
-
- update(T1, T2) - Method in class org.apache.spark.util.MutablePair
-
Updates this pair with new values and returns itself
- update(A, B) - Method in class org.apache.spark.util.TimeStampedHashMap
-
- update(A, B) - Method in class org.apache.spark.util.TimeStampedWeakValueHashMap
-
- UPDATE_PERIOD() - Method in class org.apache.spark.ui.ConsoleProgressBar
-
- updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage
aggregate metrics by calculating deltas between the currently recorded metrics and the new
metrics.
- updateBlock(BlockId, BlockStatus) - Method in class org.apache.spark.storage.StorageStatus
-
Update the given block in this storage status.
- updateBlockInfo(BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerInfo
-
- updateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Method in class org.apache.spark.storage.BlockManagerMaster
-
- updateCheckpointData(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Refresh the list of checkpointed RDDs that will be saved along with checkpoint of
this stream.
- updateCheckpointData(Time) - Method in class org.apache.spark.streaming.DStreamGraph
-
- updatedConf(SparkConf, String, String, String, Seq<String>, Map<String, String>) - Static method in class org.apache.spark.SparkContext
-
Creates a modified version of a SparkConf with the parameters that can be passed separately
to SparkContext, to make it easier to write SparkContext's constructors.
- updateEpoch(long) - Method in class org.apache.spark.MapOutputTracker
-
Called from executors to update the epoch number, potentially clearing old outputs
because of a fetch failure.
- updateGraph(Graph<VD, ED>) - Method in class org.apache.spark.mllib.impl.PeriodicGraphCheckpointer
-
Update currentGraph
with a new graph.
- updateLastSeenMs() - Method in class org.apache.spark.storage.BlockManagerInfo
-
- updateNodeIndex(int[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIndexUpdater
-
Determine a child node index based on the feature value and the split.
- updateNodeIndices(RDD<BaggedPoint<TreePoint>>, Map<Object, NodeIndexUpdater>[], Bin[][]) - Method in class org.apache.spark.mllib.tree.impl.NodeIdCache
-
Update the node index values in the cache.
- Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to perform steps (weight update) using Gradient Descent methods.
- Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
-
- updateRddInfo(Seq<RDDInfo>, Seq<StorageStatus>) - Static method in class org.apache.spark.storage.StorageUtils
-
Update the given list of RDDInfo with the given list of storage statuses.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner, JavaPairRDD<K, S>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateVertices(Iterator<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.EdgePartition
-
Return a new `EdgePartition` with updates to vertex attributes specified in `iter`.
- updateVertices(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
-
Return a new ReplicatedVertexView
where vertex attributes in edge partition are updated using
updates
.
- upgrade(VertexRDD<VD>, boolean, boolean) - Method in class org.apache.spark.graphx.impl.ReplicatedVertexView
-
Upgrade the shipping level in-place to the specified levels by shipping vertex attributes from
vertices
.
- upper() - Method in class org.apache.spark.rdd.JdbcPartition
-
- upper(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a string expression to upper case.
- UPPER() - Static method in class org.apache.spark.sql.hive.HiveQl
-
- upperBound() - Method in class org.apache.spark.sql.columnar.ColumnStatisticsSchema
-
- upperBound() - Method in class org.apache.spark.sql.jdbc.JDBCPartitioningInfo
-
- uri() - Method in class org.apache.spark.HttpServer
-
Get the URI of this HTTP server (http://host:port or https://host:port)
- url() - Method in class org.apache.spark.sql.jdbc.JDBCRelation
-
- useCachedData(LogicalPlan) - Method in class org.apache.spark.sql.CacheManager
-
Replaces segments of the given logical plan with cached versions where possible.
- useCompression() - Method in class org.apache.spark.sql.columnar.InMemoryRelation
-
- useCompression() - Method in class org.apache.spark.sql.SQLConf
-
When true tables cached using the in-memory columnar caching will be compressed.
- useDisk() - Method in class org.apache.spark.storage.StorageLevel
-
- useDst - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the destination vertex attribute is included.
- useEdge - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the edge attribute is included.
- useMemory() - Method in class org.apache.spark.storage.StorageLevel
-
- useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
-
- user() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- user() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- user() - Method in class org.apache.spark.scheduler.JobLogger
-
- userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
-
- userClass() - Method in class org.apache.spark.sql.test.ExamplePointUDT
-
- userCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
-
Param for the column name for user ids.
- UserDefinedFunction - Class in org.apache.spark.sql
-
A user-defined function.
- UserDefinedPythonFunction - Class in org.apache.spark.sql
-
A user-defined Python function.
- UserDefinedPythonFunction(String, byte[], Map<String, String>, List<String>, String, List<Broadcast<PythonBroadcast>>, Accumulator<List<byte[]>>, DataType) - Constructor for class org.apache.spark.sql.UserDefinedPythonFunction
-
- userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- userSpecifiedSchema() - Method in class org.apache.spark.sql.hive.execution.CreateMetastoreDataSource
-
- userSpecifiedSchema() - Method in class org.apache.spark.sql.json.JSONRelation
-
- userSpecifiedSchema() - Method in class org.apache.spark.sql.sources.CreateTableUsing
-
- userSpecifiedSchema() - Method in class org.apache.spark.sql.sources.CreateTempTableUsing
-
- useSrc - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the source vertex attribute is included.
- Utils - Class in org.apache.spark.util
-
Various utility methods used by Spark.
- Utils() - Constructor for class org.apache.spark.util.Utils
-
- UUIDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
-
- UUIDToJson(UUID) - Static method in class org.apache.spark.util.JsonProtocol
-