Getting Started
User Guide
API Reference
Development
Migration Guide
Spark SQL
Pandas API on Spark
Structured Streaming
MLlib (DataFrame-based)
Spark Streaming
MLlib (RDD-based)
Spark Core
Resource Management
pyspark.streaming.DStream.combineByKey
¶
DStream.
combineByKey
(
createCombiner
:
Callable
[
[
V
]
,
U
]
,
mergeValue
:
Callable
[
[
U
,
V
]
,
U
]
,
mergeCombiners
:
Callable
[
[
U
,
U
]
,
U
]
,
numPartitions
:
Optional
[
int
]
=
None
)
→ pyspark.streaming.dstream.DStream
[
Tuple
[
K
,
U
]
]
[source]
¶
Return a new DStream by applying combineByKey to each RDD.
pyspark.streaming.DStream.cogroup
pyspark.streaming.DStream.context