pyspark.SparkContext.accumulator#

SparkContext.accumulator(value, accum_param=None)[source]#

Create an Accumulator with the given initial value, using a given AccumulatorParam helper object to define how to add values of the data type if provided. Default AccumulatorParams are used for integers and floating-point numbers if you do not provide one. For other types, a custom AccumulatorParam can be used.

New in version 0.7.0.

Parameters
valueT

initialized value

accum_parampyspark.AccumulatorParam, optional

helper object to define how to add values

Returns
Accumulator

Accumulator object, a shared variable that can be accumulated

Examples

>>> acc = sc.accumulator(9)
>>> acc.value
9
>>> acc += 1
>>> acc.value
10

Accumulator object can be accumulated in RDD operations:

>>> rdd = sc.range(5)
>>> def f(x):
...     global acc
...     acc += 1
...
>>> rdd.foreach(f)
>>> acc.value
15