SparkContext.
broadcast
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions. The variable will be sent to each cluster only once.
Broadcast
New in version 0.7.0.
value to broadcast to the Spark nodes
Broadcast object, a read-only variable cached on each machine
Examples
>>> mapping = {1: 10001, 2: 10002} >>> bc = sc.broadcast(mapping)
>>> rdd = sc.range(5) >>> rdd2 = rdd.map(lambda i: bc.value[i] if i in bc.value else -1) >>> rdd2.collect() [-1, 10001, 10002, -1, -1]
>>> bc.destroy()