pyspark.sql.functions.map_concat#

pyspark.sql.functions.map_concat(*cols)[source]#

Map function: Returns the union of all given maps.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colsColumn or str

Column names or Column

Returns
Column

A map of merged entries from other maps.

Notes

For duplicate keys in input maps, the handling is governed by spark.sql.mapKeyDedupPolicy. By default, it throws an exception. If set to LAST_WIN, it uses the last map’s value.

Examples

Example 1: Basic usage of map_concat

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, 'c') as map2")
>>> df.select(sf.map_concat("map1", "map2")).show(truncate=False)
+------------------------+
|map_concat(map1, map2)  |
+------------------------+
|{1 -> a, 2 -> b, 3 -> c}|
+------------------------+

Example 2: map_concat with overlapping keys

>>> from pyspark.sql import functions as sf
>>> originalmapKeyDedupPolicy = spark.conf.get("spark.sql.mapKeyDedupPolicy")
>>> spark.conf.set("spark.sql.mapKeyDedupPolicy", "LAST_WIN")
>>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(2, 'c', 3, 'd') as map2")
>>> df.select(sf.map_concat("map1", "map2")).show(truncate=False)
+------------------------+
|map_concat(map1, map2)  |
+------------------------+
|{1 -> a, 2 -> c, 3 -> d}|
+------------------------+
>>> spark.conf.set("spark.sql.mapKeyDedupPolicy", originalmapKeyDedupPolicy)

Example 3: map_concat with three maps

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT map(1, 'a') as map1, map(2, 'b') as map2, map(3, 'c') as map3")
>>> df.select(sf.map_concat("map1", "map2", "map3")).show(truncate=False)
+----------------------------+
|map_concat(map1, map2, map3)|
+----------------------------+
|{1 -> a, 2 -> b, 3 -> c}    |
+----------------------------+

Example 4: map_concat with empty map

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map() as map2")
>>> df.select(sf.map_concat("map1", "map2")).show(truncate=False)
+----------------------+
|map_concat(map1, map2)|
+----------------------+
|{1 -> a, 2 -> b}      |
+----------------------+

Example 5: map_concat with null values

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, null) as map2")
>>> df.select(sf.map_concat("map1", "map2")).show(truncate=False)
+---------------------------+
|map_concat(map1, map2)     |
+---------------------------+
|{1 -> a, 2 -> b, 3 -> NULL}|
+---------------------------+