pyspark.sql.functions.map_concat#
- pyspark.sql.functions.map_concat(*cols)[source]#
Map function: Returns the union of all given maps.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
A map of merged entries from other maps.
Notes
For duplicate keys in input maps, the handling is governed by spark.sql.mapKeyDedupPolicy. By default, it throws an exception. If set to LAST_WIN, it uses the last map’s value.
Examples
Example 1: Basic usage of map_concat
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, 'c') as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +------------------------+ |map_concat(map1, map2) | +------------------------+ |{1 -> a, 2 -> b, 3 -> c}| +------------------------+
Example 2: map_concat with overlapping keys
>>> from pyspark.sql import functions as sf >>> originalmapKeyDedupPolicy = spark.conf.get("spark.sql.mapKeyDedupPolicy") >>> spark.conf.set("spark.sql.mapKeyDedupPolicy", "LAST_WIN") >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(2, 'c', 3, 'd') as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +------------------------+ |map_concat(map1, map2) | +------------------------+ |{1 -> a, 2 -> c, 3 -> d}| +------------------------+ >>> spark.conf.set("spark.sql.mapKeyDedupPolicy", originalmapKeyDedupPolicy)
Example 3: map_concat with three maps
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a') as map1, map(2, 'b') as map2, map(3, 'c') as map3") >>> df.select(sf.map_concat("map1", "map2", "map3")).show(truncate=False) +----------------------------+ |map_concat(map1, map2, map3)| +----------------------------+ |{1 -> a, 2 -> b, 3 -> c} | +----------------------------+
Example 4: map_concat with empty map
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map() as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +----------------------+ |map_concat(map1, map2)| +----------------------+ |{1 -> a, 2 -> b} | +----------------------+
Example 5: map_concat with null values
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, null) as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +---------------------------+ |map_concat(map1, map2) | +---------------------------+ |{1 -> a, 2 -> b, 3 -> NULL}| +---------------------------+