pyspark.sql.functions.
concat
Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
target column or columns to work on.
concatenated values. Type of the Column depends on input columns’ type.
See also
pyspark.sql.functions.array_join()
to concatenate string columns with delimiter
Examples
>>> df = spark.createDataFrame([('abcd','123')], ['s', 'd']) >>> df = df.select(concat(df.s, df.d).alias('s')) >>> df.collect() [Row(s='abcd123')] >>> df DataFrame[s: string]
>>> df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c']) >>> df = df.select(concat(df.a, df.b, df.c).alias("arr")) >>> df.collect() [Row(arr=[1, 2, 3, 4, 5]), Row(arr=None)] >>> df DataFrame[arr: array<bigint>]