pyspark.sql.functions.concat

pyspark.sql.functions.concat(*cols: ColumnOrName) → pyspark.sql.column.Column[source]

Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colsColumn or str

target column or columns to work on.

Returns
Column

concatenated values. Type of the Column depends on input columns’ type.

See also

pyspark.sql.functions.array_join()

to concatenate string columns with delimiter

Examples

>>> df = spark.createDataFrame([('abcd','123')], ['s', 'd'])
>>> df = df.select(concat(df.s, df.d).alias('s'))
>>> df.collect()
[Row(s='abcd123')]
>>> df
DataFrame[s: string]
>>> df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c'])
>>> df = df.select(concat(df.a, df.b, df.c).alias("arr"))
>>> df.collect()
[Row(arr=[1, 2, 3, 4, 5]), Row(arr=None)]
>>> df
DataFrame[arr: array<bigint>]