DataFrame.
union
Return a new DataFrame containing union of rows in this and another DataFrame.
DataFrame
New in version 2.0.0.
Changed in version 3.4.0: Supports Spark Connect.
Another DataFrame that needs to be unioned
See also
DataFrame.unionAll
Notes
This is equivalent to UNION ALL in SQL. To do a SQL-style set union (that does deduplication of elements), use this function followed by distinct().
distinct()
Also as standard in SQL, this function resolves columns by position (not by name).
Examples
>>> df1 = spark.createDataFrame([[1, 2, 3]], ["col0", "col1", "col2"]) >>> df2 = spark.createDataFrame([[4, 5, 6]], ["col1", "col2", "col0"]) >>> df1.union(df2).show() +----+----+----+ |col0|col1|col2| +----+----+----+ | 1| 2| 3| | 4| 5| 6| +----+----+----+ >>> df1.union(df1).show() +----+----+----+ |col0|col1|col2| +----+----+----+ | 1| 2| 3| | 1| 2| 3| +----+----+----+