DataFrame.
intersect
Return a new DataFrame containing rows only in both this DataFrame and another DataFrame. Note that any duplicates are removed. To preserve duplicates use intersectAll().
DataFrame
intersectAll()
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
Another DataFrame that needs to be combined.
Combined DataFrame.
Notes
This is equivalent to INTERSECT in SQL.
Examples
>>> df1 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3), ("c", 4)], ["C1", "C2"]) >>> df2 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3)], ["C1", "C2"]) >>> df1.intersect(df2).sort(df1.C1.desc()).show() +---+---+ | C1| C2| +---+---+ | b| 3| | a| 1| +---+---+