DataFrame.
subtract
Return a new DataFrame containing rows in this DataFrame but not in another DataFrame.
DataFrame
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
Another DataFrame that needs to be subtracted.
Subtracted DataFrame.
Notes
This is equivalent to EXCEPT DISTINCT in SQL.
Examples
>>> df1 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3), ("c", 4)], ["C1", "C2"]) >>> df2 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3)], ["C1", "C2"]) >>> df1.subtract(df2).show() +---+---+ | C1| C2| +---+---+ | c| 4| +---+---+