pyspark.sql.functions.
covar_samp
Returns a new Column for the sample covariance of col1 and col2.
Column
col1
col2
New in version 2.0.0.
Changed in version 3.4.0: Supports Spark Connect.
first column to calculate covariance.
second column to calculate covariance.
sample covariance of these two column values.
Examples
>>> a = [1] * 10 >>> b = [1] * 10 >>> df = spark.createDataFrame(zip(a, b), ["a", "b"]) >>> df.agg(covar_samp("a", "b").alias('c')).collect() [Row(c=0.0)]