pyspark.sql.functions.covar_samp

pyspark.sql.functions.covar_samp(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column[source]

Returns a new Column for the sample covariance of col1 and col2.

New in version 2.0.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
col1Column or str

first column to calculate covariance.

col1Column or str

second column to calculate covariance.

Returns
Column

sample covariance of these two column values.

Examples

>>> a = [1] * 10
>>> b = [1] * 10
>>> df = spark.createDataFrame(zip(a, b), ["a", "b"])
>>> df.agg(covar_samp("a", "b").alias('c')).collect()
[Row(c=0.0)]