ChiSquareTest

object ChiSquareTest

Chi-square hypothesis testing for categorical data.

See Wikipedia for more information on the Chi-squared test.

Annotations: @Since( "2.2.0" )
Source: ChiSquareTest.scala

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

ChiSquareTest
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def test(dataset: DataFrame, featuresCol: String, labelCol: String): DataFrame
Conduct Pearson's independence test for every feature against the label.
Conduct Pearson's independence test for every feature against the label. For each feature, the (feature, label) pairs are converted into a contingency matrix for which the Chi-squared statistic is computed. All label and feature values must be categorical.
The null hypothesis is that the occurrence of the outcomes is statistically independent.
dataset
DataFrame of categorical labels and categorical features. Real-valued features will be treated as categorical for each distinct value.
featuresCol
Name of features column in dataset, of type Vector (VectorUDT)
labelCol
Name of label column in dataset, of any numerical type
returns
DataFrame containing the test result for every feature against the label. This DataFrame will contain a single Row with the following fields:
- pValues: Vector
- degreesOfFreedom: Array[Int]
- statistics: Vector Each of these fields has one value per feature.
Annotations
@Since( "2.2.0" )
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

ChiSquareTest

object ChiSquareTest

Value Members

Inherited from AnyRef

Inherited from Any

Members

Packages

ChiSquareTest 

object ChiSquareTest

Value Members

Inherited from AnyRef

Inherited from Any

Members

ChiSquareTest