public class ChiSqTest extends Object implements Logging
Vectors
, whereas test of independence is conducted
on an input of type Matrix
in which independence between columns is assessed.
We also provide a method for computing the chi-squared statistic between each feature and the
label for an input RDD[LabeledPoint]
, return an Array[ChiSquaredTestResult]
of size =
number of features in the input RDD.
Supported methods for goodness of fit: pearson
(default)
Supported methods for independence: pearson
(default)
More information on Chi-squared test: http://en.wikipedia.org/wiki/Chi-squared_test
Modifier and Type | Class and Description |
---|---|
static class |
ChiSqTest.Method |
static class |
ChiSqTest.Method$ |
static class |
ChiSqTest.NullHypothesis$ |
Constructor and Description |
---|
ChiSqTest() |
Modifier and Type | Method and Description |
---|---|
static ChiSqTestResult |
chiSquared(Vector observed,
Vector expected,
String methodName) |
static ChiSqTestResult[] |
chiSquaredFeatures(RDD<LabeledPoint> data,
String methodName)
Conduct Pearson's independence test for each feature against the label across the input RDD.
|
static ChiSqTestResult |
chiSquaredMatrix(Matrix counts,
String methodName) |
static ChiSqTest.Method |
PEARSON() |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public static ChiSqTest.Method PEARSON()
public static ChiSqTestResult[] chiSquaredFeatures(RDD<LabeledPoint> data, String methodName)
public static ChiSqTestResult chiSquared(Vector observed, Vector expected, String methodName)
public static ChiSqTestResult chiSquaredMatrix(Matrix counts, String methodName)