FPGrowthModel¶
-
class
pyspark.mllib.fpm.
FPGrowthModel
(java_model: py4j.java_gateway.JavaObject)[source]¶ A FP-Growth model for mining frequent itemsets using the Parallel FP-Growth algorithm.
New in version 1.4.0.
Examples
>>> data = [["a", "b", "c"], ["a", "b", "d", "e"], ["a", "c", "e"], ["a", "c", "f"]] >>> rdd = sc.parallelize(data, 2) >>> model = FPGrowth.train(rdd, 0.6, 2) >>> sorted(model.freqItemsets().collect()) [FreqItemset(items=['a'], freq=4), FreqItemset(items=['c'], freq=3), ... >>> model_path = temp_path + "/fpm" >>> model.save(sc, model_path) >>> sameModel = FPGrowthModel.load(sc, model_path) >>> sorted(model.freqItemsets().collect()) == sorted(sameModel.freqItemsets().collect()) True
Methods
call
(name, *a)Call method of java_model
Returns the frequent itemsets of this model.
load
(sc, path)Load a model from the given path.
save
(sc, path)Save this model to the given path.
Methods Documentation
-
call
(name: str, *a: Any) → Any¶ Call method of java_model
-
freqItemsets
() → pyspark.rdd.RDD[pyspark.mllib.fpm.FPGrowth.FreqItemset][source]¶ Returns the frequent itemsets of this model.
New in version 1.4.0.
-
classmethod
load
(sc: pyspark.context.SparkContext, path: str) → pyspark.mllib.fpm.FPGrowthModel[source]¶ Load a model from the given path.
New in version 2.0.0.
-
save
(sc: pyspark.context.SparkContext, path: str) → None¶ Save this model to the given path.
New in version 1.3.0.
-