pyspark.pandas.groupby.SeriesGroupBy.nlargest

SeriesGroupBy.nlargest(n: int = 5) → pyspark.pandas.series.Series[source]

Return the first n rows ordered by columns in descending order in group.

Return the first n rows with the smallest values in columns, in descending order. The columns that are not specified are returned as well, but not used for ordering.

Parameters
nint

Number of items to retrieve.

Examples

>>> df = ps.DataFrame({'a': [1, 1, 1, 2, 2, 2, 3, 3, 3],
...                    'b': [1, 2, 2, 2, 3, 3, 3, 4, 4]}, columns=['a', 'b'])
>>> df.groupby(['a'])['b'].nlargest(1).sort_index()  
a
1  1    2
2  4    3
3  7    4
Name: b, dtype: int64