pyspark.pandas.MultiIndex.sort_values#

MultiIndex.sort_values(return_indexer=False, ascending=True)#

Return a sorted copy of the index, and optionally return the indices that sorted the index itself.

Note

This method is not supported for pandas when index has NaN value. pandas raises unexpected TypeError, but we support treating NaN as the smallest value. This method returns indexer as a pandas-on-Spark index while pandas returns it as a list. That’s because indexer in pandas-on-Spark may not fit in memory.

Parameters

return_indexerbool, default False: Should the indices that would sort the index be returned.
ascendingbool, default True: Should the index values be sorted in an ascending order.

Returns

sorted_indexps.Index or ps.MultiIndex: Sorted copy of the index.
indexerps.Index: The indices that the index itself was sorted by.

See also

Series.sort_values: Sort values of a Series.
DataFrame.sort_values: Sort values in a DataFrame.

Examples

>>> idx = ps.Index([10, 100, 1, 1000])
>>> idx
Index([10, 100, 1, 1000], dtype='int64')

Sort values in ascending order (default behavior).

>>> idx.sort_values()
Index([1, 10, 100, 1000], dtype='int64')

Sort values in descending order.

>>> idx.sort_values(ascending=False)
Index([1000, 100, 10, 1], dtype='int64')

Sort values in descending order, and also get the indices idx was sorted by.

>>> idx.sort_values(ascending=False, return_indexer=True)
(Index([1000, 100, 10, 1], dtype='int64'), Index([3, 1, 0, 2], dtype='int64'))

Support for MultiIndex.

>>> psidx = ps.MultiIndex.from_tuples([('a', 'x', 1), ('c', 'y', 2), ('b', 'z', 3)])
>>> psidx  
MultiIndex([('a', 'x', 1),
            ('c', 'y', 2),
            ('b', 'z', 3)],
           )

>>> psidx.sort_values()  
MultiIndex([('a', 'x', 1),
            ('b', 'z', 3),
            ('c', 'y', 2)],
           )

>>> psidx.sort_values(ascending=False)  
MultiIndex([('c', 'y', 2),
            ('b', 'z', 3),
            ('a', 'x', 1)],
           )

>>> psidx.sort_values(ascending=False, return_indexer=True)
(MultiIndex([('c', 'y', 2),
            ('b', 'z', 3),
            ('a', 'x', 1)],
           ), Index([1, 2, 0], dtype='int64'))