pandas.api.extensions.ExtensionArray.factorize
- ExtensionArray.factorize(use_na_sentinel=True)[source]
-
Encode the extension array as an enumerated type.
- Parameters:
-
- use_na_sentinel:bool, default True
-
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
- Returns:
-
- codes:ndarray
-
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniques:ExtensionArray
-
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See also
factorize
-
Top-level factorize method that dispatches here.
Notes
pandas.factorize()
offers a sort keyword as well.Examples
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
© 2008–2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
© 2011–2025, Open source contributors
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/2.3.0/reference/api/pandas.api.extensions.ExtensionArray.factorize.html