piRSquared piRSquared - 2 months ago 46
Python Question

pandas str extract as integer

consider the


s = pd.Series(['A1', 'B2', '3C'])

I want to extract the numeric portion of each element.

I know I can use
in the following way

s.str.extract('(\d)', expand=False)

0 1
1 2
2 3
dtype: object

Notice the
dtype: object

If I get the
of each element

s.str.extract('(\d)', expand=False).apply(type)

0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
dtype: object


How do I extract directly to integer?

0 1
1 2
2 3
dtype: int64


I think it is impossible.

See docs str.extract:


DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used. The dtype of each result column is always object, even when no match is found. If expand=True and pat has only one capture group, then return a Series (if subject is a Series) or Index (if subject is an Index).

So need astype(int) or if NaN in output - to_numeric pd.to_numeric(s.str.extract('(\d)', expand=False))