piRSquared piRSquared - 1 month ago 16
Python Question

pandas str extract as integer

consider the

pd.Series
s


s = pd.Series(['A1', 'B2', '3C'])


I want to extract the numeric portion of each element.

I know I can use
extract
in the following way

s.str.extract('(\d)', expand=False)

0 1
1 2
2 3
dtype: object


Notice the
dtype: object


If I get the
type
of each element

s.str.extract('(\d)', expand=False).apply(type)

0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
dtype: object


question

How do I extract directly to integer?

0 1
1 2
2 3
dtype: int64

Answer

I think it is impossible.

See docs str.extract:

Returns:

DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used. The dtype of each result column is always object, even when no match is found. If expand=True and pat has only one capture group, then return a Series (if subject is a Series) or Index (if subject is an Index).

So need astype(int) or if NaN in output - to_numeric pd.to_numeric(s.str.extract('(\d)', expand=False))