xpt xpt - 1 year ago 191
Python Question

pandas.DataFrame.replace with wildcards

Does the

regex replace support wildcards and "capture groups"?

E.g., to replace

What kind of regular expression is supported? Does Perl's regex supported? E.g., OK to replace
: Change the next character to lowercase.)


As Steve has pointed out, according to the Python documentation, it should work, but the following is not giving me what I expected:

df = pd.DataFrame({'A': np.random.choice(['foo', 'bar'], 100),
'B': np.random.choice(['one', 'two', 'three'], 100),
'C': np.random.choice(['I1', 'I2', 'I3', 'I4'], 100),
'D': np.random.randint(-10,11,100),
'E': np.random.randn(100)})
df.replace("f(.)(.)","b\1\2", regex=True,inplace=True)

What's wrong?


Answer Source

According to the pandas documentation:

Regex substitution is performed under the hood with re.sub. The rules for substitution for re.sub are the same.

So, yes, any substitutions which can be performed with Python's re.sub (such as \1) can also be performed with pandas.DataFrame.replace. See the Python documentation for more information.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download