Feyzi Bagirov - 1 year ago 83
Python Question

# Iterating over a column and replacing a value with an extracted string [Pandas]

I have a dataset, that looks like this:

``````  A   B
1 aa  1234
2 ab  3456
3 bc  [1357, 2468]
4 cc  8901
...
``````

I need to iterate over the column B and replace all values in square brackets ([]) with four left digits in those brackets, so the dataset would look like this:

``````  A   B
1 aa  1234
2 ab  3456
3 bc  1357
4 cc  8901
...
``````

I have this code:

``````for item in df['B']:
if len(item) > 4:
item_v = str(item[1:5])
df['B'][item] = item_v
print(df['B'][item])
``````

Which prints truncated values, however, if I check the head of the df, it still has the old values:

``````   > df['B'].head()

>  A   B
1 aa  1234
2 ab  3456
3 bc  [1357, 2468]
4 cc  8901
...
``````

What am I doing wrong?

The easiest and fastest way is to use Pandas str.get() function and create an other column for the desired results.

Solution #1 This first solution works if your values in `B` are integers `[1234,3456,[1357, 2468],8901]`

``````df['C'] = df['B'].str.get(0).astype(float)
df.C.fillna(df['B'], inplace=True)
df['C'] = df.C.astype(int, inplace=True)
``````

Output:

``````A             B     C
0  aa          1234  1234
1  ab          3456  3456
2  bc  [1357, 2468]  1357
3  cc          8901  8901
``````

Then, you can delete column B if you don't need it.

Solution #2 This solution works if your values in `B` are strings `['1234','3456',['1357', '2468'],'8901']`

``````import re
df['digits'] = df['B'].apply(lambda x: re.findall('\d+', str(x)))
df['digits'] = df['digits'].str.get(0)
print(df)
``````

Output:

``````   A             B    digits
0  aa          1234   1234
1  ab          3456   3456
2  bc  [1357, 2468]   1357
3  cc          8901   8901
``````

Again, you can delete column B if you don't need it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download