Deepak M Deepak M - 11 days ago 4
Python Question

Separating and Dropping stringed data from a column in Python

Hello If i have a column in a data frame like this:

df["OriginCityName"]:
OriginCityName:
0 Dallas/Fort Worth, TX
1 Detroit, MI


I wish to extract the Words "Dallas" and "Detroit" only and drop the remaining data.I want the column to look like this after:

df["OriginCityName"]:
OriginCityName:
0 Dallas
1 Detroit


Is there any way to do this. Thanks

Answer

You can use the extract method with regex (^[A-Za-z]+). This extracts all alpha characters from the beginning of the string:

df.OriginalCityName.str.extract('(^[A-Za-z]+)')

#0
#0     Dallas
#1    Detroit
#Name: OriginalCityName, dtype: object

Or if you are sure what you want to extract comes before either / or ,, you can try this one: df.OriginalCityName.str.extract('(^.*?)(?=[/,])'). This extracts everything before the first / or , due to lazy match .*?.

Comments