I have a csv file that I read into a pandas dataframe. There are two specific columns, 'Notes' and 'ActivityType', that I want to use as criteria. If the 'Notes' column contains a string value of 'Morning exercise' or ' Morning workout' and/or the 'ActivityType' column contains any string value (majority of cells are Null and I don't want Null values counted) then make a new column 'MorningExercise' and insert a 1 if either conditions are met or a 0 if neither are.
I have been using the code below to create a new column and insert a 1 or 0 if the text conditions are met in the 'Notes' column, but I have not figured out how to include a 1 if the 'ActivityType' column contains any string value.
JoinedTables['MorningExercise'] = JoinedTables['Notes'].str.contains(('Morning workout' or 'Morning exercise'), case=False, na=False).astype(int)
You'll need to fashion a regex pattern to use with
regex = r'Morning\s*(?:workout|exercise)' JoinedTables['MorningExercise'] = \ JoinedTables['Notes'].str.contains(regex, case=False, na=False).astype(int)
Morning # match "Morning" \s* # 0 or more whitespace chars (?: # open non-capturing group workout # match "workout" | # OR operator exercise # match "exercise" )
The pattern will look for
Morning followed by either