user95902 user95902 - 1 month ago 8
R Question

R Partial string match and returns a value from the matched row (like "match" in excel)

I would like to ask you if there is a similar function like "match" in excel in R.

For example if I have a dataset with people's educational degrees:

> edu
chr [1:4] "Bachelor" "NA" "Master" "Superieur"


And an international mapping system by ISCED:

> ISCED
Main education program English translation Code
Brevet d'enseignement supérieur (BES) certificate of higher education 5
bachelier de transition Bachelor 6
Bachelor Bachelor 6
Master Master 7


I wonder if there is a function that can help identify partially the strings from the vector edu from the first column of the dataframe ISCED, and then if there is a match, the code (5, 6 or 7) will be returned.

I know there are functions like "%like%" or "grepl", but I am looking for something that can skim through all values of the vector edu and not just one particular string defined each time.

Does anybody have any insights? Or would you guys suggest using a loop with the "grepl"?

Thank you!

Answer

One way, is using grep.

Making a vector of strings with paste0 and getting an index wherever it matches the first column (Main_education_group). Using that index to fetch the respective Code from the data frame.

ISCED$Code[grep(paste0(edu, collapse = "|"), ISCED$Main_education_program)]

#[1] 6 7