sweetmusicality sweetmusicality -3 years ago 112
R Question

Only keeping string text present in another dataframe in R

I am relatively new in R.

I have two dataframes, each of one variable only called

final
and
cv
.

final
looks like:

V1
humans, aged, female, stroke
infant, male, echocardiography
aneurysm, adolescent, female, diabetes
pregnant, diabetes, female
cardiovascular diseases, complications


and
cv
looks like

V2
stroke
pregnant
echocardiography
aneurysm
diabetes
cardiovascular diseases


I want to manipulate
final
so that it only includes the text present in
cv
. This is what I want the resulting dataframe of
final
to look like:

V1
stroke
echocardiography
aneurysm, diabetes
pregnant, diabetes
cardiovascular diseases


Please advise. Thanks!

Answer Source

We can use functions from dplyr and stringr. In addition, the or1 function from rebus is very useful to construct regular expression phrases. str_extract_all can extract all the matched string. If there are more than one phrases, the output of str_extract_all will create something like c("aneurysm", "diabetes"). I used several str_replace call with fixed to replace c(, ), and " to nothing. This part can be done more efficiently using regex, but I am not familiar with regex. df_final is the final output.

# Load packages
library(dplyr)
library(stringr)
library(rebus)

# Create example data frame
df1 <- data_frame(V1 = c("humans, aged, female, stroke", "infant, male, echocardiography",
                         "aneurysm, adolescent, female, diabetes", "pregnant, diabetes, female",
                         "cardiovascular diseases, complications"))
df2 <- data_frame(V2 = c("stroke", "pregnant", "echocardiography", "aneurysm", 
                         "diabetes", "cardiovascular diseases"))

# Process the data
df_final <- df1 %>%
  mutate(V1 = str_extract_all(V1, or1(df2$V2))) %>%
  mutate(V1 = str_replace(V1, fixed("c("), "")) %>%
  mutate(V1 = str_replace(V1, fixed(")"), "")) %>%
  mutate(V1 = str_replace_all(V1, fixed('"'), ""))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download