LLL LLL - 2 years ago 141
R Question

inserting an underscore into a particular part of variable names R

I'd like to insert an underscore after the first three characters of all variable names in a data frame. Any help would be much appreciated.

Current data frame:

df1 <- data.frame("genCrc_b1"=c(1,1,1),"genprd"=c(1,1,1) ,"genopr_b1_b2"=c(1,1,1))


Desired data frame:

df2 <- data.frame("gen_Crc_b1"=c(1,1,1),"gen_prd"=c(1,1,1) ,"gen_opr_b1_b2"=c(1,1,1))


My attempts:

gsub('^(.{3})(.*)$', "_", names(df1))
gsub('^(.{3})(.*)$', '\\_\\2', names(df1))

Answer Source

We can use sub to capture the first 3 characters as a group ((.{3})) and in the replacement specify the backreference of the group (\\1) followed by underscore

names(df1) <- sub("^(.{3})", "\\1_", names(df1))
names(df1)
#[1] "gen_Crc_b1"    "gen_prd"       "gen_opr_b1_b2"

In the OP's post, especially the last one, there were two capture groups, but only one was specified

gsub('^(.{3})(.*)$', '\\1_\\2', names(df1))

BTW, gsub is not needed as we are replacing only at a single instance instead of multiple times.

In the first case, none of backreference for the captured groups were used in the replacement

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download