Lisann Lisann - 15 days ago 7
R Question

R remove part of string after "."

It is a simple question but I don't see what I am doing wrong.
I am working in R with accession numbers like variable a

>a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")


To get information from the biomart package I need to remove the .1 etc. after the accession numbers. I normally do this with this code:

> b <- sub("..*","",a)

>[1] "" "" "" "" "" ""


But as you can see, this isn't the correct way for this variable. Can anyone help me with this?

Answer

You just need to escape the period:

a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")

gsub("\\..*","",a)
[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"    "NM_053155"