Meli Meli - 1 month ago 13
R Question

string split operation in R

In my data I have a column of strings. Each string is five characters long. I would like to figure out how to split the string so that I keep the first two characters, the last two and disregard the middle or third character.

I looked at other stackoverflow questions and found the answer listed below as helpful. Initially, the solution below was useful until I saw that in certain cases it didn't work or it worked in the way I wasn't expecting.

This is what I have:

statecensusFIPS <- c("01001", "03001", "13144")
newFIPS <- lapply(2:3, function(i){
if(i==2){
str_sub(statecensusFIPS, end = i)
} else {
str_sub(statecensusFIPS, i)
}})

StateFIPS <- newFIPS[[1]]
CountyFIPS <- newFIPS[[2]]

# Results
> StateFIPS
[1] "01" "03" "13"
> CountyFIPS
[1] "001" "001" "144"


How do I adjust the code so that I have these results instead?

StateFIPS
[1] "01" "03" "13"
CountyFIPS
[1] "01" "01" "44"

Answer

How about this (assuming that you want first 2 characters as the statefips and last 2 characters of your strings as county fips and all your strings are of length 5)?

statecensusFIPS<-c("01001", "03001", "13144")
newFIPS<-lapply(2:3,function(i) if(i==2) str_sub(statecensusFIPS,end=i) else str_sub(statecensusFIPS,i+1)) 

StateFIPS<-newFIPS[[1]]
CountyFIPS<-newFIPS[[2]]

Simpler way could be:

statecensusFIPS<-c("01001", "03001", "13144")
stateFIPS<- str_sub(statecensusFIPS,end=2) 
CountyFIPS<- str_sub(statecensusFIPS,4)