bottledatthesource bottledatthesource - 3 months ago 13
R Question

R tuple as factor (specifically longitude lattitude as factor)

I am having problems with accessing factors in

R
. I have a dataframe of tuple factor

test1
#[1] (34.0467, -118.2470) (34.0637, -118.2440) (34.0438, -118.2547)
#[4] (34.0523, -118.2676) (34.0584, -118.2810) (34.0583, -118.2616)
#39497 Levels: (0, 0) (0.0000, 0.0000) ... (34.6837, -118.1853)


How do I access just the first digit of the tuple?

thanks!


dput(test1)
...
"(34.3256, -118.4307)", "(34.3256, -118.4798)", "(34.3256, -118.5033)",
"(34.3257, -118.4244)", "(34.3258, -118.4343)", "(34.3262, -118.4104)",
"(34.3262, -118.4112)", "(34.3266, -118.4234)", "(34.3266, -118.4269)",
"(34.3266, -118.4323)", "(34.3269, -118.4278)", "(34.3272, -118.4365)",
"(34.3273, -118.4342)", "(34.3274, -118.4321)", "(34.3274, -118.4331)",
"(34.3275, -118.4247)", "(34.3275, -118.4298)", "(34.3276, -118.4115)",
"(34.3277, -118.4071)", "(34.3285, -118.4266)", "(34.3286, -118.4277)",
"(34.3287, -118.4286)", "(34.3292, -118.5048)", "(34.3293, -118.4246)",
"(34.3298, -118.4300)", "(34.3327, -118.5062)", "(34.3374, -118.5042)",
"(34.3760, -118.5254)", "(34.3767, -118.5263)", "(34.3775, -118.5270)",
"(34.3805, -118.5293)", "(34.4638, -118.1995)", "(34.5095, -117.9273)",
"(34.5304, -118.1418)", "(34.5453, -118.0405)", "(34.5650, -118.0856)",
"(34.5693, -118.0228)", "(34.5957, -118.1784)", "(34.6818, -118.0954)",
"(34.6837, -118.1853)"), class = "factor")


Can't get the beginning of that anyhow.

Answer

Try this

as.numeric(unlist(strsplit(gsub("[\\(\\)]", "",as.character(test1)),","))[c(T,F)])

Explanation

gsub is applicable only on character. So, as.character(test1) is converting test1 to character from factor. Then I am removing "(" & ")" from them like this

gsub("[\\(\\)]", "",as.character(test1))
#[1] "34.5693, -118.0228" "34.5957, -118.1784" "34.6818, -118.0954" "34.6837, -118.1853"

Later I split them into two parts depending on the separator , as

strsplit(gsub("[\\(\\)]", "",as.character(test1)),",")
#[[1]]
#[1] "34.5693"    " -118.0228"

#[[2]]
#[1] "34.5957"    " -118.1784"

#[[3]]
#[1] "34.6818"    " -118.0954"

#[[4]]
#[1] "34.6837"    " -118.1853"

Previous output is a list. unlist made output a vector.

unlist(strsplit(gsub("[\\(\\)]", "",as.character(test1)),","))
#[1] "34.5693"    " -118.0228" "34.5957"    " -118.1784" "34.6818"    " -118.0954"
#[7] "34.6837"    " -118.1853"

Basically [c(T,F)] is generating an alternating sequence of TRUE and FALSE for selection of first elements.

At last I made the output numeric using as.numeric

Output

#[1] 34.5693 34.5957 34.6818 34.6837