Johnny Jensen Johnny Jensen - 1 month ago 12
JSON Question

Create new column with matching JSON part

I try to extract the matching JSON part from column 'data' into new column 'geo'. This code only extracts the first row correctly, and then repeats itself. I expect the 'geo' column to read:

GEO STOCKHOLM, GEO NEW YORK, GEO MADRID, GEO LONDON

instead of currently

GEO STOCKHOLM, GEO STOCKHOLM, GEO STOCKHOLM, GEO STOCKHOLM

Code:

library(rjson)
data <- c('["GEO STOCKHOLM","TYPE LOW"]','["GEO NEW YORK","TYPE MEDIUM"]','["GEO MADRID","TYPE HIGH"]','["GEO LONDON","TYPE MAX"]')
df <- data.frame(data, stringsAsFactors=FALSE)
df$geo <- grep("GEO", fromJSON(df$data), value = TRUE)


As you can see, I just want to keep the 'geo' part within semicolons, and lose the 'type' part. Looking at the df dataframe, this is my (incorrect) output:

Incorrect output df dataframe

Answer

What about this?

df$geo <- sapply(sapply(df$data, FUN = fromJSON, simplify = FALSE), FUN = "[", 1)

                            data           geo
1   ["GEO STOCKHOLM","TYPE LOW"] GEO STOCKHOLM
2 ["GEO NEW YORK","TYPE MEDIUM"]  GEO NEW YORK
3     ["GEO MADRID","TYPE HIGH"]    GEO MADRID
4      ["GEO LONDON","TYPE MAX"]    GEO LONDON