Dan Luba Dan Luba - 3 months ago 8
R Question

Get keys only of top-level elements with regex in R

Say I have the following json string:

[
{
"oranges": 5,
"apples": "eleven",
"bananas": [
{
"green": "six",
"yellow": 12,
"brown": 8,
},
{
"green": 9,
"yellow": "three",
"brown": "seven",
},
{
"green": 7,
"yellow": 2,
"brown": "eight",
}
],
"grapes": [],
"pears": "seventeen"
}
]


How can I get only the top level element keys using regex? I'm doing this in R using stringr.

So far I've come up with:

str_extract_all(json_string, regex('(?<=")[^"]+?(?=":[:space:])'))


... which pulls out all of the keys, but I can't work out how to get at only the fruits.

Answer

One way (since this is actually a JavaScript value vs actual JSON):

library(V8)

ctx <- v8()

ctx$assign("dat", JS('[
   {
      "oranges": 5,
      "apples": "eleven",
      "bananas": [
         {
            "green": "six",
            "yellow": 12,
            "brown": 8,
         },
         {
            "green": 9,
            "yellow": "three",
            "brown": "seven",
         },
         {
            "green": 7,
            "yellow": 2,
            "brown": "eight",
         }
      ],
      "grapes": [],
      "pears": "seventeen"
   }
]'))

colnames(ctx$get("dat"))

## [1] "oranges" "apples"  "bananas" "grapes"  "pears"  
Comments