Costin Costin - 3 months ago 17
JSON Question

transform polygon json coordinates into a data.frame

I want to transform a dataframe into another dataframe. If possible, in less commands, using

dplyr
or
tidyr
would be great.

In order to parse the coordinates list I used
library(rjson)
, this part is OK, but I cannot manipulate the list further to get my result.

Should you can avoid using any
for
statement would be great, but any solution is good as long as it solve the problem :)

Input:

df <- data.frame(code = c("12000", "89000"),
polygon = c("[[[11,12], [13,14], [15,16]], [[21, 22], [23,24], [25,26]]]",
"[[[81,82], [83,84], [85,86]]]"))
df

> df
code polygon
1 12000 [[[11,12], [13,14], [15,16]], [[21, 22], [23,24], [25,26]]]
2 89000 [[[81,82], [83,84], [85,86]]]


Input data description:


  • column
    code
    contains postal code

  • column
    polygon
    contains one or more polygons defined by their latitude-longitude pairs of points



Output wanted:

> wanted
a lon lat id
1 12000 11 12 1
2 12000 13 14 1
3 12000 15 16 1
4 12000 21 22 2
5 12000 23 24 2
6 12000 25 26 2
7 89000 81 82 1
8 89000 83 84 1
9 89000 85 86 1


I want to plot the wanted data.frame using ggplot.

Thank you

Answer

purrr, dplyr and jsonlite solution:

df <- data.frame(code = c("12000", "89000"),
                 polygon = c("[[[11,12], [13,14], [15,16]], [[21, 22], [23,24], [25,26]]]",
                             "[[[81,82], [83,84], [85,86]]]"),
                 stringsAsFactors=FALSE)

library(purrr)
library(dplyr)
library(jsonlite)

make_coords <- function(x) {
  fromJSON(x$polygon, simplifyMatrix=FALSE) %>% 
  map_df(~map_df(., ~setNames(as.data.frame(as.list(.)), c("lat", "lon"))), .id="id")
} 

group_by(df, a=code) %>% 
  do(make_coords(.)) %>%
  ungroup() %>% 
  select(a, lat, lon, id)
## # A tibble: 9 x 4
##       a   lat   lon    id
##   <chr> <int> <int> <chr>
## 1 12000    11    12     1
## 2 12000    13    14     1
## 3 12000    15    16     1
## 4 12000    21    22     2
## 5 12000    23    24     2
## 6 12000    25    26     2
## 7 89000    81    82     1
## 8 89000    83    84     1
## 9 89000    85    86     1

This has the added benefit of validating the polygon data since your example ha[ds] invalid JSON (I had to edit out the final ] in the initial example).

NOTES:

  1. The group_by could be replaced by dplyr::rowwise or (with some other code changes) by purrr::by_row
  2. The idiom is to iterate through each code, convert the JSON into a list of coordinates, iterate through that list and make a date frame out of each polygon, and assigning the positional ID to it.
  3. The column names you want are assigned in three places: the initial group_by (to turn code into a), the innermost map_df (for lat & lon) and finally id which is auto-created by the outermost map_df.

rowwise version:

make_coords2 <- function(x) {
  fromJSON(x$polygon, simplifyMatrix=FALSE) %>% 
    map_df(~map_df(., ~setNames(as.data.frame(as.list(.)), c("lat", "lon"))), .id="id") %>% 
    mutate(a=x$a)
}

select(df, a=code, polygon) %>% 
  rowwise() %>% 
  do(make_coords2(.)) %>%
  ungroup() %>% 
  select(a, lat, lon, id)

by_row version:

make_coords3 <- function(x) {
  fromJSON(x$polygon, simplifyMatrix=FALSE) %>% 
    map_df(~map_df(., ~setNames(as.data.frame(as.list(.)), c("lat", "lon"))), .id="id")
}

select(df, a=code, polygon) %>% 
  by_row(make_coords3, .collate="rows") %>% 
  select(a, lat, lon, id)
Comments