rankthefirst rankthefirst - 17 days ago 6
JSON Question

adding quotes to json in R

I want to scrape the website: link

I use

GET
from
httr
, and get the json lite object, but without quotes, like below:

"hxbase_json1({sum:3003,list:[{Number:'1'...


So
jsonlite::fromJSON
cannot read this json..

My code is

url <- 'http://stockdata.stock.hexun.com/zrbg/data/zrbList.aspx?'
date <- '2015-12-31'
page <- 1

res <- GET(url, query = list(date = date,
count = 20,
pname = 20,
titType = 'null',
page = page
))

resC <- content(res)
resC1 <- jsonlite::fromJSON(resC)


I was wondering is there any package adding quotes to json automatically? Or is there anyway to read such json ?

Answer

In the future, please post your R code and the correct URL(s). It's technically not JSON data, it's a JavaScript construct (they aren't the same). You can do a bit of surgery and enlist the help of the V8 package:

library(httr)
library(V8)
library(stringi)

res <- GET("http://stockdata.stock.hexun.com/zrbg/data/zrbList.aspx?date=2015-12-31&count=20&pname=20&titType=null&page=1&callback=hxbase_json11479871629254")

ctx <- v8()

content(res) %>% 
  stri_replace_first_fixed("hxbase_json1(", "var dat=") %>% 
  stri_replace_last_fixed(")", "") %>% 
  ctx$eval()

ctx$get("dat") %>% 
  dplyr::glimpse()
## List of 2
##  $ sum : int 3003
##  $ list:'data.frame': 20 obs. of  13 variables:
##   ..$ Number       : chr [1:20] "1" "2" "3" "4" ...
##   ..$ StockNameLink: chr [1:20] "stock_bg.aspx?code=000002&date=2015-12-31" "stock_bg.aspx?code=601601&date=2015-12-31" "stock_bg.aspx?code=000550&date=2015-12-31" "stock_bg.aspx?code=000001&date=2015-12-31" ...
##   ..$ industry     : chr [1:20] "万科A(000002)" "中国太保(601601)" "江铃汽车(000550)" "平安银行(000001)" ...
##   ..$ stockNumber  : chr [1:20] "24.36" "24.07" "23.01" "18.69" ...
##   ..$ industryrate : chr [1:20] "90.27" "86.41" "84.29" "84.14" ...
##   ..$ Pricelimit   : chr [1:20] "A" "A" "A" "A" ...
##   ..$ lootingchips : chr [1:20] "15.00" "15.00" "9.03" "15.00" ...
##   ..$ Scramble     : chr [1:20] "15.00" "12.00" "20.00" "15.00" ...
##   ..$ rscramble    : chr [1:20] "8.00" "6.00" "18.00" "8.00" ...
##   ..$ Strongstock  : chr [1:20] "27.91" "29.34" "14.25" "27.45" ...
##   ..$ Hstock       : chr [1:20] " <a href =\"http://www.cninfo.com.cn/finalpage/2016-03-14/1202040307.PDF\" target=\"_blank\"><img alt=\"\" src=\"img/table_btn1"| __truncated__ " <a href =\"http://www.cninfo.com.cn/finalpage/2016-03-28/1202085787.PDF\" target=\"_blank\"><img alt=\"\" src=\"img/table_btn1"| __truncated__ " <a href =\"http://www.cninfo.com.cn/finalpage/2016-03-19/1202057166.PDF\" target=\"_blank\"><img alt=\"\" src=\"img/table_btn1"| __truncated__ " <a href =\"http://www.cninfo.com.cn/finalpage/2016-03-10/1202033377.PDF\" target=\"_blank\"><img alt=\"\" src=\"img/table_btn1"| __truncated__ ...
##   ..$ Wstock       : chr [1:20] "<a href =\"http://stockdata.stock.hexun.com/000002.shtml\" target=\"_blank\"><img alt=\"\" src=\"img/icon_02.gif\"></img ></a>" "<a href =\"http://stockdata.stock.hexun.com/601601.shtml\" target=\"_blank\"><img alt=\"\" src=\"img/icon_02.gif\"></img ></a>" "<a href =\"http://stockdata.stock.hexun.com/000550.shtml\" target=\"_blank\"><img alt=\"\" src=\"img/icon_02.gif\"></img ></a>" "<a href =\"http://stockdata.stock.hexun.com/000001.shtml\" target=\"_blank\"><img alt=\"\" src=\"img/icon_02.gif\"></img ></a>" ...
##   ..$ Tstock       : chr [1:20] "<img alt=\"\" onclick=\"addIStock('000002','1');\"  code=\"\" codetype=\"\" \" src=\"img/icon_03.gif\"></img >" "<img alt=\"\" onclick=\"addIStock('601601','1');\"  code=\"\" codetype=\"\" \" src=\"img/icon_03.gif\"></img >" "<img alt=\"\" onclick=\"addIStock('000550','1');\"  code=\"\" codetype=\"\" \" src=\"img/icon_03.gif\"></img >" "<img alt=\"\" onclick=\"addIStock('000001','1');\"  code=\"\" codetype=\"\" \" src=\"img/icon_03.gif\"></img >" ...