Tarak Tarak - 3 months ago 21
R Question

Rvest to manipulate and extract value from HTML

Using R
I have getBrandlist in html as

<div>\n <span class="txt edittext">BrandName1 </span>\n <span
class="cnt" data-val="116">(42)</span>\n</div>
<div>\n <span class="txt edittext">BrandName2 </span>\n <span
class="cnt" data-val="116">(62)</span>\n</div>
......


Now I have the number 62. I wish to extract BrandName2 that corresponds to this value.
I tried using
html_node(getBrandlist, css = '.cnt') %>% html_attr()

How do I go about this. Any help will be greatly appreciated.

Answer

You can do

library(rvest)
doc <- read_html('<div>\n  <span class="txt edittext">BrandName1 </span>\n  <span 
 class="cnt" data-val="116">(42)</span>\n</div>
 <div>\n  <span class="txt edittext">BrandName2 </span>\n  <span 
 class="cnt" data-val="116">(62)</span>\n</div> ')
html_node(doc, xpath = "//span[text()='(62)']/preceding-sibling::span") %>% html_text
# [1] "BrandName2 "