Selrac Selrac - 2 months ago 9
Java Question

Selenium scraping from repeated hierarchy

I have a html hierarchy and I want to extract some the information from it. The HTML hierarchy is as follows:

<div class="span">
<div class="da">
<div class="db">
<div class="dc">
<ul class="ua" id="ua_id" >
<li class="la_id">
<div class="de details">
<div class="df">
<div class="dg">
<span class="th">
<p class="p4">p: 4</p></span>
<p class="where">Description</p>
<div class='dh'>
<div class='di'>
<div>08 Oct 2016</div>
<div class="dj">loc</div>
</div>
</div>
</div>
</div>
</li>
<li class="la_id">
<div class="de details">
<div class="df">
<div class="dg">
<span class="th">
<p class="p3 plus">p: 3 plus</p></span>
<p class="where">Description</p>
<div class='dh'>
<div class='di'>
<div>02 Oct 2016</div>
<div class="dj">loc</div>
</div>
</div>
</div>
</div>
</div>
</li>
</ul>
</div>
</div>
</div>
</div>


The
  • tag gets repeated and I want to extract the details inside

    I tried the following but it doesn't work:

    List<WebElement> details =driver.findElements(By.className("la_id"));
    for(WebElement e : details){
    System.out.println(e.getAttribute("where"));
    }


    and I can not figure out either how to get the values like in:

    <div class="dj">loc</div>


    can someone please give me a hint on this

    Thanks

  • Answer

    In your case "where" is not an attribute of the element you found.

    If you need to find element with class "where" then according to your example you can find it the same way you found another element:

    List<WebElement> details =driver.findElements(By.className("where"));
       for(WebElement e : details){
       System.out.println(e.getText());
    }
    

    If you want to locate it using a parent element in your tree you can use something like:

    List<WebElement> details =driver.findElements(By.cssSelector("li.la_id p.where"));
    
    Comments