Emad Jaber Emad Jaber - 2 months ago 25
Android Question

extract strings using Jsoup

I'm trying to get some name form a website html page by using Jsoup Library, the problem is that i'm getting the elements by class getElementsByClass("name") and store it in to string Variable.
the result coming like this "mike andro rob banks maria gerardo louis....etc".
what i want to seperate the names and store them into array.
the follwing is the used code:

public String processText(String htmlPage) {

Document html = Jsoup.parse(htmlPage);
String names = html.body().getElementsByClass("name").text();
return names;
}


more information:
the source page is an html page and im saving the full html code in string and then process the string to extract only the Elements under the class="name"

the
string htmlPage which im passing to processText method is similar to the following:



<div class="name">
Rob Kardashian
</div>
</div>
</a>
</div>
<div class="channelListEntry">
<a href="/zayn_malik">
<div class="image">
<img src="http://cdn.posh24.com/images/:profile/014cf47ca44daf8f44a3e0720929ee327" alt="Zayn Malik"/>
</div>


<div class="info">
<div class="status-container">
<div class="position">4</div>

<div class="img pos"></div>
<div class="value">+12</div>

</div>
<div class="name">
Zayn Malik
</div>
</div>
</a>
</div>
<div class="channelListEntry">
<a href="/kanye_west">
<div class="image">
<img src="http://cdn.posh24.com/images/:profile/03f352f71ffab135cd81821eb190d4832" alt="Kanye West"/>
</div>


<div class="info">
<div class="status-container">
<div class="position">5</div>

<div class="img pos"></div>
<div class="value">+16</div>

</div>
<div class="name">
Kanye West
</div>
</div>
</a>
</div>
<div class="channelListEntry">
<a href="/kendall_jenner">
<div class="image">
<img src="http://cdn.posh24.com/images/:profile/066d5c02547c4357f1bc5f633c68f4085" alt="Kendall Jenner"/>
</div>




Answer

you can simply use split function to get an array from string

String arr[]=names.trim().split("\\s");

plus if you have spaces and tab combined between name then use

  String arr[]=names.split("\\s+");

Update:

      ArrayList<String>  name=new ArrayList<String>();
      for (Element output: html.body().getElementsByClass("name")) {
      name.add(output.text());
}

link to convert list to array

Comments