clint clint - 9 months ago 52
Java Question

detect all urls in img tag in a large string + android

I have a string content in android like :

xyz xyx xyz<br /><img style="max-width: 100%;" src="https://blablab.png" alt="Loading..." /></p>abc</p><img style="max-width: 100%;" src="https://blablab2.png" alt="Loading..." /><div>abc</div><img style="max-width: 100%;" src="https://blablab3.png" alt="Loading..." />

Here, I have to retrieve value of src attributes(url of image file) of all the img tags and replace the img tags with its base 64 value. How can i do that, ie. first detect values of all the src attributes and then replace the img tags with their base 64 value?

Answer Source

Unless you are Chuck Norris or Jon Skeet, you shouldn't use RegEx to match HTML. I would suggest using Jsoup. Here is an example using the string from your question:

String html = "xyz xyx xyz<br /><img style=\"max-width: 100%;\" src=\"https://blablab.png\" alt=\"Loading...\" /></p>abc</p><img style=\"max-width: 100%;\" src=\"https://blablab2.png\" alt=\"Loading...\" /><div>abc</div><img style=\"max-width: 100%;\" src=\"https://blablab3.png\" alt=\"Loading...\" />";
Document document = Jsoup.parse(html);
Elements imgs ="img[src]");
for (Element img : imgs) {
  img.attr("src", "");
String newHtml = document.html();