jscripter jscripter - 2 months ago 8
HTML Question

get text between two images

Is there a simpler way to catch the text between two images which haven't a same parent element?
I'm making a userscript for a webpage.
Like:

<div id="content"></div>
<div style="text-align:center"><img src="" alt=""></div>
<a>some text</a>
<img src="" alt="">
<div style="text-align:left">more text</div>
</div>


How to get the text between the 1st image and 2nd image of the div content. I don't know exactly the correct structure because maybe the text and the images could be inside of div or a nodes. I'd rather not use libraries

Answer Source

You basically want to handle the <img> tags as quotes around text you want to extract.

The easiest way to do that is to just replace the <img> tag with something not likely repeated in the text, and use that character as a delimiter. I'll show you how using jQuery. If you need it done in pure JS than you'll have to convert this.

First, make a copy of the HTML.

var html = $('<div>').append($("#content").html());

Next, replace all <img> tags with a special character (or other token you know is unique).

html.find("img").replaceWith("<div>~</div>");

Once you've done that you can just match text between those delimiters like this.

var str = html.text();
var rx = /~([^~]+)~/g;
var match = rx.exec(str);

To find all matches just repeat.

while(match != null)
{
    alert(match[1]);
    match = rx.exec(str);    
}

It's possible to do the same with a unique phrase like @img@ instead of a single character, but a single character is way easier.

Here's a working fiddle.

http://jsfiddle.net/thinkingmedia/etx1z6ov/2/