Webill Webill - 7 months ago 16
Javascript Question

Get the full path of all files of some types from a html file and put them into an array using js/jQuery

I need to get the paths from something like this:

<object>
<p>https://bla-bla-bla/thing.flv</p>
</object>

<p>level/thing.mp3</p>
<ul>
<li>https://thing/otherthing/thing.srt<li></ul>


On the other hand, the files can be anywhere inside the html file.

I tried some possibilities, but without success.

Any clue?

Thanks a lot!

I need to get some file names with the proper address and put them into an array:

myArray[0]='https://bla-bla-bla/othername.flv'
myArray[1]='/level/name.mp3'
myArray[2]='https://text/othertext/name.srt'


..and so on

I'm very close to solve it using regexp, I did:

var str = document.getElementById("content").innerHTML;

var res = str.match(/=http.*?.flv/gi);


In this case, I get the excerpt, but I get the whole thing around it.
eg.

I need this:

'https://this/otherthing/thing.srt'


But I getting this

'more https stuff from other url ...https://this/otherthing/thing.srt even more text...'


uniques url's, not a giant string with the first http ending with the first .srt. I need a valid path.

Answer

Since .* grabs as many matching characters as it can, you need to be more specific about what can and can't be in the middle.

Try:

var res = str.match(/https?:\/\/\S+\.flv/gi);

where \S grabs as many non-whitespace characters as it can.

To exclude certain characters, use [^...]:

var res = str.match(/https?:\/\/[^\s;]+\.flv/gi);

Alternatively, just make your .* lazy instead of greedy with a well-placed ?:

var res = str.match(/http.*?\.flv/gi);
Comments