Vlad Nitu Vlad Nitu - 5 months ago 11
PHP Question

PHP get text between a string and another

So I have a lot of text in a text file that acts like a "database" and I need to extract a specific part that starts from a string and ends with another one.


To be more specific, some of the "database" looks like this:

i:24;s:5:"sName";s:12:"adsfasdffdfd";s:7:"iStatus";i:1;s:9:"iPosition";i:0;s:17:"sDescriptionShort";s:29:"<p>test short description</p>";s:16:"sDescriptionFull";s:28:"<p>test full description</p>";


And I need to extract the part between
<p>
and
</p>
having as parameter the first
i:24
, the number being the parameter.

I tried using regexp but no success until now.

Now I know it's not good practice asking for code itself but this time I'm really stuck! Any ideas?

P.S. The file contains strings like this one after another. So I need the regexp to find a
i:$a
with
$a
my number and return the content from the first paragraph it encounters.


So what I expect to be returned is:
<p>test short description</p>

Considering this should be the first paragraph encountered AFTER
i:24

Answer

So you're looking for text that comes after the literals i:24? Since none of these are special characters, let's begin our pattern construction with that literal sequence...

i:24

Next there may or may not be more characters to consume between the i:24 and the opening <p> tag. Let's assume that these characters can be anything, so we'll use a wildcard metacharacter with the {,INF} quantifier, * giving us...

i:24.*

We want to tame the regex engine's appetite so let's modify our quantifier by making it non-greedy.

i:24.*?

Next we want to match AND CAPTURE an opening, <p>...

i:24.*?(<p>)

...and the content inside of the <p> tag, which we'll assume can be anything (read wildcard) and maybe nothing, {,INF}, or *.

i:24.*?(<p>.*)

Remember to tame our * quantifier's appetite so that it doesn't consume too many <p> tags.

i:24.*?(<p>.*?)

And finally we'll close it off by consuming and capturing the closing </p> tag, with the escaped forward-slash, since it's a special character.

i:24.*?(<p>.*?<\/p>)

Hope this works for what you're trying to accomplish.

Comments