Javacadabra Javacadabra - 8 days ago 5
PHP Question

Regular Expression preg_match() for hreflang

I'm working on

php script
to output a number of
hreflang
tags based on the page the user is on. Currently I am using
preg_match()
to match the URL being requested again a specific pattern. If it matches output the relevant tags...

So I've got something like this:

<?php if (preg_match("/^\/en-gb\/company/i", $url)): ?>
<link rel="alternate" href="https://www.mydomain1.com/en-gb/company/" hreflang="en-gb" />
<link rel="alternate" href="https://www.mydomain1.com/company/" hreflang="en" />
<link rel="alternate" href="https://www.mydomain1.com/company/" hreflang="x-default" />
<?php endif; ?>


when I visit the page
www.mydomain1.com/en-gb/company
it outputs the above tags. I also have another URL
www.mydomain1.com/en-gb/company/stats
. So I wrote another block for this URL:

<?php if (preg_match("/^\/en-gb\/company\/stats/i", $url)): ?>
<link rel="alternate" href="https://www.mydomain1.com/en-gb/company/stats" hreflang="en-gb" />
<link rel="alternate" href="https://www.mydomain1.com/company/stats" hreflang="en" />
<link rel="alternate" href="https://www.mydomain1.com/company/stats" hreflang="x-default" />
<?php endif; ?>


However this page will output all of the
hreflangtags
from the first
if statement
and never enters the second. I know my
regular Expression
is wrong but I can't pinpoint exactly where. I'd appreciate any help in order to resolve.

Answer

You may restrict the first regex to avoid matching en-gb/company when followed with /stats using a negative lookahead:

preg_match("/^\/en-gb\/company(?!\/stats(?:\/|$))/i", $url)
                              ^^^^^^^^^^^^^^^^^^^ 

See the regex demo

The (?!\/stats(?:\/|$)) negative lookahead will fail all matches of /en-gb/company that are followed with /stats/ or /stats at the end of the input.

Note that in case you just want to check an entire path with the first regex, you may use Paladin76 approach by adding the $ end-of-string anchor, but I'd also add an optional / at the end:

preg_match("/^\/en-gb\/company\/?$/i", $url)
                              ^^^^

See this regex demo

Now, this pattern will only match inputs that are only equal to /en-gb/company or /en-gb/company/.