Joe Jankowiak Joe Jankowiak - 1 month ago 38
PHP Question

Regex to get instagram picture (PHP)

I'm trying to use a regular expression to check if a URL is an Instagram picture and return just the beginning part of the URL with the /p/PICTUREID

So far this is what I've been able to come up with:

^(.*instagram.com\/p\/.*)\/


This however requires there to be a trailing slash but I do not want to require it.

Examples (that should match):

https://www.instagram.com/p/BKbwlrfjGHY/?post->
https://www.instagram.com/p/BKbwlrfjGHY

http://www.instagram.com/p/BKbwlrfjGHY/ ->
http://www.instagram.com/p/BKbwlrfjGHY

instagram.com/p/BKbwlrfjGHY ->
instagram.com/p/BKbwlrfjGHY


How do I stop at the trailing slash if it exists and anything else after?

Here is my regex101 for testing:

https://regex101.com/r/JJS2kz/1

Answer

Solution 1

You can use this regex here to match all the examples you provided:

/(https?:\/\/www\.)?instagram\.com(\/p\/\w+\/?)/

Explanation

The first part of the regex looks for http or httpsfollowed by a www. and makes the whole combination optional.

(https?:\/\/www\.)?

The second part is looking for the string instagram.com

instagram\.com

And the third part is looking for whatever letters followed by the slash after the /p/, with an optional trailing slash /. Note that this part of the regex is in parenthesis so you can retrieve it later when you use preg_match_all.

(\/p\/\w+\/?)

Solution 2

If you want to be able to support the following pattern as well (with the http/https and without the www):

http://instagram.com/p/BkbwlrfjGHY
http://instagram.com/p/BkbwlrfjGHY/
https://instagram.com/p/BkbwlrfjGHY
https://instagram.com/p/BkbwlrfjGHY

You could use this regex:

/(https?:\/\/(www\.)?)?instagram\.com(\/p\/\w+\/?)/

Example

$string = 'https://www.instagram.com/p/abcd/?post->
           https://www.instagram.com/p/efgh

           http://www.instagram.com/p/iJkL/ ->
           http://www.instagram.com/p/MnNadfoadf

           instagram.com/p/ACDOFfaf ->
           instagram.com/p/AFMDAOF';

preg_match_all('/(https?:\/\/(www\.)?)?instagram\.com(\/p\/\w+\/?)/', $string, $matches);

Then if you do a var_dump of $matches:

array(4) {
  [0]=>
  array(6) {
    [0]=>
    string(33) "https://www.instagram.com/p/abcd/"
    [1]=>
    string(32) "https://www.instagram.com/p/efgh"
    [2]=>
    string(32) "http://www.instagram.com/p/iJkL/"
    [3]=>
    string(37) "http://www.instagram.com/p/MnNadfoadf"
    [4]=>
    string(24) "instagram.com/p/ACDOFfaf"
    [5]=>
    string(23) "instagram.com/p/AFMDAOF"
  }
  [1]=>
  array(6) {
    [0]=>
    string(12) "https://www."
    [1]=>
    string(12) "https://www."
    [2]=>
    string(11) "http://www."
    [3]=>
    string(11) "http://www."
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
  }
  [2]=>
  array(6) {
    [0]=>
    string(4) "www."
    [1]=>
    string(4) "www."
    [2]=>
    string(4) "www."
    [3]=>
    string(4) "www."
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
  }
  [3]=>
  array(6) {
    [0]=>
    string(8) "/p/abcd/"
    [1]=>
    string(7) "/p/efgh"
    [2]=>
    string(8) "/p/iJkL/"
    [3]=>
    string(13) "/p/MnNadfoadf"
    [4]=>
    string(11) "/p/ACDOFfaf"
    [5]=>
    string(10) "/p/AFMDAOF"
  }
}

And now to retrieve each ids, you can use the foreach:

foreach($matches[3] as $instagramId){
    echo $instagramId . "<br>";
}

And the result will be:

/p/abcd/
/p/efgh
/p/iJkL/
/p/MnNadfoadf
/p/ACDOFfaf
/p/AFMDAOF
Comments