degeaba degeaba - 6 months ago 14
PHP Question

Regex: Scrub a YouTube URL within a string, leaving only the YouTube video code

I have a text that contains a YouTube URL. I need to remove all portions of the link, except for the YouTube video code. The URL may be surrounded by blank space or nothing; no non-blank characters will adjoin the URL.

SAMPLE:

$txt = "This text contain this link: https://www.youtube.com/watch?v=b8ri14rw32c&rel=0 and so on..."


EXTRACTING ID:

$pattern = '#(?<=v=|v\/|vi=|vi\/|youtu.be\/)[a-zA-Z0-9_-]{11}#';
preg_match_all($pattern, $txt, $matches);
print_r($matches);


EXPECTED:

Array
(
[0] = "This text contain this link b8ri14rw32c and so on..."
)

Answer

You can try this pattern to match:

https:\/\/(?:www.)?youtu(?:be\.com|\.be)\/(?:watch\?vi?[=\/])?(\w{11})(?:&\w+=[^&\s]*)*

There is exactly one capture in this expression, and it's for the YouTube video code. This capture can be used with a regex replace to replace the entire link text with just the captured video code.

This regex will work with these format YouTube URLs:

https://www.youtube.com/watch?v=b8ri14rw32c&rel=0
https://youtu.be/Rk_sAHh9s08

Other YouTube URL formats have not been tested, but could easily be supported if needed.

This PHP code will test this regexp replacement using preg_replace:

$txt = "This text contain this link: https://www.youtube.com/watch?v=b8ri14rw32c&rel=0 and so on..."
$pattern = "https:\/\/(?:www.)?youtu(?:be\.com|\.be)\/(?:watch\?vi?[=\/])?(\w{11})(?:&\w+=[^&\s]*)*"
$text = preg_replace($pattern, '\1', $txt);
Comments