damola damola - 7 months ago 35
Perl Question

Matching the TLD and file extension from the URL

I am working on a program and need to extract TLD and web page extension from the URL


should give me TLD
and Extension

While this:
should give me TLD
and Extension

Is there any way I can do this with Regex in Perl? I am using the URI module in Perl but It cannot seem to do this Type of extraction.


If you're using the URI module, you can easily extract the host and path. Then it's a simple matter of taking everything after the last dot, or conversely removing everything up to and including the last dot. You may want to get more complicated for the extension, to properly handle cases where there is no extension.

($tld = $uri->host) =~ s/.*\.//;

($extension = $uri->path) =~ s/.*\///;
$extension = '' unless $extension =~ s/.*\.//;