Baptiste Baptiste - 4 months ago 10
Perl Question

Use perl WWW::Mechanize on a local file

I'm currently working on a Perl script and I use the CPAN module WWW:Mechanize to get HTML pages from websites.
However, I would like to be able to work on offline HTML files as well (that I would save myself beforehand most likely) so I don't need the internet each time I'm trying a new script.
So basically my question is how can I transform thisĀ :

$mech->get( 'http://www.websiteadress.html' );


into thisĀ :

$mech->get( 'C:\User\myfile.html' );


I've seen that file:// could be useful but I obviously don't know how to use it as I get errors every time.

Answer

The get() method from WWW::Mechanize takes a URL as its argument. So you just need to work out what the correct URL is for your local file. You're on the right lines with the "file://" scheme.

I think you will need:

$mech->get( 'file:///C:/User/myfile.html' );

Note two important things that people often get wrong.

  1. URLs only understand forward slashes (/), so you need to convert Windows' warped backslash (\) monstrosities.
  2. The scheme is file:// (with two slashes) and your local path needs to start with another slash (/C:/) so that means that there are three slashes after file:. That seems wrong, so people often omit one of them.

Wikipedia (as always) has a lot more information - file URI scheme