adrTuIPKJ44 adrTuIPKJ44 - 1 month ago 7
Perl Question

WWW::Mechanize: How to get the value of a redirected URL's query parameter

On Mechanize, one can create a user agent that will simulate a web browser

$agent = WWW::Mechanize->new();


To access to a new webpage with the user agent I do the following:

$agent->get("http://some_url.com");


If I type this same URL in my browser it redirects to something like this:

http://some_url.com?param1=value1&param2=value2


How can I retrieve the value of those query parameters?

Answer

get method of WWW::Mechanize returns HTTP::Response object. On which you can run redirects method to get the complete redirection chain. For example I ran below code for google.com.

#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
use Data::Dumper;

my $agent = WWW::Mechanize->new();

my $object = $agent->get('http://www.google.com/');

print Dumper $object->redirects;

Output:

$VAR1 = bless( {
                 '_msg' => 'Found',
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               'accept-encoding' => 'gzip',
                                                               'user-agent' => 'WWW-Mechanize/1.82'
                                                             }, 'HTTP::Headers' ),
                                        '_uri_canonical' => bless( do{\(my $o = 'http://www.google.com/')}, 'URI::http' ),
                                        '_uri' => $VAR1->{'_request'}{'_uri_canonical'},
                                        '_content' => '',
                                        '_method' => 'GET'
                                      }, 'HTTP::Request' ),
                 '_protocol' => 'HTTP/1.1',
                 '_rc' => '302',
                 '_headers' => bless( {
                                        'title' => '302 Moved',
                                        'content-length' => '261',
                                        'location' => 'http://www.google.co.in/?gfe_rd=cr&ei=3KoEWP78GYPj8weZlLXoDA',
                                        'date' => 'Mon, 17 Oct 2016 10:41:32 GMT',
                                        'accept-ranges' => 'none',
                                        'cache-control' => 'private',
                                        'client-date' => 'Mon, 17 Oct 2016 10:41:32 GMT',
                                        'connection' => 'close',
                                        'client-response-num' => 1,
                                        'content-type' => 'text/html; charset=UTF-8',
                                        '::std_case' => {
                                                          'title' => 'Title',
                                                          'set-cookie2' => 'Set-Cookie2',
                                                          'client-peer' => 'Client-Peer',
                                                          'client-date' => 'Client-Date',
                                                          'set-cookie' => 'Set-Cookie',
                                                          'base' => 'Base',

                                                          'content-base' => 'Content-Base',
                                                          'client-response-num' => 'Client-Response-Num'
                                                        }
                                      }, 'HTTP::Headers' ),
                 '_content' => '<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.co.in/?gfe_rd=cr&amp;ei=3KoEWP78GYPj8weZlLXoDA">here</A>.
</BODY></HTML>   '

               }, 'HTTP::Response' );

As you can see the final location can be found in location header.