Dhruva Shrivastava Dhruva Shrivastava - 1 year ago 69
HTML Question

Data extraction after login to the webpage (modem login page)

I am trying to extract data from a webpage to use in further project. But webpage require login first to access the next page.

I tried using the different scripts as there are lot of related questions and solutions to the same problem are available. Here is the source code of the login page.

Source code of the login page:

<html xmlns="http://www.w3.org/1999/xhtml">
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<script type="text/javascript" src="system/js/jquery.js"></script>
<script type="text/javascript" src="system/js/md5.js"></script>
<style type="text/css" media="screen">@import url(style/css/login.css);</style>
<link rel="icon" type="image/png" href="favicon.ico" />
<div id="i_header">
<div id="i_hdtext">
High Power Modem 400 MHz
<div id="i_body">
<div id="frame_login">
<td>Username</td><td><input type="textarea" id="id_user"/></td>
<td colspan="2"> <br/> </td>
<td>Password</td><td><input type="password"id="id_pass"/></td>
<div style="text-align: center" id="submitbutton"><input type="submit" value=Send /></div>
<div id="i_foot">
$(document).ready(function() {
$("#submitbutton").click(function() {
var val_user = $("#id_user").val();
var val_pass = $("#id_pass").val();
if(val_user == "" || val_pass == "") {
alert("Please fill all required fields");
} else {
var val_pass_md5 = $.md5(val_pass);
var param = "type=loginreq&user="+val_user+"&pass="+val_pass_md5;
type : 'POST',
url : 'index.php',
data : param,
success : function(data) {
var tab = data.split(':');
if ( tab[0] == "OK" ) {
window.location.href = 'index.php?page='+tab[1];
if(tab[2].length > 0) {
} else {
error : function() {

return false;
function loginFailed(p_data) {

This is the code I am using to login to the above page and to print the next page to the console.


//The username or email address of the account.
define('USERNAME', 'admin');

//The password of the account.
define('PASSWORD', 'admin');

//Set a user agent. This basically tells the server that we are using Chrome ;)
define('USER_AGENT', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0');

//Where our cookie information will be stored (needed for authentication).
define('COOKIE_FILE', 'cookie.txt');

//URL of the login form.
define('LOGIN_FORM_URL', '');

//Login action URL. Sometimes, this is the same URL as the login form.
define('LOGIN_ACTION_URL', '');

//An associative array that represents the required form fields.
//You will need to change the keys / index names to match the name of the form
$postValues = array(
'Username' => USERNAME,
'Password' => PASSWORD

//Initiate cURL.
$curl = curl_init();

//Set the URL that we want to send our POST request to. In this
//case, it's the action URL of the login form.
curl_setopt($curl, CURLOPT_URL, LOGIN_ACTION_URL);

//Tell cURL that we want to carry out a POST request.
curl_setopt($curl, CURLOPT_POST, true);

//Set our post fields / date (from the array above).
curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($postValues));

//We don't want any HTTPS errors.
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

//Where our cookie details are saved. This is typically required
//for authentication, as the session ID is usually saved in the cookie file.

//Sets the user agent. Some websites will attempt to block bot user agents.
//Hence the reason I gave it a Chrome user agent.
curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT);

//Tells cURL to return the output once the request has been executed.
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

//Allows us to set the referer header. In this particular case, we are
//fooling the server into thinking that we were referred by the login form.
curl_setopt($curl, CURLOPT_REFERER, LOGIN_FORM_URL);

//Do we want to follow any redirects?
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, false);

//Execute the login request.

//Check for errors!
throw new Exception(curl_error($curl));

//We should be logged in by now. Let's attempt to access a password protected page
curl_setopt($curl, CURLOPT_URL, '');

//Use the same cookie file.

//Use the same user agent, just in case it is used by the server for session validation.
curl_setopt($curl, CURLOPT_USERAGENT, USER_AGENT);

//We don't want any HTTPS / SSL errors.
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

//Execute the GET request and print out the result.
echo curl_exec($curl);

But when I run the code to get the data after login to the page. I didn't get the data from next page instead I get the login page html code in console. Can anyone please suggest what I am doing wrong? Thanks

Answer Source

You're not sending the credentials the same way as the site.

Change the following (to match their code):

$postValues = array(
    'user' => USERNAME, 
    'pass' => md5(PASSWORD), 
    'type' => 'loginreq', // You also forgot this one

You might also want to change CURLOPT_FOLLOWLOCATION to true, since they most likely will redirect a successful login.

curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download