SniperCoder SniperCoder - 5 months ago 166
PHP Question

Login to amazon using CURL

I'm trying to login to amazon using curl, however when i send the POST data I'm not getting anything and i want to use curl only i don't want to use any API. This is the code that i tried:

<?php
$curl_crack = curl_init();
CURL_SETOPT($curl_crack,CURLOPT_URL,"https://www.amazon.com/ap/signin?_encoding=UTF8&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.mode=checkid_setup&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.amazon.com%2F%3Fref_%3Dnav_custrec_signin");
CURL_SETOPT($curl_crack,CURLOPT_USERAGENT,$_SERVER['HTTP_USER_AGENT']);
//CURL_SETOPT($curl_crack,CURLOPT_PROXY,trim($socks[$sockscount]));
//CURL_SETOPT($curl_crack,CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($curl_crack,CURLOPT_POST,True);
CURL_SETOPT($curl_crack,CURLOPT_POSTFIELDS,"appAction=SIGNIN&email=test@hotmail.com&create=0&password=test123");
CURL_SETOPT($curl_crack,CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($curl_crack,CURLOPT_COOKIEFILE,"cookie.txt");
curl_setopt($curl_crack, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_crack, CURLOPT_FOLLOWLOCATION, 1);
CURL_SETOPT($curl_crack,CURLOPT_TIMEOUT,30);
echo $check = curl_exec($curl_crack);

?>

Answer

Here you go. Tested & working.

EDIT: This code stopped working sometime before June 2016. Amazon has added client side Javascript browser fingerprinting that breaks automated logins like the one below. It's actually not that hard to bypass but I haven't spent time on engineering PHP code to do so which would be easily breakable by minor changes.

Instead, I've posted an example below the old PHP code that uses CasperJS to log in. PhatomJS or Selenium could also be used.

To supply a little background, an extra form field called metaData1 is populated by Jaavascript which contains a base64 encoded string of obfuscated browser information. Some of it might be compared with server side collected data.

Here's an example string (before encoding):

9E0AC647#{"version":"2.3.6-AUI","start":1466184997409,"elapsed":5,"userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36","plugins":"Chrome PDF Viewer Shockwave Flash 2100Widevine Content Decryption Module 148885Native Client ||1600-1200-1150-24---","dupedPlugins":"Chrome PDF Viewer Shockwave Flash 2100Widevine Content Decryption Module 148885Native Client Chrome PDF Viewer ||1600-1200-1150-24---","flashVersion":"21.0.0","timeZone":-8,"lsUbid":"X69-8317848-6241674:1466184997","mercury":{"version":"2.1.0","start":1467231996334,"ubid":"X69-8317848-6241674:1466184997","trueIp":"1020304","echoLatency":831},"timeToSubmit":57868,"interaction":{"keys":47,"copies":0,"cuts":0,"pastes":0,"clicks":6}}

As you can see the string contains some creepy information, what browser plugins are loaded, your key and mouse click count on the page, the trueIp is a 32-bit long IP address of your computer, your time zone, screen resolution and viewport resolution, and how long you were on the login page. There's quite a bit more info that it can collect, but this is a sample from my browser.

The value 9E0AC647 is a crc32 checksum of the string after the # - it won't match because I changed trueIp and other data. This data then goes through some transformation (encoding) using some values from Javascript, is base64 encoded, and then added to the login form.

Here's a permanent paste of the JS code responsible for all of this.


The steps:

  • Fetch the home page to establish cookies
  • Parse HTML to extract login URL
  • Fetch login page
  • Parse HTML and find signin form
  • Extract form inputs for login (there are quite a few required hidden fields)
  • Build post array for login
  • Submit login form
  • Check for success or failure

PHP Code (no longer working - see example below):

<?php

// amazon username & password
$username = 'you@example.com';
$password = 'yourpassword';

// http headers for requests
$headers = array(
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language: en-US,en;q=0.5',
    'Connection: keep-alive',
    'DNT: 1', // :)
);

// initialize curl
$ch = curl_init('https://www.amazon.com/');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, '');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');

// fetch homepage to establish cookies
$result = curl_exec($ch);

// parse HTML looking for login URL
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($result);

// find link to login page
$xpath    = new DOMXPath($dom);
$elements = $xpath->query('//*[@id="nav-link-yourAccount"]');

if ($elements->length == 0) {
    die('Did not find "sign-in" link!');
}

// get login url
$url = $elements->item(0)->attributes->getNamedItem('href')->nodeValue;

if (strpos($url, 'http') !== 0) {
    $url = 'https://www.amazon.com' . $url;
}

// fetch login page
curl_setopt($ch, CURLOPT_URL, $url);
$result = curl_exec($ch);

// parse html to get form inputs
$dom->loadHTML($result);
$xpath = new DOMXPath($dom);

// find sign in form inputs
$inputs = $xpath->query('//form[@name="signIn"]//input');

if ($inputs->length == 0) {
    die('Failed to find login form fields!');
}

// get login post url
$url = $xpath->query('//form[@name="signIn"]');
$url = $url->item(0)->attributes->getNamedItem('action')->nodeValue; // form action (login URL)

// array of form fields to submit
$fields = array();

// build list of form inputs and values
for ($i = 0; $i < $inputs->length; ++$i) {
    $attribs = $inputs->item($i)->attributes;

    if ($attribs->getNamedItem('name') !== null) {
        $val = (null !== $attribs->getNamedItem('value')) ? $attribs->getNamedItem('value')->nodeValue : '';
        $fields[$attribs->getNamedItem('name')->nodeValue] = $val;
    }
}

// populate login form fields
$fields['email']    = $username;
$fields['password'] = $password;

// prepare for login
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($fields));

// execute login post
$result = curl_exec($ch);
$info   = curl_getinfo($ch);

// if login failed, url should be the same as the login url
if ($info['url'] == $url) {
    echo "There was a problem logging in.<br>\n";
    var_dump($result);
} else {
    // if successful, we are redirected to homepage so URL is different than login url
    echo "Should be logged in!<br>\n";
    var_dump($result);
}

Working CasperJS code:

var casper = require('casper').create();

casper.userAgent('Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0');
phantom.cookiesEnabled = true;

var AMAZON_USER = 'you@yoursite.com';
var AMAZON_PASS = 'some crazy password';

casper.start('https://www.amazon.com/').thenClick('a#nav-link-yourAccount', function() {
    this.echo('Title: ' + this.getTitle());

    var emailInput = 'input#ap_email';
    var passInput  = 'input#ap_password';

    this.mouseEvent('click', emailInput, '15%', '48%');
    this.sendKeys('input#ap_email', AMAZON_USER);

    this.wait(3000, function() {
        this.mouseEvent('click', passInput, '12%', '67%');
        this.sendKeys('input#ap_password', AMAZON_PASS);

        this.mouseEvent('click', 'input#signInSubmit', '50%', '50%');
    });
});

casper.then(function(e) {
    this.wait(5000, function() {
        this.echo('Capping');
        this.capture('amazon.png');
    });
});


casper.run(function() {
    console.log('Done');

    casper.done();
});

You should really extend this code to act more like a human!