view raw
Conti Conti - 10 months ago 61
Python Question

Log in to website with python

I'm trying to log in to Wikipedia using a python script, but despite following the instructions here, I just can't get it to work.

import urllib
import urllib2
import cookielib

username = 'myname'
password = 'mypassword'

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6")]
login_data = urllib.urlencode({'wpName' : username, 'wpPassword' : password})'', login_data)
resp ='')

All I get is the "You're not logged in" page. I tried logging in to another site with the script with the same negative result. I suspect it's either got something to do with cookies, or I'm missing something incredibly simple here. But I just cannot find it.


If you inspect the raw request sent to the login URL (with the help of a tool such as Charles Proxy), you will see that it is actually sending 4 parameters: wpName, wpPassword, wpLoginAttempt and wpLoginToken. The first 3 are static and you can fill them in anytime, the 4th one however needs to be parsed from the HTML of the login page. You will need to post this value you parsed, in addition to the other 3, to the login URL to be able to login.

Here is the working code using Requests and BeautifulSoup:

import requests
from bs4 import BeautifulSoup as bs

def get_login_token(raw_resp):
    soup = bs(raw_resp.text, 'lxml')
    token = [n.get('value', '') for n in soup.find_all('input')
             if n.get('name', '') == 'wpLoginToken']
    return token[0]

payload = {
    'wpName': 'my_username',
    'wpPassword': 'my_password',
    'wpLoginAttempt': 'Log in',
    #'wpLoginToken': '',

with requests.session() as s:
    resp = s.get('')
    payload['wpLoginToken'] = get_login_token(resp)

    response_post ='',
    response = s.get('')