kirsty kirsty - 10 months ago 51
Python Question

Obtaining the label of the control of a form in Mechanize

I am scrapping sites to analyse every form within to find a general pattern that will allow me to automate submitting search queries to these sites. So far, the names of many of the forms are either non-existent or unclear, and so I will need to scrap the associated label to gain meaning for the controls (fields).

The Mechanize support site states that it is possible to extract a control from a form in a web page when searching for a specific label:

control = form.find_control(label="select a cheese")

I am looking for a solution to obtain the label from the control. The Documentation for Mechanize is poor but the answer to this question provides a link to more detailed documentation, but I have been unable to find my answer there.

Has anyone managed to do this or found a workaround solution?

Answer Source

I once had to do something similar when automating data submission to a form. I obtained a list of the names and labels of the controls as:

names = []    
labels = []     
for c in br.form.controls.__iter__():

With these lists you can then select a form as:

control = form.find_control(name=names[0])