khalid khalid - 1 month ago 18
Python Question

crawling in scrapy..not getting result as expected

Whatever I have tried
step 1

scrapy shell http://www.bseindia.com/corporates/Forth_Results.aspx?expandable=0


fount this xpath form Mozilla firebug

sel.xpath('/html/body/form/div[3]/div/div[3]/div[2]/div/div[3]/div[1]/div/div/div/table/tbody/tr[1]/td/table/tbody/tr[2]/td/table/tbody/tr/td/div/table/tbody/tr[3]/td[3]/text()').extract()[0].strip()


does not work

sel.xpath('/html/body/form/div[3]/div/div[3]/div[2]/div/div[3]/div[1]/div/div/div/table/tbody/tr[1]/td/table/tbody/tr[2]/td/table/tbody/tr/td/div/table/tbody/tr[3]/td[3]/text()').extract()[0]


does not work

sel.xpath('/html/body/form/div[3]/div/div[3]/div[2]/div/div[3]/div[1]/div/div/div/table/tbody/tr[1]/td/table/tbody/tr[2]/td/table/tbody/tr/td/div/table/tbody/tr[3]/td[3]/text()').extract()


does not work

Found xpath from chrome

sel.xpath('//div[@id="wrap"]/div/div[3]/div[2]/div/div[3]/div[1]/div/div/div/table/tbody/tr[1]/td/table/tbody/tr[2]/td/table/tbody/tr/td/div/table/tbody/tr[3]/td[2]/text()').extract()


It is working fine in chrome console but while doing in command the output is
[]
. The result is same in for mozilla xpath also.

Please help.

Answer

Chrome and Firefox have the tendency to add some DOM elements to the tree. The tbody tag is added. Also assuming you are looking for class name TTRow in the HTML, you can use the selector path as:

In [32]: response.xpath('//*[@id="wrap"]//table//tr[@class="TTRow"][3]/td[2]/text()').extract()
Out[32]: [u'DWITIYA']
Comments