Catalin Besleaga Catalin Besleaga - 9 months ago 49
HTML Question

Python2 Scrape html with xpath

Consider a html page with 3 tables in it.

I want to loop through each table and at the same time to print something along if the content coresponds to something I want.

I need to keep track of the table I'm at.

As you see in the code below I have the

variable which is a html string.

I can return the content in all the tables at once(in an array).

I'd like to loop through them.

import __future__
from lxml import html
import requests
from bs4 import BeautifulSoup

page = """
<!DOCTYPE html>
<html lang="en">
<meta charset="UTF-8">

<td>table1 td1</td>
<td>table1 td2</td>

<td>table2 td1</td>
<td>table2 td2</td>

<td>table3 td1</td>
<td>table3 td2</td>


soup = str(BeautifulSoup(page, 'html.parser'))

tree = html.fromstring(soup)

tds = tree.xpath('//table/tr/td/text()')

for td in tds:
print(td + '\n')

print('Ready !!')


You mean you need to process each table on its own?

for table in tree.xpath(".//table"):
    print("---  new table: ---")
    for td in table.xpath(".//td"):