whatislife whatislife - 1 month ago 4x
HTML Question

Find a HTML tag that contains certain text

So I am trying to find a particular string in website html source file.

Ex) If I have following html tag

<div class="rev" data="123456789adfdfdfdfadf"></div>

I want to be able to find this particular line that contain
div class = "rev"
and data that are inside and print out

But before I do that, I am just trying to make sure its finding the right tag but I kept getting
as output

This is my code

import urllib2
from BeautifulSoup import BeautifulSoup
import re
request = urllib2.Request("http://www.adidas.co.uk/nmd_r1-shoes/BB1970.html")
request.add_header("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv: Gecko/20091102 Firefox/3.5.5")

f = urllib2.urlopen(request)
soup = BeautifulSoup(f)

d = soup.findAll('div', text = re.compile('123456789adfdfdfdfadf'), attrs = {'class' : 'data'})
print d

Jan Jan

You are mixing your data (as attribute) and the text you're looking for.
With the div given, you should find it with:

print [item["data"] 
       for item in soup.find_all('div', {'_class': 'rev'}) 
       if "data" in item.attrs]

Or, a bit more accurate:

 for item in soup.find_all('div', {'_class': 'rev', attrs={'data-bin' : True}})]