knowill knowill - 3 months ago 14
Python Question

how can I manage html tree?

<td style="width: 128px; border-top: none;">
<div class="pull-left" style="padding-right: 0.5em;">
<a href="/system/30001162/" rel="tooltip" title="V-3YG7">
<img class="eveimage img-rounded" src="https:// Type/3802_64.png" width="64" height="64"
alt="V-3YG7" />
<td style="border-top: none;">
<div itemscope="itemscope" class="pull-left">
<table class="table table-condensed">
<td itemprop="systemname"><a href="/system/30001162/">V-3YG7</a></td>
<td itemprop="region"><a href="/region/10000014/">Catch</a></td>
<td itemprop="systemname"><a href="/system/30001162/">V-3YG7</a></td>
<td itemprop="region"><a href="/region/10000014/">Catch</a></td>
<td class="green" style="text-align: right" class="green-text">116,674</
<td class="green hidden-xs" style="text-align: right">27</td>
<td class="red" style="text-align: right">0</td>
<td class="red hidden-xs" style="text-align: right">6,751</td>
<td class="green hidden-xs" style="text-align: right">100.0</td>
<td class="green" style="text-align: right">396,580</td>
<td class="green hidden-xs" style="text-align: right">107</td>
<td class="red" style="text-align: right">0</td>
<small><a href="/kill/57241825/#comments"
data-disqus-identifier="57241825" rel="tooltip" title="Comments"></a></
<br /><a href="/kill/57241825/">10.00k</a></td>
<td class="icon hidden-xs" style="text-align: center; vertical-align:
<a href="/kill/57241825/" rel="tooltip" title="Detail for 57241825"
<img src=""
height="40" width="40" class="eveimage img-rounded" alt="Capsule" />
<a href="/system/30001162/">V-3YG7</a> <span style="color:
#F30202">-0.1</span><br />
<a href="/region/10000014/">Catch</a></td>
<td class="hidden-xs" style="text-align: center; vertical-align:
middle; width: 64px;">
<a href="/alliance/99005866/" rel="tooltip" title="Just let it happen">
<img src=""
height="40" width="40" class="eveimage img-rounded" alt="Just let it
happen" /></a></td>
<td class="victim" style="text-align: left; vertical-align: top;">
<a href="/character/91628726/">stroon themighty</a> (Capsule)<br />
<a href="/corporation/98423345/">Plaus Collective</a>
/ <a href="/alliance/99005866/">Just let it happen</a>
<td class="hidden-xs" style="text-align: center; vertical-align:
middle; width: 64px;">
<a href="/alliance/240835459/" rel="tooltip" title="The Volition Cult">
<img src=""
height="40" width="40" class="eveimage img-rounded" alt="The Volition
Cult" /></a></td>

python code:

for i in doc('td'):

I use beautiful soup I get some tree and want to cut it down.
I tried this but it didn't work.

if i.div.a.img['class'] == 'vvvv':

I need to check existing a tag and remove it or get data. Why it doesn't work? Or is there another way?

I need to get text of tag with class 'victim'


Note that maybe not all nodes exists on each TD, so if there's no div inside one td, it will raise "AttributeError", as div is None, and you're trying to access its values through "i.div.a.img..." etc..

Note also that class attribute could contain more than one class, so you should do "in" instead of "==". The following snippet should work fine.

for i in doc('td'):
    if 'vvvv' in i.div.a.img['class'].split():

Also note that img has no text attribute.

Although, I'd rather use "find_all" method:

doc.findAll('img', {'class': 'vvvv'})

This will get all img with class 'vvvv', note that if it has also another class won't work.

Happy coding.