git-e git-e - 1 month ago 7
HTML Question

List of all items from queryset with BeautifulSoup

I have Django project with field, with contents (from QuerySet):

<p><b>Name and LastName</b><br />
Work Title<br /><span class="text-spacer"></span>
</p>
<p><b>Name and LastName 1</b><br />
Work Title1 <br /><span class="text-spacer"></span>
</p>
<p><b>Name and LastName 2</b><br />
Work Title 2<br /><span class="text-spacer"></span>
</p>


But I want to have text in this format, with (-):

Name and LastName - Work Title
Name and LastName 2 - Work Title 2
Name and LastName 3 - Work Title 3


Here is my code, but I get only first item, but I want to have array with items:

text_list = self.texts.filter(code='ON')
for i in text_list:
soup = BeautifulSoup(i.text_en, "html.parser")
aa = soup.p.get_text(separator=" - ", strip=True)
return [aa]

Answer

You need to iterate over the p tags. From the example you provided, you can try like this:

source = """<p><b>Name and LastName</b><br />
Work Title<br /><span class="text-spacer"></span>
</p>
<p><b>Name and LastName 1</b><br />
Work Title1 <br /><span class="text-spacer"></span>
</p>
<p><b>Name and LastName 2</b><br />
Work Title 2<br /><span class="text-spacer"></span>
</p>
"""
soup = BeautifulSoup(source, 'lxml')
ary = [p.get_text(separator=' - ', strip=True) for p in soup.find_all('p')]

The ary will be:

[u'Name and LastName - Work Title',
 u'Name and LastName 1 - Work Title1',
 u'Name and LastName 2 - Work Title 2']
Comments