Karan Karan - 8 days ago 4
Python Question

Python 3.6: Beautiful Soap - How to extract all the text in a div container?

[<div class="nav-wrapper">
<p class="navigation-links">
<span class="page-numbers current">1</span>
<a class="page-numbers" href="http://www.example.com/2/">2</a>
<a class="page-numbers" href="http://www.example.com/3/">3</a>
<span class="page-numbers dots">…</span>
<a class="page-numbers" href="http://www.example.com/6/">6</a>
<a class="next page-numbers" href="http://www.example.com/2/">Next →</a> </p>
</div>]


Also,is there a simple way to extract the maximum page number in the page nav bar assuming that the entry after 'span class' is the upper limit.

Thanks in advance!

Answer
html = '''<div class="nav-wrapper">
          <p class="navigation-links">
          <span class="page-numbers current">1</span>
          <a class="page-numbers" href="http://www.example.com/2/">2</a>
          <a class="page-numbers" href="http://www.example.com/3/">3</a>
          <span class="page-numbers dots">…</span>
          <a class="page-numbers" href="http://www.example.com/6/">6</a>
          <a class="next page-numbers" href="http://www.example.com/2/">Next →</a> </p>
          </div>'''
bs = bs = BeautifulSoup(html, "html.parser")
bs.find('span', {'class':'page-numbers dots'}).findNext().text
max_page = bs.find('span', {'class':'page-numbers dots'}).findNext().text