user3257755 user3257755 - 2 months ago 6
HTML Question

Python BeautifulSoup - Find all elements whose class names begin with some string

Assume that we want to find all

elements whose class names all begin with a known string and end with an arbitrary id number.

That means that this approach doesn't work:

soup.find_all("li", {"class": KNOWN_STRING})

I have also tried this approach without any luck:"li[class^="+KNOWN_STRING)

How can this be solved?


I would use regex in this approach.

import re

soup.find_all('li', {'class': re.compile(r'regex_pattern')})

Because you have a known string but an arbitrary (I'm assuming unknown) number you can use a regular expression to define the pattern of what you expect the string to be. Example:


This would find all known strings with one or more numbers at the end. See this for more about regular expressions in Python.

Edit, to answer the question:

Would this be correct given two digits in the id? soup.find_all('li', {'class': re.compile(r'^TheMatch v-1 c-[0-9][0-9]+$')}). I assume that it wouldn't.

For two digits at the end you would do:

soup.find_all('li', {'class': re.compile(r'^TheMatch v-1 c-[0-9]{2}$')})

The + just means one or more of the previous regular expression.

What I did was specify in brackets {2} after the regular expression the number of instances I was expecting to be there 2.