shalini shalini - 3 months ago 50
Python Question

Trying to scrape tripadvisor members using BeautifulSoup

So Im trying to scrape this users profile for his ratings on hotels & restaurants separately
https://www.tripadvisor.in/members-reviews/rahuls896

Now the problem is that its showing me all reviews by default when Im reading it via BeautiFulsoup. Thus by default the class="active" is assigned to "REVIEWS_ALL".

<li data-filter="REVIEWS_ALL" class="active">All</li>
<li data-filter="REVIEWS_HOTELS">Hotels (1)</li>
<li data-filter="REVIEWS_RESTAURANTS">Restaurants (1)</li>


But I'd like the class="active" be assigned to "REVIEWS_HOTELS"

<li data-filter="REVIEWS_ALL">All</li>
<li data-filter="REVIEWS_HOTELS" class="active">Hotels (1)</li>
<li data-filter="REVIEWS_RESTAURANTS">Restaurants (1)</li>


How can I achieve this automation ?

Answer

Just try scraping the entire content for the user, then segregate them as per your requirement.

from selenium import webdriver
driver = webdriver.Firefox()
driver.get('https://www.tripadvisor.in/members-reviews/rahuls896')
next_button = driver.find_element_by_id("cs-paginate-next")
next_button.click()
Comments