view raw
clg4 clg4 - 8 months ago 110
Python Question

Python BeautifulSoup scrape Yahoo Finance value

I am attempting to scrape the 'Full Time Employees' value of 110,000 from the Yahoo finance website.

The URL is:

I have tried using Beautiful soup, but I can't find the value on the page. When I look in the DOM explorer in IE, I can see it. It has a tag with a parent tag which has a parent which has a parent . The actual value is in a custom class of

code I have tried:

from bs4 import BeautifulSoup as bs
r = requests.get(html).content
soup = bs(r)

Not sure where to go.


The problem is in the "requests" related part - the page you download with requests is not the same as you see in the browser. Browser executed all of the javascript, made multiple asynchronous requests needed to load this page. And, this particular page is quite dynamic itself. There is a lot happening on the "client-side".

What you can do is to load this page in a real browser automated by selenium. Working example:

from selenium import webdriver
from import By
from import expected_conditions as EC
from import WebDriverWait

driver = webdriver.Chrome()

# wait for the Full Time Employees to be visible
wait = WebDriverWait(driver, 10)
employees = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[. = 'Full Time Employees']/following-sibling::strong")))


Prints 110,000.