Sam Sam - 8 months ago 39
Ajax Question

Scraping AJAX loaded content with python?

So i have function that is called when i click a button , it goes as below

var min_news_id = "68feb985-1d08-4f5d-8855-cb35ae6c3e93-1";
function loadMoreNews(){
data = JSON.parse(data);
min_news_id = data.min_news_id||min_news_id;
.fail(function(){alert("Error : unable to load more news");})

Now i don't have much experience with javascript , but i assume its returning some json data from some sort of api at "en/ajax/more_news" .

Is there i way could directly call this api and get the json data from my python script. If Yes,how?

If not how do i scrape the content that is being generated?


You need to post the news id that you see inside the script to, this is an example using requests:

from bs4 import BeautifulSoup
import requests
import re

# pattern to extract min_news_id
patt = re.compile('var min_news_id\s+=\s+"(.*?)"')

with requests.Session() as s:
    soup = BeautifulSoup(s.get("").content)
    new_id_scr = soup.find("script", text=re.compile("var\s+min_news_id"))
    news_id =
    js ="", data={"news_offset":news_id})

js gives you all the html, you just have to access the js["html"].