Bolito Bolito - 1 day ago 3
Python Question

BeautifulSoup - Scraping data through paginated table using Python

I am scraping data through a betting site
(https://www.pointdevente.parionssport.fdj.fr/parisouverts/rugby).

I can scrape a limited number of events on the current page. The issue I am facing is that I am unable to scrape through the rest of data in the table .
How do I go to the next page or link.

Following is my code:

import urllib2
from urllib2 import urlopen
import requests
import dryscrape
from bs4 import BeautifulSoup

dryscrape.start_xvfb()
SessionFDJ = dryscrape.Session()
SessionFDJ.visit('https://pointdevente.parionssport.fdj.fr/parisouverts/rugby/')
ResponseFDJ = SessionFDJ.body()
print(ResponseFDJ)

Answer

This page use JavaScript to get all data and change it. Use DevTools in Chrome/Firefox to see what files/urls are used by browser and you see

https://www.pointdevente.parionssport.fdj.fr/api/1n2/offre?sport=964500

which gives all data as JSON.

It seems this page use API so find API documentation and you will no need BeautifulSoup


import requests

url = 'https://www.pointdevente.parionssport.fdj.fr/api/1n2/offre?sport=964500'

r = requests.get(url)

for x in data:
    print(x['label'])

result:

Biarritz-Perpignan
Kenya-France
Australie-Japon
Etats-Unis-Ecosse
Argentine-Pays de Galles
Angleterre-Samoa
Montauban-Colomiers
Bourgoin-Angoulême
Aurillac-Mt-de-Marsan
Dax-Albi
Vannes-Béziers
Ospreys-Edimbourg
Glasgow-Munster
Sale-Exeter
Bath-Saracens
Pau-Clermont
Zebre-Llanelli
Angleterre-Australie
Connacht-Trévise
Gloucester-Bristol
Leicester-Northampton
Cardiff-Ulster
Grenoble-Montpellier
Lyon-Castres
St.Français-Bayonne
Leinster-Newport
La Rochelle-Racing 92
Toulouse-Brive
Narbonne-Oyonnax
Worcester-Wasps
Newcastle-Harlequins
Toulon-Bordeaux
Fidji-Canada
NlleZélande-Russie
Agen-Carcassonne
AfriqueduSud-Ouganda
Comments