Andrew Andrew - 1 year ago 169
Ruby Question

Scraping Reddit using Nokogiri (429 too many requests)

I'm trying to scrape Reddit with Nokogiri, but a single run of this keeps telling me that I'm putting in too many requests.

require 'nokogiri'
require 'open-uri'
url = ""
redditscrape = Nokogiri::HTML(open(url))

OpenURI::HTTPError: 429 Too Many Requests

Isn't this only one request? If it's not, how do I create sleep intervals for Nokogiri?

Answer Source

Reddit has an API

You could probably query the API for the particular sub-reddit(s) you want to scrape. Attempting to scrape all of reddit just seems like a nightmare waiting to happen considering the high volume and the nested comments.


It looks like Reddit is blocking the ability to scrape in favor of using their public API.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download