McLeodx McLeodx - 1 year ago 66
Python Question

How to use time module on a for loop to prevent delays in list processing

How can I use the

module to skip to the next iteration if it's taking longer than 5 seconds? Specifically what is the correct way to implement the solution from How would I stop a while loop after n amount of time? with the
loop I'm using below?

import requests
from bs4 import BeautifulSoup

for n in random_list:
url = all_raw_urls[n]
req = requests.get(url)
data = req.text
soup = BeautifulSoup(data, 'html.parser')

tags = soup.find_all('img')
tags = list(set(tags))

if len(tags) < 15 or len(tags) > 50:
print(str(image_count) + ': leave' + ' : images: ' + str(len(tags)))
print(str(image_count) + ': keep' ' : images: ' + str(len(tags)))
print('request error')

image_count += 1

Answer Source

You can use requests embedded timeouts:

req = requests.get(url, timeout=5)

Value of timeout parameter is the number of seconds to wait for request or a tuple (connect_timeout, read_timeout):

req = requests.get(url, timeout=(0.5, 5))

If you really want to use time module, then you can put everything from try/except into another thread and in while loop check its state after 5 seconds, if thread is not finished, kill it and run next task in another thread.