Vladimir Vargas Vladimir Vargas - 2 months ago 16
Python Question

Multiprocessing and Selenium Python

I have 3 drivers (Firefox browsers) and I want them to

do something
in a list of websites.

I have a worker defined as:

def worker(browser, queue):
while True:
id_ = queue.get(True)
obj = ReviewID(id_)
obj.search(browser)
if obj.exists(browser):
print(obj.get_url(browser))
else:
print("Nothing")


So the worker will just acces to a queue that contains the ids and use the browser to do something.

I want to have a pool of workers so that as soon as a worker has finished using the browser to do something on the website defined by id_, then it immediately starts to work using the same browser to do something on the next id_ found in queue. I have then this:

pool = Pool(processes=3) # I want to have 3 drivers
manager = Manager()
queue = manager.Queue()
# Define here my workers in the pool
for id_ in ids:
queue.put(id_)
for i in range(3):
queue.put(None)


Here I have a problem, I don't know how to define my workers so that they are in the pool. To each driver I need to assign a worker, and all the workers share the same queue of ids. Is this possible? How can I do it?

Another idea that I have is to create a queue of browsers so that if a driver is doing nothing, it is taken by a worker, along with an id_ from the queue in order to perform a new process. But I'm completely new to multiprocessing and actually don't know how to write this.

I appreciate your help.

Answer

You could try instantiating the browser in the worker:

def worker(browser, queue):
    browser = webdriver.Chrome()
    try:
        while True:
            id_ = queue.get(True)
            obj = ReviewID(id_)
            obj.search(browser)
            if obj.exists(browser):
                print(obj.get_url(browser))
            else:
                print("Nothing")
    finally:
        brower.quit()