charles M charles M - 3 months ago 9
Python Question

Difference between Queue and Sets Python

In case there is multi-threads and one function which adds a value to a list and another function which takes that value. What would the difference be with:

import queue
scrape = queue.Queue()
def scrape():
scrape.put('example')
def send():
example = scrape.get()
print (example)

scrape = set([])
def scrape():
scrape.add('example')
def send():
example = scrape.pop()
print (example)


Why do people use the queue module which is 170-180 lines with if conditions slowing the process for this situation if they can use sets which also gives them the advantage of duplicates filtering.

Answer

Queues maintain ordering of possibly non-unique elements. Sets, on the other hand, do not maintain ordering and may not contain duplicates.

In your case you may need to keep a record of each thing scraped and/or the relative order in which it was scraped. In that case, use queues. If you just want a list of the unique things you scraped, and you don't care about the relative order in which you scraped them, use sets.

As @mata points out, a queue should be used if multiple threads are producing and consuming from it. Queues implement the blocking functionality needed to work with producer/consumer threads. Queues are thread-safe, sets are not.

In this example from the docs:

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()
for i in range(num_worker_threads):
     t = Thread(target=worker)
     t.daemon = True
     t.start()

for item in source():
    q.put(item)

q.join()   # block until all tasks are done

get in the consumer thread (i.e. worker) blocks until there is something in the queue to get, join in the producer thread blocks until each item that it put into the queue is consumed, and task_done in the consumer thread tells the queue that the item it got has been consumed.