In case there is multi-threads and one function which adds a value to a list and another function which takes that value. What would the difference be with:
scrape = queue.Queue()
example = scrape.get()
scrape = set()
example = scrape.pop()
Queues maintain ordering of possibly non-unique elements.
Sets, on the other hand, do not maintain ordering and may not contain duplicates.
In your case you may need to keep a record of each thing scraped and/or the relative order in which it was scraped. In that case, use
queues. If you just want a list of the unique things you scraped, and you don't care about the relative order in which you scraped them, use
@mata points out, a
queue should be used if multiple threads are producing and consuming from it.
Queues implement the blocking functionality needed to work with producer/consumer
Queues are thread-safe,
sets are not.
In this example from the docs:
def worker(): while True: item = q.get() do_work(item) q.task_done() q = Queue() for i in range(num_worker_threads): t = Thread(target=worker) t.daemon = True t.start() for item in source(): q.put(item) q.join() # block until all tasks are done
get in the consumer thread (i.e.
worker) blocks until there is something in the
queue to get,
join in the producer thread blocks until each item that it put into the
queue is consumed, and
task_done in the consumer thread tells the queue that the item it got has been consumed.