Given the simple code below:
cur.execute("SELECT * FROM myTable")
for row in cur.fetchall():
process(row)
Because the python GIL limits parallelism in threads, you are likely better off using subprocesses instead of threads. You can use a pool that limits the number of concurrent processes so that you don't over-commit your server. The multiprocessing
module should do the trick. The cursor is already an iterator and with a small chunksize when calling map
, it will only pull data as needed. I don't know the size of your rows or the processing time envolved so I picked a modest chunksize and the default pool size which is the number of cores on your machine.
import multiprocessing as mp
pool = mp.Pool()
cur.execute("SELECT * FROM myTable")
for result in pool.map(process, cur, chunksize=5):
print(result)
You can experiment with a thread pool by substituting
import multiprocessing.Pool
pools = multiprocessing.Pool.ThreadPool()