I've got 500K rows I want to insert into PostgreSQL using SQLAlchemy.
For speed, I'm inserting them using
In my experience, you'll see substantial performance improvements if you use
INSERT INTO tbl VALUES (...), (...), ...; as opposed to
bulk_insert_mappings, which uses
executemany. In this case you'll want to batch the rows at least on a statement level for sanity.
Committing between batches probably won't have much of an effect on the performance, but the reason to do it would be to not keep an open transaction for too long, which could impact other transactions running on the server.
You can also experiment with using
COPY to load it into a temporary table, then
INSERTing from that table.