For example I have 88 million rows in my SQL table, I want to import this in pandas, I do query every million rows, how to not hardcode the process
Here's my previous code
%%time
import mysql.connector as sql import pandas as pd
db_connection = sql.connect(host='an aws link', database='p_v1', user='sci', password='tRaF') db_cursor = db_connection.cursor() db_cursor.execute('SELECT * FROM transactions t limit 10')
table_rows = db_cursor.fetchall()
df1 = pd.read_sql('SELECT order_id, title, qty, amount FROM transactions t where order_id <='1000000' , con=db_connection)
%%time
df2 = pd.read_sql('SELECT order_id, title, qty, amount FROM transactions t where >1000000 and order_id <=2000000', con=db_connection)
frames = [df1, df2, ..., df88] #I make it shorter to make easier to read
result = pd.concat(frames)
Because I can't comment on question, here's the step
I quite sure any popular library have a ready to go tool for this case.
You need to execute algorithm: In this case I can think about something like this:
%%time
in first pasrt of the cellThe problem maybe in your mysql connetion, not a code problem