Mwesigye John Bosco Mwesigye John Bosco - 1 year ago 41
SQL Question

Maria DB fails to execute big query with too much data

I have two tables,

. The customer table grows when new customers are recorded. And I do an insert of customer data to customer_risk_score.

My scenario is I had over 245,890 entries in customer_risk_score Table and the customer entries had grown to 699,780 records.

When I run this query to

INSERT INTO customer_risk_score (rim, client_code, calculated_on)
SELECT rim, client_type, Now()
FROM customer
WHERE rim NOT IN (SELECT rim FROM customer_risk_score)

via phpmyadmin, at that point when customer_risk_score has about 245,890 records and customer table having 699,780 ,the query runs forever No termination or error message. O tested it and waited for 3 hours then terminated it manually. I even reached on a point of killing all processes running on the DBMS via terminal by executing show processlist and it was the only process running.

When the customer_risk_score is truncated and i rerun the query, it takes about 3 seconds to insert the specific entries from the customer table to customer_risk_score.

This query is scheduled to run every two hours and my question would be what causes that delay. is it a bug in mariaDB version or my Query is designed in a wrong way? Am using MariaDB VERSION @@version

Thanks in advance!

Answer Source

I would change the not in subquery into a left join ... is null join and would make sure that the rim columns are indexed in both tables:

INSERT INTO customer_risk_score (rim, client_code, calculated_on)
  SELECT c.rim, c.client_type, Now()
  FROM customer c
  LEFT JOIN customer_risk_score crs on c.rim=crs.rim and crs.rim is null

The subquery in your original table quickly grows as more and more records are inserted into the customer_risk_score table. The join conditions in my proposed solution eliminate most of these records.

Another solution would be to use a not exists operator with subquery instead of the not in.