kaledev kaledev - 8 months ago 36
SQL Question

INSERT that runs infinitely in MySQL

I have an odd problem that I can't figure out. I'm not much of a MySQL guy (more of a SQL Server person), and I have an INSERT statement that is running (seemingly) forever.

INSERT INTO voter_registration_v2.temp_address_map (person_id, address_id)
select person_id, address_id from
voter_registration.voters v
inner join voter_registration_v2.address a
on v.house_num = a.house_num
and v.half_code = a.half_code
and v.street_dir = a.street_dir
and v.street_name = a.street_name
and v.street_type_cd = a.street_type_cd
and v.street_sufx_cd = a.street_sufx_cd
and v.unit_designator = a.unit_designator
and v.unit_num = a.unit_num
and v.res_city_desc = a.res_city_desc
and v.state_cd = a.state_cd
and v.zip_code = a.zip_code;

The SELECT itself runs in 20s, 16s to fetch. When I run with the INSERT I've timed out at 6000s. All tables are using the MyISAM Engine. I attempted InnoDB originally but it didn't make a difference. It definitely is a large insert - about 600k records. Below is the CREATE for the temp table.

CREATE TABLE temp_address_map (
person_id int PRIMARY KEY,
address_id INT

However, even with 600k - I can't imagine an INSERT taking 100+ minutes if the SELECT only takes ~30s. Appreciate any suggestions.

I have noticed odd problems with my local installation of MySQL anyway. Some SELECT statements that take .5 seconds or less randomly began running forever as well. The ONLY way I could fix the problem was to uninstall and reinstall the server. I must have run through 100 suggestions on forums before I gave up. It's almost like MySQL gets progressively slower until it's unusable. (my RAM is around 48% used). Kind of odd, not sure that's what is going on here though...


You are correct. Under most circumstances, a select query that returns in 20s should not take hours for the insert. However, I would caution that you may be timing the select based on the "first row" that is returned. The insert doesn't return until all rows have been returned.

You have a very detailed on clause. I would suggest a composite index on all the columns used in the clause (starting from the most general to the least general):

create index idx_address_allkeys
    on address(state_cd, res_city_desc, zip, street_name, . . . );

In other words, I am guessing that your code is using a nested loop join, returning one row at time.