I have 2 tables one is called raw_data which has 150m rows, the other one is called income which also has about 150m rows. I am trying to run the following query and it seems to get stuck after couple of hours. In raw_data, MemberID and SiteID have indexes, in income, PersonID and SiteID have indexes, both table's storage engine is MyISAM since we are using %99 select queries.
UPDATE `income` `t1`
INNER JOIN `raw_data` `t2`
ON `t1`.`PersonID` = `t2`.`MemberID` AND `t1`.`SiteID` = `t2`.`SiteID`
SET `t1`.`Age` = `t2`.`Age`,
`t1`.`Gender` = `t2`.`Gender`,
`t1`.`Sport` = `t2`.`Sport`,
`t1`.`PersonType` = `t2`.`PersonType`,
`t1`.`BookingType` = `t2`.`BookingType`,
`t1`.`TemplateName` = `t2`.`TemplateName`
MyISAM locks table until finished.
The query would run a lot faster with 'Composite' indexes (I'm not sure which one is preferred, may as well add both.):
income: INDEX(PersonID, SiteID) -- in either order raw_data: INDEX(MemberID, SiteID) -- in either order
Note: Separate indexes on the two columns (which it sounds like you have) will not be as good.
You should also Switch to InnoDB . 99% Selects is not a sufficient excuse for using the antiquated, and soon to be removed, MyISAM.
Why copy data from one table to another? Why not simply
JOIN the two tables whenever you need it. Or, phrased another way, "redundant data is a no-no".
Here is advice on 'chunking' an
UPDATE to make it more civilized.