silverkid silverkid - 7 months ago 15
Java Question

why does mysql inner join query take so much time

in mysql i have two tables


col1 col2 SIM1 ..........col24
a x 1 5
b y 1 3
c z 0 2
d g 2 1


colA colB SIM2
x g 1
y f 0
x s 0
y e 2

Actually the number of records in the two tables in 0.4 million

i have a java program from which i am executing sql query using jdbc.

here is the query

INNER JOIN TableB ON TableA.SIM1 = TableB.SIM2
INTO OUTFILE 'c:/test12226.csv' "+

This query is taking a really long time. for my application to be feasible this should not take more than 30 seconds. i understand the records are 0.4 million but such an operation in ms access takes less than 10 seconds. is java-mysql combination more time consuming than ms-access

i have allocated 1GB ram in debug configuration. please suggest.


My guess is that one or both of TableA.SIM1 and TableB.SIM2 aren't indexed. Either that or they're different data types (eg VARCHAR and NUMERIC). Try:

CREATE INDEX index_name1 ON TableA (SIM1);
CREATE INDEX index_name2 ON TableB (SIM2);

Without indexes that query will be really slow. One table will be accessed record by record, which is fine since you're outputting the whole table. To find the corresponding record in the other table it needs to look up based on the SIM1 = SIM2 relationship.

To find records in the other table without an index it has to look through every record. This is a linear or O(n) lookup. Put half a million records in each table and that's an awful lot of comparisons required to find all the matches (billions in facts).

With the indexes the record matching is near-instant.

Think of it this way: indexing the columns is like putting a telephone book in alphabetical order. That makes it easy to find surnames. If the telephone book wasn't sorted at all how long would it take you to find someone's phone number?

Now multiply that by half a million.