Bry Guy Bry Guy - 9 days ago 6
Java Question

Java sqlite performing thousands of queries

I have a java program where I perform thousands of queries on a SQLite database inside a loop. If the query shows up empty, then I insert the row. If the query has a result, I ignore. I generally perform these individual queries in batches of 1000, but in the end there will be some hundreds of thousands of total queries to finish this task.

Because I have thousands of lines that I'm checking with individual queries, this portion of the program runs very slowly.

Is there a more efficient way to perform this many queries?

Here is the loop that constantly pulls the original data from excel docs until all the information is read:

for(int i =0;i < batchSize;i++){
try {
String[] rowReader=(dataRows.get(i));
archiveID=rowReader[16];
DIVA = rowReader[41];

//Check if already in DB. If it is not, then adds to a batch
System.out.println("checking db");
if(!isInDB(conn, archiveID, DIVA)){
stmt.setString(1,archiveID);
stmt.setString(2,DIVA);
stmt.setString(3,docName);
stmt.addBatch();
}

}catch (IndexOutOfBoundsException ex){
endOfDoc = true;
}

//dump to database every batchSize
if(++count % batchSize == 0) {
//System.out.println("executing batch");
stmt.executeBatch();
conn.commit();
count=0;
}
}


Here is the actual query method:

//returns false if combo is not in All Records, returns true if there
public static boolean isInDB(Connection conn, String archiveID, String DIVA) throws SQLException {
Connection c = conn;
Statement stmt = null;
try {
Class.forName("org.sqlite.JDBC");

stmt = c.createStatement();
ResultSet rs = stmt.executeQuery( "SELECT * FROM AllRecords WHERE ArchiveID=\"" + archiveID +"\" AND DivaCat=\""+DIVA +"\"" );
if ( rs.next() ) {
return true;
}else{
System.out.println(archiveID+DIVA+" is not in DB");
rs.close();
stmt.close();
return false;

}
} catch ( Exception e ) {
System.err.println( e.getClass().getName() + ": " + e.getMessage() );
System.exit(0);
}
return false;

}


Thanks!

CL. CL.
Answer

Without any index, finding the desired row(s) requires that the database goes through the entire table, for each query execution.

You can optimize the lookup in this particular query by indexing both lookup columns:

CREATE INDEX whatever ON AllRecords(ArchiveID, DivaCat);