user3793935 user3793935 - 3 months ago 26
MySQL Question

Java - Having a huge ArrayList (1 million +), how to create a String of it in a acceptable amount of time?

as I mentioned, I have huge ArrayLists in this format:

List<List<String>> alist;


I get the Lists out of some .CSV wich I get over a extern databasesystem. (I wrote a Visual Objects script to export the data we need for some calculations)

After I have the .CSV, I load the content into my List like this:

String line;
alist = new ArrayList<List<String>>();
int i=0;
// Datei laden, und anschlie├čend die Zeilen der CSV in eine ArrayList speichern

try {

br = new BufferedReader(new FileReader(path));
while((line = br.readLine()) != null)
{
line = line.replace(",", ".");
if(line.endsWith(delimitter))
line = line + " ";
alist.add(Arrays.asList(line.split(delimitter)));
i++;
if(i==10000000)
break;
}

}


It takes me like ~15900 ms to store the Data in my ArrayList (~1,1 million rows and 11 columns). Pretty okay I guess.
Now, that I have the List in the format I need, I would like to create a Insert out of it, so we can import the data in our database.
I'm creating the insert like this:

public String getInsertString()
{
// Tabelle ergibt sich aus dem Dateinamen, ohne das .csv
String insert="REPLACE INTO "+tablename + " (";

// Spaltennamen aus Array auslesen, immer die erste Zeile des CSV
for(int i=0; i< alist.get(0).size();i++)
{
if(i==0)
insert = insert + alist.get(0).get(i).trim();
else
insert = insert + " ,"+ alist.get(0).get(i).trim();
}
insert= insert + ") \rVALUES";

// Values der SPalten in den Insert schreiben + korreckte Syntax des Bfehels sicherstellen usw.
for(int i=1;i < alist.size();i++) // Size nach "unten" in der 2D Liste
{

insert= insert +"(";
for(int j=0; j < alist.get(0).size();j++) // Size nach "rechts" in der 2D Liste
{
// bei dem ersten ohne "," starten, damit die Syntax stimmt
// Sollte der aktuelle Wert eine Zahl oder "null" sein, keine "'" setzen. Ansonsten "'" setzen fuer den Insert in die DB
if(j==0)
{
if((StringUtils.isStrictlyNumeric(alist.get(i).get(j).trim())) || alist.get(i).get(j).trim().contains("null"))
insert = insert + alist.get(i).get(j).trim();
else
insert = insert + "'" + alist.get(i).get(j).trim() + "'";
}
else
{
if(((StringUtils.isStrictlyNumeric(alist.get(i).get(j).trim()))) || (alist.get(i).get(j).trim().contains("null")))
insert = insert +","+ alist.get(i).get(j).trim();
else
insert = insert + ",'" + alist.get(i).get(j).trim() + "'";
}

}
if(i < alist.size()-1)
insert= insert +"),";
else
insert= insert +")";
insert = insert +"\r";

}

//System.out.println(insert);
return insert;

}


Here I go over the whole List and add the values to a string so that I can use the string for a insert.
I use the filenames as table names and the first row of a file for the columns in the insert.
All other rows are the values.

After this step is done, I get a string like this
"REPLACE INTO tablename (column1,column2 ... columnx)
VALUES(value1, value2 ... valuex),
(value1, value2 ... valuex),
...."

Now I would excecute my InsertInDb class with that string and yeah, thats it.

But the second step takes way too much time. (I wait for like a hour now)
Is there a smarter way to do what I want to do?
(Insert all CSV automaticly in our Database)

Would a BigList maybe improve the speed?
https://dzone.com/articles/biglist-scalable-high
(can't test it right now)

Answer
Now, that I have the List in the format I need, I would like to create a Insert out of it, so we can import the data in our database.

Don't do that! Do a JDBC batch insert instead. (See this on how to do a PreparedStatement which will save you a lot of trouble on formatting the insert statements)

Comments