Kayser Kayser - 22 days ago 5
Java Question

How do I sort very large files

I have some files that should be sorted according to id at the beginning of each line.
The files are about 2-3 gb. I tried to read all data into an

ArrayList
and sort them. But memory is not enough to keep them all. It does not work.


Lines look like
0052304 0000004000000000000000000000000000000041 John Teddy 000023


0022024 0000004000000000000000000000000000000041 George Clan 00013


How can I sort the files??

Answer

That isn't exactly a Java problem. You need to look into an efficient algorithm for sorting data that isn't completely read into memory. A few adaptations to Merge-Sort can achieve this.

Take a look at this: http://en.wikipedia.org/wiki/Merge_sort

and: http://en.wikipedia.org/wiki/External_sorting

Basically the idea here is to break the file into smaller pieces, sort them (either with merge sort or another method), and then use the Merge from merge-sort to create the new, sorted file.

Comments