Kayser Kayser - 3 months ago 21
Java Question

How do I sort very large files

I have some files that should be sorted according to id at the beginning of each line.
The files are about 2-3 gb. I tried to read all data into an

and sort them. But memory is not enough to keep them all. It does not work.

Lines look like
0052304 0000004000000000000000000000000000000041 John Teddy 000023

0022024 0000004000000000000000000000000000000041 George Clan 00013

How can I sort the files??


That isn't exactly a Java problem. You need to look into an efficient algorithm for sorting data that isn't completely read into memory. A few adaptations to Merge-Sort can achieve this.

Take a look at this: http://en.wikipedia.org/wiki/Merge_sort

and: http://en.wikipedia.org/wiki/External_sorting

Basically the idea here is to break the file into smaller pieces, sort them (either with merge sort or another method), and then use the Merge from merge-sort to create the new, sorted file.