Tiller Tiller - 1 year ago 187
Java Question

Best way to merge binary files in Java

I'm developing a basic download manager that can download a file over http using multiple connections. At the end of the download, I have several temp file containing each a part of the file.

I now want to merge them into a single file.

It's not hard to do so, simply create an output stream and input streams and pipe the inputs into the output in the good order.

But I was wondering: is there a way to do it more efficiently? I mean, from my understanding what will happen here is that the JVM will read byte per byte the inputs, and write byte per byte the output.

So basically I have :
- read byte from disk
- store byte in memory
- some CPU instructions will probably run and the byte will probably be copied into the CPU's cache
- write byte to the disk

I was wondering if there was a way to keep the process on the disk? I don't know if I'm understandable, but basically to tell the disk "hey disk, take these files of yours and make one with them"

In a short sentence, I want to reduce the CPU & memory usage to the lowest possible.

Answer Source

In theory it may be possible to do this operation on a file system level: you could append the block list from one inode to another without moving the data. This is not very practical though, most likely you would have to bypass your operating system and access the disk directly.

The next best thing may be using FileChannel.transferTo or transferFrom methods:

This method is potentially much more efficient than a simple loop that reads from this channel and writes to the target channel. Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them.

You should also test reading and writing large blocks of bytes using streams or RandomAccessFile - it may still be faster than using channels. Here's a good article about testing sequential IO performance in Java.