I wrote a program that compacts two small files into a single-bigger file. I first read data from input files, merge data, and write output to a temp file. Once this completes I rename the temp file to the desired file name (located in the same partition on disk). Here is pseudo code:
FILE* fp_1 = fopen("file_1.dat", "r+b");
FILE* fp_2 = fopen("file_2.dat", "r+b");
FILE* fp_out = fopen("file_tmp.dat", "w+b");
// 1. Read data for the key in two files
const char* data_1 = ...;
const char* data_2 = ...;
// 2. Merge data, store in an allocated buffer
// 3. Write merged buffer to temp file
fwrite(temp_buff, estimated_size, 1, fp_out);
// Now rename temp file to desired file name
if(std::rename("file_tmp.dat", "file_out.dat") == 0)
Not in the general case. The disk can lie to the OS, claiming the write finished when it's really just queued in the hard drive's onboard RAM cache, which will be lost on abrupt power loss.
The best you can do is explicitly ask the OS to tell the disk to "really, really sync everything" after you've performed the
fflush, either limited scope with
fsync or using something like
syncfs (the former syncs all file systems, the latter limits the scope to the file system corresponding to a single file descriptor). You'd want to do a targeted
fsync after the final
fflush but before the
rename, and/or a broader
syncfs after the
rename but before the
remove calls so the data and file system tables is definitely updated before you delete the source files.
Of course, like I said, this is all best effort; if the disk controller is lying to the OS, there is nothing you can do shy of writing new firmware and drivers for the disk, which is probably going too far.