When looking into DB storage engines, it seems most use mmap to persist. However, is there a situation where writing to a cache layer and writing binary to disk using read and write makes sense?
What I'm trying to understand is what is the difference between mmap and unmmap vs read and write? And when to use the one or the other?
If you can feasibly use
mmap(), it's usually the better way. When you use
read()/write(), you have to perform a system call for every operation (although libraries like
stdio minimize this with user-mode buffering), and these context switches are expensive. Even if the file block is in the buffer cache, you have to first switch into the kernel to check for it. Additionally, the kernel needs to copy the data from the kernel buffer to the caller's memory.
On the other hand, when you use
mmap(), you only have to perform a system call when you first open and map the file. From then on, the virtual memory subsystem keeps the application memory synchronized with the file contents. Context switches are only necessary when you try to access a file block that hasn't yet been paged in from disk, not for each part of the file you try to read or write. When you modify the mapped memory, it gets written back to the file lazily.
For most practical applications, you should use whichever method fits the logic of the application best. The performance difference between the two methods will only be significant in highly time-critical applications. When implementing a library, you can't tell the needs of client applications, so of course you try to wring every bit of performance out of it. But for many other applications, premature optimization is the root of all evil.