MCHagen4 MCHagen4 - 1 month ago 12
C# Question

File structure to avoid rewrite of entire file

I am working on devising a custom binary file format that allows me to rewrite parts of the file without overwriting the entire file. The data is composed of "elements" that are of variable length. These elements can be deleted, inserted or modified within the file. Such modification may change the length of the element.

Here is what I am considering:


  • The file is two major parts: A Header and a body.

  • The body is broken up into blocks of a predetermined size that on average holds perhaps 100 elements. As elements are created, they are written to the first block with enough space left to contain it.

  • When elements are modified, if there is enough space in the block to contain it then it is rewritten, otherwise the element is deleted from the block (and the block is then rewritten) and then inserted into another block that has the required space.

  • The header contains a pointer to the address of each block and it's current size. Since this data is fixed width, I can change a single header entry without having to rewrite the whole header.



My question:
If I need to increase the size of the header, I would need to rewrite the whole body to create additional header space. If the header was a separate file from the body, I would not have this issue, but I don't like the idea of having two files. Is there any way to keep the header and body one physical file, but have each capable of expanding independent of the other?

Answer

Your file format should support a list of headers that are linked with each other.

In the header, have a field "next" that points to the position of the next header in the file. If you need to add a header, add it at the end of the file, then write its position within the file into the "next" field of the previously last header.

My opinion: Why invent a new format when there are already solutions like SQLite out there that can be easily used?

Comments