ysj ysj - 1 month ago 8
C Question

Which data structure in C allow me to store lines and append lines easily?

I got a list of string data. 10,20,30 are the line numbers

10. string 1
20. string 2
30. string 3


and if user types in "23 string data". 23 is the line number user wants to insert into. The data should become like that

10. string 1
20. string 2
23. string data
30. string 3


and if user types in "40 string data". The data should become like that

10. string 1
20. string 2
23. string data
30. string 3
40. string data


I'm relatively new in C's data structure. Which data structure should I use to store this kind of data efficiently? My current direction is to implement dynamic array or linked list. However, below are the list of problems I experienced.

Problem with dynamic array:



  1. Using the line number as the index and create sufficient space to
    ensure array length is always more or equal to the highest line
    number. Waste a lot of memory for un-used index.

  2. Printing out the data would be an issue.
    Eg. index 0-9 doesn't have memory allocated. Accessing it will cause an error. Need to find ways to know which index are used?



Problem with linked list:



  1. I cannot jump index(not sure). How do identify which line comes after
    another and insert line in between easily?


Answer

I'll give the two solutions I could come up with, but this question is possibly open-ended.

  1. Use a hash table. Keys are line numbers. Values are (string, pointer to next line's value). This makes both random and linear access fast.

  2. Easier solution that assumes you don't have wildly spaced out line numbers: Use a singly linked list L to store strings. Also create a separate array P containing a pointer to every k-th node in the list. To access line i, check P[floor(i/k)], jump to the node it points to in L, and jump forward i mod k times to reach your string. Access time is therefore O(k) (assuming O(1) array access). Space usage for n strings is O(n + max{i}/k).

The one thing that makes this relevant to C... is that there's no built-in hash table, of course! So #2 may be easier to implement.