Edgar  Rokyan Edgar Rokyan - 29 days ago 6
C Question

Array resizing and realloc function

Now I try to improve my knowledge of pointers reading "Understanding and Using C Pointers" by Richard Reese.

Here's one code example from this book concerning

realloc()
function.

char* getLine(void) {
const size_t sizeIncrement = 10;
char* buffer = malloc(sizeIncrement);
char* currentPosition = buffer;
size_t maximumLength = sizeIncrement;
size_t length = 0;
int character;

if(currentPosition == NULL) { return NULL; }

while(1) {
character = fgetc(stdin);

if(character == '\n') { break; }

if(++length >= maximumLength) {
char *newBuffer = realloc(buffer, maximumLength += sizeIncrement);

if(newBuffer == NULL) {
free(buffer);
return NULL;
}

currentPosition = newBuffer + (currentPosition - buffer);
buffer = newBuffer;
}

*currentPosition++ = character;
}

*currentPosition = '\0';
return buffer;
}


The main idea is to read all symbols into the
buffer
until we meet
\n
.

We don't know the total number of symbols to read so it's reasonable to use
realloc()
function to expand
buffer
periodically.

So, to expand
buffer
we use:

char *newBuffer = realloc(buffer, maximumLength += sizeIncrement);


In this case
realloc()
returns
newBuffer
pointer to the expanded buffer.

After that, if
realloc()
was invoked successfully,
currentPosition
is recalculated as:

currentPosition = newBuffer + (currentPosition - buffer);


QUESTION:

Is it valid to recalculate
currentPosition
in such way?

As I know, after
realloc()
invocation
buffer
pointer is invalidated. (See, for example, this). Any access to the
buffer
pointer leads to the undefined behaviour. So... where am I wrong?

M.M M.M
Answer

This code causes undefined behaviour:

currentPosition = newBuffer + (currentPosition - buffer);

After passing a pointer to realloc, that pointer variable (and all other pointers based on that pointer) become indeterminate, which is the same status that an uninitialized variable has.

Reference: C11 6.2.4/2:

[...] The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

Then, doing pointer arithmetic on an invalid pointer causes undefined behaviour, C11 6.5.6/8:

When an expression that has integer type is added to or subtracted from a pointer, [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined

The pointer operand doesn't point to an object at that time. The object it used to point to has already been freed.

In fact, evaluating the pointer at all may cause undefined behaviour, since an indeterminate value may be a trap representation. (Imagine a system where loading a value into an address register also performs a hardware check that the address belongs to this process). Refs: C11 3.19.2, 6.2.6.1/5:

If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined


The correct way to write the code would have been:

if(++length >= maximumLength)
{
    size_t currentOffset = currentPosition - buffer;

    char *newBuffer = realloc(......
    // ...

    currentPosition = newBuffer + currentOffset;
    buffer = newBuffer;
}

(Personally I would use the offset the whole way , instead of currentPosition, to avoid this problem entirely)