CodeFusion CodeFusion - 1 month ago 5x
C Question

Why does dereferencing a pointer to string (char array) returns the whole string instead of the first character?

Since the pointer to array points to the first element of the array (having the same address), I don't understand why this happens:

#include <stdio.h>

int main(void) {
char (*t)[] = {"test text"};
printf("%s\n", *t + 1); // prints "est text"

Additionally, why does the following code print

#include <stdio.h>

int main(void) {
char (*t)[] = {1, 2, 3, 4, 5};
printf("%d\n", *t + 1); // prints "2"


All other answers at the moment of writing this answer were incorrect. Moreover your question smells like an an XY problem in that the construct you were trying most probably wasn't what you wanted. What you'd really want to do is simply:

char *t = "test text";
printf("%s\n", t);  // prints "test text"


printf("%c\n", t[1]); // prints "e", the 2nd character in the string.

But since you wanted to understand why those things happen, and all the other explanations were wrong, here goes:

Your declaration declares t as a pointer to an array of char:

cdecl> explain char (*t)[];
declare t as pointer to array of char

not an array of pointers as others have suggested. Furthermore, the type of *t is incomplete, so you cannot take its size:

sizeof *t;

will result in

error: invalid application of ‘sizeof’ to incomplete type ‘char[]’
     sizeof *t;

at compile time.

Now, when you try to initialize this with

 char (*t)[] = {"test text"};

it will warn because while "test text" is a array of (constant) char, here it decays to a pointer to char. Additionally, the braces there are useless; the excerpt above is equal to writing:

char (*t)[] = "test text";

Not unlike

int a = 42;


int a = {42};

are synonymous. This is C.

To get a pointer to array, you must use "address-of" operator on the array (the string literal!), to avoid it decaying to a pointer:

char (*t)[] = &"test text";

Now t is a properly initialized as a pointer to an (immutable) array of char. However in your case using a pointer to incorrect type didn't matter because the 2 pointers, despite being of incompatible type, pointed to the equally same address - only, one pointed to array-of-char, and the other to the first character in that array of char; and thus the observed behaviour was identical.

When you dereference t, which was pointer-to-array-of-char, you will get an locator value (lvalue) of array-of-char. An lvalue of array-of-char will then under normal circumstances decay to a pointer-to-the-first-element, as they usually do, so *t + 1 will now point to the second character in that array; and printfing that value will then print the contents of a 0-terminated string starting from that pointer.

The behaviour of %s is specified in C11 (n1570) as


If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type. Characters from the array are written up to (but not including) the terminating null character. [...] If the precision is not specified or is greater than the size of the array, the array shall contain a null character. [...]

(emphasis mine.)

As for your second initialization:

char (*t2)[] = {1, 2, 3, 4, 5};

if you compile this with a recent version GCC you will get lots of warnings by default, first:

test.c:10:19: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
   char (*t2)[] = {1, 2, 3, 4, 5};

Thus 1 is converted from int to a pointer-to-array-of-char without any cast.

Then, of the remaining values, the compiler will complain:

y.c:10:19: note: (near initialization for ‘t2’)
y.c:10:21: warning: excess elements in scalar initializer
   char (*t2)[] = {1, 2, 3, 4, 5};

That is, in your case the 2, 3, 4 and 5 were silently ignored.

The value of that pointer is thus now 1, e.g. on an x86 flat memory model it would point to memory location 1 (though this is naturally implementation defined):

printf("%p\n", (void*)t2);

prints (doubly implementation defined)


When you dereference this value (which is a pointer-to-array-of-char), you will get an lvalue for array-of-char that starts at memory address 1. When you add 1, this array-of-char lvalue will decay to a pointer-to-char, and as a result you will get ((char*)1) + 1 which is a pointer-to-char whose value is 2. The type of that value can be verified from the warning generated by default by GCC (5.4.0):

y.c:5:10: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘char *’ [-Wformat=]
   printf("%d\n",*t2+1); //prints "2"

The argument is of type char *.

Now you pass a (char*)2 as an argument to printf, to be converted using %d, which expects an int. This has undefined behaviour; in your case the byte pattern of (char*)2 is sufficiently confusingly interpreted as 2 and thus it is printed.

And now one realizes that the value printed has nothing to do with 2 in the original initializer:

#include <stdio.h>

int main(void) {
    char (*t2)[] = {1, 42};
    printf("%d\n", *t2 + 1);

will still print 2, not 42. QED.

Alternatively for both initializations you could have used the C99 compound literals to initialize:

// Warning: this code is super *evil*
char (*t)[] = &(char []) { "test text" };
char (*t2)[] = &(char []) { 1, 2, 3, 4, 5 };

Though this would probably be even less that which you wanted, and the resulting code does not have any chance of compiling in C89 or C++ compilers.