Integralist Integralist - 18 days ago 9
C Question

How do strings and bits work in C?

I have a two part question:


  1. Understand output from
    sizeof

  2. Understand how strings are stored in variables (e.g. bits and ram)



Question 1



I'm trying to understand the output from the following piece of C code.

printf("a: %ld\n", sizeof("a")); // 2
printf("abc: %ld\n", sizeof("abc")); // 4


It always seems to be one larger than the actual number of characters specified.

The docs suggest that the returned value represents the size of the object (in this case a string) in bytes. So if the size of
a
gives us back
2
bytes, then I'm curious how
a
represents 16 bits of information.

If I look at the binary representation of the ASCII character
a
I can see it is
01100001
. But that's only showing 3 bits out of 1 byte being used.

Question 2



Also, how do large strings get stored into a variable in C? Am I right in thinking that they have to be stored within an array, like so:

char my_string[5] = "hello";


Interestingly when I have some code like:

char my_string = "hello";
printf("my_string: %s\n", my_string);


I get two compiler errors:

- incompatible pointer to integer conversion initializing 'char' with an expression of type 'char [6]'
- format specifies type 'char *' but the argument has type 'char'


...which I don't understand. Firstly it states the type is presumed to be a size of
[6]
when there's only 5 characters. Secondly the mention of a pointer here seems odd to me? Why does
printf
expect a pointer and why does not specifying the length of the variable/array result in a pointer to integer error?

By the way I seemingly can set the length of the variable/array to 5 rather than 6 and it'll work as I'd expect it to
char my_string[5] = "hello";
.

I'm probably just missing something very basic/fundamental about how bits and strings work in C.

Any help understanding this would be appreciated.

Answer

The first part of the question is due to the way strings are stored in C. Strings in C are nothing more than a series of characters (char) with a \0 added at the end, which is the reason you're seeing a +1 when you do sizeof. Notice in your second part if you were to say char my_string[4] = "hello"; you'd also get a compiler error saying there wasn't enough size for this string. That's also related to this.

Now onto the second part, strings themselves are a series of characters. However, you don't store every character by themselves in a variable. You instead have a pointer to these series of characters that will allow you to access them from some part of memory. Additional information regarding pointers and strings in C can be found here: Pointer to a String in C