Hedron Hedron - 4 months ago 12
C Question

Copying string from source printing strange string of characters

Im making a lexer and I'm trouble copying a string from my file buffer to the string property of the constructor. Here is the code i'm using to copy a string from the buffer.

static token_t* lexer_str(lexer_t* lexer) {
size_t str_len = 0;

while (true) {
if (lexer->len < 1) {
error_new(lexer->errors, lexer->len, lexer->pos, "Unterminated string.");
return NULL;
} else if (lexer_look(lexer, 0) == '\"') {
lexer_adv(lexer, 1);
break;
} else {
lexer_adv(lexer, 1);
str_len++;
}
}

char* string = malloc(str_len);
for (size_t idx = 0; idx < str_len; idx++)
string[idx] = lexer->src[lexer->ptr - str_len + idx];

token_t* token = token_new(lexer, _str);
token->string = string;
return token;
}


And here is the buffer.

"la la la" "me me me"


and here is the output, the string is coming out as "²²²²\"

Type:0 {
Line: 1
Pos: 0
Number: 10715872
Real: 10715872
String: ²²²²\
}


Why is this happening? Is it just me reading memory from the wrong place. Any help for how I could correctly copy the string into the token would be appricated.

Answer

First char* string = malloc(str_len); is too short, and your string is not null terminated after the copy (you copy a buffer given offset and len, the buffer does not contain an ending null char)

change to:

char* string = malloc(str_len+1);  // 1 byte more
for (size_t idx = 0; idx < str_len; idx++)
   string[idx] = lexer->src[lexer->ptr - str_len + idx];
 string[str_len] = '\0';  // don't forget to null-terminate

If the source is, say, empty, then you have a non-null terminated string in string

Comments