Miroslav Cetojevic Miroslav Cetojevic - 3 months ago 15
C Question

scanf - strange behavior: two consecutive calls results in one different string and one correct string

I encountered this behavior while trying to solve this HackerRank problem. This site uses

scanf
to pass well-formatted data to the user's code. So far, so good.

There are
p
pairs of strings, each string on a separate line. For each pair, I only need to print
YES
or
NO
once, depending on whether or not those two strings have a common substring. Straightforward enough, for sure. But I'm failing test cases for no apparent reason.

So, after debugging with
printf
, it turns out that, when calling
scanf
twice, for some reason the first string becomes a shorter version with the second string appended to it - an overlap. The second string appears normally on the next line.

The code in question (in debug-mode, if you so will):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
int p;
scanf("%d", &p);

char *s1 = malloc(sizeof(s1));
char *s2 = malloc(sizeof(s2));
int len1, len2;
char *answers[] = { "NO", "YES"};
int answers_i;
for(int i = 0; i < p; ++i) {
scanf("%s", s1);
scanf("%s", s2);
printf("%s\n%s\n", s1, s2);
len1 = strlen(s1);
len2 = strlen(s2);
answers_i = 0;
for(int j = 0; j < len1 && answers_i == 0; ++j) {
for(int k = 0; k < len2; ++k) {
if(s1[j] == s2[k]) {
// printf("s1[%d]=%c s2[%d]=%c\n", j, s1[j], k, s2[k]);
answers_i = 1;
break;
}
}
}
// printf("%s\n", answers[answers_i]);
}

return 0;
}


Input of one of the failed test cases:

10
dapkqnowwvdrknfvcmanjuroumppajrzklucroxvpfmcsclqa
ivtnjtgiogmwhqybjaxlktqbwsdhqrwovoavetymkpcco
hrtybirxncuiailznohfawjwipdtupnxnisbwcplozwrzt
ngdmqotxkpnuhmpfmajthzdtnztrqyugendiublcwp
rmpwlddwttapjzhdldjmuhmgruufltzszprzdcziigc
bbvvkeqkqekqqennyxqxkxnyxnyqnnybnbvnyqqe
annbjookwtqkoivcgbqckqtvgvktobctktgkkjiac
zsspfhmzpurrrlurdsdlrfldzyldfhudfedrszdpmsudh
yuuuydwovzawzamvydaaadkakukpynwfmpnmuaazokxkmjxawo
rqiqbhgscsetgihrrrgsqrlqgcbcbrettlehbeistbiqbisie
ibvmfltfdvlmentbfdemebbnvllfneeefnaamtblt
gukzzrqruyxsrqhyuggkrjujkwjhqhqsrqgkrkqxpszrzk
nakqzfroqouhgunxqvqbxwtibfodsvoilqrpvhtgzoholxd
bqluorjgkkrvmiptnxegxwlhrstiiafbfoxodzyguhdwi
oyvgelovlyevhhedoeolyhdevcvhgceydcdehgvoc
wsqswjnjpiarszzzxpmptrquwbnbzqiqqtzqnbajnpsjfaxr
hvkmgwawagozzabgmdmdvbbaxadawmbazvxohxzv
sfiltrslqepytjpfffqlrpejiueftrnisnnppnlpuficrjys
nvsovybaljmzenkfgayfoxzcjantbdidxflbkhbixgzk
qdphnbrjmznztnphhutkdbwjzmjwugtxggxchzcidngplj


Output

dapkqnowwvdrknfvcmanjuroumppajrzivtnjtgiogmwhqybjaxlktqbwsdhqrwovoavetymkpcco
ivtnjtgiogmwhqybjaxlktqbwsdhqrwovoavetymkpcco
hrtybirxncuiailznohfawjwipdtupnxngdmqotxkpnuhmpfmajthzdtnztrqyugendiublcwp
ngdmqotxkpnuhmpfmajthzdtnztrqyugendiublcwp
rmpwlddwttapjzhdldjmuhmgruufltzsbbvvkeqkqekqqennyxqxkxnyxnyqnnybnbvnyqqe
bbvvkeqkqekqqennyxqxkxnyxnyqnnybnbvnyqqe
annbjookwtqkoivcgbqckqtvgvktobctzsspfhmzpurrrlurdsdlrfldzyldfhudfedrszdpmsudh
zsspfhmzpurrrlurdsdlrfldzyldfhudfedrszdpmsudh
yuuuydwovzawzamvydaaadkakukpynwfrqiqbhgscsetgihrrrgsqrlqgcbcbrettlehbeistbiqbisie
rqiqbhgscsetgihrrrgsqrlqgcbcbrettlehbeistbiqbisie
ibvmfltfdvlmentbfdemebbnvllfneeegukzzrqruyxsrqhyuggkrjujkwjhqhqsrqgkrkqxpszrzk
gukzzrqruyxsrqhyuggkrjujkwjhqhqsrqgkrkqxpszrzk
nakqzfroqouhgunxqvqbxwtibfodsvoibqluorjgkkrvmiptnxegxwlhrstiiafbfoxodzyguhdwi
bqluorjgkkrvmiptnxegxwlhrstiiafbfoxodzyguhdwi
oyvgelovlyevhhedoeolyhdevcvhgceywsqswjnjpiarszzzxpmptrquwbnbzqiqqtzqnbajnpsjfaxr
wsqswjnjpiarszzzxpmptrquwbnbzqiqqtzqnbajnpsjfaxr
hvkmgwawagozzabgmdmdvbbaxadawmbasfiltrslqepytjpfffqlrpejiueftrnisnnppnlpuficrjys
sfiltrslqepytjpfffqlrpejiueftrnisnnppnlpuficrjys
nvsovybaljmzenkfgayfoxzcjantbdidqdphnbrjmznztnphhutkdbwjzmjwugtxggxchzcidngplj
qdphnbrjmznztnphhutkdbwjzmjwugtxggxchzcidngplj


The output should equal the input, but this is obviously not the case. The first string is being capped at 32 characters and the entire second string is appended to it. But the second string itself is unchanged. What exactly is happening between these two
scanf
calls?

I used
gets
(oops, deprecated) and
getchar
, but the problem persists.
fgets
is useless, since I don't know the size of the strings beforehand. I don't know any other standard alternatives.

NOTE: If anyone wants to try out this code on HackerRank, make sure to check the box for
Test against custom input
, copy&paste the input above, then click the
Run
button.

Answer

Your code has undefined behavior. The problem in your code is with the allocation of memory to s1 and s2. You allocate sizeof(s1) bytes, which is the size of a pointer. Once you read more data than the size of a pointer, you write past the allocated buffer, causing undefined behavior.

Problem constraints call for |a|, |b| < 105, so the allocation should be as follows:

char *s1 = malloc(sizeof(100001));
char *s2 = malloc(sizeof(100001));

Note an extra byte allocated for null terminator.

You need to call free(s1) and free(s2) at the end of your function.