ojblass ojblass - 9 months ago 42
C Question

Any helpful anectodes about the use of [^ in scanf?

I have run into some code and was wondering what the original developer was up to. Per the norm I have simplified it down to the basic case before asking your assistance. The man page for scanf has relevant information. I am having some trouble reading it.

#include <stdio.h>

int main() {

char title[80] = "mytitle";
char title2[80] = "mayataiatale";
char mystring[80];

/* hugh ? */
printf("%s\n",mystring); /* Output is "mytitle" */

/* hugh ? */
sscanf(title2,"%[^a]",mystring); /* Output is "m" */

return 0;

I hoping for an anectodal usage and reasons code like this might be used. The code is part of a larger code generated application. I appreciate any feedback.

Answer Source

The main reason for the character classes is so that the %s notation stops at the first white space character, even if you specify field lengths, and you quite often don't want it to. In that case, the character class notation can be extremely helpful.

Consider this code to read a line of up to 10 characters, discarding any excess, but keeping spaces:

#include <ctype.h>
#include <stdio.h>

int main(void)
    char buffer[10+1] = "";
    int rc;
    while ((rc = scanf("%10[^\n]%*[^\n]", buffer)) >= 0)
            int c = getchar();
            printf("rc = %d\n", rc);
            if (rc >= 0)
                    printf("buffer = <<%s>>\n", buffer);
            buffer[0] = '\0';
    printf("rc = %d\n", rc);

This was actually example code for a discussion on comp.lang.c.moderated (circa June 2004) related to getline() variants.

At least some confusion reigns. The first format specifier, %10[^\n], reads up to 10 non-newline characters and they are assigned to buffer, along with a trailing null. The second format specifier, %*[^\n] contains the assignment suppression character (*) and reads zero or more remaining non-newline characters from the input. When the scanf() function completes, the input is pointing at the next newline character. The body of the loop reads and prints that character, so that when the loop restarts, the input is looking at the start of the next line. The process then repeats. If the line is shorter than 10 characters, then those characters are copied to buffer, and the 'zero or more non-newlines' format processes zero non-newlines.