RaKo RaKo - 4 years ago 92
C Question

flex bison, tokenizing stops after encountering a token while reading from a file

I am new to flex and bison. I am trying to write a simple grammar accepting the string :a word in lowercase followed by a word in upper case. below are my files-

file.l

%{
#include<stdio.h>
#include<string.h>
#include "y.tab.h"

int yywrap(void)
{
printf("parsing is done*\n");
//yylex();
//return 0;
}
%}

%%
[a-z]* { printf("found lower\n");
yylval=yytext;
return LOWER;
}
[A-Z]* { printf("found upper\n");
yylval=yytext;
return UPPER;
}

[ \n] ;
. ;
%%
void main()
{


yyin = fopen("file.txt", "r");
yylex();//this function will start the rules section.... it starts the parsing.....
fclose(yyin);

}//main ends


file.y

%{
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#define YYSTYPE char *
int yylex(void);
void yyerror(const char *str)
{
fprintf(stderr,"error: %s\n",str);
}
%}

%token LOWER UPPER

%%
start :
|
start LOWER UPPER
{
printf("%s--%s\n",$2,$3);

}
%%


contents of file.txt is:

token TOKEN

this is how i compile and run:

flex file.l

yacc -d file.y

gcc lex.yy.c y.tab.c -o file

./file

The program gives warning
warning: assignment makes integer from pointer without a cast [-Wint-conversion]
yylval=yytext;

When I run the program (ignoring warning), the output is "found lower" i.e the program stops reading tokens after
return LOWER
. Can anyone help and tell me why is this running like this?Also why is the warning generated even though i specified
#define YYSTYPE char *
in file.y

Answer Source

1. Why is the warning generated even though I specified #define YYSTYPE char * in file.y?

Because that define is not visible in file.l. Both files must have consistent definitions ofyytext.

Also, you should be aware that it is never correct to simply set yylval = yytext because the buffer into which yytext points is part of a private data structure of the lexical scanner. If you need to pass the token's string value to the parser, you must make a copy.

2. Why does main not read the whole file?

Because you are never actually calling the parser, whose name is yyparse. If you are using a standard bison parser, you should never call yylex directly; yylex is called by the parser when it needs a token. [Note 1]

Since yylex just returns a single token, calling it once will produce one token. You can call it in a loop, as suggested in a comment, but that will still not parse the file.


Notes

  1. Bison can generate "push-parsers" which are called by the lexer when it has an available token. In that case, the lexer actions would not return until the entire input has been parsed, and you would call yylex rather than yyparse. That can simplify the parsing of certain languages, but it is certainly not the case here.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download