I'm working on a regular expression to recognize variable declarations in C and I have got this.
A pattern to recognize variable declarations in C. Looking at a conventional declaration, we see:
If that's the case, one should test for the type keyword before anything, to avoid matching something else, like a string or a constant defined with the preprocessor
variable name resides in \1.
The feature you need is look-behind/look-ahead.
UPDATE July 11 2015
The previous regex fail to match some variables with
_ anywhere in the middle. To fix that, one just have to add the
_ to the second part of the first capture group, it also assume variable names of two or more characters, this is how it looks after the fix:
However, this regular expression has many false positives,
goto jump; being one of them, frankly it's not suitable for the job, because of that, I decided to create another regex to cover a wider range of cases, though it's far from perfect, here it is:
unsignedchar *var; /* OK, doesn't match */ goto **label; /* OK, doesn't match */ int function(); /* OK, doesn't match */ char **a_pointer_to_a_pointer; /* OK, matches +a_pointer_to_a_pointer+ */ register unsigned char *variable; /* OK, matches +variable+ */ long long factorial(int n) /* OK, matches +n+ */ int main(int argc, int *argv) /* OK, matches +argc+ and +argv+ (needs two passes) */ const * char var; /* OK, matches +var+, however, it doesn't consider +const *+ as part of the declaration */ int i=0, j=0; /* 50%, matches +i+ but it will not match j after the first pass */ int (*functionPtr)(int,int); /* FAIL, doesn't match (too complex) */
The following case is hard to cover with a portable regular expression, text editors use contexts to avoid highlighting text inside quotes.
printf("int i=%d", i); /* FAIL, match i inside quotes */
This can be fixed if one test the syntax of the source file before applying the regular expression. With GCC and Clang one can just pass the -fsyntax-only flag to test the syntax of a source file without compiling it
int char variable; /* matches +variable+ */