Vikash Babu Vikash Babu - 4 months ago 10
Perl Question

How to get position of comment ( // ,/* or */ ) in a line via regex

I have a line of c++ code in an array:

@code="$_";


Now I have to get whether this line contain comment or not. That I can do by checking:

if( @code=~'/*')


or

if(@code=~m{//}


But there can be a situation like:

cout<<"welcome";/*this is just to print*/


Here our line is not full comment.

For this kind of situation I am getting stuck. Please help me.

Answer

From perlfaq6:

How do I use a regular expression to strip C-style comments from a file?

$/ = undef;
$_ = <>;
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;
print;

This could, of course, be more legibly written with the /x modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis.

s{
   /\*         ##  Start of /* ... */ comment
   [^*]*\*+    ##  Non-* followed by 1-or-more *'s
   (
     [^/*][^*]*\*+
   )*          ##  0-or-more things which don't start with /
               ##    but do end with '*'
   /           ##  End of /* ... */ comment


 |         ##     OR  various things which aren't comments:


   (
     "           ##  Start of " ... " string
     (
       \\.           ##  Escaped char
     |               ##    OR
       [^"\\]        ##  Non "\
     )*
     "           ##  End of " ... " string


   |         ##     OR


     '           ##  Start of ' ... ' string
     (
       \\.           ##  Escaped char
     |               ##    OR
       [^'\\]        ##  Non '\
     )*
     '           ##  End of ' ... ' string


   |         ##     OR


     .           ##  Anything other char
     [^/"'\\]*   ##  Chars which doesn't start a comment, string or escape
   )
 }{defined $2 ? $2 : ""}gxse;

A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character:

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse;

You can try CPAN module Regexp::Common::comment:

Please consult the manual of Regexp::Common for a general description of the works of this interface.

Do not use this module directly, but load it via Regexp::Common.

This modules gives you regular expressions for comments in various languages.

The C++ language has two forms of comments. Comments that start with // and last till the end of the line, and comments that start with /*, and end with */. If {-keep} is used, only $1 will be set, and set to the entire comment.