P0lT10n P0lT10n - 25 days ago 7
C# Question

Get text inside a tag with Regex

I am trying to get some text that is inside a tag, I call it keyword. This is an example

[@sometext]
.

I am not able to get only the text. I am trying to use this, reading an HTML and I will define inside my HTML some keywords like
sometext
explained before, so I need to get
sometext
and not
[@sometext]
using regex. How can I do this ?

The current regex I am using is this one:
\[@\w+\]
.

That regex will get
[@sometext]
and not
sometext
. I tried nearly everything.

Thank you very much !




EDIT

The solution was to use
(?<=\[@)\w+(?=\])
because I am using
Matches
not
Match
method.

Answer

Combine your match for the content with a positive lookbehind for the [@ and a positive lookahead for the closing [, like (?<=\[@)\w+(?=\]). The explanation (courtesy of RegexBuddy):

  • Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)

    (?<=\[@)
    
  • Match the character "[" literally

    \[

  • Match the character "@" literally

    @

  • Match a single character that is a "word character" (letters, digits, etc.)

    \w+

    • Between one and unlimited times, as many times as possible, giving back as needed (greedy)
  • Assert that the regex below can be matched, starting at this position (positive lookahead)

    (?=\])

    • Match the character "]" literally «]»
Comments