Dai Dai - 4 months ago 34
C# Question

I need to modify a Word MERGEFIELD regular expression

I'm using this library to implement Word document mail-merging in my application: http://www.codeproject.com/Articles/38575/Fill-Mergefields-in-docx-Documents-without-Microso

It works great but I've since heavily refactored the code and performed other tasks in order to integrate it with my own application.

The library uses this regex to capture Word mail-merge fields:

private static readonly Regex _instructionRegEx = new Regex(
@"^[\s]*MERGEFIELD[\s]+(?<name>[#\w]*){1} # This retrieves the field's name (Named Capture Group -> name)
[\s]*(\\\*[\s]+(?<Format>[\w]*){1})? # Retrieves field's format flag (Named Capture Group -> Format)
[\s]*(\\b[\s]+[""]?(?<PreText>[^\\]*){1})? # Retrieves text to display before field data (Named Capture Group -> PreText)
[\s]*(\\f[\s]+[""]?(?<PostText>[^\\]*){1})? # Retrieves text to display after field data (Named Capture Group -> PostText)",
RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline

This captures examples like
MERGEFIELD FieldNameGoesHere
however I've come across examples where the field name is surrounded by double-quotes, like
MERGEFIELD "FieldNameGoesHere"
however the regex does not capture these.

As you can see, the regex is a bit hardcore and is beyond my current regex-fu to modify it to consume double-quotes but also accept un-quoted MERGEFIELDs.

Obviously the first line needs to be modified, but I'm unsure of how to modify it exactly.


Update: Moved the double quotes to the outside of the named group.

In your first line, replace (?<name>[#\w]*) with "?(?<name>[#\w]*)"? the "? has the RegEx look for an optional double quote.