dgills dgills - 1 year ago 53
C# Question

Regex match text at Nth occurence between a set of characters/a given pattern

I'm looking to be able to take an MDX string like this:

[Event Product].[Event Category Filter].[Category Group].&[E].&[F].&[G].&[H]

And to use regex to individually parse out the first, second and third strings between the square brackets.

So, ideally three custom regex expressions that would return:

Event Product

Event Category Filter

Category Group

The fourth grouping and onward can be optional (although the first three will always be present). It would be great to have a fourth expression capable of returning every subsequent match after in a delimited one-liner, so something like:


I've captured the first group:


However I'm struggling with the nth exclusions piece.

Would anyone be able to lend a hand here? :)


Should've mentioned this upfront, we're using regex assembly script and doing this in vanilla T-SQL.


A C# solution is what I'd ultimately want to use, but a pure regex solution for the time being would work best. I work mostly with a non-dot net stack so we don't have a full tool kit at our disposal as far as our ETL goes, unfortunately.

Answer Source

If you create a C# class like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections;

namespace ExtractMdxParts
    public partial class UserDefinedFunctions
        public class EventData
            public SqlString Product;
            public SqlString CategoryFilter;
            public SqlString Group;
            public SqlString ExtraData;

        FillRowMethodName = "FillRow",
        TableDefinition = "Product nvarchar(128), CategoryFilter nvarchar(128), Group nvarchar(128), Extradata nvarchar(MAX)",
        IsDeterministic = true)]

        public static IEnumerable ExtractParts([SqlFacet(MaxSize = -1)] String MdxString)
            string[] parts = MdxString.Split(".".ToCharArray(), 4, StringSplitOptions.None);
            if (parts.Length < 3)
                return null;

            List<EventData> x = new List<EventData> { };
            char[] trimChars = "[]".ToCharArray();
            EventData y = new EventData { Product = parts[0].Trim(trimChars), CategoryFilter = parts[1].Trim(trimChars), Group = parts[2].Trim(trimChars) };

            if (parts.Length == 4)
                y.ExtraData = string.Join(",", parts[3].Split(".".ToCharArray()).Select(p => p.Substring(1).Trim(trimChars)));


            return x;


        public static void FillRow(object eventData, out SqlString product, out SqlString categoryFilter, out SqlString group, out SqlString extraData)
            //I'm using here the FileProperties class defined above
            EventData ed = (EventData)eventData;
            product = new SqlString(ed.Product.ToString());
            categoryFilter = new SqlString(ed.CategoryFilter.ToString());
            group = new SqlString(ed.Group.ToString());
            extraData = new SqlString(ed.ExtraData.ToString());



And add it into SQL Server with

use [testing]

   FROM 'C:\Your\Path\Here\ExtractMdxParts.dll'


CREATE FUNCTION ExtractParts (@MdxString nvarchar(MAX))
   RETURNS TABLE (Product nvarchar(128),
                    CategoryFilter nvarchar(128),
                    [Group] nvarchar(128),
                    ExtraData nvarchar(MAX)
   AS EXTERNAL NAME [ExtractMdxParts].[ExtractMdxParts.UserDefinedFunctions].ExtractParts

where 'testing' is the name of your database and you do all the fiddly bits to allow CLR procedures...

then you can do things like

SELECT * FROM dbo.ExtractParts(N'[a].[b].[c].&[e].&[f]')

and get a table like

Product CategoryFilter  Group   ExtraData
a       b               c       e,f

Important parts I had to use to make it work:

  • using System.Collections; so that you can public static IEnumerable ExtractParts...
  • new SqlString(... may not be vital, but I was getting desperate with "because T-SQL and CLR types for ... do not match"
  • I didn't go the way of a signed assembly because VS wouldn't let me sign the assembly and didn't give a meaningful error message to me so I did the brute-force ALTER AUTHORIZATION ON DATABASE::testing TO [computername\accountname]. Use signing: I'm only writing this as an answer to the question; you will have more time to shout at VS without the neighbours complaining.
  • another possibly-not-required thing is [SqlFacet(MaxSize = -1)] which comes out of something written by Adam Machanic. But if he writes it, it is very probably a Good Idea.

Oh... and I used SQL Server 2014 in conjuction with .NET 4.5.2 in VS2015 Community Edition.