TeamWild TeamWild - 1 year ago 42
C# Question

How to filter a list of strings matching a pattern

I have a list of strings (file names actually) and I'd like to keep only ones that match a filter expression like *_Test.txt

How would I best achieve this?

Here is what I came up with

List<string> files = new List<string>();

// Define a filter
string filter = "*_Test.txt";

// Make the filter regex safe
foreach (char x in @"\+?|{[()^$.#")
filter = filter.Replace(x.ToString(), @"\" + x.ToString());

filter = string.Format("^{0}$",filter.Replace("*", ".*"));

// Old School
List<string> resultList1 = files.FindAll(delegate(string s) { return Regex.IsMatch(s, filter, RegexOptions.IgnoreCase); });

// Version using LINQ
List<string> resultList2 = files.Where(x => Regex.IsMatch(x, filter, RegexOptions.IgnoreCase) == true ).ToList();

Answer Source

You probably want to use a regular expression for this if your patterns are going to be complex....

you could either use a proper regular expression as your filter (e.g for your specific example it would be new Regex(@"^.*_Test\.txt$") or you could apply a conversion algorithm.

Either way you could then just use linq to apply the regex.

for example

var myRegex=new Regex(@"^.*_Test\.txt$");
List<string> resultList=files.Where(myRegex.IsMatch).ToList();

Some people may think the above answer is incorrect, but you can use a method group instead of a lambda. If you wish the full lamda you would use:

var myRegex=new Regex(@"^.*_Test\.txt$");
List<string> resultList=files.Where(f => myRegex.IsMatch(f)).ToList();

or non Linq

List<string> resultList=files.FindAll(delegate(string s) { return myRegex.IsMatch(s);});

if you were converting the filter a simple conversion would be

 var myFilter="*_Test.txt";
 var myRegex=new Regex("^" + myFilter.Replace("*",".*") +"$");

You could then also have filters like "*Test*.txt" with this method.

However, if you went down this conversion route you would need to make sure you escaped out all the special regular expression chars e.g. "." becomes @".", "(" becomes @"(" etc.......

Edit -- The example replace is TOO simple because it doesn't convert the . so it would find "fish_Textxtxt" so escape atleast the .


string myFilter="*_Test.txt";
foreach(char x in @"\+?|{[()^$.#") {
  myFilter = myFilter.Replace(x.ToString(),@"\"+x.ToString());
Regex myRegex=new Regex(string.Format("^{0}$",myFilter.Replace("*",".*")));