Sasquatch Sasquatch - 18 days ago 6
C# Question

Performance with LINQ on a string

I have made two tests because I wanted to test performance on two different implementations of trying to find a number in a string.

This is my code:

[TestMethod]
public void TestMethod1()
{
string text = "I want to find the number (30)";
var startNumber = text.IndexOf('(');
var trimmed = text.Trim(')');
var number = trimmed.Substring(startNumber).Trim('(');

Assert.AreEqual("30", number);
}

[TestMethod]
public void TestMethod2()
{
string text = "I want to find the number (30)";
var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
var joined = string.Join("", lambdaNumber);

Assert.AreEqual("30", joined);
}


The result is that TestMethod2 (with the lamda expression) is faster than TestMethod1. According to the test explorer.

TestMethod1 = 2ms
TestMethod2 = <1ms

If I try to add a StopWatch in each test, TestMethod1 is by far the fastest.

How can I properly test the performance of this behaviour?

EDIT:

I appreciate the fact that the methods do not perform the same operation. Therefore I created the following in stead:

[TestMethod]
public void TestMethod1()
{
var sw = new Stopwatch();
sw.Start();

var number = string.Empty;
var counter = 0;
while (counter < 100000)
{
number = string.Empty;
string text = "I want to find the number (30)";
foreach (var c in text.ToCharArray())
{
int outNumber;
if (int.TryParse(c.ToString(), out outNumber))
number += c.ToString();
}
counter++;
}

sw.Stop();

Assert.AreEqual("30", number);
}

[TestMethod]
public void TestMethod2()
{
var sw = new Stopwatch();
sw.Start();

var joined = String.Empty;
var counter = 0;
while (counter < 100000)
{
string text = "I want to find the number (30)";
var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
joined = string.Join("", lambdaNumber);
counter++;
}

sw.Stop();

Assert.AreEqual("30", joined);
}


According to the StopWatch the results are the following:
TestMethod1 = 19ms
TestMethod2 = 7ms

Thank you for all the replies

Answer

As I agree with most of the comments, I thought it might help to put up a test without unit tests. If you work with LINQ please use LINQPad (free standard edition) to run tests like this or other small code blocks. Here are tests, expanded to include Regex as well, and increased to 100000 loops.

void Main()
{
    string text = "I want to find the number (30)";

    Stopwatch sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod1();
    }

    sw.Elapsed.TotalMilliseconds.Dump("Substring no parameter");    
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod1(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Substring parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod2();
    }

    sw.Elapsed.TotalMilliseconds.Dump("LINQ no parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod2(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("LINQ parameter");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod3(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Regex In");
    sw = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        TestMethod4(text);
    }

    sw.Elapsed.TotalMilliseconds.Dump("Regex Out");
    sw = Stopwatch.StartNew();
    sw.Stop();
}

// Define other methods and classes here
public void TestMethod1()
{   
    string text = "I want to find the number (30)";
    var startNumber = text.IndexOf('(');
    var trimmed = text.Trim(')');
    var number = trimmed.Substring(startNumber).Trim('(');
}

public void TestMethod1(string text)
{
    var startNumber = text.IndexOf('(');
    var trimmed = text.Trim(')');
    var number = trimmed.Substring(startNumber).Trim('(');
}

public void TestMethod2()
{
    string text = "I want to find the number (30)";
    var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
    var joined = string.Join("", lambdaNumber);
}

public void TestMethod2(string text)
{   
    var lambdaNumber = text.Where(x => Char.IsNumber(x)).ToArray();
    var joined = string.Join("", lambdaNumber);
}

public void TestMethod3(string text)
{
    var regex = new Regex(@"(\d+)");
    var match = regex.Match(text);
    var joined = match.Captures[0].Value;
}

public Regex regex = new Regex(@"(\d+)");

public void TestMethod4(string text)
{
    var match = regex.Match(text);
    var joined = match.Captures[0].Value;
}

And results:

Substring no parameter
11.3526 

Substring parameter
10.2901 

LINQ no parameter
60.2359 

LINQ parameter
56.5218 

Regex In
301.1179 

Regex Out
89.8345 

Conclusion? We are still comparing apples to oranges to diamonds. And regex does not seem to be as fast as some have suggested. Professional tools are the way to go.

Comments