John John - 4 months ago 10
C# Question

Removing taken numbers from a List

I have a

List<double>
with several numbers. What I'm trying to do is look at what
doubles
are similar to each other; add up all similar numbers and get the average. So for example a list:
{ 2.1, 2.2, 4, 4.1, 8, 8.2}
would become:
{2.15, 4.05, 8.1}


What I'm having some issues with is when I look for similar numbers using this LINQ statement:
tempList = points.Where(p => Abs(p - currentPoint) < 0.25).ToList();
, how do I also simultaneously remove all the points from the
points
list that I picked into
tempList
, so that I'm not looking at the same numbers over and over again (e.g. referring back to my example; if I just looked at
2.1
, I don't want to look at
2.2
on the next iteration because I just found the average is
2.05
for all numbers similar to
2.1
.etc).

Here were my attempts:

points = points.Except(tempList).ToList();
//and
foreach (var t in tempList)
{
points.Remove(t);
}


I successfully remove the point from the
points
list, BUT, in my main look where I'm going through each of the points, it still iterates over DELETED points, which I find very strange.

Answer

It seems like what you want is to get the averages of the similar numbers and use that list instead of the original list. Here's a Linq example using one statement that assumes "similar" means having the same integer value. I'm sure this could be modified to use the Abs calculation. But it would be bit more complex. See if this is on the right track.

var points =
    from p in new[] { 2.1, 2.2, 4, 4.1, 8, 8.2 }
    group p by (int) p into avgs
    select avgs.Average();

Console.WriteLine(String.Join(",", points.Select(p => p.ToString())));

The result is:

2.15,4.05,8.1

UPDATE:

In response to the OP's comment on averaging within .5 of the "current" number. Setting aside the definition of "current" for the moment, here's an example of averaging only numbers that are within .5 of their integer value.

var points =
    from p in new[] { 2.1, 2.2, 2.6, 4, 4.2, 4.7, 4.8 }
    let intp = (double)((int)p)
    let grp = (p - intp < .5) ? intp : p
    group p by grp into avgs
    select avgs.Average();

var averages = String.Join(",", points.Select(p => p.ToString()));

Console.WriteLine(averages);

Result:

2.15,2.6,4.1,4.7,4.8    

Your concept of "current" number makes this a bit muddy. When using Linq as it is designed to work, you are only ever looking at one item in a sequence. You, theoretically, have no knowledge of position within a sequence or of the other items in the sequence. The grouping mechanism allows you to use aggregation methods that accumulate values in an "as you go" fashion.

Using a "current" number as you pose in the question requires a look-ahead approach and an ascending ordered sequence. You actually do supply one, so maybe that is exactly what you want. In that case, Linq may be the wrong tool. You say you have it working in a loop. For comparison, the above Linq would translate into a loop something like this:

var points = new[] { 2.1, 2.2, 2.6, 4, 4.2, 4.7, 4.8 };
var groups = new Dictionary<double, List<double>>();

foreach (var p in points)
{
    var intp = (double)((int)p);
    if (p - intp < .5)
    {
        if (!groups.ContainsKey(intp))
        {
            groups[intp] = new List<double>();
        }
        groups[intp].Add(p);
    }
    else
    {
        groups[p] = new List<double> { p };
    }
}

points = groups.Select(dict => dict.Value.Average()).ToArray();

Or you can replace the "foreach" with a "for" loop which allows for a bit more manipulation of the original list of values:

for (int i = 0; i < points.Length; i++)
{
    var p = points[i];
    var intp = (double)((int)points[i]);
    if (p - intp < .5)
    {
        if (!groups.ContainsKey(intp))
        {
            groups[intp] = new List<double>();
        }
        groups[intp].Add(p);
    }
    else
    {
        groups[p] = new List<double> { p };
    }
}

As a rule of thumb, if you can imagine iterating using a foreach, you can almost certainly use Linq. If you need access to other items in the sequence while iterating, it may not be worth the trouble.

UPDATE:

I have duplicated the logic using a "for" loop that would give the result stated in the comments. This is likely to be roughly equivalent to the OP's use of a "while" loop.

var tmpPoints = new List<double>();
for (var i = 0; i < points.Length;)
{
    var value = points[i];
    var next = i + 1;
    if (next < points.Length && points[next] - points[i] < .5)
    {
        value = (points[i] + points[next]) / 2;
        i = next + 1;
    }
    else
    {
        i++;
    }
    tmpPoints.Add(value);
}

points = tmpPoints.ToArray();

// Results using the two example sequences

points = new[] { 2.1, 2.2, 2.6, 4, 4.2, 4.7, 4.8 };

Result: 2.15, 2.6, 4.1, 4.75

points = new[] { 8.8, 9.0 };

Result: 8.9

This is a specialized logic that seems somewhat like a moving average but is only using the existing values to calculate the average. This clearly requires a look-ahead approach as you cannot know whether the "current" value will be used in the final sequence without using the next value in the sequence. This type of logic could be done using Linq extension methods like Skip and Take. I don't see a reasonable way to use the Linq syntax for this. This makes for a good academic exercise. But, in the real world, this use case demands a straightforward looping approach. Even if you could get it to work, Linq would be far less readable and maintainable, and would almost certainly take a performance hit relative to the loop in this example.

Comments