nikorio nikorio - 22 days ago 11
C# Question

How to remove duplicate strings with odd index line and next even string in text file and avoid it for evens

I'm trying to remove duplicate strings located only on the odd index number lines with next even line inside text document about 30 000 rows, and avoid it for even lines content, even must be removed only if it is next after odd duplicate. For example with index numbers content:

0. some text 1
1. some text 2
2. some text 3
3. some text 2
4. some text 5
5. some text 6
6. some text 2
7. some text 7
8. some text 2
9. some text 9


and must be processed this way:

some text 1
some text 2 // keep unique
some text 3
some text 2 // remove odd duplicate
some text 5 // remove even because previous is odd duplicate
some text 6
some text 2 // keep because this duplicate on even line
some text 7
some text 2 // keep because this duplicate on even line
some text 9


to get this:

some text 1
some text 2
some text 3
some text 6
some text 2
some text 7
some text 2
some text 9


But I'm not sure how get this result. So seems like I've to read all lines content, and ask for index:

if (index % 2 == 0)
{

}


but can't get, how to compare these lines to go further

Answer

Samples: Simple | Extended

But you're not right... the correctly results should be:

+EVEN: some text 1
+ODD: some text 2
+EVEN: some text 3
-ODD: some text 2
-EVEN: some text 5
+ODD: some text 6
+EVEN: some text 2
+ODD: some text 7
+EVEN: some text 2 // you removed but is even, not odd
+ODD: some text 9 // you removed but this line is not duplicated

Code:

string[] lines = System.IO.File.ReadAllLines("/path/to/file.txt");
List<string> newLines = new List<string>();
for(int x = 0; x < lines.Length; x++)
{
    if(x % 2 == 1 && newLines.Contains(lines[x])) //is odd and already exists
        x++; \\skip next even line
    else
        newLines.Add(lines[x]);
}