Zachary Loughridge Zachary Loughridge - 19 days ago 5
Java Question

How to remove similar named strings in a list?

Given a list/array of strings:

document
document (1)
document (2)
document (3)
mypdf (1)
mypdf
myspreadsheet (1)
myspreadsheet
myspreadsheet (2)


How do I remove all the duplicates but retain only the highest copy number?

Ending result to be:

document (3)
mypdf (1)
myspreadsheet (2)

Answer

You put in a broad question, so here comes an unspecific (but nonetheless) "complete" answer:

  1. Iterate over all your strings to identify all lines that contain braces.
  2. In other words: identify all the strings that look like "X (n)"
  3. Then, for each "different" X that you found, you can iterate the list again; so that you can find all occurrences of "X", X (1)", .. and so on
  4. Doing so will allow you to detect the maximum n for each of your Xes.
  5. Push that "maximum" "X (n)" into your results list.

In other words: it only takes such a simple receipt to solve this problem; now it only takes your time to turn these pseudo-code instructions into real code.

For the record: if the layout of your file is really as shown above, then things become a bit easier - as it seems that your numbers are just increasing. What I mean is:

X (1)
X (2)
X (3)

is easier to treat than

X (1)
X (3)
X (2)

As in your case, it seems save to assume that the last X(n) contains the largest n. Which makes using a HashMap (as suggested by cainiaofei) a nice solution.