D.Spetz D.Spetz - 1 month ago 5
Java Question

Hash items in a 2d array, but only on one index

So, I have a 2d array (really, a List of Lists) that I need to squish down and remove any duplicates, but only for a specific field.

The basic layout is a list of Matches, with each Match having an ID number and a date. I need to remove all duplicates such that each ID only appears once. If an ID appears multiple times in the List of Matches, then I want to take the Match with the most recent date.

My current solution has me taking the List of Matches, adding it to a HashSet, and then converting that back to an ArrayList. However all that does is remove any exact Match duplicates, which still leaves me with the same ID appearing multiple times if they have different dates.

Set<Match> deDupedMatches = new HashSet<Match>();
List<Match> finalList = new ArrayList<Match>(deDupedMatches)

If my original data coming in is

{(1, 1-1-1999),(1, 2-2-1999),(1, 1-1-1999),(2, 3-3-2000)}

then what I get back is

{(1, 1-1-1999),(1, 2-2-1999),(2, 3-3-2000)}

But what I am really looking for is a solution that would give me

{(1, 2-2-1999),(2, 3-3-2000)}

I had some vague idea of hashing the original list in the same basic way, but only using the IDs. Basically I would end up with "buckets" based on the ID that I could iterate over, and any bucket that had more than one Match in it I could choose the correct one for. The thing that is hanging me up is the actual hashing. I am just not sure how or if I can get the Matches broken up in the way that I am thinking of.


If I understand your question correctly you want to take distinct IDs from a list with the latest date by which it occurs.

Because your Match is a class it is not as easy to compare with each other because of the fields not being looked at by Set.

What I would do to get around this problem is use a HashMap which allows distinct keys and values to be linked. Keys cannot be repeated, values can.
I would do something like this while looping through:

if(map.putIfAbsent(match.getID(), match) != null &&     
    map.get(match.getID()).getDate() < match.getDate()){
  • So what that does is it loops through your matches.
  • Put the current Match with its ID in if that ID doesn't exist yet.
  • .putIfAbsent returns the old value which is null if it did not exist.
  • You then check if there was an item in the map at that ID using the putIfAbsent (2 birds with one stone).
  • after that it is safe to compare the two dates (one in map and one from iteration - the < is an exams for your comparison method)
  • if the new one is later then replace the current Match.
  • And finally in order to get your list you use .getValues()

This will remove duplicate IDs and leave only the latest ones.

Apologies for typos or code errors, this was done on a phone. Please notify me of any errors in the comments.

Java 7 does not have the .putIfAbsent and .replace functionality, but they can be substitued for .contains and .put