MasterBolle MasterBolle - 4 months ago 6x
Java Question

object not detected in a Map

I have this HashMap:

HashMap<NGram, Integer> countMap = new HashMap<>();

I'm trying to fill it (snippet is in a loop):

NGram ng = new NGram(param);
if (countMap.containsKey(ng)){ // or if(countMap.get(ng) != null){
int counter = countMap.get(ng);
countMap.put(ng, ++counter);
countMap.put(ng, 1);

never returns true (when it should), so the else-statement will always be executed! I overwrote
and they're working just fine. With primitive datatypes the snippet will work.

So, my question is: How do I convince Java, that there's already the object in the HashMap?

Here's the code of

public String toString(){
String ret = "";
for (String s : previousToken){
ret += s + " ";
ret += token;
return ret;

public boolean equals(NGram other){
return true;
return false;

public int hashCode(){
String ts = this.toString();
int result = 3;
for(int i = 0; i < ts.length(); i++){
result = 2 * result + ((int)ts.charAt(i));
return result;


Here is an example of what your 3 methods should be.
This is what is called an MCVE (Minimal, Complete, and Verifiable example).

The primary concept is that constructing a new combined string by calling toString() is not the right way to implement equals() and hashCode().

This is mostly for performance reasons. With 18 million objects added to the map, performance of equals() and hashCode() matters.

Another reason it's a bad idea, is that toString() may not show all values of the object, or that two different objects may still have the same toString() result, e.g. if tokens could contain spaces, then concatenating them would not be able to differentiate A B, C from A, B C.

For the example, a String[] is used for the previousToken field, and a varargs constructor is used to simplify testing. Null tokens not allowed. Yours is likely slightly different, but this shows the concepts.

public class NGram {
    private String   token;
    private String[] previousToken; // could also be List<String>
    public NGram(String ... allTokens) {
        if (allTokens.length == 0)
            throw new IllegalArgumentException("At least one token is required");
        for (String s : allTokens)
            if (s == null)
                throw new NullPointerException();
        this.token = allTokens[allTokens.length - 1];
        this.previousToken = Arrays.copyOfRange(allTokens, 0, allTokens.length - 1);
    public String toString() {
        StringBuilder buf = new StringBuilder();
        for (String s : this.previousToken)
            buf.append(s).append(' ');
        return buf.append(this.token).toString();
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null || getClass() != obj.getClass())
            return false;
        NGram that = (NGram) obj;
        return this.token.equals(that.token) &&
               Arrays.equals(this.previousToken, that.previousToken);
    public int hashCode() {
        return this.token.hashCode() + 31 * Arrays.hashCode(this.previousToken);


This shows a faster, more compact version of the code to add a new object to the map. It is faster because it only performs one lookup per add, and made more compact by using a ternary operator instead of an if statement, combining the two put() calls.

It's isolated in a helper method for easier testing.

public static void main(String[] args) {
    Map<NGram, Integer> countMap = new HashMap<>();
    add(countMap, new NGram("Foo"));
    add(countMap, new NGram("Bar"));
    add(countMap, new NGram("Foo", "Bar"));
    add(countMap, new NGram("This", "is", "a", "test"));
    add(countMap, new NGram("Bar"));
private static void add(Map<NGram, Integer> countMap, NGram ng) {
    Integer counter = countMap.get(ng);
    countMap.put(ng, (counter != null ? counter + 1 : 1));


{Bar=2, Foo=1, This is a test=1, Foo Bar=1}

As you can see, the duplicate Bar value is counted twice.

If you end up with lots of duplicate objects, i.e. with high counter values, then using an immutable Integer object as the counter is not great for performance. You can use an int[1], but a mutable helper class is better, since it can provide a good toString() implementation.

private static void add(Map<NGram, Counter> countMap, NGram ng) {
    Counter counter = countMap.get(ng);
    if (counter == null)
        countMap.put(ng, new Counter(1));
private static final class Counter {
    private int count;
    public Counter(int count) {
        this.count = count;
    public void increment() {
    public int get() {
        return this.count;
    public String toString() {
        return Integer.toString(this.count);