FourOfAKind - 2 years ago 89

Java Question

I've been trying to implement Rabin-Karp algorithm in Java. I have hard time computing the rolling hash value in constant time. I've found one implementation at http://algs4.cs.princeton.edu/53substring/RabinKarp.java.html. Still I could not get how these two lines work.

`txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q) % Q;`

txtHash = (txtHash*R + txt.charAt(i)) % Q;

I looked at couple of articles on modular arithmetic but no article could able to penetrate my thick skull. Please give some pointers to understand this.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

This is the "rolling" aspect of the hash. It's eliminating the contribution of the oldest character (`txt.charAt(i-M)`

), and incorporating the contribution of the newest character(`txt.charAt(i)`

).

The hash function is defined as:

```
M-1
hash[i] = ( SUM { input[i-j] * R^j } ) % Q
j=0
```

(where I'm using `^`

to denote "to the power of".)

But this can be written as an efficient recursive implementation as:

```
hash[i] = (txtHash*R - input[i-M]*(R^M) + input[i]) % Q
```

Your reference code is doing this, but it's using various techniques to ensure that the result is always computed correctly (and efficiently).

So, for instance, the `+ Q`

in the first expression has no mathematical effect, but it ensures that the result of the sum is always positive (if it goes negative, `% Q`

doesn't have the desired effect). It's also breaking the calculation into stages, presumably to prevent numerical overflow.

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**