Biljana Gajic Biljana Gajic - 1 month ago 8
Java Question

Java: Finding jumbled word

In my country there is Game Show called Slagalica where one of the tasks is to find longest word in array of 12 letters. Size of the longest word is always 10, 11 or 12.

I have file with words from my language I use as database. Words that have 10, 11 or 12 letters in them I've saved in List (listWordSize10_11_12).
When I enter jumbled word [12 letters] in I want from my program to find what word is that originally. I know how to make it work when jumbled word is 12 letters word but I can't work it out when it's less.

Example: 10 letter word is jumbled + 2 random letters.

Goal is for that 10 letter word to be recognized and printed in original state.

Where is what I've done:

// un-jumbling word
System.out.println("Unesite rijec koja treba da se desifruje: ");
String jumbledWord = tast.nextLine();
char[] letter = jumbledWord.toCharArray();
Arrays.sort(letter);
String sorted_Mistery_Word = new String(letter);


for (int i = 0; i < listWordSize10_11_12.size(); i++) {
int exception = 0;
char[] letter_2 = listWordSize10_11_12.get(i).toCharArray();
Arrays.sort(letter_2);
String longWords = new String(letter_2);
int j = i;
while(longWords.length()>i){
if(sorted_Mistery_Word.charAt(j)!=longWords.charAt(i)){
exception++;
j++;
}
}
if(exception < 3){
System.out.println("Your word is: "+listWordSize10_11_12.get(i));
break;
}
}


Thanks!!!
P.S. This is not a homework or some job, just project I've been doing for fun!

Thanks everyone for the help, I've learned a lot!

Answer

Your basic approach for 12 characters, which I would characterize as fingerprinting, will also work for 10 or 11 letter words with some modification.

That is, rather that just sorting the letters in each candidate word as you examine it to create the fingerprint, pre-process your array to create a small(ish) fingerprint of each word as a byte[]. Using the English alphabet and ignoring case, for example, you could create a 26-byte array for each word, where each byte position contained the count of each letter in the word.

That is fingerprint[0] contains the number of 'a' characters in the word, and fingerprint[25] the number of 'z' characters.

Then just replace your sorted_Mistery_Word.charAt(j)!=longWords.charAt(i) check with a loop that increments a temporary array for each letter in the mystery word. Finally, check that the temporary array has at least the same value for every position. Something like:

byte [] makeFingerprint(String s) {
  byte[] fingerprint = new byte[26];
  for (char c : letter_2) {
    fingerprint[c - 'a']++;
  }

  return fingerprint;
}

/** determine if sub is a subset of super */
boolean isSubset(byte[] sub, byte[] super) {
  for (int i=0; i < sub.length; i++) {
    if (sub[i] > super[i]) 
      return false;
  }
  return true;
}

void findMatch(String jumbledWord) {

  byte[] fingerprint = makeFingerprint(jumbledWord);

  for (byte[] candidate : fingerprintList) {   
        if (isSubset(fingerprint, candidate)) {
            System.out.println("Your word is: " + ...);
            break;                
        }
    }
}

Here I omitted the creation of the fingerprintList - but it just involves fingerprint each word.

There are plenty of optimizations possible, but this should already be quite a bit faster than your version (and is "garbage free" in the main loop). It can handle candidates of any length (not just 10-12 chars). The biggest optimization, if you will check many words, is to try to use the "fingerpint" as a key for direct lookup. For the 12 character case this is trivial (direct lookup), but for the 10 and 11 or this you would likely have to use type of technique for higher-dimensional searching - locality sensitive hashing seems like a natural fit.