kchoi kchoi - 2 months ago 12
Java Question

Generating the lexicographically greatest string

The question is to generate the lexicographically greatest string given some string s.
So the aim is to find lexicographically greatest, unique(no repetitions) substring s1 from s.
We say that some subsequence s1 is greater than another subsequence s2 if s1 has more characters than s2 or s1 is lexicographically greater than s2 if equal length.

I/O are as follows:

Input is: babab

output is: ba

Second input is: nlhthgrfdnnlprjtecpdrthigjoqdejsfkasoctjijaoebqlrgaiakfsbljmpibkidjsrtkgrdnqsknbarpabgokbsrfhmeklrle

Second output is:
tsocrpkijgdqnbafhmle

This is what I wrote for my java code but my code fails on the second test case. Also I'm having a hard time understanding why second output isn't tsrqponmlkjihgfedcba.
Can somebody provide suggestions for a fix or even java code?

I think the algorithm has to be more efficient than generating all possible unique strings, sort them and find lexicographically largest one.

To make the question much clearer, if the input is babab, then all the possible unique combinations would be b, a, ba, ab. And the output will be ba because it's the longest and lexicographically greater than ab.

Note: this is not a homework assignment.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;


public class mostBeautiful {
final static int MAX = 1000000;
static String[] permute;
static void permutation(String prefix, String str, int counter) {
int n = str.length();
//System.out.println("n is: "+ n);
if (n == 0) {
permute[counter] = prefix;
} else {
for (int i = 0; i < n; i++) {
//System.out.println("str is: "+ str);
permutation(prefix + str.charAt(i), str.substring(0, i) + str.substring(i+1, n), counter++);
}
}
}
public static void main(String[] args) throws IOException {
BufferedReader bf = new BufferedReader(new InputStreamReader(System.in));
String s = bf.readLine();
char[] unique = new char[26];
int counter = 0;
String answer = "";
//System.out.println("s is: " + s);
int ascii = 0;
final int asciiAVal = 97;
final int asciiZVal = 122;
for (int i = 0; i < s.length(); i++) {
ascii = (int)s.charAt(i);
if (ascii < asciiAVal || ascii > asciiZVal) {
continue;
}
char ch = s.charAt(i);
unique[ch - 'a'] = ch;
}
String result = "";
for (int j = 25; j >= 0; j--) {
result += unique[j];
}
result = result.trim();
System.out.println(result);
int size = result.length() * (result.length() - 1);
permute = new String[size];
permutation("", result, counter);
for (int i = 1; i < size; i++) {
if (permute[i].compareTo(permute[i - 1]) > 0){
answer = permute[i];
} else {
answer = permute[i - 1];
}

}
System.out.println("answer is: " + answer);
}


}

Answer Source

After thinking about this problem in many ways, I have determined a divide-and-conquer algorithm that gets the results right:

Algorithm - Pseudocode

Assuming some input string, S defined as a concatenation of two substrings A + B, we compute the lexicographically greatest string recursively as:

LexMax(S) = Merge(LexMax(A),LexMax(B))

Where

LexMax(S)
{
    if Length(S) = 1
        return S
    else
    {
        LMA = LexMax(S[0:Length/2])
        LMB = LexMax(S[Length/2:end])
        return Merge(LMA,LMB)
    }
}

Merge(A,B)
{
    Sa = A
    Sb = B

    for n = 0:Length(A)
    {
        if Sb contains A[n]
        {
            if A[n+1:end] contains character > A[n]
                Remove A[n] from Sa
            else
                Remove A[n] from Sb
        }
    }

    return Sa + Sb
}

Java Code

Coming soon!

Example

Given an input string

cefcfdabbcfed

Divide it into

cefcfda
bbcfed

Assuming the function works we have:

LexMax("cefcfda") = "efcda"
LexMax("bbcfed") = "bcfed"

Merging works as follows:

e: efcda bcfed

In both substrings, greater value found to right of e in left substring, remove from left

f: fcda bcfed

In both substrings, no greater value in left substring, remove from right

c: fcda bced

In both substrings, greater value found to right of c in left substring, remove from left

d: fda bced

In both substrings, no greater value in left substring, remove from right

a: fda bce

Not in both substrings, do nothing

Final result:

LexMax(cefcfdabbcfed) = fdabce