Andreas Bonini Andreas Bonini - 1 month ago 6
C++ Question

How many palindromes can be formed by selections of characters from a string?

I'm posting this on behalf of a friend since I believe this is pretty interesting:


Take the string "abb". By leaving out
any number of letters less than the
length of the string we end up with 7
strings.

a b b ab ab bb abb

Out of these 4 are palindromes.

Similarly for the string

"hihellolookhavealookatthispalindromexxqwertyuiopasdfghjklzxcvbnmmnbvcxzlkjhgfdsapoiuytrewqxxsoundsfamiliardoesit"

(a length 112 string) 2^112 - 1
strings can be formed.

Out of these how many are
palindromes??


Below there is his implementation (in C++, C is fine too though). It's pretty slow with very long words; he wants to know what's the fastest algorithm possible for this (and I'm curious too :D).

#include <iostream>
#include <cstring>

using namespace std;



void find_palindrome(const char* str, const char* max, long& count)
{
for(const char* begin = str; begin < max; begin++) {
count++;
const char* end = strchr(begin + 1, *begin);
while(end != NULL) {
count++;
find_palindrome(begin + 1, end, count);
end = strchr(end + 1, *begin);
}
}
}


int main(int argc, char *argv[])
{
const char* s = "hihellolookhavealookatthis";
long count = 0;

find_palindrome(s, strlen(s) + s, count);

cout << count << endl;
}

Answer

First of all, your friend's solution seems to have a bug since strchr can search past max. Even if you fix this, the solution is exponential in time.

For a faster solution, you can use dynamic programming to solve this in O(n^3) time. This will require O(n^2) additional memory. Note that for long strings, even 64-bit ints as I have used here will not be enough to hold the solution.

#define MAX_SIZE 1000
long long numFound[MAX_SIZE][MAX_SIZE]; //intermediate results, indexed by [startPosition][endPosition]

long long countPalindromes(const char *str) {
    int len = strlen(str);
    for (int startPos=0; startPos<=len; startPos++)
        for (int endPos=0; endPos<=len; endPos++)
            numFound[startPos][endPos] = 0;

    for (int spanSize=1; spanSize<=len; spanSize++) {
        for (int startPos=0; startPos<=len-spanSize; startPos++) {
            int endPos = startPos + spanSize;
            long long count = numFound[startPos+1][endPos];   //if str[startPos] is not in the palindrome, this will be the count
            char ch = str[startPos];

            //if str[startPos] is in the palindrome, choose a matching character for the palindrome end
            for (int searchPos=startPos; searchPos<endPos; searchPos++) {
                if (str[searchPos] == ch)
                    count += 1 + numFound[startPos+1][searchPos];
            }

            numFound[startPos][endPos] = count;
        }
    }
    return numFound[0][len];
}

Explanation:

The array numFound[startPos][endPos] will hold the number of palindromes contained in the substring with indexes startPos to endPos.

We go over all pairs of indexes (startPos, endPos), starting from short spans and moving to longer ones. For each such pair, there are two options:

  1. The character at str[startPos] is not in the palindrome. In that case, there are numFound[startPos+1][endPos] possible palindromes - a number that we have calculated already.

  2. character at str[startPos] is in the palindrome (at its beginning). We scan through the string to find a matching character to put at the end of the palindrome. For each such character, we use the already-calculated results in numFound to find number of possibilities for the inner palindrome.

EDIT:

  • Clarification: when I say "number of palindromes contained in a string", this includes non-contiguous substrings. For example, the palindrome "aba" is contained in "abca".

  • It's possible to reduce memory usage to O(n) by taking advantage of the fact that calculation of numFound[startPos][x] only requires knowledge of numFound[startPos+1][y] for all y. I won't do this here since it complicates the code a bit.

  • Pregenerating lists of indices containing each letter can make the inner loop faster, but it will still be O(n^3) overall.