Ms. Corlib - 1 year ago 83
C# Question

Is this fact true about anagrammatical substrings?

I'm trying to write an algorithm that finds the number of anagrammatical substrings of a string. For instance, the string

`"abba"`
has 4:

(1)
`"a"`
,
`"a"`

(2)
`"b"`
,
`"b"`

(3)
`"ab"`
,
`"ba"`

(4)
`"abb"`
,
`"bba"`

A fact I'm trying to use to optimize is

If a string has no anagrammatical pairs of substrings of length k,
then it has no anagrammatical pairs of substrings of length k+1

Can you confirm whether that's true or not?

Because my algorithm

``````static int NumAnagrammaticalPairs(string str)
{
int count = 0; // count of anagrammatical pairs found
int n = str.Length / 2; // OPTIMIZATION: only need to look through the substrings of half the size or less
for(int k = 1; k <= n; ++k)
{
// get all substrings of length k
var subsk = GetSubstrings(str,k).ToList();

// count the number of anagrammatical pairs
var indices = Enumerable.Range(0, subsk.Count);
int anapairs = (from i in indices
from j in indices
where i < j && IsAnagrammaticalPair(subsk[i], subsk[j])
select 1).Count();

// OPTIMIZATION: if didn't find any anagrammatical pairs in the substrings of length k,
// there are no anagrammatical pairs in the substrings of length k+1, so we can exit
// the loop early
if(anapairs == 0)
break;
else
count += anapairs;
}
return count;
}
``````

is getting results sliggggtttthhhhly off (usually off by 1) the actual results in the test cases.

That's not the case - `abcd` and `cdab` are anagrams of length 4 but you can't find length 3 anagram substrings. Concretely, `abcdab` will not work, since it contains both `abcd` and `cdab`, but no 3 anagrams (from `abc`, `bcd`, `cda`, `dab`).