createproblem createproblem - 1 month ago 10
PHP Question

Explanation about constant-time algorithm and string comparision

I've a problem to understand two different ways of string comparison. Given is the following function which compares two strings.
This function is used in the Symfony-Framework security component to compare passwords in the user-login process.

/**
* Compares two strings.
*
* This method implements a constant-time algorithm to compare strings.
*
* @param string $knownString The string of known length to compare against
* @param string $userInput The string that the user can control
*
* @return Boolean true if the two strings are the same, false otherwise
*/
function equals($knownString, $userInput)
{
// Prevent issues if string length is 0
$knownString .= chr(0);
$userInput .= chr(0);

$knownLen = strlen($knownString);
$userLen = strlen($userInput);

$result = $knownLen - $userLen;

// Note that we ALWAYS iterate over the user-supplied length
// This is to prevent leaking length information
for ($i = 0; $i < $userLen; $i++) {
// Using % here is a trick to prevent notices
// It's safe, since if the lengths are different
// $result is already non-0
$result |= (ord($knownString[$i % $knownLen]) ^ ord($userInput[$i]));
}

// They are only identical strings if $result is exactly 0...
return 0 === $result;
}


origin: origin snippet

I've problem to understand the difference between the
equals()
function and a simple comparison
===
. I wrote a simple working example to explain my problem.

Given strings:

$password1 = 'Uif4yQZUqmCWRbWFQtdizZ9/qwPDyVHSLiR19gc6oO7QjAK6PlT/rrylpJDkZaEUOSI5c85xNEVA6JnuBrhWJw==';
$password2 = 'Uif4yQZUqmCWRbWFQtdizZ9/qwPDyVHSLiR19gc6oO7QjAK6PlT/rrylpJDkZaEUOSI5c85xNEVA6JnuBrhWJw==';
$password3 = 'iV3pT5/JpPhIXKmzTe3EOxSfZSukpYK0UC55aKUQgVaCgPXYN2SQ5FMUK/hxuj6qZoyhihz2p+M2M65Oblg1jg==';


Example 1 (act as expected)

echo $password1 === $password2 ? 'True' : 'False'; // Output: True
echo equals($password1, $password2) ? 'True' : 'False'; // Output: True


Example 2 (act as expected)

echo $password1 === $password3 ? 'True' : 'False'; // Output: False
echo equals($password1, $password3) ? 'True' : 'False'; // Output: False


I read about the Karp Rabin Algorithm but I'm not sure if the
equals()
function represent
the Karp Rabin Algorithm, and in general I didn't understand the Wikipedia article.

On the other hand I read that the
equals()
function will prevent brute force attacks is that right? Can someone explain what the advantage for
equals()
is?
Or can someone give me an example where
===
will fail and
equals()
does the correct work, so I can understand the advantage?

And what does constant-time Algorithm mean? I think constant-time has nothing to do with the real time, or if I'm wrong?

Answer

This function is just a normal string comparison function. It is not Rabin Karp. It is NOT constant time, it's linear time, regardless of what the comment says. It also does not prevent brute force attacks.

How it works:

  1. if the correct and user-provided passwords are of different length, make $result != 0
  2. iterate over the user-provided password, xor each of its characters with the corresponding character of the correct password (if the correct password is shorter, keep going through it in a circle), and bitwise or each result with $result.

Since only bitwise or is used, if any of the characters are different, $result will be != 0. Step 1 is needed because otherwise, user input "abca" would be accepted if the real password was "abc".

Why such string comparison functions are sometimes used

Let's assume we compare strings the usual way, and the correct password is "bac". Let's also assume I can precisely measure how long it takes for the password check to complete.

I (the user) try a, b, c... They don't work.

Then, I try aa. The algorithm compares the first 2 letters - b vs a, sees it's wrong, and returns false.

I now try with bb. The algorithm compares b vs b, they match, so it goes on to letter #2, compares a vs b, sees it's wrong, returns false. Now, since I am able to time the execution of the algorithm precisely, I know the password starts with "b", because the second pass took more time than the first one - I know the first letter matched.

So I try ba, bb, bc... They fail.

Now I check baa, bbb, see baa runs slower so second letter is a. This way, letter by letter, I can determine the password in an O(cN) number of attempts instead of O(c^N) that brute force would take.

It usually isn't as much of a concern as this explanation might make it sound, because it is unlikely an attacker will time a string comparison to such a degree of accuracy. But sometimes it can be.

Comments