Jason Taylor Jason Taylor - 4 months ago 37
PHP Question

What POST serialization issues does PayPal PHP IPN example refer to?

PayPal's sample code for a PHP IPN listener has this comment/code at the top:

// reading posted data from directly from $_POST causes serialization
// issues with array data in POST
// reading raw POST data from input stream instead.
$raw_post_data = file_get_contents('php://input');
$raw_post_array = explode('&', $raw_post_data);
$myPost = array();
foreach ($raw_post_array as $keyval) {
$keyval = explode ('=', $keyval);
if (count($keyval) == 2)
$myPost[$keyval[0]] = urldecode($keyval[1]);

Can someone explain what serialization issues this comment refers to? While I'm ok doing it this way, I would feel more comfortable knowing why it should be done this way.


I can't tell you paypals motivations, but I can guess: php likes to change the keys of incoming variables from an http request.

For example, the name a.b [ would show up as $_POST['a_b__']. php will replace spaces, dots, and open brackets with underscores: source: http://php.net/manual/en/language.variables.external.php

Also, php will parse well formed matching brackets in variable names into nested arrays. eg, arr[a][b] would show up as $_POST['a']['b']. http://php.net/manual/en/faq.html.php#faq.html.arrays

Also, php behaves all kinds of crazy and buggy when brackets aren't well formed: https://bugs.php.net/bug.php?id=48597

Also, magic_quotes_gpc used to have its talons into every php installation, changing the names of variables in certain cases too. http://php.net/manual/en/security.magicquotes.php

Also, php has the arg_seperator.input setting, and some people like to set this to & instead of just &. Paypal cannot know which you prefer, and would obviously always use & http://php.net/manual/en/ini.core.php#ini.arg-separator.input

Also, despite being bad practice, it's not too uncommon in php for code/libraries to automatically modify the request inputs such as $_POST, eg to xss "sanitize" them or other such cross cutting concerns.

By parsing the input manually, you avoid all those potential issues. This decision seems like good engineering on their part.