ksno ksno - 7 months ago 68
PHP Question

Why are there binary safe AND binary unsafe functions in php?

Is there any reason for this behavior/implementation ?
Example:

$array = array("index_of_an_array" => "value");
class Foo {
private $index_of_an_array;
function __construct() {}
}
$foo = new Foo();
$array = (array)$foo;
$key = str_replace("Foo", "", array_keys($array)[0]);
echo $array[$key];


Gives us an error which is complete:


NOTICE Undefined index: on line number 9


Example #2:


echo date("Y\0/m/d");


Outputs:


2016


BUT!
echo
or
var_dump()
, for example, and some other functions, would output the string "as it is", just \0 bytes are being hidden by browsers.

$string = "index-of\0-an-array";
$strgin2 = "Y\0/m/d";
echo $string;
echo $string2;
var_dump($string);
var_dump($string2);


Outputs:


index-of-an-array

"Y/m/d"

string(18) "index-of-an-array"

string(6) "Y/m/d"


Notice, that
$string
lenght is 18, but 17 characters are shown.

EDIT


From possible duplicate and php manual:



The key can either be an integer or a string. The value can be of any type.
Strings containing valid integers will be cast to the integer type. E.g. the key "8" will actually be stored under 8. On the other hand "08" will not be cast, as it isn't a valid decimal integer. So in short, any string can be a key. And a string can contain any binary data (up to 2GB). Therefore, a key can be any binary data (since a string can be any binary data).


From php string details:



There are no limitations on the values the string can be composed of;
in particular, bytes with value 0 (“NUL bytes”) are allowed anywhere
in the string (however, a few functions, said in this manual not to be
“binary safe”, may hand off the strings to libraries that ignore data
after a NUL byte.)


But I still do not understand why the language is designed this way? Are there reasons for this behavior/implementation? Why PHP does'nt handle input as binary safe everywhere but just in some functions?

From comment:



The reason is simply that many PHP functions like
printf
use the C library's implementation behind the scenes, because the PHP developers were lazy.


Arent those such as
echo
,
var_dump
,
print_r
? In other words, functions that output something. They are in fact binary safe if we take a look at my first example. Makes no sense to me to implement some binary-safe and binary-unsafe functions for output. Or just use some as they are in std lib in C and write some completely new functions.

Answer

The short answer to "why" is simply history.

PHP was originally written as a way to script C functions so they could be called easily while generating HTML. Therefore PHP strings were just C strings, which are a set of any bytes. So in modern PHP terms we would say nothing was binary-safe, simply because it wasn't planned to be anything else.

Early PHP was not intended to be a new programming language, and grew organically, with Lerdorf noting in retrospect: "I don’t know how to stop it, there was never any intent to write a programming language […] I have absolutely no idea how to write a programming language, I just kept adding the next logical step on the way."

Over time the language grew to support more elaborate string-processing functions, many taking the string's specific bytes into account and becoming "binary-safe". According to the recently written formal PHP specification:

As to how the bytes in a string translate into characters is unspecified. Although a user of a string might choose to ascribe special semantics to bytes having the value \0, from PHP's perspective, such null bytes have no special meaning. PHP does not assume strings contain any specific data or assign special values to any bytes or sequences.

As a language that has grown organically, there hasn't been a move to universally treat strings in a manner different from C. Therefore functions and libraries are binary-safe on a case-by-case basis.