vrugtehagel vrugtehagel - 1 year ago 134
PHP Question

Are php's zlib_encode() and zlib_decode not opposite?

I wrote a function (in php) that decodes some data (the data is first zlib-encoded and then base-64 encoded).

Now I want to make a function that reverses that action - but something strange happens while encoding the data. I had this string (this is after the base64 decode so this is only zlib-encoded):

xÚc22Ñ5±RÈM-ÉÈOÉLNÌQpdFZ6<€‘/`„¸Ž]œ(•šÁáÁôÀm%Âp,¦C͆êF5ÉGÄŽa:Ìl˜ndÓ‘C‹áœÈB6±¥Ä€ ¤¦¦¥¦$¥&%'¦¥df¤e¤g¤"#¡b¤c¤¡ÆÀ˜%&¯TÍ’ Ĥ'£Ç¤¤ Äà`•¢Ç §H %=†=–%ô(`f“s0¨’c`)¥]

and when calling
on the original string, it outputs
. I've looked at php's
and found that I needed to give the type of compression. Well, no luck with that either:



string(194) "c22Ñ5±RÈM-ÉÈOÉLNÌQpdFZ6<€Ÿ,Ðe×q¢‹s¡Ò@3!<˜¸­DŽÅt¨ÙPݨ¦#ùˆÃ1L‡™
Ól:rh2n(" ±‰0ÀÀ(%!55-5%)5©(©8©0µ(µ £0#-#=#©©µ##
5Æ,1y= j–%†$==†$%%«=9%p@²H(±¨è1Äè±d(¡G3›œƒA•"
string(212) "‹c22Ñ5±RÈM-ÉÈOÉLNÌQpdFZ6<€Ÿ,Ðe×q¢‹s¡Ò@3!<˜¸­DŽÅt¨ÙPݨ¦#ùˆÃ1L‡™
Ól:rh2n(" ±‰0ÀÀ(%!55-5%)5©(©8©0µ(µ £0#-#=#©©µ##
5Æ,1y= j–%†$==†$%%«=9%p@²H(±¨è1Äè±d(¡G3›œƒA•ˆÉl–U"
string(200) "xœc22Ñ5±RÈM-ÉÈOÉLNÌQpdFZ6<€Ÿ,Ðe×q¢‹s¡Ò@3!<˜¸­DŽÅt¨ÙPݨ¦#ùˆÃ1L‡™
Ól:rh2n(" ±‰0ÀÀ(%!55-5%)5©(©8©0µ(µ £0#-#=#©©µ##
5Æ,1y= j–%†$==†$%%«=9%p@²H(±¨è1Äè±d(¡G3›œƒA•)¥]"

So none of the options actually return the original. How do I reverse php's


This is the data I start with:


I decode that with php's
, after which I get the first string in the post.

Answer Source

The guarantee of the lossless compressor is that zlib_decode(zlib_encode(original_data)) will always return original_data.

However there is no guarantee that zlib_encode(zlib_decode(compressed_data)) will return compressed_data. The compressed_data you are providing could have been compressed at a different compression level, with a different compression strategy, with a different algorithm, or even simply with a different version of the same compression code. There are an infinite number of compressed representations of the same source data and there is no requirement on the compressor to produce a particular one out of that infinite set.

There is not even a requirement that a compressor be deterministic, so long as the lossless guarantee is met. However most compression codes that you will meet are deterministic. So while not guaranteed, it is quite likely that zlib_encode(original_data) will always return the same result when run with the same input, the same version and build of zlib_encode(), and the same options.

Now for a slight diversion into history. Early versions of zlib's deflate were not deterministic, since the result could depend on the contents of uninitialized memory. Even though that deflate code would always produce a correct result and would always satisfy the lossless guarantee, the non-determinism bothered folk. So that was later fixed to make deflate deterministic.

By the way, why do you care? Why would there be a need to replicate the same compressed data (which you know you could just copy, right?), so long as you are assured that you will get the original data when you decompress?

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download