View Full Version : Compression resulting in non-compliant gzip files
PaulKroll
03-17-2000, 08:40 PM
OK, this gets weird.
Writing a gzipped file from PHP (using gzopen and gzwrite or gzputs) gives two odd results:
1) a warning the first time that the file is not accessible. The directory has been made world writable to avoid such safe_mode difficulties, and in fact, the file IS written, but the warning comes up.[nbsp][nbsp]The warning doesn't appear upon further runs, and it's obvious from the datestamp that the writing is occuring.[nbsp][nbsp]Also doesn't warn if I first open a file, then close it (using the conventional file open to make a 0 length file) and THEN gzopen and write to it.
2) In any case, the file written is not a valid gzip file.[nbsp][nbsp]It's compressed data alright: if you zcat it, the complete file shows, but zcat reports a failed CRC.[nbsp][nbsp]IE5 (which handles gzipped html files) completely ignores it.
I've no idea whats going on here: I pulled down PHP 3.0.15 and the 4.0 beta 4, and compiled those both as CGI for quick testing: both generate gzip files fine.[nbsp][nbsp]FQ seems to have the same zlib here (1.1.3 has been around for a while), so I'm at a loss. Any ideas? There are alternative ways of making a gzipped file via PHP, but this would be the least-overhead-way.[nbsp]
PaulKroll
03-20-2000, 06:57 PM
(Tap on monitor) Hello? Anyone? Hell-OOooo! Terra? Justin? You waiting for some example code or just looking at the problem and going "Heck, what >IS< going on there?" :)
[This message has been edited by PaulKroll (edited 03-20-00@6:35 pm)]
Terra
03-21-2000, 02:15 AM
We have our resident PHP expert looking into this...
Hopefully Justin will be able to reproduce the problem...
I personally have not used PHP's gzip capability, and to date no one has ever reported a problem with it... shrug
My apologies for the delay, I'll nudge Justin once again to visit this bug report...
--
Terra
sysAdmin
FutureQuest
Justin
03-21-2000, 02:20 PM
I've been messing with this for the last hour, and I can't figure it out either :(
From the command line or as a CGI, it works just fine, and the CRC checks out. But the module version is not... I don't have any answers, but at least I was able to reproduce the problem consistantly. I've narrowed it down this far:
Both PHP copies are using the same Zlib version (1.1.3)
Not a CRLF issue as I had first suspected (3.0.15 has some issues there)
Tried different compression levels with same results
The output file *is* okay, and Midnight Commander, zcat, and IE all are able to display the file (just the CRC issue)
The last three bytes are different, where the rest of the file is the same (eg, same compression level/algorythm used).
Other than the above, I don't know what might cause this - but it is consistant, and I know everything was done correctly and it *should* work...
I will be digging into this problem some more, as I have been wanting to play with the idea of sending gzip'd HTML to the browser, and possibly using compressed files to store data (which it looks like is a lot easier than I had first thought, after playing with it a bit)...
------------------
Justin Nelson
FutureQuest (http://www.FutureQuest.net/index.php) Support
PaulKroll
03-21-2000, 03:12 PM
Don't you just love these sorts of things?
I've been trying to nail this down too.[nbsp][nbsp]I first figured there'd be some clash between one of the new additions (libmcrypt or mhash) but can't spot a problem. (I.e., is php compiling with the zlib code/CRC code, or is it pulling the code in from one of the other files? That shouldn't happen, "normally," but I think we already know this is not "normal." :) )
There are two zlib.h files on the server, but the other one of those is in a redhat build directory for ssh and I can't picture anyone's Makefile being so odd that it went and pulled THAT one in.
IE5.01 won't display the bad-CRC files for me, but does the good-CRC files just as you'd hope.
I'm wondering if there's a safe_mode strangeness going on, something that forces the file to get opened with a different buffering scheme or to get closed before zlib is able to finish up its work, or something to do with the phase of the moon...[nbsp][nbsp]since the rest of the critter is fine, and it's just the CRC, it seems like there's:
A) A problem with the CRC calculation
B) A problem writing out the CRC (safe_mode/being a module affecting this?)
C) Rampant, crazed squirrels swapping the bits in the server just to spite us
I'm leaning towards C right now. :(
PaulKroll
03-22-2000, 01:11 AM
This was interesting... whilst looking around for other items, I noticed the 3.0.15 upgrade would include Terra building the CGI aspect of PHP.[nbsp][nbsp]Not only does this make me think that Terra has played a LOT of Starcraft (:)), but I decided to try the problem child gzip script w/the CGI aspect.[nbsp][nbsp]It works fine from the CGI aspect.[nbsp][nbsp]The CGI version of PHP reports that it's still 3.0.12, and 3.0.15 is the module version that's not creating proper gzips.
Since eventually you're going to ask, below is the code:
--------
<?php
$fullpage="<html><body>here's some compressed some compressed some compressed some compressed some compressed some compressed page again!</body></html>";
$zp = gzopen("myfile2.html.gz", "w");
gzputs($zp,$fullpage);
gzclose($zp);
?>
----
(There's also the appropriate html and body tags, but that's not even remotely relevant)
The above script will generate a perfectly fine gzip file run as CGI here, but not as a module. As a module, because of safe mode, the script will completely fail unless the directory is world-writable (which in my test cases, it is).[nbsp][nbsp]The file IS generated when this is called from the module PHP, and DOES have the appropriate text in it when checked with zcat, but zcat finishes with a bad CRC error. It does NOT report a bad CRC w/the result of the CGI version.[nbsp][nbsp]Version 3.0.15 compiled as CGI on a RH 6.1 linux machine at home, also creates the gzip file fine.[nbsp][nbsp]zcat, gzip, and the zlib library all appear to be the same version on FQ as on my home and work machines.
This isn't critical to me at this point, so don't kill yourselves, but it'll be important "soon" (the definition of soon being a matter of management, and thus open to interpretation. I'm betting "soon" is "two months" but no promises...)
vBulletin® v3.6.8, Copyright ©2000-2009, Jelsoft Enterprises Ltd.