-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix #69279: Compressed ZIP Phar extractTo() creates garbage files #6599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When extracting compressed files from an uncompressed Phar, we must not use the direct file pointer, but rather get an uncompressed file pointer. This also fixes #79912, which appears to be a duplicate of #69279.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read both bug reports. I only have one small comment change suggestion, as it seemed like a forgotten copy-paste artifact. Otherwise, LGTM.
Co-authored-by: Anna Filina <afilina@gmail.com>
@bishopb, what do you think about this? |
To check my understanding of how these APIs work: phar_get_efp() can return either a compressed or uncompressed FP. Calling phar_open_entry_fp() will convert it into an uncompressed FP. Is that right? |
Yes. |
Heh, that seems like ... not so great API design. Can we run into the reverse issue afterwards, where some code expects phar_get_efp() to return the compressed FP but will now get the uncompressed one? |
To verify this, I created a simple ZIP with a deflated entry bar.txt followed by a stored entry foo.txt. Extracting produced a correctly uncompressed bar.txt, but a garbage foo.txt. Took me a while to find out the foo.txt was extracted from a wrong file offset. The offset would also be wrong without this patch. I think I need to look into this issue first. |
I found that is due to different extra field lengths in the local file header and the central directory header for (some?) ZIPs created by 7z or Info-ZIP's zip (which may be valid or not according to the specification). The offset of the file data in the archive is determined by the extra field length of the central directory header, so may not actually be correct. I don't know why the lengths differ, but unless someone files a bug report about that, I won't investigate further.
Not quite; I was wrong, previouly. Actually, at least for ZIPs Phar starts with an PHAR_FP (which handles uncompressed data). Calling I've added a test which extracts a deflated, a stored and a deflated entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough investigation, this makes more sense to me now.
When extracting compressed files from an uncompressed Phar, we must not
use the direct file pointer, but rather get an uncompressed file
pointer.
This also fixes #79912, which appears to be a duplicate of #69279.