Description
Description
Steps to Reproduce
I've encountered an issue with the remove_path option in the ZipArchive::addGlob method where its behavior doesn't align with the documentation. The documentation states that remove_path is used to "remove prefix from matching file paths before adding to the archive." However, it appears that remove_path matches and removes parts of the file path that are not just prefixes.
Create a directory structure as shown below:
/tmp
├── Dire
│ └── This_Is_A_File
└── test.php
Run the following PHP script
<?php
$zip = new ZipArchive();
$filename = "./HereIsZip.zip";
if ($zip->open($filename, ZipArchive::CREATE)!==TRUE) {
exit("Wrong!!!\n");
}
$dir = '/tmp/Dire';
$FirstDir = '/Dire/';
$SecondDir = 'This_Is';
$ThirdDir = '/D';
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $FirstDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $SecondDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $ThirdDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $dir]);
$zip->close();
if ($zip->open($filename) === true) {
$extractPath = '/tmp';
$zip->extractTo($extractPath);
$zip->close();
} else {
echo "Wrong\n";
}
Observe the output directory structure:
/tmp
├── Dire
│ └── This_Is_A_File
├── HereIsZip.zip
├── ire
│ └── This_Is_A_File
├── mp
│ └── Dire
│ └── This_Is_A_File
├── re
│ └── This_Is_A_File
├── test.php
└── This_Is_A_File
Expected Result:
Files are added to the archive with the specified remove_path prefix removed.
Actual Result:
The remove_path option removes parts of the file path that are not prefixes, leading to unexpected directory structures in the output.
$dir = '/tmp/Dire' // =>This_Is_A_File
$FirstDir = '/Dire/'; // =>ire/This_Is_A_File
$SecondDir = 'This_Is'; // =>re/This_Is_A_File
$ThirdDir = '/D'; // =>mp/Dire/This_Is_A_File
Additional Information:
This behavior is inconsistent with the documented purpose of remove_path and affects the usability of the ZipArchive::addGlob method when handling file paths.
The core of the issue seems to be in the way remove_path is checked within the addGlob function. Specifically, the code zval_file = opts.remove_path && strstr(Z_STRVAL_P(zval_file), opts.remove_path
appears to detect the presence of remove_path anywhere in the file path, not just at the beginning
if ((zval_file = zend_hash_index_find(Z_ARRVAL_P(return_value), i)) != NULL) {
if (opts.remove_all_path) {
basename = php_basename(Z_STRVAL_P(zval_file), Z_STRLEN_P(zval_file), NULL, 0);
file_stripped = ZSTR_VAL(basename);
file_stripped_len = ZSTR_LEN(basename);
} else if (opts.remove_path && strstr(Z_STRVAL_P(zval_file), opts.remove_path) != NULL) {
if (IS_SLASH(Z_STRVAL_P(zval_file)[opts.remove_path_len])) {
file_stripped = Z_STRVAL_P(zval_file) + opts.remove_path_len + 1;
file_stripped_len = Z_STRLEN_P(zval_file) - opts.remove_path_len - 1;
} else {
file_stripped = Z_STRVAL_P(zval_file) + opts.remove_path_len;
file_stripped_len = Z_STRLEN_P(zval_file) - opts.remove_path_len;
}
} else {
file_stripped = Z_STRVAL_P(zval_file);
file_stripped_len = Z_STRLEN_P(zval_file);
}
PHP Version
PHP 8.2.12
Operating System
Arch Linux