Skip to content

Inconsistency in ZipArchive::addGlob 'remove_path' Option Behavior #12661

Closed
@MoonlitSyntax

Description

@MoonlitSyntax

Description

Steps to Reproduce

I've encountered an issue with the remove_path option in the ZipArchive::addGlob method where its behavior doesn't align with the documentation. The documentation states that remove_path is used to "remove prefix from matching file paths before adding to the archive." However, it appears that remove_path matches and removes parts of the file path that are not just prefixes.

Create a directory structure as shown below:

/tmp
├── Dire
│   └── This_Is_A_File
└── test.php

Run the following PHP script

<?php
$zip = new ZipArchive();
$filename = "./HereIsZip.zip";

if ($zip->open($filename, ZipArchive::CREATE)!==TRUE) {
    exit("Wrong!!!\n");
}
$dir = '/tmp/Dire';
$FirstDir = '/Dire/';
$SecondDir = 'This_Is';
$ThirdDir = '/D';
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $FirstDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $SecondDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $ThirdDir]);
$zip->addGlob('/tmp/Dire/This_Is_A_File', 0, ['add_path' => '/', 'remove_path' => $dir]);
$zip->close();

if ($zip->open($filename) === true) {

    $extractPath = '/tmp'; 
    $zip->extractTo($extractPath);

    $zip->close();

} else {
    echo "Wrong\n";
}

Observe the output directory structure:

/tmp
├── Dire
│   └── This_Is_A_File
├── HereIsZip.zip
├── ire
│   └── This_Is_A_File
├── mp
│   └── Dire
│       └── This_Is_A_File
├── re
│   └── This_Is_A_File
├── test.php
└── This_Is_A_File

Expected Result:

Files are added to the archive with the specified remove_path prefix removed.

Actual Result:

The remove_path option removes parts of the file path that are not prefixes, leading to unexpected directory structures in the output.

$dir = '/tmp/Dire'     //   =>This_Is_A_File
$FirstDir = '/Dire/';     //   =>ire/This_Is_A_File
$SecondDir = 'This_Is';      //  =>re/This_Is_A_File
$ThirdDir = '/D';    //    =>mp/Dire/This_Is_A_File

Additional Information:

This behavior is inconsistent with the documented purpose of remove_path and affects the usability of the ZipArchive::addGlob method when handling file paths.

The core of the issue seems to be in the way remove_path is checked within the addGlob function. Specifically, the code zval_file = opts.remove_path && strstr(Z_STRVAL_P(zval_file), opts.remove_path appears to detect the presence of remove_path anywhere in the file path, not just at the beginning

if ((zval_file = zend_hash_index_find(Z_ARRVAL_P(return_value), i)) != NULL) {
				if (opts.remove_all_path) {
					basename = php_basename(Z_STRVAL_P(zval_file), Z_STRLEN_P(zval_file), NULL, 0);
					file_stripped = ZSTR_VAL(basename);
					file_stripped_len = ZSTR_LEN(basename);
				} else if (opts.remove_path && strstr(Z_STRVAL_P(zval_file), opts.remove_path) != NULL) {
					if (IS_SLASH(Z_STRVAL_P(zval_file)[opts.remove_path_len])) {
						file_stripped = Z_STRVAL_P(zval_file) + opts.remove_path_len + 1;
						file_stripped_len = Z_STRLEN_P(zval_file) - opts.remove_path_len - 1;
					} else {
						file_stripped = Z_STRVAL_P(zval_file) + opts.remove_path_len;
						file_stripped_len = Z_STRLEN_P(zval_file) - opts.remove_path_len;
					}
				} else {
					file_stripped = Z_STRVAL_P(zval_file);
					file_stripped_len = Z_STRLEN_P(zval_file);
				}

PHP Version

PHP 8.2.12

Operating System

Arch Linux

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions