Skip to content

Binary string detection is broken (introduced with 1.9.0) #132

Closed
@driehle

Description

@driehle

PHP version: all versions

Description
With #126 and #128 a change was introduced that is supposed to prevent binary messages from being dumped. In FullHttpMessageFormatter line 89 the following code is used to detect binary strings:

        if (preg_match('/([\x00-\x1F\x7F])/', $data)) {
            return $message.'[binary stream omitted]';
        }

Unfortunatly, this regex includes common whitespaces like \t, \r and \n as well. In consequence, all output is hidden and replaced with [binary stream omitted], even if the document is a simple HTML or XML document. I doubt that this is the intentioned behaviour, since linebreaks a by far an indicator for a binary string and moreover very common in HTML or XML documents.

Possible Solution
The code could be changed as follows, see https://stackoverflow.com/questions/25343508/detect-if-string-is-binary:

        if (preg_match('~[^\x20-\x7E\t\r\n]~', $data)) {
            return $message.'[binary stream omitted]';
        }

However, this would still be problematic regarding non-English languages, i.e. special characters in French, Spanish, Italian or German would still trigger falsely the binary detection. Therefore, I would suggest to check for all non-prinable ASCII characters as well as <del> (\x7F) and explicitly exclude <cr> (\x0D), <lf> (\x0A) and <tab> (\x0B).

        // all non-printable ASCII characters and <DEL> except for \t, \r, \n
        if (preg_match('/([\x00-\x09\x0C\x0E-\x1F\x7F])/', $data)) {
            return $message.'[binary stream omitted]';
        }

The same change should probably be applied to CurlCommandFormatter as well.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions