You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix macOS/BSD incompatibility in general:check-filenames task
The "Check Files" (Task) template includes an asset task named `general:check-filenames` that checks for the presence of
non-portable filenames in the project.
Ironically, the task itself was non-portable. The problem was that it used the `--perl-regexp` flag in the `grep`
command. This flag is not supported by the BSD version of grep used on macOS and BSD machines. This caused the task to
fail spuriously with `grep: unrecognized option '--perl-regexp'` errors when ran on a macOS or BSD machine.
The incompatibility is resolved by changing the `--perl-regexp` flag to `--extended-regexp`. This flag, which is
supported by the BSD and GNU versions of grep, allows the use of the modern and reasonable capable POSIX ERE syntax on
all platforms.
Unfortunately the regular expression used in the previous command relied on one of the additional features only present
in the PCRE syntax. This syntax was used to check for the presence of a range of characters prohibited by the Windows
filename specification:
https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
> Use any character [...] except for the following:
> - Integer value zero, sometimes referred to as the ASCII NUL character.
> - Characters whose integer representations are in the range from 1 through 31
Due to the nature of these characters, they must be represented by code in the regular expression. This was done using
the `\x{hhh..}` syntax supported by PCRE. Neither that syntax nor any of the equivalent escape patterns are supported by
POSIX ERE. A solution is offered in the GNU grep documentation:
https://www.gnu.org/software/grep/manual/grep.html#Matching-Non_002dASCII-and-Non_002dprintable-Characters
> the command `grep "$(printf '\316\233\t\317\211\n')"` is a portable albeit hard-to-read alternative
As also mentioned there:
> none of these techniques will let you put a null character directly into a command-line pattern
So the range of characters in the pattern can not include NUL. However, it turns out that even the previous command did
not detect this character although it was present by the pattern. So this limitation doesn't result in any regression in
practice.
0 commit comments