Skip to content

Commit 5e66b8f

Browse files
committed
notes to UPGRADING.INTERNALS
1 parent 8b82323 commit 5e66b8f

File tree

1 file changed

+47
-0
lines changed

1 file changed

+47
-0
lines changed

UPGRADING.INTERNALS

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ PHP 7.1 INTERNALS UPGRADE NOTES
22

33
0. Wiki Examples
44
1. Internal API changes
5+
e. Codepage handling on Windows
6+
f. Path handling on Windows
57

68
2. Build system changes
79
a. Unix build system changes
@@ -21,6 +23,51 @@ changes. See: https://wiki.php.net/phpng-upgrading
2123
1. Internal API changes
2224
========================
2325

26+
e. Codepage handling on Windows
27+
28+
A set of new APIs was introduced, which allows to handle codepage
29+
conversions. The corresponding prototypes and macros are contained
30+
in win32/codepage.h.
31+
32+
Functions with php_win32_cp_* signatures provide handling for various
33+
codepage aspects. Primarily they are in use at various places in the
34+
I/O utils code and directly in the core where necessary, providing
35+
conversions to/from UTF-16. Arbitrary conversions between codepages
36+
are possible as well, whereby UTF-16 will be always an intermediate
37+
state in this case.
38+
39+
For input length arguments, the macro PHP_WIN32_CP_IGNORE_LEN can be
40+
passed, then the API will calculate the length. For output length
41+
arguments, the macro PHP_WIN32_CP_IGNORE_LEN_P can be passed, then
42+
the API won't set the output length.
43+
44+
The mapping between encodings and codepages is provided by the predefined
45+
array of data contained in win32/cp_enc_map.c. To change the data,
46+
a generator win32/cp_enc_map_gen.c needs to be edited and run.
47+
48+
f. Path handling on Windows
49+
50+
A set of new APIs was introduced, which allows to handle UTF-8 paths. The
51+
corresponding prototypes and macros are contained in win32/ioutil.h.
52+
53+
Functions with php_win32_ioutil_* signatures provide POSIX I/O analogues.
54+
These functions are integrated in various places across the code base to
55+
support Unicode filenames. While accepting char * arguments, internally
56+
the conversion to wchar_t * happens. Internally almost no ANSI APIs are
57+
used, but directly their wide equivalents. The string conversion rules
58+
correspond to those already present in the core and depend on the current
59+
encoding settings. Doing so allows to move away from the ANSI Windows API
60+
with its dependency on the system OEM/ANSI codepage.
61+
62+
Thanks to the wide API usage, the long paths are now supported as well. The
63+
PHP_WIN32_IOUTIL_MAXPATHLEN macro is defined to 2048 bytes and will override
64+
the MAXPATHLEN in files where the header is included.
65+
66+
The most optimal use case for scripts is utilizing UTF-8 for any I/O
67+
related functions. UTF-8 filenames are supported on any system disregarding
68+
the system OEM/ANSI codepage.
69+
70+
2471
========================
2572
2. Build system changes
2673
========================

0 commit comments

Comments
 (0)