Skip to content

Commit e4a006c

Browse files
committed
Fix #65732: grapheme_*() is not Unicode compliant on CR LF sequence
According to the Unicode specification (at least as of 5.1), CRLF sequences are considered to be a single grapheme. We cater to that special case by letting grapheme_ascii_check() fail. While it would be trivial to fix grapheme_ascii_check() wrt. grapheme_strlen(), grapheme_substr() and grapheme_strrpos() would be much harder to handle, so we accept the slight performance penalty if CRLF is involved.
1 parent 9164dc1 commit e4a006c

File tree

3 files changed

+24
-1
lines changed

3 files changed

+24
-1
lines changed

NEWS

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ PHP NEWS
1212
- IMAP:
1313
. Fixed bug #72852 (imap_mail null dereference). (Anatol)
1414

15+
- Intl:
16+
. Fixed bug #65732 (grapheme_*() is not Unicode compliant on CR LF
17+
sequence). (cmb)
18+
1519
- JSON:
1620
. Fixed bug #72787 (json_decode reads out of bounds). (Jakub Zelenka)
1721

ext/intl/grapheme/grapheme_util.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ int grapheme_ascii_check(const unsigned char *day, int32_t len)
221221
{
222222
int ret_len = len;
223223
while ( len-- ) {
224-
if ( *day++ > 0x7f )
224+
if ( *day++ > 0x7f || (*day == '\n' && *(day - 1) == '\r') )
225225
return -1;
226226
}
227227

ext/intl/tests/bug65732.phpt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
--TEST--
2+
Bug #65732 (grapheme_*() is not Unicode compliant on CR LF sequence)
3+
--SKIPIF--
4+
<?php
5+
if (!extension_loaded('intl')) die('skip intl extension not available');
6+
?>
7+
--FILE--
8+
<?php
9+
var_dump(grapheme_strlen("\r\n"));
10+
var_dump(grapheme_substr(implode("\r\n", ['abc', 'def', 'ghi']), 5));
11+
var_dump(grapheme_strrpos("a\r\nb", 'b'));
12+
?>
13+
==DONE==
14+
--EXPECT--
15+
int(1)
16+
string(7) "ef
17+
ghi"
18+
int(2)
19+
==DONE==

0 commit comments

Comments
 (0)