Skip to content

Commit aa93ea1

Browse files
committed
Avoid expensive mb_valid check.
Encoding name Returns != 0 when ----------------------------- -------------------------------------- check_mb_big5 0xA1 <= c <= 0xF9 check_mb_cp932 0x81 <= c <= 0x9F || 0xE0 <= c <= 0xFC check_mb_eucjpms c >= 0x80 check_mb_euckr c >= 0x80 check_mb_gb2312 0xA1 <= c <= 0xF7 check_mb_gbk 0x81 <= c <= 0xFE check_mb_sjis 0x81 <= c <= 0x9F || 0xE0 <= c <= 0xFC check_mb_ucs2 always returns length 2 check_mb_ujis c >= 0x80 check_mb_utf16 complicated check_mb_utf32 always returns length 4 check_mb_utf8_valid c >= 0x80 check_mb_utf8mb3_valid c >= 0x80 my_ismbchar_gb18030 0x81 <= c <= 0xFE The ASCII-compatible encodings, i.e. cases where the c >= 0x80 check is sufficient, have the minimum char length == 1.
1 parent cfd12e0 commit aa93ea1

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

ext/mysqlnd/mysqlnd_charset.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -840,8 +840,10 @@ PHPAPI zend_ulong mysqlnd_cset_escape_slashes(const MYSQLND_CHARSET * const cset
840840
char esc = '\0';
841841
unsigned int len = 0;
842842

843-
/* check unicode characters */
844-
if (cset->char_maxlen > 1 && (len = cset->mb_valid(escapestr, end))) {
843+
/* check unicode characters
844+
* Encodings that have a minimum length of 1 are compatible with ASCII.
845+
* So we can skip (for performance reasons) the check to mb_valid for them. */
846+
if (cset->char_maxlen > 1 && (*((zend_uchar *) escapestr) > 0x80 || cset->char_minlen > 1) && (len = cset->mb_valid(escapestr, end))) {
845847
/* check possible overflow */
846848
if ((newstr + len) > newstr_e) {
847849
escape_overflow = TRUE;

0 commit comments

Comments
 (0)