Skip to content

Commit beef597

Browse files
committed
Fix mbstring support for SJIS-Mobile (DoCoMo, KDDI, and Softbank variants of Shift-JIS)
Lots of problems here. - Don't pass 'control' characters through silently in the middle of a multi-byte character. - Treat it as an error if a multi-byte character is truncated. - For ESC sequences used to encode emoji on earlier Softbank phones, if an invalid ESC sequence is found, don't pass it through. Rather, handle it as an error and respect `mb_substitute_character`. - In ranges used by mobile vendors for emoji, if a certain byte sequence doesn't map to any emoji, don't emit a mangled value (actually a raw (ku*94)+ten value, which may not even be a valid Unicode codepoint at all). - When converting Unicode to SJIS-Mobile, don't mangle codepoints which fall in the 2nd range of MicroSoft vendor extensions. Some vendor-specific emoji have been mapped to standard Unicode codepoints now, rather than 'private use area' codepoints. When the legacy code was written, these codepoints may not have existed yet in the Unicode standard which was current at that time. Also do a major code cleanup -- remove dead code, rearrange what is left, use some new macros and helper functions to make the code clearer...
1 parent bbbadae commit beef597

File tree

6 files changed

+1534
-529
lines changed

6 files changed

+1534
-529
lines changed

ext/mbstring/libmbfl/filters/emoji2uni.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@ static const unsigned short mb_tbl_code2uni_docomo1[] = { // 0x28c2 - 0x29db
3434
0xf4ba, 0xf303, 0xEE1E, 0xEE1F,
3535
0xEE20, 0xf51c, 0xf51b, 0xf51a,
3636
0x23f0, 0xEE21, 0xEE22, 0xEE23,
37-
0xEE24, 0xEE25, 0xEE26, 0xEE27,
38-
0xEE28, 0xEE29, 0xEE2A, 0xEE2B,
39-
0xEE2C, 0xEE2D, 0xEE2E, 0xEE2F,
37+
0xEE24, 0xEE25, 0x25EA, 0x25A0,
38+
0x25BF, 0xEE29, 0xEE2A, 0xEE2B,
39+
0x2020, 0xEE2D, 0xEE2E, 0xEE2F,
4040
0xEE30, 0xEE31, 0xEE32, 0xEE33,
4141
0xf4f2, 0xf4e9, 0xf4e0, 0xEE10,
4242
0xEE11, 0x2709, 0xEE12, 0xEE13,

0 commit comments

Comments
 (0)