Skip to content

Commit 8dddd3c

Browse files
committed
Fix buffer overflow bugs in UTF-7 text conversion
After Nikita Popov found a buffer overrun bug in one of my pull requests, I was prompted to add more assertions in a38c7e5 to help me catch such bugs myself more easily in testing. Wouldn't you just know it... as soon as I added those assertions, the mbstring test suite caught another buffer overrun bug in my UTF-7 conversion code, which I wrote the better part of a year ago. Then, when I started fuzzing the code with libfuzzer, I found and fixed another buffer overflow: If we enter the main loop, which normally outputs 3 decoded Base64 characters, where the first half of a surrogate pair had appeared at the end of the previous run, but the second half does not appear on this run, we need to output one error marker. Then, at the end of the main loop, if the Base64 input ends at an unexpected position AND the last character was not a legal Base64-encoded character, we need to output two error markers for that. The three error markers plus two valid, decoded bytes can push us over the available space in our wchar buffer.
1 parent e4b9aa1 commit 8dddd3c

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

ext/mbstring/libmbfl/filters/mbfilter_utf7.c

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,7 @@ static uint32_t* handle_base64_end(unsigned char n, unsigned char **p, uint32_t
478478

479479
static size_t mb_utf7_to_wchar(unsigned char **in, size_t *in_len, uint32_t *buf, size_t bufsize, unsigned int *state)
480480
{
481-
ZEND_ASSERT(bufsize >= 4); /* This function will infinite-loop if called with a tiny output buffer */
481+
ZEND_ASSERT(bufsize >= 5); /* This function will infinite-loop if called with a tiny output buffer */
482482

483483
unsigned char *p = *in, *e = p + *in_len;
484484
uint32_t *out = buf, *limit = buf + bufsize;
@@ -489,7 +489,7 @@ static size_t mb_utf7_to_wchar(unsigned char **in, size_t *in_len, uint32_t *buf
489489
while (p < e && out < limit) {
490490
if (base64) {
491491
/* Base64 section */
492-
if ((limit - out) < 4) {
492+
if ((limit - out) < 5) {
493493
break;
494494
}
495495

@@ -631,16 +631,19 @@ static void mb_wchar_to_utf7(uint32_t *in, size_t len, mb_convert_buf *buf, bool
631631
MB_CONVERT_BUF_ENSURE(buf, out, limit, len);
632632
RESTORE_CONVERSION_STATE();
633633
} else {
634-
/* Encode codepoint, preceded by any cached bits, as Base64 */
634+
/* Encode codepoint, preceded by any cached bits, as Base64
635+
* Make enough space in the output buffer to hold both any bytes that
636+
* we emit right here, plus any finishing byte which might need to
637+
* be emitted if the input string ends abruptly */
635638
uint64_t bits;
636639
if (w >= MBFL_WCSPLANE_SUPMIN) {
637640
/* Must use surrogate pair */
638-
MB_CONVERT_BUF_ENSURE(buf, out, limit, 6);
641+
MB_CONVERT_BUF_ENSURE(buf, out, limit, 7);
639642
w -= 0x10000;
640643
bits = ((uint64_t)cache << 32) | 0xD800DC00L | ((w & 0xFFC00) << 6) | (w & 0x3FF);
641644
nbits += 32;
642645
} else {
643-
MB_CONVERT_BUF_ENSURE(buf, out, limit, 3);
646+
MB_CONVERT_BUF_ENSURE(buf, out, limit, 4);
644647
bits = (cache << 16) | w;
645648
nbits += 16;
646649
}

0 commit comments

Comments
 (0)