Skip to content

Commit cf15d7e

Browse files
committed
ext/iconv/iconv.c: remove incorrect //IGNORE handling
Some iconv implementations allow you to append "//IGNORE" to the target charset as an indication that it should skip certain conversions. This is true for the two GNU iconv implementations, and is apparently based on a Solaris extension. It has also recently been added to POSIX 2024. Specifically, //IGNORE will cause iconv() to skip input sequences that are valid but cannot be translated to the target charset. It does not skip sequences that are invalid inputs; these still cause an EILSEQ. There is a minimal check for "//IGNORE" in ext/iconv/iconv.c, based on a config.m4 check for ICONV_BROKEN_IGNORE. If the iconv implementation does not support "//IGNORE", PHP will check for it in the target charset string, and will attempt its own workaround. This workaround is wrong for two reasons: 1. The string "//IGNORE" is left in the target charset, so if the current iconv implementation doesn't understand it, it's just going to fail. This is what happens on musl. 2. The PHP workaround looks for EILSEQ, and then attempts to skip the untranslatable sequence. But with non-GNU iconvs (i.e. all affected implementations), this will be backwards. In non-GNU implementations, EILSEQ is raised only for invalid input sequences, and not for untranslatable ones. Even in POSIX 2024, EILSEQ is reserved for input sequences that are invalid rather than untranslatable. In other words, sequences that cause an EILSEQ should never be ignored. This commit removes the workaround. It's not quite right, and removing it doesn't cause any new test failures on musl. It would be nice if PHP could abstract the //IGNORE magic away from the user, but since it has been added to POSIX 2024, I think the simplest thing to do is wait it out. Eventually the other implementations will add support for it, and then there will be no need to work around its absence.
1 parent 2f97c3e commit cf15d7e

File tree

1 file changed

+0
-29
lines changed

1 file changed

+0
-29
lines changed

ext/iconv/iconv.c

Lines changed: 0 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -415,24 +415,6 @@ static php_iconv_err_t _php_iconv_appendc(smart_str *d, const char c, iconv_t cd
415415
}
416416
/* }}} */
417417

418-
/* {{{ */
419-
#ifdef ICONV_BROKEN_IGNORE
420-
static int _php_check_ignore(const char *charset)
421-
{
422-
size_t clen = strlen(charset);
423-
if (clen >= 9 && strcmp("//IGNORE", charset+clen-8) == 0) {
424-
return 1;
425-
}
426-
if (clen >= 19 && strcmp("//IGNORE//TRANSLIT", charset+clen-18) == 0) {
427-
return 1;
428-
}
429-
return 0;
430-
}
431-
#else
432-
#define _php_check_ignore(x) (0)
433-
#endif
434-
/* }}} */
435-
436418
/* {{{ php_iconv_string() */
437419
PHP_ICONV_API php_iconv_err_t php_iconv_string(const char *in_p, size_t in_len, zend_string **out, const char *out_charset, const char *in_charset)
438420
{
@@ -442,7 +424,6 @@ PHP_ICONV_API php_iconv_err_t php_iconv_string(const char *in_p, size_t in_len,
442424
size_t bsz, result = 0;
443425
php_iconv_err_t retval = PHP_ICONV_ERR_SUCCESS;
444426
zend_string *out_buf;
445-
int ignore_ilseq = _php_check_ignore(out_charset);
446427

447428
*out = NULL;
448429

@@ -466,16 +447,6 @@ PHP_ICONV_API php_iconv_err_t php_iconv_string(const char *in_p, size_t in_len,
466447
result = iconv(cd, (ICONV_CONST char **) &in_p, &in_left, (char **) &out_p, &out_left);
467448
out_size = bsz - out_left;
468449
if (result == (size_t)(-1)) {
469-
if (ignore_ilseq && errno == EILSEQ) {
470-
if (in_left <= 1) {
471-
result = 0;
472-
} else {
473-
errno = 0;
474-
in_p++;
475-
in_left--;
476-
continue;
477-
}
478-
}
479450

480451
if (errno == E2BIG && in_left > 0) {
481452
/* converted string is longer than out buffer */

0 commit comments

Comments
 (0)