Skip to content

PHPC-2262: Replace ICU dep with vendored utf8proc lib #1447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 19, 2023

Conversation

jmikola
Copy link
Member

@jmikola jmikola commented Jul 18, 2023

https://jira.mongodb.org/browse/PHPC-2262

PHPC-2262: Replace ICU dep with vendored utf8proc lib

This introduces a --with-mongodb-utf8proc configure flag, which accepts "bundled" (default) and "system". Windows will always use bundled sources.

CheckICU.m4 and the --with-mongodb-icu configure flag have been removed.

Bumps libmongoc to 1.25-dev. This also adds support for a new MONGOC_ENABLE_SRV config constant, which is currently set based on detection in CheckResolv.m4. We can consider exposing that as a configure option down the line in PHPC-2256.

@jmikola jmikola requested a review from alcaeus July 18, 2023 18:29
@jmikola jmikola force-pushed the phpc-2262 branch 2 times, most recently from 9626f14 to b91b1f5 Compare July 18, 2023 19:51
This introduces a --with-mongodb-utf8proc configure flag, which accepts "bundled" (default) and "system". Windows will always use bundled sources.

CheckICU.m4 and the --with-mongodb-icu configure flag have been removed.

Bumps libmongoc to 1.25-dev. This also adds support for a new MONGOC_ENABLE_SRV config constant, which is currently set based on detection in CheckResolv.m4. We can consider exposing that as a configure option down the line in PHPC-2256.
@@ -318,7 +318,11 @@ if test "$PHP_MONGODB" != "no"; then

if test "$PHP_MONGODB_SYSTEM_LIBS" = "no"; then
PHP_MONGODB_BUNDLED_CFLAGS="$STD_CFLAGS -DBSON_COMPILATION -DMONGOC_COMPILATION"
PHP_MONGODB_LIBMONGOCRYPT_CFLAGS="-DKMS_MSG_STATIC -DMLIB_USER"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I relocated this within if test "$PHP_MONGODB_CLIENT_SIDE_ENCRYPTION" = "yes"; since it's only referenced within that scope (i.e. building with bundled sources and libmongocrypt is enabled).

Comment on lines +424 to +425
dnl Generated with: find src/libmongoc/src/utf8proc-2.8.0 -maxdepth 1 -name '*.c' ! -name 'utf8proc_data.c' -print0 | cut -sz -d / -f 5- | sort -dz | tr '\000' ' '
dnl Note: utf8proc_data.c is included from utf8proc.c and should not be compiled directly
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added ! -name 'utf8proc_data.c' to the find command in case someone decides to run this manually but update-submodule-sources.php has separate code to exclude utf8proc_data.c.

@@ -464,6 +479,8 @@ if test "$PHP_MONGODB" != "no"; then
fi

if test "$PHP_MONGODB_CLIENT_SIDE_ENCRYPTION" = "yes"; then
PHP_MONGODB_LIBMONGOCRYPT_CFLAGS="-DKMS_MSG_STATIC -DMLIB_USER"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relocated from above.

@@ -86,7 +86,7 @@ if (PHP_MONGODB != "no") {
ADD_EXTENSION_DEP("mongodb", "openssl", false);

var PHP_MONGODB_CFLAGS="\
/D BSON_COMPILATION /D MONGOC_COMPILATION \
/D BSON_COMPILATION /D MONGOC_COMPILATION /D UTF8PROC_STATIC \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't bother adding system library detection for utf8proc on Windows, so we always build with this constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I'm not sure whether we should rely on that library being available locally for people that download our DLLs, so compiling with bundled utf8proc seems like the logical choice.

}

ADD_FLAG("CFLAGS_MONGODB", "/EHsc /D U_USING_ICU_NAMESPACE=1");
mongoc_opts.MONGOC_ENABLE_SRV = 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is conditionally enabled if we find dnsapi.lib. I assume the behavior is the same if we don't find it, and users will just get a runtime error when attempting to connect with a mongodb+srv:// URI.

@@ -97,6 +100,10 @@ AC_LINK_IFELSE([AC_LANG_PROGRAM([[
])
])

AS_IF([test "$found_resolv" = "yes"],[
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is set independently of whether we need to link a separate resolv library or can rely on libc.

PHP_ARG_WITH([mongodb-utf8proc],
[whether to use system or bundled utf8proc for SCRAM-SHA-256 SASLprep],
[AS_HELP_STRING([--with-mongodb-utf8proc=@<:@bundled/system@:>@],
[MongoDB: Enable utf8proc for SCRAM-SHA-256 SASLprep [default=bundled]])],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, ICU was an optional dependency. IIRC, MONGOC_ENABLE_ICU could be undefined and libmongoc would conditionally raise a runtime exception if SASLprep was needed but not possible.

Since utf8proc is vendored, libmongoc provides no mechanism to disable it. We default to bundled sources here for consistency with libmongoc (their boolean build flags default to "on").

Comment on lines +12 to +29
PKG_CHECK_MODULES([PHP_MONGODB_UTF8PROC],[libutf8proc],[
PHP_MONGODB_BUNDLED_CFLAGS="$PHP_MONGODB_BUNDLED_CFLAGS $PHP_MONGODB_UTF8PROC_CFLAGS"
PHP_EVAL_LIBLINE([$PHP_MONGODB_UTF8PROC_LIBS],[MONGODB_SHARED_LIBADD])
found_utf8proc="yes"
],[
PHP_CHECK_LIBRARY([utf8proc],
[utf8proc_NFKC],
[have_utf8proc_lib="yes"],
[have_utf8proc_lib="no"])

AC_CHECK_HEADER([utf8proc.h],
[have_utf8proc_headers=yes],
[have_utf8proc_headers=no])

if test "$have_utf8proc_lib" = "yes" -a "$have_utf8proc_headers" = "yes"; then
PHP_ADD_LIBRARY([utf8proc],,[MONGODB_SHARED_LIBADD])
found_utf8proc="yes"
fi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I tested both of these code paths locally.

$ pkg-config  libutf8proc --libs
-lutf8proc
pkg-config  libutf8proc --cflags
-DUTF8PROC_EXPORTS

Configure only shows -lutf8proc, but I do see -DUTF8PROC_EXPORTS added to $PHP_MONGODB_BUNDLED_CFLAGS when using pkg-config. I'm not sure if it's required for the second branch. /cc @joshbsiegel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think UTF8PROC_EXPORTS is relevant, as it's only referenced in the following conditional for Windows:

https://github.com/JuliaStrings/utf8proc/blob/1cb28a66ca79a0845e99433fd1056257456cef8b/utf8proc.h#L123-L137

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I didn't have to do anything with UTF8PROC_EXPORTS when making changes in the C driver.

@jmikola jmikola merged commit d2a600b into mongodb:master Jul 19, 2023
@jmikola jmikola deleted the phpc-2262 branch July 19, 2023 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants