Skip to content

Use "LINKER:-undefined,dynamic_lookup" to avoid wrong deduplication #657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2025

Conversation

traversaro
Copy link
Contributor

The -undefined dynamic_lookup option is passed to the Python extension to avoid segfault when loading the Python extension on macOS on a Python interpreter that statically links to libpython (such as the conda one), see conda-forge/python-feedstock#595 .

However, on conda-forge builds this option results in linker-related errors:

 [100%] Linking CXX shared module libtorchcodec_pybind_ops7.so
 │ │   cd $SRC_DIR/build/temp.macosx-10.13-x86_64-cpython-311/src/torchcodec/_core && $BUILD_PREFIX/bin/cmake -E cmake_link_script CMakeFiles/libtorchcodec_pybind_ops7.dir/link.txt --verbose=1
 │ │   ld: warning: directory not found for option '-L$PREFIX/lib/intel64'
 │ │   ld: warning: directory not found for option '-L$PREFIX/lib/intel64_win'
 │ │   ld: warning: directory not found for option '-L$PREFIX/lib/win-x64'
 │ │   ld: invalid option to -undefined [ warning | error | suppress | dynamic_lookup ]
 │ │   x86_64-apple-darwin13.4: error: linker command failed with exit code 1 (use -v to see invocation)
 │ │   $BUILD_PREFIX/bin/x86_64-apple-darwin13.4.0-clang++ -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -fmessage-length=0 -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/torchcodec-0.3.0 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -Wall -Wextra -pedantic  -O3 -DNDEBUG -isysroot /Applications/Xcode_15.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk -mmacosx-version-min=10.13 -bundle -Wl,-headerpad_max_install_names "-undefined dynamic_lookup" -Xlinker -undefined -Xlinker dynamic_lookup -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -o libtorchcodec_pybind_ops7.so CMakeFiles/libtorchcodec_pybind_ops7.dir/AVIOFileLikeContext.cpp.o CMakeFiles/libtorchcodec_pybind_ops7.dir/pybind_ops.cpp.o   -L$PREFIX/lib/intel64  -L$PREFIX/lib/intel64_win  -L$PREFIX/lib/win-x64  -Wl,-rpath,$PREFIX/lib/intel64 -Wl,-rpath,$PREFIX/lib/intel64_win -Wl,-rpath,$PREFIX/lib/win-x64 -Wl,-rpath,$SRC_DIR/build/temp.macosx-10.13-x86_64-cpython-311/src/torchcodec/_core libtorchcodec_decoder7.dylib $PREFIX/lib/libavdevice.dylib $PREFIX/lib/libavfilter.dylib $PREFIX/lib/libavformat.dylib $PREFIX/lib/libavcodec.dylib $PREFIX/lib/libavutil.dylib $PREFIX/lib/libswresample.dylib $PREFIX/lib/libswscale.dylib $PREFIX/lib/libtorch.dylib $PREFIX/lib/libtorch_cpu.dylib $PREFIX/lib/libprotobuf.29.3.0.dylib $PREFIX/lib/libabsl_log_internal_check_op.2501.0.0.dylib $PREFIX/lib/libabsl_die_if_null.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_conditions.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_message.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_nullguard.2501.0.0.dylib $PREFIX/lib/libabsl_examine_stack.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_format.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_structured_proto.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_proto.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_log_sink_set.2501.0.0.dylib $PREFIX/lib/libabsl_log_sink.2501.0.0.dylib $PREFIX/lib/libabsl_log_entry.2501.0.0.dylib $PREFIX/lib/libabsl_flags_internal.2501.0.0.dylib $PREFIX/lib/libabsl_flags_marshalling.2501.0.0.dylib $PREFIX/lib/libabsl_flags_reflection.2501.0.0.dylib $PREFIX/lib/libabsl_flags_config.2501.0.0.dylib $PREFIX/lib/libabsl_flags_program_name.2501.0.0.dylib $PREFIX/lib/libabsl_flags_private_handle_accessor.2501.0.0.dylib $PREFIX/lib/libabsl_flags_commandlineflag.2501.0.0.dylib $PREFIX/lib/libabsl_flags_commandlineflag_internal.2501.0.0.dylib $PREFIX/lib/libabsl_log_initialize.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_globals.2501.0.0.dylib $PREFIX/lib/libabsl_log_globals.2501.0.0.dylib $PREFIX/lib/libabsl_vlog_config_internal.2501.0.0.dylib $PREFIX/lib/libabsl_log_internal_fnmatch.2501.0.0.dylib $PREFIX/lib/libabsl_raw_hash_set.2501.0.0.dylib $PREFIX/lib/libabsl_hash.2501.0.0.dylib $PREFIX/lib/libabsl_city.2501.0.0.dylib $PREFIX/lib/libabsl_low_level_hash.2501.0.0.dylib $PREFIX/lib/libabsl_hashtablez_sampler.2501.0.0.dylib $PREFIX/lib/libabsl_random_distributions.2501.0.0.dylib $PREFIX/lib/libabsl_random_seed_sequences.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_pool_urbg.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_randen.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_randen_hwaes.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_randen_hwaes_impl.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_randen_slow.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_platform.2501.0.0.dylib $PREFIX/lib/libabsl_random_internal_seed_material.2501.0.0.dylib $PREFIX/lib/libabsl_random_seed_gen_exception.2501.0.0.dylib $PREFIX/lib/libabsl_statusor.2501.0.0.dylib $PREFIX/lib/libabsl_status.2501.0.0.dylib $PREFIX/lib/libabsl_cord.2501.0.0.dylib $PREFIX/lib/libabsl_cordz_info.2501.0.0.dylib $PREFIX/lib/libabsl_cord_internal.2501.0.0.dylib $PREFIX/lib/libabsl_cordz_functions.2501.0.0.dylib $PREFIX/lib/libabsl_exponential_biased.2501.0.0.dylib $PREFIX/lib/libabsl_cordz_handle.2501.0.0.dylib $PREFIX/lib/libabsl_crc_cord_state.2501.0.0.dylib $PREFIX/lib/libabsl_crc32c.2501.0.0.dylib $PREFIX/lib/libabsl_crc_internal.2501.0.0.dylib $PREFIX/lib/libabsl_crc_cpu_detect.2501.0.0.dylib $PREFIX/lib/libabsl_bad_optional_access.2501.0.0.dylib $PREFIX/lib/libabsl_leak_check.2501.0.0.dylib $PREFIX/lib/libabsl_strerror.2501.0.0.dylib $PREFIX/lib/libabsl_str_format_internal.2501.0.0.dylib $PREFIX/lib/libabsl_synchronization.2501.0.0.dylib $PREFIX/lib/libabsl_stacktrace.2501.0.0.dylib $PREFIX/lib/libabsl_symbolize.2501.0.0.dylib $PREFIX/lib/libabsl_debugging_internal.2501.0.0.dylib $PREFIX/lib/libabsl_demangle_internal.2501.0.0.dylib $PREFIX/lib/libabsl_demangle_rust.2501.0.0.dylib $PREFIX/lib/libabsl_decode_rust_punycode.2501.0.0.dylib $PREFIX/lib/libabsl_utf8_for_code_point.2501.0.0.dylib $PREFIX/lib/libabsl_graphcycles_internal.2501.0.0.dylib $PREFIX/lib/libabsl_kernel_timeout_internal.2501.0.0.dylib $PREFIX/lib/libabsl_malloc_internal.2501.0.0.dylib $PREFIX/lib/libabsl_tracing_internal.2501.0.0.dylib $PREFIX/lib/libabsl_time.2501.0.0.dylib $PREFIX/lib/libabsl_civil_time.2501.0.0.dylib $PREFIX/lib/libabsl_time_zone.2501.0.0.dylib -Wl,-framework,CoreFoundation $PREFIX/lib/libabsl_bad_variant_access.2501.0.0.dylib $PREFIX/lib/libabsl_strings.2501.0.0.dylib $PREFIX/lib/libabsl_int128.2501.0.0.dylib $PREFIX/lib/libabsl_strings_internal.2501.0.0.dylib $PREFIX/lib/libabsl_string_view.2501.0.0.dylib $PREFIX/lib/libabsl_base.2501.0.0.dylib $PREFIX/lib/libabsl_spinlock_wait.2501.0.0.dylib $PREFIX/lib/libabsl_throw_delegate.2501.0.0.dylib $PREFIX/lib/libabsl_raw_logging_internal.2501.0.0.dylib $PREFIX/lib/libabsl_log_severity.2501.0.0.dylib $PREFIX/lib/libc10.dylib -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core $PREFIX/lib/libc10.dylib
 │ │   make[2]: *** [src/torchcodec/_core/CMakeFiles/libtorchcodec_pybind_ops7.dir/build.make:213: src/torchcodec/_core/libtorchcodec_pybind_ops7.so] Error 1
 │ │   make[2]: Leaving directory '$SRC_DIR/build/temp.macosx-10.13-x86_64-cpython-311'
 │ │   make[1]: *** [CMakeFiles/Makefile2:179: src/torchcodec/_core/CMakeFiles/libtorchcodec_pybind_ops7.dir/all] Error 2
 │ │   make[1]: Leaving directory '$SRC_DIR/build/temp.macosx-10.13-x86_64-cpython-311'
 │ │   make: *** [Makefile:139: all] Error 2
 │ │   Traceback (most recent call last):

I am not sure why this does not happen in the upstream CI, probably the version of the compiler and the linker are slightly different.

However, in a nutshell the error is there as the option deduplication (see https://cmake.org/cmake/help/latest/command/target_link_options.html#option-de-duplication) is done in the wrong way.

As the minimum CMake version is 3.18, we can avoid this wrong deduplication by using the LINKER: prefix, as documented in https://cmake.org/cmake/help/latest/command/target_link_options.html#handling-compiler-driver-differences and done indeed by pybind11 in https://github.com/pybind/pybind11/blob/v2.13.6/tools/pybind11Common.cmake#L122 .

Note that from what I understand, this should not be necessary to do here as this is already done by pybind11 in https://github.com/pybind/pybind11/blob/v2.13.6/tools/pybind11Common.cmake#L122, but I do not fully understand the whole build-related subtleties, so the proposed change should fix the problem without changing anything else.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 28, 2025
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @traversaro , LGTM, I'll let @scotts take a look and merge since this relates to the MacOS pybind extension build

@traversaro traversaro changed the title Use "LINKER:-undefined,dynamic_lookup" to avoid wrong decuplication Use "LINKER:-undefined,dynamic_lookup" to avoid wrong deduplication Apr 28, 2025
@traversaro
Copy link
Contributor Author

I see that there is a CI failure on 4.4.2, not sure if it is related:

Run ${CONDA_RUN} ffmpeg -buildconf
ffmpeg: error while loading shared libraries: libXau.so.6: cannot open shared object file: No such file or directory
ERROR conda.cli.main_run:execute(125): `conda run ffmpeg -buildconf` failed. (See above for error)
Error: Process completed with exit code 127.

Even if also here I see a mixture of (in general) incompatible channels defaults and conda-forge (see #518 (comment)), the issue here could be related to the fact that xorg libraries used to be CDT (i.e. provided by the distro, not by conda-forge itself) back when ffmpeg 4 was built. Now the same libraries provided in the xorg-libxau package (https://conda-metadata-app.streamlit.app/Search_by_file_path?path=lib%2FlibXau.so.6), so perhaps just installing the xorg-linxau package should fix the problem. However, I am not sure if this is related to this PR.

@NicolasHug
Copy link
Member

The 4.4.2 CI failures are pre-existing, I see them on #635 as well which is unrelated, so we can ignore it for this PR

@scotts
Copy link
Contributor

scotts commented Apr 28, 2025

Thanks @traversaro! To solve the mystery of why you need to manually add LINKER:-undefined,dynamic_lookup when it's already a part of the pybind11 CMake functions: we're not using those! I struggled to get our libraries to build doing everything we needed related to PyTorch and FFmpeg and pybind11 using the pybind11 CMake functions, so I ended up picking out functionality from their functions and putting them in ours. This is almost certainly a better way of specifying what we need.

@scotts scotts merged commit b4619d7 into pytorch:main Apr 28, 2025
44 of 46 checks passed
@traversaro
Copy link
Contributor Author

To solve the mystery of why you need to manually add LINKER:-undefined,dynamic_lookup when it's already a part of the pybind11 CMake functions: we're not using those! I struggled to get our libraries to build doing everything we needed related to PyTorch and FFmpeg and pybind11 using the pybind11 CMake functions, so I ended up picking out functionality from their functions and putting them in ours. This is almost certainly a better way of specifying what we need.

Interesting. However I am still surprised as apparently you link pybind11:module in

pybind11::module # This library dependency makes sure we have the right
and
${library_dependencies}
, and as far as I know the pybind11::module was linking the pybind11::python_link_helper. But looking in the pybind11 CMake code, it seems that pybind11::module does link pybind11::python_link_helper see https://github.com/pybind/pybind11/blob/bc4a66dff0464d0c87291b00a3688999fea5bedc/tools/pybind11NewTools.cmake#L251-L254C49 (while if Python3::Module is defined, it links to it, but in CMake there is a similar logic: https://github.com/Kitware/CMake/blob/9debf90d87c2ad94046b092773deb4314a3f0aa0/Modules/FindPython/Support.cmake#L4302-L4312). So at a first glance is not really a matter of using pybind11's CMake helper, but the LINKER:-undefined,dynamic_lookup should already come by linking to pybind11::module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants