Description
Installation check
- I have read the installation guide.
Platform
I'm building with wolfi: https://github.com/wolfi-dev/os
Installation Method
Built from source
pandas Version
2.1.0, 2.1.1, 2.2.0.dev0
Python Version
3.12
Installation Logs
/home/build/pandas/_libs/khash.pxd:129:0: 'khash_for_primitive_helper.pxi' not found
Which lead me to find this:
#51875
But then I started looking at your CI pipelines to see how you're building them and trying to replicate, and noticed the build order is different. What I see in the CI pipelines is that the first things being built are:
[1/151] Generating pandas/_libs/khash_primitive_helper_pxi with a custom command
[2/151] Generating pandas/_libs/algos_common_helper_pxi with a custom command
[3/151] Generating pandas/_libs/algos_take_helper_pxi with a custom command
[4/151] Generating pandas/_libs/hashtable_class_helper_pxi with a custom command
[5/151] Generating pandas/_libs/index_class_helper_pxi with a custom command
[6/151] Generating pandas/_libs/hashtable_func_helper_pxi with a custom command
[7/151] Generating pandas/__init__.py with a custom command
[8/151] Generating pandas/_libs/intervaltree_helper_pxi with a custom command
[9/151] Generating pandas/_libs/sparse_op_helper_pxi with a custom command
But when I was trying to build it, it was looking like this:
⚠️ aarch64 | [1/152] Compiling Cython source /home/build/pandas/_libs/window/indexers.pyx
⚠️ aarch64 | [2/152] Compiling Cython source /home/build/pandas/_libs/window/aggregations.pyx
⚠️ aarch64 | [3/152] Compiling Cython source /home/build/pandas/_libs/writers.pyx
⚠️ aarch64 | [4/152] Compiling Cython source /home/build/pandas/_libs/testing.pyx
⚠️ aarch64 | [5/152] Compiling Cython source /home/build/pandas/_libs/tslib.pyx
⚠️ aarch64 | [6/152] Generating pandas/_libs/sparse_op_helper_pxi with a custom command
⚠️ aarch64 | [7/152] Compiling Cython source /home/build/pandas/_libs/byteswap.pyx
⚠️ aarch64 | [8/152] Compiling Cython source /home/build/pandas/_libs/sas.pyx
And I dug into the differences for awhile more, and the one difference was that I was using samurai instead of ninja
for the builds, and just by swapping to ninja I was able to make the build work. So, that's great. However, just wanted to bring this to your attention in case it may be something that might be masking a real issue (not saying there is one 😆 )
https://github.com/michaelforney/samurai#differences-from-ninja
samurai schedules jobs using a stack, so the last scheduled job is the first to execute, while ninja schedules jobs based on the pointer value of the edge structure (they are stored in a std::set<Edge*>), so the first to execute depends on the address returned by malloc. This may result in build failures due to insufficiently specified dependencies in the project's build system.
So, I don't yet know enough about how things are all configured for dependencies, and if that may be a red herring, but like I said just wanted to share that in case it tingles somebody's spidey sense.