Description
cc @mrocklin
Hi,
Context:
Right now, pandas treats numexpr as an optional dependency. I'm a packager for archlinux, and recently, I got a feedback on the dask package saying that numexpr was missing as a dependency of this package. https://aur.archlinux.org/packages/python-dask/
However, dask does not explicitly on numexpr. The reason is detailed below.
Analysis:
In pandas.computations
, eval()
takes an optional argument engine='numexpr'
.
If numexpr is not install, then any call with default arguments will raise an exception importError from the function _check_engine
in pandas/computations/eval.py.
eval()
is called (at least) from query, that's why we are in trouble if we run dask's test without numexpr.
RFC:
Here is the question: is numexpr really an optional dependency since it's the default argument?
I would say no, but comments are open :)
From the dask devs point of view, they do not have to mark numexpr as a dependency because they do not use it explicitly. To me, they can expect that default arguments from pandas work out of the box.
From the pandas packager (not me), pandas says that numexpr is optional, treated as optional. No problem here too, he followed the guidelines.
I see two options:
- pandas changes the default backend
- or pandas adds numexpr as a true dependency.