Description
This bug was initially reported in the Elastic Community Slack. But first, come context.
Context
Since the early days of the Elasticsearch Python client, back in July 2013, the body
parameter is the way to specify the request body for requests that accept it. API calls using body look like this:
es.search(index="bonsais", body={"query": {"match_all": {}}, "size": 50})
However, this parameter is an untyped Python dictionary which is not validated by the client. That said, thanks to the Elasticsearch specification which provides the full types of each Elasticsearch API, we can provide a better experience. elasticsearch-py 8.0 did just that, introducing this new way of calling APIs, where the first level of body keys can be specified using Python parameters:
es.search(index="bonsais", query={"match_all": {}}, size=50)
This has various advantages, including better autocompletion and type checks. For example, mypy will raise an error if size is not an integer. And since we realized we could unpack body to typed parameters like this:
es.search(index="bonsais", **{"query": {"match_all": {}}, "size": 50})
We decided to deprecate the body API altogether. However, deprecating body has the following downsides:
- A lot of code written in the past decade was now triggering a deprecation warning
- Unknown parameters such as
sub_searches
or unintentional omissions from the Elasticsearch specification were rejected, causing queries to outright fail, unnecessarily forcing the use of raw requests. - Optimizations such as passing an already encoded body to avoid paying the cost of serializing JSON were no longer possible.
The original author of the client, Honza Král, pointed out those issues, and we decided to allow body
to work as before, without any warnings, alongside the new API. This is available elasticsearch-py 8.12.0.
The case of Python keywords, like from
One subtlety with the above is that some identifiers are reserved by Python and can't be used as parameters. This is the case of from
, for example. As such, es.search(index="bonsais", query={"match_all": {}}, from=100, size=50)
, is invalid Python code. For this reason, parameter aliases were introduced, and the correct way to write that query was to use from_
, eg. es.search(index="bonsais", query={"match_all": {}}, from_=100, size=50)
. And then, under the hood, from
is actually sent to Elasticsearch:
elasticsearch-py/elasticsearch/_sync/client/__init__.py
Lines 1280 to 1281 in 5014ce5
However, when the body
parameter was deprecated in elasticsearch-py 8.12, it was deprecated by converting all body
subfields to Python parameters internally, and then updated parameter aliases like from_
to from
. This means it was possible to write:
es.search(index="bonsais", body={"query": {"match_all": {}}, "from_": 100, "size": 50})
which was then converted as if we had called:
es.search(index="bonsais", query={"match_all": {}, from_=100, size=50)
to finally send {"query": {"match_all": {}}, "from": 100, "size": 50}
as the body to Elasticsearch. This no longer works with elasticsearch-py 8.12.0. The body is used as is, without any inspection, and the correct way to use from
with the body
parameter is the one that always worked:
es.search(
index="*",
body={
"query": {"match_all": {}},
"from": 10,
"size": 10,
},
)
I'm still not sure what the solution is here.