Skip to content

elasticsearch-py 8.12 breaks the use of from_ in body parameters #2425

Closed
@pquentin

Description

@pquentin

This bug was initially reported in the Elastic Community Slack. But first, come context.

Context

Since the early days of the Elasticsearch Python client, back in July 2013, the body parameter is the way to specify the request body for requests that accept it. API calls using body look like this:

es.search(index="bonsais", body={"query": {"match_all": {}}, "size": 50})

However, this parameter is an untyped Python dictionary which is not validated by the client. That said, thanks to the Elasticsearch specification which provides the full types of each Elasticsearch API, we can provide a better experience. elasticsearch-py 8.0 did just that, introducing this new way of calling APIs, where the first level of body keys can be specified using Python parameters:

es.search(index="bonsais", query={"match_all": {}}, size=50)

This has various advantages, including better autocompletion and type checks. For example, mypy will raise an error if size is not an integer. And since we realized we could unpack body to typed parameters like this:

es.search(index="bonsais", **{"query": {"match_all": {}}, "size": 50})

We decided to deprecate the body API altogether. However, deprecating body has the following downsides:

  • A lot of code written in the past decade was now triggering a deprecation warning
  • Unknown parameters such as sub_searches or unintentional omissions from the Elasticsearch specification were rejected, causing queries to outright fail, unnecessarily forcing the use of raw requests.
  • Optimizations such as passing an already encoded body to avoid paying the cost of serializing JSON were no longer possible.

The original author of the client, Honza Král, pointed out those issues, and we decided to allow body to work as before, without any warnings, alongside the new API. This is available elasticsearch-py 8.12.0.

The case of Python keywords, like from

One subtlety with the above is that some identifiers are reserved by Python and can't be used as parameters. This is the case of from, for example. As such, es.search(index="bonsais", query={"match_all": {}}, from=100, size=50), is invalid Python code. For this reason, parameter aliases were introduced, and the correct way to write that query was to use from_, eg. es.search(index="bonsais", query={"match_all": {}}, from_=100, size=50). And then, under the hood, from is actually sent to Elasticsearch:

if from_ is not None:
__query["from"] = from_

However, when the body parameter was deprecated in elasticsearch-py 8.12, it was deprecated by converting all body subfields to Python parameters internally, and then updated parameter aliases like from_ to from. This means it was possible to write:

es.search(index="bonsais", body={"query": {"match_all": {}}, "from_": 100, "size": 50})

which was then converted as if we had called:

es.search(index="bonsais", query={"match_all": {}, from_=100, size=50)

to finally send {"query": {"match_all": {}}, "from": 100, "size": 50} as the body to Elasticsearch. This no longer works with elasticsearch-py 8.12.0. The body is used as is, without any inspection, and the correct way to use from with the body parameter is the one that always worked:

es.search(
    index="*",
    body={
        "query": {"match_all": {}},
        "from": 10,
        "size": 10,
    },
)

I'm still not sure what the solution is here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions