Skip to content

Commit dc3f025

Browse files
committed
Document and enable OpenTelemetry instrumentation
1 parent d5cdfc2 commit dc3f025

File tree

8 files changed

+89
-23
lines changed

8 files changed

+89
-23
lines changed
43.2 KB
Loading
Loading
49.5 KB
Loading

docs/guide/integrations.asciidoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,13 @@
44
You can find integration options and information on this page.
55

66

7+
[discrete]
8+
[[opentelemetry]]
9+
=== OpenTelemetry instrumentation
10+
11+
The Python Elasticsearch client supports native OpenTelemetry instrumentation following the https://opentelemetry.io/docs/specs/semconv/database/elasticsearch/[OpenTelemetry Semantic Conventions for Elasticsearch].
12+
Refer to the <<opentelemetry>> page for details.
13+
714
[discrete]
815
[[transport]]
916
=== Transport
@@ -53,3 +60,6 @@ es.options(
5360
------------------------------------
5461

5562
Type hints also allow tools like your IDE to check types and provide better auto-complete functionality.
63+
64+
65+
include::open-telemetry.asciidoc[]

docs/guide/open-telemetry.asciidoc

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
[[opentelemetry]]
2+
=== Using OpenTelemetry
3+
4+
You can use https://opentelemetry.io/[OpenTelemetry] to monitor the performance and behavior of your {es} requests through the Elasticsearch Python Client.
5+
The Python Client comes with built-in OpenTelemetry instrumentation that emits https://www.elastic.co/guide/en/apm/guide/current/apm-distributed-tracing.html[distributed tracing spans] by default.
6+
With that, applications https://opentelemetry.io/docs/languages/python/instrumentation/[manually instrumented with OpenTelemetry] or using the https://opentelemetry.io/docs/languages/python/automatic/[OpenTelemetry Python agent] are enriched with additional spans that contain insightful information about the execution of the {es} requests.
7+
8+
The native instrumentation in the Python Client follows the https://opentelemetry.io/docs/specs/semconv/database/elasticsearch/[OpenTelemetry Semantic Conventions for {es}]. In particular, the instrumentation in the client covers the logical layer of {es} requests. A single span per request is created that is processed by the service through the Python Client. The following image shows a trace that records the handling of two different {es} requests: an `info` request and a `search` request.
9+
10+
[role="screenshot"]
11+
image::images/otel-waterfall-without-http.png[alt="Distributed trace with Elasticsearch spans",align="center"]
12+
13+
Usually, OpenTelemetry auto-instrumentation modules come with instrumentation support for HTTP-level communication. In this case, in addition to the logical {es} client requests, spans will be captured for the physical HTTP requests emitted by the client. The following image shows a trace with both, {es} spans (in blue) and the corresponding HTTP-level spans (in red):
14+
15+
[role="screenshot"]
16+
image::images/otel-waterfall-with-http.png[alt="Distributed trace with Elasticsearch spans",align="center"]
17+
18+
Advanced Python Client behavior such as nodes round-robin and request retries are revealed through the combination of logical {es} spans and the physical HTTP spans. The following example shows a `search` request in a scenario with two nodes:
19+
20+
[role="screenshot"]
21+
image::images/otel-waterfall-retry.png[alt="Distributed trace with Elasticsearch spans",align="center"]
22+
23+
The first node is unavailable and results in an HTTP error, while the retry to the second node succeeds. Both HTTP requests are subsumed by the logical {es} request span (in blue).
24+
25+
[discrete]
26+
==== Setup the OpenTelemetry instrumentation
27+
28+
When using the https://opentelemetry.io/docs/languages/python/instrumentation/[manual Python OpenTelemetry instrumentation] or using the https://opentelemetry.io/docs/languages/python/automatic/[OpenTelemetry Python agent], the Python Client's OpenTelemetry instrumentation is enabled by default and uses the global OpenTelemetry SDK with the global tracer provider.
29+
30+
[discrete]
31+
==== Configuring the OpenTelemetry instrumentation
32+
33+
You can configure the OpenTelemetry instrumentation through environment variables.
34+
The following configuration options are available.
35+
36+
[discrete]
37+
[[opentelemetry-config-enable]]
38+
===== Enable / Disable the OpenTelemetry instrumentation
39+
40+
With this configuration option you can enable (default) or disable the built-in OpenTelemetry instrumentation.
41+
42+
**Default:** `true`
43+
44+
|============
45+
| Environment Variable | `OTEL_PYTHON_INSTRUMENTATION_ELASTICSEARCH_ENABLED`
46+
|============
47+
48+
[discrete]
49+
===== Capture search request bodies
50+
51+
Per default, the built-in OpenTelemetry instrumentation does not capture request bodies due to data privacy considerations. You can use this option to enable capturing of search queries from the request bodies of {es} search requests in case you wish to gather this information regardless. The options are to capture the raw search query or not capture it at all.
52+
53+
**Default:** `omit`
54+
55+
**Valid Options:** `omit`, `raw`
56+
57+
|============
58+
| Environment Variable | `OTEL_PYTHON_INSTRUMENTATION_ELASTICSEARCH_CAPTURE_SEARCH_QUERY`
59+
|============
60+
61+
[discrete]
62+
==== Overhead
63+
64+
The OpenTelemetry instrumentation (as any other monitoring approach) may come with a slight overhead on CPU, memory, and/or latency. The overhead may only occur when the instrumentation is enabled (default) and an OpenTelemetry SDK is active in the target application. When the instrumentation is disabled or no OpenTelemetry SDK is active within the target application, monitoring overhead is not expected when using the client.
65+
66+
Even in cases where the instrumentation is enabled and is actively used (by an OpenTelemetry SDK), the overhead is minimal and negligible in the vast majority of cases. In edge cases where there is a noticeable overhead, the <<opentelemetry-config-enable,instrumentation can be explicitly disabled>> to eliminate any potential impact on performance.

elasticsearch/_otel.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ def __init__(
5252
body_strategy: 'Literal["omit", "raw"]' | None = None,
5353
):
5454
if enabled is None:
55-
enabled = os.environ.get(ENABLED_ENV_VAR, "false") != "false"
55+
enabled = os.environ.get(ENABLED_ENV_VAR, "true") == "true"
5656
self.tracer = tracer or _tracer
5757
self.enabled = enabled and self.tracer is not None
5858

test_elasticsearch/test_client/test_options.py

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
import pytest
2020
from elastic_transport.client_utils import DEFAULT
21+
from elastic_transport import OpenTelemetrySpan
2122

2223
from elasticsearch import AsyncElasticsearch, Elasticsearch
2324
from elasticsearch._sync.client.utils import USER_AGENT
@@ -137,13 +138,12 @@ def test_options_passed_to_perform_request(self):
137138
assert call.pop("retry_on_timeout") is DEFAULT
138139
assert call.pop("retry_on_status") is DEFAULT
139140
assert call.pop("client_meta") is DEFAULT
141+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
140142
assert call == {
141143
"headers": {
142144
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
143145
},
144146
"body": None,
145-
"endpoint_id": "indices.get",
146-
"path_parts": {"index": "test"},
147147
}
148148

149149
# Can be overwritten with .options()
@@ -157,13 +157,12 @@ def test_options_passed_to_perform_request(self):
157157
calls = client.transport.calls
158158
call = calls[("GET", "/test")][1]
159159
assert call.pop("client_meta") is DEFAULT
160+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
160161
assert call == {
161162
"headers": {
162163
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
163164
},
164165
"body": None,
165-
"endpoint_id": "indices.get",
166-
"path_parts": {"index": "test"},
167166
"request_timeout": 1,
168167
"max_retries": 2,
169168
"retry_on_status": (404,),
@@ -184,13 +183,12 @@ def test_options_passed_to_perform_request(self):
184183
calls = client.transport.calls
185184
call = calls[("GET", "/test")][0]
186185
assert call.pop("client_meta") is DEFAULT
186+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
187187
assert call == {
188188
"headers": {
189189
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
190190
},
191191
"body": None,
192-
"endpoint_id": "indices.get",
193-
"path_parts": {"index": "test"},
194192
"request_timeout": 1,
195193
"max_retries": 2,
196194
"retry_on_status": (404,),
@@ -213,13 +211,12 @@ async def test_options_passed_to_async_perform_request(self):
213211
assert call.pop("retry_on_timeout") is DEFAULT
214212
assert call.pop("retry_on_status") is DEFAULT
215213
assert call.pop("client_meta") is DEFAULT
214+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
216215
assert call == {
217216
"headers": {
218217
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
219218
},
220219
"body": None,
221-
"endpoint_id": "indices.get",
222-
"path_parts": {"index": "test"},
223220
}
224221

225222
# Can be overwritten with .options()
@@ -233,13 +230,12 @@ async def test_options_passed_to_async_perform_request(self):
233230
calls = client.transport.calls
234231
call = calls[("GET", "/test")][1]
235232
assert call.pop("client_meta") is DEFAULT
233+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
236234
assert call == {
237235
"headers": {
238236
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
239237
},
240238
"body": None,
241-
"endpoint_id": "indices.get",
242-
"path_parts": {"index": "test"},
243239
"request_timeout": 1,
244240
"max_retries": 2,
245241
"retry_on_status": (404,),
@@ -260,13 +256,12 @@ async def test_options_passed_to_async_perform_request(self):
260256
calls = client.transport.calls
261257
call = calls[("GET", "/test")][0]
262258
assert call.pop("client_meta") is DEFAULT
259+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
263260
assert call == {
264261
"headers": {
265262
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
266263
},
267264
"body": None,
268-
"endpoint_id": "indices.get",
269-
"path_parts": {"index": "test"},
270265
"request_timeout": 1,
271266
"max_retries": 2,
272267
"retry_on_status": (404,),
@@ -397,13 +392,12 @@ def test_options_timeout_parameters(self):
397392
calls = client.transport.calls
398393
call = calls[("GET", "/test")][0]
399394
assert call.pop("client_meta") is DEFAULT
395+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
400396
assert call == {
401397
"headers": {
402398
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
403399
},
404400
"body": None,
405-
"endpoint_id": "indices.get",
406-
"path_parts": {"index": "test"},
407401
"request_timeout": 1,
408402
"max_retries": 2,
409403
"retry_on_status": (404,),
@@ -428,13 +422,12 @@ def test_options_timeout_parameters(self):
428422
calls = client.transport.calls
429423
call = calls[("GET", "/test")][0]
430424
assert call.pop("client_meta") is DEFAULT
425+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
431426
assert call == {
432427
"headers": {
433428
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
434429
},
435430
"body": None,
436-
"endpoint_id": "indices.get",
437-
"path_parts": {"index": "test"},
438431
"request_timeout": 2,
439432
"max_retries": 3,
440433
"retry_on_status": (400,),
@@ -454,13 +447,12 @@ def test_options_timeout_parameters(self):
454447
assert call.pop("retry_on_timeout") is DEFAULT
455448
assert call.pop("retry_on_status") is DEFAULT
456449
assert call.pop("client_meta") is DEFAULT
450+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
457451
assert call == {
458452
"headers": {
459453
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
460454
},
461455
"body": None,
462-
"endpoint_id": "indices.get",
463-
"path_parts": {"index": "test"},
464456
}
465457

466458
client = Elasticsearch(
@@ -477,13 +469,12 @@ def test_options_timeout_parameters(self):
477469
calls = client.transport.calls
478470
call = calls[("GET", "/test")][0]
479471
assert call.pop("client_meta") is DEFAULT
472+
assert isinstance(call.pop("otel_span"), OpenTelemetrySpan)
480473
assert call == {
481474
"headers": {
482475
"accept": "application/vnd.elasticsearch+json; compatible-with=8",
483476
},
484477
"body": None,
485-
"endpoint_id": "indices.get",
486-
"path_parts": {"index": "test"},
487478
"request_timeout": 1,
488479
"max_retries": 2,
489480
"retry_on_status": (404,),

test_elasticsearch/test_otel.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,6 @@
2828
pass
2929

3030

31-
from elasticsearch import JsonSerializer
3231
from elasticsearch._otel import ENABLED_ENV_VAR, OpenTelemetry
3332

3433
pytestmark = [
@@ -51,7 +50,7 @@ def setup_tracing():
5150

5251
def test_enabled():
5352
otel = OpenTelemetry()
54-
assert otel.enabled == (os.environ.get(ENABLED_ENV_VAR, "false") != "false")
53+
assert otel.enabled == (os.environ.get(ENABLED_ENV_VAR, "true") == "true")
5554

5655

5756
def test_minimal_span():

0 commit comments

Comments
 (0)