Skip to content

Correctly handled response headers in speak rest client for aura 2 #518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 47 additions & 20 deletions deepgram/clients/speak/v1/rest/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,15 +160,28 @@ def stream_memory(
self._logger.info("addons: %s", addons)
self._logger.info("headers: %s", headers)

return_vals = [
"content-type",
"request-id",
"model-uuid",
"model-name",
"char-count",
"transfer-encoding",
"date",
]
is_aura_2_model = options["model"].split("-")[1] == "2"

Comment on lines +163 to +164
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for model identification

The code assumes options["model"] exists and follows a specific format (name-number-...). If "model" is missing from options or doesn't contain at least two segments when split by -, this will raise a KeyError or IndexError.

Add proper error handling to prevent potential runtime errors:

-        is_aura_2_model = options["model"].split("-")[1] == "2"
+        is_aura_2_model = False
+        if options and "model" in options:
+            model_parts = options["model"].split("-")
+            if len(model_parts) > 1:
+                is_aura_2_model = model_parts[1] == "2"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
is_aura_2_model = options["model"].split("-")[1] == "2"
is_aura_2_model = False
if options and "model" in options:
model_parts = options["model"].split("-")
if len(model_parts) > 1:
is_aura_2_model = model_parts[1] == "2"

if is_aura_2_model:
return_vals = [
"content-type",
"request-id",
"model-name",
"characters",
"transfer-encoding",
"date",
]
else:
return_vals = [
"content-type",
"request-id",
"model-uuid",
"model-name",
"char-count",
"transfer-encoding",
"date",
]

result = self.post_memory(
url,
options=options,
Expand All @@ -181,17 +194,31 @@ def stream_memory(
)

self._logger.info("result: %s", result)
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_uuid=str(result["model-uuid"]),
model_name=str(result["model-name"]),
characters=int(str(result["char-count"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)

if is_aura_2_model:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_name=str(result["model-name"]),
characters=int(str(result["characters"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)
else:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_uuid=str(result["model-uuid"]),
model_name=str(result["model-name"]),
characters=int(str(result["char-count"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)

Comment on lines +198 to +221
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid duplicating SpeakRESTResponse construction logic

The two branches for creating the SpeakRESTResponse differ only in a few parameters, leading to code duplication.

Refactor to reduce duplication and improve maintainability:

-        if is_aura_2_model:
-            resp = SpeakRESTResponse(
-                content_type=str(result["content-type"]),
-                request_id=str(result["request-id"]),
-                model_name=str(result["model-name"]),
-                characters=int(str(result["characters"])),
-                transfer_encoding=str(result["transfer-encoding"]),
-                date=str(result["date"]),
-                stream=cast(io.BytesIO, result["stream"]),
-                stream_memory=cast(io.BytesIO, result["stream"]),
-            )
-        else:
-            resp = SpeakRESTResponse(
-                content_type=str(result["content-type"]),
-                request_id=str(result["request-id"]),
-                model_uuid=str(result["model-uuid"]),
-                model_name=str(result["model-name"]),
-                characters=int(str(result["char-count"])),
-                transfer_encoding=str(result["transfer-encoding"]),
-                date=str(result["date"]),
-                stream=cast(io.BytesIO, result["stream"]),
-                stream_memory=cast(io.BytesIO, result["stream"]),
-            )
+        # Common parameters for all models
+        response_params = {
+            "content_type": str(result["content-type"]),
+            "request_id": str(result["request-id"]),
+            "model_name": str(result["model-name"]),
+            "transfer_encoding": str(result["transfer-encoding"]),
+            "date": str(result["date"]),
+            "stream": cast(io.BytesIO, result["stream"]),
+            "stream_memory": cast(io.BytesIO, result["stream"]),
+        }
+        
+        # Model-specific parameters
+        if is_aura_2_model:
+            response_params["characters"] = int(str(result["characters"]))
+        else:
+            response_params["model_uuid"] = str(result["model-uuid"])
+            response_params["characters"] = int(str(result["char-count"]))
+        
+        resp = SpeakRESTResponse(**response_params)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if is_aura_2_model:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_name=str(result["model-name"]),
characters=int(str(result["characters"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)
else:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_uuid=str(result["model-uuid"]),
model_name=str(result["model-name"]),
characters=int(str(result["char-count"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)
# Common parameters for all models
response_params = {
"content_type": str(result["content-type"]),
"request_id": str(result["request-id"]),
"model_name": str(result["model-name"]),
"transfer_encoding": str(result["transfer-encoding"]),
"date": str(result["date"]),
"stream": cast(io.BytesIO, result["stream"]),
"stream_memory": cast(io.BytesIO, result["stream"]),
}
# Model-specific parameters
if is_aura_2_model:
response_params["characters"] = int(str(result["characters"]))
else:
response_params["model_uuid"] = str(result["model-uuid"])
response_params["characters"] = int(str(result["char-count"]))
resp = SpeakRESTResponse(**response_params)

💡 Verification agent

🧩 Analysis chain

Add error handling for missing headers in the response

The code assumes that all expected headers will be present in the response. If a header is missing, accessing it will raise a KeyError.

Add error handling to gracefully handle missing headers:

        # Common parameters for all models
        response_params = {
-            "content_type": str(result["content-type"]),
-            "request_id": str(result["request-id"]),
-            "model_name": str(result["model-name"]),
-            "transfer_encoding": str(result["transfer-encoding"]),
-            "date": str(result["date"]),
+            "content_type": str(result.get("content-type", "")),
+            "request_id": str(result.get("request-id", "")),
+            "model_name": str(result.get("model-name", "")),
+            "transfer_encoding": str(result.get("transfer-encoding", "")),
+            "date": str(result.get("date", "")),
            "stream": cast(io.BytesIO, result["stream"]),
            "stream_memory": cast(io.BytesIO, result["stream"]),
        }
        
        # Model-specific parameters
        if is_aura_2_model:
-            response_params["characters"] = int(str(result["characters"]))
+            response_params["characters"] = int(str(result.get("characters", 0)))
        else:
-            response_params["model_uuid"] = str(result["model-uuid"])
-            response_params["characters"] = int(str(result["char-count"]))
+            response_params["model_uuid"] = str(result.get("model-uuid", ""))
+            response_params["characters"] = int(str(result.get("char-count", 0)))

Let's verify if these headers can ever be missing in the API response:


🏁 Script executed:

#!/bin/bash
# Check the SpeakRESTResponse class to understand which parameters are required
grep -A 20 "class SpeakRESTResponse" --include="*.py" -r .

Length of output: 1721


Action Required: Update Error Handling for Missing Response Headers

The current implementation in deepgram/clients/speak/v1/rest/client.py (lines 198–221) assumes that all headers are present in the API response. However, if any header is missing, a KeyError will be raised. Although the SpeakRESTResponse class provides default values for its attributes, the client code still directly indexes the response dictionary.

Please update the code as follows to gracefully handle missing headers by using result.get() with appropriate default values:

  • Refactor the common response parameter extraction:
    Use result.get("header-key", default) for fields like "content-type", "request-id", "model-name", "transfer-encoding", and "date".

  • Handle model-specific parameters:
    For the Aura 2 model, use result.get("characters", 0) and for the other model type, use result.get("model-uuid", "") and result.get("char-count", 0).

Below is the suggested diff snippet:

        # Common parameters for all models
        response_params = {
-            "content_type": str(result["content-type"]),
-            "request_id": str(result["request-id"]),
-            "model_name": str(result["model-name"]),
-            "transfer_encoding": str(result["transfer-encoding"]),
-            "date": str(result["date"]),
+            "content_type": str(result.get("content-type", "")),
+            "request_id": str(result.get("request-id", "")),
+            "model_name": str(result.get("model-name", "")),
+            "transfer_encoding": str(result.get("transfer-encoding", "")),
+            "date": str(result.get("date", "")),
            "stream": cast(io.BytesIO, result["stream"]),
            "stream_memory": cast(io.BytesIO, result["stream"]),
        }
        
        # Model-specific parameters
        if is_aura_2_model:
-            response_params["characters"] = int(str(result["characters"]))
+            response_params["characters"] = int(str(result.get("characters", 0)))
        else:
-            response_params["model_uuid"] = str(result["model-uuid"])
-            response_params["characters"] = int(str(result["char-count"]))
+            response_params["model_uuid"] = str(result.get("model-uuid", ""))
+            response_params["characters"] = int(str(result.get("char-count", 0)))

By incorporating these changes, the code will safely handle cases where headers may be missing in the API response, preventing runtime errors.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if is_aura_2_model:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_name=str(result["model-name"]),
characters=int(str(result["characters"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)
else:
resp = SpeakRESTResponse(
content_type=str(result["content-type"]),
request_id=str(result["request-id"]),
model_uuid=str(result["model-uuid"]),
model_name=str(result["model-name"]),
characters=int(str(result["char-count"])),
transfer_encoding=str(result["transfer-encoding"]),
date=str(result["date"]),
stream=cast(io.BytesIO, result["stream"]),
stream_memory=cast(io.BytesIO, result["stream"]),
)
# Common parameters for all models
response_params = {
"content_type": str(result.get("content-type", "")),
"request_id": str(result.get("request-id", "")),
"model_name": str(result.get("model-name", "")),
"transfer_encoding": str(result.get("transfer-encoding", "")),
"date": str(result.get("date", "")),
"stream": cast(io.BytesIO, result["stream"]),
"stream_memory": cast(io.BytesIO, result["stream"]),
}
# Model-specific parameters
if is_aura_2_model:
response_params["characters"] = int(str(result.get("characters", 0)))
else:
response_params["model_uuid"] = str(result.get("model-uuid", ""))
response_params["characters"] = int(str(result.get("char-count", 0)))
resp = SpeakRESTResponse(**response_params)

self._logger.verbose("resp Object: %s", resp)
self._logger.notice("speak succeeded")
self._logger.debug("SpeakClient.stream LEAVE")
Expand Down