-
Notifications
You must be signed in to change notification settings - Fork 84
Correctly handled response headers in speak rest client for aura 2 #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant SpeakRESTClient
participant DeepgramAPI
Caller->>SpeakRESTClient: stream_memory(options)
SpeakRESTClient->>DeepgramAPI: Send request with options
DeepgramAPI-->>SpeakRESTClient: Return response with headers
alt Model is aura 2
SpeakRESTClient->>SpeakRESTClient: Parse headers (use "characters", omit "model_uuid")
SpeakRESTClient->>Caller: Return SpeakRESTResponse (no model_uuid, char count from "characters")
else Other models
SpeakRESTClient->>SpeakRESTClient: Parse headers (use "char-count", include "model_uuid")
SpeakRESTClient->>Caller: Return SpeakRESTResponse (with model_uuid, char count from "char-count")
end
Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
deepgram/clients/speak/v1/rest/client.py (3)
165-184
: Consider using a more robust model identification approachThe current approach to identify Aura 2 models relies on a specific model naming convention that could change in the future.
Consider using a more explicit check (like a list of known Aura 2 models) or a configuration setting that would make this code more maintainable:
- is_aura_2_model = options["model"].split("-")[1] == "2" + # List of known Aura 2 models for more explicit identification + AURA_2_MODELS = ["aura-2", "aura-2-en", "aura-2-multilingual"] + is_aura_2_model = options and "model" in options and options["model"] in AURA_2_MODELS
192-194
: Consider dynamically setting the return_vals in the post_memory callThe
file_result
parameter is now using the conditionally setreturn_vals
variable, but the code would be clearer if the conditional logic was closer to where it's used.For better code clarity, consider moving the
return_vals
setting closer to where it's used:result = self.post_memory( url, options=options, addons=addons, headers=headers, json=body, timeout=timeout, - file_result=return_vals, + file_result=[ + "content-type", + "request-id", + "model-name", + "characters" if is_aura_2_model else "char-count", + "transfer-encoding", + "date", + ] + ([] if is_aura_2_model else ["model-uuid"]), **kwargs, )
112-225
: Update the docstring to document the different behavior for Aura 2 modelsThe method's docstring should document the special handling for Aura 2 models to make it clear to users and maintainers.
Add a note to the docstring about the different handling for Aura 2 models:
""" Speak from a text source and store in memory. Args: source (TextSource): The text source to speak. options (SpeakRESTOptions): Additional options for the ingest (default is None). addons (Dict): Additional options for the request (default is None). headers (Dict): Additional headers for the request (default is None). timeout (httpx.Timeout): The timeout for the request (default is None). endpoint (str): The endpoint to use for the request (default is "v1/speak"). Returns: SpeakRESTResponse: The response from the speak request. Raises: DeepgramTypeError: Raised for known API errors. + + Note: + This method handles the response differently for Aura 2 models, as they + return different headers compared to other models. Specifically, Aura 2 + models use "characters" instead of "char-count" and don't include "model-uuid". """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
deepgram/clients/speak/v1/rest/client.py
(2 hunks)
🔇 Additional comments (1)
deepgram/clients/speak/v1/rest/client.py (1)
112-225
:❓ Verification inconclusive
Add unit tests for both Aura 2 and non-Aura 2 models
According to the PR objectives, tests haven't been added yet. This change introduces conditional logic that should be tested for both cases.
Please add unit tests to verify the correct behavior for both Aura 2 and non-Aura 2 models. Here's an outline of what to test:
For non-Aura 2 models:
- Verify it requests the correct headers including "model-uuid" and "char-count"
- Verify it creates the SpeakRESTResponse with model_uuid and using "char-count" for characters
For Aura 2 models:
- Verify it requests the correct headers including "characters" and excluding "model-uuid"
- Verify it creates the SpeakRESTResponse without model_uuid and using "characters" for characters
Let's check the existing test files to understand how to structure these tests:
🏁 Script executed:
#!/bin/bash # Find test files related to the SpeakRESTClient find . -name "test_*.py" | xargs grep -l "SpeakRESTClient" 2>/dev/nullLength of output: 58
Action required: Add unit tests for conditional logic in stream_memory method
Please add comprehensive unit tests covering both the Aura 2 and non-Aura 2 model scenarios within the stream_memory method in
deepgram/clients/speak/v1/rest/client.py
.
For non-Aura 2 models:
- Ensure that the branch using
"model-uuid"
and"char-count"
is exercised.- Verify that the headers and the resulting
SpeakRESTResponse
include"model-uuid"
and use"char-count"
for the characters field.For Aura 2 models:
- Ensure that the branch based on
options["model"].split("-")[1] == "2"
is exercised.- Verify that
"characters"
is used in the response (and that"model-uuid"
is omitted).Based on our search, there currently appear to be no existing test files referencing the
SpeakRESTClient
. Please manually verify or set up new test files to cover these scenarios.
if is_aura_2_model: | ||
resp = SpeakRESTResponse( | ||
content_type=str(result["content-type"]), | ||
request_id=str(result["request-id"]), | ||
model_name=str(result["model-name"]), | ||
characters=int(str(result["characters"])), | ||
transfer_encoding=str(result["transfer-encoding"]), | ||
date=str(result["date"]), | ||
stream=cast(io.BytesIO, result["stream"]), | ||
stream_memory=cast(io.BytesIO, result["stream"]), | ||
) | ||
else: | ||
resp = SpeakRESTResponse( | ||
content_type=str(result["content-type"]), | ||
request_id=str(result["request-id"]), | ||
model_uuid=str(result["model-uuid"]), | ||
model_name=str(result["model-name"]), | ||
characters=int(str(result["char-count"])), | ||
transfer_encoding=str(result["transfer-encoding"]), | ||
date=str(result["date"]), | ||
stream=cast(io.BytesIO, result["stream"]), | ||
stream_memory=cast(io.BytesIO, result["stream"]), | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid duplicating SpeakRESTResponse construction logic
The two branches for creating the SpeakRESTResponse differ only in a few parameters, leading to code duplication.
Refactor to reduce duplication and improve maintainability:
- if is_aura_2_model:
- resp = SpeakRESTResponse(
- content_type=str(result["content-type"]),
- request_id=str(result["request-id"]),
- model_name=str(result["model-name"]),
- characters=int(str(result["characters"])),
- transfer_encoding=str(result["transfer-encoding"]),
- date=str(result["date"]),
- stream=cast(io.BytesIO, result["stream"]),
- stream_memory=cast(io.BytesIO, result["stream"]),
- )
- else:
- resp = SpeakRESTResponse(
- content_type=str(result["content-type"]),
- request_id=str(result["request-id"]),
- model_uuid=str(result["model-uuid"]),
- model_name=str(result["model-name"]),
- characters=int(str(result["char-count"])),
- transfer_encoding=str(result["transfer-encoding"]),
- date=str(result["date"]),
- stream=cast(io.BytesIO, result["stream"]),
- stream_memory=cast(io.BytesIO, result["stream"]),
- )
+ # Common parameters for all models
+ response_params = {
+ "content_type": str(result["content-type"]),
+ "request_id": str(result["request-id"]),
+ "model_name": str(result["model-name"]),
+ "transfer_encoding": str(result["transfer-encoding"]),
+ "date": str(result["date"]),
+ "stream": cast(io.BytesIO, result["stream"]),
+ "stream_memory": cast(io.BytesIO, result["stream"]),
+ }
+
+ # Model-specific parameters
+ if is_aura_2_model:
+ response_params["characters"] = int(str(result["characters"]))
+ else:
+ response_params["model_uuid"] = str(result["model-uuid"])
+ response_params["characters"] = int(str(result["char-count"]))
+
+ resp = SpeakRESTResponse(**response_params)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if is_aura_2_model: | |
resp = SpeakRESTResponse( | |
content_type=str(result["content-type"]), | |
request_id=str(result["request-id"]), | |
model_name=str(result["model-name"]), | |
characters=int(str(result["characters"])), | |
transfer_encoding=str(result["transfer-encoding"]), | |
date=str(result["date"]), | |
stream=cast(io.BytesIO, result["stream"]), | |
stream_memory=cast(io.BytesIO, result["stream"]), | |
) | |
else: | |
resp = SpeakRESTResponse( | |
content_type=str(result["content-type"]), | |
request_id=str(result["request-id"]), | |
model_uuid=str(result["model-uuid"]), | |
model_name=str(result["model-name"]), | |
characters=int(str(result["char-count"])), | |
transfer_encoding=str(result["transfer-encoding"]), | |
date=str(result["date"]), | |
stream=cast(io.BytesIO, result["stream"]), | |
stream_memory=cast(io.BytesIO, result["stream"]), | |
) | |
# Common parameters for all models | |
response_params = { | |
"content_type": str(result["content-type"]), | |
"request_id": str(result["request-id"]), | |
"model_name": str(result["model-name"]), | |
"transfer_encoding": str(result["transfer-encoding"]), | |
"date": str(result["date"]), | |
"stream": cast(io.BytesIO, result["stream"]), | |
"stream_memory": cast(io.BytesIO, result["stream"]), | |
} | |
# Model-specific parameters | |
if is_aura_2_model: | |
response_params["characters"] = int(str(result["characters"])) | |
else: | |
response_params["model_uuid"] = str(result["model-uuid"]) | |
response_params["characters"] = int(str(result["char-count"])) | |
resp = SpeakRESTResponse(**response_params) |
💡 Verification agent
🧩 Analysis chain
Add error handling for missing headers in the response
The code assumes that all expected headers will be present in the response. If a header is missing, accessing it will raise a KeyError.
Add error handling to gracefully handle missing headers:
# Common parameters for all models
response_params = {
- "content_type": str(result["content-type"]),
- "request_id": str(result["request-id"]),
- "model_name": str(result["model-name"]),
- "transfer_encoding": str(result["transfer-encoding"]),
- "date": str(result["date"]),
+ "content_type": str(result.get("content-type", "")),
+ "request_id": str(result.get("request-id", "")),
+ "model_name": str(result.get("model-name", "")),
+ "transfer_encoding": str(result.get("transfer-encoding", "")),
+ "date": str(result.get("date", "")),
"stream": cast(io.BytesIO, result["stream"]),
"stream_memory": cast(io.BytesIO, result["stream"]),
}
# Model-specific parameters
if is_aura_2_model:
- response_params["characters"] = int(str(result["characters"]))
+ response_params["characters"] = int(str(result.get("characters", 0)))
else:
- response_params["model_uuid"] = str(result["model-uuid"])
- response_params["characters"] = int(str(result["char-count"]))
+ response_params["model_uuid"] = str(result.get("model-uuid", ""))
+ response_params["characters"] = int(str(result.get("char-count", 0)))
Let's verify if these headers can ever be missing in the API response:
🏁 Script executed:
#!/bin/bash
# Check the SpeakRESTResponse class to understand which parameters are required
grep -A 20 "class SpeakRESTResponse" --include="*.py" -r .
Length of output: 1721
Action Required: Update Error Handling for Missing Response Headers
The current implementation in deepgram/clients/speak/v1/rest/client.py
(lines 198–221) assumes that all headers are present in the API response. However, if any header is missing, a KeyError will be raised. Although the SpeakRESTResponse
class provides default values for its attributes, the client code still directly indexes the response dictionary.
Please update the code as follows to gracefully handle missing headers by using result.get()
with appropriate default values:
-
Refactor the common response parameter extraction:
Useresult.get("header-key", default)
for fields like"content-type"
,"request-id"
,"model-name"
,"transfer-encoding"
, and"date"
. -
Handle model-specific parameters:
For the Aura 2 model, useresult.get("characters", 0)
and for the other model type, useresult.get("model-uuid", "")
andresult.get("char-count", 0)
.
Below is the suggested diff snippet:
# Common parameters for all models
response_params = {
- "content_type": str(result["content-type"]),
- "request_id": str(result["request-id"]),
- "model_name": str(result["model-name"]),
- "transfer_encoding": str(result["transfer-encoding"]),
- "date": str(result["date"]),
+ "content_type": str(result.get("content-type", "")),
+ "request_id": str(result.get("request-id", "")),
+ "model_name": str(result.get("model-name", "")),
+ "transfer_encoding": str(result.get("transfer-encoding", "")),
+ "date": str(result.get("date", "")),
"stream": cast(io.BytesIO, result["stream"]),
"stream_memory": cast(io.BytesIO, result["stream"]),
}
# Model-specific parameters
if is_aura_2_model:
- response_params["characters"] = int(str(result["characters"]))
+ response_params["characters"] = int(str(result.get("characters", 0)))
else:
- response_params["model_uuid"] = str(result["model-uuid"])
- response_params["characters"] = int(str(result["char-count"]))
+ response_params["model_uuid"] = str(result.get("model-uuid", ""))
+ response_params["characters"] = int(str(result.get("char-count", 0)))
By incorporating these changes, the code will safely handle cases where headers may be missing in the API response, preventing runtime errors.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if is_aura_2_model: | |
resp = SpeakRESTResponse( | |
content_type=str(result["content-type"]), | |
request_id=str(result["request-id"]), | |
model_name=str(result["model-name"]), | |
characters=int(str(result["characters"])), | |
transfer_encoding=str(result["transfer-encoding"]), | |
date=str(result["date"]), | |
stream=cast(io.BytesIO, result["stream"]), | |
stream_memory=cast(io.BytesIO, result["stream"]), | |
) | |
else: | |
resp = SpeakRESTResponse( | |
content_type=str(result["content-type"]), | |
request_id=str(result["request-id"]), | |
model_uuid=str(result["model-uuid"]), | |
model_name=str(result["model-name"]), | |
characters=int(str(result["char-count"])), | |
transfer_encoding=str(result["transfer-encoding"]), | |
date=str(result["date"]), | |
stream=cast(io.BytesIO, result["stream"]), | |
stream_memory=cast(io.BytesIO, result["stream"]), | |
) | |
# Common parameters for all models | |
response_params = { | |
"content_type": str(result.get("content-type", "")), | |
"request_id": str(result.get("request-id", "")), | |
"model_name": str(result.get("model-name", "")), | |
"transfer_encoding": str(result.get("transfer-encoding", "")), | |
"date": str(result.get("date", "")), | |
"stream": cast(io.BytesIO, result["stream"]), | |
"stream_memory": cast(io.BytesIO, result["stream"]), | |
} | |
# Model-specific parameters | |
if is_aura_2_model: | |
response_params["characters"] = int(str(result.get("characters", 0))) | |
else: | |
response_params["model_uuid"] = str(result.get("model-uuid", "")) | |
response_params["characters"] = int(str(result.get("char-count", 0))) | |
resp = SpeakRESTResponse(**response_params) |
is_aura_2_model = options["model"].split("-")[1] == "2" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for model identification
The code assumes options["model"]
exists and follows a specific format (name-number-...
). If "model"
is missing from options or doesn't contain at least two segments when split by -
, this will raise a KeyError or IndexError.
Add proper error handling to prevent potential runtime errors:
- is_aura_2_model = options["model"].split("-")[1] == "2"
+ is_aura_2_model = False
+ if options and "model" in options:
+ model_parts = options["model"].split("-")
+ if len(model_parts) > 1:
+ is_aura_2_model = model_parts[1] == "2"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
is_aura_2_model = options["model"].split("-")[1] == "2" | |
is_aura_2_model = False | |
if options and "model" in options: | |
model_parts = options["model"].split("-") | |
if len(model_parts) > 1: | |
is_aura_2_model = model_parts[1] == "2" |
Proposed changes
There was a bug raised in this discord thread:
https://discord.com/channels/1108042150941294664/1361735544278745359
Essentially the POST response from aura 2 for TTS does not return the same headers as aura 2 and the python sdk previously did not handle the aura 2 headers correctly.
Types of changes
What types of changes does your code introduce to the community Python SDK?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
Summary by CodeRabbit