Skip to content

False 500 testing #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 18, 2022
Merged

False 500 testing #57

merged 9 commits into from
Oct 18, 2022

Conversation

jonathanfeng-scale
Copy link
Contributor

Here's the test on Python that's run with Pytest:
Screen Shot 2022-10-17 at 7 16 43 PM
Screen Shot 2022-10-17 at 7 15 30 PM

Here's the console.logs from the local-machine where I'm "reproducing" a similar error (returns 500, but successfully creates task):
Screen Shot 2022-10-17 at 7 16 19 PM
Screen Shot 2022-10-17 at 7 16 07 PM

So, the test for this was kinda weird, I made our task creation handler force an error during the first task creation. This had to be forced after the task has been created, but before the 200 is sent back.

The same error can't happen on the second task creation, but it should fail in a 409

kevin-xu-scale and others added 5 commits October 7, 2022 11:59
- we don't need to force a retry on a 409, it will only result in another 409, so let's just directly handle the issue
- try_history back to 500
Copy link
Contributor

@shaun-scale shaun-scale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the testing here!

Copy link
Contributor

@fatihkurtoglu fatihkurtoglu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, left some comments

scaleapi/api.py Outdated
@@ -109,9 +109,28 @@ def _api_request(
json = None
if res.status_code == 200:
json = res.json()
elif res.status_code == 409:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our API also returns 409 (Conflict) during batch and project creation.
Probably super rarer to have a 500 during batch or project creation, but should we add further governance here to ensure that behavior applies only to task creation?

Checking the task/ included in endpoint in a POST request and checking if unique_id is available in the body can be some ways to approach that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, will do that

scaleapi/api.py Outdated
# See if the first retry was a 500 error
if retry_history[0][3] == 500:
uuid = body["unique_id"]
newUrl = f"https://api.scale.com/v1/tasks?unique_id={uuid}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newUrl = f"https://api.scale.com/v1/tasks?unique_id={uuid}"
newUrl = f"{self.base_api_url}/tasks?unique_id={uuid}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be ideal later if we have an API method that directly returns a single task for the given unique_id (similar to getting a task with task id)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that would be ideal

newRes = self._http_request(
"GET", newUrl, headers=headers, auth=auth
)
json = newRes.json()["docs"][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, will this json be the same as what we return after task creation, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

# redirect_location=None)
if retry_history != ():
# See if the first retry was a 500 error
if retry_history[0][3] == 500:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm (since I'm not super sure) but is 500 the only error returned from our API for those race conditions? (and not other 5xx errors).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the exact error is task creation taking longer than our timeout, so Scale sends a 500 back in the response

@fatihkurtoglu
Copy link
Contributor

fatihkurtoglu commented Oct 18, 2022

@jonathanfeng-scale Also, let's bump the version too version.py:)
https://github.com/scaleapi/scaleapi-python-client/blob/master/docs/pypi_update_guide.md has details on how to deploy a new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants