Skip to content

Retry when safe #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Nov 19, 2015
Merged

Retry when safe #41

merged 26 commits into from
Nov 19, 2015

Conversation

laurentsenta
Copy link
Contributor

Add a retry policy to idempotent requests (GET, DELETE, POST that sets data),

  • default max_retries to 3 so that users automatically benefit from this change.
  • Catches server errors: 429 (too many requests), 503 (service unavailable), 504 (timeout) and BadStatusLine errors (remote disconnected)

Add dependency to responses library for testing (mock the requests library).

Add __init__.py in tests/ folder so that nosetests can run individual
tests.

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
This patch add the retrier system for all calls to the api (even
unsafe ones).

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Right now the API does not raises 404 errors when we delete a resource
that does not exists. Thus we can retry delete requests until we get a
200.

A correct implementation would be to catch 404 errors as "delete
success", which is not needed now and complexify the retry
implementation (special retry case only for delete requests).

A test `test_delete_on_hubstorage_api_does_not_404` will be triggered
when the assumption does not hold anymore and the retry policy should be
fixed.

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
if (isinstance(err, ConnectionError) and err.args[0] == 'Connection aborted.' and
isinstance(err.args[1], BadStatusLine) and err.args[1][0] == repr('')):
logger.warning("Protocol failed with BadStatusLine, retrying (maybe)")
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this conditional run into an updated urllib3 (or httplib) that got fixed with Python 3.5 ? Do we need to handle RemoteDisconnected error, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoteDisconnected inherits from BadStatusLine their should be no difference,
but actually, after testing the code in python3 it appears that it triggers a different structure for the same error. Catching it would look like:

                isinstance(err, ConnectionError)
                and isinstance(err.args[0], ProtocolError)
                and err.args[0].args[0] == 'Connection aborted.'
                and isinstance(err.args[0].args[1], BadStatusLine))

This is terrible, I'm looking at an idiomatic way to check the list of chained exception, something like:

if BadStatusLineError in all_chained_errors(err)

Thanks for the note!

Allows calls to apidelete to override the default is_idempotent
behavior.

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
@@ -14,6 +17,24 @@
__version__ = pkgutil.get_data('hubstorage', 'VERSION').strip()


logger = logging.getLogger('HubstorageClient')

_ERROR_CODES_TO_RETRY = (429, 503, 504)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a global retry code list, I think we should add 408 too.
More details at http://blog.haproxy.com/2014/05/26/haproxy-and-http-errors-408-in-chrome/

We have experienced 408's before ( #1 and #2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Fixes #14 examples (504 and timeouts)

Signed-off-by: lsenta <laurent.senta@gmail.com>
Change where the dispatch occurs so that _iter_lines simply forward
the request configuration to the client. The client decides whether
to apply the retry policy depending on request args.

Move the default GET behavior to apiget so that iter_json method is not
altered by this change. (it already implements a retry that may be
incompatible).

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
A client may define a max_retries and max_retry_time.
Setting only max_retry=N means that the client will retry N times, no
matter the time it takes.

Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
Signed-off-by: lsenta <laurent.senta@gmail.com>
dangra added a commit that referenced this pull request Nov 19, 2015
@dangra dangra merged commit 456b115 into master Nov 19, 2015
@dangra dangra deleted the retry-when-safe branch November 19, 2015 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants