Retry when safe #41

laurentsenta · 2015-11-16T14:24:46Z

Add a retry policy to idempotent requests (GET, DELETE, POST that sets data),

default max_retries to 3 so that users automatically benefit from this change.
Catches server errors: 429 (too many requests), 503 (service unavailable), 504 (timeout) and BadStatusLine errors (remote disconnected)

Add dependency to responses library for testing (mock the requests library).

Add __init__.py in tests/ folder so that nosetests can run individual tests. Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

This patch add the retrier system for all calls to the api (even unsafe ones). Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

Right now the API does not raises 404 errors when we delete a resource that does not exists. Thus we can retry delete requests until we get a 200. A correct implementation would be to catch 404 errors as "delete success", which is not needed now and complexify the retry implementation (special retry case only for delete requests). A test `test_delete_on_hubstorage_api_does_not_404` will be triggered when the assumption does not hold anymore and the retry policy should be fixed. Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

immerrr · 2015-11-16T16:16:18Z

hubstorage/client.py

+    if (isinstance(err, ConnectionError) and err.args[0] == 'Connection aborted.' and
+            isinstance(err.args[1], BadStatusLine) and err.args[1][0] == repr('')):
+        logger.warning("Protocol failed with BadStatusLine, retrying (maybe)")
+        return True


Can this conditional run into an updated urllib3 (or httplib) that got fixed with Python 3.5 ? Do we need to handle RemoteDisconnected error, too?

RemoteDisconnected inherits from BadStatusLine their should be no difference,
but actually, after testing the code in python3 it appears that it triggers a different structure for the same error. Catching it would look like:

isinstance(err, ConnectionError) and isinstance(err.args[0], ProtocolError) and err.args[0].args[0] == 'Connection aborted.' and isinstance(err.args[0].args[1], BadStatusLine))

This is terrible, I'm looking at an idiomatic way to check the list of chained exception, something like:

if BadStatusLineError in all_chained_errors(err)

Thanks for the note!

Allows calls to apidelete to override the default is_idempotent behavior. Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

dangra · 2015-11-17T17:20:54Z

hubstorage/client.py

@@ -14,6 +17,24 @@
 __version__ = pkgutil.get_data('hubstorage', 'VERSION').strip()


+logger = logging.getLogger('HubstorageClient')
+
+_ERROR_CODES_TO_RETRY = (429, 503, 504)


For a global retry code list, I think we should add 408 too.
More details at http://blog.haproxy.com/2014/05/26/haproxy-and-http-errors-408-in-chrome/

We have experienced 408's before ( #1 and #2)

Signed-off-by: lsenta <laurent.senta@gmail.com>

Fixes #14 examples (504 and timeouts) Signed-off-by: lsenta <laurent.senta@gmail.com>

Change where the dispatch occurs so that _iter_lines simply forward the request configuration to the client. The client decides whether to apply the retry policy depending on request args. Move the default GET behavior to apiget so that iter_json method is not altered by this change. (it already implements a retry that may be incompatible). Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

A client may define a max_retries and max_retry_time. Setting only max_retry=N means that the client will retry N times, no matter the time it takes. Signed-off-by: lsenta <laurent.senta@gmail.com>

Signed-off-by: lsenta <laurent.senta@gmail.com>

Retry when safe

laurentsenta added 14 commits November 13, 2015 15:26

Make tests/ folder a python package

45d0a19

Add __init__.py in tests/ folder so that nosetests can run individual tests. Signed-off-by: lsenta <laurent.senta@gmail.com>

Remove the current retry policy based on urllib3 retry

47917c5

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add retry based on HTTP status code

54a5087

This patch add the retrier system for all calls to the api (even unsafe ones). Signed-off-by: lsenta <laurent.senta@gmail.com>

Move tests on retry policy to their own package

7f19194

Signed-off-by: lsenta <laurent.senta@gmail.com>

Test retry fails on too many attempts

b2e6b6e

Signed-off-by: lsenta <laurent.senta@gmail.com>

Allow retry only for GET requests

74d3315

Signed-off-by: lsenta <laurent.senta@gmail.com>

Allow retry for mapping ressources save.

8d2aad5

Signed-off-by: lsenta <laurent.senta@gmail.com>

Refactor retry tests: add mock_api function to simplify testing code

ae62add

Signed-off-by: lsenta <laurent.senta@gmail.com>

Use only **kwargs in client requests (catch malformed call sooner)

c575f84

Signed-off-by: lsenta <laurent.senta@gmail.com>

Update the requirements for the library and testing it.

c24086f

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add BadStatusLine error to retried exceptions

b00f3a7

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add no-404 check for store.delete in test retry

b67dbf2

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add retry for collections store and delete

fb09bf6

Signed-off-by: lsenta <laurent.senta@gmail.com>

immerrr reviewed Nov 16, 2015
View reviewed changes

laurentsenta added 2 commits November 17, 2015 11:11

Set is_idempotent as a default value in apidelete instead of forcing it

7c5c938

Allows calls to apidelete to override the default is_idempotent behavior. Signed-off-by: lsenta <laurent.senta@gmail.com>

Fix incorrect/useless checks in test_retry

3d591b1

Signed-off-by: lsenta <laurent.senta@gmail.com>

dangra reviewed Nov 17, 2015
View reviewed changes

laurentsenta added 9 commits November 17, 2015 19:07

Turn retrier creation into a method with exponential backoff and jitter

25f9767

Signed-off-by: lsenta <laurent.senta@gmail.com>

Turn retrying requirement into a >= range

fa8dce2

Signed-off-by: lsenta <laurent.senta@gmail.com>

Test that retry does not catch unhandled exceptions

cd6b318

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add note on python3 compatibility for retrier

c484269

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add retry on timeout exceptions

dd80468

Fixes #14 examples (504 and timeouts) Signed-off-by: lsenta <laurent.senta@gmail.com>

Add 408 to retried http errors

d370178

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add max_retry_time to the client options

ea9d165

A client may define a max_retries and max_retry_time. Setting only max_retry=N means that the client will retry N times, no matter the time it takes. Signed-off-by: lsenta <laurent.senta@gmail.com>

Move set request timeout from resource to client

7180fd8

Signed-off-by: lsenta <laurent.senta@gmail.com>

Add automatic computation for exponential backoff in retrier

33768c8

Signed-off-by: lsenta <laurent.senta@gmail.com>

dangra added a commit that referenced this pull request Nov 19, 2015

Merge pull request #41 from scrapinghub/retry-when-safe

456b115

Retry when safe

dangra merged commit 456b115 into master Nov 19, 2015

dangra deleted the retry-when-safe branch November 19, 2015 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retry when safe #41

Retry when safe #41

Uh oh!

laurentsenta commented Nov 16, 2015

Uh oh!

immerrr Nov 16, 2015

Uh oh!

laurentsenta Nov 17, 2015

Uh oh!

dangra Nov 17, 2015

Uh oh!

laurentsenta Nov 18, 2015

Uh oh!

Uh oh!

Retry when safe #41

Retry when safe #41

Uh oh!

Conversation

laurentsenta commented Nov 16, 2015

Uh oh!

immerrr Nov 16, 2015

Choose a reason for hiding this comment

Uh oh!

laurentsenta Nov 17, 2015

Choose a reason for hiding this comment

Uh oh!

dangra Nov 17, 2015

Choose a reason for hiding this comment

Uh oh!

laurentsenta Nov 18, 2015

Choose a reason for hiding this comment

Uh oh!

Uh oh!