-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-2433 Fix Python 3 ServerDescription/Exception memory leak #520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When the SDAM monitor check fails, a ServerDescription is created from the exception. This exception is kept alive via the ServerDescription.error field. Unfortunately, the exception's traceback contains a reference to the previous ServerDescription. Altogether this means that each consecutively failing check leaks memory by building an ever growing chain of ServerDescription -> Exception -> Traceback -> Frame -> ServerDescription -> ... objects. This change breaks the chain and prevents the memory leak by clearing the Exception's __traceback__, __context__, and __cause__ fields.
8d503d6
to
1b40b94
Compare
Rebased to fix test failures caused by PYTHON-2436. |
# If a bug like PYTHON-2433 is reintroduced then too many | ||
# ServerDescriptions will be kept alive and this test will fail: | ||
# AssertionError: 4 != 22 within 5 delta (18 difference) | ||
self.assertAlmostEqual(initial_count, final_count, delta=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test. What is behind the choice of using delta=5
? Is that number 5
liable to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 is arbitrary but it allows for some tolerance for race conditions, like having several clients open in the background (creating and destroying ServerDescriptions).
pymongo/server_description.py
Outdated
@@ -69,6 +70,11 @@ def __init__( | |||
self._me = ismaster.me | |||
self._last_update_time = _time() | |||
self._error = error | |||
# PYTHON-2433 Clear error traceback info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you are purging the traceback, etc. in this location as opposed to after emitting the heartbeat failed event in Monitor._check_server
? Why preclude any exception on a ServerDescription from having a traceback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is a bit odd to mutate the exception in the ServerDescription constructor. I moved this logic to the Monitor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome job identifying this so quickly. It would have been a real doozy to figure out independently!
Alternative fixes would be:
|
When the SDAM monitor check fails, a ServerDescription is created from the exception. This exception is kept alive via the ServerDescription.error field. Unfortunately, the exception's traceback contains a reference to the previous ServerDescription. Altogether this means that each consecutively failing check leaks memory by building an ever growing chain of ServerDescription -> Exception -> Traceback -> Frame -> ServerDescription -> ... objects. This change breaks the chain and prevents the memory leak by clearing the Exception's __traceback__, __context__, and __cause__ fields. (cherry picked from commit 6c92e6c)
When the SDAM monitor check fails, a ServerDescription is created from the exception. This exception is kept alive via the ServerDescription.error field. Unfortunately, the exception's traceback contains a reference to the previous ServerDescription. Altogether this means that each consecutively failing check leaks memory by building an ever growing chain of ServerDescription -> Exception -> Traceback -> Frame -> ServerDescription -> ... objects. This change breaks the chain and prevents the memory leak by clearing the Exception's __traceback__, __context__, and __cause__ fields. (cherry picked from commit 6c92e6c)
When the SDAM monitor check fails, a ServerDescription is created from
the exception. This exception is kept alive via the
ServerDescription.error field. Unfortunately, the exception's traceback
contains a reference to the previous ServerDescription. Altogether this
means that each consecutively failing check leaks memory by building an
ever growing chain of ServerDescription -> Exception -> Traceback ->
Frame -> ServerDescription -> ... objects.
This change breaks the chain and prevents the memory leak by clearing
the Exception's traceback, context, and cause fields.