-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-87135: Hang non-main threads that attempt to acquire the GIL during finalization #28525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
cbc2e33
0cb5f8f
703d799
40222ec
ea5cc54
e71b959
2b51995
8022ebe
334777e
6f85036
bdebcf8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -409,7 +409,11 @@ Initializing and finalizing the interpreter | |
freed. Some memory allocated by extension modules may not be freed. Some | ||
extensions may not work properly if their initialization routine is called more | ||
than once; this can happen if an application calls :c:func:`Py_Initialize` and | ||
:c:func:`Py_FinalizeEx` more than once. | ||
:c:func:`Py_FinalizeEx` more than once. :c:func:`Py_FinalizeEx` must not be | ||
called recursively from within itself. Therefore, it must not be called by any | ||
code that may be run as part of the interpreter shutdown process, such as | ||
:py:mod:`atexit` handlers, object finalizers, or any code that may be run while | ||
flushing the stdout and stderr files. | ||
|
||
.. audit-event:: cpython._PySys_ClearAuditHooks "" c.Py_FinalizeEx | ||
|
||
|
@@ -1000,6 +1004,78 @@ thread, where the CPython global runtime was originally initialized. | |
The only exception is if :c:func:`exec` will be called immediately | ||
after. | ||
|
||
.. _cautions-regarding-runtime-finalization: | ||
|
||
Cautions regarding runtime finalization | ||
--------------------------------------- | ||
|
||
In the late stage of :term:`interpreter shutdown`, after attempting to wait for | ||
non-daemon threads to exit (though this can be interrupted by | ||
:class:`KeyboardInterrupt`) and running the :mod:`atexit` functions, the runtime | ||
is marked as *finalizing*: :c:func:`_Py_IsFinalizing` and | ||
:func:`sys.is_finalizing` return true. At this point, only the *finalization | ||
thread* that initiated finalization (typically the main thread) is allowed to | ||
acquire the :term:`GIL`. | ||
|
||
If any thread, other than the finalization thread, attempts to acquire the GIL | ||
during finalization, either explicitly via :c:func:`PyGILState_Ensure`, | ||
:c:macro:`Py_END_ALLOW_THREADS`, :c:func:`PyEval_AcquireThread`, or | ||
:c:func:`PyEval_AcquireLock`, or implicitly when the interpreter attempts to | ||
reacquire it after having yielded it, the thread enters a permanently blocked | ||
state where it remains until the program exits. In most cases this is harmless, | ||
but this can result in deadlock if a later stage of finalization attempts to | ||
acquire a lock owned by the blocked thread, or otherwise waits on the blocked | ||
thread. | ||
|
||
To avoid non-Python threads becoming blocked, or Python-created threads becoming | ||
blocked while executing C extension code, you can use | ||
:c:func:`PyThread_TryAcquireFinalizeBlock` and | ||
:c:func:`PyThread_ReleaseFinalizeBlock`. | ||
|
||
For example, to deliver an asynchronous notification to Python from a C | ||
extension, you might be inclined to write the following code that is *not* safe | ||
to execute during finalization: | ||
|
||
.. code-block:: c | ||
|
||
// some non-Python created thread that wants to send Python an async notification | ||
PyGILState_STATE state = PyGILState_Ensure(); // may hang thread | ||
// call `call_soon_threadsafe` on some event loop object | ||
PyGILState_Release(state); | ||
|
||
To avoid the possibility of the thread hanging during finalization, and also | ||
support older Python versions: | ||
|
||
.. code-block:: c | ||
|
||
// some non-Python created thread that wants to send Python an async notification | ||
PyGILState_STATE state; | ||
#if PY_VERSION_HEX >= 0x030c0000 // API added in Python 3.12 | ||
int acquired = PyThread_TryAcquireFinalizeBlock(); | ||
if (!acquired) { | ||
// skip sending notification since python is exiting | ||
return; | ||
} | ||
#endif // PY_VERSION_HEX | ||
state = PyGILState_Ensure(); // safe now | ||
// call `call_soon_threadsafe` on some event loop object | ||
PyGILState_Release(state); | ||
#if PY_VERSION_HEX >= 0x030c0000 // API added in Python 3.12 | ||
PyThread_ReleaseFinalizeBlock(); | ||
#endif // PY_VERSION_HEX | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This code looks quite complicated just to acquire the GIL. Why not adding a variant of PyGILState_Ensure() which tries to acquire it, or return a special value if Python is exiting? It would be simpler to use than having to add 2 new function calls:
Code compatible with old Python version:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It sounds like your suggestion is that the same That is feasible, if we modify the definition of Existing definition: typedef
enum {PyGILState_LOCKED, PyGILState_UNLOCKED}
PyGILState_STATE; New definition: typedef
enum {
PyGILState_LOCKED,
PyGILState_UNLOCKED,
PyGILState_EXITING,
PyGILState_LOCKED_WITH_FINALIZE_BLOCK,
PyGILState_UNLOCKED_WITH_FINALIZE_BLOCK}
PyGILState_STATE; The reason we need 3 new states rather than just 1 is because But it seems like it may be confusing to essentially overload the meaning of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Yhg1s - any thoughts on this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I understand the point about PyGILState_Release() being confusing. I think the main question is whether there's a reasonable use-case for the finalize block outside of acquiring the GIL. For the use-case of making sure it's safe to acquire the GIL (and preventing finalization from starting while the GILState is held), I think the PyGILState API makes more sense. At that point it isn't exposed as a separate lock but as a new error condition (and a new safety guarantee when calling TryEnsure). Framing it as " |
||
|
||
Or with the convenience interface (requires Python >=3.12): | ||
|
||
.. code-block:: c | ||
|
||
// some non-Python created thread that wants to send Python an async notification | ||
PyGILState_TRY_STATE state = PyGILState_TryAcquireFinalizeBlockAndGIL(); | ||
if (!state) { | ||
// skip sending notification since python is exiting | ||
return; | ||
} | ||
// call `call_soon_threadsafe` on some event loop object | ||
PyGILState_ReleaseGILAndFinalizeBlock(state); | ||
|
||
High-level API | ||
-------------- | ||
|
@@ -1082,11 +1158,14 @@ code, or when embedding the Python interpreter: | |
ensues. | ||
|
||
.. note:: | ||
Calling this function from a thread when the runtime is finalizing | ||
will terminate the thread, even if the thread was not created by Python. | ||
You can use :c:func:`_Py_IsFinalizing` or :func:`sys.is_finalizing` to | ||
check if the interpreter is in process of being finalized before calling | ||
this function to avoid unwanted termination. | ||
Calling this function from a thread when the runtime is finalizing will | ||
hang the thread until the program exits, even if the thread was not | ||
created by Python. Refer to | ||
:ref:`cautions-regarding-runtime-finalization` for more details. | ||
|
||
.. versionchanged:: 3.12 | ||
Hangs the current thread, rather than terminating it, if called while the | ||
interpreter is finalizing. | ||
|
||
.. c:function:: PyThreadState* PyThreadState_Get() | ||
|
||
|
@@ -1128,11 +1207,14 @@ with sub-interpreters: | |
to call arbitrary Python code. Failure is a fatal error. | ||
|
||
.. note:: | ||
Calling this function from a thread when the runtime is finalizing | ||
will terminate the thread, even if the thread was not created by Python. | ||
You can use :c:func:`_Py_IsFinalizing` or :func:`sys.is_finalizing` to | ||
check if the interpreter is in process of being finalized before calling | ||
this function to avoid unwanted termination. | ||
Calling this function from a thread when the runtime is finalizing will | ||
hang the thread until the program exits, even if the thread was not | ||
created by Python. Refer to | ||
:ref:`cautions-regarding-runtime-finalization` for more details. | ||
|
||
.. versionchanged:: 3.12 | ||
Hangs the current thread, rather than terminating it, if called while the | ||
interpreter is finalizing. | ||
|
||
.. c:function:: void PyGILState_Release(PyGILState_STATE) | ||
|
||
|
@@ -1144,6 +1226,36 @@ with sub-interpreters: | |
Every call to :c:func:`PyGILState_Ensure` must be matched by a call to | ||
:c:func:`PyGILState_Release` on the same thread. | ||
|
||
.. c:function:: PyGILState_TRY_STATE PyGILState_AcquireFinalizeBlockAndGIL() | ||
|
||
Attempts to acquire a :ref:`finalize | ||
block<cautions-regarding-runtime-finalization>`, and if successful, acquires | ||
the :term:`GIL`. | ||
|
||
This is a simple convenience interface that saves having to call | ||
:c:func:`PyThread_TryAcquireFinalizeBlock` and :c:func:`PyGILState_Ensure` | ||
separately. | ||
|
||
Returns ``PyGILState_TRY_LOCK_FAILED`` (equal to 0) if the interpreter is | ||
already waiting to finalize. In this case, the :term:`GIL` is not acquired | ||
and Python C APIs that require the :term:`GIL` must not be called. | ||
|
||
Otherwise, acquires a finalize block and then acquires the :term:`GIL`. | ||
|
||
Each call that is successful (i.e. returns a non-zero | ||
``PyGILState_TRY_STATE`` value) must be paired with a subsequent call to | ||
:c:func:`PyGILState_ReleaseGILAndFinalizeBlock` with the same value returned | ||
by this function. Calling :c:func:`PyGILState_ReleaseGILAndFinalizeBlock` with the | ||
error value ``PyGILState_TRY_LOCK_FAILED`` is safe and does nothing. | ||
|
||
.. versionadded:: 3.12 | ||
|
||
.. c:function:: void PyGILState_ReleaseGILAndFinalizeBlock(PyGILState_TRY_STATE) | ||
|
||
Releases any locks acquired by the corresponding call to | ||
:c:func:`PyGILState_AcquireFinalizeBlockAndGIL`. | ||
|
||
.. versionadded:: 3.12 | ||
|
||
.. c:function:: PyThreadState* PyGILState_GetThisThreadState() | ||
|
||
|
@@ -1410,17 +1522,20 @@ All of the following functions must be called after :c:func:`Py_Initialize`. | |
If this thread already has the lock, deadlock ensues. | ||
|
||
.. note:: | ||
Calling this function from a thread when the runtime is finalizing | ||
will terminate the thread, even if the thread was not created by Python. | ||
You can use :c:func:`_Py_IsFinalizing` or :func:`sys.is_finalizing` to | ||
check if the interpreter is in process of being finalized before calling | ||
this function to avoid unwanted termination. | ||
Calling this function from a thread when the runtime is finalizing will | ||
hang the thread until the program exits, even if the thread was not | ||
created by Python. Refer to | ||
:ref:`cautions-regarding-runtime-finalization` for more details. | ||
|
||
.. versionchanged:: 3.8 | ||
Updated to be consistent with :c:func:`PyEval_RestoreThread`, | ||
:c:func:`Py_END_ALLOW_THREADS`, and :c:func:`PyGILState_Ensure`, | ||
and terminate the current thread if called while the interpreter is finalizing. | ||
|
||
.. versionchanged:: 3.12 | ||
Hangs the current thread, rather than terminating it, if called while the | ||
interpreter is finalizing. | ||
|
||
:c:func:`PyEval_RestoreThread` is a higher-level function which is always | ||
available (even when threads have not been initialized). | ||
|
||
|
@@ -1448,17 +1563,19 @@ All of the following functions must be called after :c:func:`Py_Initialize`. | |
instead. | ||
|
||
.. note:: | ||
Calling this function from a thread when the runtime is finalizing | ||
will terminate the thread, even if the thread was not created by Python. | ||
You can use :c:func:`_Py_IsFinalizing` or :func:`sys.is_finalizing` to | ||
check if the interpreter is in process of being finalized before calling | ||
this function to avoid unwanted termination. | ||
Calling this function from a thread when the runtime is finalizing will | ||
hang the thread until the program exits, even if the thread was not | ||
created by Python. Refer to | ||
:ref:`cautions-regarding-runtime-finalization` for more details. | ||
|
||
.. versionchanged:: 3.8 | ||
Updated to be consistent with :c:func:`PyEval_RestoreThread`, | ||
:c:func:`Py_END_ALLOW_THREADS`, and :c:func:`PyGILState_Ensure`, | ||
and terminate the current thread if called while the interpreter is finalizing. | ||
|
||
.. versionchanged:: 3.12 | ||
Hangs the current thread, rather than terminating it, if called while the | ||
interpreter is finalizing. | ||
|
||
.. c:function:: void PyEval_ReleaseLock() | ||
|
||
|
@@ -1469,6 +1586,32 @@ All of the following functions must be called after :c:func:`Py_Initialize`. | |
:c:func:`PyEval_SaveThread` or :c:func:`PyEval_ReleaseThread` | ||
instead. | ||
|
||
.. c:function:: int PyThread_AcquireFinalizeBlock() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAICT, this API is interpreter-specific, rather than tied to a thread or the global runtime. So That said, the proposed function relies on knowing the interpreter associated with the current thread (e.g. The caller would explicitly provide the interpreter that should be blocked: PyInterpreterState *interp = PyInterpreterState_Get();
if (!PyInterpreterState_AcquireFinalizeBlock(interp)) {
end_early_because_we_are_finalizing_unexpectedly();
return;
}
do_the_normal_work();
PyInterpreterState_ReleaseFinalizeBlock(interp);
return; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given what I've said about "finalize block", consider alternate names:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While it is fairly straightforward to make these finalize blocks interpreter-specific, it is not clear to me, with my limited understanding of sub-interpreters, whether that is actually useful.
As an example, suppose we have a C extension that provides a Python API that allows Python callbacks to be passed in, and then later calls those Python functions on its own non-Python-created thread pool. If this extension is to support sub-interpreters, then either during multi-phase module initialization, or when it receives the Python callback, it must record the PyInterpreterState associated with the callback. Then, in order to invoke the callback on a thread from its thread pool, it must obtain a PyThreadState for the (thread, interpreter) combination, creating one if one does not already exist. To ensure the Given the need for external synchronization of threads when calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before I dig in to responding: one key point to consider is that Python users (especially extension authors, via the C-API) only interact directly with the global runtime via a few API functions. In nearly every case they are instead interacting with the Python thread (state) or interpreter associated with the current OS thread. Another key point is that, as of 3.12, each interpreter has its own GIL. Finally, it is certainly possible that I've misunderstood either the problem you're trying to solve or the way you're trying to solve it or both. I'm completely willing to learn and adjust. Then again, I might be completely right too! (Sorry for the length of this post. I genuinely want to understand and to be sure we're taking the right approach. I appreciate the work you've done and your willingness to converse.) Now, on to responses:
Please clarify. The only thing I see is: "All thread states associated with this interpreter are destroyed." The behavior should be the same for all interpreters, whether via
Caveats:
I do see that it calls There are a few things we don't do in either that we probably should, probably before or right after step (5), e.g. disable the import system, disallow new threads (thread states). Also, step (3) only applies to threads created by the threading module. We might want to extend that to all other threads states (i.e. created via
Looking at
So I'm not sure what you mean specifically. FWIW,
Only daemon threads (and, for now, threads (states) created via the C-API) would still be running at that point, and only until next step (5) above. So are we talking about both the following?
Just to be clear, here are the ways thread states get created:
At the moment, it's mostly only with that last one that we are careful during runtime/interp finalization. It occurs to me that this PR is mostly about addressing that: dealing with other thread states in the same way we currently do threads created via the threading module. Does that sound right?
Yeah, we should probably be more deliberate about disallowing that sort of thing during finalization.
That's what the proposed change in the PR is, AFAICS. The API you're adding must be specific to each interpreter, not to the global runtime. The resources that the proposed change protects are per-interpreter resources, not global ones. So I would not expect there to be any additional API or synchronization mechanism other than what you've already proposed (except applied to each interpreter instead of the main interpreter. Otherwise users of multiple interpreters will still be subject to the problem you're trying to solve.
That's literally what
Why wouldn't we just exclusively use the mechanism you're proposed here? Why would each interpreter have to have an additional duplicate? Again, the resources we're trying to protect here are specific to each interpreter, not to the global runtime, no?
Hmm, I didn't catch what external synchonization of threads you are talking about. Sorry if I missed it or misunderstood. Please restate what you mean specifically. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have any experience with implementing extensions that work with multiple interpreters, but I'm trying to think how that would be done safely. Let's say this extension lets the user schedule a Python callback to invoked at a specific time, on a global thread pool not created by Python. With a single interpreter, the extension may either keep a cached PyThreadState per thread in the pool for the main interpreter, or create it on demand. I haven't checked exactly what happens when trying to create a new PyThreadState while With multiple interpreters, we need a separate PyThreadState per thread in the pool per interpreter, and for each callback that has been scheduled, we also need to store the associated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I get it now. The "finalize block" API, as proposed, relies on the global runtime state, which is guaranteed to exist in the process, whereas a given interpreter state pointer may have already been freed. That said, do we continually have the guarantees we might need relative to the global runtime state, since at a certain point we will have freed some of the state the proposed API would need, no? I suppose if we can rely on some final flag for already-finalized then we'd be okay. As to interpreters, even if the target one has been finalized already, we can still know that. Interpreters may be looked up by ID, rather than referenced by pointer. It's an O(n) operation, of course, but I'm not sure that would be a huge obstacle. Likewise the pointer can be checked against the list of alive interpreters to check for validity. Would we need more than that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is safe to assume that there may be numerous race conditions still remaining in the finalization logic. In particular I'd assume that calling I am unclear on how sub-interpreters should be handled. Checking if the For the single-interpreter case we don't care about leaking memory because the program is about to exit anyway. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since there are some API questions to resolve here, one option may be to split out the change to make threads hang rather than terminate, which can go in right away, and I expect will be sufficient for almost all single-interpreter use cases. The finalize block API, or other API changes to safely support multi-threading without leaks in the presence of interpreters stopping, could then be added later without blocking the fix for the common case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed @jbms, can you make a PR splitting that part out? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, see #105805 |
||
|
||
Attempts to acquire a block on Python finalization. | ||
|
||
While the *finalize block* is held, the Python interpreter will block before | ||
it begins finalization. Holding a finalize block ensures that the | ||
:term:`GIL` can be safely acquired without the risk of hanging the thread. | ||
Refer to :ref:`cautions-regarding-runtime-finalization` for more details. | ||
|
||
If successful, returns 1. If the interpreter is already finalizing, or about | ||
to begin finalization and waiting for all previously-acquired finalize blocks | ||
to be released, returns 0 without acquiring a finalize block. | ||
|
||
Every successful call must be paired with a call to | ||
:c:func:`PyThread_ReleaseFinalizeBlock`. | ||
|
||
This function may be safely called with or without holding the :term:`GIL`. | ||
jbms marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. versionadded:: 3.12 | ||
|
||
.. c:function:: void PyThread_ReleaseFinalizeBlock() | ||
|
||
Releases a finalize block acquired by a prior successful call to | ||
:c:func:`PyThread_AcquireFinalizeBlock` (return value of 1). | ||
|
||
.. versionadded:: 3.12 | ||
|
||
.. _sub-interpreter-support: | ||
|
||
|
@@ -2007,4 +2150,3 @@ be used in new code. | |
.. c:function:: void* PyThread_get_key_value(int key) | ||
.. c:function:: void PyThread_delete_key_value(int key) | ||
.. c:function:: void PyThread_ReInitTLS() | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,7 +17,37 @@ typedef enum PyLockStatus { | |
|
||
PyAPI_FUNC(void) PyThread_init_thread(void); | ||
PyAPI_FUNC(unsigned long) PyThread_start_new_thread(void (*)(void *), void *); | ||
PyAPI_FUNC(void) _Py_NO_RETURN PyThread_exit_thread(void); | ||
/* Terminates the current thread. | ||
* | ||
* WARNING: This function is only safe to call if all functions in the full call | ||
* stack are written to safely allow it. Additionally, the behavior is | ||
* platform-dependent. This function should be avoided, and is no longer called | ||
* by Python itself. It is retained only for compatibility with existing C | ||
* extension code. | ||
* | ||
* With pthreads, calls `pthread_exit` which attempts to unwind the stack and | ||
* call C++ destructors. If a `noexcept` function is reached, the program is | ||
* terminated. | ||
* | ||
* On Windows, calls `_endthreadex` which kills the thread without calling C++ | ||
* destructors. | ||
* | ||
* In either case there is a risk of invalid references remaining to data on the | ||
* thread stack. | ||
*/ | ||
Py_DEPRECATED(3.12) PyAPI_FUNC(void) _Py_NO_RETURN PyThread_exit_thread(void); | ||
gpshead marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#ifndef Py_LIMITED_API | ||
/* Hangs the thread indefinitely without exiting it. | ||
* | ||
* bpo-42969: There is no safe way to exit a thread other than returning | ||
* normally from its start function. This is used during finalization in lieu | ||
* of actually exiting the thread. Since the program is expected to terminate | ||
* soon anyway, it does not matter if the thread stack stays around until then. | ||
*/ | ||
PyAPI_FUNC(void) _Py_NO_RETURN _PyThread_hang_thread(void); | ||
#endif /* !Py_LIMITED_API */ | ||
Comment on lines
+40
to
+49
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, this should probably go in Include/cpython/pythread.h. Also, why the leading underscore? If it's not meant to public use then put it in Include/internal/pycore_pythread.h. Otherwise either drop the leading underscore or add the "PyUnstable_" prefix. (See https://devguide.python.org/developer-workflow/c-api/#c-api, AKA PEP 689.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in #105805 |
||
|
||
PyAPI_FUNC(unsigned long) PyThread_get_thread_ident(void); | ||
|
||
#if (defined(__APPLE__) || defined(__linux__) || defined(_WIN32) \ | ||
|
Uh oh!
There was an error while loading. Please reload this page.