Description
Bug report
Bug description:
there is a race where it's possible for BaseSubprocessTransport.__del__
to try to close the transport after the event loop has been closed. this results in an unraisable exception in __del__
, and it can also result in an orphan process being leaked.
the following is a reproducer that triggers the race between run()
exiting and [the process dying and the event loop learning about the process's death]. on my machine, with this reproducer the bug occurs (due to run()
winning the race) maybe 90% of the time:
from __future__ import annotations
import asyncio
from subprocess import PIPE
async def main() -> None:
try:
async with asyncio.timeout(1):
process = await asyncio.create_subprocess_exec(
"/usr/bin/env",
"sh",
"-c",
"while true; do sleep 1; done",
stdin=PIPE,
stdout=PIPE,
stderr=PIPE,
)
try:
await process.wait()
except BaseException:
process.kill()
# N.B.: even though they send it SIGKILL, the user is (very briefly)
# leaking an orphan process to asyncio, because they are not waiting for
# the event loop to learn that the process died. if we added
# while True:
# try:
# await process.wait()
# except CancelledError:
# pass
# else:
# break
# (i.e. if we used structured concurrency) then the race would not
# occur.
raise
except TimeoutError:
pass
if __name__ == "__main__":
asyncio.run(main())
most of the time, running this emits
Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x7fd25db428e0>
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/base_subprocess.py", line 126, in __del__
self.close()
File "/usr/lib/python3.11/asyncio/base_subprocess.py", line 104, in close
proto.pipe.close()
File "/usr/lib/python3.11/asyncio/unix_events.py", line 566, in close
self._close(None)
File "/usr/lib/python3.11/asyncio/unix_events.py", line 590, in _close
self._loop.call_soon(self._call_connection_lost, exc)
File "/usr/lib/python3.11/asyncio/base_events.py", line 761, in call_soon
self._check_closed()
File "/usr/lib/python3.11/asyncio/base_events.py", line 519, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
this case looks similar to GH-109538. i think the following patch (analogous to GH-111983) fixes it:
diff --git a/Lib/asyncio/base_subprocess.py b/Lib/asyncio/base_subprocess.py
index 6dbde2b696..ba219ba39d 100644
--- a/Lib/asyncio/base_subprocess.py
+++ b/Lib/asyncio/base_subprocess.py
@@ -123,8 +123,11 @@ def close(self):
def __del__(self, _warn=warnings.warn):
if not self._closed:
- _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
- self.close()
+ if self._loop.is_closed():
+ _warn("loop is closed", ResourceWarning, source=self)
+ else:
+ _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
+ self.close()
def get_pid(self):
return self._pid
however, there is another case for which the above patch is not sufficient. in the above example the user orphaned the process after sending SIGKILL
/TerminateProcess
(which is not immediate, but only schedules the kill), but what if they fully orphan it?
from __future__ import annotations
import asyncio
from subprocess import PIPE
async def main_leak_subprocess() -> None:
await asyncio.create_subprocess_exec(
"/usr/bin/env",
"sh",
"-c",
"while true; do sleep 1; done",
stdin=PIPE,
stdout=PIPE,
stderr=PIPE,
)
if __name__ == "__main__":
asyncio.run(main_leak_subprocess())
currently (on main
), when the race condition occurs (for this example the condition is run()
winning the race against BaseSubprocessTransport
GC) then asyncio emits a loud complaint Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x7f5b3b291e40>
and leaks the orphan process (check htop
after the interpreter exits!). asyncio probably also leaks the pipes.
but with the patch above, asyncio will quietly leak the orphan process (and probably pipes), but it will not yell about the leak unless the user enables ResourceWarning
s. which is not good.
so a more correct patch (fixes both cases) may be something along the lines of
diff --git a/Lib/asyncio/base_subprocess.py b/Lib/asyncio/base_subprocess.py
index 6dbde2b696..9c86c8444c 100644
--- a/Lib/asyncio/base_subprocess.py
+++ b/Lib/asyncio/base_subprocess.py
@@ -123,8 +123,31 @@ def close(self):
def __del__(self, _warn=warnings.warn):
if not self._closed:
- _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
- self.close()
+ if not self._loop.is_closed():
+ _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
+ self.close()
+ else:
+ _warn("loop is closed", ResourceWarning, source=self)
+
+ # self.close() requires the event loop to be open, so we need to reach
+ # into close() and its dependencies and manually do the bare-minimum
+ # cleanup that they'd do if the loop was open. I.e. we do the syscalls
+ # only; we can't interact with an event loop.
+
+ # TODO: Probably need some more stuff here too so that we don't leak fd's...
+
+ if (self._proc is not None and
+ # has the child process finished?
+ self._returncode is None and
+ # the child process has finished, but the
+ # transport hasn't been notified yet?
+ self._proc.poll() is None):
+
+ try:
+ self._proc.kill()
+ except (ProcessLookupError, PermissionError):
+ # the process may have already exited or may be running setuid
+ pass
def get_pid(self):
return self._pid
with this patch applied, neither example leaks an orphan process out of run()
, and both examples emit ResourceWarning
. however this patch is rather messy. it is also perhaps still leaking pipe fd's out of run()
. (the fd's probably get closed by the OS when the interpreter shuts down, but i suspect one end of each pipe will be an orphan from the time when run()
exits to the time when the interpreter shuts down, which can be arbitrarily long).
CPython versions tested on:
3.11, CPython main branch
Operating systems tested on:
Linux
Linked PRs
- gh-114177: avoid calling connection lost callbacks when loop is already closed in asyncio subprocess #134508
- [3.14] gh-114177: avoid calling connection lost callbacks when loop is already closed in asyncio subprocess (GH-134508) #134561
- [3.13] gh-114177: avoid calling connection lost callbacks when loop is already closed in asyncio subprocess (GH-134508) #134562
Metadata
Metadata
Assignees
Projects
Status