Skip to content

Inconsistent subprocess.Popen.communicate() behavior between Windows and Posix #134453

Open
@tikuma-lsuhsc

Description

@tikuma-lsuhsc

Bug report

Bug description:

This is not a bug per se as it diverges from the documentation, but Posix version of subprocess.Popen.communicate() behaves less ideally when a large "memoryview"-able object with non-byte item is passed as the input argument. For example, suppose I have a simple pass-through process which takes a long input array x of length greater than 512 and return it as is:

import subprocess as sp
import numpy as np

x = np.random.randn(44100, dtype=float) # 8-byte data type
ret = sp,run('pass_thru', input=x)
y = np.frombuffer(ret.stdout, dtype=float)
assert np.array_equal(x, y)

This example fails the assertion only in Posix because the returned array y shorter than x.

This appears to stem from sp.Popen.communicate()

if self._input:
    input_view = memoryview(self._input)

#[snip]

if key.fileobj is self.stdin:
    chunk = input_view[self._input_offset:self._input_offset + _PIPE_BUF]

    try:
        self._input_offset += os.write(key.fd, chunk)
    except BrokenPipeError:
        selector.unregister(key.fileobj)
        key.fileobj.close()
    else:
        if self._input_offset >= len(self._input):
            #[snip]

The indexed chunk of input_view sends more bytes if self._input is not a bytes-like while len(self._input) counts the number of items. As such self._input_offset > len(self._input) after the first few writes due to the mismatch in their units.

I think the if statement checks the bytes read against input_view.nbytes instead of len(self._input).

This use case is not officially supported as the input is expected to be bytes-like and not any arbitrary memoryview object, but it works under Windows, and this Posix codepath behavior is rather nasty as it works (as wrongly expected) for a short input (as is often the case for testing).

I believe an arbitrary memoryview object as a subprocess input works otherwise (based on my extensive uses passing audio and video data to and from FFmpeg) so perhaps I should label this issue as a feature request.

P.S., A fix for the example is to use ret = sp,run('pass_thru', input=x.view('b')).

CPython versions tested on:

3.13

Operating systems tested on:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtopic-subprocessSubprocess issues.type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions