-
Notifications
You must be signed in to change notification settings - Fork 39
Enable public decoder creation with file like object #616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
source (str, ``Pathlib.path``, ``torch.Tensor``, or bytes): The source of the audio: | ||
source (str, ``Pathlib.path``, | ||
``io.RawIOBase``, ``io.BufferedReader``, | ||
bytes, or ``torch.Tensor``): The source of the audio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this still render okay? I wanted to keep line length under control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, we'll have to check the built docs before merging. But I don't think we should always stick to actual types in the documentation, because they're not always the most informative. Anecdotally, I personally have no idea what io.RawIOBase
or io.BufferedReader
are, so reading these types in the doc doesn't tell me much about what I can and cannot do.
I think it would be much more informative if we were to document instead something like:
source (str, ``Pathlib.path``, bytes, ``torch.Tensor``, or file-like object):
and then describe what we mean by "file-like" later in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer: no: Docs build failed. :(
elif isinstance(source, bytes): | ||
return core.create_from_bytes(source, seek_mode) | ||
elif isinstance(source, Tensor): | ||
return core.create_from_tensor(source, seek_mode) | ||
elif isinstance(source, io.TextIOBase): | ||
raise TypeError( | ||
"source is of type io.TextIOBase; did you forget to specify binary reading?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it's not obvious what an io.TextIOBase
is (I have no idea). IMO we should just guess something like
"source is of type io.TextIOBase; did you forget to specify binary reading?" | |
"Looks like you specified the source as open(..., 'r')? Try with 'rb' for binary reading?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, if users provide that, they'll see the kinds of exceptions that we test for in test_ops.py. Are we okay with that? Is that enough?
I think it's enough for now. If we get reports from confused users, we can try to get stricter.
Approving to unblock, but my general comment is that we shouldn't expose the expected types in the docs or in the error message, because these aren't informative for users. In type annotations, it's OK (it might be too strict as well, but we can figure it out later).
@NicolasHug, what's your suggestion for the docs? Just call it a file-like object? In tutorials, we can link to the Python glossary term (https://docs.python.org/3/glossary.html#term-file-like-object), but I don't think we want to do that in the inline docs. |
|
||
raise TypeError( | ||
f"Unknown source type: {type(source)}. " | ||
"Supported types are str, Path, bytes and Tensor." | ||
"Supported types are str, Path, io.RawIOBase, io.BufferedReader, bytes and Tensor." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should say "supported types are ... or a file-like object with read(...) and seek(...) methods"
(fill-in the ... parts).
Because of the way our input check is set up above, I don't think users will hit e.g. the "you're missing a read
method" error - which is OK as long as this error is sufficiently informative
WDYT of this? diff --git a/src/torchcodec/decoders/_audio_decoder.py b/src/torchcodec/decoders/_audio_decoder.py
index f973142..044c855 100644
--- a/src/torchcodec/decoders/_audio_decoder.py
+++ b/src/torchcodec/decoders/_audio_decoder.py
@@ -26,14 +26,15 @@ class AudioDecoder:
Returned samples are float samples normalized in [-1, 1]
Args:
- source (str, ``Pathlib.path``,
- ``io.RawIOBase``, ``io.BufferedReader``,
- bytes, or ``torch.Tensor``): The source of the audio:
+ source (str, ``Pathlib.path``, bytes, ``torch.Tensor``, or file-like object): The source of the audio:
- If ``str``: a local path or a URL to a video or audio file.
- If ``Pathlib.path``: a path to a local video or audio file.
- If ``io.RawIOBase`` or ``io.BufferedReader``: a file-like object that refers to a audio file.
- If ``bytes`` object or ``torch.Tensor``: the raw encoded audio data.
+ - If file-like: a Python object that exposes a ``read(self, size: int) -> bytes`` method and
+ a ``seek(self, offset: int, whence: int) -> bytes`` method.
+ Read more in our TODO_LINK_TO_FILE_LIKE_TUTO_HERE
stream_index (int, optional): Specifies which stream in the file to decode samples from.
Note that this index is absolute across all media types. If left unspecified, then
the :term:`best stream` is used.
|
Exposes the ability for users to pass a file-like object to our public
VideoDecoder
andAudioEncoder
objects. More testing to make sure we're getting the correct results to follow.I wanted to put up this PR to discuss how to approach objects that have a
read
andseek
attribute, but do not actually have the correct signature. Currently, if users provide that, they'll see the kinds of exceptions that we test for intest_ops.py
. Are we okay with that? Is that enough? If no, then we could do what I allude to in the comment, and use the inspect module to verify that the methods will actually work.I also want to call out something that pleased me immensely: this feature and audio decoding have been in development in parallel, and they work together seamlessly! To the point that I was actually testing for audio without realizing it, as I just modified the test for decoder creation and only after it all worked did I realize I was testing for video and audio.