Skip to content

Enable public decoder creation with file like object #616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 3, 2025

Conversation

scotts
Copy link
Contributor

@scotts scotts commented Apr 3, 2025

Exposes the ability for users to pass a file-like object to our public VideoDecoder and AudioEncoder objects. More testing to make sure we're getting the correct results to follow.

I wanted to put up this PR to discuss how to approach objects that have a read and seek attribute, but do not actually have the correct signature. Currently, if users provide that, they'll see the kinds of exceptions that we test for in test_ops.py. Are we okay with that? Is that enough? If no, then we could do what I allude to in the comment, and use the inspect module to verify that the methods will actually work.

I also want to call out something that pleased me immensely: this feature and audio decoding have been in development in parallel, and they work together seamlessly! To the point that I was actually testing for audio without realizing it, as I just modified the test for decoder creation and only after it all worked did I realize I was testing for video and audio.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2025
source (str, ``Pathlib.path``, ``torch.Tensor``, or bytes): The source of the audio:
source (str, ``Pathlib.path``,
``io.RawIOBase``, ``io.BufferedReader``,
bytes, or ``torch.Tensor``): The source of the audio:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this still render okay? I wanted to keep line length under control.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, we'll have to check the built docs before merging. But I don't think we should always stick to actual types in the documentation, because they're not always the most informative. Anecdotally, I personally have no idea what io.RawIOBase or io.BufferedReader are, so reading these types in the doc doesn't tell me much about what I can and cannot do.

I think it would be much more informative if we were to document instead something like:

source (str, ``Pathlib.path``, bytes, ``torch.Tensor``, or file-like object): 

and then describe what we mean by "file-like" later in the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer: no: Docs build failed. :(

@scotts scotts requested a review from NicolasHug April 3, 2025 13:42
@scotts scotts marked this pull request as ready for review April 3, 2025 13:42
elif isinstance(source, bytes):
return core.create_from_bytes(source, seek_mode)
elif isinstance(source, Tensor):
return core.create_from_tensor(source, seek_mode)
elif isinstance(source, io.TextIOBase):
raise TypeError(
"source is of type io.TextIOBase; did you forget to specify binary reading?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's not obvious what an io.TextIOBase is (I have no idea). IMO we should just guess something like

Suggested change
"source is of type io.TextIOBase; did you forget to specify binary reading?"
"Looks like you specified the source as open(..., 'r')? Try with 'rb' for binary reading?"

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if users provide that, they'll see the kinds of exceptions that we test for in test_ops.py. Are we okay with that? Is that enough?

I think it's enough for now. If we get reports from confused users, we can try to get stricter.

Approving to unblock, but my general comment is that we shouldn't expose the expected types in the docs or in the error message, because these aren't informative for users. In type annotations, it's OK (it might be too strict as well, but we can figure it out later).

@scotts
Copy link
Contributor Author

scotts commented Apr 3, 2025

@NicolasHug, what's your suggestion for the docs? Just call it a file-like object? In tutorials, we can link to the Python glossary term (https://docs.python.org/3/glossary.html#term-file-like-object), but I don't think we want to do that in the inline docs.


raise TypeError(
f"Unknown source type: {type(source)}. "
"Supported types are str, Path, bytes and Tensor."
"Supported types are str, Path, io.RawIOBase, io.BufferedReader, bytes and Tensor."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say "supported types are ... or a file-like object with read(...) and seek(...) methods"
(fill-in the ... parts).

Because of the way our input check is set up above, I don't think users will hit e.g. the "you're missing a read method" error - which is OK as long as this error is sufficiently informative

@NicolasHug
Copy link
Member

what's your suggestion for the docs?

WDYT of this?

diff --git a/src/torchcodec/decoders/_audio_decoder.py b/src/torchcodec/decoders/_audio_decoder.py
index f973142..044c855 100644
--- a/src/torchcodec/decoders/_audio_decoder.py
+++ b/src/torchcodec/decoders/_audio_decoder.py
@@ -26,14 +26,15 @@ class AudioDecoder:
     Returned samples are float samples normalized in [-1, 1]
 
     Args:
-        source (str, ``Pathlib.path``,
-                ``io.RawIOBase``, ``io.BufferedReader``,
-                bytes, or ``torch.Tensor``): The source of the audio:
+        source (str, ``Pathlib.path``,  bytes, ``torch.Tensor``, or file-like object): The source of the audio:
 
             - If ``str``: a local path or a URL to a video or audio file.
             - If ``Pathlib.path``: a path to a local video or audio file.
             - If ``io.RawIOBase`` or ``io.BufferedReader``: a file-like object that refers to a audio file.
             - If ``bytes`` object or ``torch.Tensor``: the raw encoded audio data.
+            - If file-like: a Python object that exposes a ``read(self, size: int) -> bytes`` method and
+              a ``seek(self, offset: int, whence: int) -> bytes`` method.
+              Read more in our TODO_LINK_TO_FILE_LIKE_TUTO_HERE
         stream_index (int, optional): Specifies which stream in the file to decode samples from.
             Note that this index is absolute across all media types. If left unspecified, then
             the :term:`best stream` is used.

@scotts scotts merged commit e611e29 into pytorch:main Apr 3, 2025
46 checks passed
@scotts scotts deleted the file_like_public branch April 3, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants