-
Notifications
You must be signed in to change notification settings - Fork 39
Rough implementation of getting key frame indices #484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some non-blocking suggestions, approving to unblock
@@ -48,6 +48,7 @@ TORCH_LIBRARY(torchcodec_ns, m) { | |||
"get_frames_by_pts_in_range(Tensor(a!) decoder, *, int stream_index, float start_seconds, float stop_seconds) -> (Tensor, Tensor, Tensor)"); | |||
m.def( | |||
"get_frames_by_pts(Tensor(a!) decoder, *, int stream_index, float[] timestamps) -> (Tensor, Tensor, Tensor)"); | |||
m.def("get_key_frame_indices(Tensor(a!) decoder, int stream_index) -> int[]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can be conservative and expose it privately in the Python core API, for now? I.e. as _get_key_frame_indices
instead of get_key_frame_indices
test/decoders/test_video_decoder.py
Outdated
key_frame_indices = decoder._get_key_frame_indices() | ||
size = key_frame_indices.size() | ||
assert size[0] > 0 | ||
assert len(size) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug, is there a more PyTorch-y way to assert this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find of an obvious one. In this specific case, maybe we can just hard-code assert key_frame_indices.shape == (42,)
, but of course that's only working for this video.
On another note, I wonder if we should somewhat check the correctness of the returned values (still non-blocking)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should check the actual return value. I can probably hardcode what we expect in the test based off of what ffprobe
returns.
@register_fake("torchcodec_ns::_get_key_frame_indices") | ||
def get_key_frame_indices_abstract( | ||
decoder: torch.Tensor, *, stream_index: int | ||
) -> List[int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the type annotation says list but I think you changed it to tensor?
key_frame_indices = decoder._get_key_frame_indices() | ||
|
||
# The key frame indices were generated from the following command: | ||
# $ ffprobe -v error -hide_banner -select_streams v:1 -show_frames -of csv test/resources/nasa_13013.mp4 | grep -n ",I," | cut -d ':' -f 1 > key_frames.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the line-noise of a shell command on a single line so that it's easier to copy-paste. It does make it harder to read, but making it easy to read means breaking it up over several lines, and then when you go to copy-paste, there's the #
comment markers in there. Happy to change it to a better way.
# 4. Using cut to extract just the count for the frame. | ||
# Finally, because the above produces a count, which is index + 1, we subtract | ||
# one from all values manually to arrive at the values below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you meant the same thing, but my understanding is that -n
will output the line number (not the count), and the cut
part will select that line number (which is 1-based)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant the same thing. I can try to clarify.
Only works in exact mode. Note that we had to start explicitly tracking
frameIndex
in ourFrameInfo
struct. That allows us to easily know the overall index for a key frame.