Skip to content

Rough implementation of getting key frame indices #484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 29, 2025

Conversation

scotts
Copy link
Contributor

@scotts scotts commented Jan 27, 2025

Only works in exact mode. Note that we had to start explicitly tracking frameIndex in our FrameInfo struct. That allows us to easily know the overall index for a key frame.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 27, 2025
@scotts scotts marked this pull request as ready for review January 28, 2025 02:06
@scotts scotts requested a review from NicolasHug January 28, 2025 02:06
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some non-blocking suggestions, approving to unblock

@@ -48,6 +48,7 @@ TORCH_LIBRARY(torchcodec_ns, m) {
"get_frames_by_pts_in_range(Tensor(a!) decoder, *, int stream_index, float start_seconds, float stop_seconds) -> (Tensor, Tensor, Tensor)");
m.def(
"get_frames_by_pts(Tensor(a!) decoder, *, int stream_index, float[] timestamps) -> (Tensor, Tensor, Tensor)");
m.def("get_key_frame_indices(Tensor(a!) decoder, int stream_index) -> int[]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can be conservative and expose it privately in the Python core API, for now? I.e. as _get_key_frame_indices instead of get_key_frame_indices

key_frame_indices = decoder._get_key_frame_indices()
size = key_frame_indices.size()
assert size[0] > 0
assert len(size) == 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug, is there a more PyTorch-y way to assert this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find of an obvious one. In this specific case, maybe we can just hard-code assert key_frame_indices.shape == (42,), but of course that's only working for this video.

On another note, I wonder if we should somewhat check the correctness of the returned values (still non-blocking)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should check the actual return value. I can probably hardcode what we expect in the test based off of what ffprobe returns.

@register_fake("torchcodec_ns::_get_key_frame_indices")
def get_key_frame_indices_abstract(
decoder: torch.Tensor, *, stream_index: int
) -> List[int]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the type annotation says list but I think you changed it to tensor?

key_frame_indices = decoder._get_key_frame_indices()

# The key frame indices were generated from the following command:
# $ ffprobe -v error -hide_banner -select_streams v:1 -show_frames -of csv test/resources/nasa_13013.mp4 | grep -n ",I," | cut -d ':' -f 1 > key_frames.txt
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the line-noise of a shell command on a single line so that it's easier to copy-paste. It does make it harder to read, but making it easy to read means breaking it up over several lines, and then when you go to copy-paste, there's the # comment markers in there. Happy to change it to a better way.

Comment on lines +845 to +847
# 4. Using cut to extract just the count for the frame.
# Finally, because the above produces a count, which is index + 1, we subtract
# one from all values manually to arrive at the values below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you meant the same thing, but my understanding is that -n will output the line number (not the count), and the cut part will select that line number (which is 1-based)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant the same thing. I can try to clarify.

@scotts scotts merged commit 298c9c1 into pytorch:main Jan 29, 2025
44 checks passed
@scotts scotts deleted the get_key_indices branch January 29, 2025 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants