-
Notifications
You must be signed in to change notification settings - Fork 37
add opencv benchmark #711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add opencv benchmark #711
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
self._print_each_iteration_time = False | ||
|
||
def decode_frames(self, video_file, pts_list): | ||
import cv2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to only import in __init__
and then store the module as a self.cv2
attibute. Otherwise, we'd be benchmarking the import
statement when calling the decoding method, and it may add noise to the results.
import cv2 | ||
|
||
frames = [ | ||
cv2.resize(frame, (width, height)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note that openCV doesn't apply antialias, while the rest of the decode_and_resize()
methods apply antialias by default. That makes these methods less comparable.
I would suggest not to expose decode_and_resize
for openCV if we can, so as to prevent any confusion, but if we need to expose it for technical reasons then let's at least add a comment pointing out this discrepancy about antialiasing.
CC @scotts as it's relevant to the whole "resize inconsistency chaos"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decode_and_reize()
is needed to generate data for the README, so I've left a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
cc @NicolasHug - It seems OpenCV only supports certain frame resolutions, so the benchmark on ![]() I ran the benchmark using a generated mandelbrot video at 1920x1080, which OpenCV is able to support. |
Thanks for checking @Dan-Flores ! I think what's happening with
the mp4 containers has 2 video streams: one with 320x180 resolution and one with 480x270 resolution. OpenCV decodes the first one while we decode the second one, because it's what ffmpeg considers to be the "best" stream. And BTW, we probably want to double check we were decoding the same streams with our other benchmark backends (torchaudio etc.), but that can be done separately. |
@@ -828,7 +829,7 @@ def run_benchmarks( | |||
# are using different random pts values across videos. | |||
random_pts_list = (torch.rand(num_samples) * duration).tolist() | |||
|
|||
for decoder_name, decoder in decoder_dict.items(): | |||
for decoder_name, decoder in sorted(decoder_dict.items(), key=lambda x: x[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was added to make it easier to compare benchmarks, I'm open to alternative approaches to sorting or removing it entirely.
This PR updates the changes in #674, which added a benchmark for decoding with the OpenCV library.
Changes in this PR:
stream_index
in TorchCodecPublic, TorchAudioDecoderBenchmark 1: nasa_13013.mp4 (320x180)
ffprobe test/resources/nasa_13013.mp4
.stream_index
to TorchCodecPublic and TorchAudioDecoder.Benchmark 2: mandelbrot_1920x1080_120s.mp4
Interpreting the Results:
nasa_13013.mp4
on the lower resolution stream shows similar performance by TorchCodec and OpenCV.