-
Notifications
You must be signed in to change notification settings - Fork 66
Avoid seeking checks when decoding frames sequentially #1028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| auto result = getNextFrameInternal(preAllocatedOutputTensor); | ||
| lastDecodedFrameIndex_ = frameIndex; | ||
| return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO for myself: explain why that's crucially important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that changing this logic avoids calling canWeAvoidSeeking() because this logic doesn't actually call that function. Does not updating the cursor change the conditions for which we call canWeAvoidSeeking()? Is it maybe clearer to directly update those conditions? These calls don't seem that expensive (although of course they'll cost something). I'm a little worried that we have some assumption elsewhere that we're always updating the cursor, and now we're not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very indirect and yes, based on assumptions.
canWeAvoidSeeking() is only called if the cursor was set:
torchcodec/src/torchcodec/_core/SingleStreamDecoder.cpp
Lines 1174 to 1177 in 7d27c15
| if (cursorWasJustSet_) { | |
| maybeSeekToBeforeDesiredPts(); | |
| cursorWasJustSet_ = false; | |
| } |
and the cursor is only set if we call setCursorPtsInSeconds.
I'm happy to revisit the core logic of how we handle this. TBH, I'm a little bit into a speed-running mode at the moment.
Half of the reason I'm submitting the PR as-is is to check the CI and make this issue visible as early as possible.
|
Digging into
Can't we now replace that call with your newly added |
We have a "skip seeking" logic where we try to minimize the number of seeks we have to do. This logic lives in
torchcodec/src/torchcodec/_core/SingleStreamDecoder.cpp
Line 1091 in 7d27c15
The problem is:
canWeAvoidSeeking()itself can be expensive to call!! This is especially true inapproximatemode which, as I just found out, can be slower than exact mode for short videos (10s long).In this PR, we skip the call to
canWeAvoidSeeking()when we can - yes, we "skip the skip-checking logic"! We can do that when we are decoding frames contiguously: 0, 1, 2, 3....This will provide significant speedups when:
get_frames_at()andget_frames_played_at().Why is
canWeAvoidSeeking()slow?Because it calls
getKeyFrameIndexForPts()which, in approximate mode, callsav_index_search_timestamp(). Calling this for all frames can dominate the runtime!Benchmarks:
Decoding all 300 frames of a short 10s long h264 720p video (testsrc2), approximate mode goes from 1249.56ms to 825.74ms (1.5X faster).
messy (but correct) benchmarking code: