-
Notifications
You must be signed in to change notification settings - Fork 67
Add pixel_format to VideoEncoder API #1027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| outPixelFormat_ = (formats && formats[0] != AV_PIX_FMT_NONE) | ||
| ? formats[0] | ||
| : AV_PIX_FMT_YUV420P; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSupportedPixelFormats is not guaranteed to return any formats. If the user does not specify a format and we find none, I think we should try to use the broadly supported yuv420p, rather than error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this makes sense and that's similar to what we do for the audio encoder when we can't validate:
torchcodec/src/torchcodec/_core/Encoder.cpp
Lines 85 to 87 in 8e615e3
| // Can't really validate anything in this case, best we can do is hope that | |
| // FLTP is supported by the encoder. If not, FFmpeg will raise. | |
| return AV_SAMPLE_FMT_FLTP; |
| "avi", | ||
| "mkv", | ||
| "flv", | ||
| "gif", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gif only supports rgb pixel formats, this test is now focused on the more common yuv formats.
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done @Dan-Flores , thank you!
| self, | ||
| dest: Union[str, Path], | ||
| *, | ||
| pixel_format: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job on making this a keyword only params 👍
| outPixelFormat_ = (formats && formats[0] != AV_PIX_FMT_NONE) | ||
| ? formats[0] | ||
| : AV_PIX_FMT_YUV420P; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this makes sense and that's similar to what we do for the audio encoder when we can't validate:
torchcodec/src/torchcodec/_core/Encoder.cpp
Lines 85 to 87 in 8e615e3
| // Can't really validate anything in this case, best we can do is hope that | |
| // FLTP is supported by the encoder. If not, FFmpeg will raise. | |
| return AV_SAMPLE_FMT_FLTP; |
src/torchcodec/_core/Encoder.cpp
Outdated
| validatePixelFormat(*avCodec, videoStreamOptions.pixelFormat.value()); | ||
| } else { | ||
| const AVPixelFormat* formats = getSupportedPixelFormats(*avCodec); | ||
| // Use first listed pixel format as default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we were using avcodec_find_best_pix_fmt_of_list, now our heuristic is to return the first format in the list. Do I understand correctly that the reason for this change is that you have empirically observe that the first in the list is often yuv420p, which is a good default?
I think it makes sense, just confirming my understanding. IT might be worth making that very explicit through a comment explaining that yuv420p is often the first entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, yuv420p is often the first in the list, and FFmpeg's avcodec_default_get_format uses the same heuristic.
I'll add a comment here to mention this as well
| RuntimeError, | ||
| match=r"Specified pixel format rgb24 is not supported[\s\S]*Supported pixel formats.*yuv420p", | ||
| ): | ||
| getattr(encoder, method)(**valid_params, pixel_format="rgb24") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's personal taste, but I think this test might be simpler and more clear if we did the more basic thing:
with pytest.raises(
RuntimeError,
match=r"Unknown pixel format: invalid_pix_fmt[\s\S]*Supported pixel formats.*yuv420p",
):
encoder.to_file(str(tmp_path / "output.mp4", pixel_format="invalid_pix_fmt")
with pytest.raises(
RuntimeError,
match=r"Unknown pixel format: invalid_pix_fmt[\s\S]*Supported pixel formats.*yuv420p",
):
encoder.to_tensor(format="mp4", pixel_format="invalid_pix_fmt")
with pytest.raises(
RuntimeError,
match=r"Unknown pixel format: invalid_pix_fmt[\s\S]*Supported pixel formats.*yuv420p",
):
encoder.to_file_like(file_like=io.BytesIO(), format="mp4", pixel_format="invalid_pix_fmt")
...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is a bit confusing, but I reused it as the AudioEncoder tests also use this pattern to test across encoding methods.
torchcodec/test/test_encoders.py
Lines 193 to 194 in dc86a8c
| with pytest.raises(RuntimeError, match="bit_rate=-1 must be >= 0"): | |
| getattr(decoder, method)(**valid_params, bit_rate=-1) |
In terms of taste, I would prefer to use this pattern and reduce code duplication. Alternatively, I don't think this test needs to be parametrized across encoding methods, since the error will always be hit in VideoEncoder::initializeEncoder.
src/torchcodec/_core/ops.py
Outdated
| frame_rate: int, | ||
| filename: str, | ||
| crf: Optional[int], | ||
| pixel_format: Optional[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing = None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps - I'm not sure if they are needed for the @register_fake annotated functions. I'll add them in case.
src/torchcodec/_core/ops.py
Outdated
| frame_rate: int, | ||
| format: str, | ||
| crf: Optional[int], | ||
| pixel_format: Optional[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing = None?
This PR updates VideoEncoder API to accept an optional
pixel_format.avcodec_find_best_pix_fmt_of_list, instead validate and use provided pixel format.pixel_formatprovided, use default pixel format. Often this isyuv420p, but not always (ex.gifcodec).