Decoder-native resize public implementation #1003

scotts · 2025-10-27T19:28:27Z

Public API for decoder-native resize. The implementation in this PR accepts both torchvision.transforms.v2.Resize and a newly defined torchcodec.transforms.Resize.

In #526, I had initially proposed not using TorchVision transforms, and instead coming up with TorchCodec specific versions. @NicolasHug proposed that we accept TorchVision transforms, and that's what I followed up with in my design in #885.

After discussing the previous iteration of this PR, we agreed we wanted to see what it would look like to accept both. Having implemented this, I agree it's the right thing to do:

We now don't need to require TorchVision, even when using the decoder-native feature.
We have a natural place to document the behavior of each decoder-native transform that we accept, and what its limitations are compared to the TorchVision version of that transform.
We have a more principled mechanism of enforcing how TorchVision transforms map to decoder-native semantics. We still have to dig into the TorchVision object to get the info we need, but the torchcodec.transforms class is a clear representation in code of what is supported. In the old PR, that mapping was buried in the logic that turned the TorchVision transform directly into the specification string the core API needs.

Four points worth discussing:

I made the base class for all TorchCodec defined decoder-native transforms to be DecoderTransform. I think it would be confusing if it was just Transform, and DecoderNativeTransform seems both too long and too obscure.
I made the module path torchcodec.transforms instead of torchcodec.decoder_transforms. That's almost counter to point 1, but I think that there's less chance of confusion with the module path.
Should it be DecoderResize instead of just Resize?
The type annotation that users will see only mentions accepting torchcodec.transforms.DecoderTransform. It does not mention the TorchVision transforms or nn.Module. The text of the docstring will say it, and I think that's enough?

src/torchcodec/_core/Transform.cpp

test/test_transform_ops.py

scotts · 2025-10-27T23:53:49Z

src/torchcodec/decoders/_video_decoder.py

        dimension_order: Literal["NCHW", "NHWC"] = "NCHW",
        num_ffmpeg_threads: int = 1,
        device: Optional[Union[str, torch_device]] = "cpu",
+        transforms: List[Any] = [],  # TRANSFORMS TODO: what is the user-facing type?


Discussion point 1: If we accept TorchVision transforms, and we want to lazily load TorchVision, what type do we advertise here? We can easily explain that we accept a TorchVision transform in the docs, but what should we put in the type annotation?

It should probably be either Any or nn.Module, which is the base class of all torchvision v2 transforms, and something users are familiar with since this is the core building block of any pytorch model.

Oh, that solves the problem nicely: it can definitely be nn.Module.

src/torchcodec/decoders/_video_decoder.py

scotts · 2025-11-07T04:15:53Z

mypy.ini

 show_error_codes = True
 pretty = True
 allow_redefinition = True
+follow_untyped_imports = True


I was getting linting errors like: https://github.com/meta-pytorch/torchcodec/actions/runs/19157614790/job/54761644331

Which points to docs which recommend the above change: https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports

Decoder-native resize public implementation

dd24dfa

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 27, 2025

scotts commented Oct 27, 2025

View reviewed changes

src/torchcodec/_core/Transform.cpp Show resolved Hide resolved

scotts commented Oct 27, 2025

View reviewed changes

test/test_transform_ops.py Outdated Show resolved Hide resolved

Lint

3a2df84

scotts commented Oct 27, 2025

View reviewed changes

src/torchcodec/decoders/_video_decoder.py Show resolved Hide resolved

scotts mentioned this pull request Oct 30, 2025

Proper resize tests; remove swscale resize #1013

Merged

scotts added 11 commits November 6, 2025 06:24

Merge branch 'main' of github.com:pytorch/torchcodec into transform_api

5344ab4

Implement decoder native transforms API

98cf81b

Correct merge

65c4ad7

Actually add new file

f300c70

Lint

2c3b7f0

Better assert

80e84b5

Better comment

5ac60d8

Top level transforms import

531b40f

Add the init file. Sigh.

cc333ac

Linter now needs torchvision in the environment

238a8ff

Avoid missing import errors

55d362c

scotts commented Nov 7, 2025

View reviewed changes

scotts added 2 commits November 7, 2025 18:40

Better names, better docs

0d2492e

More testing, docstring editing

a2da767

scotts marked this pull request as ready for review November 10, 2025 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decoder-native resize public implementation #1003

Decoder-native resize public implementation #1003

scotts commented Oct 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

scotts Oct 27, 2025

Uh oh!

NicolasHug Oct 28, 2025 •

edited

Loading

Uh oh!

scotts Oct 28, 2025

Uh oh!

Uh oh!

scotts Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Decoder-native resize public implementation #1003

Are you sure you want to change the base?

Decoder-native resize public implementation #1003

Conversation

scotts commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scotts Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

scotts Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scotts commented Oct 27, 2025 •

edited

Loading

NicolasHug Oct 28, 2025 •

edited

Loading