Skip to content

Conversation

@weiyuanyue
Copy link
Contributor

@weiyuanyue weiyuanyue commented Nov 5, 2025

  1. Fix OpenVINO GPU Reshape Error
    image

    A prior PR introduced explicit free dimension overrides (batch, channels, height, width) to help NVTensorRTRTX (TensorRT EP) initialize models with dynamic dimensions. In our current pipeline the VAE latent input size is effectively static (e.g. 64×64 for 512×512 output), so height/width do not need overriding. Leaving them in place is harmless for most EPs, but OpenVINO GPU applies additional graph optimizations that, with these overrides present, transform a transpose + reshape sequence incorrectly and result in an impossible reshape (element count mismatch).

    • Root Cause Details

      • Overrides applied: AddFreeDimensionOverrideByName("height", H/8) and ("width", W/8).
        Actual model latent spatial dims are already constant; the overrides re‑assert values the optimizer no longer treats as purely static.
      • OpenVINO GPU optimization path (layout normalization + constant folding) duplicates or misinterprets flattened spatial tokens, producing an inferred input shape [1,4096,4096,512] for a node that then attempts to reshape to [4096,512]. Element counts differ (8,589,934,592 vs 2,097,152) ⇒ failure.
      • Removing only the height/width overrides makes the issue disappear across repeated runs; batch/channels overrides are safe.
      • This does not reproduce with CPU EP or other providers because their optimization passes are less aggressive with respect to these particular shape transformations.
    • Current Mitigation
      We have commented out the height/width overrides; initialization succeeds on OpenVINO GPU. No other EP shows negative impact from their removal.

  2. Disable model compilation for CPU EP

image image

The CPU Execution Provider in ONNX Runtime does not implement EPContext model compilation, so invoking compilation on the CPU EP will fail. We removed the [Compile model] checkbox when the user selects the CPU EP.

@weiyuanyue weiyuanyue marked this pull request as ready for review November 5, 2025 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants