Skip to content

Conversation

@januszjah
Copy link
Contributor

@januszjah januszjah commented Nov 17, 2025

This PR improves constant expression folding for pitch and width parameters in Intel GPU block I/O operations. The changes introduce a more robust constant evaluation mechanism that handles multiple levels of type casts and operation folding, addressing issue #5338.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves constant expression folding for pitch and width parameters in Intel GPU block I/O operations. The changes introduce a more robust constant evaluation mechanism that handles multiple levels of type casts and operation folding, addressing issue #5338.

  • Adds a new tryConstEval function that recursively evaluates constants through casts and foldable operations
  • Replaces the previous skipTrunc-based pitch validation with the new evaluation approach
  • Adds width validation using the same constant evaluation mechanism
  • Updates test kernel to mark stride parameters as constexpr to enable better compile-time evaluation

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp Implements new constant evaluation helpers and applies them to pitch/width validation in block I/O conversion
python/test/unit/language/test_block_pointer.py Marks stride parameters as constexpr in test kernel to support improved constant folding

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +54 to +57
if (succeeded(def->fold(results)) && results.size() == 1) {
if (auto val = dyn_cast_or_null<Value>(results[0]))
return val;
}
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fold result could be an Attribute containing a constant value, not just a Value. If results[0] is an Attribute, it should be converted to a constant Value rather than being ignored. This would cause tryConstEval to miss foldable constant attributes.

Copilot uses AI. Check for mistakes.
@etiotto etiotto linked an issue Nov 17, 2025 that may be closed by this pull request
M, N, K, #
stride_am, stride_ak, #
stride_bk, stride_bn, #
stride_am: tl.constexpr, stride_ak: tl.constexpr, #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this changes the test. What happens if these arguments are not constexpr, does the test still fails ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because it resolves finally to %arg, so it's still unknown at compile time and later in runtime it's 32, so it's not valid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change this existing test case, and add one test under Intel folder with TRITON_INTEL_2DBLOCK_ASSERT set in the file.

return v;
}

static std::optional<int64_t> tryConstEval(Value v, int depth = 16) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we port these changes to Utility.cpp and improve mlir::triton::intel::getFoldedConstantValue??

Copy link
Contributor Author

@januszjah januszjah Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will increase work done while compilation because getFoldedConstantValue is used in isConstant in many places and my version does multiple skips and folds instead of single fold, however IDK if it's a problem and how obsessive we are about cycles at this step.

M, N, K, #
stride_am, stride_ak, #
stride_bk, stride_bn, #
stride_am: tl.constexpr, stride_ak: tl.constexpr, #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change this existing test case, and add one test under Intel folder with TRITON_INTEL_2DBLOCK_ASSERT set in the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[2Dblock] fix minicore test's runtime checks

4 participants