Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -802,7 +802,7 @@ torch::Tensor hadacore_transform(torch::Tensor& x, bool inplace) {
});

if (numel % 256 != 0) {
out = out.index({torch::indexing::Slice(0, numel / had_size)});
out = out.narrow(0, 0, numel / had_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While this change to use narrow is correct for ABI stability, it operates on the out tensor which contains uninitialized data when inplace=false. This is because the run_fht kernel at line 801 is called to perform an in-place operation on x, but when inplace=false, out is a separate tensor that is never populated with the result. This causes the function to return garbage data.

To fix this critical bug, the kernel should be called to write to out:

// at line 801
hadacore::run_fht<SCALAR_TYPE>(x.data_ptr(), out.data_ptr(), x.numel(), had_size, stream);

Additionally, the existing tests only seem to cover the inplace=true path. Please add a test case with inplace=false to verify the fix and prevent future regressions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, I think gemini is trolling. If there is a problem, then it was pre-existing

}

if (inplace && out.data_ptr() != x.data_ptr()) {
Expand Down