Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit 97c114d

Browse files
authored
add QuantizeLinearForQbits activation contiguous check (#1072)
Signed-off-by: changwangss <chang1.wang@intel.com>
1 parent 226e088 commit 97c114d

File tree

1 file changed

+2
-0
lines changed
  • intel_extension_for_transformers/llm/quantization/nn

1 file changed

+2
-0
lines changed

intel_extension_for_transformers/llm/quantization/nn/modules.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,8 @@ def forward(self, x: torch.Tensor):
120120
m = reduce(mul, shape[0:-1])
121121
out = torch.zeros(m, self.out_features, dtype=x.dtype)
122122
bias = None if self.bias is None else self.bias.data
123+
if not x.is_contiguous():
124+
x = x.contiguous()
123125
out = matmul_kbit(
124126
x.view(m, shape[-1]), self.weight, bias, out,
125127
self.compute_dtype, self.weight_dtype, self.scale_dtype, do_dequant=self.training

0 commit comments

Comments
 (0)