Replies: 2 comments 1 reply
-
|
GPU info. I have tried to set |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Oh, I got it. CPU backend will first quantize |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to use Vulkan backend in my project chatllm.cpp, and have troubles with
mat_multoperator, wherewisQ8_0,input&outputareF32. The result differs slightly from CPU (wandinputare exactly the same).Dumped data (here,
inputis just a vector):Plot of point-wise error:
I think this might be caused by a flag or missing of a function call in my code. @0cc4m would you provide some hints?
Beta Was this translation helpful? Give feedback.
All reactions