Skip to content

Conversation

@max-krasnyansky
Copy link
Collaborator

Introduce fastdiv and fixtest-backend-ops failures for ADD/SUB/MUL
Thanks @chraac!
Subsumes #17042

Fixed inference with Qwen3-VL models that generate graphs with ne[1] == 0

max-krasnyansky and others added 3 commits November 9, 2025 18:36
llm_graph_context::build_inp_out_ids() can generate tensors with zero nrows.
Somehow other backends seems to handle this without obvious explicit checks.
In the hexagon case we need to check explicitly and skip them.
@chraac
Copy link
Contributor

chraac commented Nov 10, 2025

Great work! I've tested the fix on my device and it works well.

One small thing: init_fastdiv_values is called for each binary operation during each calculation.
However, this should have minimal performance impact since it only contains constant shifts and mpys, so I think it's acceptable.

Looking ahead, have you considered maintaining a similar structure to store these values on the CPU? That could be optimal since the weight dimensions remain fixed throughout all stages.

Copy link
Contributor

@chraac chraac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants