cpu: skip NOPs to avoid barriers #17133

max-krasnyansky · 2025-11-10T00:45:40Z

When I was testing matmul chunking I noticed that we're still doing barriers for the NOPs.
This PR adds explicit check for NOPs in the graph_compute_thread so that we can skip the barriers.

I instrumented the code to see how many we actually skip for various models and it's quite a bit.
The numbers below are per-graph (ie per-token, etc)

Llama-3.2-1B SKIPPED 192
Llama-3.2-3B SKIPPED 336

Qwen3-0.6B   SKIPPED 336
Qwen3-4B     SKIPPED 432
Qwen3-VL-2B  SKIPPED 336

GPT-OSS-20B  SKIPPED 504

The overall speed up is noticeable for smaller models, but this makes sense in general to avoid waisting cycles.
Here is M4 Pro Qwen3-0.6B on the CPU before and after.

 ./scripts/compare-commits.sh master cpu-skip-nops  llama-bench --device none -m ../gguf/Qwen3-0.6B-Q4_0.gguf -p 128 -n 64 -t 8 -fa 1
...
| Model           | Test   |   t/s master |   t/s cpu-skip-nops |   Speedup |
|:----------------|:-------|-------------:|--------------------:|----------:|
| qwen3 0.6B Q4_0 | pp128  |      1889.64 |             1907.99 |      1.01 |
| qwen3 0.6B Q4_0 | tg64   |       315.90 |              331.45 |      1.05 |

I'm seeing similar bump on the Snapdragons, not so much on the Prompt but definitely for Token Gen.

ggml/src/ggml-cpu/ggml-cpu.c

cpu: skip NOPs to avoid barriers

011b2f9

max-krasnyansky requested review from ggerganov and slaren as code owners November 10, 2025 00:45

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 10, 2025

DajanaV mentioned this pull request Nov 10, 2025

UPSTREAM PR #17133: cpu: skip NOPs to avoid barriers auroralabs-loci/llama.cpp#151

Open

ggerganov reviewed Nov 10, 2025

View reviewed changes

ggml/src/ggml-cpu/ggml-cpu.c Outdated Show resolved Hide resolved

cpu: use ggml_op_is_empty

3b5b5d3

ggerganov approved these changes Nov 10, 2025

View reviewed changes

max-krasnyansky merged commit 395e286 into ggml-org:master Nov 10, 2025
64 of 66 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu: skip NOPs to avoid barriers #17133

cpu: skip NOPs to avoid barriers #17133

Uh oh!

max-krasnyansky commented Nov 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cpu: skip NOPs to avoid barriers #17133

cpu: skip NOPs to avoid barriers #17133

Uh oh!

Conversation

max-krasnyansky commented Nov 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants