Update index.md

FangRun2 · web-flow · commit dfa9f124334e · 2025-09-30T15:54:23.000+08:00
diff --git a/docs/source/user-guide/sparse-attention/index.md b/docs/source/user-guide/sparse-attention/index.md
@@ -2,10 +2,10 @@
 ## Motivations
 Attention mechanisms, especially in LLMs, are often the bottleneck in terms of latency during inference due to their computational complexity. Despite their importance in capturing contextual relationships, traditional attention requires processing all token interactions, leading to significant delays.
 
-![Attention Overhead](/docs/source/_static/images/attention_overhead.png)
+![Attention Overhead](../../_static/images/attention_overhead.png)
 
 Researchers have found that attention in LLM is highly dispersed:
-![Attention Sparsity](/docs/source/_static/images/attention_sparsity.png)
+![Attention Sparsity](../../_static/images/attention_sparsity.png)
 
 This movitates them actively developing sparse attention algorithms to address the latency issue. These algorithms aim to reduce the number of token interactions by focusing only on the most relevant parts of the input, thereby lowering the computation and memory requirements.
 While promising, the gap between theoretical prototypes and practical implementations in inference frameworks remains a significant challenge.
@@ -19,7 +19,7 @@ By utilizing UCM, researchers can efficiently implement rapid prototyping and te
 ## Architecture
 ### Overview
 The core concept of our UCMSparse attention framework is to offload the complete Key-Value (KV) cache to a dedicated KV cache storage. We then identify the crucial KV pairs relevant to the current context, as determined by our sparse attention algorithms, and selectively load only the necessary portions of the KV cache from storage into High Bandwidth Memory (HBM). This design significantly reduces the HBM footprint while accelerating generation speed.
-![Sparse Attn Arch](/docs/source/_static/images/sparse_attn_arch.png)
+![Sparse Attn Arch](../../_static/images/sparse_attn_arch.png)
 
 
 ### Key Concepts