Skip to content

Commit 773e124

Browse files
committed
refactor
1 parent 81b590b commit 773e124

File tree

2 files changed

+1
-0
lines changed

2 files changed

+1
-0
lines changed

_posts/2025-10-05-deepseek-v2--a-strong--economical--and-efficient-mixture-of-experts-language-model.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,7 @@ DeepSeek 67B를 따라 DeepSeek-V2에서도 Rotary Position Embedding(RoPE)을
185185
분리된 RoPE 전략을 갖춘 MLA는 다음과 같은 계산을 수행합니다.
186186

187187
$$[\mathbf{q}_{t,1}^R; \mathbf{q}_{t,2}^R; ...; \mathbf{q}_{t,n_h}^R] = \mathbf{q}_t^R = \text{RoPE}(W^{QR} \mathbf{c}_t^Q)$$
188+
188189
$$\mathbf{k}_t^R = \text{RoPE}(W^{KR} \mathbf{h}_t)$$
189190

190191
그 다음 압축된 부분과 RoPE 부분을 연결합니다.

assets/images/summaries.jpg

805 KB
Loading

0 commit comments

Comments
 (0)