Commit bb174b6
authored
Llama 3.1 RoPE scaling (#205)
* feat(llama): import RoPE scaling code
This is imported from the original Llama reference implementation:
https://github.com/meta-llama/llama-models/blob/7890266c5a3ccd29e739d53a71ea968bcf4ca400/models/llama3/reference_impl/model.py#L45
Note that the function does not have any effect on the original model
code as long as the use_scaled parameter is false (the default).
* feat(llama): add RopeScalingArgs
These are aligned with HF ones, so it will be easier to implement rope
scaling as it is done in Llama3.1.
* feat(llama): support rope scaling arguments to improve flexibility
* chore: relax safetensors pattern on download
* feat: untie weights when needed (i.e.: Llama3.2-1B)
* feat: add support for Llama3.1 - 3.2 and 3.3 models1 parent 08e4977 commit bb174b6
File tree
5 files changed
+126
-5
lines changed- jetstream_pt
- third_party/llama
5 files changed
+126
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
70 | 76 | | |
71 | 77 | | |
72 | 78 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
47 | 50 | | |
48 | 51 | | |
49 | 52 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| |||
78 | 81 | | |
79 | 82 | | |
80 | 83 | | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
81 | 90 | | |
82 | 91 | | |
83 | 92 | | |
| |||
215 | 224 | | |
216 | 225 | | |
217 | 226 | | |
218 | | - | |
| 227 | + | |
219 | 228 | | |
220 | 229 | | |
221 | 230 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
8 | 18 | | |
9 | 19 | | |
10 | 20 | | |
| |||
29 | 39 | | |
30 | 40 | | |
31 | 41 | | |
| 42 | + | |
32 | 43 | | |
33 | 44 | | |
34 | 45 | | |
| |||
103 | 114 | | |
104 | 115 | | |
105 | 116 | | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
106 | 171 | | |
107 | 172 | | |
108 | 173 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
170 | 171 | | |
171 | 172 | | |
172 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
173 | 199 | | |
174 | | - | |
175 | | - | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
176 | 205 | | |
177 | | - | |
178 | | - | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
179 | 210 | | |
180 | 211 | | |
181 | 212 | | |
| |||
223 | 254 | | |
224 | 255 | | |
225 | 256 | | |
| 257 | + | |
226 | 258 | | |
227 | 259 | | |
228 | 260 | | |
| |||
306 | 338 | | |
307 | 339 | | |
308 | 340 | | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
309 | 347 | | |
310 | 348 | | |
311 | 349 | | |
| |||
0 commit comments