Commit 5d22b18
committed
Improve codegen of align_offset when stride == 1
Previously checking for `pmoda == 0` would get LLVM to generate branchy
code, when, for `stride = 1` the offset can be computed without such a
branch by doing effectively a `-p % a`.
For well-known (constant) alignments, with the new ordering of these
conditionals, we end up generating 2 to 3 cheap instructions on x86_64:
movq %rdi, %rax
negl %eax
andl $7, %eax
instead of 5+ as previously.
For unknown alignments the new code also generates just 3 instructions:
negq %rdi
leaq -1(%rsi), %rax
andq %rdi, %rax1 parent e7271da commit 5d22b18
1 file changed
+11
-13
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1172 | 1172 | | |
1173 | 1173 | | |
1174 | 1174 | | |
1175 | | - | |
| 1175 | + | |
1176 | 1176 | | |
1177 | 1177 | | |
1178 | 1178 | | |
| |||
1220 | 1220 | | |
1221 | 1221 | | |
1222 | 1222 | | |
1223 | | - | |
| 1223 | + | |
1224 | 1224 | | |
1225 | | - | |
| 1225 | + | |
| 1226 | + | |
| 1227 | + | |
| 1228 | + | |
1226 | 1229 | | |
| 1230 | + | |
1227 | 1231 | | |
1228 | 1232 | | |
1229 | 1233 | | |
1230 | | - | |
1231 | | - | |
1232 | | - | |
1233 | | - | |
1234 | | - | |
1235 | | - | |
1236 | | - | |
1237 | | - | |
1238 | | - | |
1239 | | - | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
1240 | 1238 | | |
1241 | 1239 | | |
1242 | 1240 | | |
| |||
0 commit comments