Commit 7d59843
committed
Also use contig kernel if simplified iter is 1d and has unit strides
Example where it helps:
```
In [1]: import dpctl, dpctl.tensor as dpt
In [2]: x = dpt.arange(1234*7873, dtype=dpt.int32)
In [3]: xx = dpt.permute_dims(dpt.reshape(x, (2, 617, 7873)), (1,2,0))
In [4]: yy = dpt.permute_dims(dpt.reshape(dpt.empty_like(x, dtype="f4"), (2, 617, 7873)), (1,2,0))
In [5]: %timeit yy[...] = xx
1.07 ms ± 93.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
```
in master the time is about 2.8 ms on Iris Xe.1 parent 2669110 commit 7d59843
1 file changed
+19
-8
lines changedLines changed: 19 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
169 | | - | |
170 | | - | |
171 | 169 | | |
172 | 170 | | |
173 | 171 | | |
| |||
202 | 200 | | |
203 | 201 | | |
204 | 202 | | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
210 | 221 | | |
211 | 222 | | |
212 | 223 | | |
| |||
0 commit comments