You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every single one has an upper bound that is one higher than it should
be.
- For `thread_idx_[xyz]`: indices are 0-indexed, so the maximum index is
the `block_dim_[xyz]` maximum minus one. Changing `..=` to `..` fixes
it.
- For `block_idx_[xyz]`: likewise, but relative to `grid_dim_[xyz]`.
- For `block_dim_[xyz]`: these were all one too big. Not sure why,
perhaps a `..`/`..=` mix-up?
- For `grid_dim_[xyz]`: likewise. (Yes, these grid maximum dimensions
are all of the form 2^N-1 even though the block maximum dimensions are
all of the form 2^N. I don't know why, but it's what the CUDA docs
say.)
0 commit comments