-
Notifications
You must be signed in to change notification settings - Fork 280
Extend support of batch_cast<...> to upcasting to a type twice as big #1184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@AntoinePrv : this is just some mockup to check if we're good with the API |
9e19732 to
4d848a8
Compare
|
Also cc @JohanMabille & @DiamonDinoia for the API. I didn't want to support the generic conversion, say from uint8 to uint32 with larger arrays, let's start humble |
|
I would call the API On the implementation:
namespace {
template<class T> struct widen;
template<> struct widen<uint8_t> { using type = uint16_t; };
template<> struct widen<uint16_t> { using type = uint32_t; };
template<> struct widen<uint32_t> { using type = uint64_t; };
template<class T>
using widen_t_unsigned = typename widen<T>::type;
} // namespace
template<class T>
struct widen {
static_assert(std::is_integral<T>::value && !std::is_same<T,bool>::value,
"integral non-bool type required");
using U = typename std::make_unsigned<T>::type;
using UWide = detail::widen_t_unsigned<U>;
using type = typename std::conditional<
std::is_signed<T>::value,
typename std::make_signed<UWide>::type,
UWide
>::type;
};
template<class T>
using widen_t = typename widen<T>::type; |
4d848a8 to
affe742
Compare
|
@DiamonDinoia yep, widen looks like a good name, thanks! |
520e96d to
b76733d
Compare
| template <typename T> | ||
| struct widen : widen<typename std::make_unsigned<T>::type> | ||
| { | ||
| }; | ||
|
|
||
| template <> | ||
| struct widen<uint32_t> | ||
| { | ||
| using type = uint64_t; | ||
| }; | ||
| template <> | ||
| struct widen<uint16_t> | ||
| { | ||
| using type = uint32_t; | ||
| }; | ||
| template <> | ||
| struct widen<uint8_t> | ||
| { | ||
| using type = uint16_t; | ||
| }; | ||
| template <> | ||
| struct widen<float> | ||
| { | ||
| using type = double; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a trait that convert a byte size to a uint/float type could be more generic:
template <>
struct sized_uint<4>
{
using type = uint32_t;
};
...And then use with batch<sized_uint_t<2*sizeof(T)>> (or an alias for that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not in favor of this. Then we need sized_int and sized_uint and the API will need an std::conditional<std::is_signed<T>, sized_int_t<2*sizeof(T)>, sized_uint_t<2*sizeof(T)>
AntoinePrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time! Looking good
|
My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here. |
Yup, I'll wait for it to be merged! |
471c67f to
a470cf8
Compare
| return { batch<T_out, A>::load_aligned(&out_buffer[0]), | ||
| batch<T_out, A>::load_aligned(&out_buffer[batch<T_out, A>::size]) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out_buffer
and out_buffer + batch<T_out, A>::size seems clearer to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I mean doing pointer arithmetic directly instead of using the [] and & operators)
JohanMabille
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@DiamonDinoia : Are you fine if I split (haha) that part of your commit in a seperate one so that we don't block each other? |
6e570d1 to
f8136da
Compare
Sure, go ahead. I'll rebase once I have time |
It was relying on an intrinsic only available with avx512dq
f8136da to
6b93d0f
Compare
done (as in: I created a separate PR to extract that part of your patch, merged it and rebased my PR on top of it) |
6b93d0f to
b9c9e8d
Compare
Intel + common implementation + test + doc Fix #1179
b9c9e8d to
7ebdc5e
Compare
Fix #1179