Extend support of batch_cast<...> to upcasting to a type twice as big #1184

serge-sans-paille · 2025-10-29T22:07:26Z

serge-sans-paille · 2025-10-29T22:07:59Z

@AntoinePrv : this is just some mockup to check if we're good with the API

serge-sans-paille · 2025-10-29T22:10:43Z

Also cc @JohanMabille & @DiamonDinoia for the API. I didn't want to support the generic conversion, say from uint8 to uint32 with larger arrays, let's start humble

DiamonDinoia · 2025-10-30T03:32:58Z

I would call the API extension, extend or widen because it does not just cast, it zero extend and 1 extend. I think in intrinsics terminology is called widening and convert with widening.

On the implementation:

using std::conditional and std::make_signed halves the number of struct you need.
~~2. Proposal load/store masked #1162 introduces lower/upper :)~~

namespace {
template<class T> struct widen;

template<> struct widen<uint8_t>  { using type = uint16_t; };
template<> struct widen<uint16_t> { using type = uint32_t; };
template<> struct widen<uint32_t> { using type = uint64_t; };

template<class T>
using widen_t_unsigned = typename widen<T>::type;

} // namespace 

template<class T>
struct widen {
    static_assert(std::is_integral<T>::value && !std::is_same<T,bool>::value,
                  "integral non-bool type required");

    using U     = typename std::make_unsigned<T>::type;
    using UWide = detail::widen_t_unsigned<U>;

    using type = typename std::conditional<
        std::is_signed<T>::value,
        typename std::make_signed<UWide>::type,
        UWide
    >::type;
};

template<class T>
using widen_t = typename widen<T>::type;

serge-sans-paille · 2025-10-30T08:17:34Z

@DiamonDinoia yep, widen looks like a good name, thanks!

AntoinePrv · 2025-10-30T16:21:26Z

include/xsimd/types/xsimd_traits.hpp

+        template <typename T>
+        struct widen : widen<typename std::make_unsigned<T>::type>
+        {
+        };
+
+        template <>
+        struct widen<uint32_t>
+        {
+            using type = uint64_t;
+        };
+        template <>
+        struct widen<uint16_t>
+        {
+            using type = uint32_t;
+        };
+        template <>
+        struct widen<uint8_t>
+        {
+            using type = uint16_t;
+        };
+        template <>
+        struct widen<float>
+        {
+            using type = double;
+        };


I think a trait that convert a byte size to a uint/float type could be more generic:

template <> struct sized_uint<4> { using type = uint32_t; }; ...

And then use with batch<sized_uint_t<2*sizeof(T)>> (or an alias for that).

I am not in favor of this. Then we need sized_int and sized_uint and the API will need an std::conditional<std::is_signed<T>, sized_int_t<2*sizeof(T)>, sized_uint_t<2*sizeof(T)>

AntoinePrv

Thanks for taking the time! Looking good

DiamonDinoia · 2025-10-30T16:26:53Z

My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here.

serge-sans-paille · 2025-10-30T21:35:15Z

My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here.

Yup, I'll wait for it to be merged!

DiamonDinoia · 2025-10-30T22:04:51Z

include/xsimd/arch/common/xsimd_common_cast.hpp

+            return { batch<T_out, A>::load_aligned(&out_buffer[0]),
+                     batch<T_out, A>::load_aligned(&out_buffer[batch<T_out, A>::size]) };


out_buffer
and out_buffer + batch<T_out, A>::size seems clearer to me

(I mean doing pointer arithmetic directly instead of using the [] and & operators)

JohanMabille

LGTM

serge-sans-paille · 2025-10-31T07:54:44Z

My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here.

@DiamonDinoia : Are you fine if I split (haha) that part of your commit in a seperate one so that we don't block each other?

DiamonDinoia · 2025-10-31T09:18:53Z

My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here.

@DiamonDinoia : Are you fine if I split (haha) that part of your commit in a seperate one so that we don't block each other?

Sure, go ahead. I'll rebase once I have time

Related to #1184 and #1162

It was relying on an intrinsic only available with avx512dq

serge-sans-paille · 2025-10-31T14:58:25Z

My only suggestion would be to merge masked_memory ops. Or merge lower/upper from it and use it here.

done (as in: I created a separate PR to extract that part of your patch, merged it and rebased my PR on top of it)

Intel + common implementation + test + doc Fix #1179

serge-sans-paille force-pushed the bug/1179 branch from 9e19732 to 4d848a8 Compare October 29, 2025 22:09

serge-sans-paille force-pushed the bug/1179 branch from 4d848a8 to affe742 Compare October 30, 2025 07:10

serge-sans-paille force-pushed the bug/1179 branch 5 times, most recently from 520e96d to b76733d Compare October 30, 2025 13:54

AntoinePrv reviewed Oct 30, 2025

View reviewed changes

serge-sans-paille force-pushed the bug/1179 branch 2 times, most recently from 471c67f to a470cf8 Compare October 30, 2025 21:48

DiamonDinoia reviewed Oct 30, 2025

View reviewed changes

JohanMabille reviewed Oct 31, 2025

View reviewed changes

serge-sans-paille force-pushed the bug/1179 branch 2 times, most recently from 6e570d1 to f8136da Compare October 31, 2025 08:32

serge-sans-paille added a commit that referenced this pull request Oct 31, 2025

Move from split_avx / split_avx512 to lower_half/ upper_half

f4d7a34

Related to #1184 and #1162

serge-sans-paille mentioned this pull request Oct 31, 2025

Move from split_avx / split_avx512 to lower_half/ upper_half #1188

Merged

serge-sans-paille added a commit that referenced this pull request Oct 31, 2025

Move from split_avx / split_avx512 to lower_half/ upper_half

fce8da2

Related to #1184 and #1162

Fix portability issue with avx512f upper_half of batch of float

dba009b

It was relying on an intrinsic only available with avx512dq

serge-sans-paille force-pushed the bug/1179 branch from f8136da to 6b93d0f Compare October 31, 2025 14:58

serge-sans-paille force-pushed the bug/1179 branch from 6b93d0f to b9c9e8d Compare October 31, 2025 17:58

New API: widen to widen a batch to a batch twice as big

7ebdc5e

Intel + common implementation + test + doc Fix #1179

serge-sans-paille force-pushed the bug/1179 branch from b9c9e8d to 7ebdc5e Compare October 31, 2025 18:12

serge-sans-paille merged commit 9c7c6de into master Oct 31, 2025
116 of 118 checks passed

		return { batch<T_out, A>::load_aligned(&out_buffer[0]),
		batch<T_out, A>::load_aligned(&out_buffer[batch<T_out, A>::size]) };

Extend support of batch_cast<...> to upcasting to a type twice as big #1184

Extend support of batch_cast<...> to upcasting to a type twice as big #1184

Uh oh!

Conversation

serge-sans-paille commented Oct 29, 2025

Uh oh!

serge-sans-paille commented Oct 29, 2025

Uh oh!

serge-sans-paille commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DiamonDinoia commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

serge-sans-paille commented Oct 30, 2025

Uh oh!

AntoinePrv Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

DiamonDinoia Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AntoinePrv left a comment

Choose a reason for hiding this comment

Uh oh!

DiamonDinoia commented Oct 30, 2025

Uh oh!

serge-sans-paille commented Oct 30, 2025

Uh oh!

DiamonDinoia Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

DiamonDinoia Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

JohanMabille left a comment

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DiamonDinoia commented Oct 31, 2025

Uh oh!

serge-sans-paille commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

serge-sans-paille commented Oct 29, 2025 •

edited

Loading

DiamonDinoia commented Oct 30, 2025 •

edited

Loading

DiamonDinoia Oct 30, 2025 •

edited

Loading

serge-sans-paille commented Oct 31, 2025 •

edited

Loading

serge-sans-paille commented Oct 31, 2025 •

edited

Loading