Skip to content

Commit 0ff487a

Browse files
introduce repr(scalable)
Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
1 parent 192c533 commit 0ff487a

File tree

1 file changed

+326
-0
lines changed

1 file changed

+326
-0
lines changed

text/0000-repr-scalable.md

Lines changed: 326 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,326 @@
1+
- Feature Name: `repr_scalable`
2+
- Start Date: 2025-07-07
3+
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
4+
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Extends Rust's existing SIMD infrastructure, `#[repr(simd)]`, with a
10+
complementary scalable representation, `#[repr(scalable)]`, to support scalable
11+
vector types, such as Arm's Scalable Vector Extension (SVE), or RISC-V's Vector
12+
Extension (RVV).
13+
14+
Like the existing `repr(simd)` representation, `repr(scalable)` is internal
15+
compiler infrastructure that will be used only in the standard library to
16+
introduce scalable vector types which can then be stablised. Only the
17+
infrastructure to define these types are introduced in this RFC, not the types
18+
or intrinsics that use it.
19+
20+
This RFC builds on Rust's existing SIMD infrastructure, introduced in
21+
[rfcs#1199: SIMD Infrastructure][rfcs#1199]. It depends on
22+
[rfcs#3729: Hierarchy of Sized traits][rfcs#3729].
23+
24+
SVE is used in examples throughout this RFC, but the proposed features should be
25+
sufficient to enable support for similar extensions in other architectures, such
26+
as RISC-V's V Extension.
27+
28+
# Motivation
29+
[motivation]: #motivation
30+
31+
SIMD types and instructions are a crucial element of high-performance Rust
32+
applications and allow for operating on multiple values in a single instruction.
33+
Many processors have SIMD registers of a known fixed length and provide
34+
intrinsics which operate on these registers. For example, Arm's Neon extension
35+
is well-supported by Rust and provides 128-bit registers and a wide range of
36+
intrinsics.
37+
38+
Instead of releasing more extensions with ever increasing register bit widths,
39+
AArch64 has introduced a Scalable Vector Extension (SVE). Similarly, RISC-V has
40+
a Vector Extension (RVV). These extensions have vector registers whose width
41+
depends on the CPU implementation and bit-width-agnostic intrinsics for
42+
operating on these registers. By using scalale vectors, code won't need to be
43+
re-written using new architecture extensions with larger registers, new types
44+
and intrinsics, but instead will work on newer processors with different vector
45+
register lengths and performance characteristics.
46+
47+
Scalable vectors have interesting and challenging implications for Rust,
48+
introducing value types with sizes that can only be known at runtime, requiring
49+
significant changes to the language's notion of sizedness - this support is
50+
being proposed in the [rfcs#3729].
51+
52+
Hardware is generally available with SVE, and key Rust stakeholders want to be
53+
able to use these architecture features from Rust. In a [recent discussion on
54+
SVE, Amanieu, co-lead of the library team, said][quote_amanieu]:
55+
56+
> I've talked with several people in Google, Huawei and Microsoft, all of whom
57+
> have expressed a rather urgent desire for the ability to use SVE intrinsics in
58+
> Rust code, especially now that SVE hardware is generally available.
59+
60+
Without support in the compiler, leveraging the
61+
[*Hierarchy of Sized traits*][rfcs#3729] proposal, it is not possible to
62+
introduce intrinsics and types exposing the scalable vector support in hardware.
63+
64+
# Guide-level explanation
65+
[guide-level-explanation]: #guide-level-explanation
66+
67+
None of the infrastructure proposed in this RFC is intended to be used directly
68+
by Rust users.
69+
70+
`repr(scalable)` as described later in
71+
[*Reference-level explanation*][reference-level-explanation] is perma-unstable
72+
and exists only enables scalable vector types to be defined in the standard
73+
library. The specific vector types are intended to eventually be stabilised, but
74+
none are proposed in this RFC.
75+
76+
## Using scalable vectors
77+
[using-scalable-vectors]: #using-scalable-vectors
78+
79+
Scalable vector types correspond to vector registers in hardware with unknown
80+
size at compile time. However, it will be a known and fixed size at runtime.
81+
Additional properties could be known during compilation, depending on the
82+
architecture, such as a minimum or maximum size or that the size must be a
83+
multiple of some factor.
84+
85+
As previously described, users will not define their own scalable vector types
86+
and instead use intrinsics from `std::arch`, and this RFC is not proposing any
87+
such intrinsics, just the infrastructure. However, to illustrate how the types
88+
and intrinsics that this infrastructure will enable can be used, consider the
89+
following example that sums two input vectors:
90+
91+
```rust
92+
fn sve_add(in_a: Vec<f32>, in_b: Vec<f32>, out_c: &mut Vec<f32>) {
93+
let len = in_a.len();
94+
unsafe {
95+
// `svcntw` returns the actual number of elements that are in a 32-bit
96+
// element vector
97+
let step = svcntw() as usize;
98+
for i in (0..len).step_by(step) {
99+
let a = in_a.as_ptr().add(i);
100+
let b = in_b.as_ptr().add(i);
101+
let c = out_c as *mut f32;
102+
let c = c.add(i);
103+
104+
// `svwhilelt_b32` generates a mask based on comparing the current
105+
// index against the `len`
106+
let pred = svwhilelt_b32(i as _, len as _);
107+
108+
// `svld1_f32` loads a vector register with the data from address
109+
// `a`, zeroing any elements in the vector that are masked out
110+
let sva = svld1_f32(pred, a);
111+
let svb = svld1_f32(pred, b);
112+
113+
// `svadd_f32_m` adds `a` and `b`, any lanes that are masked out will
114+
// take the keep value of `a`
115+
let svc = svadd_f32_m(pred, sva, svb);
116+
117+
// `svst1_f32` will store the result without accessing any memory
118+
// locations that are masked out
119+
svst1_f32(svc, pred, c);
120+
}
121+
}
122+
}
123+
```
124+
125+
From a user's perspective, writing code for scalable vectors isn't too different
126+
from when writing code with a fixed sized vector.
127+
128+
# Reference-level explanation
129+
[reference-level-explanation]: #reference-level-explanation
130+
131+
Types annotated with the `#[repr(simd)]` attribute contains either an array
132+
field or multiple fields to indicate the intended size of the SIMD vector that
133+
the type represents.
134+
135+
Similarly, a `scalable` repr is introduced to define a scalable vector type.
136+
`scalable` accepts an integer to determine the minimum number of elements the
137+
vector contains. For example:
138+
139+
```rust
140+
#[repr(simd, scalable(4))]
141+
pub struct svfloat32_t { _ty: [f32], }
142+
```
143+
144+
As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get
145+
the element type for the codegen backend.
146+
147+
## Properties of scalable vectors
148+
[properties-of-scalable-vector-types]: #properties-of-scalable-vectors
149+
150+
Scalable vectors are necessarily non-`const Sized` (from [rfcs#3729]) as they
151+
behave like value types but the exact size cannot be known at compilation time.
152+
153+
[rfcs#3729] allows these types to implement `Clone` (and consequently `Copy`) as
154+
`Clone` only requires an implementation of `Sized`, irrespective of constness.
155+
156+
Scalable vector types have some further restrictions due to limitations of the
157+
codegen backend:
158+
159+
- Cannot be stored in compound types (structs, enums, etc)
160+
- Including coroutines, so these types cannot be held across
161+
an await boundary in async functions.
162+
- Cannot be used in arrays
163+
- Cannot be the type of a static variable.
164+
165+
Some of these limitations may be able to be lifted in future depending on what
166+
is supported by rustc's codegen backends.
167+
168+
## ABI
169+
[abi]: #abi
170+
171+
Rust currently always passes SIMD vectors on the stack to avoid ABI mismatches
172+
between functions annotated with `target_feature` - where the relevant vector
173+
register is guaranteed to be present - and those without - where the relevant
174+
vector register might not be present.
175+
176+
However, this approach will not work for scalable vector types as the relevant
177+
target feature must to be present to use the instruction that can allocate the
178+
correct size on the stack for the scalable vector.
179+
180+
Therefore, there is an additional restriction that these types cannot be used in
181+
the argument or return types of functions unless those functions are annotated
182+
with the relevant target feature.
183+
184+
## Target features
185+
[target-features]: #target-features
186+
187+
Similarly to the issues with the ABI of scalable vectors, without the relevant
188+
target features, few operations can actually be performed on scalable vectors -
189+
causing issues for the use of scalable vectors in generic code and with traits.
190+
For example, implementations of traits like `Clone` would not be able to
191+
actually perform a clone, and generic functions that are instantiated with
192+
scalable vectors would during instruction selection in the codegen backend.
193+
194+
When a scalable vector is instantiated into a generic function during
195+
monomorphisation, or a trait method is being implemented for a scalable vector,
196+
then the relevant target feature will be added to the function.
197+
198+
For example, when instantiating `std::mem::size_of_val` with a scalable vector
199+
during monomorphisation, the relevant target feature will be added to `size_of_val`
200+
for codegen.
201+
202+
## Implementing `repr(scalable)`
203+
[implementing-repr-scalable]: #implementing-reprscalable
204+
205+
Implementing `repr(scalable)` largely involves lowering scalable vectors to the
206+
appropriate type in the codegen backend. LLVM has robust support for scalable
207+
vectors and is the default backend, so this section will focus on implementation
208+
in the LLVM codegen backend. Other codegen backends can implement support when
209+
scalable vectors are supported by the backend.
210+
211+
Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable
212+
vectors to the correct type in LLVM and the `vscale` modifier that is applied to
213+
LLVM's vector types.
214+
215+
LLVM's scalable vector type is of the form `<vscale x elements x type>`:
216+
217+
- `elements` multiplied by `size_of::<$ty>` gives the smallest allowed register
218+
size and the increment size
219+
- `vscale` is a runtime constant that is used to determine the actual vector
220+
register size
221+
222+
For example, with SVE, the scalable vector register (`Z` register) size has to
223+
be a multiple of 128 bits and a power of 2. Only the value of `elements` can be
224+
chosen by compiler. For `f32`, `elements` must always be four, as with the
225+
minimum `vscale` of one, `1 * 4 * sizeof(f32)` is the 128-bit minimum register
226+
size.
227+
228+
At runtime `vscale` could then be any power of two which would result in
229+
register sizes of 128, 256, 512, 1024 and 2048. `vscale` could be any value
230+
providing it gives a legal vector register size for the architecture.
231+
232+
`repr(scalable)` expects the number of `elements` to be provided rather than
233+
calculating it. This avoids needing to teach the compiler how to calculate the
234+
required `element` count, particularly as some of these scalable types can have
235+
different element counts. For instance, the predicates used in SVE have
236+
different element counts depending on the types they are a predicate for.
237+
238+
While it is possible to change the vector length at runtime using a
239+
[`prctl()`][prctl] call to the kernel, this would require that `vscale` change,
240+
which is unsupported. As Rust cannot prevent users from doing this, it will be
241+
documented as undefined behaviour, consistent with C and C++.
242+
243+
# Drawbacks
244+
[drawbacks]: #drawbacks
245+
246+
- `repr(scalable)` is inherently additional complexity to the language, despite
247+
being largely hidden from users.
248+
249+
# Rationale and alternatives
250+
[rationale-and-alternatives]: #rationale-and-alternatives
251+
252+
Without support for scalable vectors in the language and compiler, it is not
253+
possible to leverage hardware with scalable vectors from Rust. As extensions
254+
with scalable vectors are available in architectures as either the only or
255+
recommended way to do SIMD, lack of support in Rust would severely limit Rust's
256+
suitability on these architectures compared to other systems programming
257+
languages.
258+
259+
By aligning with the approach taken by C (discussed in the
260+
[*Prior art*][prior-art] below), most of the documentation that already exists
261+
for scalable vector intrinsics in C should still be applicable to Rust.
262+
263+
# Prior art
264+
[prior-art]: #prior-art
265+
266+
There are not many languages with support for scalable vectors:
267+
268+
- SVE in C takes a similar approach as this proposal by using sizeless
269+
incomplete types to represent scalable vectors. However, sizeless types are
270+
not part of the C specification and Arm's C Language Extensions (ACLE) provide
271+
[an edit to the C standard][acle_sizeless] which formally define "sizeless
272+
types".
273+
- [.NET 9 has experimental support for SVE][dotnet], but as a managed language,
274+
the design and implementation considerations in .NET are quite different to
275+
Rust.
276+
277+
[rfcs#3268] was a previous iteration of this RFC.
278+
279+
# Unresolved questions
280+
[unresolved-questions]: #unresolved-questions
281+
282+
There are currently no unresolved questions.
283+
284+
# Future possibilities
285+
[future-possibilities]: #future-possibilities
286+
287+
There are a handful of future possibilities enabled by this RFC:
288+
289+
## General mechanism for target-feature-affected types
290+
[general-mechanism-target-feature-types]: #general-mechanism-for-target-feature-affected-types
291+
292+
A more general mechanism for enforcing that SIMD types are only used in
293+
`target_feature`-annotated functions would be useful, as this would enable SVE
294+
types to have fewer distinct restrictions than other SIMD types, and would
295+
enable SIMD vectors to be passed by-register, a performance improvement.
296+
297+
Such a mechanism would need be introduced gradually to existing SIMD types with
298+
a forward compatibility lint. This will be addressed in a forthcoming RFC.
299+
300+
## Relaxed restrictions
301+
[relaxed-restrictions]: #relaxed-restrictions
302+
303+
Some of the restrictions on these types (e.g. use in compound types) could be
304+
relaxed at a later time either by extending rustc's codegen or leveraging newly
305+
added support in LLVM.
306+
307+
However, as C also has restriction and scalable vectors are nevertheless used in
308+
production code, it is unlikely there will be much demand for those restrictions
309+
to be relaxed.
310+
311+
## Portable SIMD
312+
[portable-simd]: #portable-simd
313+
314+
Given that there are significant differences between scalable vectors and
315+
fixed-length vectors, and that `std::simd` is unstable, it is worth
316+
experimenting with architecture-specific support and implementation initially.
317+
Later, there are a variety of approaches that could be taken to incorporate
318+
support for scalable vectors into Portable SIMD.
319+
320+
[acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types
321+
[dotnet]: https://github.com/dotnet/runtime/issues/93095
322+
[prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt
323+
[rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html
324+
[rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268
325+
[rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729
326+
[quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754

0 commit comments

Comments
 (0)