You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/index.rst
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,9 @@ vendors and compilers.
19
19
`xsimd` provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of numbers with the same arithmetic
20
20
operators as for single values. It also provides accelerated implementation of common mathematical functions operating on batches.
21
21
22
+
`xsimd` makes it easy to write a single algorithm, generate one version of the algorithm per micro-architecture and pick the best one at runtime, based on the
23
+
running processor capability.
24
+
22
25
You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the `The C++ Scientist`_. The mathematical functions are a
23
26
lightweight implementation of the algorithms also used in `boost.SIMD`_.
24
27
@@ -80,6 +83,7 @@ This software is licensed under the BSD-3-Clause license. See the LICENSE file f
80
83
api/batch_manip
81
84
api/math_index
82
85
api/aligned_allocator
86
+
api/dispatching
83
87
84
88
.. _The C++ Scientist: http://johanmabille.github.io/blog/archives/
Copy file name to clipboardExpand all lines: docs/source/installation.rst
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,27 +21,27 @@
21
21
Installation
22
22
============
23
23
24
-
Although ``xsimd`` is a header-only library, we provide standardized means to install it, with package managers or with cmake.
24
+
Although `xsimd` is a header-only library, we provide standardized means to install it, with package managers or with cmake.
25
25
26
-
Besides the xsimd headers, all these methods place the ``cmake`` project configuration file in the right location so that third-party projects can use cmake's ``find_package`` to locate xsimd headers.
26
+
Besides the `xsimd` headers, all these methods place the ``cmake`` project configuration file in the right location so that third-party projects can use cmake's ``find_package`` to locate `xsimd` headers.
27
27
28
28
.. image:: conda.svg
29
29
30
30
Using the conda-forge package
31
31
-----------------------------
32
32
33
-
A package for xsimd is available for the mamba (or conda) package manager.
33
+
A package for `xsimd` is available for the `mamba <https://mamba.readthedocs.io>`_ (or `conda<https://conda.io>`_) package manager.
34
34
35
35
.. code::
36
36
37
-
mamba install -c conda-forge xsimd
37
+
mamba install -c conda-forge xsimd
38
38
39
39
.. image:: spack.svg
40
40
41
41
Using the Spack package
42
42
-----------------------
43
43
44
-
A package for xsimd is available on the Spack package manager.
44
+
A package for `xsimd` is available on the `Spack<https://spack.io>`_ package manager.
45
45
46
46
.. code::
47
47
@@ -53,7 +53,7 @@ A package for xsimd is available on the Spack package manager.
53
53
From source with cmake
54
54
----------------------
55
55
56
-
You can also install ``xsimd`` from source with cmake. On Unix platforms, from the source directory:
56
+
You can also install `xsimd` from source with `cmake<https://cmake.org/>`_. On Unix platforms, from the source directory:
Copy file name to clipboardExpand all lines: docs/source/vectorized_code.rst
+50-10Lines changed: 50 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,8 +28,8 @@ How can we used `xsimd` to take advantage of vectorization ?
28
28
Explicit use of an instruction set
29
29
----------------------------------
30
30
31
-
`xsimd` provides the template class ``batch<T, A>`` where ``A`` is the target architecture and ``T`` the type of the values involved in SIMD
32
-
instructions. If you know which intruction set is available on your machine, you can directly use the corresponding specialization
31
+
`xsimd` provides the template class :cpp:class:`xsimd::batch` parametrized by ``T`` and ``A`` types where ``T`` is the type of the values involved in SIMD
32
+
instructions and ``A`` is the target architecture. If you know which instruction set is available on your machine, you can directly use the corresponding specialization
33
33
of ``batch``. For instance, assuming the AVX instruction set is available, the previous code can be vectorized the following way:
34
34
35
35
.. code::
@@ -60,19 +60,19 @@ of ``batch``. For instance, assuming the AVX instruction set is available, the p
60
60
}
61
61
62
62
However, if you want to write code that is portable, you cannot rely on the use of ``batch<double, xsimd::avx>``.
63
-
Indeed this won't compile on a CPU where only SSE2 instruction set is available for instance. Fortuantely, if you don't set the second template parameter, ``xsimd`` picks the best architecture among the one available, based on the compiler flag you use.
63
+
Indeed this won't compile on a CPU where only SSE2 instruction set is available for instance. Fortunately, if you don't set the second template parameter, `xsimd` picks the best architecture among the one available, based on the compiler flag you use.
64
64
65
65
66
66
Aligned vs unaligned memory
67
67
---------------------------
68
68
69
-
In the previous example, you may have noticed the ``load_unaligned/store_unaligned`` functions. These
69
+
In the previous example, you may have noticed the :cpp:func:`xsimd::batch::load_unaligned` and :cpp:func:`xsimd::batch::store_unaligned` functions. These
70
70
are meant for loading values from contiguous dynamically allocated memory into SIMD registers and
71
71
reciprocally. When dealing with memory transfer operations, some instructions sets required the memory
72
72
to be aligned by a given amount, others can handle both aligned and unaligned modes. In that latter case,
73
-
operating on aligned memory is always faster than operating on unaligned memory.
73
+
operating on aligned memory is generally faster than operating on unaligned memory.
74
74
75
-
`xsimd` provides an aligned memory allocator which follows the standard requirements, so it can be used
75
+
`xsimd` provides an aligned memory allocator, namely :cpp:class:`xsimd::aligned_allocator` which follows the standard requirements, so it can be used
76
76
with STL containers. Let's change the previous code so it can take advantage of this allocator:
77
77
78
78
.. code::
@@ -118,7 +118,7 @@ mechanism that allows you to easily write such a generic code:
118
118
#include "xsimd/xsimd.hpp"
119
119
120
120
template <class C, class Tag>
121
-
void mean(const C& a, const C& b, C& res)
121
+
void mean(const C& a, const C& b, C& res, Tag)
122
122
{
123
123
using b_type = xsimd::batch<double>;
124
124
std::size_t inc = b_type::size;
@@ -139,10 +139,50 @@ mechanism that allows you to easily write such a generic code:
139
139
}
140
140
}
141
141
142
-
Here, the ``Tag`` template parameter can be ``xsimd::aligned_mode`` or ``xsimd::unaligned_mode``. Assuming the existence
143
-
of a ``get_alignment_tag`` metafunction in the code, the previous code can be invoked this way:
142
+
Here, the ``Tag`` template parameter can be :cpp:struct:`xsimd::aligned_mode` or :cpp:struct:`xsimd::unaligned_mode`. Assuming the existence
143
+
of a ``get_alignment_tag`` meta-function in the code, the previous code can be invoked this way:
This can be useful to implement runtime dispatching, based on the instruction set detected at runtime. `xsimd` provides a generic machinery :cpp:func:`xsimd::dispatch()` to implement
188
+
this pattern. Based on the above example, instead of calling ``mean{}(arch, a, b, res, tag)``, one can use ``xsimd::dispatch(mean{})(a, b, res, tag)``.
0 commit comments