Skip to content

Commit cc26c0f

Browse files
committed
doxy: improve the faq entry about slow gpu discovery
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
1 parent 45d5fcc commit cc26c0f

File tree

1 file changed

+15
-8
lines changed

1 file changed

+15
-8
lines changed

doc/hwloc.doxy

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4031,9 +4031,9 @@ libraries such as NVML.
40314031
To speed up lstopo, you may disable such features with command-line
40324032
options such as <tt>\--no-io</tt>.
40334033

4034-
When NVIDIA GPU probing is enabled with CUDA or NVML, one should make sure that
4035-
the <em>Persistent</em> mode is enabled (with <tt>nvidia-smi -pm 1</tt>)
4036-
to avoid significant GPU initialization overhead.
4034+
When NVIDIA GPU probing is enabled (e.g. with CUDA or NVML), one may enable
4035+
the <em>Persistent</em> mode (with <tt>nvidia-smi -pm 1</tt>)
4036+
to avoid significant GPU wakeup and initialization overhead.
40374037

40384038
When AMD GPU discovery is enabled with OpenCL and hwloc is used remotely
40394039
over ssh, some spurious round-trips on the network may significantly
@@ -4042,11 +4042,18 @@ Forcing the <tt>DISPLAY</tt> environment variable to the remote X server
40424042
display (usually <tt>:0</tt>) instead of only setting the <tt>COMPUTE</tt>
40434043
variable may avoid this.
40444044

4045-
Also remember that these components may be disabled at build-time with
4046-
configure flags such as <tt>\--disable-opencl</tt>, <tt>\--disable-cuda</tt> or <tt>\--disable-nvml</tt>,
4047-
and at runtime with the environment variable
4048-
<tt>HWLOC_COMPONENTS=-opencl,-cuda,-nvml</tt>
4049-
or with hwloc_topology_set_components().
4045+
Also remember that these hwloc components may be disabled.
4046+
At build-time, one may pass configure flags such as <tt>\--disable-opencl</tt>,
4047+
<tt>\--disable-cuda</tt>, <tt>\--disable-nvml</tt>, <tt>\--disable-rsmi</tt>,
4048+
and <tt>\--disable-levelzero</tt>.
4049+
At runtime, one may set the environment variable
4050+
<tt>HWLOC_COMPONENTS=-opencl,-cuda,-nvml,-rsmi,-levelzero</tt>
4051+
or call hwloc_topology_set_components().
4052+
4053+
Remember that these backends are disabled by default, except in lstopo.
4054+
If hwloc itself is still too slow even after disabling all the I/O devices
4055+
as explained above, see also \ref faq_disable_faster for disabling even more
4056+
features.
40504057

40514058

40524059
\subsection faq_privileged Does hwloc require privileged access?

0 commit comments

Comments
 (0)