File tree Expand file tree Collapse file tree 2 files changed +10
-4
lines changed Expand file tree Collapse file tree 2 files changed +10
-4
lines changed Original file line number Diff line number Diff line change @@ -206,8 +206,11 @@ impl f32 {
206206 /// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207207 /// error, yielding a more accurate result than an unfused multiply-add.
208208 ///
209- /// Using `mul_add` can be more performant than an unfused multiply-add if
210- /// the target architecture has a dedicated `fma` CPU instruction.
209+ /// Using `mul_add` *can* be more performant than an unfused multiply-add if
210+ /// the target architecture has a dedicated `fma` CPU instruction. However,
211+ /// this is not always true, and care must be taken not to overload the
212+ /// architecture's available FMA units when using many FMA instructions
213+ /// in a row, which can cause a stall and performance degradation.
211214 ///
212215 /// # Examples
213216 ///
Original file line number Diff line number Diff line change @@ -206,8 +206,11 @@ impl f64 {
206206 /// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207207 /// error, yielding a more accurate result than an unfused multiply-add.
208208 ///
209- /// Using `mul_add` can be more performant than an unfused multiply-add if
210- /// the target architecture has a dedicated `fma` CPU instruction.
209+ /// Using `mul_add` *can* be more performant than an unfused multiply-add if
210+ /// the target architecture has a dedicated `fma` CPU instruction. However,
211+ /// this is not always true, and care must be taken not to overload the
212+ /// architecture's available FMA units when using many FMA instructions
213+ /// in a row, which can cause a stall and performance degradation.
211214 ///
212215 /// # Examples
213216 ///
You can’t perform that action at this time.
0 commit comments