Skip to content

Commit 80c0159

Browse files
author
Jennifer Berringer
committed
cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
JIRA: https://issues.redhat.com/browse/RHEL-80968 commit fbb4a47 Author: Beata Michalska <beata.michalska@arm.com> Date: Fri Jan 31 16:24:37 2025 +0000 cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Currently the CPUFreq core exposes two sysfs attributes that can be used to query current frequency of a given CPU(s): namely cpuinfo_cur_freq and scaling_cur_freq. Both provide slightly different view on the subject and they do come with their own drawbacks. cpuinfo_cur_freq provides higher precision though at a cost of being rather expensive. Moreover, the information retrieved via this attribute is somewhat short lived as frequency can change at any point of time making it difficult to reason from. scaling_cur_freq, on the other hand, tends to be less accurate but then the actual level of precision (and source of information) varies between architectures making it a bit ambiguous. The new attribute, cpuinfo_avg_freq, is intended to provide more stable, distinct interface, exposing an average frequency of a given CPU(s), as reported by the hardware, over a time frame spanning no more than a few milliseconds. As it requires appropriate hardware support, this interface is optional. Note that under the hood, the new attribute relies on the information provided by arch_freq_get_on_cpu, which, up to this point, has been feeding data for scaling_cur_freq attribute, being the source of ambiguity when it comes to interpretation. This has been amended by restoring the intended behavior for scaling_cur_freq, with a new dedicated config option to maintain status quo for those, who may need it. CC: Jonathan Corbet <corbet@lwn.net> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: Borislav Petkov <bp@alien8.de> CC: Dave Hansen <dave.hansen@linux.intel.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Phil Auld <pauld@redhat.com> CC: x86@kernel.org CC: linux-doc@vger.kernel.org Signed-off-by: Beata Michalska <beata.michalska@arm.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250131162439.3843071-3-beata.michalska@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jennifer Berringer <jberring@redhat.com>
1 parent 2a41628 commit 80c0159

File tree

3 files changed

+57
-2
lines changed

3 files changed

+57
-2
lines changed

Documentation/admin-guide/pm/cpufreq.rst

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,20 @@ are the following:
248248
If that frequency cannot be determined, this attribute should not
249249
be present.
250250

251+
``cpuinfo_avg_freq``
252+
An average frequency (in KHz) of all CPUs belonging to a given policy,
253+
derived from a hardware provided feedback and reported on a time frame
254+
spanning at most few milliseconds.
255+
256+
This is expected to be based on the frequency the hardware actually runs
257+
at and, as such, might require specialised hardware support (such as AMU
258+
extension on ARM). If one cannot be determined, this attribute should
259+
not be present.
260+
261+
Note, that failed attempt to retrieve current frequency for a given
262+
CPU(s) will result in an appropriate error, i.e: EAGAIN for CPU that
263+
remains idle (raised on ARM).
264+
251265
``cpuinfo_max_freq``
252266
Maximum possible operating frequency the CPUs belonging to this policy
253267
can run at (in kHz).
@@ -293,7 +307,8 @@ are the following:
293307
Some architectures (e.g. ``x86``) may attempt to provide information
294308
more precisely reflecting the current CPU frequency through this
295309
attribute, but that still may not be the exact current CPU frequency as
296-
seen by the hardware at the moment.
310+
seen by the hardware at the moment. This behavior though, is only
311+
available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
297312

298313
``scaling_driver``
299314
The scaling driver currently in use.

drivers/cpufreq/Kconfig.x86

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
340340
option lets the probing code bypass some of those checks if the
341341
parameter "relaxed_check=1" is passed to the module.
342342

343+
config CPUFREQ_ARCH_CUR_FREQ
344+
default y
345+
bool "Current frequency derived from HW provided feedback"
346+
help
347+
This determines whether the scaling_cur_freq sysfs attribute returns
348+
the last requested frequency or a more precise value based on hardware
349+
provided feedback (as architected counters).
350+
Given that a more precise frequency can now be provided via the
351+
cpuinfo_avg_freq attribute, by enabling this option,
352+
scaling_cur_freq maintains the provision of a counter based frequency,
353+
for compatibility reasons.
354+

drivers/cpufreq/cpufreq.c

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -733,12 +733,20 @@ __weak int arch_freq_get_on_cpu(int cpu)
733733
return -EOPNOTSUPP;
734734
}
735735

736+
static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
737+
{
738+
return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
739+
}
740+
736741
static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
737742
{
738743
ssize_t ret;
739744
int freq;
740745

741-
freq = arch_freq_get_on_cpu(policy->cpu);
746+
freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
747+
? arch_freq_get_on_cpu(policy->cpu)
748+
: 0;
749+
742750
if (freq > 0)
743751
ret = sysfs_emit(buf, "%u\n", freq);
744752
else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
@@ -783,6 +791,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
783791
return sysfs_emit(buf, "<unknown>\n");
784792
}
785793

794+
/*
795+
* show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
796+
*/
797+
static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
798+
char *buf)
799+
{
800+
int avg_freq = arch_freq_get_on_cpu(policy->cpu);
801+
802+
if (avg_freq > 0)
803+
return sysfs_emit(buf, "%u\n", avg_freq);
804+
return avg_freq != 0 ? avg_freq : -EINVAL;
805+
}
806+
786807
/*
787808
* show_scaling_governor - show the current policy for the specified CPU
788809
*/
@@ -945,6 +966,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
945966
}
946967

947968
cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
969+
cpufreq_freq_attr_ro(cpuinfo_avg_freq);
948970
cpufreq_freq_attr_ro(cpuinfo_min_freq);
949971
cpufreq_freq_attr_ro(cpuinfo_max_freq);
950972
cpufreq_freq_attr_ro(cpuinfo_transition_latency);
@@ -1072,6 +1094,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
10721094
return ret;
10731095
}
10741096

1097+
if (cpufreq_avg_freq_supported(policy)) {
1098+
ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
1099+
if (ret)
1100+
return ret;
1101+
}
1102+
10751103
ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
10761104
if (ret)
10771105
return ret;

0 commit comments

Comments
 (0)