Skip to content

Commit c82044a

Browse files
committed
Merge: Sched: /proc/schedstat improvements
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6405 JIRA: https://issues.redhat.com/browse/RHEL-23495 Update /proc/schedstat with fixes and improved information from upstream. AMD requested these and they don't carry a large risk. Signed-off-by: Phil Auld <pauld@redhat.com> Approved-by: Juri Lelli <juri.lelli@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Augusto Caringi <acaringi@redhat.com>
2 parents b97e4e0 + 4970c10 commit c82044a

File tree

6 files changed

+143
-92
lines changed

6 files changed

+143
-92
lines changed

Documentation/scheduler/sched-stats.rst

Lines changed: 75 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,22 @@
22
Scheduler Statistics
33
====================
44

5+
Version 17 of schedstats removed 'lb_imbalance' field as it has no
6+
significance anymore and instead added more relevant fields namely
7+
'lb_imbalance_load', 'lb_imbalance_util', 'lb_imbalance_task' and
8+
'lb_imbalance_misfit'. The domain field prints the name of the
9+
corresponding sched domain from this version onwards.
10+
511
Version 16 of schedstats changed the order of definitions within
612
'enum cpu_idle_type', which changed the order of [CPU_MAX_IDLE_TYPES]
713
columns in show_schedstat(). In particular the position of CPU_IDLE
814
and __CPU_NOT_IDLE changed places. The size of the array is unchanged.
915

1016
Version 15 of schedstats dropped counters for some sched_yield:
1117
yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
12-
identical to version 14.
18+
identical to version 14. Details are available at
19+
20+
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/scheduler/sched-stats.txt?id=1e1dbb259c79b
1321

1422
Version 14 of schedstats includes support for sched_domains, which hit the
1523
mainline kernel in 2.6.20 although it is identical to the stats from version
@@ -26,7 +34,14 @@ cpus on the machine, while domain0 is the most tightly focused domain,
2634
sometimes balancing only between pairs of cpus. At this time, there
2735
are no architectures which need more than three domain levels. The first
2836
field in the domain stats is a bit map indicating which cpus are affected
29-
by that domain.
37+
by that domain. Details are available at
38+
39+
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/sched-stats.txt?id=b762f3ffb797c
40+
41+
The schedstat documentation is maintained version 10 onwards and is not
42+
updated for version 11 and 12. The details for version 10 are available at
43+
44+
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/sched-stats.txt?id=1da177e4c3f4
3045

3146
These fields are counters, and only increment. Programs which make use
3247
of these will need to start with a baseline observation and then calculate
@@ -71,88 +86,97 @@ Domain statistics
7186
-----------------
7287
One of these is produced per domain for each cpu described. (Note that if
7388
CONFIG_SMP is not defined, *no* domains are utilized and these lines
74-
will not appear in the output.)
89+
will not appear in the output. <name> is an extension to the domain field
90+
that prints the name of the corresponding sched domain. It can appear in
91+
schedstat version 17 and above, and requires CONFIG_SCHED_DEBUG.)
7592

76-
domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
93+
domain<N> <name> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
7794

7895
The first field is a bit mask indicating what cpus this domain operates over.
7996

80-
The next 24 are a variety of sched_balance_rq() statistics in grouped into types
81-
of idleness (idle, busy, and newly idle):
97+
The next 33 are a variety of sched_balance_rq() statistics in grouped into types
98+
of idleness (busy, idle and newly idle):
8299

83100
1) # of times in this domain sched_balance_rq() was called when the
101+
cpu was busy
102+
2) # of times in this domain sched_balance_rq() checked but found the
103+
load did not require balancing when busy
104+
3) # of times in this domain sched_balance_rq() tried to move one or
105+
more tasks and failed, when the cpu was busy
106+
4) Total imbalance in load when the cpu was busy
107+
5) Total imbalance in utilization when the cpu was busy
108+
6) Total imbalance in number of tasks when the cpu was busy
109+
7) Total imbalance due to misfit tasks when the cpu was busy
110+
8) # of times in this domain pull_task() was called when busy
111+
9) # of times in this domain pull_task() was called even though the
112+
target task was cache-hot when busy
113+
10) # of times in this domain sched_balance_rq() was called but did not
114+
find a busier queue while the cpu was busy
115+
11) # of times in this domain a busier queue was found while the cpu
116+
was busy but no busier group was found
117+
118+
12) # of times in this domain sched_balance_rq() was called when the
84119
cpu was idle
85-
2) # of times in this domain sched_balance_rq() checked but found
120+
13) # of times in this domain sched_balance_rq() checked but found
86121
the load did not require balancing when the cpu was idle
87-
3) # of times in this domain sched_balance_rq() tried to move one or
122+
14) # of times in this domain sched_balance_rq() tried to move one or
88123
more tasks and failed, when the cpu was idle
89-
4) sum of imbalances discovered (if any) with each call to
90-
sched_balance_rq() in this domain when the cpu was idle
91-
5) # of times in this domain pull_task() was called when the cpu
124+
15) Total imbalance in load when the cpu was idle
125+
16) Total imbalance in utilization when the cpu was idle
126+
17) Total imbalance in number of tasks when the cpu was idle
127+
18) Total imbalance due to misfit tasks when the cpu was idle
128+
19) # of times in this domain pull_task() was called when the cpu
92129
was idle
93-
6) # of times in this domain pull_task() was called even though
130+
20) # of times in this domain pull_task() was called even though
94131
the target task was cache-hot when idle
95-
7) # of times in this domain sched_balance_rq() was called but did
132+
21) # of times in this domain sched_balance_rq() was called but did
96133
not find a busier queue while the cpu was idle
97-
8) # of times in this domain a busier queue was found while the
134+
22) # of times in this domain a busier queue was found while the
98135
cpu was idle but no busier group was found
99-
9) # of times in this domain sched_balance_rq() was called when the
100-
cpu was busy
101-
10) # of times in this domain sched_balance_rq() checked but found the
102-
load did not require balancing when busy
103-
11) # of times in this domain sched_balance_rq() tried to move one or
104-
more tasks and failed, when the cpu was busy
105-
12) sum of imbalances discovered (if any) with each call to
106-
sched_balance_rq() in this domain when the cpu was busy
107-
13) # of times in this domain pull_task() was called when busy
108-
14) # of times in this domain pull_task() was called even though the
109-
target task was cache-hot when busy
110-
15) # of times in this domain sched_balance_rq() was called but did not
111-
find a busier queue while the cpu was busy
112-
16) # of times in this domain a busier queue was found while the cpu
113-
was busy but no busier group was found
114136

115-
17) # of times in this domain sched_balance_rq() was called when the
116-
cpu was just becoming idle
117-
18) # of times in this domain sched_balance_rq() checked but found the
137+
23) # of times in this domain sched_balance_rq() was called when the
138+
was just becoming idle
139+
24) # of times in this domain sched_balance_rq() checked but found the
118140
load did not require balancing when the cpu was just becoming idle
119-
19) # of times in this domain sched_balance_rq() tried to move one or more
141+
25) # of times in this domain sched_balance_rq() tried to move one or more
120142
tasks and failed, when the cpu was just becoming idle
121-
20) sum of imbalances discovered (if any) with each call to
122-
sched_balance_rq() in this domain when the cpu was just becoming idle
123-
21) # of times in this domain pull_task() was called when newly idle
124-
22) # of times in this domain pull_task() was called even though the
143+
26) Total imbalance in load when the cpu was just becoming idle
144+
27) Total imbalance in utilization when the cpu was just becoming idle
145+
28) Total imbalance in number of tasks when the cpu was just becoming idle
146+
29) Total imbalance due to misfit tasks when the cpu was just becoming idle
147+
30) # of times in this domain pull_task() was called when newly idle
148+
31) # of times in this domain pull_task() was called even though the
125149
target task was cache-hot when just becoming idle
126-
23) # of times in this domain sched_balance_rq() was called but did not
150+
32) # of times in this domain sched_balance_rq() was called but did not
127151
find a busier queue while the cpu was just becoming idle
128-
24) # of times in this domain a busier queue was found while the cpu
152+
33) # of times in this domain a busier queue was found while the cpu
129153
was just becoming idle but no busier group was found
130154

131155
Next three are active_load_balance() statistics:
132156

133-
25) # of times active_load_balance() was called
134-
26) # of times active_load_balance() tried to move a task and failed
135-
27) # of times active_load_balance() successfully moved a task
157+
34) # of times active_load_balance() was called
158+
35) # of times active_load_balance() tried to move a task and failed
159+
36) # of times active_load_balance() successfully moved a task
136160

137161
Next three are sched_balance_exec() statistics:
138162

139-
28) sbe_cnt is not used
140-
29) sbe_balanced is not used
141-
30) sbe_pushed is not used
163+
37) sbe_cnt is not used
164+
38) sbe_balanced is not used
165+
39) sbe_pushed is not used
142166

143167
Next three are sched_balance_fork() statistics:
144168

145-
31) sbf_cnt is not used
146-
32) sbf_balanced is not used
147-
33) sbf_pushed is not used
169+
40) sbf_cnt is not used
170+
41) sbf_balanced is not used
171+
42) sbf_pushed is not used
148172

149173
Next three are try_to_wake_up() statistics:
150174

151-
34) # of times in this domain try_to_wake_up() awoke a task that
175+
43) # of times in this domain try_to_wake_up() awoke a task that
152176
last ran on a different cpu in this domain
153-
35) # of times in this domain try_to_wake_up() moved a task to the
177+
44) # of times in this domain try_to_wake_up() moved a task to the
154178
waking cpu because it was cache-cold on its own cpu anyway
155-
36) # of times in this domain try_to_wake_up() started passive balancing
179+
45) # of times in this domain try_to_wake_up() started passive balancing
156180

157181
/proc/<pid>/schedstat
158182
---------------------

include/linux/sched.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -913,6 +913,7 @@ struct task_struct {
913913
unsigned sched_reset_on_fork:1;
914914
unsigned sched_contributes_to_load:1;
915915
unsigned sched_migrated:1;
916+
unsigned sched_task_hot:1;
916917

917918
/* Force alignment to the next boundary: */
918919
unsigned :0;

include/linux/sched/topology.h

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,10 @@ struct sched_domain {
114114
unsigned int lb_count[CPU_MAX_IDLE_TYPES];
115115
unsigned int lb_failed[CPU_MAX_IDLE_TYPES];
116116
unsigned int lb_balanced[CPU_MAX_IDLE_TYPES];
117-
unsigned int lb_imbalance[CPU_MAX_IDLE_TYPES];
117+
unsigned int lb_imbalance_load[CPU_MAX_IDLE_TYPES];
118+
unsigned int lb_imbalance_util[CPU_MAX_IDLE_TYPES];
119+
unsigned int lb_imbalance_task[CPU_MAX_IDLE_TYPES];
120+
unsigned int lb_imbalance_misfit[CPU_MAX_IDLE_TYPES];
118121
unsigned int lb_gained[CPU_MAX_IDLE_TYPES];
119122
unsigned int lb_hot_gained[CPU_MAX_IDLE_TYPES];
120123
unsigned int lb_nobusyg[CPU_MAX_IDLE_TYPES];
@@ -140,9 +143,7 @@ struct sched_domain {
140143
unsigned int ttwu_move_affine;
141144
unsigned int ttwu_move_balance;
142145
#endif
143-
#ifdef CONFIG_SCHED_DEBUG
144146
char *name;
145-
#endif
146147
union {
147148
void *private; /* used during construction */
148149
struct rcu_head rcu; /* used during destruction */
@@ -202,18 +203,12 @@ struct sched_domain_topology_level {
202203
int flags;
203204
int numa_level;
204205
struct sd_data data;
205-
#ifdef CONFIG_SCHED_DEBUG
206206
char *name;
207-
#endif
208207
};
209208

210209
extern void __init set_sched_topology(struct sched_domain_topology_level *tl);
211210

212-
#ifdef CONFIG_SCHED_DEBUG
213211
# define SD_INIT_NAME(type) .name = #type
214-
#else
215-
# define SD_INIT_NAME(type)
216-
#endif
217212

218213
#else /* CONFIG_SMP */
219214

0 commit comments

Comments
 (0)