Skip to content

Commit 08637d7

Browse files
committed
Revert "Merge: cgroup: Backport upstream cgroup commits up to v6.8"
This reverts merge request !4128
1 parent 7148081 commit 08637d7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+660
-740
lines changed

Documentation/admin-guide/cgroup-v2.rst

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -983,23 +983,6 @@ All cgroup core files are prefixed with "cgroup."
983983
killing cgroups is a process directed operation, i.e. it affects
984984
the whole thread-group.
985985

986-
cgroup.pressure
987-
A read-write single value file that allowed values are "0" and "1".
988-
The default is "1".
989-
990-
Writing "0" to the file will disable the cgroup PSI accounting.
991-
Writing "1" to the file will re-enable the cgroup PSI accounting.
992-
993-
This control attribute is not hierarchical, so disable or enable PSI
994-
accounting in a cgroup does not affect PSI accounting in descendants
995-
and doesn't need pass enablement via ancestors from root.
996-
997-
The reason this control attribute exists is that PSI accounts stalls for
998-
each cgroup separately and aggregates it at each level of the hierarchy.
999-
This may cause non-negligible overhead for some workloads when under
1000-
deep level of the hierarchy, in which case this control attribute can
1001-
be used to disable PSI accounting in the non-leaf cgroups.
1002-
1003986
irq.pressure
1004987
A read-write nested-keyed file.
1005988

Documentation/power/freezing-of-tasks.rst

Lines changed: 37 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -14,28 +14,27 @@ architectures).
1414
II. How does it work?
1515
=====================
1616

17-
There is one per-task flag (PF_NOFREEZE) and three per-task states
18-
(TASK_FROZEN, TASK_FREEZABLE and __TASK_FREEZABLE_UNSAFE) used for that.
19-
The tasks that have PF_NOFREEZE unset (all user space tasks and some kernel
20-
threads) are regarded as 'freezable' and treated in a special way before the
21-
system enters a sleep state as well as before a hibernation image is created
22-
(hibernation is directly covered by what follows, but the description applies
23-
to system-wide suspend too).
17+
There are three per-task flags used for that, PF_NOFREEZE, PF_FROZEN
18+
and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have
19+
PF_NOFREEZE unset (all user space processes and some kernel threads) are
20+
regarded as 'freezable' and treated in a special way before the system enters a
21+
suspend state as well as before a hibernation image is created (in what follows
22+
we only consider hibernation, but the description also applies to suspend).
2423

2524
Namely, as the first step of the hibernation procedure the function
2625
freeze_processes() (defined in kernel/power/process.c) is called. A system-wide
27-
static key freezer_active (as opposed to a per-task flag or state) is used to
28-
indicate whether the system is to undergo a freezing operation. And
29-
freeze_processes() sets this static key. After this, it executes
30-
try_to_freeze_tasks() that sends a fake signal to all user space processes, and
31-
wakes up all the kernel threads. All freezable tasks must react to that by
32-
calling try_to_freeze(), which results in a call to __refrigerator() (defined
33-
in kernel/freezer.c), which changes the task's state to TASK_FROZEN, and makes
34-
it loop until it is woken by an explicit TASK_FROZEN wakeup. Then, that task
35-
is regarded as 'frozen' and so the set of functions handling this mechanism is
36-
referred to as 'the freezer' (these functions are defined in
37-
kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h). User space
38-
tasks are generally frozen before kernel threads.
26+
variable system_freezing_cnt (as opposed to a per-task flag) is used to indicate
27+
whether the system is to undergo a freezing operation. And freeze_processes()
28+
sets this variable. After this, it executes try_to_freeze_tasks() that sends a
29+
fake signal to all user space processes, and wakes up all the kernel threads.
30+
All freezable tasks must react to that by calling try_to_freeze(), which
31+
results in a call to __refrigerator() (defined in kernel/freezer.c), which sets
32+
the task's PF_FROZEN flag, changes its state to TASK_UNINTERRUPTIBLE and makes
33+
it loop until PF_FROZEN is cleared for it. Then, we say that the task is
34+
'frozen' and therefore the set of functions handling this mechanism is referred
35+
to as 'the freezer' (these functions are defined in kernel/power/process.c,
36+
kernel/freezer.c & include/linux/freezer.h). User space processes are generally
37+
frozen before kernel threads.
3938

4039
__refrigerator() must not be called directly. Instead, use the
4140
try_to_freeze() function (defined in include/linux/freezer.h), that checks
@@ -44,40 +43,31 @@ if the task is to be frozen and makes the task enter __refrigerator().
4443
For user space processes try_to_freeze() is called automatically from the
4544
signal-handling code, but the freezable kernel threads need to call it
4645
explicitly in suitable places or use the wait_event_freezable() or
47-
wait_event_freezable_timeout() macros (defined in include/linux/wait.h)
48-
that put the task to sleep (TASK_INTERRUPTIBLE) or freeze it (TASK_FROZEN) if
49-
freezer_active is set. The main loop of a freezable kernel thread may look
46+
wait_event_freezable_timeout() macros (defined in include/linux/freezer.h)
47+
that combine interruptible sleep with checking if the task is to be frozen and
48+
calling try_to_freeze(). The main loop of a freezable kernel thread may look
5049
like the following one::
5150

5251
set_freezable();
53-
54-
while (true) {
55-
struct task_struct *tsk = NULL;
56-
57-
wait_event_freezable(oom_reaper_wait, oom_reaper_list != NULL);
58-
spin_lock_irq(&oom_reaper_lock);
59-
if (oom_reaper_list != NULL) {
60-
tsk = oom_reaper_list;
61-
oom_reaper_list = tsk->oom_reaper_list;
62-
}
63-
spin_unlock_irq(&oom_reaper_lock);
64-
65-
if (tsk)
66-
oom_reap_task(tsk);
67-
}
68-
69-
(from mm/oom_kill.c::oom_reaper()).
70-
71-
If a freezable kernel thread is not put to the frozen state after the freezer
72-
has initiated a freezing operation, the freezing of tasks will fail and the
73-
entire system-wide transition will be cancelled. For this reason, freezable
74-
kernel threads must call try_to_freeze() somewhere or use one of the
52+
do {
53+
hub_events();
54+
wait_event_freezable(khubd_wait,
55+
!list_empty(&hub_event_list) ||
56+
kthread_should_stop());
57+
} while (!kthread_should_stop() || !list_empty(&hub_event_list));
58+
59+
(from drivers/usb/core/hub.c::hub_thread()).
60+
61+
If a freezable kernel thread fails to call try_to_freeze() after the freezer has
62+
initiated a freezing operation, the freezing of tasks will fail and the entire
63+
hibernation operation will be cancelled. For this reason, freezable kernel
64+
threads must call try_to_freeze() somewhere or use one of the
7565
wait_event_freezable() and wait_event_freezable_timeout() macros.
7666

7767
After the system memory state has been restored from a hibernation image and
7868
devices have been reinitialized, the function thaw_processes() is called in
79-
order to wake up each frozen task. Then, the tasks that have been frozen leave
80-
__refrigerator() and continue running.
69+
order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that
70+
have been frozen leave __refrigerator() and continue running.
8171

8272

8373
Rationale behind the functions dealing with freezing and thawing of tasks
@@ -106,8 +96,7 @@ III. Which kernel threads are freezable?
10696
Kernel threads are not freezable by default. However, a kernel thread may clear
10797
PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE
10898
directly is not allowed). From this point it is regarded as freezable
109-
and must call try_to_freeze() or variants of wait_event_freezable() in a
110-
suitable place.
99+
and must call try_to_freeze() in a suitable place.
111100

112101
IV. Why do we do that?
113102
======================

drivers/android/binder.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3714,9 +3714,10 @@ static int binder_wait_for_work(struct binder_thread *thread,
37143714
struct binder_proc *proc = thread->proc;
37153715
int ret = 0;
37163716

3717+
freezer_do_not_count();
37173718
binder_inner_proc_lock(proc);
37183719
for (;;) {
3719-
prepare_to_wait(&thread->wait, &wait, TASK_INTERRUPTIBLE|TASK_FREEZABLE);
3720+
prepare_to_wait(&thread->wait, &wait, TASK_INTERRUPTIBLE);
37203721
if (binder_has_work_ilocked(thread, do_proc_work))
37213722
break;
37223723
if (do_proc_work)
@@ -3733,6 +3734,7 @@ static int binder_wait_for_work(struct binder_thread *thread,
37333734
}
37343735
finish_wait(&thread->wait, &wait);
37353736
binder_inner_proc_unlock(proc);
3737+
freezer_count();
37363738

37373739
return ret;
37383740
}

drivers/media/pci/pt3/pt3.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -445,8 +445,8 @@ static int pt3_fetch_thread(void *data)
445445
pt3_proc_dma(adap);
446446

447447
delay = ktime_set(0, PT3_FETCH_DELAY * NSEC_PER_MSEC);
448-
set_current_state(TASK_UNINTERRUPTIBLE|TASK_FREEZABLE);
449-
schedule_hrtimeout_range(&delay,
448+
set_current_state(TASK_UNINTERRUPTIBLE);
449+
freezable_schedule_hrtimeout_range(&delay,
450450
PT3_FETCH_DELAY_DELTA * NSEC_PER_MSEC,
451451
HRTIMER_MODE_REL);
452452
}

fs/coredump.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -403,8 +403,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
403403
if (core_waiters > 0) {
404404
struct core_thread *ptr;
405405

406-
wait_for_completion_state(&core_state->startup,
407-
TASK_UNINTERRUPTIBLE|TASK_FREEZABLE);
406+
freezer_do_not_count();
407+
wait_for_completion(&core_state->startup);
408+
freezer_count();
408409
/*
409410
* Wait for all the threads to become inactive, so that
410411
* all the thread context (extended register state, like

fs/nfs/file.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -578,8 +578,7 @@ static vm_fault_t nfs_vm_page_mkwrite(struct vm_fault *vmf)
578578
}
579579

580580
wait_on_bit_action(&NFS_I(inode)->flags, NFS_INO_INVALIDATING,
581-
nfs_wait_bit_killable,
582-
TASK_KILLABLE|TASK_FREEZABLE_UNSAFE);
581+
nfs_wait_bit_killable, TASK_KILLABLE);
583582

584583
folio_lock(folio);
585584
mapping = folio_file_mapping(folio);

fs/nfs/inode.c

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,13 +72,18 @@ nfs_fattr_to_ino_t(struct nfs_fattr *fattr)
7272
return nfs_fileid_to_ino_t(fattr->fileid);
7373
}
7474

75-
int nfs_wait_bit_killable(struct wait_bit_key *key, int mode)
75+
static int nfs_wait_killable(int mode)
7676
{
77-
schedule();
77+
freezable_schedule_unsafe();
7878
if (signal_pending_state(mode, current))
7979
return -ERESTARTSYS;
8080
return 0;
8181
}
82+
83+
int nfs_wait_bit_killable(struct wait_bit_key *key, int mode)
84+
{
85+
return nfs_wait_killable(mode);
86+
}
8287
EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
8388

8489
/**
@@ -1338,8 +1343,7 @@ int nfs_clear_invalid_mapping(struct address_space *mapping)
13381343
*/
13391344
for (;;) {
13401345
ret = wait_on_bit_action(bitlock, NFS_INO_INVALIDATING,
1341-
nfs_wait_bit_killable,
1342-
TASK_KILLABLE|TASK_FREEZABLE_UNSAFE);
1346+
nfs_wait_bit_killable, TASK_KILLABLE);
13431347
if (ret)
13441348
goto out;
13451349
spin_lock(&inode->i_lock);

fs/nfs/nfs3proc.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
3636
res = rpc_call_sync(clnt, msg, flags);
3737
if (res != -EJUKEBOX)
3838
break;
39-
__set_current_state(TASK_KILLABLE|TASK_FREEZABLE_UNSAFE);
40-
schedule_timeout(NFS_JUKEBOX_RETRY_TIME);
39+
freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
4140
res = -ERESTARTSYS;
4241
} while (!fatal_signal_pending(current));
4342
return res;

fs/nfs/nfs4proc.c

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -421,8 +421,8 @@ static int nfs4_delay_killable(long *timeout)
421421
{
422422
might_sleep();
423423

424-
__set_current_state(TASK_KILLABLE|TASK_FREEZABLE_UNSAFE);
425-
schedule_timeout(nfs4_update_delay(timeout));
424+
freezable_schedule_timeout_killable_unsafe(
425+
nfs4_update_delay(timeout));
426426
if (!__fatal_signal_pending(current))
427427
return 0;
428428
return -EINTR;
@@ -432,8 +432,7 @@ static int nfs4_delay_interruptible(long *timeout)
432432
{
433433
might_sleep();
434434

435-
__set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE_UNSAFE);
436-
schedule_timeout(nfs4_update_delay(timeout));
435+
freezable_schedule_timeout_interruptible_unsafe(nfs4_update_delay(timeout));
437436
if (!signal_pending(current))
438437
return 0;
439438
return __fatal_signal_pending(current) ? -EINTR :-ERESTARTSYS;
@@ -7428,8 +7427,7 @@ nfs4_retry_setlk_simple(struct nfs4_state *state, int cmd,
74287427
status = nfs4_proc_setlk(state, cmd, request);
74297428
if ((status != -EAGAIN) || IS_SETLK(cmd))
74307429
break;
7431-
__set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
7432-
schedule_timeout(timeout);
7430+
freezable_schedule_timeout_interruptible(timeout);
74337431
timeout *= 2;
74347432
timeout = min_t(unsigned long, NFS4_LOCK_MAXTIMEOUT, timeout);
74357433
status = -ERESTARTSYS;
@@ -7497,8 +7495,10 @@ nfs4_retry_setlk(struct nfs4_state *state, int cmd, struct file_lock *request)
74977495
break;
74987496

74997497
status = -ERESTARTSYS;
7500-
wait_woken(&waiter.wait, TASK_INTERRUPTIBLE|TASK_FREEZABLE,
7498+
freezer_do_not_count();
7499+
wait_woken(&waiter.wait, TASK_INTERRUPTIBLE,
75017500
NFS4_LOCK_MAXTIMEOUT);
7501+
freezer_count();
75027502
} while (!signalled());
75037503

75047504
remove_wait_queue(q, &waiter.wait);

fs/nfs/nfs4state.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1317,8 +1317,7 @@ int nfs4_wait_clnt_recover(struct nfs_client *clp)
13171317

13181318
refcount_inc(&clp->cl_count);
13191319
res = wait_on_bit_action(&clp->cl_state, NFS4CLNT_MANAGER_RUNNING,
1320-
nfs_wait_bit_killable,
1321-
TASK_KILLABLE|TASK_FREEZABLE_UNSAFE);
1320+
nfs_wait_bit_killable, TASK_KILLABLE);
13221321
if (res)
13231322
goto out;
13241323
if (clp->cl_cons_state < 0)

0 commit comments

Comments
 (0)