Skip to content

Commit 9b62ae5

Browse files
committed
Merge: rebase locking/futex to 6.11
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6037 JIRA: https://issues.redhat.com/browse/RHEL-60306 Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5892 since !5892 is already in readyForMerge state and backports the following commits that would have been backported by this MR: ``` ca4bc2e locking/qspinlock: Fix 'wait_early' set but not used warning d566c78 locking/rwsem: Clarify that RWSEM_READER_OWNED is just a hint f22f713 locking/rwsem: Make DEBUG_RWSEMS and PREEMPT_RT mutually exclusive d84f317 locking/mutex: split out mutex_types.h ``` Omitted-fix: 6623b02 ``` locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node() [...] for LoongArch. Correct the type of "old" variable to "u8". ``` Omitted-fix: 45e15c1 ``` csky: Add qspinlock support ``` Signed-off-by: Čestmír Kalina <ckalina@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2 parents e48532f + 6adaa4a commit 9b62ae5

29 files changed

+476
-229
lines changed

Documentation/locking/futex-requeue-pi.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Futex Requeue PI
55
Requeueing of tasks from a non-PI futex to a PI futex requires
66
special handling in order to ensure the underlying rt_mutex is never
77
left without an owner if it has waiters; doing so would break the PI
8-
boosting logic [see rt-mutex-desgin.txt] For the purposes of
8+
boosting logic [see rt-mutex-design.rst] For the purposes of
99
brevity, this action will be referred to as "requeue_pi" throughout
1010
this document. Priority inheritance is abbreviated throughout as
1111
"PI".

Documentation/locking/locktorture.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Kernel Lock Torture Test Operation
55
CONFIG_LOCK_TORTURE_TEST
66
========================
77

8-
The CONFIG LOCK_TORTURE_TEST config option provides a kernel module
8+
The CONFIG_LOCK_TORTURE_TEST config option provides a kernel module
99
that runs torture tests on core kernel locking primitives. The kernel
1010
module, 'locktorture', may be built after the fact on the running
1111
kernel to be tested, if desired. The tests periodically output status
@@ -67,7 +67,7 @@ torture_type
6767

6868
- "rtmutex_lock":
6969
rtmutex_lock() and rtmutex_unlock() pairs.
70-
Kernel must have CONFIG_RT_MUTEX=y.
70+
Kernel must have CONFIG_RT_MUTEXES=y.
7171

7272
- "rwsem_lock":
7373
read/write down() and up() semaphore pairs.

Documentation/locking/locktypes.rst

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -211,9 +211,6 @@ raw_spinlock_t and spinlock_t
211211
raw_spinlock_t
212212
--------------
213213

214-
raw_spinlock_t is a strict spinning lock implementation regardless of the
215-
kernel configuration including PREEMPT_RT enabled kernels.
216-
217214
raw_spinlock_t is a strict spinning lock implementation in all kernels,
218215
including PREEMPT_RT kernels. Use raw_spinlock_t only in real critical
219216
core code, low-level interrupt handling and places where disabling
@@ -247,7 +244,7 @@ based on rt_mutex which changes the semantics:
247244
Non-PREEMPT_RT kernels disable preemption to get this effect.
248245

249246
PREEMPT_RT kernels use a per-CPU lock for serialization which keeps
250-
preemption disabled. The lock disables softirq handlers and also
247+
preemption enabled. The lock disables softirq handlers and also
251248
prevents reentrancy due to task preemption.
252249

253250
PREEMPT_RT kernels preserve all other spinlock_t semantics:

Documentation/locking/mutex-design.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -101,12 +101,24 @@ features that make lock debugging easier and faster:
101101
- Detects multi-task circular deadlocks and prints out all affected
102102
locks and tasks (and only those tasks).
103103

104-
Releasing a mutex is not an atomic operation: Once a mutex release operation
105-
has begun, another context may be able to acquire the mutex before the release
106-
operation has fully completed. The mutex user must ensure that the mutex is not
107-
destroyed while a release operation is still in progress - in other words,
108-
callers of mutex_unlock() must ensure that the mutex stays alive until
109-
mutex_unlock() has returned.
104+
Mutexes - and most other sleeping locks like rwsems - do not provide an
105+
implicit reference for the memory they occupy, which reference is released
106+
with mutex_unlock().
107+
108+
[ This is in contrast with spin_unlock() [or completion_done()], which
109+
APIs can be used to guarantee that the memory is not touched by the
110+
lock implementation after spin_unlock()/completion_done() releases
111+
the lock. ]
112+
113+
mutex_unlock() may access the mutex structure even after it has internally
114+
released the lock already - so it's not safe for another context to
115+
acquire the mutex and assume that the mutex_unlock() context is not using
116+
the structure anymore.
117+
118+
The mutex user must ensure that the mutex is not destroyed while a
119+
release operation is still in progress - in other words, callers of
120+
mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
121+
has returned.
110122

111123
Interfaces
112124
----------

Documentation/locking/ww-mutex-design.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Wound/Wait Deadlock-Proof Mutex Design
33
======================================
44

5-
Please read mutex-design.txt first, as it applies to wait/wound mutexes too.
5+
Please read mutex-design.rst first, as it applies to wait/wound mutexes too.
66

77
Motivation for WW-Mutexes
88
-------------------------

include/asm-generic/qrwlock.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
/*
33
* Queue read/write lock
44
*
5+
* These use generic atomic and locking routines, but depend on a fair spinlock
6+
* implementation in order to be fair themselves. The implementation in
7+
* asm-generic/spinlock.h meets these requirements.
8+
*
59
* (C) Copyright 2013-2014 Hewlett-Packard Development Company, L.P.
610
*
711
* Authors: Waiman Long <waiman.long@hp.com>

include/asm-generic/qspinlock.h

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,35 @@
22
/*
33
* Queued spinlock
44
*
5+
* A 'generic' spinlock implementation that is based on MCS locks. For an
6+
* architecture that's looking for a 'generic' spinlock, please first consider
7+
* ticket-lock.h and only come looking here when you've considered all the
8+
* constraints below and can show your hardware does actually perform better
9+
* with qspinlock.
10+
*
11+
* qspinlock relies on atomic_*_release()/atomic_*_acquire() to be RCsc (or no
12+
* weaker than RCtso if you're power), where regular code only expects atomic_t
13+
* to be RCpc.
14+
*
15+
* qspinlock relies on a far greater (compared to asm-generic/spinlock.h) set
16+
* of atomic operations to behave well together, please audit them carefully to
17+
* ensure they all have forward progress. Many atomic operations may default to
18+
* cmpxchg() loops which will not have good forward progress properties on
19+
* LL/SC architectures.
20+
*
21+
* One notable example is atomic_fetch_or_acquire(), which x86 cannot (cheaply)
22+
* do. Carefully read the patches that introduced
23+
* queued_fetch_set_pending_acquire().
24+
*
25+
* qspinlock also heavily relies on mixed size atomic operations, in specific
26+
* it requires architectures to have xchg16; something which many LL/SC
27+
* architectures need to implement as a 32bit and+or in order to satisfy the
28+
* forward progress guarantees mentioned above.
29+
*
30+
* Further reading on mixed size atomics that might be relevant:
31+
*
32+
* http://www.cl.cam.ac.uk/~pes20/popl17/mixed-size.pdf
33+
*
534
* (C) Copyright 2013-2015 Hewlett-Packard Development Company, L.P.
635
* (C) Copyright 2015 Hewlett-Packard Enterprise Development LP
736
*
@@ -41,7 +70,7 @@ static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
4170
*/
4271
static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
4372
{
44-
return !atomic_read(&lock.val);
73+
return !lock.val.counter;
4574
}
4675

4776
/**

include/asm-generic/spinlock.h

Lines changed: 89 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,94 @@
11
/* SPDX-License-Identifier: GPL-2.0 */
2-
#ifndef __ASM_GENERIC_SPINLOCK_H
3-
#define __ASM_GENERIC_SPINLOCK_H
2+
43
/*
5-
* You need to implement asm/spinlock.h for SMP support. The generic
6-
* version does not handle SMP.
4+
* 'Generic' ticket-lock implementation.
5+
*
6+
* It relies on atomic_fetch_add() having well defined forward progress
7+
* guarantees under contention. If your architecture cannot provide this, stick
8+
* to a test-and-set lock.
9+
*
10+
* It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
11+
* sub-word of the value. This is generally true for anything LL/SC although
12+
* you'd be hard pressed to find anything useful in architecture specifications
13+
* about this. If your architecture cannot do this you might be better off with
14+
* a test-and-set.
15+
*
16+
* It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
17+
* uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
18+
* a full fence after the spin to upgrade the otherwise-RCpc
19+
* atomic_cond_read_acquire().
20+
*
21+
* The implementation uses smp_cond_load_acquire() to spin, so if the
22+
* architecture has WFE like instructions to sleep instead of poll for word
23+
* modifications be sure to implement that (see ARM64 for example).
24+
*
725
*/
8-
#ifdef CONFIG_SMP
9-
#error need an architecture specific asm/spinlock.h
10-
#endif
26+
27+
#ifndef __ASM_GENERIC_SPINLOCK_H
28+
#define __ASM_GENERIC_SPINLOCK_H
29+
30+
#include <linux/atomic.h>
31+
#include <asm-generic/spinlock_types.h>
32+
33+
static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
34+
{
35+
u32 val = atomic_fetch_add(1<<16, lock);
36+
u16 ticket = val >> 16;
37+
38+
if (ticket == (u16)val)
39+
return;
40+
41+
/*
42+
* atomic_cond_read_acquire() is RCpc, but rather than defining a
43+
* custom cond_read_rcsc() here we just emit a full fence. We only
44+
* need the prior reads before subsequent writes ordering from
45+
* smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
46+
* have no outstanding writes due to the atomic_fetch_add() the extra
47+
* orderings are free.
48+
*/
49+
atomic_cond_read_acquire(lock, ticket == (u16)VAL);
50+
smp_mb();
51+
}
52+
53+
static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
54+
{
55+
u32 old = atomic_read(lock);
56+
57+
if ((old >> 16) != (old & 0xffff))
58+
return false;
59+
60+
return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
61+
}
62+
63+
static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
64+
{
65+
u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
66+
u32 val = atomic_read(lock);
67+
68+
smp_store_release(ptr, (u16)val + 1);
69+
}
70+
71+
static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
72+
{
73+
u32 val = lock.counter;
74+
75+
return ((val >> 16) == (val & 0xffff));
76+
}
77+
78+
static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
79+
{
80+
arch_spinlock_t val = READ_ONCE(*lock);
81+
82+
return !arch_spin_value_unlocked(val);
83+
}
84+
85+
static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
86+
{
87+
u32 val = atomic_read(lock);
88+
89+
return (s16)((val >> 16) - (val & 0xffff)) > 1;
90+
}
91+
92+
#include <asm/qrwlock.h>
1193

1294
#endif /* __ASM_GENERIC_SPINLOCK_H */
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
3+
#ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
4+
#define __ASM_GENERIC_SPINLOCK_TYPES_H
5+
6+
#include <linux/types.h>
7+
typedef atomic_t arch_spinlock_t;
8+
9+
/*
10+
* qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
11+
* include.
12+
*/
13+
#include <asm/qrwlock_types.h>
14+
15+
#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
16+
17+
#endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */

include/linux/lockdep.h

Lines changed: 19 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -82,63 +82,6 @@ struct lock_chain {
8282
u64 chain_key;
8383
};
8484

85-
#define MAX_LOCKDEP_KEYS_BITS 13
86-
#define MAX_LOCKDEP_KEYS (1UL << MAX_LOCKDEP_KEYS_BITS)
87-
#define INITIAL_CHAIN_KEY -1
88-
89-
struct held_lock {
90-
/*
91-
* One-way hash of the dependency chain up to this point. We
92-
* hash the hashes step by step as the dependency chain grows.
93-
*
94-
* We use it for dependency-caching and we skip detection
95-
* passes and dependency-updates if there is a cache-hit, so
96-
* it is absolutely critical for 100% coverage of the validator
97-
* to have a unique key value for every unique dependency path
98-
* that can occur in the system, to make a unique hash value
99-
* as likely as possible - hence the 64-bit width.
100-
*
101-
* The task struct holds the current hash value (initialized
102-
* with zero), here we store the previous hash value:
103-
*/
104-
u64 prev_chain_key;
105-
unsigned long acquire_ip;
106-
struct lockdep_map *instance;
107-
struct lockdep_map *nest_lock;
108-
#ifdef CONFIG_LOCK_STAT
109-
u64 waittime_stamp;
110-
u64 holdtime_stamp;
111-
#endif
112-
/*
113-
* class_idx is zero-indexed; it points to the element in
114-
* lock_classes this held lock instance belongs to. class_idx is in
115-
* the range from 0 to (MAX_LOCKDEP_KEYS-1) inclusive.
116-
*/
117-
unsigned int class_idx:MAX_LOCKDEP_KEYS_BITS;
118-
/*
119-
* The lock-stack is unified in that the lock chains of interrupt
120-
* contexts nest ontop of process context chains, but we 'separate'
121-
* the hashes by starting with 0 if we cross into an interrupt
122-
* context, and we also keep do not add cross-context lock
123-
* dependencies - the lock usage graph walking covers that area
124-
* anyway, and we'd just unnecessarily increase the number of
125-
* dependencies otherwise. [Note: hardirq and softirq contexts
126-
* are separated from each other too.]
127-
*
128-
* The following field is used to detect when we cross into an
129-
* interrupt context:
130-
*/
131-
unsigned int irq_context:2; /* bit 0 - soft, bit 1 - hard */
132-
unsigned int trylock:1; /* 16 bits */
133-
134-
unsigned int read:2; /* see lock_acquire() comment */
135-
unsigned int check:1; /* see lock_acquire() comment */
136-
unsigned int hardirqs_off:1;
137-
unsigned int sync:1;
138-
unsigned int references:11; /* 32 bits */
139-
unsigned int pin_count;
140-
};
141-
14285
/*
14386
* Initialization, self-test and debugging-output methods:
14487
*/
@@ -235,9 +178,27 @@ static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
235178
(lock)->dep_map.wait_type_outer, \
236179
(lock)->dep_map.lock_type)
237180

181+
/**
182+
* lockdep_set_novalidate_class: disable checking of lock ordering on a given
183+
* lock
184+
* @lock: Lock to mark
185+
*
186+
* Lockdep will still record that this lock has been taken, and print held
187+
* instances when dumping locks
188+
*/
238189
#define lockdep_set_novalidate_class(lock) \
239190
lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
240191

192+
/**
193+
* lockdep_set_notrack_class: disable lockdep tracking of a given lock entirely
194+
* @lock: Lock to mark
195+
*
196+
* Bigger hammer than lockdep_set_novalidate_class: so far just for bcachefs,
197+
* which takes more locks than lockdep is able to track (48).
198+
*/
199+
#define lockdep_set_notrack_class(lock) \
200+
lockdep_set_class_and_name(lock, &__lockdep_no_track__, #lock)
201+
241202
/*
242203
* Compare locking classes
243204
*/
@@ -395,6 +356,7 @@ static inline void lockdep_set_selftest_task(struct task_struct *task)
395356
#define lockdep_set_subclass(lock, sub) do { } while (0)
396357

397358
#define lockdep_set_novalidate_class(lock) do { } while (0)
359+
#define lockdep_set_notrack_class(lock) do { } while (0)
398360

399361
/*
400362
* We don't define lockdep_match_class() and lockdep_match_key() for !LOCKDEP

0 commit comments

Comments
 (0)