Skip to content

Commit 8446ff5

Browse files
fdmananagregkh
authored andcommitted
btrfs: avoid load/store tearing races when checking if an inode was logged
[ Upstream commit 986bf6e ] At inode_logged() we do a couple lockless checks for ->logged_trans, and these are generally safe except the second one in case we get a load or store tearing due to a concurrent call updating ->logged_trans (either at btrfs_log_inode() or later at inode_logged()). In the first case it's safe to compare to the current transaction ID since once ->logged_trans is set the current transaction, we never set it to a lower value. In the second case, where we check if it's greater than zero, we are prone to load/store tearing races, since we can have a concurrent task updating to the current transaction ID with store tearing for example, instead of updating with a single 64 bits write, to update with two 32 bits writes or four 16 bits writes. In that case the reading side at inode_logged() could see a positive value that does not match the current transaction and then return a false negative. Fix this by doing the second check while holding the inode's spinlock, add some comments about it too. Also add the data_race() annotation to the first check to avoid any reports from KCSAN (or similar tools) and comment about it. Fixes: 0f8ce49 ("btrfs: avoid inode logging during rename and link when possible") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 3d9c5e1 commit 8446ff5

File tree

1 file changed

+21
-4
lines changed

1 file changed

+21
-4
lines changed

fs/btrfs/tree-log.c

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3364,15 +3364,32 @@ static int inode_logged(const struct btrfs_trans_handle *trans,
33643364
struct btrfs_key key;
33653365
int ret;
33663366

3367-
if (inode->logged_trans == trans->transid)
3367+
/*
3368+
* Quick lockless call, since once ->logged_trans is set to the current
3369+
* transaction, we never set it to a lower value anywhere else.
3370+
*/
3371+
if (data_race(inode->logged_trans) == trans->transid)
33683372
return 1;
33693373

33703374
/*
3371-
* If logged_trans is not 0, then we know the inode logged was not logged
3372-
* in this transaction, so we can return false right away.
3375+
* If logged_trans is not 0 and not trans->transid, then we know the
3376+
* inode was not logged in this transaction, so we can return false
3377+
* right away. We take the lock to avoid a race caused by load/store
3378+
* tearing with a concurrent btrfs_log_inode() call or a concurrent task
3379+
* in this function further below - an update to trans->transid can be
3380+
* teared into two 32 bits updates for example, in which case we could
3381+
* see a positive value that is not trans->transid and assume the inode
3382+
* was not logged when it was.
33733383
*/
3374-
if (inode->logged_trans > 0)
3384+
spin_lock(&inode->lock);
3385+
if (inode->logged_trans == trans->transid) {
3386+
spin_unlock(&inode->lock);
3387+
return 1;
3388+
} else if (inode->logged_trans > 0) {
3389+
spin_unlock(&inode->lock);
33753390
return 0;
3391+
}
3392+
spin_unlock(&inode->lock);
33763393

33773394
/*
33783395
* If no log tree was created for this root in this transaction, then

0 commit comments

Comments
 (0)