Skip to content

Conversation

@jjezra
Copy link
Contributor

@jjezra jjezra commented Oct 31, 2025

FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.

Resolves #3713

@jjezra jjezra requested a review from ohadzeliger October 31, 2025 18:32
@jjezra jjezra added the bug fix Change that fixes a bug label Oct 31, 2025
@jjezra jjezra marked this pull request as ready for review November 3, 2025 13:42
@ScottDugas ScottDugas requested review from ScottDugas and removed request for ScottDugas November 3, 2025 17:50
@jjezra jjezra requested a review from ScottDugas November 3, 2025 17:53

private void fileLockClearFlushAndClose(boolean isRecovery) {
synchronized (clearingLockNowLock) {
// Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If indeed all that the synchronized block is trying to do is protect a boolean, then an AtomicBoolean can do the job. It seems as if we are not trying to create a totally exclusive code flow but just prevent "too many" runs of the same methods, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to AtomicBoolean.
The code supposed to prevent concurrent lock clears, which seems to be caused by concurrent, possibly recursive, calls to directory close.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to undo the AtomicBoolean use as some tests were failing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: Which tests failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but it might have been testMultiClose.

  FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.

  Resolves FoundationDB#3713
@jjezra jjezra force-pushed the lucene_block_parallel_lock_clear branch from 9b5b678 to 62dfa1b Compare November 11, 2025 17:12
@jjezra jjezra requested a review from ohadzeliger November 11, 2025 17:13
@jjezra jjezra force-pushed the lucene_block_parallel_lock_clear branch from 62dfa1b to e81f7d2 Compare November 11, 2025 18:28
closed = flushed; // allow close retry
closingContext = null;
synchronized (clearingLockNowLock) {
// clearing under lock to avoid spotbugsMain's "Inconsistent synchronization" issue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a finally around all the code above?
i.e. what happens if the asyncToSync calls above fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!
I've moved it to a fileLockFunc's whenComplete step.

private void fileLockClearFlushAndClose(boolean isRecovery) {
synchronized (clearingLockNowLock) {
// Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions
if (clearingLockNow) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only seen it for lazy instantiation, but would it be worth to duplicate this before the synchronized lock to avoid entering the Synchronized lock if already clearing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can either do that or remove the synchronized lock altogether (since our concern is recursion, the synchronized lock might be an overkill...)

}
}

private void fileLockClearFlushAndClose(boolean isRecovery) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like there are some concurrency challenges with FDBDirectoryLock, it would probably be good to add some tests that try to clear this lock concurrently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent parallel file lock clear

3 participants