-
Notifications
You must be signed in to change notification settings - Fork 116
Prevent parallel file lock clear #3714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
|
|
||
| private void fileLockClearFlushAndClose(boolean isRecovery) { | ||
| synchronized (clearingLockNowLock) { | ||
| // Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If indeed all that the synchronized block is trying to do is protect a boolean, then an AtomicBoolean can do the job. It seems as if we are not trying to create a totally exclusive code flow but just prevent "too many" runs of the same methods, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to AtomicBoolean.
The code supposed to prevent concurrent lock clears, which seems to be caused by concurrent, possibly recursive, calls to directory close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to undo the AtomicBoolean use as some tests were failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious: Which tests failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but it might have been testMultiClose.
...ne/src/main/java/com/apple/foundationdb/record/lucene/directory/FDBDirectoryLockFactory.java
Outdated
Show resolved
Hide resolved
FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy. Resolves FoundationDB#3713
9b5b678 to
62dfa1b
Compare
62dfa1b to
e81f7d2
Compare
| closed = flushed; // allow close retry | ||
| closingContext = null; | ||
| synchronized (clearingLockNowLock) { | ||
| // clearing under lock to avoid spotbugsMain's "Inconsistent synchronization" issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in a finally around all the code above?
i.e. what happens if the asyncToSync calls above fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
I've moved it to a fileLockFunc's whenComplete step.
| private void fileLockClearFlushAndClose(boolean isRecovery) { | ||
| synchronized (clearingLockNowLock) { | ||
| // Here: this function is being called from too many paths. Until cleanup, this guard is here to avoid recursions | ||
| if (clearingLockNow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only seen it for lazy instantiation, but would it be worth to duplicate this before the synchronized lock to avoid entering the Synchronized lock if already clearing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can either do that or remove the synchronized lock altogether (since our concern is recursion, the synchronized lock might be an overkill...)
| } | ||
| } | ||
|
|
||
| private void fileLockClearFlushAndClose(boolean isRecovery) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like there are some concurrency challenges with FDBDirectoryLock, it would probably be good to add some tests that try to clear this lock concurrently.
FDBDirectory: When file lock clear is called in parallel by multiple threads, synchronically waiting for the first thread's clear to complete seems to be a better strategy.
Resolves #3713