-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Optimize PointRangeQuery for intra-segment concurrency with segment-level DocIdSet caching
#15446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
Before I add some tests, tested this behavior using https://github.com/msfroh/lucene-university (will check in the code here as well). Notice in the following logs:
Visual Flow |
| * Iterator that filters another iterator to only return docs within a partition range. | ||
| * Reading from a FixedBitSet is thread-safe (just reading from long[]), so multiple | ||
| * partitions can read from the same underlying DocIdSet concurrently. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong, we cannot share the iterator between partitions (even when the underlaying data structure is a FixedBitSet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Ignacio the iterator is not shared each partition still has its own iterator (PartitionFilteredDocIdSetIterator a DocIdSetIterator implementation), only the DocIdSet is shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, I was misled by the java docs for this class (there is no DocIdSet in the class). I think this comment should go when the iterator is created instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya I want to make sure the CI's are green on the PR to begin with. Updating the java docs and code clean up is pending will do that.
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
Looks like flaky test ? |
|
I am having a hard time understanding why this PR is improving the query throughput of IntNRQ. Mi expectation is that the query expends most of the time traversing the BKD tree and very little time building the result. As this PR still traverses the BKD tree with one thread, I would expect very little change in the query latency. I did make a local test with one of my favourite datasets and I did not see any change on latency as expected. More over, I would expect query throughput to be hurt by this change because all those blocked search threads doing no work, so concurrent queries will be running with less resources. Do you happen to know why QPS is improving? I might be missing something. |
|
This idea was inspired from comments #13745 (comment) and #15383 (comment).
The idea is instead of doing multiple same BKD traversal when divided into multiple partitions, do one BKD traversal per segment and share the
I assume you enabled intra segment is both cases ? |
May I know if I could test that on my local as well ? For now I used https://github.com/mikemccand/luceneutil |
No, in the baseline I used current main with segment only concurrency. The candidate is this patch.
I test with the datasets used for lucene geospatial benchmarks: https://benchmarks.mikemccandless.com/geobench.html |
Can you test with both intra segment enabled (in this patch from this PR the intra segment is already enabled). FYI here is the past Intra segment search benchmarks from Lucene: #13542 (comment) |
I see the same issue mikemccand/luceneutil#372 (comment) when I want to run the geo benchmark. Let me see if I can still test the geospatial benchmarks with one segment and bounding box query. |
|
No need, now I understand the results you are providing. I think you should provide the comparison with main for completeness (e.g is this solution competitive with the current status quo). |
| */ | ||
| protected LeafSlice[] slices(List<LeafReaderContext> leaves) { | ||
| return slices(leaves, MAX_DOCS_PER_SLICE, MAX_SEGMENTS_PER_SLICE, false); | ||
| return slices(leaves, MAX_DOCS_PER_SLICE, MAX_SEGMENTS_PER_SLICE, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change to default false before merging the PR.
Thanks! Below are the results without enabling intra-segment search (on both My approach is to improve PointRangeQuery performance when intra-segment search is enabled, as part of stabilizing the intra-segment work and eliminate per-segment work across segment partitions |
That's not what I meant, I wanted this PR with intra segment search with the current main in order to answer the question what are the benefits of using this against current main? |
Enable intra segment on candidate, baseline using
|
Here are some important results #15446 (comment). I can clearly see this PR change helped to reduce the regression with |
|
I think you are focusing in the wrong things. Yes, this change makes intra segment concurrency to suck less, but it still sucks and it is still unusable. It is still 40% slower that the concurrent segment search! We should never ever block search threads. IMO we should focus in how we do to search the data in a segment concurrently instead. |
With the current BKD setup for PointRangeQuery any thoughts or suggestion on this? FYI I did the same experiment for |
|
Since the benchmark results are part of multiple comments (#15446 (comment), #15446 (comment), #15446 (comment)), following is overall summary for IntNRQ Benchmark Results
|
In my opinion we cannot achieve it with the current segment layout. We need to evolve lucene segments / data structures so they can be searched concurrently. |
I will try to once again see the bottleneck in my current PR to further improve PointRangeQuery with intra segment, in this case should we continue to iterate and merge this PR as its definitely better than current state ? |
|
In my opinion we should not merge this PR, sorry. It makes no sense to have all this complexity that provides no benefit. If someone else think otherwise I am not going to block it but I don't like this approach. |
Thanks for your overall feedback and I'm open for discussion and thoughts. I don’t fully agree that it provides no benefit as existing implementation for PointRangeQuery currently on |
|
The effort isn't lost - at least we know the benchmarks. I also am not particularly fond of this concurrent cache and blocking approach. |
Description
This PR optimizes
PointRangeQueryto efficiently support intra-segment concurrent search by implementing segment-levelDocIdSetcaching. When a large segment is split into multiple partitions for parallel processing, all partitions now share a single BKD tree traversal result instead of each partition performing redundant traversals. The solution was derived as part of discussion from this PR #15383. Related issue forPointRangeQuerywith #13745 intra-segment.Problem
With intra-segment concurrency enabled, a single segment can be split into multiple partitions, each processed by a different thread. In the current implementation, each partition independently traverses the BKD tree and builds its own DocIdSet, resulting in Query latency #13542 (comment) and redundant/duplicate BKD crawl.
Solution
Implement a segment level cache that ensures the BKD tree is traversed only once per segment, with the resulting
DocIdSetshared across all partitions:SegmentDocIdSetSupplier: A new helper class that lazily builds and caches the
DocIdSetfor an entire segment.Segment-level cache: A
ConcurrentHashMap<LeafReaderContext, SegmentDocIdSetSupplier>in theWeightthat ensures all partitions of the same segment share the same supplier.PartitionScorerSupplier: A new
ScorerSupplierimplementation that references the shared cache and filters results to the partition's doc ID range.PartitionFilteredDocIdSetIterator: A lightweight iterator wrapper that filters the shared full-segment
DocIdSetto only return docs within the partition's range.Pending once need to update the
cost()methods right and add the tests along with some code cleanup. Here are some local testing details OptimizePointRangeQueryfor intra-segment concurrency with segment-levelDocIdSetcaching #15446 (comment).The behavior is same when intra-segment is disabled, handled in existing
scorerSupplier(LeafReaderContext context)method.Performance Impact: Seen good improvement with
IntNRQTested with enabling intra-segment on both candidate and baseline.
Related Issues