First sketch of MDB_DUPSORT. #5558

kenwenzel · 2025-11-10T15:37:06Z

GitHub issue resolved: #4218

Briefly describe the changes proposed in this PR:

PR Author Checklist (see the contributor guidelines for more details):

my pull request is self-contained
I've added tests for the changes I made
I've applied code formatting (you can use mvn process-resources to format from the command line)
I've squashed my commits where necessary
every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

kenwenzel · 2025-11-10T15:40:35Z

I've played a bit with DUPSORT and have the following results:

database is a bit smaller if only DUPSORT with variable values is used
database is way larger if DUPFIXED is used (at least around 80% more)
benchmarks are a bit slower due to matching keys and values

Peformance with DUPSORT:

Benchmark                                                             (numThreads)  Mode  Cnt     Score      Error  Units
QueryBenchmark.complexQuery                                                    N/A  avgt    3     8.351 ±    2.485  ms/op
QueryBenchmark.different_datasets_with_similar_distributions                   N/A  avgt    3     4.911 ±   10.699  ms/op
QueryBenchmark.groupByQuery                                                    N/A  avgt    3     2.189 ±    0.573  ms/op
QueryBenchmark.long_chain                                                      N/A  avgt    3  1459.867 ± 1600.677  ms/op
QueryBenchmark.lots_of_optional                                                N/A  avgt    3   408.424 ±  170.489  ms/op
QueryBenchmark.minus                                                           N/A  avgt    3    19.063 ±   30.869  ms/op
QueryBenchmark.multiple_sub_select                                             N/A  avgt    3   101.625 ±   30.560  ms/op
QueryBenchmark.nested_optionals                                                N/A  avgt    3   292.365 ±   59.615  ms/op
QueryBenchmark.optional_lhs_filter                                             N/A  avgt    3    69.231 ±    7.340  ms/op
QueryBenchmark.optional_rhs_filter                                             N/A  avgt    3   106.239 ±   50.879  ms/op
QueryBenchmark.ordered_union_limit                                             N/A  avgt    3   177.760 ±  602.567  ms/op
QueryBenchmark.pathExpressionQuery1                                            N/A  avgt    3    43.933 ±    6.755  ms/op
QueryBenchmark.pathExpressionQuery2                                            N/A  avgt    3     6.644 ±    4.060  ms/op
QueryBenchmark.query_distinct_predicates                                       N/A  avgt    3   131.304 ±   82.359  ms/op
QueryBenchmark.simple_filter_not                                               N/A  avgt    3    11.713 ±    2.413  ms/op
QueryBenchmark.sub_select                                                      N/A  avgt    3   166.433 ±   36.140  ms/op
QueryBenchmarkFoaf.groupByCount                                                N/A  avgt    5  1203.649 ±   38.835  ms/op
QueryBenchmarkFoaf.groupByCountSorted                                          N/A  avgt    5  1169.111 ±   57.376  ms/op
QueryBenchmarkFoaf.personsAndFriends                                           N/A  avgt    5   299.565 ±   10.615  ms/op
QueryBenchmarkParallel.complexQuery                                              4  avgt    3    21.410 ±   12.238  ms/op
QueryBenchmarkParallel.different_datasets_with_similar_distributions             4  avgt    3    11.305 ±    4.164  ms/op
QueryBenchmarkParallel.groupByQuery                                              4  avgt    3     8.376 ±    1.347  ms/op
QueryBenchmarkParallel.lots_of_optional                                          4  avgt    3  1005.969 ±  160.791  ms/op

Performance on develop:

Benchmark                                                             (numThreads)  Mode  Cnt     Score     Error  Units
QueryBenchmark.complexQuery                                                    N/A  avgt    3     7.803 ±   2.515  ms/op
QueryBenchmark.different_datasets_with_similar_distributions                   N/A  avgt    3     4.154 ±   1.314  ms/op
QueryBenchmark.groupByQuery                                                    N/A  avgt    3     1.868 ±   1.031  ms/op
QueryBenchmark.long_chain                                                      N/A  avgt    3  1257.513 ± 272.696  ms/op
QueryBenchmark.lots_of_optional                                                N/A  avgt    3   392.749 ±  93.742  ms/op
QueryBenchmark.minus                                                           N/A  avgt    3    17.783 ±  15.425  ms/op
QueryBenchmark.multiple_sub_select                                             N/A  avgt    3    96.446 ±  17.314  ms/op
QueryBenchmark.nested_optionals                                                N/A  avgt    3   273.396 ±  49.543  ms/op
QueryBenchmark.optional_lhs_filter                                             N/A  avgt    3    66.929 ±  25.232  ms/op
QueryBenchmark.optional_rhs_filter                                             N/A  avgt    3    92.581 ±  28.442  ms/op
QueryBenchmark.ordered_union_limit                                             N/A  avgt    3   134.559 ±  20.988  ms/op
QueryBenchmark.pathExpressionQuery1                                            N/A  avgt    3    38.629 ±  13.478  ms/op
QueryBenchmark.pathExpressionQuery2                                            N/A  avgt    3     6.942 ±   3.485  ms/op
QueryBenchmark.query_distinct_predicates                                       N/A  avgt    3   155.704 ± 396.561  ms/op
QueryBenchmark.simple_filter_not                                               N/A  avgt    3    11.171 ±   2.426  ms/op
QueryBenchmark.sub_select                                                      N/A  avgt    3   131.768 ±  44.619  ms/op
QueryBenchmarkFoaf.groupByCount                                                N/A  avgt    5  1174.969 ±  57.989  ms/op
QueryBenchmarkFoaf.groupByCountSorted                                          N/A  avgt    5  1091.847 ± 100.255  ms/op
QueryBenchmarkFoaf.personsAndFriends                                           N/A  avgt    5   300.028 ±  31.409  ms/op
QueryBenchmarkParallel.complexQuery                                              4  avgt    3    19.595 ±  11.311  ms/op
QueryBenchmarkParallel.different_datasets_with_similar_distributions             4  avgt    3    10.852 ±   1.590  ms/op
QueryBenchmarkParallel.groupByQuery                                              4  avgt    3     7.837 ±   0.967  ms/op
QueryBenchmarkParallel.lots_of_optional                                          4  avgt    3   965.467 ±  61.832  ms/op

hmottestad · 2025-11-11T08:57:14Z

I tested DUPSORT in my branch and implemented it specifically for the SP** index. Seemed to make som small performance improvements, but not sure it was much.

kenwenzel · 2025-11-11T11:45:54Z

@hmottestad My primary goal is to reduce the DB size on disk. I've also experimented with Morton (Z-oder) codes to only have one index. But this performed bad.
Do you think that DUPFIXED with MDB_NEXT_MULTIPLE is worth considering?

kenwenzel self-assigned this Nov 10, 2025

kenwenzel mentioned this pull request Nov 10, 2025

LMDB Store: working on new ID based join iterator #5549

Draft

5 tasks

kenwenzel force-pushed the lmdb-dupsort branch 2 times, most recently from 6f953d9 to a7fa590 Compare November 11, 2025 07:41

kenwenzel force-pushed the lmdb-dupsort branch 9 times, most recently from 09b18cf to d8e4017 Compare November 18, 2025 20:18

First sketch of MDB_DUPSORT.

7f85dc6

kenwenzel force-pushed the lmdb-dupsort branch 2 times, most recently from 5033f2a to 4b8ec4b Compare November 20, 2025 16:24

Re-use iterator state and pool cursors.

c815e5d

kenwenzel force-pushed the lmdb-dupsort branch from 4b8ec4b to c815e5d Compare November 20, 2025 16:35

kenwenzel added 3 commits November 21, 2025 08:30

Correctly go to first value of key if MDB_GET_BOTH_RANGE did not work.

c1def82

Create a common write method for keys and values.

9834893

Extend VarintMatcher for up to 4 values.

f29017d

kenwenzel force-pushed the lmdb-dupsort branch 4 times, most recently from 5dfe593 to eabd5ca Compare November 25, 2025 07:24

Introduce configurable split position for triple indexes.

83371cb

kenwenzel force-pushed the lmdb-dupsort branch from eabd5ca to 83371cb Compare November 25, 2025 12:43

kenwenzel added 3 commits November 25, 2025 22:26

Store multiple elements in a value to reduce size on disk.

46f9a19

Adapt iterator to storage layout.

68f8d49

Use MDB_APPENDDUP

80b37c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

First sketch of MDB_DUPSORT. #5558

First sketch of MDB_DUPSORT. #5558

Uh oh!

kenwenzel commented Nov 10, 2025 •

edited

Loading

Uh oh!

kenwenzel commented Nov 10, 2025 •

edited

Loading

Uh oh!

hmottestad commented Nov 11, 2025

Uh oh!

kenwenzel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

First sketch of MDB_DUPSORT. #5558

Are you sure you want to change the base?

First sketch of MDB_DUPSORT. #5558

Uh oh!

Conversation

kenwenzel commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kenwenzel commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmottestad commented Nov 11, 2025

Uh oh!

kenwenzel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kenwenzel commented Nov 10, 2025 •

edited

Loading

kenwenzel commented Nov 10, 2025 •

edited

Loading