Skip to content

Conversation

@LiaCastaneda
Copy link
Contributor

@LiaCastaneda LiaCastaneda commented Nov 26, 2025

Which issue does this PR close?

Closes #17527

Rationale for this change

Currently, DataFusion computes bounds for all queries that contain a HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to true (default). It might make sense to compute these bounds only when we explicitly know there is a consumer that will use them.

What changes are included in this PR?

This PR expands the filter pushdown result enum from two variants (Yes/No) to three variants : Exact/Inexact/Unsupported as suggested in #18856 and #17527

in handle_child_pushdown_result, the HashJoinExec checks the discriminant returned by its probe side child to determine whether the dynamic filter will be used. If the child returns Unsupported, the HashJoinExec skips creating the dynamic filter accumulator, avoiding unnecessary computation.

Are these changes tested?

Added a test test_hash_join_dynamic_filter_with_unsupported_scan that verifies that the DynamicFilter placeholder is not present in the probe node.

Are there any user-facing changes?

Yes, the PushedDown enum now has three variants instead of two.

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate labels Nov 26, 2025
@LiaCastaneda LiaCastaneda force-pushed the lia/compute-dyn-filters-only-when-consumer-asks-for-it branch from ebc401a to 3b31893 Compare November 26, 2025 10:40
Comment on lines 302 to 305
pub fn is_used(self: &Arc<Self>) -> bool {
// Strong count > 1 means at least one consumer is holding a reference beyond the producer.
Arc::strong_count(self) > 1
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how to test a condition where is_used() returns false without adding too much machinery or making the dynamic_filter attribute from HashJoin public which would make it easy to mess with the Arc reference count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can’t you just make a new DynamicFilterPhysicalExpr and check that is_used is False?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you have a test already?

Copy link
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is draft but the current code looks good to me, I’ll approve once it’s ready :)

@LiaCastaneda
Copy link
Contributor Author

LiaCastaneda commented Nov 26, 2025

Something funky is going on here with the Arc count, some queries are not pushing down the filter to the probe because Arc count remains at 1 -> Arc count=1 in execution time, will take a look...

It doesn't happen in all the tests though, which is strange

Edit: For the tests that fails seems like partition 0 never gets to see a strong_count >1

[partition 0] Arc count at execution: 1
  [partition 0] Is used: false
  [partition 1] Arc count at execution: 2
  [partition 1] Is used: true
  [partition 2] Arc count at execution: 2
  [partition 2] Is used: true
  [partition 3] Arc count at execution: 2
  [partition 3] Is used: true
  [partition 4] Arc count at execution: 2
  [partition 4] Is used: true
  [partition 5] Arc count at execution: 2
  [partition 5] Is used: true
  [partition 6] Arc count at execution: 2
  [partition 6] Is used: true
  [partition 7] Arc count at execution: 2
  [partition 7] Is used: true
  [partition 8] Arc count at execution: 2
  [partition 8] Is used: true
  [partition 9] Arc count at execution: 2
  [partition 9] Is used: true
  [partition 10] Arc count at execution: 2
  [partition 10] Is used: true
  [partition 11] Arc count at execution: 2
  [partition 11] Is used: true

I think this might be because we clone the DynamicFilterPhysicalExpr directly in the execute phase of DataSourceExec (at least for this kind of node)

In any case, seems like the strong_count approach might hit some edge cases here...

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate datasource Changes to the datasource crate and removed physical-expr Changes to the physical-expr crates labels Nov 27, 2025
@LiaCastaneda LiaCastaneda force-pushed the lia/compute-dyn-filters-only-when-consumer-asks-for-it branch from 62688c7 to 7664b0d Compare November 27, 2025 16:57
@LiaCastaneda LiaCastaneda force-pushed the lia/compute-dyn-filters-only-when-consumer-asks-for-it branch from 7664b0d to 8c467c9 Compare November 27, 2025 17:09
@github-actions github-actions bot added the execution Related to the execution crate label Nov 27, 2025
@alamb alamb added the api change Changes the API exposed to users of the crate label Dec 1, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you @LiaCastaneda

The only thing I think is needed is to add a note to the upgrade guide

@adriangb do you want to take a look at this PR prior to merge?

/// Discriminant for the result of pushing down a filter into a child node.
#[derive(Debug, Clone, Copy)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum PushedDown {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a public enum (doc link) I think this will be an API change (I marked the PR as such)

Can you please add a note to the DataFusion 52 upgrade guide explaining how people need to update their existing code (e.g. if they used to return Yes or No what should they return after this change)?

https://github.com/apache/datafusion/blob/9f725d9c7064813cda0de0f87d115354b68d76e6/docs/source/library-user-guide/upgrading.md#L22-L21

@alamb
Copy link
Contributor

alamb commented Dec 1, 2025

run benchmarks

@alamb
Copy link
Contributor

alamb commented Dec 1, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing lia/compute-dyn-filters-only-when-consumer-asks-for-it (728fde9) to fb14d7c diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Dec 1, 2025

🤖: Benchmark completed

Details

Comparing HEAD and lia_compute-dyn-filters-only-when-consumer-asks-for-it
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2697.10 ms │                                             2748.66 ms │ no change │
│ QQuery 1     │  1316.46 ms │                                             1310.41 ms │ no change │
│ QQuery 2     │  2484.15 ms │                                             2498.55 ms │ no change │
│ QQuery 3     │  1116.93 ms │                                             1110.21 ms │ no change │
│ QQuery 4     │  2371.11 ms │                                             2321.44 ms │ no change │
│ QQuery 5     │ 28306.32 ms │                                            28476.49 ms │ no change │
│ QQuery 6     │  4230.39 ms │                                             4237.43 ms │ no change │
│ QQuery 7     │  3647.99 ms │                                             3615.26 ms │ no change │
└──────────────┴─────────────┴────────────────────────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 46170.45ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 46318.45ms │
│ Average Time (HEAD)                                                   │  5771.31ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │  5789.81ms │
│ Queries Faster                                                        │          0 │
│ Queries Slower                                                        │          0 │
│ Queries with No Change                                                │          8 │
│ Queries with Failure                                                  │          0 │
└───────────────────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.53 ms │                                                2.21 ms │ +1.15x faster │
│ QQuery 1     │    50.65 ms │                                               48.48 ms │     no change │
│ QQuery 2     │   138.85 ms │                                              137.64 ms │     no change │
│ QQuery 3     │   166.42 ms │                                              167.48 ms │     no change │
│ QQuery 4     │  1074.24 ms │                                             1107.96 ms │     no change │
│ QQuery 5     │  1500.86 ms │                                             1505.22 ms │     no change │
│ QQuery 6     │     2.19 ms │                                                2.14 ms │     no change │
│ QQuery 7     │    55.99 ms │                                               55.80 ms │     no change │
│ QQuery 8     │  1433.46 ms │                                             1438.73 ms │     no change │
│ QQuery 9     │  1834.71 ms │                                             1855.08 ms │     no change │
│ QQuery 10    │   396.37 ms │                                              397.92 ms │     no change │
│ QQuery 11    │   445.08 ms │                                              450.27 ms │     no change │
│ QQuery 12    │  1337.56 ms │                                             1368.91 ms │     no change │
│ QQuery 13    │  2159.17 ms │                                             2141.45 ms │     no change │
│ QQuery 14    │  1253.87 ms │                                             1285.80 ms │     no change │
│ QQuery 15    │  1214.01 ms │                                             1227.89 ms │     no change │
│ QQuery 16    │  2713.52 ms │                                             2727.51 ms │     no change │
│ QQuery 17    │  2672.03 ms │                                             2696.53 ms │     no change │
│ QQuery 18    │  5312.51 ms │                                             5035.52 ms │ +1.06x faster │
│ QQuery 19    │   130.21 ms │                                              126.79 ms │     no change │
│ QQuery 20    │  2016.14 ms │                                             1991.80 ms │     no change │
│ QQuery 21    │  2342.35 ms │                                             2312.08 ms │     no change │
│ QQuery 22    │  3948.29 ms │                                             3949.44 ms │     no change │
│ QQuery 23    │ 14706.89 ms │                                            13287.83 ms │ +1.11x faster │
│ QQuery 24    │   231.24 ms │                                              218.60 ms │ +1.06x faster │
│ QQuery 25    │   485.79 ms │                                              489.69 ms │     no change │
│ QQuery 26    │   245.70 ms │                                              217.96 ms │ +1.13x faster │
│ QQuery 27    │  2904.01 ms │                                             2833.98 ms │     no change │
│ QQuery 28    │ 23753.40 ms │                                            23583.56 ms │     no change │
│ QQuery 29    │   980.90 ms │                                              973.83 ms │     no change │
│ QQuery 30    │  1374.88 ms │                                             1339.59 ms │     no change │
│ QQuery 31    │  1408.58 ms │                                             1402.35 ms │     no change │
│ QQuery 32    │  4834.01 ms │                                             5151.11 ms │  1.07x slower │
│ QQuery 33    │  5903.63 ms │                                             5954.64 ms │     no change │
│ QQuery 34    │  6095.62 ms │                                             6197.45 ms │     no change │
│ QQuery 35    │  1915.96 ms │                                             1875.74 ms │     no change │
│ QQuery 36    │   118.10 ms │                                              120.20 ms │     no change │
│ QQuery 37    │    52.24 ms │                                               51.90 ms │     no change │
│ QQuery 38    │   121.59 ms │                                              119.37 ms │     no change │
│ QQuery 39    │   198.20 ms │                                              198.99 ms │     no change │
│ QQuery 40    │    42.91 ms │                                               41.56 ms │     no change │
│ QQuery 41    │    40.53 ms │                                               40.88 ms │     no change │
│ QQuery 42    │    33.56 ms │                                               32.87 ms │     no change │
└──────────────┴─────────────┴────────────────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 97648.75ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 96164.73ms │
│ Average Time (HEAD)                                                   │  2270.90ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │  2236.39ms │
│ Queries Faster                                                        │          5 │
│ Queries Slower                                                        │          1 │
│ Queries with No Change                                                │         37 │
│ Queries with Failure                                                  │          0 │
└───────────────────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 128.18 ms │                                              139.76 ms │  1.09x slower │
│ QQuery 2     │  27.45 ms │                                               26.80 ms │     no change │
│ QQuery 3     │  38.94 ms │                                               36.01 ms │ +1.08x faster │
│ QQuery 4     │  28.49 ms │                                               28.88 ms │     no change │
│ QQuery 5     │  87.67 ms │                                               86.86 ms │     no change │
│ QQuery 6     │  19.14 ms │                                               19.34 ms │     no change │
│ QQuery 7     │ 216.95 ms │                                              224.91 ms │     no change │
│ QQuery 8     │  34.24 ms │                                               30.97 ms │ +1.11x faster │
│ QQuery 9     │ 105.71 ms │                                              101.77 ms │     no change │
│ QQuery 10    │  63.70 ms │                                               62.73 ms │     no change │
│ QQuery 11    │  19.00 ms │                                               18.50 ms │     no change │
│ QQuery 12    │  53.13 ms │                                               51.61 ms │     no change │
│ QQuery 13    │  44.97 ms │                                               48.78 ms │  1.08x slower │
│ QQuery 14    │  14.27 ms │                                               13.56 ms │     no change │
│ QQuery 15    │  24.14 ms │                                               24.30 ms │     no change │
│ QQuery 16    │  25.07 ms │                                               24.75 ms │     no change │
│ QQuery 17    │ 150.74 ms │                                              152.84 ms │     no change │
│ QQuery 18    │ 286.35 ms │                                              281.81 ms │     no change │
│ QQuery 19    │  39.98 ms │                                               38.59 ms │     no change │
│ QQuery 20    │  49.11 ms │                                               49.62 ms │     no change │
│ QQuery 21    │ 325.29 ms │                                              314.30 ms │     no change │
│ QQuery 22    │  17.56 ms │                                               17.61 ms │     no change │
└──────────────┴───────────┴────────────────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 1800.07ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 1794.29ms │
│ Average Time (HEAD)                                                   │   81.82ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │   81.56ms │
│ Queries Faster                                                        │         2 │
│ Queries Slower                                                        │         2 │
│ Queries with No Change                                                │        18 │
│ Queries with Failure                                                  │         0 │
└───────────────────────────────────────────────────────────────────────┴───────────┘

Copy link
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be because we clone the DynamicFilterPhysicalExpr directly in the execute phase of DataSourceExec (at least for this kind of node)

In any case, seems like the strong_count approach might hit some edge cases here...

This is interesting. Could it indicate a bug? I don't think we downcast into DynamicFilterPhysicalExpr anywhere in the Parquet specific code. So if we keep a reference it must be via Arc::clone.

I ask because I think this approach is what we actually want. As per https://github.com/apache/datafusion/pull/18938/files#r2578193421 I'm not even sure this change ends up having a different behavior at runtime?

Comment on lines +1167 to +1169
// Only create the dynamic filter if the probe side will actually use it (Exact or Inexact).
// If it's Unsupported, don't compute the filter since it won't be used.
let will_be_used = !matches!(filter.discriminant, PushedDown::Unsupported);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the case, don't we end up in the same place as Yes/No? I.e. this change only seems helpful if we did something like "only create the filter if the child said Exact".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it lets scans say "I can't use this at all" (Unsupported), so we can skip computing filters entirely if stats prunning is not supported either - the Yes/No system had no way to express that: if we had stats pruning with the filters, it would fall under the No discriminant, but we would still need them. I'm also thinking: if we know a scan will only use the filter for stats pruning (Inexact), maybe would it make sense to compute just the min/max bounds instead of both IN LIST and bounds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are both good reasons to make this API change. Maybe we can justify the change by doing what you are saying and skipping pushing down the entire hash table if it won't be used? But then again that is basically free... and where do bloom filters fall into this calculation? i.e. bloom filter pruning only works if HashJoinExec produces an InListExpr and the scan node is a Parquet node (or other format that supports bloom filters). That seems like an awful lot of complex coordination between the producer/consumer that is specific to each file (some may have bloom filters some don't) and the filter being pushed down (min/max vs. InList vs. Hash Table).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like it adds complexity to the “which filter to use” decision. Maybe the only clear use case is:

it lets scans say "I can't use this at all" (Unsupported), so we can skip computing filters entirely if stats prunning is not supported eitheri think it lets scans say "I can't use this at all" (Unsupported), so we can skip computing filters entirely if stats prunning is not supported either

I was also thinking, what if we just let consumers communicate what kinds of filters they support, and the producer only adjusts that decision based on memory or row-count limits? Or would that be an anti-pattern? In any case, that still wouldn’t let the producer understand the purpose of the dynamic filters (if for stats or row level filtering)

@LiaCastaneda
Copy link
Contributor Author

LiaCastaneda commented Dec 1, 2025

I ask because I think this approach is what we actually want.

Yeah, sorry for the confusion -- I switched the approach to exact/inexact/unsupported just as an alternative since it was getting really hard to get is_used fully right. I will open a separate PR for is_used to avoid confusion, I will attempt to solve the bug (or at least find out what's going on) between today and tomorrow.

This is interesting. Could it indicate a bug?

I took a look the other day, and the thing is that for some reason the strong count is not accounting for the leaf node that supposedly also holds a reference to DynamicFilterPhysicalExpr. 🤔

[partition 0] Arc count at execution: 1
[partition 0] Is used: false
[partition 1] Arc count at execution: 2

The 1 -> 2 strong count is actually expected here because the first partition creates the SharedBuildAccumulator which also holds a reference to the DynamicFilterPhysicalExpr, but if the hash join + the leaf keep the ref (which is what we should be seeing) I would have expected to see something like:

[partition 0] Arc count at execution: 2
[partition 0] Is used: true
[partition 1] Arc count at execution: 3

@alamb
Copy link
Contributor

alamb commented Dec 1, 2025

run benchmarks

@LiaCastaneda
Copy link
Contributor Author

@alamb I will attempt to get right the other approach (is_used) first, as @adriangb maybe we shouldn't modify the API unless we have an clear need for a third (stat only) discriminant. Lets not merge this yet 😄

@alamb
Copy link
Contributor

alamb commented Dec 2, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing lia/compute-dyn-filters-only-when-consumer-asks-for-it (cb749ad) to fb14d7c diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Dec 2, 2025

show benchmark queue

@alamb
Copy link
Contributor

alamb commented Dec 2, 2025

🤖 Hi @alamb, you asked to view the benchmark queue (#18938 (comment)).

Job User Benchmarks Comment
18938_3598974879.sh alamb default https://github.com/apache/datafusion/pull/18938#issuecomment-3598974879
18972_3600358896.sh Dandandan tpch_mem https://github.com/apache/datafusion/pull/18972#issuecomment-3600358896

@alamb
Copy link
Contributor

alamb commented Dec 2, 2025

🤖: Benchmark completed

Details

Comparing HEAD and lia_compute-dyn-filters-only-when-consumer-asks-for-it
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2767.67 ms │                                             2740.62 ms │ no change │
│ QQuery 1     │  1262.43 ms │                                             1287.83 ms │ no change │
│ QQuery 2     │  2465.37 ms │                                             2422.75 ms │ no change │
│ QQuery 3     │  1172.47 ms │                                             1119.26 ms │ no change │
│ QQuery 4     │  2364.16 ms │                                             2343.32 ms │ no change │
│ QQuery 5     │ 28802.93 ms │                                            28731.60 ms │ no change │
│ QQuery 6     │  4258.92 ms │                                             4279.77 ms │ no change │
│ QQuery 7     │  3588.21 ms │                                             3490.75 ms │ no change │
└──────────────┴─────────────┴────────────────────────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 46682.16ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 46415.89ms │
│ Average Time (HEAD)                                                   │  5835.27ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │  5801.99ms │
│ Queries Faster                                                        │          0 │
│ Queries Slower                                                        │          0 │
│ Queries with No Change                                                │          8 │
│ Queries with Failure                                                  │          0 │
└───────────────────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.55 ms │                                                2.68 ms │     no change │
│ QQuery 1     │    52.02 ms │                                               50.21 ms │     no change │
│ QQuery 2     │   136.76 ms │                                              137.00 ms │     no change │
│ QQuery 3     │   169.62 ms │                                              167.50 ms │     no change │
│ QQuery 4     │  1187.17 ms │                                             1161.64 ms │     no change │
│ QQuery 5     │  1590.44 ms │                                             1546.91 ms │     no change │
│ QQuery 6     │     2.32 ms │                                                2.12 ms │ +1.09x faster │
│ QQuery 7     │    56.54 ms │                                               54.82 ms │     no change │
│ QQuery 8     │  1511.29 ms │                                             1509.88 ms │     no change │
│ QQuery 9     │  1990.88 ms │                                             1955.38 ms │     no change │
│ QQuery 10    │   403.54 ms │                                              409.87 ms │     no change │
│ QQuery 11    │   451.48 ms │                                              457.59 ms │     no change │
│ QQuery 12    │  1470.94 ms │                                             1485.98 ms │     no change │
│ QQuery 13    │  2217.06 ms │                                             2133.54 ms │     no change │
│ QQuery 14    │  1337.52 ms │                                             1329.19 ms │     no change │
│ QQuery 15    │  1307.39 ms │                                             1301.05 ms │     no change │
│ QQuery 16    │  2752.91 ms │                                             2740.93 ms │     no change │
│ QQuery 17    │  2738.79 ms │                                             2701.23 ms │     no change │
│ QQuery 18    │  5113.70 ms │                                             5073.84 ms │     no change │
│ QQuery 19    │   130.34 ms │                                              130.32 ms │     no change │
│ QQuery 20    │  2059.88 ms │                                             2002.18 ms │     no change │
│ QQuery 21    │  2359.42 ms │                                             2310.55 ms │     no change │
│ QQuery 22    │  4012.46 ms │                                             3913.87 ms │     no change │
│ QQuery 23    │ 13467.65 ms │                                            13188.10 ms │     no change │
│ QQuery 24    │   236.69 ms │                                              234.30 ms │     no change │
│ QQuery 25    │   499.69 ms │                                              491.81 ms │     no change │
│ QQuery 26    │   234.39 ms │                                              218.38 ms │ +1.07x faster │
│ QQuery 27    │  2966.96 ms │                                             2828.76 ms │     no change │
│ QQuery 28    │ 23816.85 ms │                                            23563.87 ms │     no change │
│ QQuery 29    │   943.64 ms │                                              947.57 ms │     no change │
│ QQuery 30    │  1386.29 ms │                                             1399.26 ms │     no change │
│ QQuery 31    │  1412.73 ms │                                             1418.85 ms │     no change │
│ QQuery 32    │  4623.55 ms │                                             4803.31 ms │     no change │
│ QQuery 33    │  5898.12 ms │                                             6005.81 ms │     no change │
│ QQuery 34    │  6020.30 ms │                                             6059.38 ms │     no change │
│ QQuery 35    │  1960.47 ms │                                             1949.46 ms │     no change │
│ QQuery 36    │   122.08 ms │                                              121.00 ms │     no change │
│ QQuery 37    │    51.18 ms │                                               55.14 ms │  1.08x slower │
│ QQuery 38    │   120.04 ms │                                              122.63 ms │     no change │
│ QQuery 39    │   198.46 ms │                                              199.93 ms │     no change │
│ QQuery 40    │    45.03 ms │                                               43.30 ms │     no change │
│ QQuery 41    │    41.59 ms │                                               39.01 ms │ +1.07x faster │
│ QQuery 42    │    33.88 ms │                                               32.37 ms │     no change │
└──────────────┴─────────────┴────────────────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 97134.62ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 96300.53ms │
│ Average Time (HEAD)                                                   │  2258.94ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │  2239.55ms │
│ Queries Faster                                                        │          3 │
│ Queries Slower                                                        │          1 │
│ Queries with No Change                                                │         39 │
│ Queries with Failure                                                  │          0 │
└───────────────────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ lia_compute-dyn-filters-only-when-consumer-asks-for-it ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 131.73 ms │                                              140.41 ms │  1.07x slower │
│ QQuery 2     │  27.82 ms │                                               28.02 ms │     no change │
│ QQuery 3     │  39.56 ms │                                               35.74 ms │ +1.11x faster │
│ QQuery 4     │  29.95 ms │                                               29.67 ms │     no change │
│ QQuery 5     │  89.46 ms │                                               87.23 ms │     no change │
│ QQuery 6     │  19.88 ms │                                               19.53 ms │     no change │
│ QQuery 7     │ 228.13 ms │                                              228.75 ms │     no change │
│ QQuery 8     │  36.75 ms │                                               35.39 ms │     no change │
│ QQuery 9     │ 109.11 ms │                                              104.25 ms │     no change │
│ QQuery 10    │  68.04 ms │                                               63.60 ms │ +1.07x faster │
│ QQuery 11    │  18.48 ms │                                               18.83 ms │     no change │
│ QQuery 12    │  51.29 ms │                                               51.85 ms │     no change │
│ QQuery 13    │  47.86 ms │                                               48.06 ms │     no change │
│ QQuery 14    │  14.17 ms │                                               13.95 ms │     no change │
│ QQuery 15    │  24.99 ms │                                               24.89 ms │     no change │
│ QQuery 16    │  25.18 ms │                                               25.17 ms │     no change │
│ QQuery 17    │ 155.49 ms │                                              156.26 ms │     no change │
│ QQuery 18    │ 285.00 ms │                                              277.43 ms │     no change │
│ QQuery 19    │  39.68 ms │                                               37.68 ms │ +1.05x faster │
│ QQuery 20    │  51.35 ms │                                               50.76 ms │     no change │
│ QQuery 21    │ 330.72 ms │                                              333.45 ms │     no change │
│ QQuery 22    │  18.13 ms │                                               17.88 ms │     no change │
└──────────────┴───────────┴────────────────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                                                     ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                                                     │ 1842.76ms │
│ Total Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it)   │ 1828.79ms │
│ Average Time (HEAD)                                                   │   83.76ms │
│ Average Time (lia_compute-dyn-filters-only-when-consumer-asks-for-it) │   83.13ms │
│ Queries Faster                                                        │         3 │
│ Queries Slower                                                        │         1 │
│ Queries with No Change                                                │        18 │
│ Queries with Failure                                                  │         0 │
└───────────────────────────────────────────────────────────────────────┴───────────┘

@LiaCastaneda
Copy link
Contributor Author

I didn't had time to get back to this :( I’ll be on vacation until next next week, so I won’t be able to look at this again until then. If anyone wants to take over feel free, otherwise, I’ll continue when I’m back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate datasource Changes to the datasource crate execution Related to the execution crate optimizer Optimizer rules physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Only compute bounds/ dynamic filters if consumer asks for it

3 participants