Skip to content

Conversation

@lcartey
Copy link
Contributor

@lcartey lcartey commented Feb 28, 2025

getCdsDeclaration() experienced a performance regression in larger codebases with the upgrade to 2.20.4. An example of a partially completed set of tuple counts:

Evaluated relational algebra for predicate CDS::UserDefinedApplicationService.getCdsDeclaration/0#dispred#80b6f393@15836fu7 with tuple counts:
                3237       ~0%    {3} r1 = SCAN containerparent OUTPUT In.1, _, In.0
                3237       ~0%    {3}    | REWRITE WITH Out.1 := 2
                3237       ~1%    {5}    | JOIN WITH `Files::Container.splitAbsolutePath/2#dispred#43b82b7b_021#join_rhs` ON FIRST 2 OUTPUT Lhs.2, Lhs.0, _, Rhs.2, _
                 572       ~0%    {3}    | REWRITE WITH Tmp.2 := "", Tmp.4 := ".cds", Out.2 := InverseAppend(Tmp.2,Tmp.4,In.3) KEEPING 3
                9529     ~232%    {4}    | JOIN WITH containerparent ON FIRST 1 OUTPUT Lhs.2, _, Lhs.1, Rhs.1
                9529     ~234%    {4}    | REWRITE WITH Out.1 := 2
               51748    ~1132%    {3}    | JOIN WITH `Files::Container.splitAbsolutePath/2#dispred#43b82b7b_120#join_rhs` ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Rhs.2
          2969847789    ~1182%    {4}    | JOIN WITH `JSON::JsonValue.getJsonFile/0#dispred#2c4d2f29#bf_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.0, Lhs.1, Lhs.2
            23973500  ~630445%    {3}    | JOIN WITH project#CDL::CdlElement#class#12896241 ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3
        568488449775  ~197988%    {3}    | JOIN WITH `Locations::Locatable.getFile/0#dispred#58f05aaa_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
            24415500  ~145193%    {3}    | JOIN WITH CDL::CdlService#ccbb5e6c ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0
           891311496  ~306096%    {3}    | JOIN WITH `DataFlowPrivate::Node.getFile/0#dispred#4a057003_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
               95500   ~29220%    {3}    | JOIN WITH CDS::UserDefinedApplicationService#class#e1298e76 ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Lhs.2
                   0       ~0%    {2}    | JOIN WITH `DataFlowPrivate::Node.getFile/0#dispred#4a057003` ON FIRST 2 OUTPUT Lhs.0, Lhs.2
                                  return r1

The join order created a cartesian product on CdlElements and JSON files.

This has been addressed by:

  • Using CdlService.getImplementation() to implement UserDefinedService.getCdsDeclaration(). This avoids duplication of logic.
  • Update getImplementation() to use a simpler path matching definition, to avoid join order problems.

There is a performance issue with getCdsDeclaration, so ensure
we're using a single definition.
This now restricts the files to exist in the same directory, not
just the same name.
Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the changes, thank you!

@lcartey lcartey merged commit 94ba8d2 into main Feb 28, 2025
5 checks passed
@lcartey lcartey deleted the lcartey/performance-improvements branch February 28, 2025 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants