Skip to content

Conversation

@feeblefakie
Copy link
Contributor

This is an automated backport of the following:

Please merge this PR after all checks have passed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @feeblefakie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request is an automated backport that fundamentally enhances the ScalarDB Blob Storage adapter by enabling full data manipulation capabilities. It introduces a robust internal data model and dedicated handlers for processing Get, Scan, Put, and Delete operations, effectively transforming Blob Storage into a more functional distributed storage solution. The changes are thoroughly validated with extensive integration tests covering various operational aspects and edge cases.

Highlights

  • Blob Storage DML Support: This backport introduces comprehensive support for data manipulation language (DML) operations, including Get, Scan, Put, and Delete, within the Blob Storage adapter. This significantly expands the functionality of the Object Storage integration.
  • New Internal Data Model and Handlers: New classes such as ObjectStoragePartition, ObjectStorageRecord, MutateStatementHandler, and SelectStatementHandler have been added to manage data storage, retrieval, and modification efficiently within the object storage context.
  • Enhanced Integration Test Coverage: Numerous new integration test classes have been added for various scenarios, including conditional mutations, cross-partition scans, multiple clustering/partition keys, and specific Consensus Commit behaviors, ensuring robustness and correctness of the new DML features. Previously disabled DML-related tests are now enabled.
  • New Error Codes and Primary Key Validation: Several new error codes have been introduced to provide more specific feedback for Object Storage operations, including conflicts in mutations, errors in selections, and validation for illegal characters in primary keys.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @feeblefakie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request backports crucial changes to the Blob Storage adapter, enabling comprehensive data manipulation capabilities. It introduces a robust framework for handling Put and Delete operations, including complex conditional mutations, and integrates these features with both distributed storage and consensus commit transactions. The changes are validated through an extensive suite of new and updated integration tests, ensuring the reliability and correctness of the Object Storage backend.

Highlights

  • Data Manipulation Support: The Blob Storage adapter now fully supports data manipulation operations (DML), including Put and Delete, along with conditional mutations, significantly expanding its functionality.
  • Comprehensive Integration Tests: A wide array of new integration tests have been added for Object Storage, covering various distributed storage and consensus commit scenarios, including cross-partition scans, conditional mutations, and handling of large objects.
  • Refactored Internal Logic: New internal classes such as MutateStatementHandler, SelectStatementHandler, ObjectStoragePartition, ObjectStorageRecord, and StreamingRecordIterator have been introduced to manage data operations, serialization, and record retrieval efficiently.
  • Enhanced Error Handling: New error codes have been added to CoreError to specifically address Object Storage-related issues, such as unsupported cross-partition scan ordering, illegal characters in primary keys, and mutation conflicts.
  • Test Suite Cleanup: Temporary disables for DML operations in existing integration tests have been removed, reflecting the new support for these operations in Object Storage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Object Storage adapter, enabling data manipulation operations (CRUD) for Object Storage. The changes include new core logic for handling operations, specific error codes, and extensive integration tests to ensure functionality and robustness. Key components like MutateStatementHandler, SelectStatementHandler, ObjectStoragePartition, and ObjectStorageOperationChecker have been added, along with utility classes for data mapping and record handling. The integration test base classes have also been refactored to allow for more flexible table metadata definitions.

Overall, the changes are well-structured and demonstrate a thorough approach to integrating Object Storage capabilities. The comprehensive test suite is particularly commendable for covering various scenarios, including large object writes and primary key validation.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a backport that adds data manipulation operation support to the Blob Storage adapter. The changes are extensive, introducing the core logic for handling mutations (Put, Delete) by reading, modifying, and writing entire partitions (objects). It also adds a comprehensive suite of integration tests, correctly disabling those for unsupported features like secondary indexes and cross-partition ordering. The overall implementation appears solid. I've identified one significant bug in the scan logic that could lead to incorrect results with certain table schemas, and a minor issue in a test's cleanup routine that could affect test reliability. My review comments provide details and suggestions for fixes.

Comment on lines +177 to +191
private boolean isScanForDescClusteringOrder(Scan scan, TableMetadata tableMetadata) {
if (scan.getStartClusteringKey().isPresent()) {
Key startClusteringKey = scan.getStartClusteringKey().get();
String lastValueName =
startClusteringKey.getColumns().get(startClusteringKey.size() - 1).getName();
return tableMetadata.getClusteringOrder(lastValueName) == Scan.Ordering.Order.DESC;
}
if (scan.getEndClusteringKey().isPresent()) {
Key endClusteringKey = scan.getEndClusteringKey().get();
String lastValueName =
endClusteringKey.getColumns().get(endClusteringKey.size() - 1).getName();
return tableMetadata.getClusteringOrder(lastValueName) == Scan.Ordering.Order.DESC;
}
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic in isScanForDescClusteringOrder is flawed. It determines whether to swap the start and end clustering keys for a scan based on the ordering of the last column in the provided start/end key. This is incorrect for tables with multiple clustering keys, especially those with mixed ordering (e.g., c1 DESC, c2 ASC).

The decision to treat a scan as "descending" (and thus swap start/end keys) should be based on the overall requested order of the scan. When no explicit ordering is given in the Scan object, it should default to the table's natural clustering order, which is primarily determined by the ordering of the first clustering key.

This can lead to incorrect scan results, such as missing records or returning records outside the intended range. The method should be updated to correctly determine the effective scan direction.

  private boolean isScanForDescClusteringOrder(Scan scan, TableMetadata tableMetadata) {
    if (!scan.getOrderings().isEmpty()) {
      // If orderings are specified, use isReverseOrder to determine the direction.
      return isReverseOrder(scan, tableMetadata);
    }

    // If no orderings are specified, check the natural order of the table.
    if (tableMetadata.getClusteringKeyNames().isEmpty()) {
      return false;
    }
    String firstClusteringKeyName = tableMetadata.getClusteringKeyNames().iterator().next();
    return tableMetadata.getClusteringOrder(firstClusteringKeyName) == Scan.Ordering.Order.DESC;
  }

@brfrn169 brfrn169 merged commit c19fad8 into 3 Nov 13, 2025
205 of 209 checks passed
@brfrn169 brfrn169 deleted the 3-pull-3124 branch November 13, 2025 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants