-
Notifications
You must be signed in to change notification settings - Fork 40
Backport to branch(3) : Add support for data manipulation operations in Blob Storage adapter #3158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @feeblefakie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request is an automated backport that fundamentally enhances the ScalarDB Blob Storage adapter by enabling full data manipulation capabilities. It introduces a robust internal data model and dedicated handlers for processing Get, Scan, Put, and Delete operations, effectively transforming Blob Storage into a more functional distributed storage solution. The changes are thoroughly validated with extensive integration tests covering various operational aspects and edge cases. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Summary of ChangesHello @feeblefakie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request backports crucial changes to the Blob Storage adapter, enabling comprehensive data manipulation capabilities. It introduces a robust framework for handling Put and Delete operations, including complex conditional mutations, and integrates these features with both distributed storage and consensus commit transactions. The changes are validated through an extensive suite of new and updated integration tests, ensuring the reliability and correctness of the Object Storage backend. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new Object Storage adapter, enabling data manipulation operations (CRUD) for Object Storage. The changes include new core logic for handling operations, specific error codes, and extensive integration tests to ensure functionality and robustness. Key components like MutateStatementHandler, SelectStatementHandler, ObjectStoragePartition, and ObjectStorageOperationChecker have been added, along with utility classes for data mapping and record handling. The integration test base classes have also been refactored to allow for more flexible table metadata definitions.
Overall, the changes are well-structured and demonstrate a thorough approach to integrating Object Storage capabilities. The comprehensive test suite is particularly commendable for covering various scenarios, including large object writes and primary key validation.
...ation-test/java/com/scalar/db/storage/objectstorage/ObjectStorageWrapperIntegrationTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a backport that adds data manipulation operation support to the Blob Storage adapter. The changes are extensive, introducing the core logic for handling mutations (Put, Delete) by reading, modifying, and writing entire partitions (objects). It also adds a comprehensive suite of integration tests, correctly disabling those for unsupported features like secondary indexes and cross-partition ordering. The overall implementation appears solid. I've identified one significant bug in the scan logic that could lead to incorrect results with certain table schemas, and a minor issue in a test's cleanup routine that could affect test reliability. My review comments provide details and suggestions for fixes.
| private boolean isScanForDescClusteringOrder(Scan scan, TableMetadata tableMetadata) { | ||
| if (scan.getStartClusteringKey().isPresent()) { | ||
| Key startClusteringKey = scan.getStartClusteringKey().get(); | ||
| String lastValueName = | ||
| startClusteringKey.getColumns().get(startClusteringKey.size() - 1).getName(); | ||
| return tableMetadata.getClusteringOrder(lastValueName) == Scan.Ordering.Order.DESC; | ||
| } | ||
| if (scan.getEndClusteringKey().isPresent()) { | ||
| Key endClusteringKey = scan.getEndClusteringKey().get(); | ||
| String lastValueName = | ||
| endClusteringKey.getColumns().get(endClusteringKey.size() - 1).getName(); | ||
| return tableMetadata.getClusteringOrder(lastValueName) == Scan.Ordering.Order.DESC; | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in isScanForDescClusteringOrder is flawed. It determines whether to swap the start and end clustering keys for a scan based on the ordering of the last column in the provided start/end key. This is incorrect for tables with multiple clustering keys, especially those with mixed ordering (e.g., c1 DESC, c2 ASC).
The decision to treat a scan as "descending" (and thus swap start/end keys) should be based on the overall requested order of the scan. When no explicit ordering is given in the Scan object, it should default to the table's natural clustering order, which is primarily determined by the ordering of the first clustering key.
This can lead to incorrect scan results, such as missing records or returning records outside the intended range. The method should be updated to correctly determine the effective scan direction.
private boolean isScanForDescClusteringOrder(Scan scan, TableMetadata tableMetadata) {
if (!scan.getOrderings().isEmpty()) {
// If orderings are specified, use isReverseOrder to determine the direction.
return isReverseOrder(scan, tableMetadata);
}
// If no orderings are specified, check the natural order of the table.
if (tableMetadata.getClusteringKeyNames().isEmpty()) {
return false;
}
String firstClusteringKeyName = tableMetadata.getClusteringKeyNames().iterator().next();
return tableMetadata.getClusteringOrder(firstClusteringKeyName) == Scan.Ordering.Order.DESC;
}
This is an automated backport of the following:
Please merge this PR after all checks have passed.