Skip to content

Conversation

@XuPeng-SH
Copy link
Contributor

@XuPeng-SH XuPeng-SH commented Dec 1, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue https://github.com/matrixorigin/MO-Cloud/issues/6733

What this PR does / why we need it:

When a JOIN operation encounters a type mismatch (e.g., INT = VARCHAR), the type conversion fails during evalJoinCondition(), leaving HashmapBuilder.vecs in a partially initialized state. When Reset() is called during cleanup, it attempts to free nil pointers, causing a panic.


PR Type

Bug fix, Tests


Description

  • Add nil pointer checks in Reset() and Free() methods to prevent panics

  • Introduce cleanupPartiallyCreatedVecs() helper to handle partial initialization failures

  • Call cleanup on errors in evalJoinCondition() to prevent memory leaks

  • Add comprehensive regression tests for nil pointer handling in Reset/Free

  • Add BVT test case for LEFT JOIN with type mismatch error handling


Diagram Walkthrough

flowchart LR
  A["evalJoinCondition<br/>encounters error"] -->|calls| B["cleanupPartiallyCreatedVecs"]
  B -->|frees partial vecs| C["hb.vecs = nil"]
  C -->|prevents nil deref| D["Reset/Free safe"]
  E["Reset/Free methods"] -->|check nil| F["vecs[i] != nil"]
  F -->|check nil| G["vecs[i][j] != nil"]
  G -->|safe free| H["No panic"]
Loading

File Walkthrough

Relevant files
Bug fix
hashmap_util.go
Add nil pointer safety checks and cleanup helper                 

pkg/sql/colexec/hashmap_util/hashmap_util.go

  • Add nil pointer checks in Reset() method for hb.vecs and
    hb.UniqueJoinKeys arrays
  • Add nil pointer checks in Free() method for hb.UniqueJoinKeys array
  • Introduce new cleanupPartiallyCreatedVecs() helper function to safely
    free partially initialized vectors
  • Call cleanupPartiallyCreatedVecs() in evalJoinCondition() when errors
    occur during vector evaluation or duplication
+32/-4   
Tests
hashmap_util_test.go
Add comprehensive nil pointer regression tests                     

pkg/sql/colexec/hashmap_util/hashmap_util_test.go

  • Add import for vector package
  • Add TestResetWithNilPointers() regression test for nil pointer
    handling in Reset with needDupVec=true
  • Add TestResetWithMixedNilAndValidPointers() test for mixed nil and
    valid vectors in Reset
  • Add TestFreeWithNilPointers() regression test for nil pointer handling
    in Free
  • Add TestFreeWithMixedNilAndValidPointers() test for mixed nil and
    valid vectors in Free
+95/-0   
join.test
Add BVT test for LEFT JOIN type mismatch                                 

test/distributed/cases/join/join.test

  • Add BVT test case for LEFT JOIN with type mismatch (INT = VARCHAR)
  • Test that type conversion error is returned instead of causing panic
  • Include table creation, data insertion, and error validation
+13/-0   
join.result
Update test results with type mismatch error                         

test/distributed/cases/join/join.result

  • Update expected output for existing JOIN test (whitespace formatting)
  • Add expected error output for LEFT JOIN with type mismatch test case
  • Include test cleanup statements for new test case
+11/-1   

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing auditing: The new error-handling paths and resource frees do not include any auditing or logging of
critical actions or outcomes, making it unclear if failures during JOIN condition
evaluation are recorded.

Referred Code
func (hb *HashmapBuilder) evalJoinCondition(proc *process.Process) error {
	for idx1 := range hb.Batches.Buf {
		tmpVes := make([]*vector.Vector, len(hb.executors))
		hb.vecs = append(hb.vecs, tmpVes)
		for idx2 := range hb.executors {
			vec, err := hb.executors[idx2].Eval(proc, []*batch.Batch{hb.Batches.Buf[idx1]}, nil)
			if err != nil {
				// Clean up partially created vecs to prevent nil pointer issues in Reset
				hb.cleanupPartiallyCreatedVecs(proc)
				return err
			}
			if hb.needDupVec {
				hb.vecs[idx1][idx2], err = vec.Dup(proc.Mp())
				if err != nil {
					// Clean up partially created vecs to prevent nil pointer issues in Reset
					hb.cleanupPartiallyCreatedVecs(proc)
					return err
				}
			} else {

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Error exposure: The test asserts an internal-type conversion error message 'invalid argument cast to
int, bad value abc' which may expose internal casting details to end users.

Referred Code
SELECT * FROM t1 LEFT JOIN t2 ON t1.id = t2.name;
invalid argument cast to int, bad value abc
drop table if exists t1;

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@mergify mergify bot added the kind/bug Something isn't working label Dec 1, 2025
@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent potential double-free in cleanup

Add a check for hb.needDupVec in cleanupPartiallyCreatedVecs to avoid a
potential double-free of vectors when an error occurs during join condition
evaluation.

pkg/sql/colexec/hashmap_util/hashmap_util.go [179-193]

 // cleanupPartiallyCreatedVecs frees all vectors in hb.vecs that were successfully created.
 // This is used when an error occurs during evalJoinCondition to prevent memory leaks
 // and nil pointer dereferences in Reset.
 func (hb *HashmapBuilder) cleanupPartiallyCreatedVecs(proc *process.Process) {
+	if !hb.needDupVec {
+		hb.vecs = nil
+		return
+	}
 	for i := 0; i < len(hb.vecs); i++ {
 		if hb.vecs[i] != nil {
 			for j := 0; j < len(hb.vecs[i]); j++ {
 				if hb.vecs[i][j] != nil {
 					hb.vecs[i][j].Free(proc.Mp())
 				}
 			}
 		}
 	}
 	hb.vecs = nil
 }
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a potential double-free bug in the new cleanupPartiallyCreatedVecs function, which would occur if an error is encountered when hb.needDupVec is false.

High
  • More

@heni02 heni02 merged commit b716774 into matrixorigin:3.0-dev Dec 2, 2025
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working Review effort 2/5 size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants