-
Notifications
You must be signed in to change notification settings - Fork 682
Description
Currently, Backend.insert() has what I view as a footgun, where if the column names don't match it falls back to relying on column ordering.
ibis/ibis/backends/sql/__init__.py
Lines 456 to 470 in 38b19c7
| def _build_insert_from_table( | |
| self, *, target: str, source, db: str | None = None, catalog: str | None = None | |
| ): | |
| compiler = self.compiler | |
| quoted = compiler.quoted | |
| # Compare the columns between the target table and the object to be inserted | |
| # If source is a subset of target, use source columns for insert list | |
| # Otherwise, assume auto-generated column names and use positional ordering. | |
| target_cols = self.get_schema(target, catalog=catalog, database=db).keys() | |
| columns = ( | |
| source_cols | |
| if (source_cols := source.schema().keys()) <= target_cols | |
| else target_cols | |
| ) |
I think we should require that set(inserted_column_names) <= set(target_table_column_names), and error if this is not the case. The use can do this casting themselves if they want columns to be matched based on position.
I don't think we should worry about a nice deprecation path, of first warning for a full version before breaking. I think this is used rarely enough we can just YOLO it and break people suddenly.
No where else in ibis that I am aware of relies on the ordering of columns.
Started from #11624. This is relevant when trying to add an .upsert() method: we want this to have the same semantics as .insert(). For now we are going to continue that PR with the matching behavior to .insert(). We can change both methods later if this suggestion is accepted.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status