-
Notifications
You must be signed in to change notification settings - Fork 682
Description
What happened?
The problem
When calling pow() with a small (ie: int8) literal as the LHS or base of the calculation, polars will calculate an incorrect answer due to an overflow when the
result is of a larger type (ie int64).
The bug is particularly triggered by ibis reducing the literal int to the smallest type before passing the value through to polars.
Although the true bug is on the polars side ( I have raised the error there: pola-rs/polars#25213 ), the issue is magnified by ibis reducing the size of the literal to int8.
Possible solution
The call to pow() could force the left hand side to be a larger type, rather than passing straight through to polars. Perhaps cast all LHS int literals to int64 and all LHS float literals to float64 here:
ibis/ibis/backends/polars/compiler.py
Line 721 in 484776f
| @translate.register(ops.Power) |
Reproduction
import pandas as pd
ibis.set_backend('polars')
polars_backend = ibis.polars.connect()
sqlite_backend = ibis.sqlite.connect()
duckdb_backend = ibis.duckdb.connect()
# Create base table
pl_df = pl.DataFrame({'base': [2], 'exp': [8]})
#ibis backend tables
df_polars = polars_backend.create_table('df_polars', pl_df)
df_sqlite = sqlite_backend.create_table('pl_df', pl_df)
df_duckdb = duckdb_backend.create_table('pl_df', pl_df)
# Expression definitions
lit_base = ibis.literal(2)
lit_exp = ibis.literal(8)
# Build the results
# Polars - will fail with literals on the LHS as they are cast to Int8
polars_results = df_polars.select(
ibis.literal("polars").name("backend"),
lit_base.pow(df_polars.exp).name("lit_base**col_exp"),
df_polars.base.pow(lit_exp).name("col_base**lit_exp"),
lit_base.pow(lit_exp).name("lit_base**lit_exp"),
df_polars.base.pow(df_polars.exp).name("col_base**col_exp"),
).execute()
# SQLite - all correct
sqlite_results = df_sqlite.select(
ibis.literal("sqlite").name("backend"),
lit_base.pow(df_sqlite.exp).name("lit_base**col_exp"),
df_sqlite.base.pow(lit_exp).name("col_base**lit_exp"),
lit_base.pow(lit_exp).name("lit_base**lit_exp"),
df_sqlite.base.pow(df_sqlite.exp).name("col_base**col_exp"),
).execute()
# DuckDB - all correct
duckdb_results = df_duckdb.select(
ibis.literal("duckdb").name("backend"),
lit_base.pow(df_duckdb.exp).name("lit_base**col_exp"),
df_duckdb.base.pow(lit_exp).name("col_base**lit_exp"),
lit_base.pow(lit_exp).name("lit_base**lit_exp"),
df_duckdb.base.pow(df_duckdb.exp).name("col_base**col_exp"),
).execute()
pd.concat([polars_results, duckdb_results, sqlite_results])Output
| backend | lit_base**col_exp | col_base**lit_exp | lit_base**lit_exp | col_base**col_exp |
|---|---|---|---|---|
| polars | 0 | 256 | 0 | 256 |
| duckdb | 256 | 256 | 256 | 256 |
| sqlite | 256 | 256 | 256 | 256 |
What version of ibis are you using?
11.0.0
What backend(s) are you using, if any?
Where the bug occurs: Polars
Used for comparison in example: DuckDB, SQLite
Relevant log output
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
Type
Projects
Status