Skip to content

bug: Polars - calling pow() with a small literal integer as the base triggers an overflow in polars #11740

@discreteds

Description

@discreteds

What happened?

The problem

When calling pow() with a small (ie: int8) literal as the LHS or base of the calculation, polars will calculate an incorrect answer due to an overflow when the
result is of a larger type (ie int64).

The bug is particularly triggered by ibis reducing the literal int to the smallest type before passing the value through to polars.

Although the true bug is on the polars side ( I have raised the error there: pola-rs/polars#25213 ), the issue is magnified by ibis reducing the size of the literal to int8.

Possible solution

The call to pow() could force the left hand side to be a larger type, rather than passing straight through to polars. Perhaps cast all LHS int literals to int64 and all LHS float literals to float64 here:

@translate.register(ops.Power)

Reproduction

import pandas as pd

    
ibis.set_backend('polars')
polars_backend = ibis.polars.connect()
sqlite_backend = ibis.sqlite.connect()
duckdb_backend = ibis.duckdb.connect()

# Create base table
pl_df = pl.DataFrame({'base': [2], 'exp': [8]})

#ibis backend tables
df_polars = polars_backend.create_table('df_polars', pl_df)
df_sqlite = sqlite_backend.create_table('pl_df', pl_df)
df_duckdb = duckdb_backend.create_table('pl_df', pl_df)

# Expression definitions
lit_base = ibis.literal(2)
lit_exp = ibis.literal(8)

# Build the results
# Polars - will fail with literals on the LHS as they are cast to Int8 
polars_results = df_polars.select(
    ibis.literal("polars").name("backend"),
    lit_base.pow(df_polars.exp).name("lit_base**col_exp"),
    df_polars.base.pow(lit_exp).name("col_base**lit_exp"),
    lit_base.pow(lit_exp).name("lit_base**lit_exp"),
    df_polars.base.pow(df_polars.exp).name("col_base**col_exp"),
).execute()

# SQLite - all correct
sqlite_results = df_sqlite.select(
    ibis.literal("sqlite").name("backend"),
    lit_base.pow(df_sqlite.exp).name("lit_base**col_exp"),
    df_sqlite.base.pow(lit_exp).name("col_base**lit_exp"),
    lit_base.pow(lit_exp).name("lit_base**lit_exp"),
    df_sqlite.base.pow(df_sqlite.exp).name("col_base**col_exp"),
).execute()

# DuckDB - all correct
duckdb_results = df_duckdb.select(
    ibis.literal("duckdb").name("backend"),
    lit_base.pow(df_duckdb.exp).name("lit_base**col_exp"),
    df_duckdb.base.pow(lit_exp).name("col_base**lit_exp"),
    lit_base.pow(lit_exp).name("lit_base**lit_exp"),
    df_duckdb.base.pow(df_duckdb.exp).name("col_base**col_exp"),
).execute()

pd.concat([polars_results, duckdb_results, sqlite_results])

Output

backend lit_base**col_exp col_base**lit_exp lit_base**lit_exp col_base**col_exp
polars 0 256 0 256
duckdb 256 256 256 256
sqlite 256 256 256 256

What version of ibis are you using?

11.0.0

What backend(s) are you using, if any?

Where the bug occurs: Polars

Used for comparison in example: DuckDB, SQLite

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions