-
Notifications
You must be signed in to change notification settings - Fork 1k
feat(snowflake)!: annotate type for MODE function snowflake #6447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| pass | ||
|
|
||
|
|
||
| class Mode(AggFunc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is supported by other dialects.
For example: Databricks mode supports up to 2 arguments:
mode(expr [, deterministic ])
We should verify if other databases support this function (from their docs online), and find a common representation for all of them. Then add parsing tests for each of them validate_identity or validate_all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to support 3 patterns:
Pattern 1: Simple Aggregate
Mode(this=category, deterministic=None)
-> Snowflake: MODE(category)
Pattern 2: With Deterministic Parameter
Mode(this=category, deterministic=TRUE/FALSE)
-> Databricks/Spark: mode(category, true)
Pattern 3: Empty Function with WITHIN GROUP
WithinGroup(
this=Mode(this=None, deterministic=None),
expression=Order(expressions=[Ordered(this=category)])
)
-> PostgreSQL: mode() WITHIN GROUP (ORDER BY category)
VaggelisD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work Felix! Leaving a few comments to further clean this up, should be good to merge afterwards:
tests/dialects/test_bigquery.py
Outdated
| self.validate_identity("SELECT MODE(category)") | ||
| self.validate_identity("SELECT MODE(price) AS most_common FROM products") | ||
| self.validate_identity("SELECT MODE(status) OVER (PARTITION BY region) FROM orders") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think MODE exists on BQ, couldn't find it in their functions nor does it work:
bigquery> SELECT mode(1);
Function not found: mode; Did you mean mod? at [1:8]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
tests/dialects/test_snowflake.py
Outdated
| self.validate_identity("SELECT MODE(x)") | ||
| self.validate_identity("SELECT MODE(category) FROM table1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 2 tests are identical, we can keep one of those:
| self.validate_identity("SELECT MODE(x)") | |
| self.validate_identity("SELECT MODE(category) FROM table1") | |
| self.validate_identity("SELECT MODE(x)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
tests/dialects/test_bigquery.py
Outdated
| self.validate_identity("SELECT MODE(category)") | ||
| self.validate_identity("SELECT MODE(price) AS most_common FROM products") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, the alias or the FROM <table> shouldn't make a difference in the test, we're mostly interested in testing MODE's args.
| self.validate_identity("SELECT MODE(category)") | |
| self.validate_identity("SELECT MODE(price) AS most_common FROM products") | |
| self.validate_identity("SELECT MODE(category)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| self.validate_identity("SELECT MODE(category)") | ||
| self.validate_identity("SELECT MODE(price, TRUE) AS deterministic_mode FROM products") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Also, instead of repeating tests in each file we can also put the common ones in identity_function.sql such as MODE(x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem is that Postgres only seems to support semantics like "SELECT MODE() WITHIN GROUP (ORDER BY category) FROM table1", not "SELECT MODE(x)"
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge after addressing Vaggelis' comments.
4ddfa06 to
80f4644
Compare
annotate type for MODE function snowflake
https://docs.snowflake.com/en/sql-reference/functions/mode