Commit ffffb3f
[SPARK-54397][PYTHON] Make
### What changes were proposed in this pull request?
Fix the hashability of `UserDefinedType`
### Why are the changes needed?
UDT is not hashable, e.g.
```
In [11]: from pyspark.testing.objects import ExamplePointUDT
In [12]: e = ExamplePointUDT()
In [13]: {e: 0}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[13], line 1
----> 1 {e: 0}
TypeError: unhashable type: 'ExamplePointUDT'
In [14]: from pyspark.ml.linalg import VectorUDT
In [15]: v = VectorUDT()
In [16]: {v: 1}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[16], line 1
----> 1 {v: 1}
TypeError: unhashable type: 'VectorUDT'
```
see https://docs.python.org/3/reference/datamodel.html#object.__hash__
> If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections.
> A class that overrides __eq__() and does not define __hash__() will have its __hash__() implicitly set to None. When the __hash__() method of a class is None, instances of the class will raise an appropriate TypeError when a program attempts to retrieve their hash value, and will also be correctly identified as unhashable when checking isinstance(obj, collections.abc.Hashable).
### Does this PR introduce _any_ user-facing change?
yes, `hash(udt)` will work after this fix
### How was this patch tested?
added tests
### Was this patch authored or co-authored using generative AI tooling?
no
Closes apache#53113 from zhengruifeng/type_hashable.
Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>UserDefinedType hashable1 parent 7609a10 commit ffffb3f
File tree
3 files changed
+44
-3
lines changed- python/pyspark
- ml/tests
- sql
- tests
3 files changed
+44
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
367 | 370 | | |
368 | 371 | | |
369 | 372 | | |
| |||
394 | 397 | | |
395 | 398 | | |
396 | 399 | | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
397 | 403 | | |
398 | 404 | | |
399 | 405 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2076 | 2076 | | |
2077 | 2077 | | |
2078 | 2078 | | |
| 2079 | + | |
| 2080 | + | |
| 2081 | + | |
| 2082 | + | |
| 2083 | + | |
| 2084 | + | |
| 2085 | + | |
| 2086 | + | |
| 2087 | + | |
| 2088 | + | |
| 2089 | + | |
| 2090 | + | |
| 2091 | + | |
| 2092 | + | |
| 2093 | + | |
| 2094 | + | |
| 2095 | + | |
| 2096 | + | |
| 2097 | + | |
| 2098 | + | |
| 2099 | + | |
| 2100 | + | |
| 2101 | + | |
| 2102 | + | |
| 2103 | + | |
| 2104 | + | |
| 2105 | + | |
| 2106 | + | |
| 2107 | + | |
| 2108 | + | |
| 2109 | + | |
| 2110 | + | |
| 2111 | + | |
| 2112 | + | |
| 2113 | + | |
| 2114 | + | |
| 2115 | + | |
| 2116 | + | |
2079 | 2117 | | |
2080 | 2118 | | |
2081 | 2119 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2003 | 2003 | | |
2004 | 2004 | | |
2005 | 2005 | | |
2006 | | - | |
2007 | | - | |
2008 | | - | |
2009 | 2006 | | |
2010 | 2007 | | |
2011 | 2008 | | |
| |||
0 commit comments