-
Notifications
You must be signed in to change notification settings - Fork 59
odsc-65115 : Transform columns names to make them compatible across models #1046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| """ | ||
| clean_df = self._remove_trailing_whitespace(data) | ||
| # clean_df = self._normalize_column_names(clean_df) | ||
| if hasattr(self.dataset_info, 'horizon'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a proxy for "if it's forecasting"?
If so it may be better to use kind/type for future compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the transformations we use OperatorSpec, incase of the forecasting it is ForecastOperatorSpec, type is available in the OperatorConfig which is not required here and adding type to spec will make it redundant, so I have doing that. Updated this to us operatorspec instance type check for better readability .
| self.target_category_columns = [ | ||
| self.raw_column_names.get(col, col) for col in self.target_category_columns | ||
| ] | ||
| df.columns = df.columns.str.replace(" ", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be doing this replace 2x? Is there a way to do it once so we ensure it's done the same way
| ) | ||
|
|
||
| if self.raw_column_names: | ||
| expected_names.extend(list(self.raw_column_names.values())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these guaranteed to be in historical data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, as we are making this transformation to the historical data columns, expected columns has to be extended using the same post transformed columns
odsc-65115
Add the column name transformations, also save a mapping for reverse transformation as per model requirement.