You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -122,6 +122,32 @@ The ``.materialise()`` method takes the following parameter:
122
122
.. seealso::
123
123
Refer :ref:`Data types` supported by feature store
124
124
125
+
126
+
Materialise Stream
127
+
==================
128
+
You can call the ``materialise_stream() -> FeatureGroupJob`` method of the ``FeatureGroup`` instance to load the streaming data to feature group. To persist the feature_group and save feature_group data along the metadata in the feature store, call the ``materialise_stream()``
129
+
130
+
The ``.materialise_stream()`` method takes the following parameter:
131
+
- ``input_dataframe``: Features in Streaming Dataframe to be saved.
132
+
- ``query_name``: It is possible to optionally specify a name for the query to make it easier to recognise in the Spark UI. Defaults to ``None``.
133
+
- ``ingestion_mode``: Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
134
+
- ``append``: Only the new rows in the streaming DataFrame/Dataset will be written to the sink. If the query doesn’t contain aggregations, it will be equivalent to append mode. Defaults to ``"append"``.
135
+
- ``complete``: All the rows in the streaming DataFrame/Dataset will be written to the sink every time there is some update.
136
+
- ``update``: only the rows that were updated in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
137
+
- ``await_termination``: Waits for the termination of this query, either by ``query.stop()`` or by an exception. If the query has terminated with an exception, then the exception will be thrown. If timeout is set, it returns whether the query has terminated or not within the timeout seconds. Defaults to ``False``.
138
+
- ``timeout``: Only relevant in combination with ``await_termination=True``.
139
+
- Defaults to ``None``.
140
+
- ``checkpoint_dir``: Checkpoint directory location. This will be used to as a reference to from where to resume the streaming job. Defaults to ``None``.
141
+
- ``write_options``: Additional write options for Spark as key-value pairs.
142
+
- Defaults to ``{}``.
143
+
144
+
.. seealso::
145
+
:ref:`Feature Group Job`
146
+
147
+
.. seealso::
148
+
Refer :ref:`Data types` supported by feature store
149
+
150
+
125
151
Delete
126
152
======
127
153
@@ -173,6 +199,9 @@ With a ``FeatureGroup`` instance, You can save the expectation details using ``w
173
199
.. image:: figures/validation.png
174
200
175
201
.. code-block:: python3
202
+
from great_expectations.core import ExpectationSuite, ExpectationConfiguration
203
+
from ads.feature_store.common.enums import TransformationMode, ExpectationType
204
+
from ads.feature_store.feature_group import FeatureGroup
176
205
177
206
expectation_suite = ExpectationSuite(
178
207
expectation_suite_name="expectation_suite_name"
@@ -221,6 +250,7 @@ feature group or it can be updated later as well.
221
250
.. code-block:: python3
222
251
223
252
# Define statistics configuration for selected features
253
+
from ads.feature_store.statistics_config import StatisticsConfig
0 commit comments