You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The REST PubSub implementation is included in bullet-core, and can be launched along with the Web Service. If it is enabled the Web Service will expose two additional REST endpoints, one for reading/writing Bullet queries, and one
4
-
for reading/writing results.
3
+
The REST PubSub implementation is included in bullet-core, and can be launched along with the Web Service. If it is enabled the Web Service will expose two additional REST endpoints, one for reading/writing Bullet queries, and one for reading/writing results.
5
4
6
5
## How does it work?
7
6
8
-
When the Web Service receives a query from a user, it will create a PubSubMessage and write the message to the "query" RESTPubSub endpoint. This PubSubMessage will contain not only the query, but also some metadata, including the
9
-
appropriate host/port to which the response should be sent (this is done to allow for multiple Web Services running simultaneously). The query is then stored in memory until the backend does a GET from this endpoint, at which
10
-
time the query will be served to the backend, and dropped from the queue in memory.
7
+
When the Web Service receives a query from a user, it will create a PubSubMessage and write the message to the "query" RESTPubSub endpoint. This PubSubMessage will contain not only the query, but also some metadata, including the appropriate host/port to which the response should be sent (this is done to allow for multiple Web Services running simultaneously). The query is then stored in memory until the backend does a GET from this endpoint, at which time the query will be served to the backend, and dropped from the queue in memory.
11
8
12
-
Once the backed has generated the results of the query, it will wrap those results in PubSubMessage. The backend extracts the URL to send the results to from the metadata and writes the results PubSubMessage to the
13
-
"results" REST endpoint with a POST. This result will then be stored in memory until the Web Service does a GET to that endpoint, at which time the Web Service will have the results of the query to send back to the user.
9
+
Once the backed has generated the results of the query, it will wrap those results in PubSubMessage. The backend extracts the URL to send the results to from the metadata and writes the results PubSubMessage to the "results" REST endpoint with a POST. This result will then be stored in memory until the Web Service does a GET to that endpoint, at which time the Web Service will have the results of the query to send back to the user.
14
10
15
11
## Setup
16
12
@@ -20,13 +16,13 @@ To enable the RESTPubSub and expose the two additional necessary REST endpoints,
20
16
bullet.pubsub.builtin.rest.enabled: true
21
17
```
22
18
23
-
...in the Web Service Application.yaml file. This can also be done from the command line when launching the Web Service jar file by adding the command-line option:
19
+
...in the Web Service ```application.yaml``` configuration file. This can also be done from the command line when launching the Web Service jar file by adding the command-line option:
24
20
25
21
```bash
26
22
--bullet.pubsub.builtin.rest.enabled=true
27
23
```
28
24
29
-
This will enable the two necessary REST endpoints, the paths for which can be configured in the Application.yaml file with the settings:
25
+
This will enable the two necessary REST endpoints, the paths for which can be configured in the ```application.yaml``` file with the settings:
* __bullet.pubsub.context.name: "QUERY_PROCESSING"__ - tells the PubSub that it is running in the backend
54
-
* __bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"__ - tells Bullet to use this class for it's PubSub
55
-
* __bullet.pubsub.rest.connect.timeout.ms: 5000__ - sets the HTTP connect timeout to a half second
56
-
* __bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100__ - this is the maxiumum number of uncommitted messages allowed before blocking
57
-
* __bullet.pubsub.rest.query.subscriber.min.wait.ms: 10__ - this setting is used to avoid making an http request too rapidly and overloading the http endpoint. It will force the backend to poll the query endpoint at most once every 10ms.
58
-
* __bullet.pubsub.rest.query.urls__ - this should be a list of all the query rest enpoint URLs. If you are only running one Web Service this will only contain one url (the url of your Web Service followed by the full path of the query endpoint).
| bullet.pubsub.context.name | QUERY_PROCESSING | Tells the PubSub that it is running in the backend |
49
+
| bullet.pubsub.class.name | com.yahoo.bullet.pubsub.rest.RESTPubSub | Tells Bullet to use this class for its PubSub |
50
+
| bullet.pubsub.rest.connect.timeout.ms | 5000 | Sets the HTTP connect timeout to 5 s |
51
+
| bullet.pubsub.rest.subscriber.max.uncommitted.messages | 100 | This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking |
52
+
| bullet.pubsub.rest.query.subscriber.min.wait.ms | 10 | This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the backend to poll the query endpoint at most once every 10ms |
53
+
| bullet.pubsub.rest.query.urls | <EXAMPLE DEFAULTS> | This should be a list of all the query REST endpoint URLs. If you are only running one Web Service this will only contain one URL (the URL of your Web Service followed by the full path of the query endpoint) |
* __bullet.pubsub.context.name: "QUERY_SUBMISSION"__ - tells the PubSub that it is running in the Web Service
78
-
* __bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"__ - tells Bullet to use this class for it's PubSub
79
-
* __bullet.pubsub.rest.connect.timeout.ms: 5000__ - sets the HTTP connect timeout to a half second
80
-
* __bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100__ - this is the maxiumum number of uncommitted messages allowed before blocking
81
-
* __bullet.pubsub.rest.query.subscriber.min.wait.ms: 10__ - this setting is used to avoid making an http request too rapidly and overloading the http endpoint. It will force the backend to poll the query endpoint at most once every 10ms.
82
-
* __bullet.pubsub.rest.result.url: "http://localhost:9901/api/bullet/pubsub/result"__ - this is the endpoint from which the WebService should read results - it should generally be the hostname of that machine the Web Service is running on (or "localhost").
83
-
* __bullet.pubsub.rest.query.urls__ - in the Web Service this setting should contain __exactly one__ url - the url to which queries should be written - it should generally be the hostname of that machine the Web Service is running on (or "localhost").
| bullet.pubsub.context.name | QUERY_SUBMISSION | Tells the PubSub that it is running in the Web Service |
73
+
| bullet.pubsub.class.name | com.yahoo.bullet.pubsub.rest.RESTPubSub | Tells Bullet to use this class for its PubSub |
74
+
| bullet.pubsub.rest.connect.timeout.ms | 5000 | Sets the HTTP connect timeout to 5 s |
75
+
| bullet.pubsub.rest.subscriber.max.uncommitted.messages | 100 | This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking |
76
+
| bullet.pubsub.rest.result.subscriber.min.wait.ms | 10 | This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the Web Service to poll the query endpoint at most once every 10ms |
77
+
| bullet.pubsub.rest.result.url | http://localhost:9901/api/bullet/pubsub/result | This is the endpoint from which the Web Service should read results. This is the hostname of that machine the Web Service is running on (or ```localhost```) |
78
+
| bullet.pubsub.rest.query.urls | http://localhost:9901/api/bullet/pubsub/query | In the Web Service, this should contain *exactly one* URL (the URL to which queries should be written). This is the hostname of that machine the Web Service is running on (or ```localhost```) |
Copy file name to clipboardExpand all lines: docs/ui/usage.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -147,7 +147,7 @@ This example gets the Top 3 most popular ```type``` values (there are only 6 but
147
147
148
148
### Approximate
149
149
150
-
By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated.
150
+
By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated.
@@ -183,7 +183,7 @@ In this example we compute bucket'ed frequency for the "gaussian" field. As the
183
183
184
184
If the regular chart option is insufficient for your result (for instance, you have too many groups and metrics or you want to post-aggregate your results or remove outliers etc), then there is a advanced Pivot mode available when you are in the Chart option.
185
185
186
-
The Pivot option provides a drag-and-drop interface to drag fields to breakdown and aggregate by their values. Operations such as finding standard deviations, variance, etc are available as well as easily viewing them as tables and charts.
186
+
The Pivot option provides a drag-and-drop interface to drag fields to breakdown and aggregate by their values. Operations such as finding standard deviations, variance, etc are available as well as easily viewing them as tables and charts.
187
187
188
188
The following example shows a ```Group``` query with multiple groups and metrics and some interactions with the Pivot table.
189
189
@@ -192,4 +192,3 @@ The following example shows a ```Group``` query with multiple groups and metrics
192
192
!!! note "Raw data does have a regular chart mode option"
193
193
194
194
This is deliberate since the Chart option tries to infer your independent and dependent columns. When you fetch raw data, this is prone to errors so only the Pivot option is allowed.
0 commit comments