Skip to content

Commit ea4d173

Browse files
authored
Fixing REST with table (#21)
1 parent 8258ec7 commit ea4d173

File tree

2 files changed

+28
-35
lines changed

2 files changed

+28
-35
lines changed

docs/pubsub/rest.md

Lines changed: 26 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,12 @@
11
# REST PubSub
22

3-
The REST PubSub implementation is included in bullet-core, and can be launched along with the Web Service. If it is enabled the Web Service will expose two additional REST endpoints, one for reading/writing Bullet queries, and one
4-
for reading/writing results.
3+
The REST PubSub implementation is included in bullet-core, and can be launched along with the Web Service. If it is enabled the Web Service will expose two additional REST endpoints, one for reading/writing Bullet queries, and one for reading/writing results.
54

65
## How does it work?
76

8-
When the Web Service receives a query from a user, it will create a PubSubMessage and write the message to the "query" RESTPubSub endpoint. This PubSubMessage will contain not only the query, but also some metadata, including the
9-
appropriate host/port to which the response should be sent (this is done to allow for multiple Web Services running simultaneously). The query is then stored in memory until the backend does a GET from this endpoint, at which
10-
time the query will be served to the backend, and dropped from the queue in memory.
7+
When the Web Service receives a query from a user, it will create a PubSubMessage and write the message to the "query" RESTPubSub endpoint. This PubSubMessage will contain not only the query, but also some metadata, including the appropriate host/port to which the response should be sent (this is done to allow for multiple Web Services running simultaneously). The query is then stored in memory until the backend does a GET from this endpoint, at which time the query will be served to the backend, and dropped from the queue in memory.
118

12-
Once the backed has generated the results of the query, it will wrap those results in PubSubMessage. The backend extracts the URL to send the results to from the metadata and writes the results PubSubMessage to the
13-
"results" REST endpoint with a POST. This result will then be stored in memory until the Web Service does a GET to that endpoint, at which time the Web Service will have the results of the query to send back to the user.
9+
Once the backed has generated the results of the query, it will wrap those results in PubSubMessage. The backend extracts the URL to send the results to from the metadata and writes the results PubSubMessage to the "results" REST endpoint with a POST. This result will then be stored in memory until the Web Service does a GET to that endpoint, at which time the Web Service will have the results of the query to send back to the user.
1410

1511
## Setup
1612

@@ -20,13 +16,13 @@ To enable the RESTPubSub and expose the two additional necessary REST endpoints,
2016
bullet.pubsub.builtin.rest.enabled: true
2117
```
2218
23-
...in the Web Service Application.yaml file. This can also be done from the command line when launching the Web Service jar file by adding the command-line option:
19+
...in the Web Service ```application.yaml``` configuration file. This can also be done from the command line when launching the Web Service jar file by adding the command-line option:
2420

2521
```bash
2622
--bullet.pubsub.builtin.rest.enabled=true
2723
```
2824

29-
This will enable the two necessary REST endpoints, the paths for which can be configured in the Application.yaml file with the settings:
25+
This will enable the two necessary REST endpoints, the paths for which can be configured in the ```application.yaml``` file with the settings:
3026

3127
```yaml
3228
bullet.pubsub.builtin.rest.query.path: /pubsub/query
@@ -39,46 +35,44 @@ Configure the backend to use the REST PubSub:
3935

4036
```yaml
4137
bullet.pubsub.context.name: "QUERY_PROCESSING"
42-
bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"
43-
38+
bullet.pubsub.class.name: "com.yahoo.bullet.pubsub.rest.RESTPubSub"
4439
bullet.pubsub.rest.connect.timeout.ms: 5000
4540
bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100
46-
bullet.pubsub.rest.result.subscriber.min.wait.ms: 10
4741
bullet.pubsub.rest.query.subscriber.min.wait.ms: 10
4842
bullet.pubsub.rest.query.urls:
49-
- "http://webServiceHostNameA:9901/api/bullet/pubsub/query"
50-
- "http://webServiceHostNameB:9902/api/bullet/pubsub/query"
43+
- "http://<API_HOST_A>:9901/api/bullet/pubsub/query"
44+
- "http://<API_HOST_B>:9901/api/bullet/pubsub/query"
5145
```
52-
53-
* __bullet.pubsub.context.name: "QUERY_PROCESSING"__ - tells the PubSub that it is running in the backend
54-
* __bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"__ - tells Bullet to use this class for it's PubSub
55-
* __bullet.pubsub.rest.connect.timeout.ms: 5000__ - sets the HTTP connect timeout to a half second
56-
* __bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100__ - this is the maxiumum number of uncommitted messages allowed before blocking
57-
* __bullet.pubsub.rest.query.subscriber.min.wait.ms: 10__ - this setting is used to avoid making an http request too rapidly and overloading the http endpoint. It will force the backend to poll the query endpoint at most once every 10ms.
58-
* __bullet.pubsub.rest.query.urls__ - this should be a list of all the query rest enpoint URLs. If you are only running one Web Service this will only contain one url (the url of your Web Service followed by the full path of the query endpoint).
46+
| Setting Name | Default Value | Meaning |
47+
| ----------------------------------------------------------------- | --------------------------------------- | ---------------- |
48+
| bullet.pubsub.context.name | QUERY_PROCESSING | Tells the PubSub that it is running in the backend |
49+
| bullet.pubsub.class.name | com.yahoo.bullet.pubsub.rest.RESTPubSub | Tells Bullet to use this class for its PubSub |
50+
| bullet.pubsub.rest.connect.timeout.ms | 5000 | Sets the HTTP connect timeout to 5 s |
51+
| bullet.pubsub.rest.subscriber.max.uncommitted.messages | 100 | This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking |
52+
| bullet.pubsub.rest.query.subscriber.min.wait.ms | 10 | This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the backend to poll the query endpoint at most once every 10ms |
53+
| bullet.pubsub.rest.query.urls | <EXAMPLE DEFAULTS> | This should be a list of all the query REST endpoint URLs. If you are only running one Web Service this will only contain one URL (the URL of your Web Service followed by the full path of the query endpoint) |
5954

6055
### Plug into the Web Service
6156

6257
Configure the Web Service to use the REST PubSub:
6358

6459
```yaml
6560
bullet.pubsub.context.name: "QUERY_SUBMISSION"
66-
bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"
67-
61+
bullet.pubsub.class.name: "com.yahoo.bullet.pubsub.rest.RESTPubSub"
6862
bullet.pubsub.rest.connect.timeout.ms: 5000
6963
bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100
7064
bullet.pubsub.rest.result.subscriber.min.wait.ms: 10
71-
bullet.pubsub.rest.query.subscriber.min.wait.ms: 10
7265
bullet.pubsub.rest.result.url: "http://localhost:9901/api/bullet/pubsub/result"
7366
bullet.pubsub.rest.query.urls:
7467
- "http://localhost:9901/api/bullet/pubsub/query"
7568
```
7669

77-
* __bullet.pubsub.context.name: "QUERY_SUBMISSION"__ - tells the PubSub that it is running in the Web Service
78-
* __bullet.pubsub.class.name: "com.yahoo.bullet.kafka.KafkaPubSub"__ - tells Bullet to use this class for it's PubSub
79-
* __bullet.pubsub.rest.connect.timeout.ms: 5000__ - sets the HTTP connect timeout to a half second
80-
* __bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100__ - this is the maxiumum number of uncommitted messages allowed before blocking
81-
* __bullet.pubsub.rest.query.subscriber.min.wait.ms: 10__ - this setting is used to avoid making an http request too rapidly and overloading the http endpoint. It will force the backend to poll the query endpoint at most once every 10ms.
82-
* __bullet.pubsub.rest.result.url: "http://localhost:9901/api/bullet/pubsub/result"__ - this is the endpoint from which the WebService should read results - it should generally be the hostname of that machine the Web Service is running on (or "localhost").
83-
* __bullet.pubsub.rest.query.urls__ - in the Web Service this setting should contain __exactly one__ url - the url to which queries should be written - it should generally be the hostname of that machine the Web Service is running on (or "localhost").
84-
70+
| Setting Name | Default Value | Meaning |
71+
| ----------------------------------------------------------------- | ---------------------------------------------- | ---------------- |
72+
| bullet.pubsub.context.name | QUERY_SUBMISSION | Tells the PubSub that it is running in the Web Service |
73+
| bullet.pubsub.class.name | com.yahoo.bullet.pubsub.rest.RESTPubSub | Tells Bullet to use this class for its PubSub |
74+
| bullet.pubsub.rest.connect.timeout.ms | 5000 | Sets the HTTP connect timeout to 5 s |
75+
| bullet.pubsub.rest.subscriber.max.uncommitted.messages | 100 | This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking |
76+
| bullet.pubsub.rest.result.subscriber.min.wait.ms | 10 | This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the Web Service to poll the query endpoint at most once every 10ms |
77+
| bullet.pubsub.rest.result.url | http://localhost:9901/api/bullet/pubsub/result | This is the endpoint from which the Web Service should read results. This is the hostname of that machine the Web Service is running on (or ```localhost```) |
78+
| bullet.pubsub.rest.query.urls | http://localhost:9901/api/bullet/pubsub/query | In the Web Service, this should contain *exactly one* URL (the URL to which queries should be written). This is the hostname of that machine the Web Service is running on (or ```localhost```) |

docs/ui/usage.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ This example gets the Top 3 most popular ```type``` values (there are only 6 but
147147

148148
### Approximate
149149

150-
By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated.
150+
By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated.
151151

152152
<iframe width="900" height="508" src="https://www.youtube.com/embed/hCHWy229Yhw?autoplay=0&loop=0&playlist=hCHWy229Yhw" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>
153153

@@ -183,7 +183,7 @@ In this example we compute bucket'ed frequency for the "gaussian" field. As the
183183

184184
If the regular chart option is insufficient for your result (for instance, you have too many groups and metrics or you want to post-aggregate your results or remove outliers etc), then there is a advanced Pivot mode available when you are in the Chart option.
185185

186-
The Pivot option provides a drag-and-drop interface to drag fields to breakdown and aggregate by their values. Operations such as finding standard deviations, variance, etc are available as well as easily viewing them as tables and charts.
186+
The Pivot option provides a drag-and-drop interface to drag fields to breakdown and aggregate by their values. Operations such as finding standard deviations, variance, etc are available as well as easily viewing them as tables and charts.
187187

188188
The following example shows a ```Group``` query with multiple groups and metrics and some interactions with the Pivot table.
189189

@@ -192,4 +192,3 @@ The following example shows a ```Group``` query with multiple groups and metrics
192192
!!! note "Raw data does have a regular chart mode option"
193193

194194
This is deliberate since the Chart option tries to infer your independent and dependent columns. When you fetch raw data, this is prone to errors so only the Pivot option is allowed.
195-

0 commit comments

Comments
 (0)