Skip to content

Commit 9cd62f6

Browse files
committed
Cleaning up
1 parent 77a50ff commit 9cd62f6

File tree

19 files changed

+138
-558
lines changed

19 files changed

+138
-558
lines changed

docs/about/contact.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ If you have any issues with any of the particular Bullet sub-components, feel fr
66

77
| | |
88
| ------------- | ------ |
9-
| Storm | [https://github.com/yahoo/bullet-storm/issues](https://github.com/yahoo/bullet-storm/issues) |
10-
| Web Service | [https://github.com/yahoo/bullet-service/issues](https://github.com/yahoo/bullet-service/issues) |
11-
| UI | [https://github.com/yahoo/bullet-ui/issues](https://github.com/yahoo/bullet-ui/issues) |
12-
| Record | [https://github.com/yahoo/bullet-record/issues](https://github.com/yahoo/bullet-record/issues) |
13-
| Core | [https://github.com/yahoo/bullet-core/issues](https://github.com/yahoo/bullet-core/issues) |
14-
| Kafka PubSub | [https://github.com/yahoo/bullet-kafka/issues](https://github.com/yahoo/bullet-kafka/issues) |
15-
| Documentation | [https://github.com/yahoo/bullet-docs/issues](https://github.com/yahoo/bullet-docs/issues) |
9+
| Storm | [https://github.com/bullet-db/bullet-storm/issues](https://github.com/bullet-db/bullet-storm/issues) |
10+
| Web Service | [https://github.com/bullet-db/bullet-service/issues](https://github.com/bullet-db/bullet-service/issues) |
11+
| UI | [https://github.com/bullet-db/bullet-ui/issues](https://github.com/bullet-db/bullet-ui/issues) |
12+
| Record | [https://github.com/bullet-db/bullet-record/issues](https://github.com/bullet-db/bullet-record/issues) |
13+
| Core | [https://github.com/bullet-db/bullet-core/issues](https://github.com/bullet-db/bullet-core/issues) |
14+
| Kafka PubSub | [https://github.com/bullet-db/bullet-kafka/issues](https://github.com/bullet-db/bullet-kafka/issues) |
15+
| Documentation | [https://github.com/bullet-db/bullet-docs/issues](https://github.com/bullet-db/bullet-docs/issues) |
1616

1717
## Mailing Lists
1818

docs/about/contributing.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@ We welcome all contributions! We also welcome all usage experiences, stories, an
44

55
## Contributor License Agreement (CLA)
66

7-
Bullet is hosted under the [Yahoo Github Organization](https://github.com/yahoo). In order to contribute to any Yahoo project, you will need to submit a CLA. When you submit a Pull Request to any Bullet repository, a CLABot will ask you to sign the CLA if you haven't signed one already.
8-
9-
Read the [human-readable summary](https://yahoocla.herokuapp.com/) of the CLA.
7+
Bullet is hosted under the [Bullet Github Organization](https://github.com/bullet-db), a subsidiary of the [Yahoo Github Organization](https://github.com/yahoo). In order to contribute to any Yahoo project, you will need to submit a CLA. When you submit a Pull Request to any Bullet repository, a CLABot will ask you to sign the CLA if you haven't signed one already. Read the [human-readable summary](https://yahoocla.herokuapp.com/) of the CLA.
108

119
## Future plans
1210

@@ -16,8 +14,11 @@ This list is neither comprehensive nor in any particular order.
1614

1715
| Feature | Components | Description | Status |
1816
|-------------------- | ----------- | ------------------------- | ------------- |
19-
| Security | WS, UI | The obvious enterprise security for locking down access to the data and the instance of Bullet. Considering SSL, Kerberos, LDAP etc. Ideally, without a database | Planning |
2017
| Bullet on X | BE | With the pub/sub feature, Bullet can be implemented on other Stream Processors like Flink, Kafka Streaming, Samza etc | Open |
21-
| Bullet on Beam | BE | Bullet can be implemented on [Apache Beam](https://beam.apache.org) as an alternative to implementing it on various Stream Processors | Open |
2218
| SQL API | BE, WS | WS supports an endpoint that converts a SQL-like query into Bullet queries | In Progress |
19+
| More Windows | BE | We have implemented a few of the windows we wanted to support initially but there are still more we can add | Open |
20+
| More Aggregations | BE, UI | We can add more aggregations like Group By Count Distinct etc | Open |
21+
| Post Aggregations | BE, UI | Post aggregations once the aggregations are done is useful | Open |
22+
| Bullet on Beam | BE | Bullet can be implemented on [Apache Beam](https://beam.apache.org) as an alternative to implementing it on various Stream Processors | Open |
23+
| Security | WS, UI | The obvious enterprise security for locking down access to the data and the instance of Bullet. Considering SSL, Kerberos, LDAP etc. Ideally, without a database | Planning |
2324
| Packaging | UI, BE, WS | Github releases and building from source are the only two options for the UI. Docker images or the like for quick setup and to mix and match various pluggable components would be really useful | Open |

docs/backend/storm-architecture.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The red colored lines are the path for the queries that come in through the PubS
2222

2323
Bullet can accept arbitrary sources of data as long as they can be read from Storm. You can either:
2424

25-
1. Write a Storm spout that reads your data from where ever it is (Kafka etc) and [converts it to Bullet Records](ingestion.md). See [Quick Start](../quick-start.md#storm-topology) for an example.
25+
1. Write a Storm spout that reads your data from where ever it is (Kafka etc) and [converts it to Bullet Records](ingestion.md). See [Quick Start](../quick-start/storm.md#storm-topology) for an example.
2626
2. Hook up an existing topology that is doing something else directly to Bullet. You will still write and hook up a component that converts your data into Bullet Records in your existing topology.
2727

2828
| | Pros | Cons |
@@ -35,7 +35,7 @@ Your data is then emitted to the Filter bolt. If you have no queries in your sy
3535

3636
!!! note "Why support micro-batching?"
3737

38-
```RAW``` queries do not micro-batch by default, which makes Bullet really snappy when running those queries. As soon as your maximum record limit is reached, the query immediately returns. You can use a setting in [bullet_defaults.yaml](https://github.com/yahoo/bullet-storm/blob/master/src/main/resources/bullet_defaults.yaml) to turn on batching if you like. In the near future, micro-batching will let Bullet provide incremental results - partial results arrive over the duration of the query. Bullet can emit intermediate aggregations as they are all [additive](#combining).
38+
```RAW``` queries do not micro-batch by default, which makes Bullet really snappy when running those queries. As soon as your maximum record limit is reached, the query immediately returns. You can use a setting in [bullet_defaults.yaml](https://github.com/bullet-db/bullet-storm/blob/master/src/main/resources/bullet_defaults.yaml) to turn on batching if you like. In the near future, micro-batching will let Bullet provide incremental results - partial results arrive over the duration of the query. Bullet can emit intermediate aggregations as they are all [additive](#combining).
3939

4040
### Request processing
4141

docs/backend/storm-performance.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@ You should be familiar with [Storm](http://storm.apache.org), [Kafka](http://kaf
2121

2222
## How was this tested?
2323

24-
All tests run here were using [Bullet-Storm 0.4.2](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2) and [Bullet-Storm 0.4.3](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.3). We are working with just the Storm piece without going through the Web Service or the UI. The DRPC REST endpoint provided by Storm lets us do just that.
24+
All tests run here were using [Bullet-Storm 0.4.2](https://github.com/bullet-db/bullet-storm/releases/tag/bullet-storm-0.4.2) and [Bullet-Storm 0.4.3](https://github.com/bullet-db/bullet-storm/releases/tag/bullet-storm-0.4.3). We are working with just the Storm piece without going through the Web Service or the UI. The DRPC REST endpoint provided by Storm lets us do just that.
2525

2626
This particular version of Bullet on Storm was **prior to the architecture shift** to a PubSub layer but this would be the equivalent to using the Storm DRPC PubSub layer on a newer version of Bullet on Storm. You can replace DRPC spout and PrepareRequest bolt with Query spout and ReturnResults bolt with Result bolt conceptually. The actual implementation of the DRPC based PubSub layer just uses these spout and bolt implementations underneath anyway for the Publishers and Subscribers so the parallelisms and CPU utilizations should map 1-1.
2727

2828
Using the pluggable metrics interface in Bullet on Storm, we captured worker level metrics such as CPU time, Heap usage, GC times and types, sent them to a in-house monitoring service for time-slicing and graphing. The figures shown below use this service.
2929

30-
See [0.3.0](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.0) for how to plug in your own metrics collection.
30+
See [0.3.0](https://github.com/bullet-db/bullet-storm/releases/tag/bullet-storm-0.3.0) for how to plug in your own metrics collection.
3131

3232
### Tools used
3333

@@ -76,7 +76,7 @@ bullet.query.aggregation.max.size: 512
7676
bullet.query.aggregation.raw.max.size: 500
7777
bullet.query.aggregation.distribution.max.points: 200
7878
```
79-
Any setting not listed here defaults to the defaults in [bullet_defaults.yaml](https://github.com/yahoo/bullet-storm/blob/bullet-storm-0.4.2/src/main/resources/bullet_defaults.yaml). In particular, **metadata collection** and **timestamp injection** is enabled. ```RAW``` type queries also micro-batch by size 1 (in other words, do not micro-batch).
79+
Any setting not listed here defaults to the defaults in [bullet_defaults.yaml](https://github.com/bullet-db/bullet-storm/blob/bullet-storm-0.4.2/src/main/resources/bullet_defaults.yaml). In particular, **metadata collection** and **timestamp injection** is enabled. ```RAW``` type queries also micro-batch by size 1 (in other words, do not micro-batch).
8080

8181
The parallelisms, CPU and memory settings for the components are listed below.
8282

@@ -434,7 +434,7 @@ With this configuration, we were able to run **```680```** queries simultaneousl
434434

435435
!!! note "Measuring latency in Bullet"
436436

437-
So far, we have been using data being delayed long enough as a proxy for queries failing. [Bullet-Storm 0.4.3](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.3) adds an average latency metric computed in the Filter Bolts. For the next tests, we add a timestamp in the Data Source spouts when the record is read and this latency metric tells us exactly how long it takes for the record to be matched against a query and acked. By setting a limit for this latency, we can much more accurately measure acceptable performance.
437+
So far, we have been using data being delayed long enough as a proxy for queries failing. [Bullet-Storm 0.4.3](https://github.com/bullet-db/bullet-storm/releases/tag/bullet-storm-0.4.3) adds an average latency metric computed in the Filter Bolts. For the next tests, we add a timestamp in the Data Source spouts when the record is read and this latency metric tells us exactly how long it takes for the record to be matched against a query and acked. By setting a limit for this latency, we can much more accurately measure acceptable performance.
438438

439439
## Test 6: Scaling for More Data
440440

docs/backend/storm-setup.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This section explains how to set up and run Bullet on Storm. If you're using the
44

55
## Configuration
66

7-
Bullet is configured at run-time using settings defined in a file. Settings not overridden will default to the values in [bullet_defaults.yaml](https://github.com/yahoo/bullet-storm/blob/master/src/main/resources/bullet_defaults.yaml). There are too many to list here. You can find out what these settings do in the comments listed in the defaults.
7+
Bullet is configured at run-time using settings defined in a file. Settings not overridden will default to the values in [bullet_defaults.yaml](https://github.com/bullet-db/bullet-storm/blob/master/src/main/resources/bullet_defaults.yaml). There are too many to list here. You can find out what these settings do in the comments listed in the defaults.
88

99
## Installation
1010

@@ -47,9 +47,9 @@ You need a JVM based project that implements one of the two options above. You i
4747

4848
If you just need the jar artifact directly, you can download it from [JCenter](http://jcenter.bintray.com/com/yahoo/bullet/bullet-storm/).
4949

50-
You can also add ```<classifier>sources</classifier>``` or ```<classifier>javadoc</classifier>``` if you want the sources or javadoc. We also package up our test code where we have some helper classes to deal with [Storm components](https://github.com/yahoo/bullet-storm/tree/master/src/test/java/com/yahoo/bullet/storm). If you wish to use these to help with testing your topology, you can add another dependency on bullet-storm with ```<type>test-jar</type>```.
50+
You can also add ```<classifier>sources</classifier>``` or ```<classifier>javadoc</classifier>``` if you want the sources or javadoc. We also package up our test code where we have some helper classes to deal with [Storm components](https://github.com/bullet-db/bullet-storm/tree/master/src/test/java/com/yahoo/bullet/storm). If you wish to use these to help with testing your topology, you can add another dependency on bullet-storm with ```<type>test-jar</type>```.
5151

52-
If you are going to use the second option (directly pipe data into Bullet from your Storm topology), then you will need a main class that directly calls the submit method with your wired up topology and the name of the component that is going to emit Bullet Records in that wired up topology. The submit method can be found in [Topology.java](https://github.com/yahoo/bullet-storm/blob/master/src/main/java/com/yahoo/bullet/Topology.java). The submit method submits the topology so it should be the last thing you do in your main.
52+
If you are going to use the second option (directly pipe data into Bullet from your Storm topology), then you will need a main class that directly calls the submit method with your wired up topology and the name of the component that is going to emit Bullet Records in that wired up topology. The submit method can be found in [Topology.java](https://github.com/bulletbullet-storm/blob/master/src/main/java/com/yahoo/bullet/Topology.java). The submit method submits the topology so it should be the last thing you do in your main.
5353

5454
If you are just implementing a Spout, see the [Launch](#launch) section below on how to use the main class in Bullet to create and submit your topology.
5555

docs/css/extra.css

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,21 @@ video {
3030
width: 100%;
3131
}
3232

33+
@media (min-width: 1300px) {
34+
.container {
35+
width: 1200px;
36+
}
37+
}
38+
3339
@media (min-width: 1650px) {
34-
body > .container {
35-
width: 1400px;
40+
.container {
41+
width: 1500px;
3642
}
3743
}
3844

3945
@media (min-width: 1920px) {
40-
body > .container {
41-
width: 1680px;
46+
.container {
47+
width: 1750px;
4248
}
4349
}
4450

docs/index.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ This instance of Bullet also powers other use-cases such as letting analysts val
3838

3939
# Quick Start
4040

41-
See [Quick Start](quick-start/bullet-on-spark.md) to set up Bullet locally using spark-streaming. You will generate some synthetic streaming data that you can then query with Bullet.
41+
See [Quick Start](quick-start/spark.md) to set up Bullet locally using Spark Streaming. You will generate some synthetic streaming data that you can then query with Bullet.
4242

4343
# Setup Bullet on your streaming data
4444

@@ -61,7 +61,7 @@ To set up Bullet on a real data stream, you need:
6161

6262
Bullet queries allow you to filter, project and aggregate data. You can also specify a window to get incremental results. Bullet lets you fetch raw (the individual data records) as well as aggregated data.
6363

64-
* See the [UI Usage section](ui/usage.md) for using the UI to build Bullet queries. This is the same UI you will build in the [Quick Start](quick-start/bullet-on-spark.md)
64+
* See the [UI Usage section](ui/usage.md) for using the UI to build Bullet queries. This is the same UI you will build in the Quick Starts.
6565

6666
* See the [API section](ws/api.md) for building Bullet API queries
6767

@@ -162,10 +162,11 @@ Implementations of [Bullet on Storm](backend/storm-architecture.md) and [Bullet
162162
## PubSub
163163

164164
The PubSub is responsible for transmitting queries from the API to the Backend and returning results back from the Backend to the clients. It decouples whatever particular Backend you are using with the API.
165-
We currently support two different PubSub implementation:
165+
We currently support three different PubSub implementations:
166166

167167
* [Kafka](pubsub/kafka.md)
168168
* [REST](pubsub/rest.md)
169+
* [Storm DRPC](pubsub/storm-drpc.md) (only for non-windowed queries)
169170

170171
You can also very easily [implement your own](pubsub/architecture.md#implementing-your-own-pubsub) by defining a few interfaces that we provide.
171172

@@ -182,4 +183,4 @@ The Web Service can be deployed as a standalone Java application (a JAR file) or
182183

183184
!!! note "Want to know more?"
184185

185-
In practice, the backend is implemented using the basic components that the Stream processing framework provides. See [Storm Architecture](backend/storm-architecture.md) for details.
186+
In practice, the backend is implemented using the basic components that the Stream processing framework provides. See [Storm Architecture](backend/storm-architecture.md) and [Spark Architecture](backend/spark-architecture.md) for details.

0 commit comments

Comments
 (0)