Skip to content

Commit 5abc7d0

Browse files
committed
Adding changes to everything but UI usage videos
1 parent ba41e15 commit 5abc7d0

File tree

9 files changed

+1143
-377
lines changed

9 files changed

+1143
-377
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,9 @@ You need [mkdocs](http://www.mkdocs.org/#installation) installed to build the do
1111
This also uses the mkdocs theme : [Cinder](http://sourcefoundry.org/cinder/).
1212

1313
Since Cinder has not been upgraded in a while, you will need to bring in changes in this [PR](https://github.com/chrissimpkins/cinder/pull/26) of Cinder found here: [twardoch/clinker-mktheme](https://github.com/twardoch/clinker-mktheme/tree/2016-12-22)
14+
and add on this fix in this [PR](https://github.com/twardoch/clinker-mktheme/pull/10)
1415

15-
You will need Python installed.
16+
You will also need Python installed.
1617

1718
```bash
1819
sudo pip install virtualenv
@@ -30,7 +31,7 @@ mkdocs build
3031
You will need [Maven 3](https://maven.apache.org/install.html) and [JDK 8](http://www.oracle.com/technetwork/java/javase/downloads/index.html) installed to build the examples.
3132

3233
```bash
33-
cd bullet-docs/examples/storm && mvn package
34+
cd bullet-docs/examples/ && make
3435
```
3536

3637
Code licensed under the Apache 2 license. See LICENSE file for terms.

docs/about/contributing.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,8 @@ This list is neither comprehensive nor in any particular order.
1414

1515
| Feature | Components | Description | Status |
1616
|------------------- | ----------- | ------------------------- | ------------- |
17-
| TOP K | UI, BE | A TOP K implementation using DataSketches: FrequentItems | In progress |
18-
| DISTRIBUTION | UI, BE | A query to get the distribution/quantiles of a numeric field using DataSketches: Quantiles | In progress |
1917
| Pub-Sub Queue | BE, WS, UI | WS and BE talk through the pub/sub. Bullet Storm uses Storm DRPC for this, which is strictly request-response. This will let us work on other Stream Processors and support incremental updates through WebSockets or SSEs | |
2018
| Incremental updates| BE, WS, UI | Push results back to users as soon as they arrive. Monoidal operations implies additive, so progressive results can be streamed back. Micro-batching and other features come into play | |
2119
| SQL API | BE, WS | WS supports an endpoint that converts a SQL-like query into Bullet queries | |
2220
| LocalForage | UI | Migration to LocalForage to distance ourselves from the relatively small LocalStorage space | [#9](https://github.com/yahoo/bullet-ui/issues/9) |
2321
| UI Packaging | UI | Github releases and building from source are the only two options. Docker or something similar may be more apt | |
24-
| Simple Settings | UI, WS | There are several settings in the UI and WS that are directly tied to the BE. They should be configurable and optimally configurable from one location | |

docs/about/releases.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ certain CPU and memory related settings specific to RAS in its configuration. Th
1212

1313
!!! note "Future support"
1414

15-
We will support Storm 0.10 for a bit longer till Storm 2.0 is up and stable. Storm 1.0+ have a lot of performance fixes and features that you should be running with.
15+
We will support Storm 0.10 for a bit longer till Storm 2.0 is up and stable. Storm versions 1.0+ have a lot of performance fixes and features that you should be running with.
1616

1717
| | |
1818
| ----------------------------- | --------------- |
@@ -26,6 +26,9 @@ certain CPU and memory related settings specific to RAS in its configuration. Th
2626

2727
| Date | Storm 1.0 | Storm 0.10 | Highlights |
2828
| ------------ | ---------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- | ---------- |
29+
| 2017-04-28 | [**0.4.2**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2) | [**0.4.2**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.2) | Strict JSON output and fix for no data distributions |
30+
| 2017-04-26 | [**0.4.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.1) | [**0.4.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.1) | Result Metadata Concept name mismatch fix |
31+
| 2017-04-21 | [**0.4.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.0) | [**0.4.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.0) | DISTRIBUTION and TOP K release. Configuration renames. |
2932
| 2017-03-13 | [**0.3.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.1) | [**0.3.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.3.1) | Extra records accepted after query expiry bug fix |
3033
| 2017-02-27 | [**0.3.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.0) | [**0.3.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.3.0) | Metrics interface, config namespace, NPE bug fix |
3134
| 2017-02-15 | [**0.2.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.2.1) | [**0.2.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.2.1) | Acking support, Max size and other bug fixes |
@@ -43,6 +46,8 @@ The Web Service implementation that can serve a static schema from a file and ta
4346
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-service.svg)](https://github.com/yahoo/bullet-service/releases/latest) |
4447
| **Latest Artifact** | [![Download](https://api.bintray.com/packages/yahoo/maven/bullet-service/images/download.svg)](https://bintray.com/yahoo/maven/bullet-service/_latestVersion) |
4548

49+
### Releases
50+
4651
| Date | Release | Highlights |
4752
| ------------ | -------------------------------------------------------------------------------------- | ---------- |
4853
| 2016-12-16 | [**0.0.1**](https://github.com/yahoo/bullet-service/releases/tag/bullet-service-0.0.1) | The first release with support for DRPC and the file-based schema |
@@ -62,6 +67,7 @@ The Bullet UI that lets you build, run, save and visualize results from Bullet.
6267

6368
| Date | Release | Highlights |
6469
| ------------ | -------------------------------------------------------------------------------------- | ---------- |
70+
| 2016-05-01 | [**0.2.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0) | Release for Top K and Distribution. Supports Bullet Storm 0.4.2+ |
6571
| 2016-02-21 | [**0.1.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.1.0) | The first release with support for all features included in Bullet Storm 0.2.1+ |
6672

6773
## Bullet Record
@@ -79,4 +85,5 @@ The AVRO container that you need to convert your data into to be consumed by Bul
7985

8086
| Date | Release | Highlights |
8187
| ------------ | ------------------------------------------------------------------------------------ | ---------- |
88+
| 2017-04-17 | [**0.1.1**](https://github.com/yahoo/bullet-record/releases/tag/bullet-record-0.1.0) | Helper methods to remove, rename, check presence and count fields in the Record |
8289
| 2017-02-09 | [**0.1.0**](https://github.com/yahoo/bullet-record/releases/tag/bullet-record-0.1.0) | Map constructor |

docs/backend/storm-architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Since the data from the Prepare Request bolt (a query and a piece of return info
5757

5858
!!! note "Combining and operations"
5959

60-
In order to be able to combine intermediate results and process data in any order, all aggregations that Bullet does need to be associative and have an identity. In other words, they need to be [Monoids](https://en.wikipedia.org/wiki/Monoid). Luckily for us, the [DataSketches](http://datasketches.github.io) that we use are monoids (actually are commutative monoids). Sketches can be unioned and thus all the aggregations we support - ```SUM```, ```COUNT```, ```MIN```, ```MAX```, ```AVG```, ```COUNT DISTINCT```, ```DISTINCT``` - are monoidal. (```AVG``` is monoidal if you store a ```SUM``` and a ```COUNT``` instead).
60+
In order to be able to combine intermediate results and process data in any order, all aggregations that Bullet does need to be associative and have an identity. In other words, they need to be [Monoids](https://en.wikipedia.org/wiki/Monoid). Luckily for us, the [DataSketches](http://datasketches.github.io) that we use are monoids (actually are commutative monoids). Sketches can be unioned and thus all the aggregations we support - ```SUM```, ```COUNT```, ```MIN```, ```MAX```, ```AVG```, ```COUNT DISTINCT```, ```DISTINCT``` etc - are monoidal. (```AVG``` is monoidal if you store a ```SUM``` and a ```COUNT``` instead).
6161

6262

6363
## Scalability

docs/index.md

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88

99
* Is a **look-forward** query system. Queries are submitted first and they operate on data that arrive after the query is submitted
1010

11+
* Supports rich queries for filtering and getting **Raw data, Counting Distincts, Distincts, Grouping (Sum, Count, Min, Max, Avg), Distributions, and Top K**
12+
1113
* Is **multi-tenant** and can scale for more queries and/or for more data
1214

1315
* Provides a **UI and Web Service** that are also pluggable for a full end-to-end solution to your querying needs
@@ -91,8 +93,10 @@ The current aggregation types that are supported are:
9193
| GROUP | The resulting output would be a record containing the result of an operation for each unique value combination in your specified fields |
9294
| COUNT DISTINCT | Computes the number of distinct elements in the fields. (May be approximate) |
9395
| LIMIT or RAW | The resulting output would be at most the number specified in size. |
96+
| DISTRIBUTION | Computes distributions of the elements in the field. E.g. Find the median value or various percentile of a field, or get frequency or cumulative frequency distributions |
97+
| TOP K | Returns the top K most frequently appearing values in the column |
9498

95-
Currently we support ```GROUP``` aggregations on the following operations:
99+
Currently we support ```GROUP``` aggregations with the following operations:
96100

97101
| Operation | Meaning |
98102
| -------------- | ------- |
@@ -112,19 +116,10 @@ The Bullet Web Service returns your query result as well as associated metadata
112116

113117
It is often intractable to perform aggregations on an unbounded stream of data and still support arbitrary queries. However, it is possible if an exact answer is not required and the approximate answer's error is exactly quantifiable. There are stochastic algorithms and data structures that let us do this. We use [Data Sketches](https://datasketches.github.io/) to perform aggregations such as counting uniques, and will be using Sketches to implement some future aggregations.
114118

115-
Sketches let us be exact in our computation up to configured thresholds and approximate after. The error is very controllable and quantifiable. All Bullet queries that use Sketches return the error bounds with Standard Deviations as part of the results so you can quantify the error exactly. Using Sketches lets us address otherwise hard to solve problems in sub-linear space.
119+
Sketches let us be exact in our computation up to configured thresholds and approximate after. The error is very controllable and quantifiable. All Bullet queries that use Sketches return the error bounds with Standard Deviations as part of the results so you can quantify the error exactly. Using Sketches lets us address otherwise hard to solve problems in sub-linear space. We uses Sketches to compute ```COUNT DISTINCT```, ```GROUP```, ```DISTRIBUTION``` and ```TOP K``` queries.
116120

117121
We also use Sketches as a way to control high cardinality grouping (group by a natural key column or related) and rely on the Sketching data structure to drop excess groups. It is up to you setting up Bullet to determine to set Sketch sizes large or small enough for to satisfy the queries that will be performed on that instance of Bullet.
118122

119-
## New query types coming soon
120-
121-
Using Sketches, we have implemented ```COUNT DISTINCT``` and ```GROUP``` and are working on other aggregations including but not limited to:
122-
123-
| Aggregation | Meaning |
124-
| -------------- | ------- |
125-
| TOP K | Returns the top K most frequently appearing values in the column |
126-
| DISTRIBUTION | Computes distributions of the elements in the column. E.g. Find the median value or the 95th percentile of a field or graph the entire distribution as a histogram |
127-
128123
# Architecture
129124

130125
## Backend

0 commit comments

Comments
 (0)