You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,8 +11,9 @@ You need [mkdocs](http://www.mkdocs.org/#installation) installed to build the do
11
11
This also uses the mkdocs theme : [Cinder](http://sourcefoundry.org/cinder/).
12
12
13
13
Since Cinder has not been upgraded in a while, you will need to bring in changes in this [PR](https://github.com/chrissimpkins/cinder/pull/26) of Cinder found here: [twardoch/clinker-mktheme](https://github.com/twardoch/clinker-mktheme/tree/2016-12-22)
14
+
and add on this fix in this [PR](https://github.com/twardoch/clinker-mktheme/pull/10)
14
15
15
-
You will need Python installed.
16
+
You will also need Python installed.
16
17
17
18
```bash
18
19
sudo pip install virtualenv
@@ -30,7 +31,7 @@ mkdocs build
30
31
You will need [Maven 3](https://maven.apache.org/install.html) and [JDK 8](http://www.oracle.com/technetwork/java/javase/downloads/index.html) installed to build the examples.
31
32
32
33
```bash
33
-
cd bullet-docs/examples/storm&&mvn package
34
+
cd bullet-docs/examples/ &&make
34
35
```
35
36
36
37
Code licensed under the Apache 2 license. See LICENSE file for terms.
| TOP K | UI, BE | A TOP K implementation using DataSketches: FrequentItems | In progress |
18
-
| DISTRIBUTION | UI, BE | A query to get the distribution/quantiles of a numeric field using DataSketches: Quantiles | In progress |
19
17
| Pub-Sub Queue | BE, WS, UI | WS and BE talk through the pub/sub. Bullet Storm uses Storm DRPC for this, which is strictly request-response. This will let us work on other Stream Processors and support incremental updates through WebSockets or SSEs ||
20
18
| Incremental updates| BE, WS, UI | Push results back to users as soon as they arrive. Monoidal operations implies additive, so progressive results can be streamed back. Micro-batching and other features come into play ||
21
19
| SQL API | BE, WS | WS supports an endpoint that converts a SQL-like query into Bullet queries ||
22
20
| LocalForage | UI | Migration to LocalForage to distance ourselves from the relatively small LocalStorage space |[#9](https://github.com/yahoo/bullet-ui/issues/9)|
23
21
| UI Packaging | UI | Github releases and building from source are the only two options. Docker or something similar may be more apt ||
24
-
| Simple Settings | UI, WS | There are several settings in the UI and WS that are directly tied to the BE. They should be configurable and optimally configurable from one location ||
Copy file name to clipboardExpand all lines: docs/about/releases.md
+8-1Lines changed: 8 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ certain CPU and memory related settings specific to RAS in its configuration. Th
12
12
13
13
!!! note "Future support"
14
14
15
-
We will support Storm 0.10 for a bit longer till Storm 2.0 is up and stable. Storm 1.0+ have a lot of performance fixes and features that you should be running with.
15
+
We will support Storm 0.10 for a bit longer till Storm 2.0 is up and stable. Storm versions 1.0+ have a lot of performance fixes and features that you should be running with.
| 2017-04-28 |[**0.4.2**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2)|[**0.4.2**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.2)| Strict JSON output and fix for no data distributions |
30
+
| 2017-04-26 |[**0.4.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.1)|[**0.4.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.1)| Result Metadata Concept name mismatch fix |
31
+
| 2017-04-21 |[**0.4.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.0)|[**0.4.0**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.4.0)| DISTRIBUTION and TOP K release. Configuration renames. |
29
32
| 2017-03-13 |[**0.3.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.1)|[**0.3.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.3.1)| Extra records accepted after query expiry bug fix |
| 2017-02-15 |[**0.2.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.2.1)|[**0.2.1**](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.10-0.2.1)| Acking support, Max size and other bug fixes |
@@ -43,6 +46,8 @@ The Web Service implementation that can serve a static schema from a file and ta
| 2016-12-16 |[**0.0.1**](https://github.com/yahoo/bullet-service/releases/tag/bullet-service-0.0.1)| The first release with support for DRPC and the file-based schema |
@@ -62,6 +67,7 @@ The Bullet UI that lets you build, run, save and visualize results from Bullet.
| 2016-05-01 |[**0.2.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0)| Release for Top K and Distribution. Supports Bullet Storm 0.4.2+ |
65
71
| 2016-02-21 |[**0.1.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.1.0)| The first release with support for all features included in Bullet Storm 0.2.1+ |
66
72
67
73
## Bullet Record
@@ -79,4 +85,5 @@ The AVRO container that you need to convert your data into to be consumed by Bul
| 2017-04-17 |[**0.1.1**](https://github.com/yahoo/bullet-record/releases/tag/bullet-record-0.1.0)| Helper methods to remove, rename, check presence and count fields in the Record |
Copy file name to clipboardExpand all lines: docs/backend/storm-architecture.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,7 +57,7 @@ Since the data from the Prepare Request bolt (a query and a piece of return info
57
57
58
58
!!! note "Combining and operations"
59
59
60
-
In order to be able to combine intermediate results and process data in any order, all aggregations that Bullet does need to be associative and have an identity. In other words, they need to be [Monoids](https://en.wikipedia.org/wiki/Monoid). Luckily for us, the [DataSketches](http://datasketches.github.io) that we use are monoids (actually are commutative monoids). Sketches can be unioned and thus all the aggregations we support - ```SUM```, ```COUNT```, ```MIN```, ```MAX```, ```AVG```, ```COUNT DISTINCT```, ```DISTINCT``` - are monoidal. (```AVG``` is monoidal if you store a ```SUM``` and a ```COUNT``` instead).
60
+
In order to be able to combine intermediate results and process data in any order, all aggregations that Bullet does need to be associative and have an identity. In other words, they need to be [Monoids](https://en.wikipedia.org/wiki/Monoid). Luckily for us, the [DataSketches](http://datasketches.github.io) that we use are monoids (actually are commutative monoids). Sketches can be unioned and thus all the aggregations we support - ```SUM```, ```COUNT```, ```MIN```, ```MAX```, ```AVG```, ```COUNT DISTINCT```, ```DISTINCT``` etc - are monoidal. (```AVG``` is monoidal if you store a ```SUM``` and a ```COUNT``` instead).
Copy file name to clipboardExpand all lines: docs/index.md
+6-11Lines changed: 6 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,8 @@
8
8
9
9
* Is a **look-forward** query system. Queries are submitted first and they operate on data that arrive after the query is submitted
10
10
11
+
* Supports rich queries for filtering and getting **Raw data, Counting Distincts, Distincts, Grouping (Sum, Count, Min, Max, Avg), Distributions, and Top K**
12
+
11
13
* Is **multi-tenant** and can scale for more queries and/or for more data
12
14
13
15
* Provides a **UI and Web Service** that are also pluggable for a full end-to-end solution to your querying needs
@@ -91,8 +93,10 @@ The current aggregation types that are supported are:
91
93
| GROUP | The resulting output would be a record containing the result of an operation for each unique value combination in your specified fields |
92
94
| COUNT DISTINCT | Computes the number of distinct elements in the fields. (May be approximate) |
93
95
| LIMIT or RAW | The resulting output would be at most the number specified in size. |
96
+
| DISTRIBUTION | Computes distributions of the elements in the field. E.g. Find the median value or various percentile of a field, or get frequency or cumulative frequency distributions |
97
+
| TOP K | Returns the top K most frequently appearing values in the column |
94
98
95
-
Currently we support ```GROUP``` aggregations on the following operations:
99
+
Currently we support ```GROUP``` aggregations with the following operations:
96
100
97
101
| Operation | Meaning |
98
102
| -------------- | ------- |
@@ -112,19 +116,10 @@ The Bullet Web Service returns your query result as well as associated metadata
112
116
113
117
It is often intractable to perform aggregations on an unbounded stream of data and still support arbitrary queries. However, it is possible if an exact answer is not required and the approximate answer's error is exactly quantifiable. There are stochastic algorithms and data structures that let us do this. We use [Data Sketches](https://datasketches.github.io/) to perform aggregations such as counting uniques, and will be using Sketches to implement some future aggregations.
114
118
115
-
Sketches let us be exact in our computation up to configured thresholds and approximate after. The error is very controllable and quantifiable. All Bullet queries that use Sketches return the error bounds with Standard Deviations as part of the results so you can quantify the error exactly. Using Sketches lets us address otherwise hard to solve problems in sub-linear space.
119
+
Sketches let us be exact in our computation up to configured thresholds and approximate after. The error is very controllable and quantifiable. All Bullet queries that use Sketches return the error bounds with Standard Deviations as part of the results so you can quantify the error exactly. Using Sketches lets us address otherwise hard to solve problems in sub-linear space. We uses Sketches to compute ```COUNT DISTINCT```, ```GROUP```, ```DISTRIBUTION``` and ```TOP K``` queries.
116
120
117
121
We also use Sketches as a way to control high cardinality grouping (group by a natural key column or related) and rely on the Sketching data structure to drop excess groups. It is up to you setting up Bullet to determine to set Sketch sizes large or small enough for to satisfy the queries that will be performed on that instance of Bullet.
118
122
119
-
## New query types coming soon
120
-
121
-
Using Sketches, we have implemented ```COUNT DISTINCT``` and ```GROUP``` and are working on other aggregations including but not limited to:
122
-
123
-
| Aggregation | Meaning |
124
-
| -------------- | ------- |
125
-
| TOP K | Returns the top K most frequently appearing values in the column |
126
-
| DISTRIBUTION | Computes distributions of the elements in the column. E.g. Find the median value or the 95th percentile of a field or graph the entire distribution as a histogram |
0 commit comments