@@ -111,12 +111,12 @@ But the overloaded version `compressor.compress(String)` already calls it automa
111111
112112### Where to store the compressed data?
113113In its purest form, a ` String ` is just a byte array (` byte[] ` ), and a compressed ` String ` couldn't be different.
114- You can store it anywhere you would store a ` byte[] ` .
115- The most common approach is to store each compressed string ordered in memory using a ` byte[][] ` (for binary search) or
116- a B+Tree if you need frequent insertions (coming in the next release).
117- The frequency of reads and writes + business requirements will tell the best media and data structure to use.
114+ You can store it anywhere you would store a ` byte[] ` . If you are compressing millions of different entries, a very common
115+ approach is to store each compressed string ordered in memory using a ` byte[][] ` (for binary search) or a B+Tree if you
116+ need frequent insertions (coming in the next release). The frequency of reads and writes + business requirements will
117+ tell the best media and data structure to use.
118118
119- If the data is ordered before compression and stored in-memory in a ` byte[][] ` , you can use the full power of the binary
119+ If the data is ordered before compression and stored in-memory in a ` byte[][] ` as mentioned above , you can use the full power of the binary
120120search directly in the compressed data through ` FourBitBinarySearch ` , ` FiveBitBinarySearch ` , and ` SixBitBinarySearch ` .
121121
122122### Binary search
@@ -146,18 +146,20 @@ int index = binary.search("63821623849863628763#");
146146
147147if (index >= 0 ) {
148148 byte [] found = compressedData[index];
149- String decompressed = compressor. decompress(found);
150- }
149+ String decompressed = compressor. decompress(found);
150+ ```
151+ In case you used are using a custom character set to compress the data, you need to pass it through the binary search constructor:
152+ ```java
153+ public FiveBitBinarySearch(byte [][] compressedData, boolean prefixSearch, byte [] charset)
151154```
152-
153- In case you used a custom character set to compress
154155
155156### B + Tree
156157
157158Coming in the next release.
158159
159160### Bulk / Batch compression
160161
162+ In some rare cases you need to fetch your data in batches from a remote location or another third party actor.
161163java- string- compressor provides both, `BulkCompressor ` and `ManagedBulkCompressor ` specifically for this task.
162164They help you automatize the process of adding each batch to the correct position in the destination array where the
163165compressed data will be stored. Both currently supports `byte [][]` as destination for the compressed data.
@@ -168,67 +170,18 @@ from handle array positions and bounds. This is why we recommend `ManagedBulkCom
168170
169171Both bulk compressors loop through the data in parallel by calling `IntStream . range(). parallel()`.
170172
171- Let's take ` compactedData ` from the previous example and show how we can populate it with data from all customers:
172-
173173```java
174- byte [][] compactedData = new byte [100000000 ][]; // Data for 100 million customers.
175-
176-
177-
178-
179-
180-
181-
182- byte [] compressed = compressor. compress(input);
183- byte [] decompressed = compressor. decompress(compressed);
184- String string = new String (decompressed, StandardCharsets . ISO_8859_1 );
174+ byte [][] compressedData = new byte [100000000 ][]; // Storage for a max of 100 million customers.
175+ // ...
176+ ManagedBulkCompressor managed = new ManagedBulkCompressor (compressor, compressedData);
177+ // ...loop...
178+ managed. compressAndAddAll(batch); // batch is the list of strings to be compressed.
185179```
186180
187-
188- ` BulkCompressor ` is a "lower-level" utility where
189-
190-
191-
181+ ### Logging
182+ If you need logging, search for libraries like ZeroLog , ChronicleLog , Log4j 2 Async Loggers , and other similar tools
183+ (we did not test any of those). You will need a fast log library, or it can become a bottleneck.
192184
193185### Other
194- Do not forget to check our JavaDocs with further information about each member.
195-
196-
197-
198-
199-
200-
201-
202-
203-
204-
205-
206-
207- <br >
208- <br >
209- <br >
210- <br >
211- <br >
212- <br >
213- <br >
214- <br >
215- <br >
216- <br >
217- <br >
218- <br >
219- <br >
220- <br >
221- <br >
222- <br >
223-
224-
225-
226-
227-
228-
229-
230-
231-
232-
233-
234- if you need logging , check ZeroLog, ChronicleLog and similar tools
186+ Do not forget to check the JavaDocs with further information about each member.
187+ Also check the test directory for additional examples.
0 commit comments