-
Notifications
You must be signed in to change notification settings - Fork 18
compress.q
This library provides functions to assist with the compression of HDB splayed and partitioned tables.
Provides a symbol reference to a default compression mode for each supported compression type:
| Symbol | Compression |
|---|---|
`none |
(0; 0; 0) |
`qipc |
(17; 1; 0) |
`gzip |
(17; 2; 7) |
`snappy |
(17; 3; 0) |
`lz4hc |
(17; 4; 9) |
Provides the compression statistics (via -21!) for all columns in the specified splayed table folder and returns a table.
Any uncompressed columns will have a null compressed value
q) .compress.getSplayStats `:/tmp/hdb/2021.10.29/trade
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time 8000560 8000016 qipc 1 17 0
sym 1182897 10101304 qipc 1 17 0
price 8000560 8000016 qipc 1 17 0
vol 3500240 8000016 qipc 1 17 0
q).compress.getSplayStats `:/tmp/hdb/2021.01.23/trade
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time 16 none 0 0 0
sym 4096 none 0 0 0
price 16 none 0 0 0
vol 16 none 0 0 0Provides the compression statistics (via -21!) for all columns in all tables within the specified partition with the specified HDB. The returned table is the same as .compress.getSplayStats with part and table columns added
Note that this function is not par.txt aware. If using a segmented HDB, the hdbRoot parameter should be the segment root.
q) select sum compressedLength by part, table from .compress.getPartitionStats[`:/tmp/hdb; 2021.01.23]
part table | uncompressedLength
--------------------| ------------------
2021.01.23 tbl | 4392
2021.01.23 tbl10 | 40
2021.01.23 tbl2 | 40
2021.01.23 trade | 4144
2021.01.23 tradeComp| 4144
Compresses a splayed table.
-
compressType: Can either be a symbol (one ofnone,qipc,gzip,snappy,lz4hc) or a 3-element integer list describing the compression type -
options: A dictionary of options to modify the function's behaviour-
recompress: If true, any compressed files will be recompressed (default isfalse) -
inplace: If true,targetSplayPathcan be the same assourceSplayPath(default isfalse)
-
The function doesn't always compress every column in the splay. It will return a table information describing the operation that was performed; writeMode provides the detail to what was performed and why:
-
compress: The file was compressed- The file is uncompressed, or is compressed and the
recompressoption is true
- The file is uncompressed, or is compressed and the
-
copy: The file was copied (via the OS-specific copy command)- The file is either empty (0 = count) or is already compressed and the
recompressoption is missing or false
- The file is either empty (0 = count) or is already compressed and the
-
ignore: The file was ignored- The file would've been copied (as above) but was an inplace copy so nothing to do
q) .compress.getSplayStats `:/tmp/hdb/2021.10.29/trade
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time 8000016 none 0 0 0
sym 10101304 none 0 0 0
price 8000016 none 0 0 0
vol 8000016 none 0 0 0
q) .compress.splay[`:/tmp/hdb/2021.10.29/trade; `:/tmp/hdb/2021.10.29/tradeComp; `lz4hc; ()!()]
...
column source target compressed inplace empty writeMode
---------------------------------------------------------------------------------------------------------------
time :/tmp/hdb/2021.10.29/trade/time :/tmp/hdb/2021.10.29/tradeComp/time 0 0 0 compress
sym :/tmp/hdb/2021.10.29/trade/sym :/tmp/hdb/2021.10.29/tradeComp/sym 0 0 0 compress
price :/tmp/hdb/2021.10.29/trade/price :/tmp/hdb/2021.10.29/tradeComp/price 0 0 0 compress
vol :/tmp/hdb/2021.10.29/trade/vol :/tmp/hdb/2021.10.29/tradeComp/vol 0 0 0 compress
q) .compress.getSplayStats `:/tmp/hdb/2021.10.29/tradeComp
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time 8000560 8000016 lz4hc 4 17 9
sym 1139049 10101304 lz4hc 4 17 9
price 6728031 8000016 lz4hc 4 17 9
vol 2927839 8000016 lz4hc 4 17 9Compresses multiple splayed tables within a HDB partition
-
tbls: Either a list of tables to compress orCOMP_ALLcan be specified to compress all tables -
options: A dictionary of options to modify the function's behaviour-
recompress: If true, any compressed files will be recompressed (default isfalse) -
inplace: If true,sourceRootcan be the same astargetRoot(default isfalse) -
srcParTxt: If true, anypar.txtinsourceRootwill be used to find the specified partition (default istrue) -
tgtParTxt: If true, anypar.txtintargetRootwill be used to write the specified partition (default istrue)
-
NOTE: There is no interaction with the sym file in the source or target HDBs with this function. It is expected that the sym file is shared across both the source and target.
The same information is returned as .compress.splay with part and table columns added.
Copyright (C) Sport Trades Ltd 2017 - 2020, John Keys and Jaskirat Rajasansir 2020 - 2024