Skip to content

Commit 4f83f38

Browse files
committed
Improved stored array documentation
1 parent 121e72e commit 4f83f38

File tree

3 files changed

+50
-16
lines changed

3 files changed

+50
-16
lines changed

docs/source/stored_arrays.rst

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
Stored Arrays
88
=============
99

10+
File arrays
11+
-----------
12+
1013
Arrays can be stored on a file system using ``xfile_array``, enabling
1114
persistence of data. This type of array is a file-backed cached ``xarray``,
1215
meaning that you can use it as a normal array, and it will be flushed to the
@@ -16,7 +19,7 @@ file system or Google Cloud Storage, and data can be stored in various formats,
1619
e.g. GZip or Blosc.
1720

1821
File Mode
19-
---------
22+
^^^^^^^^^
2023

2124
A file array can be created using one of the three following file modes:
2225

@@ -30,9 +33,35 @@ A file array can be created using one of the three following file modes:
3033

3134
The default mode is ``load``.
3235

33-
Example : on-disk file array
34-
----------------------------
36+
IO handler
37+
^^^^^^^^^^
38+
39+
Stored arrays can be read and written to various file systems using an IO
40+
handler, which is a template parameter of ``xfile_array``. The following IO
41+
handlers are currently supported:
42+
43+
- ``xio_disk_handler``: for accessing the local file system.
44+
- ``xio_gcs_handler``: for accessing Google Cloud Storage.
45+
46+
The IO handler is itself templated by a file format.
47+
48+
File format
49+
^^^^^^^^^^^
3550

51+
An array is stored in a file system using a file format, which usually performs
52+
some kind of compression. A file format has the ability to store the data, and
53+
optionally the shape of the array. It can also optionally be configured. The
54+
following file formats are currently supported:
55+
56+
- ``xio_binary_config``: raw binary format.
57+
- ``xio_gzip_config``: GZip format.
58+
- ``xio_blosc_config``: Blosc format.
59+
60+
These formats currently only store the data, not the shape. GZip and Blosc
61+
formats are configurable, but not the binary format.
62+
63+
Example : on-disk file array
64+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3665

3766
.. code:: cpp
3867
@@ -64,13 +93,13 @@ Example : on-disk file array
6493
a2.resize(shape);
6594
6695
// a1 and a2 are equal
67-
assert(xt:all(xt::equal(a1, a2)));
96+
assert(xt::all(xt::equal(a1, a2)));
6897
6998
return 0;
7099
}
71100
72-
Stored Chunked Arrays
73-
---------------------
101+
Chunked File Arrays
102+
-------------------
74103

75104
As for a "normal" array, a chunked array can be stored on a file system. Under
76105
the hood, it will use ``xfile_array`` to store the chunks. But rather than
@@ -79,7 +108,10 @@ only a limited number of file arrays are used at the same time in a chunk pool.
79108
The container which is responsible for managing the chunk pool (i.e. map
80109
logical chunks in the array to physical chunks in the pool) is the
81110
``xchunk_store_manager``, but you should not use it directly. Instead, we
82-
provide factory functions to create a stored chunked array, as shown below:
111+
provide factory functions to create a chunked file array, as shown below:
112+
113+
Example : on-disk chunked file array
114+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
83115

84116
.. code-block:: cpp
85117
@@ -105,5 +137,6 @@ provide factory functions to create a stored chunked array, as shown below:
105137
a1(0, 0) = 5.6; // because the pool is full, this saves chunk (1, 0) to disk
106138
// and assigns to chunk (0, 0) in memory
107139
// when a1 is destroyed, all the modified chunks are saved to disk
108-
// this can be forced with a1.chunks().flush()
140+
// here, only chunks (0, 1) and (0, 0) are saved, since chunk (1, 0) was not changed
141+
// flushing can be triggered manually by calling a1.chunks().flush()
109142
}

include/xtensor-io/xchunk_store_manager.hpp

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -77,11 +77,12 @@ namespace xt
7777
*
7878
* The xchunk_store_manager class implements a multidimensional chunk container.
7979
* Chunks are managed in a pool, allowing for a limited number of chunks
80-
* that can simultaneously be hold in memory.
80+
* that can simultaneously be hold in memory, by swapping chunks when the pool
81+
* is full. Should not be used directly, instead use chunked_file_array factory
82+
* functions.
8183
*
8284
* @tparam EC The type of a chunk (e.g. xfile_array)
8385
* @tparam IP The type of the index-to-path transformer (default: xindex_path)
84-
* @sa xfile_array, xchunked_array, chunked_array, chunked_file_array
8586
*/
8687
template <class EC, class IP = xindex_path>
8788
class xchunk_store_manager: public xaccessible<xchunk_store_manager<EC, IP>>,
@@ -200,7 +201,7 @@ namespace xt
200201
};
201202

202203
/**
203-
* Creates a stored chunked array.
204+
* Creates a chunked file array.
204205
* This function returns an uninitialized ``xchunked_array<xchunk_store_manager<xfile_array<T, IOH>>>``.
205206
*
206207
* @tparam T The type of the elements (e.g. double)
@@ -226,8 +227,8 @@ namespace xt
226227
layout_type chunk_memory_layout = XTENSOR_DEFAULT_LAYOUT);
227228

228229
/**
229-
* Creates a stored chunked array.
230-
* This function returns a uninitialized ``xchunked_array<xchunk_store_manager<xfile_array<T, IOH>>>``.
230+
* Creates a chunked file array.
231+
* This function returns an uninitialized ``xchunked_array<xchunk_store_manager<xfile_array<T, IOH>>>``.
231232
*
232233
* @tparam T The type of the elements (e.g. double)
233234
* @tparam IOH The type of the IO handler (e.g. xio_disk_handler)
@@ -254,7 +255,7 @@ namespace xt
254255
layout_type chunk_memory_layout = XTENSOR_DEFAULT_LAYOUT);
255256

256257
/**
257-
* Creates a stored chunked array.
258+
* Creates a chunked file array.
258259
* This function returns a ``xchunked_array<xchunk_store_manager<xfile_array<T, IOH>>>`` initialized from an expression.
259260
*
260261
* @tparam T The type of the elements (e.g. double)
@@ -280,7 +281,7 @@ namespace xt
280281
layout_type chunk_memory_layout = XTENSOR_DEFAULT_LAYOUT);
281282

282283
/**
283-
* Creates a stored chunked array.
284+
* Creates a chunked file array.
284285
* This function returns a ``xchunked_array<xchunk_store_manager<xfile_array<T, IOH>>>`` initialized from an expression.
285286
*
286287
* @tparam IOH The type of the IO handler (e.g. xio_disk_handler)

include/xtensor-io/xfile_array.hpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ namespace xt
113113
*
114114
* @tparam E The type of the container holding the elements
115115
* @tparam IOH The type of the IO handler (e.g. xio_disk_handler)
116-
* @sa chunked_file_array, xchunk_store_manager
116+
* @sa xchunk_store_manager
117117
*/
118118
template <class E, class IOH>
119119
class xfile_array_container : public xaccessible<xfile_array_container<E, IOH>>,

0 commit comments

Comments
 (0)