Skip to content

Commit bce2843

Browse files
committed
DistributedArray -> DistArray ++
1 parent 9e3e04a commit bce2843

18 files changed

+256
-118
lines changed

docs/source/global.rst

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,10 @@ global shape :math:`(512, 1024, 2048)`. To lift this array into RAM requires
1111
machine. If, however, you have access to a distributed architecture, you can
1212
split the array up and share it between, e.g., four CPUs (most supercomputers
1313
have either 2 or 4 GB of memory per CPU), which will only need to
14-
hold 2 GBs of the global array each.
14+
hold 2 GBs of the global array each. Moreover, many algorithms with varying
15+
degrees of locality can take advantage of the distributed nature of the array
16+
to compute local array pieces concurrently, effectively exploiting multiple
17+
processor resources.
1518

1619
There are several ways of distributing a large multidimensional
1720
array. Two such distributions for our three-dimensional global array
@@ -55,7 +58,7 @@ classes in the :mod:`.pencil` module:
5558
* :class:`.Transfer`
5659

5760
These classes are the low-level backbone of the higher-level :class:`.PFFT` and
58-
:class:`.DistributedArray` classes. To use these low-level classes
61+
:class:`.DistArray` classes. To use these low-level classes
5962
directly is not recommended and usually not necessary. However, for
6063
clarity we start by describing how these low-level classes work together.
6164

@@ -64,9 +67,9 @@ distributed along axis 0. With a high level API we could then simply
6467
do::
6568

6669
import numpy as np
67-
from mpi4py_fft import DistributedArray
70+
from mpi4py_fft import DistArray
6871
N = (8, 8)
69-
a = DistributedArray(N, [0, 1])
72+
a = DistArray(N, [0, 1])
7073

7174
where the ``[0, 1]`` list decides that the first axis can be distributed,
7275
whereas the second axis is using one processor only and as such is
@@ -119,7 +122,7 @@ can only be distributed with
119122
one processor group. If we wanted to distribute the second axis instead
120123
of the first, then we would have done::
121124

122-
a = DistributedArray(N, [1, 0])
125+
a = DistArray(N, [1, 0])
123126

124127
With the low-level approach we would have had to use ``axis=0`` in the
125128
creation of ``p0``, as well as ``[1, 0]`` in the creation of ``subcomm``.
@@ -136,11 +139,11 @@ the value of each processors rank (note that it would also work to follow the
136139
low-level approach and use ``a0``)::
137140

138141
import numpy as np
139-
from mpi4py_fft import DistributedArray
142+
from mpi4py_fft import DistArray
140143
from mpi4py import MPI
141144
comm = MPI.COMM_WORLD
142145
N = (8, 8)
143-
a = DistributedArray(N, [0, 1])
146+
a = DistArray(N, [0, 1])
144147
a[:] = comm.Get_rank()
145148
print(a.shape)
146149

@@ -293,11 +296,11 @@ processor groups, respectively. On the other hand, if you can get away with it,
293296
or if you do not have access to a great number of processors, then fewer groups
294297
are usually found to be faster for the same number of processors in total.
295298

296-
We can implement the global redistribution using the high-level :class:`.DistributedArray`
299+
We can implement the global redistribution using the high-level :class:`.DistArray`
297300
class::
298301

299302
N = (8, 8, 8, 8)
300-
a3 = DistributedArray(N, [0, 0, 0, 1])
303+
a3 = DistArray(N, [0, 0, 0, 1])
301304
a2 = a3.redistribute(2)
302305
a1 = a2.redistribute(1)
303306
a0 = a1.redistribute(0)

docs/source/indices.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ Indices and tables
33

44
* :ref:`genindex`
55
* :ref:`modindex`
6-
* :ref:`search`
6+
* :ref:`search`

docs/source/installation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,4 +133,4 @@ This test-suit is run automatically on every commit to github, see, e.g.,
133133
.. _numpy: https://www.numpy.org
134134
.. _numba: https://www.numba.org
135135
.. _conda-build: https://conda.io/docs/commands/build/conda-build.html
136-
.. _pypi: https://pypi.org/project/shenfun/
136+
.. _pypi: https://pypi.org/project/shenfun/

docs/source/introduction.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
Introduction
22
============
33

4-
The Python package mpi4py-fft is a tool primarily for working with Fast
4+
The Python package `mpi4py-fft`_
5+
is a tool primarily for working with Fast
56
Fourier Transforms (FFTs) of (large) multidimensional arrays. There is really
67
no limit as to how large the arrays can be, just as long as there is sufficient
78
computing powers available. Also, there are no limits as to how transforms can
@@ -15,7 +16,7 @@ the main modules:
1516

1617
* :mod:`.mpifft`
1718
* :mod:`.pencil`
18-
* :mod:`.distributedarray`
19+
* :mod:`.distarray`
1920
* :mod:`.libfft`
2021
* :mod:`.fftw`
2122

@@ -28,7 +29,7 @@ However, this module is rarely used on its own, unless one simply needs to do
2829
global redistributions without any transforms at all. The :mod:`.pencil` module
2930
is used heavily by the :class:`.PFFT` class.
3031

31-
The :mod:`.distributedarray` module contains classes for simply distributing
32+
The :mod:`.distarray` module contains classes for simply distributing
3233
multidimensional arrays, with no regards to transforms. The distributed arrays
3334
created from the classes here can very well be used in any MPI application that
3435
requires a large multidimensional distributed array.
@@ -42,3 +43,5 @@ because `pyfftw <https://github.com/pyFFTW/pyFFTW>`_ does not include support
4243
for real-to-real transforms. Through the interface in :mod:`.fftw` we can do
4344
here, in Python, pretty much everything that you can do in the original
4445
FFTW library.
46+
47+
.. _`mpi4py-fft`: https://bitbucket.org/mpi4py/mpi4py-fft

docs/source/io.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,12 @@ reads data in parallel. A simple example of usage is::
1616

1717
from mpi4py import MPI
1818
import numpy as np
19-
from mpi4py_fft import PFFT, HDF5File, NCFile, newDarray
19+
from mpi4py_fft import PFFT, HDF5File, NCFile, newDistArray
2020

2121
N = (128, 256, 512)
2222
T = PFFT(MPI.COMM_WORLD, N)
23-
u = newDarray(T, forward_output=False)
24-
v = newDarray(T, forward_output=False, val=2)
23+
u = newDistArray(T, forward_output=False)
24+
v = newDistArray(T, forward_output=False, val=2)
2525
u[:] = np.random.random(N)
2626

2727
fields = {'u': [u], 'v': [v]}
@@ -43,8 +43,8 @@ The stored dataarrays can be retrieved later on::
4343

4444
f0 = HDF5File('h5test.h5', T, mode='r')
4545
f1 = NCFile('nctest.nc', T, mode='r')
46-
u0 = newDarray(T, forward_output=False)
47-
u1 = newDarray(T, forward_output=False)
46+
u0 = newDistArray(T, forward_output=False)
47+
u1 = newDistArray(T, forward_output=False)
4848
f0.read(u0, 'u', 0)
4949
f0.read(u1, 'u', 1)
5050
f1.read(u0, 'u', 0)
@@ -165,4 +165,4 @@ generate the files using::
165165
generate_xdmf('variousfields.h5', order='visit')
166166

167167
because for some reason Paraview and Visit require the mesh in the xdmf-files
168-
to be stored in opposite order.
168+
to be stored in opposite order.

docs/source/mpi4py_fft.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ mpi4py\_fft.pencil module
3838
:undoc-members:
3939
:show-inheritance:
4040

41-
mpi4py\_fft.distributedarray module
41+
mpi4py\_fft.distarray module
4242
-----------------------------------
4343

44-
.. automodule:: mpi4py_fft.distributedarray
44+
.. automodule:: mpi4py_fft.distarray
4545
:members:
4646
:undoc-members:
4747
:show-inheritance:

docs/source/parallel.rst

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ the following code snippet::
1818

1919
import numpy as np
2020
from mpi4py import MPI
21-
from mpi4py_fft import PFFT, newDarray
21+
from mpi4py_fft import PFFT, newDistArray
2222
N = np.array([128, 128, 128], dtype=int)
2323
fft = PFFT(MPI.COMM_WORLD, N, axes=(0, 1, 2), dtype=np.float, slab=True)
2424

@@ -47,7 +47,7 @@ With data aligned in axis 0, we can perform the final transform
4747
Assume now that all the code in this section is stored to a file named
4848
``pfft_example.py``, and add to the above code::
4949

50-
u = newDarray(fft, False)
50+
u = newDistArray(fft, False)
5151
u[:] = np.random.random(u.shape).astype(u.dtype)
5252
u_hat = fft.forward(u)
5353
uj = np.zeros_like(u)
@@ -63,8 +63,8 @@ should raise no exception, and the output should be::
6363

6464
This shows that the first index has been shared between the two processors
6565
equally. The array ``u`` thus corresponds to :math:`u_{j_0/P,j_1,j_2}`. Note
66-
that the :func:`.newDarray` function returns a :class:`.DistributedArray`
67-
object, which in turn is a subclassed Numpy ndarray. The :func:`.newDarray`
66+
that the :func:`.newDistArray` function returns a :class:`.DistArray`
67+
object, which in turn is a subclassed Numpy ndarray. The :func:`.newDistArray`
6868
function uses ``fft`` to determine the size and type of the created distributed
6969
array, i.e., (64, 128, 128) and ``np.float`` for both processors.
7070
The ``False`` argument indicates that the shape
@@ -80,7 +80,7 @@ The output array will be distributed in axis 1, so the output array
8080
shape should be (128, 64, 65) on each processor. We check this by adding
8181
the following code and rerunning::
8282

83-
u_hat = newDarray(fft, True)
83+
u_hat = newDistArray(fft, True)
8484
print(MPI.COMM_WORLD.Get_rank(), u_hat.shape)
8585

8686
leading to an additional print of::
@@ -147,7 +147,7 @@ with real-to-complex transforms like this::
147147
idct = functools.partial(idctn, type=3)
148148
transforms = {(0,): (rfftn, irfftn), (1, 2): (dct, idct)}
149149
r2c = PFFT(MPI.COMM_WORLD, N, axes=((0,), (1, 2)), transforms=transforms)
150-
u = newDarray(r2c, False)
150+
u = newDistArray(r2c, False)
151151
u[:] = np.random.random(u.shape).astype(u.dtype)
152152
u_hat = r2c.forward(u)
153153
uj = np.zeros_like(u)
@@ -170,7 +170,7 @@ A parallel transform object can be created and tested as::
170170
fft = PFFT(MPI.COMM_WORLD, N, ((0,), (1, 2), (3, 4)), slab=True,
171171
transforms={(1, 2): (dctn, idctn), (3, 4): (dstn, idstn)})
172172

173-
A = newDarray(fft, False)
173+
A = newDistArray(fft, False)
174174
A[:] = np.random.random(A.shape)
175175
C = fftw.aligned_like(A)
176176
B = fft.forward(A)
@@ -190,11 +190,11 @@ in the PFFT calling::
190190

191191
import numpy as np
192192
from mpi4py import MPI
193-
from mpi4py_fft import PFFT, newDarray
193+
from mpi4py_fft import PFFT, newDistArray
194194

195195
N = np.array([128, 128, 128], dtype=int)
196196
fft = PFFT(MPI.COMM_WORLD, N, axes=(0, 1, 2), dtype=np.float)
197-
u = newDarray(fft, False)
197+
u = newDistArray(fft, False)
198198
u[:] = np.random.random(u.shape).astype(u.dtype)
199199
u_hat = fft.forward(u)
200200
uj = np.zeros_like(u)
@@ -327,22 +327,22 @@ With mpi4py-fft we can compute this convolution using the ``padding`` keyword
327327
of the :class:`.PFFT` class::
328328

329329
import numpy as np
330-
from mpi4py_fft import PFFT, newDarray
330+
from mpi4py_fft import PFFT, newDistArray
331331
from mpi4py import MPI
332332

333333
comm = MPI.COMM_WORLD
334334
N = (128, 128) # Global shape in physical space
335335
fft = PFFT(comm, N, padding=[1.5, 1.5], dtype=np.complex)
336336

337337
# Create arrays in normal spectral space
338-
a_hat = newDarray(fft, True)
339-
b_hat = newDarray(fft, True)
338+
a_hat = newDistArray(fft, True)
339+
b_hat = newDistArray(fft, True)
340340
a_hat[:] = np.random.random(a_hat.shape) + np.random.random(a_hat.shape)*1j
341341
b_hat[:] = np.random.random(a_hat.shape) + np.random.random(a_hat.shape)*1j
342342

343343
# Transform to real space with padding
344-
a = newDarray(fft, False)
345-
b = newDarray(fft, False)
344+
a = newDistArray(fft, False)
345+
b = newDistArray(fft, False)
346346
assert a.shape == (192//comm.Get_size(), 192)
347347
a = fft.backward(a_hat, a)
348348
b = fft.backward(b_hat, b)

examples/darray.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
import numpy as np
22
from mpi4py import MPI
33
from mpi4py_fft.pencil import Subcomm
4-
from mpi4py_fft.distributedarray import DistributedArray, newDarray, Function
4+
from mpi4py_fft.distarray import DistArray, newDistArray, Function
55
from mpi4py_fft.mpifft import PFFT
66

7-
# Test DistributedArray. Start with alignment in axis 0, then tranfer to 1 and
7+
# Test DistArray. Start with alignment in axis 0, then tranfer to 1 and
88
# finally to 2
99
N = (16, 14, 12)
10-
z0 = DistributedArray(N, dtype=np.float, alignment=0)
10+
z0 = DistArray(N, dtype=np.float, alignment=0)
1111
z0[:] = np.random.randint(0, 10, z0.shape)
1212
s0 = MPI.COMM_WORLD.allreduce(np.sum(z0))
1313
z1 = z0.redistribute(2)
@@ -17,7 +17,7 @@
1717
assert s0 == s1 == s2
1818

1919
fft = PFFT(MPI.COMM_WORLD, darray=z2, axes=(0, 2, 1))
20-
z3 = newDarray(fft, forward_output=True)
20+
z3 = newDistArray(fft, forward_output=True)
2121
z2c = z2.copy()
2222
fft.forward(z2, z3)
2323
fft.backward(z3, z2)
@@ -28,11 +28,11 @@
2828

2929
print(z3.local_slice(), z3.substart, z3.commsizes)
3030

31-
v0 = newDarray(fft, forward_output=False, rank=1)
31+
v0 = newDistArray(fft, forward_output=False, rank=1)
3232
#v0 = Function(fft, forward_output=False, rank=1)
3333
v0[:] = np.random.random(v0.shape)
3434
v0c = v0.copy()
35-
v1 = newDarray(fft, forward_output=True, rank=1)
35+
v1 = newDistArray(fft, forward_output=True, rank=1)
3636

3737
for i in range(3):
3838
v1[i] = fft.forward(v0[i], v1[i])
@@ -53,13 +53,13 @@
5353

5454

5555
N = (6, 6, 6)
56-
z = DistributedArray(N, dtype=float, alignment=0)
56+
z = DistArray(N, dtype=float, alignment=0)
5757
z[:] = MPI.COMM_WORLD.Get_rank()
5858
g = z.get_global_slice((0, slice(None), 0))
5959
if MPI.COMM_WORLD.Get_rank() == 0:
6060
print(g)
6161

62-
z2 = DistributedArray(N, dtype=float, alignment=2)
62+
z2 = DistArray(N, dtype=float, alignment=2)
6363
z.redistribute(darray=z2)
6464

6565
g = z2.get_global_slice((0, slice(None), 0))
@@ -72,7 +72,7 @@
7272
assert abs(s0-s1) < 1e-12
7373

7474
N = (3, 3, 6, 6, 6)
75-
z2 = DistributedArray(N, dtype=float, val=1, alignment=2, rank=2)
75+
z2 = DistArray(N, dtype=float, val=1, alignment=2, rank=2)
7676
z2[:] = MPI.COMM_WORLD.Get_rank()
7777
z1 = z2.redistribute(1)
7878
z0 = z1.redistribute(0)

examples/spectral_dns_solver.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
from time import time
1111
import numpy as np
1212
from mpi4py import MPI
13-
from mpi4py_fft import PFFT, newDarray
13+
from mpi4py_fft import PFFT, newDistArray
1414

1515
# Set viscosity, end time and time step
1616
nu = 0.000625
@@ -28,18 +28,18 @@
2828
FFT_pad = FFT
2929

3030
# Declare variables needed to solve Navier-Stokes
31-
U = newDarray(FFT, False, rank=1) # Velocity
32-
U_hat = newDarray(FFT, rank=1) # Velocity transformed
33-
P = newDarray(FFT, False) # Pressure (scalar)
34-
P_hat = newDarray(FFT) # Pressure transformed
35-
U_hat0 = newDarray(FFT, rank=1) # Runge-Kutta work array
36-
U_hat1 = newDarray(FFT, rank=1) # Runge-Kutta work array
31+
U = newDistArray(FFT, False, rank=1) # Velocity
32+
U_hat = newDistArray(FFT, rank=1) # Velocity transformed
33+
P = newDistArray(FFT, False) # Pressure (scalar)
34+
P_hat = newDistArray(FFT) # Pressure transformed
35+
U_hat0 = newDistArray(FFT, rank=1) # Runge-Kutta work array
36+
U_hat1 = newDistArray(FFT, rank=1) # Runge-Kutta work array
3737
a = [1./6., 1./3., 1./3., 1./6.] # Runge-Kutta parameter
3838
b = [0.5, 0.5, 1.] # Runge-Kutta parameter
39-
dU = newDarray(FFT, rank=1) # Right hand side of ODEs
40-
curl = newDarray(FFT, False, rank=1)
41-
U_pad = newDarray(FFT_pad, False, rank=1)
42-
curl_pad = newDarray(FFT_pad, False, rank=1)
39+
dU = newDistArray(FFT, rank=1) # Right hand side of ODEs
40+
curl = newDistArray(FFT, False, rank=1)
41+
U_pad = newDistArray(FFT_pad, False, rank=1)
42+
curl_pad = newDistArray(FFT_pad, False, rank=1)
4343

4444
def get_local_mesh(FFT, L):
4545
"""Returns local mesh."""

examples/transforms.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import functools
22
import numpy as np
33
from mpi4py import MPI
4-
from mpi4py_fft import PFFT, DistributedArray
4+
from mpi4py_fft import PFFT, DistArray
55
from mpi4py_fft.fftw import dctn, idctn
66

77
# Set global size of the computational box
@@ -17,16 +17,16 @@
1717

1818
assert fft.axes == pfft.axes
1919

20-
u = DistributedArray(pfft=fft, forward_output=False)
20+
u = DistArray(pfft=fft, forward_output=False)
2121
u[:] = np.random.random(u.shape).astype(u.dtype)
2222

23-
u_hat = DistributedArray(pfft=fft, forward_output=True)
23+
u_hat = DistArray(pfft=fft, forward_output=True)
2424
u_hat = fft.forward(u, u_hat)
2525
uj = np.zeros_like(u)
2626
uj = fft.backward(u_hat, uj)
2727
assert np.allclose(uj, u)
2828

29-
u_padded = DistributedArray(pfft=pfft, forward_output=False)
29+
u_padded = DistArray(pfft=pfft, forward_output=False)
3030
uc = u_hat.copy()
3131
u_padded = pfft.backward(u_hat, u_padded)
3232
u_hat = pfft.forward(u_padded, u_hat)

0 commit comments

Comments
 (0)