Coverage for gpaw/blacs.py: 74%
209 statements
« prev ^ index » next coverage.py v7.7.1, created at 2025-07-20 00:19 +0000
« prev ^ index » next coverage.py v7.7.1, created at 2025-07-20 00:19 +0000
1# Copyright (C) 2010 CAMd
2# Copyright (C) 2010 Argonne National Laboratory
3# Please see the accompanying LICENSE file for further information.
5"""Module for high-level BLACS interface.
7Usage
8=====
10A BLACS grid is a logical grid of processors. To use BLACS, first
11create a BLACS grid. If comm contains 8 or more ranks, this example
12will work::
14 from gpaw.mpi import world
15 from gpaw.blacs import BlacsGrid
16 grid = BlacsGrid(world, 4, 2)
18Use the processor grid to create various descriptors for distributed
19arrays::
21 block_desc = grid.new_descriptor(500, 500, 64, 64)
22 local_desc = grid.new_descriptor(500, 500, 500, 500)
24The first descriptor describes 500 by 500 arrays distributed amongst
25the 8 CPUs of the BLACS grid in blocks of 64 by 64 elements (which is
26a sensible block size). That means each CPU has many blocks located
27all over the array::
29 print(world.rank, block_desc.shape, block_desc.gshape)
31Here block_desc.shape is the local array shape while gshape is the
32global shape. The local array shape varies a bit on each CPU as the
33block distribution may be slightly uneven.
35The second descriptor, local_desc, has a block size equal to the
36global size of the array, and will therefore only have one block.
37This block will then reside on the first CPU -- local_desc therefore
38represents non-distributed arrays. Let us instantiate some arrays::
40 H_MM = local_desc.empty()
42 if world.rank == 0:
43 assert H_MM.shape == (500, 500)
44 H_MM[:, :] = calculate_hamiltonian_or_something()
45 else:
46 assert H_MM.shape[0] == 0 or H_MM.shape[1] == 0
48 H_mm = block_desc.empty()
49 print(H_mm.shape) # many elements on all CPUs
51We can then redistribute the local H_MM into H_mm::
53 from gpaw.blacs import Redistributor
54 redistributor = Redistributor(world, local_desc, block_desc)
55 redistributor.redistribute(H_MM, H_mm)
57Now we can run parallel linear algebra on H_mm. This will diagonalize
58H_mm, place the eigenvectors in C_mm and the eigenvalues globally in
59eps_M::
61 eps_M = np.empty(500)
62 C_mm = block_desc.empty()
63 block_desc.diagonalize_ex(H_mm, C_mm, eps_M)
65We can redistribute C_mm back to the master process if we want::
67 C_MM = local_desc.empty()
68 redistributor2 = Redistributor(world, block_desc, local_desc)
69 redistributor2.redistribute(C_mm, C_MM)
71If somebody wants to do all this more easily, they will probably write
72a function for that.
74List of interesting classes
75===========================
77 * BlacsGrid
78 * BlacsDescriptor
79 * Redistributor
81The other classes in this module are coded specifically for GPAW and
82are inconvenient to use otherwise.
84The module gpaw.utilities.blacs contains several functions like gemm,
85gemv and r2k. These functions may or may not have appropriate
86docstings, and may use Fortran-like variable naming. Also, either
87this module or gpaw.utilities.blacs will be renamed at some point.
89"""
91import numpy as np
93import gpaw
94from gpaw.mpi import SerialCommunicator
95from gpaw.matrix_descriptor import MatrixDescriptor
96from gpaw.utilities.scalapack import scalapack_inverse_cholesky, \
97 scalapack_diagonalize_ex, scalapack_general_diagonalize_ex, \
98 scalapack_diagonalize_dc, scalapack_general_diagonalize_dc, \
99 scalapack_diagonalize_mr3, scalapack_general_diagonalize_mr3
100import gpaw.cgpaw as cgpaw
103INACTIVE = -1
104BLOCK_CYCLIC_2D = 1
107class BlacsGrid:
108 """Class representing a 2D grid of processors sharing a Blacs context.
110 A BLACS grid defines a logical M by N ordering of a collection of
111 CPUs. A BLACS grid can be used to create BLACS descriptors. On
112 an npcol by nprow BLACS grid, a matrix is distributed amongst M by
113 N CPUs along columns and rows, respectively, while the matrix
114 shape and blocking properties are determined by the descriptors.
116 Use the method new_descriptor() to create any number of BLACS
117 descriptors sharing the same CPU layout.
119 Most matrix operations require the involved matrices to all be on
120 the same BlacsGrid. Use a Redistributor to redistribute matrices
121 from one BLACS grid to another if necessary.
123 Parameters::
125 * comm: MPI communicator for CPUs of the BLACS grid or None. A BLACS
126 grid may use all or some of the CPUs of the communicator.
127 * nprow: Number of CPU rows.
128 * npcol: Number of CPU columns.
129 * order: 'R' or 'C', meaning rows or columns. I'm not sure what this
130 does, it probably interchanges the meaning of rows and columns. XXX
132 Complicated stuff
133 -----------------
135 It may be useful to know that a BLACS grid is said to be active
136 and will evaluate to True on any process where comm is not None
137 *and* comm.rank < nprow * npcol. Otherwise it is considered
138 inactive and evaluates to False. Ranks where a grid is inactive
139 never do anything at all.
141 BLACS identifies each grid by a unique ID number called the
142 context (frequently abbreviated ConTxt). Grids on inactive ranks
143 have context -1."""
144 def __init__(self, comm, nprow, npcol, order='R'):
145 assert nprow > 0
146 assert npcol > 0
147 assert len(order) == 1
148 assert order in 'CcRr'
149 # set a default value for the context leads to fewer
150 # if statements below
151 self.context = INACTIVE
153 self.comm = comm
154 self.nprow = nprow
155 self.npcol = npcol
156 self.ncpus = nprow * npcol
157 self.order = order
159 # There are three cases to handle:
160 # 1. Comm is None is inactive (default).
161 # 2. Comm is a legitimate communicator
162 # 3. DryRun
163 if gpaw.dry_run:
164 self.mycol = INACTIVE
165 self.myrow = INACTIVE
166 return
168 if comm is not None: # MPI task is part of the communicator
169 if nprow * npcol > comm.size:
170 raise ValueError(
171 'Impossible: %dx%d Blacs grid with %d CPUs'
172 % (nprow, npcol, comm.size))
174 try:
175 new = cgpaw.new_blacs_context
176 except AttributeError as e:
177 raise AttributeError(
178 'BLACS is unavailable. '
179 'GPAW must be compiled with BLACS/ScaLAPACK, '
180 'and must run in MPI-enabled interpreter '
181 '(gpaw-python). Original error: %s' % e)
183 self.context = new(comm.get_c_object(), npcol, nprow, order)
184 assert (self.context != INACTIVE) == (comm.rank < nprow * npcol)
186 self.mycol, self.myrow = cgpaw.get_blacs_gridinfo(self.context,
187 nprow,
188 npcol)
190 @property
191 def coords(self):
192 return self.myrow, self.mycol
194 @property
195 def shape(self):
196 return self.nprow, self.npcol
198 def coords2rank(self, row, col):
199 return self.nprow * col + row
201 def rank2coords(self, rank):
202 col, row = divmod(rank, self.nprow)
203 return row, col
205 def new_descriptor(self, M, N, mb, nb, rsrc=0, csrc=0):
206 """Create a new descriptor from this BLACS grid.
208 See documentation for BlacsDescriptor.__init__."""
209 return BlacsDescriptor(self, M, N, mb, nb, rsrc, csrc)
211 def is_active(self):
212 """Whether context is active on this rank."""
213 return self.context != INACTIVE
215 def __bool__(self):
216 2 / 0
218 def __str__(self):
219 classname = self.__class__.__name__
220 template = '%s[comm:size=%d,rank=%d; context=%d; %dx%d]'
221 string = template % (classname, self.comm.size, self.comm.rank,
222 self.context, self.nprow, self.npcol)
223 return string
225 def __del__(self):
226 if self.is_active():
227 cgpaw.blacs_destroy(self.context)
230class DryRunBlacsGrid00000000(BlacsGrid):
231 def __init__(self, comm, nprow, npcol, order='R'):
232 assert (isinstance(comm, SerialCommunicator) or
233 isinstance(comm.comm, SerialCommunicator))
234 # DryRunCommunicator is subclass
236 if nprow * npcol > comm.size:
237 raise ValueError('Impossible: %dx%d Blacs grid with %d CPUs'
238 % (nprow, npcol, comm.size))
239 self.context = INACTIVE
240 self.comm = comm
241 self.nprow = nprow
242 self.npcol = npcol
243 self.ncpus = nprow * npcol
244 self.mycol, self.myrow = INACTIVE, INACTIVE
245 self.order = order
248class BlacsDescriptor(MatrixDescriptor):
249 """Class representing a 2D matrix distribution on a blacs grid.
251 A BlacsDescriptor represents a particular shape and distribution
252 of matrices. A BlacsDescriptor has a global matrix shape and a
253 rank-dependent local matrix shape. The local shape is not
254 necessarily equal on all ranks.
256 A numpy array is said to be compatible with a BlacsDescriptor if,
257 on all ranks, the shape of the numpy array is equal to the local
258 shape of the BlacsDescriptor. Compatible arrays can be created
259 conveniently with the zeros() and empty() methods.
261 An array with a global shape of M by N is distributed such that
262 each process gets a number of distinct blocks of size mb by nb.
263 The blocks on one process generally reside in very different areas
264 of the matrix to improve load balance.
266 The following chart describes how different ranks (there are 4
267 ranks in this example, 0 through 3) divide the matrix into blocks.
268 This is called 2D block cyclic distribution::
270 +--+--+--+--+..+--+
271 | 0| 1| 0| 1|..| 1|
272 +--+--+--+--+..+--+
273 | 2| 3| 2| 3|..| 3|
274 +--+--+--+--+..+--+
275 | 0| 1| 0| 1|..| 1|
276 +--+--+--+--+..+--+
277 | 2| 3| 2| 3|..| 3|
278 +--+--+--+--+..+--+
279 ...................
280 ...................
281 +--+--+--+--+..+--+
282 | 2| 3| 2| 3|..| 3|
283 +--+--+--+--+..+--+
285 Also refer to:
286 https://netlib.org/scalapack/slug/index.html
288 Parameters:
289 * blacsgrid: the BLACS grid of processors to distribute matrices.
290 * M: global row count
291 * N: global column count
292 * mb: number of rows per block
293 * nb: number of columns per block
294 * rsrc: rank on which the first row is stored
295 * csrc: rank on which the first column is stored
297 Complicated stuff
298 -----------------
300 If there is trouble with matrix shapes, the below caveats are
301 probably the reason.
303 Depending on layout, a descriptor may have a local shape of zero
304 by N or something similar. If the row blocksize is 7, the global
305 row count is 10, and the blacs grid contains 3 row processes: The
306 first process will have 7 rows, the next will have 3, and the last
307 will have 0. The shapes in this case must still be correctly
308 given to BLACS functions, which can be confusing.
310 A blacs descriptor must also give the correct local leading
311 dimension (lld), which is the local array size along the
312 memory-contiguous direction in the matrix, and thus equal to the
313 local column number, *except* when local shape is zero, but the
314 implementation probably works.
316 """
317 def __init__(self, blacsgrid, M, N, mb, nb, rsrc=0, csrc=0):
318 assert M > 0
319 assert N > 0
320 assert 1 <= mb
321 assert 1 <= nb
322 if mb > M:
323 mb = M
324 if nb > N:
325 nb = N
326 assert 0 <= rsrc < blacsgrid.nprow
327 assert 0 <= csrc < blacsgrid.npcol
329 self.blacsgrid = blacsgrid
330 self.M = M # global size 1
331 self.N = N # global size 2
332 self.mb = mb # block cyclic distr dim 1
333 self.nb = nb # and 2. How many rows or columns are on this processor
334 # more info:
335 # https://www.netlib.org/scalapack/slug/node75.html
336 self.rsrc = rsrc
337 self.csrc = csrc
339 if blacsgrid.is_active():
340 locN, locM = cgpaw.get_blacs_local_shape(self.blacsgrid.context,
341 self.N, self.M,
342 self.nb, self.mb,
343 self.csrc, self.rsrc)
344 # max 1 is nonsensical, but appears
345 # to be required by PBLAS
346 self.lld = max(1, locN)
347 else:
348 # ScaLAPACK has no requirements as to what these values on an
349 # inactive blacsgrid should be. This seemed reasonable to me
350 # at the time.
351 locN, locM = 0, 0
352 # This applies only to empty matrices. Some functions will
353 # complain about lld<1, so lld=1 will make those happy.
354 self.lld = 1
356 # locM, locN is not allowed to be negative. This will cause the
357 # redistributor to fail. This could happen on active blacsgrid
358 # which does not contain any piece of the distribute matrix.
359 # This is why there is a final check on the value of locM, locN.
360 MatrixDescriptor.__init__(self, max(0, locM), max(0, locN))
362 # This is the definition of inactive descriptor; can occur
363 # on an active or inactive blacs grid.
364 self.active = locM > 0 and locN > 0
366 self.bshape = (self.mb, self.nb) # Shape of one block
367 self.gshape = (M, N) # Global shape of array
369 def asarray(self):
370 """Return a nine-element array representing this descriptor.
372 In the C/Fortran code, a BLACS descriptor is represented by a
373 special array of arcane nature. The value of asarray() must
374 generally be passed to BLACS functions in the C code."""
375 arr = np.array([BLOCK_CYCLIC_2D, self.blacsgrid.context,
376 self.N, self.M, self.nb, self.mb, self.csrc, self.rsrc,
377 self.lld], np.intc)
378 return arr
380 def __repr__(self):
381 classname = self.__class__.__name__
382 template = '%s[context=%d, glob %s, block %s, lld %d, loc %s]'
383 string = template % (classname, self.blacsgrid.context,
384 self.gshape,
385 self.bshape, self.lld, self.shape)
386 return string
388 def index2grid(self, row, col):
389 """Get the BLACS grid coordinates storing global index (row, col)."""
390 assert row < self.gshape[0], (row, col, self.gshape)
391 assert col < self.gshape[1], (row, col, self.gshape)
392 gridx = (row // self.bshape[0]) % self.blacsgrid.nprow
393 gridy = (col // self.bshape[1]) % self.blacsgrid.npcol
394 return gridx, gridy
396 def index2rank(self, row, col):
397 """Get the rank where global index (row, col) is stored."""
398 return self.blacsgrid.coords2rank(*self.index2grid(row, col))
400 def diagonalize_dc(self, H_nn, C_nn, eps_N, UL='L'):
401 """See documentation in gpaw/utilities/blacs.py."""
402 scalapack_diagonalize_dc(self, H_nn, C_nn, eps_N, UL)
404 def diagonalize_ex(self, H_nn, C_nn, eps_N, UL='L', iu=None):
405 """See documentation in gpaw/utilities/blacs.py."""
406 scalapack_diagonalize_ex(self, H_nn, C_nn, eps_N, UL, iu=iu)
408 def diagonalize_mr3(self, H_nn, C_nn, eps_N, UL='L', iu=None):
409 """See documentation in gpaw/utilities/blacs.py."""
410 scalapack_diagonalize_mr3(self, H_nn, C_nn, eps_N, UL, iu=iu)
412 def general_diagonalize_dc(self, H_mm, S_mm, C_mm, eps_M,
413 UL='L'):
414 """See documentation in gpaw/utilities/blacs.py."""
415 scalapack_general_diagonalize_dc(self, H_mm, S_mm, C_mm, eps_M,
416 UL)
418 def general_diagonalize_ex(self, H_mm, S_mm, C_mm, eps_M,
419 UL='L', iu=None):
420 """See documentation in gpaw/utilities/blacs.py."""
421 scalapack_general_diagonalize_ex(self, H_mm, S_mm, C_mm, eps_M,
422 UL, iu=iu)
424 def general_diagonalize_mr3(self, H_mm, S_mm, C_mm, eps_M,
425 UL='L', iu=None):
426 """See documentation in gpaw/utilities/blacs.py."""
427 scalapack_general_diagonalize_mr3(self, H_mm, S_mm, C_mm, eps_M,
428 UL, iu=iu)
430 def inverse_cholesky(self, S_nn, UL='L'):
431 """See documentation in gpaw/utilities/blacs.py."""
432 scalapack_inverse_cholesky(self, S_nn, UL)
434 def my_blocks(self, array_mn):
435 """Yield the local blocks and their global index limits.
437 Yields tuples of the form (Mstart, Mstop, Nstart, Nstop, block),
438 for each locally stored block of the array.
439 """
440 if not self.check(array_mn):
441 raise ValueError(f'Bad array shape ({self} vs {array_mn.shape})')
443 grid = self.blacsgrid
444 mb = self.mb
445 nb = self.nb
446 myrow = grid.myrow
447 mycol = grid.mycol
448 nprow = grid.nprow
449 npcol = grid.npcol
450 M, N = self.gshape
452 Mmyblocks = -(-self.shape[0] // mb)
453 Nmyblocks = -(-self.shape[1] // nb)
454 for Mblock in range(Mmyblocks):
455 for Nblock in range(Nmyblocks):
456 myMstart = Mblock * mb
457 myNstart = Nblock * nb
458 Mstart = myrow * mb + Mblock * mb * nprow
459 Nstart = mycol * mb + Nblock * nb * npcol
460 Mstop = min(Mstart + mb, M)
461 Nstop = min(Nstart + nb, N)
462 block = array_mn[myMstart:myMstart + mb,
463 myNstart:myNstart + nb]
465 yield Mstart, Mstop, Nstart, Nstop, block
467 def as_serial(self):
468 return self.blacsgrid.new_descriptor(self.M, self.N, self.M, self.N)
470 def redistribute(self, otherdesc, src_mn, dst_mn=None,
471 subM=None, subN=None, ia=0, ja=0, ib=0, jb=0, uplo='G'):
472 if self.blacsgrid != otherdesc.blacsgrid:
473 raise ValueError('Cannot redistribute to other BLACS grid. '
474 'Requires using Redistributor class explicitly')
475 if dst_mn is None:
476 dst_mn = otherdesc.empty(dtype=src_mn.dtype)
477 r = Redistributor(self.blacsgrid.comm, self, otherdesc)
478 r.redistribute(src_mn, dst_mn, subM, subN, ia, ja, ib, jb, uplo)
479 return dst_mn
481 def collect_on_master(self, src_mn, dst_mn=None, uplo='G'):
482 desc = self.as_serial()
483 return self.redistribute(desc, src_mn, dst_mn, uplo=uplo)
485 def distribute_from_master(self, src_mn, dst_mn=None, uplo='G'):
486 desc = self.as_serial()
487 return desc.redistribute(self, src_mn, dst_mn, uplo=uplo)
490class Redistributor:
491 """Class for redistributing BLACS matrices on different contexts."""
492 def __init__(self, supercomm, srcdescriptor, dstdescriptor):
493 """Create redistributor.
495 Source and destination descriptors may reside on different
496 BLACS grids, but the descriptors should describe arrays with
497 the same number of elements.
499 The communicators of the BLACS grid of srcdescriptor as well
500 as that of dstdescriptor *must* both be subcommunicators of
501 supercomm.
503 Allowed values of UPLO are: G for general matrix, U for upper
504 triangular and L for lower triangular. The latter two are useful
505 for symmetric matrices."""
506 self.supercomm = supercomm
507 self.supercomm_bg = BlacsGrid(self.supercomm, self.supercomm.size, 1)
508 self.srcdescriptor = srcdescriptor
509 self.dstdescriptor = dstdescriptor
511 def redistribute(self, src_mn, dst_mn=None,
512 subM=None, subN=None,
513 ia=0, ja=0, ib=0, jb=0, uplo='G'):
514 """Redistribute src_mn into dst_mn.
516 src_mn and dst_mn must be compatible with source and
517 destination descriptors of this redistributor.
519 If subM and subN are given, distribute only a subM by subN
520 submatrix.
522 If any ia, ja, ib and jb are given, they denote the global
523 index (i, j) of the origin of the submatrix inside the source
524 and destination (a, b) matrices."""
526 srcdescriptor = self.srcdescriptor
527 dstdescriptor = self.dstdescriptor
528 dtype = src_mn.dtype
530 if dst_mn is None:
531 dst_mn = dstdescriptor.empty(dtype=dtype)
533 # self.supercomm must be a supercommunicator of the communicators
534 # corresponding to the context of srcmatrix as well as dstmatrix.
535 # We should verify this somehow.
536 srcdescriptor = self.srcdescriptor
537 dstdescriptor = self.dstdescriptor
539 dtype = src_mn.dtype
540 if dst_mn is None:
541 dst_mn = dstdescriptor.zeros(dtype=dtype)
543 assert dtype == dst_mn.dtype
544 assert dtype == float or dtype == complex
546 # Check to make sure the submatrix of the source
547 # matrix will fit into the destination matrix
548 # plus standard BLACS matrix checks.
549 srcdescriptor.checkassert(src_mn)
550 dstdescriptor.checkassert(dst_mn)
552 if subM is None:
553 subM = srcdescriptor.gshape[0]
554 if subN is None:
555 subN = srcdescriptor.gshape[1]
557 assert srcdescriptor.gshape[0] >= subM
558 assert srcdescriptor.gshape[1] >= subN
559 assert dstdescriptor.gshape[0] >= subM
560 assert dstdescriptor.gshape[1] >= subN
562 # Switch to Fortran conventions
563 uplo = {'U': 'L', 'L': 'U', 'G': 'G'}[uplo]
564 cgpaw.scalapack_redist(srcdescriptor.asarray(),
565 dstdescriptor.asarray(),
566 src_mn, dst_mn,
567 subN, subM,
568 ja + 1, ia + 1, jb + 1, ib + 1, # 1-indexing
569 self.supercomm_bg.context, uplo)
570 return dst_mn
573def parallelprint(comm, obj):
574 import sys
575 for a in range(comm.size):
576 if a == comm.rank:
577 print('rank=%d' % a)
578 print(obj)
579 print()
580 sys.stdout.flush()
581 comm.barrier()