The main goal of the project is to test the scalability of Hypre PCG
linear solvers on the UC Denver
Beowulf cluster and to determine the size of the largest linear
system our cluster is capable of solving using Hypre.
We want to test all available Hypre PCG
linear solvers:
- 1=AMG-PCG
- 2=DS-PCG
- 8=ParaSails-PCG
- 12=Schwarz-PCG
- 14=GSMG-PCG
- 43=Euclid-PCG
on all three Hypre installations that we have compiled for the previous
class project:
- gcc with scali
- pgcc with scali
- gcc with mpich
In this project, we only test linear solvers, not eigensolvers, thus we
need
to use the provided "ij_es" driver without specifying the "-lobpcg"
option.
We will use 3-D 7-point Laplacian, which is built-in in the driver, as
the linear system
matrix and the default choice of the righ-hand side vector, which is a
vector with unit components. The size of the problem is determined by
the
"-n <nx> <ny> <nz>" option, where
<nx>,<ny>, and <nz> are integer numbers specifying
the number of mesh points for the 3-D Laplacian in x, y, and z
directions, correspondingly.
E.g., "-n 100 100 100" option generates the 3-D Laplacian with 100 mesh
points in each direction, thus the total size of the matrix in this
case is 100*100*100=1,000,000.
For the scalability test, you need first determine the largest possible
problem the particular combination of the solver and the installation
is capable of solving on one node - this can be done interactively
(make sure that you are the only one running jobs on this node).
Second, you increase the number of nodes and the total problem size
proportionally and record the CPU time needed to solve the problems.
Make sure, of course, that you use both processors on every node. The
results of the tests can be presented using pictures with bars, see
p.18 as an example from the following set of slides:
Eleventh
Copper Mountain Conference on Multigrid Methods,
March 30 - APRIL 4, 2003:
Implementation of a Scalable Preconditioned Eigenvalue Solver Using
Hypre - Andrew Knyazev and Merico E. Argentati
(
slides)
The preliminary tests show that a 100*100*100=1,000,000
3-D 7-point Laplacian is a good starting point for the one node
for all preconditioners. Let us use the default topology, e.g.,
without specifying the -P option, and let us increase the problem size
in the y-direction proportionally to the increase of the
number of nodes in the scalability tests, so that the
3-D 7-point Laplacian always has a 100*100*100=1,000,000
block residing on every node, no matter how many nodes are involved.
The conclusion needs to address the following questions:
Which installation is faster and how the answer depends on the number
of nodes? We expect that mpich would be faster for a few nodes compared
to scali, but with the increase of the number of nodes scali should
eventually win over mpich.
What solver is most efficient and under which circumstances? Suspected
winners might be 1=AMG-PCG and 2=DS-PCG.
Here are preliminary results for the project report:
Scalability plots