Do not forget to read our Beowulf cluster Web pages!
Installing the latest alpha Hypre with precompiled LAPACK libraries
The install is similar, but not identical to that used in one of the previous projects since this time we want to use the precompiled system-wide BLAS and LAPACK libraries. Thus, you need to do a complete install from the scratch.
On Beowulf, use the "env" command to check if you run C or TC shells. If not, use "chsh" command to change your shell to "/bin/tcsh". If you want the shell change on beowulf to be permanent, change the shell on math.
Get the latest Hypre alpha in your beowulf account terminal window:
cd
cp ~aknyazev/hypre_03_19_04.tgz .
tar xzvf hypre_03_19_04.tgz
This creates a new directory, linear_solvers.
To configure gcc-scali install with the precompiled BLAS and
LAPACK libraries, run in
this directory:
./configure --with-blas="-L/usr/local/lib -lblas" --with-lapack="-llapack -lg2c"
make test
Installing the latest LOBPCG eigenvalue solver for Hypre
To test the most recent version of our eigensolver you need to install additionally in the linear_solvers directory:
mkdir es
cd es
cp ~aknyazev/lobpcg28apr04.tar.gz .
tar xzvf lobpcg28apr04.tar.gz
./build.sh
Check that there are no errors during the install. You will have a warning:
gcc: -lmpi: linker input file unused since linking not done
a few times - this is OK. The lobpcg test drivers: ij_lobpcg and struct_lobpcg are being compiled in the subdirectory test_lobpcg.
Testing the latest LOBPCG eigenvalue solver for Hypre
The lobpcg test drivers: ij_lobpcg and struct_lobpcg are in the subdirectory test_lobpcg and support most options of the original ij and struct drivers. Here is a list of most important options supported by the ij_lobpcg and struct_lobpcg drivers:
Preconditioning in the lobpcg code is done by calling Hypre PCG linear solvers and preconditioners. We want to test in the lobpcg eigensolver all available Hypre PCG linear solvers that we have already tested on our Beowulf cluster for linear systems in the previous projects. Solvers to test in the ij_lobpcg driver:
Attention:
the input parameters and the defaults
in ij and struct interface drivers
are completely different in the present version of Hypre. Namely,
in struct, the `-n'
option allows one to specify the local problem size PER processor. The
global problem size will be Px*nx by Py*ny by Pz*nz.
Also, the defaut -P option in struct is different from that of ij,
as well as the default tolerance and the max number of iterations.
To change the default max number of iterations
and the verbosity level to make them consistent with that of the
ij driver, you need to change the following lines of the
the struct_lobpcg.c file:
The original struct driver does not seem to have the
command line -tol option, but the present struct_lobpcg driver
with the -lobpcg option does accept the -tol value and it must be
specified -tol 1e-8 to make our tests consistent.
In order to have consistent problem size, follow these examples:
mpimon ./ij_lobpcg -lobpcg -solver 1 -tol 1e-8 -pcgitr 0 -vrand 1 -n 100 100 100 -P 1 2 1 -- node1 2
solve the problem with the 3D Laplacian of the same size
100x100x100 on one node 2 CPUs, while
mpimon ./ij_lobpcg -lobpcg -solver 1 -tol 1e-8 -pcgitr 0 -vrand 1 -n 100 200 100 -P 1 4 1 -- node1 2 node2 2
solve the problem with the 3D Laplacian of the same size
100x200x100 on 2 nodes with 2 CPUs each.
The -iout 0 option in struct prevent its from generating
output files we do not need. Even with -iout 0 option, the
struct driver generates a file called zout.A.00000 that
you need to remove manually. I could not find out how to
tell the struct driver NOT to generate this file.
HYPRE_PCGSetMaxIter( (HYPRE_Solver)solver, 50 );
HYPRE_PCGSetPrintLevel( (HYPRE_Solver)solver, 1 );
into
HYPRE_PCGSetMaxIter( (HYPRE_Solver)solver, 1000);
HYPRE_PCGSetPrintLevel( (HYPRE_Solver)solver, 2 );
and recompile, by running ./build.sh in the es directory.
mpimon ./struct_lobpcg -lobpcg -solver 11 -tol 1e-8 -pcgitr 0 -vrand 1 -n 100 50 100 -P 1 2 1 -iout 0 -- node1 2
mpimon ./struct_lobpcg -lobpcg -tol 1e-8 -solver 11 -pcgitr 0 -vrand 1 -n 100 50 100 -P 1 4 1 -iout 0 -- node1 2 node2 2
The scalability test is similar to that of the
previous class projects.
We increase the problem size
in the y-direction proportionally to the increase of the
number of nodes, so that the
3-D 7-point Laplacian always has a 100*100*100=1,000,000
block residing on every node, no matter how many nodes are involved.
The results of the tests can be presented using pictures with bars, see
p.18 as an example from the following set of slides:
Eleventh
Copper Mountain Conference on Multigrid Methods,
March 30 - APRIL 4, 2003:
Implementation of a Scalable Preconditioned Eigenvalue Solver Using
Hypre - Andrew Knyazev and Merico E. Argentati
(
slides)
In this project, we test the eigensolver for computing only the
approximation to one eigenvector, corresponding to the smallest
eigenvalue of the 3D 7 point Laplacian, using the
"-vrand" 1 option. The CPU timer for longer runs is broken
in both struct and ij drivers in this Hypre release,
e.g., tests with -vrand 10 options are often long enough
to produce negative CPU times.
The conclusion needs to address the following questions: