First of all, read our Beowulf cluster Web pages!
On Beowulf, use the "env" command to check if you run C or TC shells. If not, use "chsh" command to change your shell to "/bin/tcsh". If you want the shell change on beowulf to be permanent, change the shell on math.
Download Hypre software package rev. 1.8.2b to Beowulf and unpack it, using
tar xzvf hypre-1.8.2b.tar.gz
Rename the hypre-1.8.2b directory:
mv hypre-1.8.2b hypre-1.8.2b_gcc_scali
Get lobpcg_042403.tar.gz and unpack it using
tar xzvf lobpcg_042403.tar.gz
Download configure and ij_es.c files and replace the corresponging files in the lobpcg_042403 directory with the new updates. Note: the beowulf home directory is mounted on math at /data/home/, so from your math account you can copy files to your beowulf account using the cp command.
Remove all the files in the hypre-1.8.2b_gcc_scali/src/parcsr_es/LOBPCG directory:
rm hypre-1.8.2b_gcc_scali/src/parcsr_es/LOBPCG/*
and put there instead all the files from the lobpcg_042403 directory, including the updated configure and ij_es.c files:
cp lobpcg_042403/* hypre-1.8.2b_gcc_scali/src/parcsr_es/LOBPCG/
Change permission of hypre-1.8.2b_gcc_scali/src/configure to writable and rename it
chmod u+w hypre-1.8.2b_gcc_scali/src/configure
mv hypre-1.8.2b_gcc_scali/src/configure hypre-1.8.2b_gcc_scali/src/configure .original
Copy the configure file:
cp hypre-1.8.2b_gcc_scali/src/parcsr_es/LOBPCG/configure hypre-1.8.2b_gcc_scali/src/configure
There are several options to compile hypre on CUD Beowulf.
cd to hypre-1.8.2b_gcc_scali/src directory.
For gcc with scali, run
configure --with-CC=gcc --with-CXX=g++ --with-mpi-include=/opt/scali/include --with-mpi-lib-dirs=/opt/scali/lib --with-mpi-libs=mpi --with-CFLAGS="-O -O3 -D_REENTRANT -Wall -I/opt/scali/include"
Make sure that you create separate hypre directories for every install, e.g. hypre-1.8.2b_gcc_scali as above.
For pgcc (PGI) with scali first set the correct enviroment:
setenv PGI /opt/pgi-3.2
set path=($PGI/linux86/bin $path)
and then run
configure --with-CC=pgcc --with-CXX=pgCC --with-mpi-include=/opt/scali/include --with-mpi-lib-dirs=/opt/scali/lib --with-mpi-libs=mpi --with-CFLAGS="-O -O3 -D_REENTRANT -I/opt/scali/include"
For gcc with mpich run
configure --with-CC=gcc --with-CXX=g++ --with-mpi-include=/opt/mpich/include --with-mpi-lib-dirs=/opt/mpich/lib --with-mpi-libs=mpich --with-CFLAGS="-O3 -D_REENTRANT -Wall -I/opt/mpich/include"
and then to compile run "make" in the same directory. Check the make output to make sure that there are no errors (warnings are OK).
To run the lobpcg eigensolver, the procedure depends on if scali or mpich is used for compilation.
a) For scali: Examples (interactive job):
mpimon ij_es -lobpcg -solver 12 -itr 20 -- node1 2 node2 2
Examples (background job using PBS):
scasub -mpimon -np 6 -npn 2 ij_es -lobpcg -n 50 50 50
When you submit a background job through the PBS, it will return you the job number, e.g. "7563.master.cluster". When the job is done, it generates an error file, e.g., "ij_es.e7563," which is normally empty, and the output file, e.g., "ij_es.o7563", which should look the same way as the output that you would get from an interactive run.
If you want to use a script, create a file called, e.g., "script.pbs", that contains the following:
#!/bin/csh -f
#PBS -l nodes=4
printenv
pwd
cd ~/hypre-1.8.2b_gcc_scali/src/parcsr_es/LOBPCG
set nodelist
set nodes=(`cat $PBS_NODEFILE`)
echo $nodes
foreach node ($nodes)
set nodelist = ($nodelist $node 2)
end
echo $nodelist
mpimon ij_es -lobpcg -n 50 50 50 -- $nodelist
where the fifth line should point to the right directory,
and the last line should contain the actual driver you want to run.
Submit the script using the commandqsub script.pbs
which will return you the job number, e.g., "7581.master.cluste", and check the status using
qstat -a
Note: "qstat" is a PBS command and it can also see the jobs submitted through the PBS. It will NOT list interactive jobs! If you want to see the real picture, ssh to the node that is of interest for you and use the "top" command.
The output will be in the file called like "script.pbs.o7581".
Hint: if your output file contains the enviroment, the nodes list and nothing else, most likely there is a mistake in the PBS script. The most common mistake is the wrong directory in the fifth line of the script.
b) For mpich: Examples (interactive job):
Create a file named "machines" that contains the following two lines:
node1:2
node2:2
and run the following command in the corresponding directory:
/opt/mpich/bin/mpirun -np 4 -machinefile machines ij_es -lobpcg -solver 12 -itr 20
Comment: /opt/mpich/bin/mpirun is just an identical copy of /scratch/opt/mpich-1.2.3/util/mpirun. If you simply run mpirun without specifying the path, it will likely run /opt/scali/bin/mpirun which is not the correct one.
If you want to use a script, create a file called, e.g., "script.pbs", that contains the following:
#!/bin/csh -f
#PBS -l nodes=4
printenv
pwd
cd ~/hypre-1.8.2b_gcc_mpich/src/parcsr_es/LOBPCG
set nodelist
set nodes=(`cat $PBS_NODEFILE`)
echo $nodes
foreach node ($nodes)
set nodelist = ($nodelist$node'\n')
end
echo $nodelist > nodelist
/opt/mpich/bin/mpirun -np 4 -machinefile nodelist ij_es -lobpcg
where the fifth line should point to the right directory,
and the last line should contain the actual driver you want to run.
Submit the script using the commandqsub script.pbs
which will return you the job number, e.g., "7581.master.cluste", and check the status using
qstat -a
The output will be in the file called like "script.pbs.o7581".