Abinit
From FarmShare
(11 intermediate revisions not shown) | |||
Line 1: | Line 1: | ||
+ | ==new== | ||
+ | Abinit 7.10.4 non-MPI is available on FarmShare | ||
+ | ==old== | ||
+ | The below info used: | ||
+ | /farmshare/software/free/abinit/7.4.2/src/README | ||
+ | |||
== Abinit == | == Abinit == | ||
Line 5: | Line 11: | ||
== MPI example on barley cluster == | == MPI example on barley cluster == | ||
+ | Abinit comes with some sample input files. Here is an excerpt from $ABINITHOME/share/abinit-test/tutoparal/README_dfpt.txt which we will use. | ||
+ | |||
+ | Second test : BaTiO3 slab (29 atoms), | ||
+ | computation of the phonon frequencies at qpt 0.0 0.375 0.0 | ||
+ | |||
+ | This test, with 29 atom, is quite slow, but scales very well. | ||
+ | |||
+ | There is one preparatory step, before running the DFPT calculation. | ||
+ | The preparatory step can be run on 16 processors at most with the current | ||
+ | input file. It might use more processors as well, with the kgb parallelism | ||
+ | (but the input file has to be modified). | ||
+ | On 8 processors, the preparatory step is about three hours. | ||
+ | It generates well-converged wavefunctions. For a quick trial, | ||
+ | simply set nstep 1 instead of nstep 50 , | ||
+ | this will run in about 6 minutes. | ||
+ | |||
+ | The test case itself is an underconverged calculation of the response with | ||
+ | respect to one perturbation (atomic displacement). It is underconverged | ||
+ | because nstep has been set to 10, while more than 30 are needed. | ||
+ | Moreover, obtaining the interatomic force constants would need computing | ||
+ | many more perturbations than the present one. | ||
+ | In any case, the present test case run in about 45 minutes on a 8 core | ||
+ | machine. | ||
+ | Since the number of k points to be kept for the present perturbation is is 8x8x1 with 4 symmetries, | ||
+ | that is 16, and the number of bands is 120, the perfectly scalable part of the | ||
+ | test case should have a maximum speed up of 1920. | ||
+ | |||
+ | From tests for the 8 core case, on a total of 20200 secs, there | ||
+ | were 305 secs for vtorho3:synchro (sequential) and | ||
+ | 260.460 for inwffil (sequential). | ||
+ | The latter will not increase with a bigger value of nstep, and for more | ||
+ | perturbations, while the former will increase proportionally. | ||
+ | |||
+ | Hence, in the present status, for 8 cores, the sequential part is about 3%, | ||
+ | leading to a maximum speed-up with respect to sequential, of about 240. | ||
+ | For a larger test case (bigger nstep, more perturbations), the maximum speed up might | ||
+ | be twice bigger. | ||
+ | |||
+ | |||
+ | |||
+ | Preparatory step 1 | ||
+ | (mpirun ...) abinit < tdfpt_03.files > tdfpt_03.log | ||
+ | cp tdfpt_03.o_WFK tdfpt_04.i_WFK | ||
+ | cp tdfpt_03.o_WFK tdfpt_04.i_WFQ | ||
+ | |||
+ | Test case, step 2 (DFPT calculation) | ||
+ | (mpirun ...) abinit < tdfpt_04.files > tdfpt_04.log | ||
+ | |||
+ | |||
+ | The lines under Preparatory step 1 and Test case, step2 translates to this job submission script: | ||
<source lang="sh"> | <source lang="sh"> | ||
- | + | #!/bin/bash | |
- | + | ||
- | + | #$ -cwd | |
- | -bash | + | #$ -S /bin/bash |
- | + | #$ -N abinittest | |
- | bishopj@ | + | #$ -M bishopj@stanford.edu |
- | + | #$ -m beas | |
- | + | #$ -R y | |
- | + | #$ -l mem_free=1G | |
- | + | #$ -pe orte 8 | |
- | + | ||
- | + | echo "Got $NSLOTS slots" | |
- | + | echo "jobid $JOB_ID" | |
- | + | ||
- | + | tmphosts=`mktemp` | |
+ | awk '{ for (i=0; i < $2; ++i) { print $1} }' $PE_HOSTFILE > $tmphosts | ||
+ | |||
+ | echo "pwd" | ||
+ | pwd | ||
+ | |||
+ | echo "" | ||
+ | echo "nslots: $NSLOTS" | ||
+ | echo "" | ||
+ | |||
+ | module load abinit acml | ||
+ | |||
+ | date | ||
+ | mpirun -np $NSLOTS -machinefile $tmphosts -x LD_LIBRARY_PATH /farmshare/software/free/abinit/7.4.2/bin/abinit < tdfpt_03.files > tdfpt_03.log | ||
+ | date | ||
+ | |||
+ | cp tdfpt_03.o_WFK tdfpt_04.i_WFK | ||
+ | cp tdfpt_03.o_WFK tdfpt_04.i_WFQ | ||
+ | |||
+ | date | ||
+ | mpirun -np $NSLOTS -machinefile $tmphosts -x LD_LIBRARY_PATH /farmshare/software/free/abinit/7.4.2/bin/abinit < tdfpt_04.files > tdfpt_04.log | ||
+ | date | ||
+ | </source> | ||
+ | |||
+ | |||
+ | Here is an example run: | ||
+ | |||
+ | <source lang="sh"> | ||
+ | $ module load abinit | ||
+ | $ mkdir abinittest | ||
+ | $ cd abinittest | ||
+ | $ cp -rp $ABINITHOME/share/abinit-test . | ||
+ | $ cd abinit-test/tutoparal/Input/ | ||
+ | </source> | ||
+ | |||
+ | Save the job submission script above to abinit.submit in this directory. | ||
+ | |||
+ | Now we will submit the job: | ||
+ | |||
+ | <source lang="sh"> | ||
+ | $ qsub abinit.submit | ||
Your job 1143544 ("abinit") has been submitted | Your job 1143544 ("abinit") has been submitted | ||
bishopj@scorn:~/abinittest/abinit-test/tutoparal/Input$ qstat | bishopj@scorn:~/abinittest/abinit-test/tutoparal/Input$ qstat | ||
job-ID prior name user state submit/start at queue slots ja-task-ID | job-ID prior name user state submit/start at queue slots ja-task-ID | ||
----------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ||
- | 1143544 0.39219 abinit bishopj r 10/ | + | 1143544 0.39219 abinit bishopj r 10/11/2013 22:11:12 raring.q@barley07.Stanford.EDU 8 |
- | bishopj | + | </source> |
+ | |||
+ | Wait for a few minutes for job to complete. | ||
+ | |||
+ | <source lang="sh"> | ||
+ | $ cat abinit.o1143544 | ||
+ | Got 8 slots | ||
+ | jobid 1143544 | ||
+ | pwd | ||
+ | /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input | ||
+ | |||
+ | nslots: 8 | ||
+ | |||
+ | Fri Oct 11 22:11:27 PDT 2013 | ||
+ | Fri Oct 11 22:11:36 PDT 2013 | ||
+ | Fri Oct 11 22:11:50 PDT 2013 | ||
+ | Fri Oct 11 22:24:13 PDT 2013 | ||
+ | </source> | ||
+ | |||
+ | == scaling behavior == | ||
+ | |||
+ | Looking at the runtimes for 4, 8, and 16 core jobs shows good scaling behavior. These are the times I observed: | ||
+ | |||
+ | * 4 cores: 40 minutes | ||
+ | * 8 cores: 13 minutes | ||
+ | * 16 cores: 7 minutes | ||
+ | |||
+ | To try this, simply replace -pe orte 8 with the desired number of cores in place of 8 in the job submission script | ||
+ | |||
+ | |||
+ | <source lang="sh"> | ||
+ | Got 4 slots | ||
+ | jobid 1143471 | ||
+ | pwd | ||
+ | /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input | ||
+ | |||
+ | nslots: 4 | ||
+ | |||
+ | Fri Oct 11 23:03:12 PDT 2013 | ||
+ | Fri Oct 11 23:22:40 PDT 2013 | ||
+ | Fri Oct 11 23:22:53 PDT 2013 | ||
+ | Fri Oct 11 23:43:49 PDT 2013 | ||
+ | </source> | ||
+ | |||
+ | <source lang="sh"> | ||
+ | Got 8 slots | ||
+ | jobid 1143469 | ||
+ | pwd | ||
+ | /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input | ||
+ | |||
+ | nslots: 8 | ||
+ | |||
+ | Fri Oct 11 22:11:27 PDT 2013 | ||
+ | Fri Oct 11 22:11:36 PDT 2013 | ||
+ | Fri Oct 11 22:11:50 PDT 2013 | ||
+ | Fri Oct 11 22:24:13 PDT 2013 | ||
+ | </source> | ||
+ | |||
+ | <source lang="sh"> | ||
+ | Got 16 slots | ||
+ | jobid 1143470 | ||
+ | pwd | ||
+ | /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input | ||
+ | |||
+ | nslots: 16 | ||
+ | |||
+ | Fri Oct 11 22:39:00 PDT 2013 | ||
+ | Fri Oct 11 22:39:08 PDT 2013 | ||
+ | Fri Oct 11 22:39:21 PDT 2013 | ||
+ | Fri Oct 11 22:46:08 PDT 2013 | ||
</source> | </source> |
Latest revision as of 16:28, 14 October 2015
Contents |
new
Abinit 7.10.4 non-MPI is available on FarmShare
old
The below info used:
/farmshare/software/free/abinit/7.4.2/src/README
Abinit
Parallel Abinit is available on FarmShare. This install uses MPI and ACML. To use it you need to submit a parallel job to the barley cluster.
MPI example on barley cluster
Abinit comes with some sample input files. Here is an excerpt from $ABINITHOME/share/abinit-test/tutoparal/README_dfpt.txt which we will use.
Second test : BaTiO3 slab (29 atoms), computation of the phonon frequencies at qpt 0.0 0.375 0.0 This test, with 29 atom, is quite slow, but scales very well. There is one preparatory step, before running the DFPT calculation. The preparatory step can be run on 16 processors at most with the current input file. It might use more processors as well, with the kgb parallelism (but the input file has to be modified). On 8 processors, the preparatory step is about three hours. It generates well-converged wavefunctions. For a quick trial, simply set nstep 1 instead of nstep 50 , this will run in about 6 minutes.
The test case itself is an underconverged calculation of the response with respect to one perturbation (atomic displacement). It is underconverged because nstep has been set to 10, while more than 30 are needed. Moreover, obtaining the interatomic force constants would need computing many more perturbations than the present one. In any case, the present test case run in about 45 minutes on a 8 core machine. Since the number of k points to be kept for the present perturbation is is 8x8x1 with 4 symmetries, that is 16, and the number of bands is 120, the perfectly scalable part of the test case should have a maximum speed up of 1920.
From tests for the 8 core case, on a total of 20200 secs, there were 305 secs for vtorho3:synchro (sequential) and 260.460 for inwffil (sequential). The latter will not increase with a bigger value of nstep, and for more perturbations, while the former will increase proportionally. Hence, in the present status, for 8 cores, the sequential part is about 3%, leading to a maximum speed-up with respect to sequential, of about 240. For a larger test case (bigger nstep, more perturbations), the maximum speed up might be twice bigger. Preparatory step 1 (mpirun ...) abinit < tdfpt_03.files > tdfpt_03.log cp tdfpt_03.o_WFK tdfpt_04.i_WFK cp tdfpt_03.o_WFK tdfpt_04.i_WFQ Test case, step 2 (DFPT calculation) (mpirun ...) abinit < tdfpt_04.files > tdfpt_04.log
The lines under Preparatory step 1 and Test case, step2 translates to this job submission script:
#!/bin/bash #$ -cwd #$ -S /bin/bash #$ -N abinittest #$ -M bishopj@stanford.edu #$ -m beas #$ -R y #$ -l mem_free=1G #$ -pe orte 8 echo "Got $NSLOTS slots" echo "jobid $JOB_ID" tmphosts=`mktemp` awk '{ for (i=0; i < $2; ++i) { print $1} }' $PE_HOSTFILE > $tmphosts echo "pwd" pwd echo "" echo "nslots: $NSLOTS" echo "" module load abinit acml date mpirun -np $NSLOTS -machinefile $tmphosts -x LD_LIBRARY_PATH /farmshare/software/free/abinit/7.4.2/bin/abinit < tdfpt_03.files > tdfpt_03.log date cp tdfpt_03.o_WFK tdfpt_04.i_WFK cp tdfpt_03.o_WFK tdfpt_04.i_WFQ date mpirun -np $NSLOTS -machinefile $tmphosts -x LD_LIBRARY_PATH /farmshare/software/free/abinit/7.4.2/bin/abinit < tdfpt_04.files > tdfpt_04.log date
Here is an example run:
$ module load abinit $ mkdir abinittest $ cd abinittest $ cp -rp $ABINITHOME/share/abinit-test . $ cd abinit-test/tutoparal/Input/
Save the job submission script above to abinit.submit in this directory.
Now we will submit the job:
$ qsub abinit.submit Your job 1143544 ("abinit") has been submitted bishopj@scorn:~/abinittest/abinit-test/tutoparal/Input$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 1143544 0.39219 abinit bishopj r 10/11/2013 22:11:12 raring.q@barley07.Stanford.EDU 8
Wait for a few minutes for job to complete.
$ cat abinit.o1143544 Got 8 slots jobid 1143544 pwd /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input nslots: 8 Fri Oct 11 22:11:27 PDT 2013 Fri Oct 11 22:11:36 PDT 2013 Fri Oct 11 22:11:50 PDT 2013 Fri Oct 11 22:24:13 PDT 2013
scaling behavior
Looking at the runtimes for 4, 8, and 16 core jobs shows good scaling behavior. These are the times I observed:
- 4 cores: 40 minutes
- 8 cores: 13 minutes
- 16 cores: 7 minutes
To try this, simply replace -pe orte 8 with the desired number of cores in place of 8 in the job submission script
Got 4 slots jobid 1143471 pwd /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input nslots: 4 Fri Oct 11 23:03:12 PDT 2013 Fri Oct 11 23:22:40 PDT 2013 Fri Oct 11 23:22:53 PDT 2013 Fri Oct 11 23:43:49 PDT 2013
Got 8 slots jobid 1143469 pwd /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input nslots: 8 Fri Oct 11 22:11:27 PDT 2013 Fri Oct 11 22:11:36 PDT 2013 Fri Oct 11 22:11:50 PDT 2013 Fri Oct 11 22:24:13 PDT 2013
Got 16 slots jobid 1143470 pwd /afs/ir/users/b/i/bishopj/farmshare/abinit/abinit-test/tutoparal/Input nslots: 16 Fri Oct 11 22:39:00 PDT 2013 Fri Oct 11 22:39:08 PDT 2013 Fri Oct 11 22:39:21 PDT 2013 Fri Oct 11 22:46:08 PDT 2013