Not enough memory: Difference between revisions
No edit summary |
No edit summary |
||
Line 22: | Line 22: | ||
requirements on each node. | requirements on each node. | ||
*Possibly the executable becomes smaller if the options ''-G1'' (T3E) and ''-g'' are removed from the lines ''OFLAG'' and ''DEBUG'' in the makefile. | *Possibly the executable becomes smaller if the options ''-G1'' (T3E) and ''-g'' are removed from the lines ''OFLAG'' and ''DEBUG'' in the makefile. | ||
*Switch of symmetrisation ({ | *Switch of symmetrisation ({{TAG|ISYM}}=0). Symmetrisation is done locally on each node requiring three huge arrays. VASP.4.4.2 (and newer versions) have a switch to run a more memory conserving symmetrization. This can be selected by specifying {{TAG|ISYM}}=2. Results might however differ somewhat from {{TAG|ISYM}}=1 (usually only 1/100th of an meV). Also avoid writing or reading the {{TAG|CHGCAR}} file ({{TAG|LCHARG}}=''.FALSE.''). | ||
*Use {{TAG|NPAR}}=1. | *Use {{TAG|NPAR}}=1. | ||
Revision as of 13:34, 10 May 2019
First of all, the memory requirements of the serial version can be estimated using the makeparam utility (see Memory requirements). At present, there is however no way to estimate the memory requirements of the parallel version.
In fact, it might be difficult to run huge jobs on "thin" T3E or SP2 nodes. Most tables (pseudopotentials etc.) and the executable must be held on all nodes (10-20 Mbytes). In addition one complex array of the size is allocated on each node; during dynamic simulation even up to three such arrays are allocated. Upon reading and writing the charge density, a complex array that can hold all data points of the charge density is allocated 8*NGXF*NGYF*NGZF). Finally, three such arrays are allocated (and deallocated) during the charge density symmetrisation (the charge density symmetrisation takes usually the hugest amount of memory.) All other data are distributed among all nodes.
The following things can be tried to reduce the memory requirements on each node.
- Possibly the executable becomes smaller if the options -G1 (T3E) and -g are removed from the lines OFLAG and DEBUG in the makefile.
- Switch of symmetrisation (ISYM=0). Symmetrisation is done locally on each node requiring three huge arrays. VASP.4.4.2 (and newer versions) have a switch to run a more memory conserving symmetrization. This can be selected by specifying ISYM=2. Results might however differ somewhat from ISYM=1 (usually only 1/100th of an meV). Also avoid writing or reading the CHGCAR file (LCHARG=.FALSE.).
- Use NPAR=1.
It should be mentioned that VASP relies heavily on dynamic memory allocation (ALLOCATE and DEALLOCATE). As far as we know there is no memory leakage (ALLOCATE without DEALLOCATE), however unfortunately it is impossible to be entirely sure that no leakage exists. It should be mentioned that some users have observed that the code is growing during dynamic simulations on the T3E. This is however most likely due to a "problematic" dynamic memory management of the f90 runtime system and not due to programming error in VASP. Unfortunately the dynamic memory subsystems of most f90 compilers are still rather inefficient. As a result it might happen, that the memory becomes more and more fragmented during the run, so that large pieces of memory can not be allocated. We can only hope for improvements in the dynamic memory management (for instance the introduction of garbage collectors).