RMM-DIIS: Difference between revisions
Vaspmaster (talk | contribs) No edit summary |
Vaspmaster (talk | contribs) No edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 35: | Line 35: | ||
* Replace <math>\psi^0_m</math> by <math>\bar{\psi}^M</math> and move on to start work on the next orbital, ''e.g.'' <math>\psi^0_{m+1}</math>. | * Replace <math>\psi^0_m</math> by <math>\bar{\psi}^M</math> and move on to start work on the next orbital, ''e.g.'' <math>\psi^0_{m+1}</math>. | ||
The size of the trial step <math>\lambda</math> is a critical value for the stability of the algorithm. We have found that a reasonable choice for the trial step can be obtained from the minimization of the Rayleigh quotient along the search direction in ''the first step'', this optimal <math>\lambda</math> is then used for a particular orbital until the algorithm moves on to the next orbital. | The size of the trial step <math>\lambda</math> is a critical value for the stability of the algorithm. We have found that a reasonable choice for the trial step can be obtained from the minimization of the Rayleigh quotient along the search direction in ''the first step'', this optimal <math>\lambda</math> is then used for a particular orbital until the algorithm moves on to the next orbital.{{cite|kresse:cms:1996}}{{cite|kresse:prb:96}} | ||
As mentioned before, the optimization of an orbital is stopped when either the maximum number of iterations per orbital ({{TAG|NRMM}}), or a certain convergence threshold has been reached. The latter may be fine-tuned by means of the {{TAG|EBREAK}}, {{TAG|DEPER}}, and {{TAG|WEIMIN}} tags. Note: we do not recommend you to do so! Rather rely on the defaults instead. | As mentioned before, the optimization of an orbital is stopped when either the maximum number of iterations per orbital ({{TAG|NRMM}}), or a certain convergence threshold has been reached. The latter may be fine-tuned by means of the {{TAG|EBREAK}}, {{TAG|DEPER}}, and {{TAG|WEIMIN}} tags. Note: we do not recommend you to do so! Rather rely on the defaults instead. | ||
The [[RMM-DIIS]] algorithm works on a "per-orbital" basis and as such it trivially parallelizes over orbitals, which is the default [[Parallelization| parallelization strategy of VASP]]. However, to cast some of the operations involved into the form of ''matrix-matrix multiplications'' and leverage the performance of BLAS3 library calls, the [[RMM-DIIS]] implementation in VASP works on {{TAG|NSIM}} orbitals simultaneously. | The [[RMM-DIIS]] algorithm works on a "per-orbital" basis and as such it trivially parallelizes over orbitals, which is the default [[Parallelization| parallelization strategy of VASP]]. However, to cast some of the operations involved into the form of ''matrix-matrix multiplications'' and leverage the performance of BLAS3 library calls, the [[RMM-DIIS]] implementation in VASP works on {{TAG|NSIM}} orbitals simultaneously. | ||
Note that, in the [[Self-consistency_cycle|self-consistency cycle]] of {{VASP}}, subspace rotation and [[RMM-DIIS]] refinement of the orbitals alternate. | Note that, in the [[Self-consistency_cycle|self-consistency cycle]] of {{VASP}}, subspace rotation and [[RMM-DIIS]] refinement of the orbitals alternate. | ||
Furthermore, {{VASP}} re-orthonormalizes the orbitals after the [[RMM-DIIS]] refinement step. | Furthermore, {{VASP}} re-orthonormalizes the orbitals after the [[RMM-DIIS]] refinement step. | ||
It should be emphasized that, in principle, the [[RMM-DIIS]] method should also converge without any explicit subspace diagonalization and/or re-orthonormalization. | It should be emphasized that, in principle, the [[RMM-DIIS]] method should also converge without any explicit subspace diagonalization and/or re-orthonormalization. | ||
However, in our experience their inclusion speeds up convergence so substantially that it shortens the time-to-solution of most calculations, even though these operations scale as <math>O(N^3)</math>. | However, in our experience their inclusion speeds up convergence so substantially that it shortens the time-to-solution of most calculations, even though these operations scale as <math>O(N^3)</math>.{{cite|kresse:cms:1996}}{{cite|kresse:prb:96}} | ||
A drawback of the [[RMM-DIIS]] method is that it always converges toward the eigenstates which are closest to the initial trial orbitals. This leads, in principle, to serious problems because there is no guarantee of convergence to the correct ground state at all: if the initial set of orbitals does not ‘‘span’’ the ground state it might happen that in the final solution some eigenstates are ‘‘missing’’. To avoid this, the initialization of the orbitals must be done with great care. | |||
Therefore, either the number of non-selfconsistent cycles at the start of [[Self-consistency_cycle|self-consistency cycle]] is chosen to be large ({{TAG|NELMDL}} = 12, for {{TAG|ALGO}} = VeryFast), or the non-selfconsistent cycles are done with the [[Blocked-Davidson algorithm|blocked-Davidson algorithm]] before switching over to the use of the [[RMM-DIIS]] ({{TAG|ALGO}} = Fast). | |||
The [[RMM-DIIS]] is approximately a factor of 1.5-2 faster than the [[Blocked-Davidson algorithm|blocked-Davidson algorithm]], but less robust. | |||
== References == | == References == |
Latest revision as of 09:43, 14 November 2023
The implementation of the Residual Minimization Method with Direct Inversion in the Iterative Subspace (RMM-DIIS) in VASP[1][2] is based on the original work of Pulay:[3]
- The procedure starts with the evaluation of the preconditioned residual vector for some selected orbital :
- where is the preconditioning function, and the residual is computed as:
- with
- Then a Jacobi-like trial step is taken in the direction of the vector:
- and a new residual vector is determined:
- Next a linear combination of the initial orbital and the trial orbital
- is sought, such that the norm of the residual vector is minimized. Assuming linearity in the residual vector:
- this requires the minimization of:
- with respect to .
- This step is usually called direct inversion of the iterative subspace (DIIS).
- The next trial step () starts from , along the direction . In each iteration is increased by 1, and a new trial orbital:
- and its corresponding residual vector are added to the iterative subspace, that is subsequently inverted to yield .
- The algorithm keeps iterating until the norm of the residual has dropped below a certain threshold, or the maximum number of iterations per orbital has been reached (NRMM).
- Replace by and move on to start work on the next orbital, e.g. .
The size of the trial step is a critical value for the stability of the algorithm. We have found that a reasonable choice for the trial step can be obtained from the minimization of the Rayleigh quotient along the search direction in the first step, this optimal is then used for a particular orbital until the algorithm moves on to the next orbital.[1][2]
As mentioned before, the optimization of an orbital is stopped when either the maximum number of iterations per orbital (NRMM), or a certain convergence threshold has been reached. The latter may be fine-tuned by means of the EBREAK, DEPER, and WEIMIN tags. Note: we do not recommend you to do so! Rather rely on the defaults instead.
The RMM-DIIS algorithm works on a "per-orbital" basis and as such it trivially parallelizes over orbitals, which is the default parallelization strategy of VASP. However, to cast some of the operations involved into the form of matrix-matrix multiplications and leverage the performance of BLAS3 library calls, the RMM-DIIS implementation in VASP works on NSIM orbitals simultaneously.
Note that, in the self-consistency cycle of VASP, subspace rotation and RMM-DIIS refinement of the orbitals alternate. Furthermore, VASP re-orthonormalizes the orbitals after the RMM-DIIS refinement step. It should be emphasized that, in principle, the RMM-DIIS method should also converge without any explicit subspace diagonalization and/or re-orthonormalization. However, in our experience their inclusion speeds up convergence so substantially that it shortens the time-to-solution of most calculations, even though these operations scale as .[1][2]
A drawback of the RMM-DIIS method is that it always converges toward the eigenstates which are closest to the initial trial orbitals. This leads, in principle, to serious problems because there is no guarantee of convergence to the correct ground state at all: if the initial set of orbitals does not ‘‘span’’ the ground state it might happen that in the final solution some eigenstates are ‘‘missing’’. To avoid this, the initialization of the orbitals must be done with great care. Therefore, either the number of non-selfconsistent cycles at the start of self-consistency cycle is chosen to be large (NELMDL = 12, for ALGO = VeryFast), or the non-selfconsistent cycles are done with the blocked-Davidson algorithm before switching over to the use of the RMM-DIIS (ALGO = Fast).
The RMM-DIIS is approximately a factor of 1.5-2 faster than the blocked-Davidson algorithm, but less robust.