Category:Bethe-Salpeter equations: Difference between revisions
No edit summary |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 74: | Line 74: | ||
Although the dielectric function is frequency-dependent, the static approximation <math>W_{\mathbf{G}, \mathbf{G}^{\prime}}(\mathbf{q}, \omega=0)</math> is considered a standard for practical BSE calculations. | Although the dielectric function is frequency-dependent, the static approximation <math>W_{\mathbf{G}, \mathbf{G}^{\prime}}(\mathbf{q}, \omega=0)</math> is considered a standard for practical BSE calculations. | ||
== Scaling == | == Scaling == | ||
The scaling of the | The steep scaling of BSE with the system size can be a limiting factor for its application in large systems. This should be considered when performing BSE calculations. | ||
=== Building matrix === | |||
The {{TAGO|ALGO|BSE/TDHF}} algorithm as a first step, requires building the Hamiltonian of rank | |||
:<math>N_{\rm rank} = N_k\times N_c\times N_v</math>, | :<math>N_{\rm rank} = N_k\times N_c\times N_v</math>, | ||
where <math>N_k</math> is the number of k-points in the full Brillouin zone and <math>N_c</math> and <math>N_v</math> are the number of conduction and valence bands, respectively. This computation scales as | |||
:<math>N_k\times N_q\times (N_v\times N_v\times N_G\times N_c\times N_c)</math>, | |||
where <math>N_q</math> is the number of q-points and <math>N_G</math> number of G-vectors. To simplify it, we can estimate this computation as <math>N^4-N^5</math> with the system size. | |||
=== Solving equation === | |||
In the second step, the equation has to be solved. VASP provides different methods for doing that. | |||
==== Exact diagonalization ==== | |||
The exact diagonalization algorithm ({{TAGO|IBSE|2}}) scales cubically with the matrix rank <math>N_{\rm rank}^3</math> | |||
or as <math>N^6</math> with the system size. | |||
==== Iterative solution ==== | |||
The iterative solution, as in the time-evolution ({{TAGO|IBSE|1}}) or Lanczos | |||
({{TAGO|IBSE|3}}) algorithms, do not | |||
require diagonalizaing the full matrix but instead, require computing the matrix-vector multiplication for a number of steps or iterations <math>m</math>. Thus, solving the equation via the time-evolution or Lanzcos algorithms scales as <math>N_{\rm rank}^2\times m</math> or <math>N^4</math> with the system size. The number of iterations depends on the algorithm and the required precision, which can be selected via {{TAG|BSEPREC}} . | |||
== Exact diagonalization == | == Exact diagonalization == | ||
Line 134: | Line 141: | ||
- \cfrac{b_2^2}{...}}}, | - \cfrac{b_2^2}{...}}}, | ||
</math> | </math> | ||
where <math>|u_0\rangle</math> is an initial guess vector computed from the dipole moments, <math>|u_0\rangle = \sum_{cv\mathbf{k}} \langle c\mathbf{k}|r_\alpha|v\mathbf{k}\rangle \langle v\mathbf{k}|r_\beta|c\mathbf{k}\rangle</math>. The <math>a</math> and <math>b</math> coefficients are evaluated iteratively, with the iterative algorithm stopping once the difference between <math>\epsilon(\omega)</math> from two consecutive iterations is below a certain threshold | where <math>|u_0\rangle</math> is an initial guess vector computed from the dipole moments, <math>|u_0\rangle = \sum_{cv\mathbf{k}} \langle c\mathbf{k}|r_\alpha|v\mathbf{k}\rangle \langle v\mathbf{k}|r_\beta|c\mathbf{k}\rangle</math>. The <math>a</math> and <math>b</math> coefficients are evaluated iteratively, with the iterative algorithm stopping once the difference between <math>\epsilon(\omega)</math> from two consecutive iterations is below a certain threshold selected by {{TAG|BSEPREC}}. | ||
Using the dipole moments as the starting point means that the iterative algorithm is sensitive only to optically active transitions, i.e. <math>v\to c</math> transitions with non-zero dipole moment. As such, the algorithm will ignore optically inactive transitions and can reach convergence faster than other methods for larger matrices. | Using the dipole moments as the starting point means that the iterative algorithm is sensitive only to optically active transitions, i.e. <math>v\to c</math> transitions with non-zero dipole moment. As such, the algorithm will ignore optically inactive transitions and can reach convergence faster than other methods for larger matrices. | ||
Line 140: | Line 147: | ||
The following features are currently supported: | The following features are currently supported: | ||
* Calculating the dielectric function | * Calculating the dielectric function | ||
<!-- | |||
* Calculating the eigenvalues of bright excitonic states | * Calculating the eigenvalues of bright excitonic states | ||
expression with the u_0 vector explicitly written | expression with the u_0 vector explicitly written | ||
<math\delta_{\alpha\beta} - \frac{4\pi}{\Omega}\sum_{cv\mathbf{k}} \langle|c\mathbf{k}|r_\alpha|v\mathbf{k}\rangle | <math\delta_{\alpha\beta} - \frac{4\pi}{\Omega}\sum_{cv\mathbf{k}} \langle|c\mathbf{k}|r_\alpha|v\mathbf{k}\rangle | ||
Line 160: | Line 168: | ||
LLIBS += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal | LLIBS += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal | ||
To be able to perform the BSE calculation on GPUs, VASP needs to store the full BSE Hamiltonian in the GPU memory, which is often the limiting factor. The memory required to store the BSE Hamiltonian can be estimated as <math>N_{\rm rank}^2\times 16\cdot 10^{-9}</math> in Gb for {{TAGO|ANTIRES|0}}. In the case of exact diagonalization {{TAGO|IBSE|2}}, the eigensolver requires an additional scratch space. | To be able to perform the BSE calculation on GPUs, VASP needs to store the full BSE Hamiltonian in the GPU memory, which is often the limiting factor. The memory required to store the BSE Hamiltonian can be estimated as <math>N_{\rm rank}^2\times 16\cdot 10^{-9}</math> in Gb for {{TAGO|ANTIRES|0}}. In the case of exact diagonalization {{TAGO|IBSE|2}}, the eigensolver requires an additional scratch space. | ||
{{NB|mind|When running BSE calculations on GPUs, we recommend not setting {{TAG|OMEGAMAX}} or setting it to a larger value so that all the bands selected in {{TAG|NBANDSV}} and {{TAG|NBANDSO}} are included in the kernel. Otherwise, additional data transfers between CPU and GPU might be required, which leads to a serious performance degradation on GPUs.|}} | |||
== How to == | == How to == |
Latest revision as of 13:54, 20 December 2024
The formalism of the Bethe-Salpeter equation (BSE) allows for calculating the polarizability with the electron-hole interaction and constitutes the state of the art for calculating absorption spectra in solids.
Theory
Bethe-Salpeter equation
In the BSE, the excitation energies correspond to the eigenvalues of the following linear problem[1]
The matrices and describe the resonant and anti-resonant transitions between the occupied and unoccupied states
The energies and orbitals of these states are usually obtained in a calculation, but DFT and Hybrid functional calculations can be used as well. The electron-electron interaction and electron-hole interaction are described via the bare Coulomb and the screened potential .
The coupling between resonant and anti-resonant terms is described via terms and
Due to the presence of this coupling, the Bethe-Salpeter Hamiltonian is non-Hermitian.
Tamm-Dancoff approximation
A common approximation to the BSE is the Tamm-Dancoff approximation (TDA), which neglects the coupling between resonant and anti-resonant terms, i.e., and . Hence, the TDA reduces the BSE to a Hermitian problem
In reciprocal space, the matrix is written as
where is the cell volume, is the bare Coulomb potential without the long-range part
and the screened Coulomb potential
Here, the dielectric function describes the screening in within the random-phase approximation (RPA)
Although the dielectric function is frequency-dependent, the static approximation is considered a standard for practical BSE calculations.
Scaling
The steep scaling of BSE with the system size can be a limiting factor for its application in large systems. This should be considered when performing BSE calculations.
Building matrix
The ALGO = BSE/TDHF
algorithm as a first step, requires building the Hamiltonian of rank
- ,
where is the number of k-points in the full Brillouin zone and and are the number of conduction and valence bands, respectively. This computation scales as
- ,
where is the number of q-points and number of G-vectors. To simplify it, we can estimate this computation as with the system size.
Solving equation
In the second step, the equation has to be solved. VASP provides different methods for doing that.
Exact diagonalization
The exact diagonalization algorithm (IBSE = 2
) scales cubically with the matrix rank
or as with the system size.
Iterative solution
The iterative solution, as in the time-evolution (IBSE = 1
) or Lanczos
(IBSE = 3
) algorithms, do not
require diagonalizaing the full matrix but instead, require computing the matrix-vector multiplication for a number of steps or iterations . Thus, solving the equation via the time-evolution or Lanzcos algorithms scales as or with the system size. The number of iterations depends on the algorithm and the required precision, which can be selected via BSEPREC .
Exact diagonalization
The diagonalization of the BSE Hamiltonian can be perform using various eigensolvers provided in ScaLAPACK, ELPA, and cuSolver libraries. The advantage of this approach is that the eigenvectors can be directly obtained and used for the analysis of the excitons. Using the eigenvalues and eigenvectors of the BSE Hamiltonian, the macroscopic dielectric which accounts for the excitonic effects can be found
The following features are currently supported:
- Calculating the dielectric function and eigenvectors
- Calculations beyond Tamm-Dancoff approximation
- Calculations of for
- Fatband plot
Time evolution
Alternatively, it is possible to use the time-evolution algorithm which applies a short Dirac delta pulse of electric field and then follows the evolution of the dipole moments. The dielectric function is found via a Fourier transform [2]
- ,
where and are the dipole moments.
The solution found this way is strictly equivalent to the same solution as the exact diagonalization and can be used for obtaining the absorption spectrum, but does not yield the eigenvectors, which can be limiting for the analysis of the excitons. The advantage of this approach is the quadratic scaling with the size of the BSE Hamiltonian .
The time-evolution algorithm can be selected by setting IBSE = 1 in a BSE calculation. The required number of steps in the time-evolution calculation depends on the broadening CSHIFT and the maximum energy OMEGAMAX. The precision can be selected via tag BSEPREC.
Mind: The required number of steps does not depend on the size of the Hamiltonian |
The following features are currently supported:
- Calculating the dielectric function
- Calculations beyond the Tamm-Dancoff approximation
Lanczos algorithm
The expression for the dielectric function can be re-written as a continued fraction
where is an initial guess vector computed from the dipole moments, . The and coefficients are evaluated iteratively, with the iterative algorithm stopping once the difference between from two consecutive iterations is below a certain threshold selected by BSEPREC.
Using the dipole moments as the starting point means that the iterative algorithm is sensitive only to optically active transitions, i.e. transitions with non-zero dipole moment. As such, the algorithm will ignore optically inactive transitions and can reach convergence faster than other methods for larger matrices.
The following features are currently supported:
- Calculating the dielectric function
Performing BSE calculations on GPU
As of VASP 6.5, the BSE/TDHF calculations with IBSE = 1
or IBSE = 2
can be fully run on NVIDIA GPUs.
To be able to offload the BSE calculations to GPUs one has to compile VASP with the cuSOLVERMp and cuBLASMp libraries provided with NVHPC-SDK 24.7 or newer.
To be able to use these libraries VASP has to be compiled with HPC-X (MPI shipped with NVHPC-SDK), which can be loaded via
module load nvhpc-hpcx-cuda12/24.7
To enable these libraries in VASP, make sure to include the following lines in your makefile.include
CPP_OPTIONS+= -DCUSOLVERMP -DCUBLASMP LLIBS += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal
To be able to perform the BSE calculation on GPUs, VASP needs to store the full BSE Hamiltonian in the GPU memory, which is often the limiting factor. The memory required to store the BSE Hamiltonian can be estimated as in Gb for ANTIRES = 0
. In the case of exact diagonalization IBSE = 2
, the eigensolver requires an additional scratch space.
Mind: When running BSE calculations on GPUs, we recommend not setting OMEGAMAX or setting it to a larger value so that all the bands selected in NBANDSV and NBANDSO are included in the kernel. Otherwise, additional data transfers between CPU and GPU might be required, which leads to a serious performance degradation on GPUs. |
How to
- Practical guide for solving the Bethe-Salpeter equation via diagonalization BSE calculations
- Practical guide for solving the Casida equation via diagonalization TDDFT calculations
References
- ↑ T. Sander, E. Maggio, and G. Kresse, Beyond the Tamm-Dancoff approximation for extended systems using exact diagonalization, Phys. Rev. B 92, 045209 (2015).
- ↑ T. Sander, G. Kresse, Macroscopic dielectric function within time-dependent density functional theory—Real time evolution versus the Casida approach , J. Chem. Phys. 146, 064110 (2017)
Pages in category "Bethe-Salpeter equations"
The following 24 pages are in this category, out of 24 total.