VASPml library: Difference between revisions

From VASP Wiki
No edit summary
No edit summary
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
VASPml is a C++ library accompanying {{VASP}}, providing functionality related to machine-learned force fields. It is supposed to extend, and eventually replace, the original Fortran machine learning code inside {{VASP}}. Currently, it does not yet offer any training capabilities but rather focuses on inference. At this point VASPml is in a beta-testing stage and provides its first application, an interface to the popular molecular dynamics (MD) software [https://www.lammps.org LAMMPS]. This allows users to combine {{VASP}}-generated machine-learned force fields with the large amount of MD-related features provided by LAMMPS, some of which may not be offered in {{VASP}} directly.
VASPml is a C++ library accompanying {{VASP}}, providing functionality related to machine-learned force fields. It is supposed to extend, and eventually replace, the original Fortran machine learning code inside {{VASP}}. Currently, it does not yet offer any training capabilities but rather focuses on inference. At this point VASPml is in a beta-testing stage and provides its first application, an interface to the popular molecular dynamics (MD) software [https://www.lammps.org LAMMPS]. This allows users to combine {{VASP}}-generated machine-learned force fields with the large amount of MD-related features provided by LAMMPS, some of which may not be offered in {{VASP}} directly.
{{NB|warning|As of {{VASP}} 6.5.0 the VASPml library is experimental and results should be carefully checked against the standard Fortran code (compile without <code>-Dlibvaspml</code> or set {{TAGO|ML_LIB|.FALSE.}}).}}


= Build instructions =
= Supported features =


In future the source of VASPml will be distributed as part of the official {{VASP}} release. The build process of {{VASP}} will include the steps necessary to compile and link also the VASPml library and interfaces. However, at this point (and most likely even when integrated into {{VASP}}) it is possible to build VASPml completely independent of {{VASP}}. The following sections describe details of such an independent build of VASPml.
* [[Running machine-learned force fields in LAMMPS]]
* Fast prediction-only mode in {{VASP}} ({{TAGO|ML_MODE|run}})


=== Prerequisites ===
If {{VASP}} is compiled with the VASPml library and a requested feature is supported by both, the original Fortran code '''and''' the C++ VASPml implementation, then the latter code path is used by default. To override this behavior and explicitly avoid the use of the VASPml library set {{TAGO|ML_LIB|.FALSE.}} in the {{FILE|INCAR}} file.


# VASPml requires a C++ compiler conforming to the C++17 language standard, for example compilers which are part of:
= Restrictions =
#* GNU Compiler Collection
#* Intel oneAPI Base Toolkit
#* NVIDIA HPC SDK
#* NEC SDK
# Numerical libraries: LAPACK and BLAS, which are distributed for example as part of:
#* OpenBLAS
#* Intel oneAPI Math Kernel Library (part of Base Toolkit)
#* NVIDIA HPC SDK
#* NEC NLC (NEC Numeric Library Collection)
# An MPI (Message Passing Interface) implementation, e.g. in
#* OpenMPI
#* Intel MPI (part of Intel oneAPI HPC Toolkit)
#* NVIDIA HPC SDK (OpenMPI)
#* NEC MPI


=== Build library and applications ===
Since the VASPml library is still under development some features available in the original Fortran code are not yet available:
* No machine learning related file output (e.g. {{FILE|ML_LOGFILE}}) {{NB|tip|For running the fast prediction-only mode in VASP there is currently only negligible performance gains from the VASPml library. Hence, if file output is important (e.g. when monitoring the [[Best_practices_for_machine-learned_force_fields#Spilling_factor:_error_estimates_during_production_runs|spilling factor]]) we recommend using the original Fortran code ({{TAGO|ML_LIB|.FALSE.}}).|:}}
* Thermodynamic integration ({{TAG|ML_LCOUPLE}})
* Heat flux calculation ({{TAG|ML_LHEAT}})


Similar to {{VASP}} also VASPml requires to enter compiler details and library paths into a file named <code>makefile.include</code> before the build process can be started. Template files for this file can be found in the <code>arch</code> subdirectory. Usually it is convenient to start from one of these files. Hence, first copy it to the base directory and rename it to <code>makefile.include</code>, e.g.
= Dependencies =


<pre>cp arch/makefile.include.gnu makefile.include</pre>
The VASPml library depends on the following compilers and external libraries:
Then modify the contents to reflect the compiler and library settings on your machine, for details see the [[#compiler-and-linker-options|compiler and linker options]] section below. Once done, the VASPml library can be built by executing this command in the top directory:
* C++ compiler supporting the C++17 language standard
* MPI
* BLAS and the corresponding C interface CBLAS
* LAPACK and the corresponding C interface LAPACKE
These requirements are usually already covered by the [[Installing_VASP.6.X.X#Requirements|{{VASP}} requirements]].  


<pre>make -j </pre>
= Build instructions =
This will automatically build the two ''targets'' <code>libvaspml</code> and <code>applications</code>, and is equivalent of running two make commands explicitly in this order:
 
<pre>make libvaspml -j
make applications -j</pre>
The <code>libvaspml</code> target builds the library with same name and places it in the <code>lib</code> folder. The <code>applications</code> target compiles and links standalone applications present in <code>src/applications</code>. At the moment there is only one application named <code>vaspml-predict</code> which predicts energy, forces and stress for one <code>POSCAR</code> file with a given <code>ML_FF</code> force field file. The executable will be copied to the <code>bin</code> directory.
 
With the <code>-j</code> flag present <code>make</code> will run the build process in parallel, starting as many parallel jobs as possible. You can limit the maximum load the build process is allowed to cause with the <code>-l</code> flag, please review the documentation of GNU <code>make</code>.


==== Compiler and linker options ====
The VASPml library is automatically built alongside {{VASP}} if <code>-Dlibvaspml</code> is added to the <code>CPP_OPTIONS</code> [[Precompiler options|precompiler option]] in the [[makefile.include]] file. In addition, a few more compiler settings regarding the C++ compiler, include paths and VASPml options may be required. The [[makefile.include#Archetypical files|makefile.include templates]] provided in {{VASP}}'s <code>arch</code> directory contain pre-filled blocks corresponding to the VASPml build. Uncomment the VASPml-related lines and fill with values according to your [[Toolchains|toolchain]]. For example, when using the GCC toolchain with OpenBLAS the makefile.include section may look like this:
 
...
The following compiler and linker options in the <code>makefile.include</code> should be reviewed and eventually modified before starting the build process:
# For machine learning library vaspml (experimental)
 
CPP_OPTIONS += -Dlibvaspml
* <code>CXX</code>: This should be a C++17-compatible C++ compiler with MPI support.
CPP_OPTIONS += -DVASPML_USE_CBLAS
* <code>CXXFLAGS</code>: Specifies the flags for the C++ compiler.
#CPP_OPTIONS += -DVASPML_DEBUG_LEVEL=3
* <code>INCLUDE</code>: Paths in which to look for headers of required libraries. Here the include directory of BLAS should be listed.
CXX_ML      = mpic++
* <code>FC</code> (deprecated)
CXXFLAGS_ML = -O3 -std=c++17 -pedantic-errors -Wall -Wextra
* <code>FFLAGS</code> (deprecated)
INCLUDE_ML  = -I$(OPENBLAS_ROOT)/include
 
...
==== Compile-time options ====
Apart from the mandatory <code>-Dlibvaspml</code> flag there are the following possible <code>CPP_OPTIONS</code>:
 
* <code>-DVASPML_DEBUG_LEVEL</code>: If set to 1, 2 or 3 enables various sanity checks during runtime with low, medium and high impact on performance, respectively.
* <code>-DVASPML_USE_CBLAS</code>: Use CBLAS (C interface for BLAS routines) for linear algebra. This is the default and should always be used.
* <code>-DVASPML_USE_CBLAS</code>: Use CBLAS (C interface for BLAS routines) for linear algebra. This is the default and should always be used.
* <code>-DVASPML_USE_MKL</code>: Use Intel MKL for linear algebra.
* <code>-DVASPML_DEBUG_LEVEL=[0|1|2|3]</code>: If set to 1, 2 or 3 enables various sanity checks during runtime with low, medium and high impact on performance, respectively. Setting it to 0 or omitting the flag disables runtime checks. {{NB|mind|Do not use this flag for production runs as it may decrease performance.|:}}
* <code>-DVASPML_FORTRAN_MATH</code> (deprecated): Enable legacy Fortran math routines.
* <code>-DVASPML_USE_MKL</code>: Use Intel MKL for linear algebra (must be used in combination with <code>-DVASPML_USE_CBLAS</code>).
 
In addition, VASPml requires to set its own compiler, flags and include path:
==== Makefile options ====
* <code>CXX_ML</code>: This should be a C++17-compatible C++ compiler with MPI support (usually an MPI wrapper corresponding to the selected toolchain, e.g. <code>mpic++</code>, <code>mpicxx</code>, <code>mpicpx</code> or <code>mpinc++</code>).
 
* <code>CXXFLAGS_ML</code>: Specifies the flags for the C++ compiler. Typically, here the optimization level (<code>-O3</code>) and the compliance with C++17 is specified.
* <code>--no-color</code>: Disables colored output of makefiles.
* <code>INCLUDE_ML</code>: Include flags for the required dependencies should be added here. {{NB|tip|For some [[toolchains]] it is not necessary to explicitly add paths here because the compilers automatically include the correct directories (e.g. Intel oneAPI, NVHPC). In other cases (e.g. GNU compiler with openBLAS) the given path must contain the desired C++ headers of the dependencies:
* <code>--no-logo</code>: Disables the logo VASPml logo output.
* CBLAS: <code>cblas.h</code>
 
* LAPACKE: <code>lapacke.h</code>|:}}
=== Automatic patching and compilations of LAMMPS ===


<pre>make lammps -j</pre>
The VASPml project (source code and related files) is located within the <code>src/vaspml</code> directory relative to the {{VASP}} root folder. Upon compilation it is copied to the <code>build/std</code>, <code>build/gam</code> and/or <code>build/ncl</code> build folders, just like all other {{VASP}} sources. If the VASPml library was successfully compiled <code>libvaspml.a</code> will be located in <code>build/std/vaspml/lib/</code> (similarly for the <code>gam</code> and <code>ncl</code> versions). However, it is usually not necessary to check its presence because the {{VASP}} build will handle this (and fail if VASPml cannot be built).
==== Technical details ====


From a technical standpoint LAMMPS and VASPml interact in the following way: on the LAMMPS side a new class <code>PairVASP</code> (inheriting from <code>Pair</code>) is implemented in <code>pair_vasp.cpp/h</code>. Its purpose is to transfer the neighbor lists to VASPml, trigger processing, and receive back the energy and force contributions. VASPml enters the received neighbor list data into its own structures and computes energy and force predictions according to the pre-trained machine-learned force field. A typical build for the combination of the two codes requires to first compile the <code>libvaspml</code> library. Then, LAMMPS is patched with the additional <code>pair_vasp.cpp/h</code> files, which are automatically compiled during the LAMMPS build. In the final stage, LAMMPS is linked to the <code>libvaspml</code> library, resulting in a patched executable. This can be done manually but VASPml also offers a convenient automated way covering all steps (<code>make lammps</code>).
[[Category:Machine-learned force fields]]

Revision as of 08:33, 24 January 2025

VASPml is a C++ library accompanying VASP, providing functionality related to machine-learned force fields. It is supposed to extend, and eventually replace, the original Fortran machine learning code inside VASP. Currently, it does not yet offer any training capabilities but rather focuses on inference. At this point VASPml is in a beta-testing stage and provides its first application, an interface to the popular molecular dynamics (MD) software LAMMPS. This allows users to combine VASP-generated machine-learned force fields with the large amount of MD-related features provided by LAMMPS, some of which may not be offered in VASP directly.

Warning: As of VASP 6.5.0 the VASPml library is experimental and results should be carefully checked against the standard Fortran code (compile without -Dlibvaspml or set ML_LIB = .FALSE.).

Supported features

If VASP is compiled with the VASPml library and a requested feature is supported by both, the original Fortran code and the C++ VASPml implementation, then the latter code path is used by default. To override this behavior and explicitly avoid the use of the VASPml library set ML_LIB = .FALSE. in the INCAR file.

Restrictions

Since the VASPml library is still under development some features available in the original Fortran code are not yet available:

  • No machine learning related file output (e.g. ML_LOGFILE)
Tip: For running the fast prediction-only mode in VASP there is currently only negligible performance gains from the VASPml library. Hence, if file output is important (e.g. when monitoring the spilling factor) we recommend using the original Fortran code (ML_LIB = .FALSE.).

Dependencies

The VASPml library depends on the following compilers and external libraries:

  • C++ compiler supporting the C++17 language standard
  • MPI
  • BLAS and the corresponding C interface CBLAS
  • LAPACK and the corresponding C interface LAPACKE

These requirements are usually already covered by the VASP requirements.

Build instructions

The VASPml library is automatically built alongside VASP if -Dlibvaspml is added to the CPP_OPTIONS precompiler option in the makefile.include file. In addition, a few more compiler settings regarding the C++ compiler, include paths and VASPml options may be required. The makefile.include templates provided in VASP's arch directory contain pre-filled blocks corresponding to the VASPml build. Uncomment the VASPml-related lines and fill with values according to your toolchain. For example, when using the GCC toolchain with OpenBLAS the makefile.include section may look like this:

...
# For machine learning library vaspml (experimental)
CPP_OPTIONS += -Dlibvaspml
CPP_OPTIONS += -DVASPML_USE_CBLAS
#CPP_OPTIONS += -DVASPML_DEBUG_LEVEL=3
CXX_ML      = mpic++
CXXFLAGS_ML = -O3 -std=c++17 -pedantic-errors -Wall -Wextra
INCLUDE_ML  = -I$(OPENBLAS_ROOT)/include
...

Apart from the mandatory -Dlibvaspml flag there are the following possible CPP_OPTIONS:

  • -DVASPML_USE_CBLAS: Use CBLAS (C interface for BLAS routines) for linear algebra. This is the default and should always be used.
  • -DVASPML_DEBUG_LEVEL=[0|1|2|3]: If set to 1, 2 or 3 enables various sanity checks during runtime with low, medium and high impact on performance, respectively. Setting it to 0 or omitting the flag disables runtime checks.
Mind: Do not use this flag for production runs as it may decrease performance.
  • -DVASPML_USE_MKL: Use Intel MKL for linear algebra (must be used in combination with -DVASPML_USE_CBLAS).

In addition, VASPml requires to set its own compiler, flags and include path:

  • CXX_ML: This should be a C++17-compatible C++ compiler with MPI support (usually an MPI wrapper corresponding to the selected toolchain, e.g. mpic++, mpicxx, mpicpx or mpinc++).
  • CXXFLAGS_ML: Specifies the flags for the C++ compiler. Typically, here the optimization level (-O3) and the compliance with C++17 is specified.
  • INCLUDE_ML: Include flags for the required dependencies should be added here.
Tip: For some toolchains it is not necessary to explicitly add paths here because the compilers automatically include the correct directories (e.g. Intel oneAPI, NVHPC). In other cases (e.g. GNU compiler with openBLAS) the given path must contain the desired C++ headers of the dependencies:
  • CBLAS: cblas.h
  • LAPACKE: lapacke.h

The VASPml project (source code and related files) is located within the src/vaspml directory relative to the VASP root folder. Upon compilation it is copied to the build/std, build/gam and/or build/ncl build folders, just like all other VASP sources. If the VASPml library was successfully compiled libvaspml.a will be located in build/std/vaspml/lib/ (similarly for the gam and ncl versions). However, it is usually not necessary to check its presence because the VASP build will handle this (and fail if VASPml cannot be built).