ML ICRITERIA: Difference between revisions

Latest revision as of 09:09, 14 April 2023

Default: ML_ICRITERIA	= 3	for ML_MODE = SELECT
	= 1	else

Description: Decides whether (ML_ICRITERIA>0) or how the Bayesian error threshold (ML_CTIFOR) is updated within the machine learning force field method. ML_CTIFOR determines whether a first-principles calculation is performed.

The use of this tag in combination with the learning algorithms is described here: here.

The following options are possible for ML_ICRITERIA:

ML_ICRITERIA = 0: The threshold ML_CTIFOR is not updated. This method is only recommended for refining an existing force field. For example, if you know that ML_CTIFOR has taken a value of 0.03 in previous runs, you can continue to collect training data by now setting the threshold to ML_CTIFOR=0.03 to capture all contours and areas of the potential energy surface where first-principles data are still missing. To achieve extremely robust force fields, it is recommended to run NSW=100000 steps in this mode to slightly above the highest temperature to be considered.
ML_ICRITERIA = 1: Set ML_CTIFOR to a value proportional to the average Bayesian errors of the ML_MHIS steps. ML_ICRITERIA = 1, the average is calculated only for errors after updating the force field. Such updates are quite rare, so updates of ML_CTIFOR are also quite rare in this mode. Furthermore, since the first principle calculations are only performed for configurations with large Bayesian errors ("outliers"), the force field is updated only after the outliers are taken into account. Therefore, the Bayes errors included in the averaging are typically larger than the average Bayes error in this mode. It is therefore recommended to set ML_CX to 0 (default) in this mode.
ML_ICRITERIA = 2: Update the criteria using the moving average of all previous Bayesian errors. This method gives the average of the errors of all previous predictions (i.e. all previously considered MD steps), while ML_ICRITERIA = 1 gives only the average of the predictions immediately following the retraining. The length of the history in this mode is currently hard-coded and set to 400 steps (or ML_MHIS x 50 in the newer version). This mode tends to continue sampling, and is therefore somewhat prone to oversampling: as Bayesian errors decrease, the threshold is steadily lowered and additional first-principles computations are initiated. The recommended values for ML_CX in this mode are approximately 0.1 to 0.3. = 0,2, a first-principles calculation is typically performed every 50 steps. This means that if the number of ionic steps is, say, NSW=50,000, then about 1,000 first-principles calculations should be performed. For many materials, this results in a reasonably good and robust ML database.
ML_ICRITERIA=3: This mode is the default for reselecting local reference configurations from an existing ML_AB file (ML_MODE = SELECT). The ML_AB file shall contain a ML_CTIFOR for each structure stored in the ML_AB file. These values are used by VASP as Bayesian error thresholds for structure selection. This also means that the tags ML_CTIFOR, ML_CX, ML_CSLOPE, ML_CSIG and ML_MHIS set in INCAR are ignored. This mode is only available when ML_MODE=SELECT is activated. It is important that the ML_AB file contains a ML_CTIFOR value for each structure included. Otherwise, VASP will throw an error and will also indicate to the user that some ML_CTIFOR values are missing from the ML_AB file.

As mentioned above, the ML_CX tag can be used to fine-tune the update of ML_CTIFOR. The fact that the ML_ICRITERIA = 1 or ML_ICRITERIA = 2 is a matter of taste. Just remember that ML_CX must be set differently in both modes. While ML_ICRITERIA = 1, the ML_CX = 0.0, ML_ICRITERIA = 2, ML_CX = 0.2 is a good default. Most of our force fields use ML_ICRITERIA = 1, but this mode sometimes stagnates and stops the first principle calculations. On the other hand, and as already mentioned, using ML_ICRITERIA = 2 is prone to oversampling, i.e. it may perform too many first principle calculations.

@@ Line 1: / Line 1: @@
-{{TAGDEF|ML_FF_LCRITERIA|[logical]|.TRUE.}}
+{{DISPLAYTITLE:ML_ICRITERIA}}
+{{TAGDEF|ML_ICRITERIA|[integer]}}
+{{DEF|ML_ICRITERIA|3|for {{TAG|ML_MODE}} {{=}} SELECT|1|else}}
-Description: Decides whether the threshold in the decision step for the Bayesian error estimation is renewed or not in the machine learning force field methods.
+Description: Decides whether ({{TAG|ML_ICRITERIA}}>0) or how the Bayesian error threshold ({{TAG|ML_CTIFOR}}) is updated within the machine learning force field method. {{TAG|ML_CTIFOR}} determines whether a first-principles calculation is performed.
 ----
+The use of this tag in combination with the learning algorithms is described here: [[Machine learning force field calculations: Basics#Threshold for error of forces|here]].
-This flag is only used if Bayesian error estimation is switched on ({{TAG|ML_FF_IERR}}=2 or 3).
+The following options are possible for {{TAG|ML_ICRITERIA}}:
+* {{TAG|ML_ICRITERIA}} = 0: The threshold {{TAG|ML_CTIFOR}} is not updated. This method is only recommended for refining an existing force field. For example, if you know that {{TAG|ML_CTIFOR}} has taken a value of 0.03 in previous runs, you can continue to collect training data by now setting the threshold to {{TAG|ML_CTIFOR}}=0.03 to capture all contours and areas of the potential energy surface where first-principles data are still missing. To achieve extremely robust force fields, it is recommended to run {{TAG|NSW}}=100000 steps in this mode to slightly above the highest temperature to be considered.
+* {{TAG|ML_ICRITERIA}} = 1: Set {{TAG|ML_CTIFOR}} to a value proportional to the average Bayesian errors of the {{TAG|ML_MHIS}} steps. {{TAG|ML_ICRITERIA}} = 1, the average is calculated only for errors after updating the force field. Such updates are quite rare, so updates of {{TAG|ML_CTIFOR}} are also quite rare in this mode. Furthermore, since the first principle calculations are only performed for configurations with large Bayesian errors ("outliers"), the force field is updated only after the outliers are taken into account. Therefore, the Bayes errors included in the averaging are typically larger than the average Bayes error in this mode.  It is therefore recommended to set {{TAG|ML_CX}} to 0 (default) in this mode.
+* {{TAG|ML_ICRITERIA}} = 2: Update the criteria using the moving average of all previous Bayesian errors. This method gives the average of the errors of all previous predictions (i.e. all previously considered MD steps), while {{TAG|ML_ICRITERIA}} = 1 gives only the average of the predictions immediately following the retraining. The length of the history in this mode is currently hard-coded and set to 400 steps (or {{TAG|ML_MHIS}} x 50 in the newer version). This mode tends to continue sampling, and is therefore somewhat prone to oversampling: as Bayesian errors decrease, the threshold is steadily lowered and additional first-principles computations are initiated. The recommended values for {{TAG|ML_CX}} in this mode are approximately 0.1 to 0.3. = 0,2, a first-principles calculation is typically performed every 50 steps. This means that if the number of ionic steps is, say, {{TAG|NSW}}=50,000, then about 1,000 first-principles calculations should be performed. For many materials, this results in a reasonably good and robust ML database.
+*{{TAG|ML_ICRITERIA}}=3: This mode is the default for reselecting local reference configurations from an existing {{TAG|ML_AB}} file ({{TAG|ML_MODE}} = ''SELECT''). The {{FILE|ML_AB}} file shall contain a {{TAG|ML_CTIFOR}} for each structure stored in the {{FILE|ML_AB}} file. These values are used by {{VASP}} as Bayesian error thresholds for structure selection. This also means that the tags {{TAG|ML_CTIFOR}}, {{TAG|ML_CX}}, {{TAG|ML_CSLOPE}}, {{TAG|ML_CSIG}} and {{TAG|ML_MHIS}} set in {{FILE|INCAR}} are ignored. This mode is only available when {{TAG|ML_MODE}}=''SELECT'' is activated. It is important that the {{FILE|ML_AB}} file contains a {{TAG|ML_CTIFOR}} value for each structure included. Otherwise, {{VASP}} will throw an error and will also indicate to the user that some {{TAG|ML_CTIFOR}} values are missing from the {{FILE|ML_AB}} file.
-== Related Tags and Sections ==
+As mentioned above, the {{TAG|ML_CX}} tag can be used to fine-tune the update of {{TAG|ML_CTIFOR}}.
-{{TAG|ML_FF_LMLFF}}, {{TAG|ML_FF_IERR}}, {{TAG|ML_FF_ISAMPLE}}, {{TAG|ML_FF_CSLOPE}}, {{TAG|ML_FF_CSIG}}, {{TAG|ML_FF_MHIS}}
+The fact that the {{TAG|ML_ICRITERIA}} = 1 or {{TAG|ML_ICRITERIA}} = 2 is a matter of taste. Just remember that {{TAG|ML_CX}} must be set differently in both modes.  While {{TAG|ML_ICRITERIA}} = 1, the {{TAG|ML_CX}} = 0.0, {{TAG|ML_ICRITERIA}} = 2, {{TAG|ML_CX}} = 0.2 is a good default.
+Most of our force fields use {{TAG|ML_ICRITERIA}} = 1, but this mode sometimes stagnates and stops the first principle calculations.
+On the other hand, and as already mentioned, using {{TAG|ML_ICRITERIA}} = 2 is prone to oversampling, i.e. it may perform too many first principle calculations.
-{{sc|ML_FF_LCRITERIA|Examples|Examples that use this tag}}
+== Related tags and articles ==
+{{TAG|ML_LMLFF}}, {{TAG|ML_CTIFOR}}, {{TAG|ML_CSLOPE}}, {{TAG|ML_CSIG}}, {{TAG|ML_MHIS}}, {{TAG|ML_CX}}
+{{sc|ML_ICRITERIA|Examples|Examples that use this tag}}
 ----
+[[Category:INCAR tag]][[Category:Machine-learned force fields]]
-[[Category:INCAR]][[Category:Machine Learning]][[Category:Machine Learned Force Fields]][[Category: Alpha]]