ML ICRITERIA: Difference between revisions
(ML_XMIX reference added) |
(ML_ICRITERIA=2 explained in more detail. Attempting to give a more concise overview.) |
||
Line 4: | Line 4: | ||
Description: Decides whether ({{TAG|ML_ICRITERIA}}>0) or how the Bayesian error threshold ({{TAG|ML_CTIFOR}}) is updated within the machine learning force field method. {{TAG|ML_CTIFOR}} determines whether a first principles calculations is performed. | Description: Decides whether ({{TAG|ML_ICRITERIA}}>0) or how the Bayesian error threshold ({{TAG|ML_CTIFOR}}) is updated within the machine learning force field method. {{TAG|ML_CTIFOR}} determines whether a first principles calculations is performed. | ||
---- | ---- | ||
The | The use of this tag in combination with the learning algorithms is described here: [[Machine learning force field calculations: Basics#Threshold for error of forces|here]]. | ||
The following options are possible for {{TAG|ML_ICRITERIA}}: | The following options are possible for {{TAG|ML_ICRITERIA}}: | ||
* {{TAG|ML_ICRITERIA}} = 0: No update of the threshold {{TAG|ML_CTIFOR}} is | * {{TAG|ML_ICRITERIA}} = 0: No update of the threshold {{TAG|ML_CTIFOR}} is performed. We recommend to use this mode only to refine an existing force field. For instance, if you know that in previous runs {{TAG|ML_CTIFOR}} was taking a value of 0.03, you might continue acquiring training data with the threshold now fixed to {{TAG|ML_CTIFOR}}=0.03, in order to catch all outliners and areas of the potential energy surface, where first principle data are still missing. To obtain highly robust force fields, we recommend to run for say {{TAG|NSW}}=100000 (one hundred thousand steps) in this mode at the highest temperature to be considered (or slightly above the highest considered temperature). | ||
* {{TAG|ML_ICRITERIA}} = 1: Update of criteria using average of the Bayesian errors of the forces | * {{TAG|ML_ICRITERIA}} = 1: Update of criteria using average of the Bayesian errors of the forces. {{TAG|ML_CTIFOR}} is set proportional to the previous average Bayesian errors. For {{TAG|ML_ICRITERIA}} = 1, the average is calculated only over the errors after an update of the force field. Such updates occur only rather rarely, hence updates of {{TAG|ML_CTIFOR}} are also fairly seldom in this mode. Furthermore, since first principles calculations are only performed for configurations with large Bayesian errors ("outliners"), also updates of the force fields occur only after outliners have been considered. Hence the Bayesian errors that enter the averaging are also typically larger than the average Bayesian error in this mode. It is thus recommended to set {{TAG|ML_CX}} to 0 in this mode (default). | ||
* {{TAG|ML_ICRITERIA}} = 2: Update of criteria using gliding average of Bayesian errors ( | (see description of method [[Machine learning force field calculations: Basics#Threshold for error of forces|here]]). | ||
* {{TAG|ML_ICRITERIA}} = 2: Update of criteria using gliding average of all previous Bayesian errors. This mode averages the error over all previous predictions (that is every previously considered MD step), whereas the {{TAG|ML_ICRITERIA}} = 1 averages only over predictions immediately after re-training. The history length in this mode is currently hard coded and set to 400 steps. This mode tends to continue sampling, and it is thus somewhat prone to oversampling: as the Bayesian errors decrease, also the threshold will be continuously lowered, and further first principles calculations are initiated. Recommended values for {{TAG|ML_CX}} are about 0.1- 0.2 in this mode. For a value around {{TAG|ML_CX}} = 0.15, typically every 50 steps a first principles calculation is performed. This means that if the number of ionic steps is set to say {{TAG|NSW}}=50000, about 1000 first principles calculations are performed. This results in a fairly good and robust data base for ML for many materials. | |||
As already hinted above, the tag {{TAG|ML_CX}} allows to fine tune the update of {{TAG|ML_CTIFOR}}. | |||
Whether to use {{TAG|ML_ICRITERIA}} = 1 or {{TAG|ML_ICRITERIA}} = 2, is a matter of taste, just recall that {{TAG|ML_CX}} must be set differently for both modes Whereas a good default for {{TAG|ML_ICRITERIA}} = 1 is {{TAG|ML_CX}} = 0.0, a sensible default for {{TAG|ML_ICRITERIA}} = 2 is {{TAG|ML_CX}} = 0.15. | |||
Most of our force-fields have been generated using {{TAG|ML_ICRITERIA}} = 1, but this mode sometimes stagnates and stops performing first principles calculations. | |||
On the other hand and as already mentioned, {{TAG|ML_ICRITERIA}} = 2 tends to oversample and perform too many first principles calculations. | |||
== Related tags and articles == | == Related tags and articles == | ||
{{TAG|ML_LMLFF}}, {{TAG|ML_CTIFOR}}, {{TAG|ML_CSLOPE}}, {{TAG|ML_CSIG}}, {{TAG|ML_MHIS}}, {{TAG| | {{TAG|ML_LMLFF}}, {{TAG|ML_CTIFOR}}, {{TAG|ML_CSLOPE}}, {{TAG|ML_CSIG}}, {{TAG|ML_MHIS}}, {{TAG|ML_CX}} | ||
{{sc|ML_ICRITERIA|Examples|Examples that use this tag}} | {{sc|ML_ICRITERIA|Examples|Examples that use this tag}} | ||
---- | ---- | ||
[[Category:INCAR tag]][[Category:Machine-learned force fields]] | [[Category:INCAR tag]][[Category:Machine-learned force fields]] |
Revision as of 09:31, 17 September 2022
ML_ICRITERIA = [integer]
Default: ML_ICRITERIA = 1
Description: Decides whether (ML_ICRITERIA>0) or how the Bayesian error threshold (ML_CTIFOR) is updated within the machine learning force field method. ML_CTIFOR determines whether a first principles calculations is performed.
The use of this tag in combination with the learning algorithms is described here: here.
The following options are possible for ML_ICRITERIA:
- ML_ICRITERIA = 0: No update of the threshold ML_CTIFOR is performed. We recommend to use this mode only to refine an existing force field. For instance, if you know that in previous runs ML_CTIFOR was taking a value of 0.03, you might continue acquiring training data with the threshold now fixed to ML_CTIFOR=0.03, in order to catch all outliners and areas of the potential energy surface, where first principle data are still missing. To obtain highly robust force fields, we recommend to run for say NSW=100000 (one hundred thousand steps) in this mode at the highest temperature to be considered (or slightly above the highest considered temperature).
- ML_ICRITERIA = 1: Update of criteria using average of the Bayesian errors of the forces. ML_CTIFOR is set proportional to the previous average Bayesian errors. For ML_ICRITERIA = 1, the average is calculated only over the errors after an update of the force field. Such updates occur only rather rarely, hence updates of ML_CTIFOR are also fairly seldom in this mode. Furthermore, since first principles calculations are only performed for configurations with large Bayesian errors ("outliners"), also updates of the force fields occur only after outliners have been considered. Hence the Bayesian errors that enter the averaging are also typically larger than the average Bayesian error in this mode. It is thus recommended to set ML_CX to 0 in this mode (default).
(see description of method here).
- ML_ICRITERIA = 2: Update of criteria using gliding average of all previous Bayesian errors. This mode averages the error over all previous predictions (that is every previously considered MD step), whereas the ML_ICRITERIA = 1 averages only over predictions immediately after re-training. The history length in this mode is currently hard coded and set to 400 steps. This mode tends to continue sampling, and it is thus somewhat prone to oversampling: as the Bayesian errors decrease, also the threshold will be continuously lowered, and further first principles calculations are initiated. Recommended values for ML_CX are about 0.1- 0.2 in this mode. For a value around ML_CX = 0.15, typically every 50 steps a first principles calculation is performed. This means that if the number of ionic steps is set to say NSW=50000, about 1000 first principles calculations are performed. This results in a fairly good and robust data base for ML for many materials.
As already hinted above, the tag ML_CX allows to fine tune the update of ML_CTIFOR. Whether to use ML_ICRITERIA = 1 or ML_ICRITERIA = 2, is a matter of taste, just recall that ML_CX must be set differently for both modes Whereas a good default for ML_ICRITERIA = 1 is ML_CX = 0.0, a sensible default for ML_ICRITERIA = 2 is ML_CX = 0.15. Most of our force-fields have been generated using ML_ICRITERIA = 1, but this mode sometimes stagnates and stops performing first principles calculations. On the other hand and as already mentioned, ML_ICRITERIA = 2 tends to oversample and perform too many first principles calculations.
Related tags and articles
ML_LMLFF, ML_CTIFOR, ML_CSLOPE, ML_CSIG, ML_MHIS, ML_CX