arxiv:2603.10992

A Tutorial Review of Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

Published on Apr 28

Authors:

Abstract

Bayesian optimization framework using Gaussian processes with derivative observations and active learning enables efficient surrogate modeling for potential energy surface searches across minimization, saddle point identification, and path following tasks.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Building local surrogates to accelerate stationary point searches on potential energy surfaces spans decades of effort. Done correctly, surrogates can reduce the number of expensive electronic structure evaluations by roughly an order of magnitude while preserving the accuracy of the underlying theory, with the gain depending on oracle cost, search distance, and the availability of analytical forces. We present a unified Bayesian optimization view of minimization, single-point saddle searches, and double-ended path searches: all three share one six-step surrogate loop and differ only in the inner optimization target and the acquisition criterion. The framework uses Gaussian process regression with derivative observations, inverse-distance kernels, and active learning, and we develop optional extensions for production use, including farthest-point sampling with the Earth Mover's Distance, MAP regularization, an adaptive trust radius, and random Fourier features for scaling. Accompanying pedagogical Rust code demonstrates that all three applications use the same Bayesian optimization loop, bridging the gap between theoretical formulation and practical execution.

View arXiv page View PDF Add to collection

Community

rgoswami

about 15 hours ago

Surrogate-accelerated stationary point searches on potential energy surfaces are spread across two decades of papers with incompatible notation. This invited tutorial review unifies them: minimization, single-point saddle searches (the GP-dimer / minimum mode following), and double-ended path searches (GP-NEB) are one six-step Bayesian optimization loop that differs only in the inner optimization target and the acquisition criterion.

The surrogate is a Gaussian process with derivative observations on inverse-distance kernels, trained by active learning. The gain is roughly an order-of-magnitude reduction in electronic-structure evaluations, conditioned on oracle cost, search distance, and the availability of analytical forces; the review states the regime rather than claiming a universal speedup.

Production extensions are derived in full: farthest-point sampling with an intensive Earth Mover's Distance for pruning, MAP regularization for hyperparameters, an adaptive trust radius, and random Fourier features for scaling.

Pedagogical Rust code (ChemGP, https://github.com/lode-org/ChemGP) runs all three applications from the same loop and mirrors the production C++ in gpr_optim used for the published benchmarks, so every equation maps to a line of code.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2603.10992

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.10992 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.10992 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.10992 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.