HOME

TheInfoList



OR:

A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so a model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as a function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the airflow around the wing for different shape variables (length, curvature, material, ..). For many real-world problems, however, a single simulation can take many minutes, hours, or even days to complete. As a result, routine tasks such as design optimization, design space exploration, sensitivity analysis and ''what-if'' analysis become impossible since they require thousands or even millions of simulation evaluations. One way of alleviating this burden is by constructing approximation models, known as surrogate models, ''metamodels'' or ''emulators'', that mimic the behavior of the simulation model as closely as possible while being computationally cheap(er) to evaluate. Surrogate models are constructed using a data-driven, bottom-up approach. The exact, inner working of the simulation code is not assumed to be known (or even understood), solely the input-output behavior is important. A model is constructed based on modeling the response of the simulator to a limited number of intelligently chosen data points. This approach is also known as behavioral modeling or black-box modeling, though the terminology is not always consistent. When only a single design variable is involved, the process is known as
curve fitting Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data i ...
. Though using surrogate models in lieu of experiments and simulations in engineering design is more common, surrogate modeling may be used in many other areas of science where there are expensive experiments and/or function evaluations.


Goals

The scientific challenge of surrogate modeling is the generation of a surrogate that is as accurate as possible, using as few simulation evaluations as possible. The process comprises three major steps which may be interleaved iteratively: * Sample selection (also known as sequential design, optimal experimental design (OED) or active learning) * Construction of the surrogate model and optimizing the model parameters (bias-variance trade-off) * Appraisal of the accuracy of the surrogate. The accuracy of the surrogate depends on the number and location of samples (expensive experiments or simulations) in the design space. Various
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
(DOE) techniques cater to different sources of errors, in particular, errors due to noise in the data or errors due to an improper surrogate model.


Types of surrogate models

Popular surrogate modeling approaches are: polynomial response surfaces;
kriging In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
; more generalized Bayesian approaches, gradient-enhanced kriging (GEK); radial basis function;
support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...
s; space mapping; J.W. Bandler, Q. Cheng, S.A. Dakroury, A.S. Mohamed, M.H. Bakr, K. Madsen and J. Søndergaard,
Space mapping: the state of the art
" IEEE Trans. Microwave Theory Tech., vol. 52, no. 1, pp. 337-361, Jan. 2004.
artificial neural networks Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units ...
and
Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
. Further methods recently explored are Fourier surrogate modeling and
random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...
s . For some problems, the nature of true function is not known a priori so it is not clear which surrogate model will be most accurate. In addition, there is no consensus on how to obtain the most reliable estimates of the accuracy of a given surrogate. Many other problems have known physics properties. In these cases, physics-based surrogates such as space-mapping based models are the most efficient. A recent survey of surrogate-assisted evolutionary optimization techniques can be found in. Spanning two decades of development and engineering applications, Rayas-Sanchez reviews aggressive space mapping exploiting surrogate models. Recently, Razavi et al. have published a state-of-the-art review of surrogate models used in water resources management field.


Invariance properties

Recently proposed comparison-based surrogate models (e.g. ranking
support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...
) for evolutionary algorithms, such as
CMA-ES Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continu ...
, allow to preserve some invariance properties of surrogate-assisted optimizers: *1. Invariance with respect to monotonous transformations of the function (scaling) *2. Invariance with respect to orthogonal transformations of the search space (rotation).


Applications

An important distinction can be made between two different applications of surrogate models: design optimization and design space approximation (also known as emulation). In surrogate model based optimization, an initial surrogate is constructed using some of the available budgets of expensive experiments and/or simulations. The remaining experiments/simulations are run for designs which the surrogate model predicts may have promising performance. The process usually takes the form of the following search/update procedure. *1. Initial sample selection (the experiments and/or simulations to be run) *2. Construct surrogate model *3. Search surrogate model (the model can be searched extensively, e.g. using a
genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to ge ...
, as it is cheap to evaluate) *4. Run and update experiment/simulation at a new location(s) found by search and add to sample *5. Iterate steps 2 to 4 until out of time or design 'good enough' Depending on the type of surrogate used and the complexity of the problem, the process may converge on a local or global optimum, or perhaps none at all.Jones, D.R (2001),
A taxonomy of global optimization methods based on response surfaces
" Journal of Global Optimization, 21:345–383.
In design space approximation, one is not interested in finding the optimal parameter vector but rather in the global behavior of the system. Here the surrogate is tuned to mimic the underlying model as closely as needed over the complete design space. Such surrogates are a useful, cheap way to gain insight into the global behavior of the system. Optimization can still occur as a post-processing step, although with no update procedure (see above) the optimum found cannot be validated.


Surrogate modeling software

* Surrogate Modeling Toolbox (SMT: https://github.com/SMTorg/smt): is a Python package that contains a collection of surrogate modeling methods, sampling techniques, and benchmarking functions. This package provides a library of surrogate models that is simple to use and facilitates the implementation of additional methods. SMT is different from existing surrogate modeling libraries because of its emphasis on derivatives, including training derivatives used for gradient-enhanced modeling, prediction derivatives, and derivatives with respect to the training data. It also includes new surrogate models that are not available elsewhere: kriging by partial-least squares reduction and energy-minimizing spline interpolation.{{cite journal , last1 = Bouhlel , first1 = M.A. , last2 = Hwang , first2 = J.H. , last3 = Bartoli , first3 = Nathalie , last4 = Lafage , first4 = R. , last5 = Morlier , first5 = J. , last6 = Martins , first6 = J.R.R.A. , year = 2019 , title = A Python surrogate modeling framework with derivatives , journal = Advances in Engineering Software , volume = 135 , pages = 102662 , doi =10.1016/j.advengsoft.2019.03.005 , s2cid = 128324330 , url = http://mdolab.engin.umich.edu/content/python-surrogate-modeling-framework-derivatives
Surrogates.jl
is a Julia packages which offers tools like random forests, radial basis methods and kriging.


See also

* Linear approximation *
Response surface methodology In statistics, response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. The method was introduced by George E. P. Box and K. B. Wilson in 1951. The main idea of RSM ...
*
Kriging In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
* Radial basis functions * Gradient-enhanced kriging (GEK) * OptiY * Space mapping *
Surrogate endpoint In clinical trials, a surrogate endpoint (or surrogate marker) is a measure of effect of a specific treatment that may correlate with a ''real'' clinical endpoint but does not necessarily have a guaranteed relationship. The National Institutes of H ...
*
Surrogate data Surrogate data, sometimes known as analogous data, usually refers to time series data that is produced using well-defined (linear) models like Autoregressive–moving-average model, ARMA processes that reproduce various statistical properties like ...
* Fitness approximation *
Computer experiment A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other similar ...
*
Conceptual model A conceptual model is a representation of a system. It consists of concepts used to help people know, understand, or simulate a subject the model represents. In contrast, physical models are physical object such as a toy model that may be asse ...
* Bayesian regression *
Bayesian model selection The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nul ...


References


Reading

* Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., Tucker, P.K. (2005), �
Surrogate-based analysis and optimization
” Progress in Aerospace Sciences, 41, 1–28. * D. Gorissen, I. Couckuyt, P. Demeester, T. Dhaene, K. Crombecq, (2010), �
A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design
" Journal of Machine Learning Research, Vol. 11, pp. 2051−2055, July 2010. * T-Q. Pham, A. Kamusella, H. Neubert, �
Auto-Extraction of Modelica Code from Finite Element Analysis or Measurement Data
" 8th International Modelica Conference, 20–22 March 2011 in Dresden. * Forrester, Alexander, Andras Sobester, and Andy Keane,
Engineering design via surrogate modelling: a practical guide
', John Wiley & Sons, 2008. * Bouhlel, M. A. and Bartoli, N. and Otsmane, A. and Morlier, J. (2016)
Improving kriging surrogates of high-dimensional design models by Partial Least Squares dimension reduction
, Structural and Multidisciplinary Optimization 53 (5), 935-952 * Bouhlel, M. A. and Bartoli, N. and Otsmane, A. and Morlier, J. (2016)
An improved approach for estimating the hyperparameters of the kriging model for high-dimensional problems through the partial least squares method
, Mathematical Problems in Engineering


External links




Matlab SUrrogate MOdeling Toolbox – Matlab SUMO Toolbox

Surrogate Modeling Toolbox -- Python
Design of experiments Numerical analysis Scientific models Mathematical modeling Machine learning