6. Surrogate Models#
6.1. What is the Surrogate Model#
A Surrogate Model is an approximate model used to mimic the behavior of a more complex and expensive-to-evaluate function or model. In sensitivity analysis and optimization, surrogate models are often employed to reduce the number of costly evaluations (e.g., simulations, real-world experiments) by providing fast predictions of objective or constraint values.
Typical surrogate models include Gaussian Processes (Kriging), Radial Basis Function Networks (RBF), Support Vector Regression (SVR), and others. They share several common characteristics:
Fast prediction speed – once trained, surrogate models can evaluate new
inputs quickly compared to the original expensive function.
Good approximation capability – they can capture complex, nonlinear
relationships from a limited set of data samples.
Uncertainty estimation – some models (e.g., Kriging/Gaussian Process)
provide both predictions and confidence intervals.
Surrogate models show great potential in expensive-to-evaluate problems. Using surrogate modelling is one of the most distinctive and important features of UQPyL.
6.2. Overview of UQPyL.surrogates#
The surrogates module provides a collection of surrogate models designed to
support sensitivity analysis and parameter optimization:
Abbreviation |
Full Name |
Features |
|---|---|---|
KRG |
Kriging |
Supports |
GP |
Gaussian Process |
Supports |
LR |
Linear Regression |
Supports |
PR |
Polynomial Regression |
Supports |
RBF |
Radial Basis Function |
Supports |
SVM |
Support Vector Machine |
Uses libsvm as core |
MARS |
Multivariate Adaptive Regression Splines |
Uses Earth package as core |
All models above inherit from the base class surrogateABC, whose main
methods are fit() and predict().
6.3. Class surrogateABC#
6.3.1. Constructor#
__init__(...)
6.3.1.1. Description#
Initializes a surrogate model.
Hyper-parameters are divided into:
Fixed parameters (common to all surrogate models)
Model-specific parameters (vary between models)
6.3.1.2. Fixed parameters#
scalers (Tuple(Scaler, Scaler))
A tuple specifying the scalers to be used for normalization. If given, inputs
Xand outputsYare normalized during training and prediction. EachScalershould be a class fromUQPyL.utility.scalers[0]is used forX, andscalers[1]is used forY.
polyFeature (PolynomialFeatures)
A class from
UQPyL.utilityused to apply polynomial transformation toX.
Other hyper-parameters are model-specific; please refer to individual APIs.
6.3.2. Methods#
6.3.2.1. fit#
6.3.2.1.1. Description#
Fit the surrogate model to training data.
6.3.2.1.2. Parameters#
xTrain (ndarray)
2D NumPy array of input decisions.
yTrain (ndarray)
2D NumPy array of outputs corresponding to each decision.
yTrainshould be 2D with a single column.
6.3.2.2. predict#
6.3.2.2.1. Description#
Predict outputs for new inputs using the trained surrogate model.
6.3.2.2.2. Parameters#
xPred (ndarray)
2D NumPy array of shape
(n_samples, n_features).
6.3.2.2.3. Returns#
ndarray
2D NumPy array of shape
(n_samples, 1)with predicted outputs.
6.4. How to import surrogate models#
Each surrogate model is implemented in a separate submodule.
Example: Radial Basis Function (RBF):
# Import the Radial Basis Function surrogate model
from UQPyL.surrogates.rbf import RBF
# Import the Cubic kernel used by the RBF model
from UQPyL.surrogates.rbf.kernel import Cubic
# Optional kernels: Cubic, Gaussian, Linear, Multiquadric, ThinPlateSpline
# Create an instance of the RBF surrogate model with the Cubic kernel
rbf = RBF(kernel=Cubic())
Note
Cubic is a Python class and must be instantiated before being passed as
the kernel.
Note
Gaussian Process and Kriging models also use kernels in a similar way. Each kernel class has its own hyper-parameters; see their specific API references.
6.5. Fitting and predicting with surrogate models#
Example: use an RBF model with a cubic kernel to approximate the Sphere function.
from UQPyL.problems.single_objective import Sphere
# Define the problem: Sphere function with 15 inputs
problem = Sphere(nInput=15, ub=100, lb=-100)
# Generate training data using Latin Hypercube Sampling (LHS)
from UQPyL.DoE import LHS
lhs = LHS()
# 300 training samples
trainX = lhs.sample(nt=300, problem=problem)
trainY = problem.objFunc(trainX)
# 50 test samples
predictX = lhs.sample(nt=50, problem=problem)
# Create and configure the RBF surrogate model
from UQPyL.surrogates.rbf import RBF
from UQPyL.surrogates.rbf.kernel import Cubic
kernel = Cubic()
model = RBF(kernel=kernel)
# Train the surrogate model
model.fit(trainX, trainY)
# Predict outputs for new inputs
predictY = model.predict(predictX)
# Evaluate model performance using R² metric
from UQPyL.utility.metric import R_square
R2 = R_square(trainY, predictY)
print(R2)
6.6. Using the AutoTuner tool#
The construction of a good surrogate model requires suitable hyper-parameters.
To assist this process, UQPyL provides an auto-tuning utility
AutoTuner in UQPyL.surrogates.
AutoTuner supports two search strategies:
Grid Search
Evaluates all combinations in a predefined hyper-parameter grid. Simple and robust for small search spaces.
Evolution Search
Uses an evolutionary algorithm to explore the hyper-parameter space. More suitable for large or complex search spaces but more expensive.
6.6.1. Grid search#
Example: build a Gaussian Process (GP) with a Matérn kernel.
The Matérn kernel is:
Matérn Kernel#
Here, \(d_{ij}\) is the distance between points \(i\) and \(j\), \(K_\nu\) is the Modified Bessel Function of the Second Kind, and \(\Gamma(\nu)\) is the Gamma function.
Kernel hyper-parameters:
\(\nu\) – smoothness parameter (e.g.
{0.5, 1.5, 2.5, np.inf})\(l\) – length scale
Gaussian Process also includes a hyper-parameter c added to the kernel
matrix for smoothness and numerical stability.
Example (Sphere function):
# Step 1: Prepare training and testing data
import numpy as np
from UQPyL.problems.single_objective import Sphere
from UQPyL.DoE import LHS
problem = Sphere(nInput=10)
lhs = LHS()
trainX = lhs.sample(nt=300, problem=problem)
trainY = problem.objFunc(trainX)
testX = lhs.sample(nt=50, problem=problem)
testY = problem.objFunc(testX)
# Step 2: Initialize a Gaussian Process model with a Matérn kernel
from UQPyL.surrogates.gp import GPR
from UQPyL.surrogates.gp.kernel import Matern
kernel = Matern()
gpr = GPR(kernel=kernel)
# Step 3: Get list of tunable hyper-parameters
nameList = gpr.getParaList() # e.g. ['c', 'l', 'nu']
# Step 4: Define hyper-parameter grid
paraGrid = {
'c': [1e-6, 1e-8, 1e-10, 1e-12],
'l': [1, 1e1, 1e2, 1e3, 1e4, 1e5],
'nu': [0.5, 1.5, 2.5, np.inf]
}
# Step 5: Create AutoTuner
from UQPyL.surrogates import AutoTuner
autoTuner = AutoTuner(surrogate=gpr)
# Step 6: Grid search tuning
autoTuner.gridTune(trainX, trainY, paraGrid=paraGrid)
# Step 7: Train GP with best hyper-parameters
gpr.fit(trainX, trainY)
# Step 8: Predict on test data
predictY = gpr.predict(testX)
# Step 9: Evaluate R²
from UQPyL.utility.metric import R_square
R2 = R_square(testY, predictY)
print(R2)
6.6.2. Evolution search#
Example: Linear Regression (LR) with Lasso loss.
Lasso loss adds an \(L_1\) penalty to encourage sparsity:
Lasso loss function#
Here, \(n\) is the number of training samples and \(w\) is the weight vector.
Hyper-parameter:
\(c\) – regularization parameter controlling the strength of L1 penalty.
Example (Sphere function):
# Step 1: Prepare training and testing data
import numpy as np
from UQPyL.problems.single_objective import Sphere
from UQPyL.DoE import LHS
problem = Sphere(nInput=10)
lhs = LHS()
trainX = lhs.sample(nt=300, problem=problem)
trainY = problem.objFunc(trainX)
testX = lhs.sample(nt=50, problem=problem)
testY = problem.objFunc(testX)
# Step 2: Initialize Linear Regression with Lasso
from UQPyL.surrogates.regression import LinearRegression
lr = LinearRegression(
lossType='Lasso',
C=10,
C_attr={'ub': 100, 'lb': 1e-5, 'type': 'float', 'log': True}
)
# Step 3: Get tunable hyper-parameters
nameList = lr.getParaList()
# Step 4: Create AutoTuner with GA optimizer
from UQPyL.surrogates import AutoTuner
from UQPyL.optimization.single_objective import GA
ga = GA(nPop=50, maxFEs=10000, verboseFlag=False)
autoTuner = AutoTuner(surrogate=lr, optimizer=ga)
autoTuner.opTune(trainX, trainY, nameList)
# Train final model
lr.fit(trainX, trainY)
# Step 5: Predict and evaluate
predictY = lr.predict(testX)
from UQPyL.utility.metric import R_square
R2 = R_square(testY, predictY)
print(R2)
6.7. Sensitivity analysis with surrogate models#
For expensive problems, classical sensitivity analysis (SA) may require too many evaluations. Surrogate models can greatly reduce this burden.
Example: Ishigami-like function (expensive true model).
Assume the function
is expensive to evaluate.
A surrogate-assisted SA framework:
Framework: Sensitivity analysis with surrogate models#
In this framework, expensive evaluations occur only during sampling to train the surrogate. Afterwards, SA uses the surrogate, which is inexpensive.
Code example:
# Step 1: Define the problem
import numpy as np
from UQPyL.problems import Problem
def objFunc(X):
objs = (np.sin(X[:, 0])
+ 7 * np.sin(X[:, 1])**2
+ 0.1 * X[:, 2]**4 * np.sin(X[:, 0]))
return objs[:, None]
Ishigami = Problem(
nInput=3,
nOutput=1,
objFunc=objFunc,
ub=np.pi,
lb=-np.pi,
varType=[0, 0, 0],
name="Ishigami"
)
# Step 2: Training data for surrogate model
from UQPyL.DoE import LHS
lhs = LHS()
X = lhs.sample(nt=100, problem=Ishigami)
Y = Ishigami.objFunc(X)
# Step 3: Train surrogate (RBF)
from UQPyL.surrogates.rbf import RBF
rbf = RBF()
rbf.fit(X, Y)
# Step 4: Reconstruct problem with surrogate objFunc
problem_ = Problem(
nInput=3,
nOutput=1,
objFunc=rbf.predict,
ub=np.pi,
lb=-np.pi,
varType=[0, 0, 0],
name="Ishigami"
)
# Step 5: Perform SA using surrogate
from UQPyL.sensibility import FAST
fast = FAST()
X_ = fast.sample(problem=problem_)
Y_ = problem_.objFunc(X_)
res = fast.analyze(problem=problem_, X=X_, Y=Y_)
print(res)
6.8. Single-objective optimization with surrogate models#
In many real-world applications, objective/constraint evaluations are expensive (simulations, experiments, etc.). Surrogate-assisted optimization significantly reduces the number of expensive evaluations.
Several surrogate-assisted algorithms have been developed, such as:
ASMO (2014)
ASMO-PODE (2017)
AMSMO (2023)
ASMO framework (example):
The framework of ASMO#
Idea:
Use DoE to generate initial samples and evaluate expensive objective.
Build surrogate from database of evaluated points.
Optimize surrogate (cheap).
Re-evaluate best surrogate solutions with true expensive model.
Update database and rebuild surrogate.
Repeat until termination.
UQPyL provides a generalized ASMO supporting various surrogates and optimizers.
Example: RBF + GA in ASMO framework:
# Step 1: Define (expensive) Sphere problem
from UQPyL.problems.single_objective import Sphere
problem = Sphere(nInput=10)
# Step 2: Instantiate RBF surrogate
from UQPyL.surrogates.rbf import RBF
rbf = RBF()
# Step 3: Instantiate GA optimizer
from UQPyL.optimization.single_objective import GA
ga = GA(nPop=50, maxFEs=10000, verboseFlag=False)
# Step 4: Use ASMO framework
from UQPyL.optimization.single_objective import ASMO
asmo = ASMO(surrogate=rbf, optimizer=ga, saveFlag=True)
# Step 5: Run ASMO
res = asmo.run(problem=problem)
6.9. Multi-objective optimization with surrogate models#
For multi-objective problems, MO-ASMO extends ASMO by training one surrogate per objective.
Example: RBF + KRG + NSGA-II in MO-ASMO:
# Step 1: Define ZDT1 problem (expensive)
from UQPyL.optimization.multi_objective import ZDT1
zdt1 = ZDT1()
# Step 2: Instantiate surrogate models
from UQPyL.surrogates.rbf import RBF
from UQPyL.surrogates.kriging import KRG
rbf = RBF()
krg = KRG()
# Step 3: Combine multiple surrogates
from UQPyL.surrogates import MultiSurrogates
surrogates = MultiSurrogates(n_surrogates=2, models_list=[rbf, krg])
# Step 4: Instantiate NSGA-II optimizer
from UQPyL.optimization.multi_objective import NSGAII
nsgaii = NSGAII(nPop=50, maxFEs=10000, verboseFlag=False)
# Step 5: Use MO-ASMO framework
from UQPyL.optimization.multi_objective import MOASMO
moasmo = MOASMO(surrogates=surrogates, optimizer=nsgaii, saveFlag=True)
# Step 6: Run MO-ASMO
moasmo.run(problem=zdt1)