Class CMAESOptimizer
The CMA-Evolution Strategy (CMA-ES) is a reliable stochastic optimization method which should be applied if derivative-based methods, e.g. quasi-Newton BFGS or conjugate gradient, fail due to a rugged search landscape (e.g. noise, local optima, outlier, etc.) of the objective function. Like a quasi-Newton method, the CMA-ES learns and applies a variable metric on the underlying search space. Unlike a quasi-Newton method, the CMA-ES neither estimates nor uses gradients, making it considerably more reliable in terms of finding a good, or even close to optimal, solution.
In general, on smooth objective functions the CMA-ES is roughly ten times slower than BFGS (counting objective function evaluations, no gradients provided). For up to variables also the derivative-free simplex direct search method (Nelder and Mead) can be faster, but it is far less reliable than CMA-ES.
The CMA-ES is particularly well suited for non-separable and/or badly conditioned problems. To observe the advantage of CMA compared to a conventional evolution strategy, it will usually take about function evaluations. On difficult problems the complete optimization (a single run) is expected to take roughly between and function evaluations.
This implementation is translated and adapted from the Matlab version
of the CMA-ES algorithm as implemented in module cmaes.m
version 3.51.
For more information, please refer to the following links:
- Since:
- 3.0
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static class
Used to sort fitness values.private class
Normalizes fitness values to the range [0,1].static class
Population size.static class
Input sigma values.private static class
Stores the value and penalty (for repair of out of bounds point). -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate RealMatrix
Coordinate system.private RealMatrix
B*D, stored for efficiency.private RealMatrix
Covariance matrix.private double
Cumulation constant.private double
Learning rate for rank-one update.private double
Learning rate for rank-one update - diagonalOnlyprivate double
Learning rate for rank-mu update'private double
Learning rate for rank-mu update - diagonalOnlyprivate final int
Determines how often a new random offspring is generated in case it is not feasible / beyond the defined limits, default is 0.private double
Expectation of ||N(0,I)|| == norm(randn(N,1)).private double
Cumulation constant for step-size.private RealMatrix
Scaling.private double
Damping for step-size.private RealMatrix
Diagonal of C, used for diagonalOnly.private RealMatrix
Diagonal of sqrt(D), stored for efficiency.private int
Defines the number of initial iterations, where the covariance matrix remains diagonal and the algorithm has internally linear time complexity.private int
Number of objective variables/problem dimensionprivate double[]
History queue of best values.private final boolean
Indicates whether statistic data is collected.private int
Size of history queue of best values.private double[]
private final boolean
Covariance update mechanism, default is active CMA.private boolean
Number of objective variables/problem dimensionprivate int
Number of iterations already performed.private int
Population size, offspring number.private double
log(mu + 0.5), stored for efficiency.private final int
Maximal number of iterations allowed.private int
Number of parents/points for recombination.private double
Variance-effectiveness of sum w_i x_i.private double
Norm of ps, stored for efficiency.private RealMatrix
Evolution path.private RealMatrix
Evolution path for sigma.private final RandomGenerator
Random generator.private double
Overall standard deviation - search volume.private final List
<RealMatrix> History of D matrix.History of fitness values.private final List
<RealMatrix> History of mean matrix.History of sigma values.private final double
Limit for fitness value.private double
Stop if fun-changes smaller stopTolFun.private double
Stop if back fun-changes smaller stopTolHistFun.private double
Stop if x-changes larger stopTolUpX.private double
Stop if x-change smaller stopTolX.private RealMatrix
Array for weighted recombination.private RealMatrix
Objective variables.Fields inherited from class org.apache.commons.math3.optim.BaseOptimizer
evaluations
-
Constructor Summary
ConstructorsConstructorDescriptionCMAESOptimizer
(int maxIterations, double stopFitness, boolean isActiveCMA, int diagonalOnly, int checkFeasableCount, RandomGenerator random, boolean generateStatistics, ConvergenceChecker<PointValuePair> checker) -
Method Summary
Modifier and TypeMethodDescriptionprivate void
Checks dimensions and values of boundaries and inputSigma if defined.private static void
copyColumn
(RealMatrix m1, int col1, RealMatrix m2, int col2) Copies a column from m1 to m2.private static RealMatrix
diag
(RealMatrix m) private static RealMatrix
divide
(RealMatrix m, RealMatrix n) protected PointValuePair
Performs the bulk of the optimization algorithm.private static RealMatrix
eye
(int n, int m) private void
initializeCMA
(double[] guess) Initialization of the dynamic search parametersprivate static int[]
inverse
(int[] indices) private static RealMatrix
log
(RealMatrix m) private static double
max
(double[] m) private static double
max
(RealMatrix m) private static double
min
(double[] m) private static double
min
(RealMatrix m) private static RealMatrix
ones
(int n, int m) optimize
(OptimizationData... optData) Stores data and performs the optimization.protected void
parseOptimizationData
(OptimizationData... optData) Scans the list of (required and optional) optimization data that characterize the problem.private static void
push
(double[] vals, double val) Pushes the current best fitness value in a history queue.private double[]
randn
(int size) private RealMatrix
randn1
(int size, int popSize) private static RealMatrix
repmat
(RealMatrix mat, int n, int m) private static int[]
reverse
(int[] indices) private static RealMatrix
selectColumns
(RealMatrix m, int[] cols) private static RealMatrix
sequence
(double start, double end, double step) private int[]
sortedIndices
(double[] doubles) Sorts fitness values.private static RealMatrix
sqrt
(RealMatrix m) private static RealMatrix
square
(RealMatrix m) private static RealMatrix
private static RealMatrix
times
(RealMatrix m, RealMatrix n) private static RealMatrix
triu
(RealMatrix m, int k) private void
updateBD
(double negccov) Update B and D from C.private void
updateCovariance
(boolean hsig, RealMatrix bestArx, RealMatrix arz, int[] arindex, RealMatrix xold) Update of the covariance matrix C.private void
updateCovarianceDiagonalOnly
(boolean hsig, RealMatrix bestArz) Update of the covariance matrix C for diagonalOnly > 0private boolean
updateEvolutionPaths
(RealMatrix zmean, RealMatrix xold) Update of the evolution paths ps and pc.private double
valueRange
(CMAESOptimizer.ValuePenaltyPair[] vpPairs) Get range of values.private static RealMatrix
zeros
(int n, int m) Methods inherited from class org.apache.commons.math3.optim.nonlinear.scalar.MultivariateOptimizer
computeObjectiveValue, getGoalType
Methods inherited from class org.apache.commons.math3.optim.BaseMultivariateOptimizer
getLowerBound, getStartPoint, getUpperBound
Methods inherited from class org.apache.commons.math3.optim.BaseOptimizer
getConvergenceChecker, getEvaluations, getIterations, getMaxEvaluations, getMaxIterations, incrementEvaluationCount, incrementIterationCount, optimize
-
Field Details
-
lambda
private int lambdaPopulation size, offspring number. The primary strategy parameter to play with, which can be increased from its default value. Increasing the population size improves global search properties in exchange to speed. Speed decreases, as a rule, at most linearly with increasing population size. It is advisable to begin with the default small population size. -
isActiveCMA
private final boolean isActiveCMACovariance update mechanism, default is active CMA. isActiveCMA = true turns on "active CMA" with a negative update of the covariance matrix and checks for positive definiteness. OPTS.CMA.active = 2 does not check for pos. def. and is numerically faster. Active CMA usually speeds up the adaptation. -
checkFeasableCount
private final int checkFeasableCountDetermines how often a new random offspring is generated in case it is not feasible / beyond the defined limits, default is 0. -
inputSigma
private double[] inputSigma- See Also:
-
dimension
private int dimensionNumber of objective variables/problem dimension -
diagonalOnly
private int diagonalOnlyDefines the number of initial iterations, where the covariance matrix remains diagonal and the algorithm has internally linear time complexity. diagonalOnly = 1 means keeping the covariance matrix always diagonal and this setting also exhibits linear space complexity. This can be particularly useful for dimension > 100.- See Also:
-
isMinimize
private boolean isMinimizeNumber of objective variables/problem dimension -
generateStatistics
private final boolean generateStatisticsIndicates whether statistic data is collected. -
maxIterations
private final int maxIterationsMaximal number of iterations allowed. -
stopFitness
private final double stopFitnessLimit for fitness value. -
stopTolUpX
private double stopTolUpXStop if x-changes larger stopTolUpX. -
stopTolX
private double stopTolXStop if x-change smaller stopTolX. -
stopTolFun
private double stopTolFunStop if fun-changes smaller stopTolFun. -
stopTolHistFun
private double stopTolHistFunStop if back fun-changes smaller stopTolHistFun. -
mu
private int muNumber of parents/points for recombination. -
logMu2
private double logMu2log(mu + 0.5), stored for efficiency. -
weights
Array for weighted recombination. -
mueff
private double mueffVariance-effectiveness of sum w_i x_i. -
sigma
private double sigmaOverall standard deviation - search volume. -
cc
private double ccCumulation constant. -
cs
private double csCumulation constant for step-size. -
damps
private double dampsDamping for step-size. -
ccov1
private double ccov1Learning rate for rank-one update. -
ccovmu
private double ccovmuLearning rate for rank-mu update' -
chiN
private double chiNExpectation of ||N(0,I)|| == norm(randn(N,1)). -
ccov1Sep
private double ccov1SepLearning rate for rank-one update - diagonalOnly -
ccovmuSep
private double ccovmuSepLearning rate for rank-mu update - diagonalOnly -
xmean
Objective variables. -
pc
Evolution path. -
ps
Evolution path for sigma. -
normps
private double normpsNorm of ps, stored for efficiency. -
B
Coordinate system. -
D
Scaling. -
BD
B*D, stored for efficiency. -
diagD
Diagonal of sqrt(D), stored for efficiency. -
C
Covariance matrix. -
diagC
Diagonal of C, used for diagonalOnly. -
iterations
private int iterationsNumber of iterations already performed. -
fitnessHistory
private double[] fitnessHistoryHistory queue of best values. -
historySize
private int historySizeSize of history queue of best values. -
random
Random generator. -
statisticsSigmaHistory
History of sigma values. -
statisticsMeanHistory
History of mean matrix. -
statisticsFitnessHistory
History of fitness values. -
statisticsDHistory
History of D matrix.
-
-
Constructor Details
-
CMAESOptimizer
public CMAESOptimizer(int maxIterations, double stopFitness, boolean isActiveCMA, int diagonalOnly, int checkFeasableCount, RandomGenerator random, boolean generateStatistics, ConvergenceChecker<PointValuePair> checker) - Parameters:
maxIterations
- Maximal number of iterations.stopFitness
- Whether to stop if objective function value is smaller thanstopFitness
.isActiveCMA
- Chooses the covariance matrix update method.diagonalOnly
- Number of initial iterations, where the covariance matrix remains diagonal.checkFeasableCount
- Determines how often new random objective variables are generated in case they are out of bounds.random
- Random generator.generateStatistics
- Whether statistic data is collected.checker
- Convergence checker.- Since:
- 3.1
-
-
Method Details
-
getStatisticsSigmaHistory
- Returns:
- History of sigma values.
-
getStatisticsMeanHistory
- Returns:
- History of mean matrix.
-
getStatisticsFitnessHistory
- Returns:
- History of fitness values.
-
getStatisticsDHistory
- Returns:
- History of D matrix.
-
optimize
public PointValuePair optimize(OptimizationData... optData) throws TooManyEvaluationsException, DimensionMismatchException Stores data and performs the optimization.The list of parameters is open-ended so that sub-classes can extend it with arguments specific to their concrete implementations.
When the method is called multiple times, instance data is overwritten only when actually present in the list of arguments: when not specified, data set in a previous call is retained (and thus is optional in subsequent calls).
Important note: Subclasses must override
BaseOptimizer.parseOptimizationData(OptimizationData[])
if they need to register their own options; but then, they must also callsuper.parseOptimizationData(optData)
within that method.- Overrides:
optimize
in classMultivariateOptimizer
- Parameters:
optData
- Optimization data. In addition to those documented inMultivariateOptimizer
, this method will register the following data:- Returns:
- a point/value pair that satisfies the convergence criteria.
- Throws:
TooManyEvaluationsException
- if the maximal number of evaluations is exceeded.DimensionMismatchException
- if the initial guess, target, and weight arguments have inconsistent dimensions.
-
doOptimize
Performs the bulk of the optimization algorithm.- Specified by:
doOptimize
in classBaseOptimizer<PointValuePair>
- Returns:
- the point/value pair giving the optimal value of the objective function.
-
parseOptimizationData
Scans the list of (required and optional) optimization data that characterize the problem.- Overrides:
parseOptimizationData
in classMultivariateOptimizer
- Parameters:
optData
- Optimization data. The following data will be looked for:
-
checkParameters
private void checkParameters()Checks dimensions and values of boundaries and inputSigma if defined. -
initializeCMA
private void initializeCMA(double[] guess) Initialization of the dynamic search parameters- Parameters:
guess
- Initial guess for the arguments of the fitness function.
-
updateEvolutionPaths
Update of the evolution paths ps and pc.- Parameters:
zmean
- Weighted row matrix of the gaussian random numbers generating the current offspring.xold
- xmean matrix of the previous generation.- Returns:
- hsig flag indicating a small correction.
-
updateCovarianceDiagonalOnly
Update of the covariance matrix C for diagonalOnly > 0- Parameters:
hsig
- Flag indicating a small correction.bestArz
- Fitness-sorted matrix of the gaussian random values of the current offspring.
-
updateCovariance
private void updateCovariance(boolean hsig, RealMatrix bestArx, RealMatrix arz, int[] arindex, RealMatrix xold) Update of the covariance matrix C.- Parameters:
hsig
- Flag indicating a small correction.bestArx
- Fitness-sorted matrix of the argument vectors producing the current offspring.arz
- Unsorted matrix containing the gaussian random values of the current offspring.arindex
- Indices indicating the fitness-order of the current offspring.xold
- xmean matrix of the previous generation.
-
updateBD
private void updateBD(double negccov) Update B and D from C.- Parameters:
negccov
- Negative covariance factor.
-
push
private static void push(double[] vals, double val) Pushes the current best fitness value in a history queue.- Parameters:
vals
- History queue.val
- Current best fitness value.
-
sortedIndices
private int[] sortedIndices(double[] doubles) Sorts fitness values.- Parameters:
doubles
- Array of values to be sorted.- Returns:
- a sorted array of indices pointing into doubles.
-
valueRange
Get range of values.- Parameters:
vpPairs
- Array of valuePenaltyPairs to get range from.- Returns:
- a double equal to maximum value minus minimum value.
-
log
- Parameters:
m
- Input matrix- Returns:
- Matrix representing the element-wise logarithm of m.
-
sqrt
- Parameters:
m
- Input matrix.- Returns:
- Matrix representing the element-wise square root of m.
-
square
- Parameters:
m
- Input matrix.- Returns:
- Matrix representing the element-wise square of m.
-
times
- Parameters:
m
- Input matrix 1.n
- Input matrix 2.- Returns:
- the matrix where the elements of m and n are element-wise multiplied.
-
divide
- Parameters:
m
- Input matrix 1.n
- Input matrix 2.- Returns:
- Matrix where the elements of m and n are element-wise divided.
-
selectColumns
- Parameters:
m
- Input matrix.cols
- Columns to select.- Returns:
- Matrix representing the selected columns.
-
triu
- Parameters:
m
- Input matrix.k
- Diagonal position.- Returns:
- Upper triangular part of matrix.
-
sumRows
- Parameters:
m
- Input matrix.- Returns:
- Row matrix representing the sums of the rows.
-
diag
- Parameters:
m
- Input matrix.- Returns:
- the diagonal n-by-n matrix if m is a column matrix or the column matrix representing the diagonal if m is a n-by-n matrix.
-
copyColumn
Copies a column from m1 to m2.- Parameters:
m1
- Source matrix.col1
- Source column.m2
- Target matrix.col2
- Target column.
-
ones
- Parameters:
n
- Number of rows.m
- Number of columns.- Returns:
- n-by-m matrix filled with 1.
-
eye
- Parameters:
n
- Number of rows.m
- Number of columns.- Returns:
- n-by-m matrix of 0 values out of diagonal, and 1 values on the diagonal.
-
zeros
- Parameters:
n
- Number of rows.m
- Number of columns.- Returns:
- n-by-m matrix of zero values.
-
repmat
- Parameters:
mat
- Input matrix.n
- Number of row replicates.m
- Number of column replicates.- Returns:
- a matrix which replicates the input matrix in both directions.
-
sequence
- Parameters:
start
- Start value.end
- End value.step
- Step size.- Returns:
- a sequence as column matrix.
-
max
- Parameters:
m
- Input matrix.- Returns:
- the maximum of the matrix element values.
-
min
- Parameters:
m
- Input matrix.- Returns:
- the minimum of the matrix element values.
-
max
private static double max(double[] m) - Parameters:
m
- Input array.- Returns:
- the maximum of the array values.
-
min
private static double min(double[] m) - Parameters:
m
- Input array.- Returns:
- the minimum of the array values.
-
inverse
private static int[] inverse(int[] indices) - Parameters:
indices
- Input index array.- Returns:
- the inverse of the mapping defined by indices.
-
reverse
private static int[] reverse(int[] indices) - Parameters:
indices
- Input index array.- Returns:
- the indices in inverse order (last is first).
-
randn
private double[] randn(int size) - Parameters:
size
- Length of random array.- Returns:
- an array of Gaussian random numbers.
-
randn1
- Parameters:
size
- Number of rows.popSize
- Population size.- Returns:
- a 2-dimensional matrix of Gaussian random numbers.
-