HyperparametersΒΆ
Info
Hyperparameter optimization is a new feature available since version 0.6.0. In general, this is quite a challenging and computationally expensive topic, and only a few basics are presented in this guide. If you are interested in contributing or collaborating, please let us know to enrich this module with more robust and better features.
Most algoriths have hyperparameters. For some optimization methods the parameters are already defined and can directly be optimized. For instance, for Differential Evolution (DE) the parameters can be found by:
[1]:
import json
from pymoo.algorithms.soo.nonconvex.de import DE
from pymoo.core.parameters import get_params, flatten, set_params, hierarchical
algorithm = DE()
flatten(get_params(algorithm))
[1]:
{'mating.jitter': <pymoo.core.variable.Choice at 0x104dff1d0>,
'mating.CR': <pymoo.core.variable.Real at 0x104dff1a0>,
'mating.crossover': <pymoo.core.variable.Choice at 0x104dff170>,
'mating.F': <pymoo.core.variable.Real at 0x104dff140>,
'mating.n_diffs': <pymoo.core.variable.Choice at 0x104dff110>,
'mating.selection': <pymoo.core.variable.Choice at 0x103bd5c40>}
If not provided directly, when initializing a HyperparameterProblem
these variables are directly used for optimization.
Secondly, one needs to define what exactly should be optimized. For instance, for a single run on a problem (with a fixed random seed) using the well-known parameter optimization toolkit Optuna, the implementation may look as follows:
[2]:
from pymoo.algorithms.hyperparameters import SingleObjectiveSingleRun, HyperparameterProblem
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.algorithms.soo.nonconvex.optuna import Optuna
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere
algorithm = G3PCX()
problem = Sphere(n_var=10)
n_evals = 500
performance = SingleObjectiveSingleRun(problem, termination=("n_evals", n_evals), seed=1)
res = minimize(HyperparameterProblem(algorithm, performance),
Optuna(),
termination=('n_evals', 50),
seed=1,
verbose=False)
hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))
res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=1)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))
{'mutation.eta': 13.808552564884089, 'mutation.prob': 0.11558038643127766, 'crossover.zeta': 0.178990412527086, 'crossover.eta': 0.16302613650647435, 'family_size': 4, 'n_parents': 6, 'n_offsprings': 2, 'pop_size': 30}
Best solution found:
X = [0.50007373 0.49994234 0.49992429 0.49998374 0.49992907 0.49991547
0.50006644 0.49986791 0.50017263 0.49987572]
F = [9.4041504e-08]
Of course, you can also directly use the MixedVariableGA
available in our framework:
[3]:
from pymoo.algorithms.hyperparameters import SingleObjectiveSingleRun, HyperparameterProblem
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.algorithms.soo.nonconvex.optuna import Optuna
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere
algorithm = G3PCX()
problem = Sphere(n_var=10)
n_evals = 500
performance = SingleObjectiveSingleRun(problem, termination=("n_evals", n_evals), seed=1)
res = minimize(HyperparameterProblem(algorithm, performance),
MixedVariableGA(pop_size=5),
termination=('n_evals', 50),
seed=1,
verbose=False)
hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))
res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=1)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))
{'mutation.eta': 20.846815850616412, 'mutation.prob': 0.1862612654517947, 'crossover.zeta': 0.2737866416405445, 'crossover.eta': 0.1709324724852795, 'family_size': 8, 'n_parents': 3, 'n_offsprings': 2, 'pop_size': 72}
Best solution found:
X = [0.49993119 0.50005912 0.50013164 0.50004701 0.49981859 0.50001513
0.49994978 0.5001145 0.50009071 0.50000453]
F = [8.47902938e-08]
Now, optimizing the parameters for a single random seed is often not desirable. And this is precisely what makes hyper-parameter optimization computationally expensive. So instead of using just a single random seed, we can use the MultiRun
performance assessment to average over multiple runs as follows:
[4]:
from pymoo.algorithms.hyperparameters import HyperparameterProblem, MultiRun, stats_single_objective_mean
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere
algorithm = G3PCX()
problem = Sphere(n_var=10)
n_evals = 500
seeds = [5, 50, 500]
performance = MultiRun(problem, seeds=seeds, func_stats=stats_single_objective_mean, termination=("n_evals", n_evals))
res = minimize(HyperparameterProblem(algorithm, performance),
MixedVariableGA(pop_size=5),
termination=('n_evals', 50),
seed=1,
verbose=True)
hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))
res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=5)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))
=================================================
n_gen | n_eval | f_avg | f_min
=================================================
1 | 5 | 0.0025603953 | 0.0000961170
2 | 10 | 0.0002927193 | 0.0000961170
3 | 15 | 0.0001224755 | 0.0000255079
4 | 20 | 0.0000718049 | 3.933692E-06
5 | 25 | 0.0000229269 | 3.933692E-06
6 | 30 | 7.309705E-06 | 2.258006E-06
7 | 35 | 3.780789E-06 | 2.258006E-06
8 | 40 | 2.654280E-06 | 1.696268E-06
9 | 45 | 2.134554E-06 | 1.335062E-06
10 | 50 | 1.626838E-06 | 2.232820E-07
{'mutation.eta': 21.327923020272316, 'mutation.prob': 0.21071914771913985, 'crossover.zeta': 0.2099716552753084, 'crossover.eta': 0.13018974582365575, 'family_size': 8, 'n_parents': 3, 'n_offsprings': 3, 'pop_size': 77}
Best solution found:
X = [0.49991024 0.49974101 0.49997067 0.49987257 0.49967221 0.50000813
0.50012018 0.49997465 0.4998346 0.50008974]
F = [2.50238347e-07]
Another way of performance measure is the number of evaluations until a specific goal has been reached. For single-objective optimization, such a goal is most likely until a minimum function value has been found. Thus, for the termination, we use MinimumFunctionValueTermination
with a value of 1e-5
. We run the method for each random seed until this value has been reached or at most 500
function evaluations have taken place. The performance is then measured by the average number of
function evaluations (func_stats=stats_avg_nevals
) to reach the goal.
[5]:
from pymoo.algorithms.hyperparameters import HyperparameterProblem, MultiRun, stats_avg_nevals
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.core.termination import TerminateIfAny
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere
from pymoo.termination.fmin import MinimumFunctionValueTermination
from pymoo.termination.max_eval import MaximumFunctionCallTermination
algorithm = G3PCX()
problem = Sphere(n_var=10)
termination = TerminateIfAny(MinimumFunctionValueTermination(1e-5), MaximumFunctionCallTermination(500))
performance = MultiRun(problem, seeds=[5, 50, 500], func_stats=stats_avg_nevals, termination=termination)
res = minimize(HyperparameterProblem(algorithm, performance),
MixedVariableGA(pop_size=5),
('n_evals', 50),
seed=1,
verbose=True)
hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))
res = minimize(Sphere(), algorithm, termination=("n_evals", res.f), seed=5)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))
=================================================
n_gen | n_eval | f_avg | f_min
=================================================
1 | 5 | 5.298000E+02 | 5.030000E+02
2 | 10 | 5.050000E+02 | 5.030000E+02
3 | 15 | 5.042000E+02 | 5.030000E+02
4 | 20 | 5.034000E+02 | 5.030000E+02
5 | 25 | 5.024000E+02 | 5.000000E+02
6 | 30 | 5.018000E+02 | 5.000000E+02
7 | 35 | 5.006000E+02 | 5.000000E+02
8 | 40 | 5.000000E+02 | 5.000000E+02
9 | 45 | 5.000000E+02 | 5.000000E+02
10 | 50 | 5.000000E+02 | 5.000000E+02
{'mutation.eta': 22.551200976898883, 'mutation.prob': 0.538816734003357, 'crossover.zeta': 0.023912790481847794, 'crossover.eta': 0.06246801257677745, 'family_size': 10, 'n_parents': 3, 'n_offsprings': 8, 'pop_size': 20}
Best solution found:
X = [0.50409446 0.55274317 0.57741933 0.50696571 0.52539616 0.50433253
0.44251811 0.55334363 0.44936427 0.38612332]
F = [0.0311862]