Hyperparameters¶

Info

Hyperparameter optimization is a new feature available since version 0.6.0. In general, this is quite a challenging and computationally expensive topic, and only a few basics are presented in this guide. If you are interested in contributing or collaborating, please let us know to enrich this module with more robust and better features.

Most algoriths have hyperparameters. For some optimization methods the parameters are already defined and can directly be optimized. For instance, for Differential Evolution (DE) the parameters can be found by:

[1]:

import json
from pymoo.algorithms.soo.nonconvex.de import DE
from pymoo.core.parameters import get_params, flatten, set_params, hierarchical

algorithm = DE()
flatten(get_params(algorithm))

[1]:

{'mating.jitter': <pymoo.core.variable.Choice at 0x1042b3f80>,
 'mating.CR': <pymoo.core.variable.Real at 0x10532a480>,
 'mating.crossover': <pymoo.core.variable.Choice at 0x10532a450>,
 'mating.F': <pymoo.core.variable.Real at 0x10532a420>,
 'mating.n_diffs': <pymoo.core.variable.Choice at 0x10532a3f0>,
 'mating.selection': <pymoo.core.variable.Choice at 0x105328aa0>}

If not provided directly, when initializing a HyperparameterProblem these variables are directly used for optimization.

Secondly, one needs to define what exactly should be optimized. For instance, for a single run on a problem (with a fixed random seed) using the well-known parameter optimization toolkit Optuna, the implementation may look as follows:

[2]:

from pymoo.algorithms.hyperparameters import SingleObjectiveSingleRun, HyperparameterProblem
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.algorithms.soo.nonconvex.optuna import Optuna
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere

algorithm = G3PCX()

problem = Sphere(n_var=10)
n_evals = 500

performance = SingleObjectiveSingleRun(problem, termination=("n_evals", n_evals), seed=1)

res = minimize(HyperparameterProblem(algorithm, performance),
               Optuna(),
               termination=('n_evals', 50),
               seed=1,
               verbose=False)

hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))

res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=1)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))

{'mutation.eta': 13.808552564884089, 'mutation.prob': 0.11558038643127766, 'crossover.zeta': 0.178990412527086, 'crossover.eta': 0.16302613650647435, 'family_size': 4, 'n_parents': 6, 'n_offsprings': 2, 'pop_size': 30}
Best solution found:
X = [0.50007373 0.49994234 0.49992429 0.49998374 0.49992907 0.49991547
 0.50006644 0.49986791 0.50017263 0.49987572]
F = [9.4041504e-08]

Of course, you can also directly use the MixedVariableGA available in our framework:

[3]:

from pymoo.algorithms.hyperparameters import SingleObjectiveSingleRun, HyperparameterProblem
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.algorithms.soo.nonconvex.optuna import Optuna
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere


algorithm = G3PCX()

problem = Sphere(n_var=10)
n_evals = 500

performance = SingleObjectiveSingleRun(problem, termination=("n_evals", n_evals), seed=1)

res = minimize(HyperparameterProblem(algorithm, performance),
               MixedVariableGA(pop_size=5),
               termination=('n_evals', 50),
               seed=1,
               verbose=False)

hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))

res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=1)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))

{'mutation.eta': 20.846815850616412, 'mutation.prob': 0.1862612654517947, 'crossover.zeta': 0.2737866416405445, 'crossover.eta': 0.1709324724852795, 'family_size': 8, 'n_parents': 3, 'n_offsprings': 2, 'pop_size': 72}
Best solution found:
X = [0.49993119 0.50005912 0.50013164 0.50004701 0.49981859 0.50001513
 0.49994978 0.5001145  0.50009071 0.50000453]
F = [8.47902938e-08]

Now, optimizing the parameters for a single random seed is often not desirable. And this is precisely what makes hyper-parameter optimization computationally expensive. So instead of using just a single random seed, we can use the MultiRun performance assessment to average over multiple runs as follows:

[4]:

from pymoo.algorithms.hyperparameters import HyperparameterProblem, MultiRun, stats_single_objective_mean
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere


algorithm = G3PCX()

problem = Sphere(n_var=10)
n_evals = 500
seeds = [5, 50, 500]

performance = MultiRun(problem, seeds=seeds, func_stats=stats_single_objective_mean, termination=("n_evals", n_evals))

res = minimize(HyperparameterProblem(algorithm, performance),
               MixedVariableGA(pop_size=5),
               termination=('n_evals', 50),
               seed=1,
               verbose=True)

hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))

res = minimize(Sphere(), algorithm, termination=("n_evals", n_evals), seed=5)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))

=================================================
n_gen  |  n_eval  |     f_avg     |     f_min
=================================================
     1 |        5 |  0.0025603953 |  0.0000961170
     2 |       10 |  0.0002927193 |  0.0000961170
     3 |       15 |  0.0001224755 |  0.0000255079
     4 |       20 |  0.0000718049 |  3.933692E-06
     5 |       25 |  0.0000229269 |  3.933692E-06
     6 |       30 |  7.309705E-06 |  2.258006E-06
     7 |       35 |  3.780789E-06 |  2.258006E-06
     8 |       40 |  2.654280E-06 |  1.696268E-06
     9 |       45 |  2.134554E-06 |  1.335062E-06
    10 |       50 |  1.626838E-06 |  2.232820E-07
{'mutation.eta': 21.327923020272316, 'mutation.prob': 0.21071914771913985, 'crossover.zeta': 0.2099716552753084, 'crossover.eta': 0.13018974582365575, 'family_size': 8, 'n_parents': 3, 'n_offsprings': 3, 'pop_size': 77}
Best solution found:
X = [0.49991024 0.49974101 0.49997067 0.49987257 0.49967221 0.50000813
 0.50012018 0.49997465 0.4998346  0.50008974]
F = [2.50238347e-07]

Another way of performance measure is the number of evaluations until a specific goal has been reached. For single-objective optimization, such a goal is most likely until a minimum function value has been found. Thus, for the termination, we use MinimumFunctionValueTermination with a value of 1e-5. We run the method for each random seed until this value has been reached or at most 500 function evaluations have taken place. The performance is then measured by the average number of function evaluations (func_stats=stats_avg_nevals) to reach the goal.

[5]:

from pymoo.algorithms.hyperparameters import HyperparameterProblem, MultiRun, stats_avg_nevals
from pymoo.algorithms.soo.nonconvex.g3pcx import G3PCX
from pymoo.core.mixed import MixedVariableGA
from pymoo.core.parameters import set_params, hierarchical
from pymoo.core.termination import TerminateIfAny
from pymoo.optimize import minimize
from pymoo.problems.single import Sphere
from pymoo.termination.fmin import MinimumFunctionValueTermination
from pymoo.termination.max_eval import MaximumFunctionCallTermination

algorithm = G3PCX()

problem = Sphere(n_var=10)

termination = TerminateIfAny(MinimumFunctionValueTermination(1e-5), MaximumFunctionCallTermination(500))

performance = MultiRun(problem, seeds=[5, 50, 500], func_stats=stats_avg_nevals, termination=termination)

res = minimize(HyperparameterProblem(algorithm, performance),
               MixedVariableGA(pop_size=5),
               ('n_evals', 50),
               seed=1,
               verbose=True)

hyperparams = res.X
print(hyperparams)
set_params(algorithm, hierarchical(hyperparams))

res = minimize(Sphere(), algorithm, termination=("n_evals", res.f), seed=5)
print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))

=================================================
n_gen  |  n_eval  |     f_avg     |     f_min
=================================================
     1 |        5 |  5.298000E+02 |  5.030000E+02
     2 |       10 |  5.050000E+02 |  5.030000E+02
     3 |       15 |  5.042000E+02 |  5.030000E+02
     4 |       20 |  5.034000E+02 |  5.030000E+02
     5 |       25 |  5.024000E+02 |  5.000000E+02
     6 |       30 |  5.018000E+02 |  5.000000E+02
     7 |       35 |  5.006000E+02 |  5.000000E+02
     8 |       40 |  5.000000E+02 |  5.000000E+02
     9 |       45 |  5.000000E+02 |  5.000000E+02
    10 |       50 |  5.000000E+02 |  5.000000E+02
{'mutation.eta': 22.551200976898883, 'mutation.prob': 0.538816734003357, 'crossover.zeta': 0.023912790481847794, 'crossover.eta': 0.06246801257677745, 'family_size': 10, 'n_parents': 3, 'n_offsprings': 8, 'pop_size': 20}
Best solution found:
X = [0.50409446 0.55274317 0.57741933 0.50696571 0.52539616 0.50433253
 0.44251811 0.55334363 0.44936427 0.38612332]
F = [0.0311862]