How humpday.minimize() picks an algorithm when you don't
specify one — and which algorithms are eligible for which kinds
of problem.
When you call minimize(f, bounds=...) without a
method= argument, three filters decide which optimizer
runs:
n_dim:
GridSearch is exponential; BayesianOpt's
Gaussian process scales cubically in observations and its RBF
kernel degenerates; PRIMA_UOBYQA needs
(n+1)(n+2)/2 interpolation points before it can
build its full quadratic model.
BayesianOpt's GP
doesn't pay off until you've drawn enough samples to fit it;
CMAEvolutionStrategy wants several generations of
4·n evaluations to adapt its covariance;
Powell's PRIMA family must initialize an interpolation set
before the first iteration.
minimize() entry we time your objective four times
and take the median. The resulting per-call cost decides which
algorithms are worth running: when each call is a microsecond,
BayesianOpt's GP fit and CMA-ES's eigendecomposition
would dominate wall-clock and are filtered out; when each call
is a second, every algorithm is eligible and we lean toward the
sample-efficient ones.
The recommender consults a benchmarks-driven grid
(benchmarks/recommendation_grid.json) when one is
available, picking the eligible algorithm with the smallest
median best-value on the nearest (n_dim, n_trials)
cell. When no grid is present it falls through to the rule-based
ranking that mirrors suggest_pure.
Algorithms not listed here are treated as having no cap (linear-in- dim or better).
| Algorithm | Max n_dim | Why |
|---|---|---|
| GridSearch | 4 | n_per_axisn_dim grid explodes |
| BayesianOpt | 10 | GP cost O(n_obs3) + RBF kernel degenerates in high dim |
| PRIMA_UOBYQA | 12 | Needs (n+1)(n+2)/2 interpolation points |
| NelderMead | 20 | Simplex degenerates past ~20 dims |
| PRIMA_NEWUOA | 25 | Underdetermined quadratic model fit |
| Powell | 30 | Direction-set line searches stall |
| PatternSearch | 30 | 2n directions per pattern |
| PRIMA_BOBYQA | 30 | Geometry/BIGLAG steps make each optimize() call minutes past n=30 |
| FireflyAlgorithm | 50 | O(n_pop2) attractions per generation |
| AntColonyOpt | 50 | Archive-sample interactions degrade |
| CMAEvolutionStrategy | 100 | Eigendecomposition O(n3) per generation |
Each algorithm is classified by its per-iteration bookkeeping cost. An algorithm is eligible only when your objective's measured eval-time is at least the tier's threshold — below that, the algorithm's own work would dominate wall-clock.
| Tier | Algorithms | Per-iter overhead | Min eval-time |
|---|---|---|---|
| 0 | RandomSearch, GridSearch | ~µs — pure sampling baselines, no adaptive structure | 0 |
| 1 | HillClimbing, Rechenberg, NelderMead, Powell, DE, PSO, SA, GA, ES, FA, HS, ACO, LBFGSB, PatternSearch, CoordinateDescent | ~100µs — single-trajectory adaptation, simplex updates, population loops | 10 µs |
| 2 | PRIMA_BOBYQA, PRIMA_NEWUOA | ~1ms — small linear-algebra updates per iter | 100 µs |
| 3 | PRIMA_UOBYQA, CMAEvolutionStrategy | ~10ms — full quadratic / eigendecomp per gen | 1 ms |
| 4 | BayesianOpt | ~100ms — GP fit O(n_obs3) per iter | 10 ms |
The auto-selection is the default; you can always override.
from humpday import minimize
# Auto-selection (the default).
result = minimize(my_objective, bounds=[(-5, 5)] * 8)
print(result.method) # "DifferentialEvolution" (say)
print(result.eval_time_measured) # 1.4e-04 (s/call)
print(result.tier) # 1
print(result.fun, result.x)
# Disable the timing call — useful for stochastic objectives.
result = minimize(my_objective, bounds=..., options={"auto_timing": False})
# Force a specific algorithm.
result = minimize(my_objective, bounds=..., method="CMAEvolutionStrategy")
benchmarks/recommendation_grid.json is the empirical
input to auto-selection. Each cell is keyed
"n_dim/n_trials" and holds, for every algorithm that
cleared the structural eligibility filter at that cell, the raw
per-(objective, seed) best-value plus aggregated
borda_score, borda_worst,
median_best, and mean_wall summaries.
The objective suite covers twelve landscape regimes — the
five classics (sphere, Rosenbrock, Rastrigin, Griewank, Salomon),
four deceptive multimodal benchmarks that prevent locally-greedy
algorithms from winning on separable cases (Ackley, Schwefel,
Michalewicz with m=20, Styblinski-Tang), and three
rotated benchmarks (rotated Rosenbrock, rotated
Rastrigin, rotated Ackley) that strip the axis-alignment bonus
from coordinate-wise methods. The rotation matrix Q is
deterministic per (n_dim, seed) so the benchmark stays
reproducible while still penalizing algorithms that rely on
coordinate-axis structure (CoordinateDescent, Powell, NelderMead)
and rewarding ones whose adaptation is rotation-invariant
(CMA-ES, DE, PSO).
Borda mean-rank. The recommender picks by the
Borda score (lower is better): for each
(objective, seed) run, every algorithm in the cell
gets a rank 1..k by best-value; the per-algorithm Borda score
is the mean rank across the suite. This rewards reliability
(consistently in the top few) over occasional brilliance
(winning by 1e-8 on one benchmark while ranking last on
another). borda_worst reports the worst single
rank an algorithm achieved — a tie-breaker for the
risk-averse.
Parallel. Every
(cell, algorithm, objective, seed) tuple is an
independent unit of work; pass --workers N to use a
process pool. The script pins BLAS to one thread per worker so the
numpy-heavy algorithms (CMA-ES, BayesianOpt) don't contend.
Incremental. Re-running the script reads any
existing grid file, skips cells / algorithms / objectives / seeds
that already have a recorded result, and only runs the missing
tuples. Re-running with new --dims or
--trials extends the grid rather than overwriting it,
and the meta.first_built timestamp is preserved.
Writes are atomic (sibling-temp + rename) so a kill mid-write
can't corrupt the file.
# Fast development sweep (~30s, parallel, default cpu_count - 1 workers). python benchmarks/build_recommendation_grid.py --quick # Full default sweep (dims 2,5,10,20,50,100 × trials 50,200,1000). python benchmarks/build_recommendation_grid.py --workers 16 --seeds 5 # Extend an existing grid with a new dimension only. python benchmarks/build_recommendation_grid.py --dims 30 # Retry algorithms previously marked "too slow" (e.g. on a beefier machine). python benchmarks/build_recommendation_grid.py --include-skipped
Algorithms whose probe run exceeds the per-call wall-clock budget
(default 30s) are marked skipped_too_slow for that
cell — the next run won't waste cycles on them unless you
pass --include-skipped.
f(x) once at the cube centre — if your
objective is non-deterministic, that one sample sets random
state. Pass options={"auto_timing": False} to skip
it; the recommender then falls back to dim & trials only.
LBFGSB or
Powell directly; if it's known multimodal with
expensive evaluations, pick BayesianOpt.
method= explicitly when you need byte-for-byte
determinism.
See SOTA status for how each algorithm compares against its canonical reference implementation, and Algorithms for the per-algorithm implementation notes.