一类基于提前停止的超参优化算法专为大型超参数搜索空间设计,尤其是评估一组超参数的计算成本很高时。Irace实现了迭代竞赛算法(iterated racing algorithm),将搜索重点放在最有前景的配置上,用统计测试剔除不佳的。[28][29]
另一种提前停止超参数优化算法是连续减半算法(SHA),[30]一开始是随机搜索,但会定期修剪低性能模型,从而将计算资源集中到更有前景的模型上。异步连续减(ASHA)[31]无需同步评估与修剪,从而进一步提高了SHA的资源利用率。Hyperband[32]是一种更高级的基于提前停止的算法,可多次调用SHA或ASHA,具有不同程度的剪枝侵占性(aggressiveness),因此适用范围更广,所需输入也更少。
^Shaban, A., Cheng, C. A., Hatch, N., & Boots, B. (2019, April). Truncated back-propagation for bilevel optimization. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 1723-1732). PMLR.
^Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N, Hodjat B. Evolving Deep Neural Networks. 2017. arXiv:1703.00548 [cs.NE].
^Jaderberg M, Dalibard V, Osindero S, Czarnecki WM, Donahue J, Razavi A, Vinyals O, Green T, Dunning I, Simonyan K, Fernando C, Kavukcuoglu K. Population Based Training of Neural Networks. 2017. arXiv:1711.09846 [cs.LG].
^Such FP, Madhavan V, Conti E, Lehman J, Stanley KO, Clune J. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning. 2017. arXiv:1712.06567 [cs.NE].
^Li, Ang; Spyra, Ola; Perel, Sagi; Dalibard, Valentin; Jaderberg, Max; Gu, Chenjie; Budden, David; Harley, Tim; Gupta, Pramod. A Generalized Framework for Population Based Training. 2019-02-05. arXiv:1902.01894 [cs.AI].
^López-Ibáñez, Manuel; Dubois-Lacoste, Jérémie; Pérez Cáceres, Leslie; Stützle, Thomas; Birattari, Mauro. The irace package: Iterated Racing for Automatic Algorithm Configuration. Operations Research Perspective. 2016, 3 (3): 43–58. doi:10.1016/j.orp.2016.09.002. hdl:10419/178265.
^Birattari, Mauro; Stützle, Thomas; Paquete, Luis; Varrentrapp, Klaus. A Racing Algorithm for Configuring Metaheuristics. Gecco 2002. 2002: 11–18.
^Jamieson, Kevin; Talwalkar, Ameet. Non-stochastic Best Arm Identification and Hyperparameter Optimization. 2015-02-27. arXiv:1502.07943 [cs.LG].
^Li, Liam; Jamieson, Kevin; Rostamizadeh, Afshin; Gonina, Ekaterina; Hardt, Moritz; Recht, Benjamin; Talwalkar, Ameet. A System for Massively Parallel Hyperparameter Tuning. 2020-03-16. arXiv:1810.05934v5 [cs.LG].
^Li, Lisha; Jamieson, Kevin; DeSalvo, Giulia; Rostamizadeh, Afshin; Talwalkar, Ameet. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research. 2020-03-16, 18: 1–52. arXiv:1603.06560.
^Diaz, Gonzalo; Fokoue, Achille; Nannicini, Giacomo; Samulowitz, Horst. An effective algorithm for hyperparameter optimization of neural networks. 2017. arXiv:1705.08520 [cs.AI].