Decadz

Decadz OP t1_j41wx24 wrote

Yes, as you said that's what im trying to find out! Will be interesting to know whether you can combine the two approaches into one technique, or have two seperate approaches being used in one system.

Was just enquiring to make sure i'm not going to spend time reinventing the wheel haha. Will also be interesting to have some insight into how the two approach interact, and whether the benefits can stack or if they overlap.

2

Decadz OP t1_j41w1zt wrote

Thanks for the suggestion! Brandon Amos has many great pieces of research. The linked paper is quite long, so I will need to have a more complete reading at a later date to be sure. At a glance though, this tutorial is about meta-optimization theory as opposed to what I was originally asking for which is about application of meta-optimization techniques to learning parameter initialisations + optimizers.

3

Decadz OP t1_j41udg4 wrote

Thanks for the recommendation! I was unaware of this follow up work, which naturally extends Baydin et al. original work [1]. Categorically, I would consider this paper to be more about meta-optimization (theory), similar to [2, 3]. I was looking for more applied meta-optimization work.

[1] Baydin, A. G., et al. (2017). Online learning rate adaptation with hypergradient descent.

[2] Maclaurin, D., et al. (2015). Gradient-based hyperparameter optimization through reversible learning. ICML.

[3] Lorraine, J., et al (2020). Optimizing millions of hyperparameters by implicit differentiation. AISTATS

2