Use Bayesian optimisation! Fit a Gaussian process to your model performance as a fn of hyperparams. Run your network on a fraction of your dataset a few times until your GP has a few samples to work on. Search hyperprams by evaluating the GP at different points.
Viewing a single comment thread. View all comments