Viewing a single comment thread. View all comments

Superschlenz t1_iypxi3i wrote

>Could you elaborate on why?

Because random noise basically means "We do not understand the real causes," and a solution cannot be optimal if different random seeds lead to different performance results.

>What is the alternative?

I am not competent enough to answer that, but basically the random seed is a hyperparameter and an optimal learning algorithm should have zero hyperparameters at all, so that everything depends on the user data and learning is not hampered by the wrong hyperparameter choice of the developer. Maybe Bayesian Optimization with a yet-to-invent way to stack them against the curse of high-dimensional data.

0

Oceanboi t1_iyq03g2 wrote

Why do you say an optimal learning algorithm should have zero hyperparameters? Are you saying an optimal neural network would learn things like batch size, learning rate, optimal optimizer (lol), input size, etc, on its own? In this case wouldn't a model with zero hyperparameters be the same conceptually as a model that has been tuned to the optimal hyperparameter combination?

Theoretically you could make these hyperparameters trainable if you had the coding chops, so why are we still as a community tweaking hyperparameters iteratively?

1

Superschlenz t1_iyq5oy5 wrote

>Why do you say an optimal learning algorithm should have zero hyperparameters?

Because hyperparameters are fixed by the developer, and so the developer must know the user's environment in order to tune them, but if it requires a developer then it is programming and not learning.

>Are you saying an optimal neural network would learn things like batch size, learning rate, optimal optimizer (lol), input size, etc, on its own?

An optimal learning algorithm wouldn't have those hyperparameters at all, not even static hardware.

>In this case wouldn't a model with zero hyperparameters be the same conceptually as a model that has been tuned to the optimal hyperparameter combination?

Users do not tune hyperparameters, and developers do not know the user's environment. The agent can be broadly pretrained at the developer's laboratory to speed up learning at the user's site, but finally it has to learn on its own at the user's site without a developer being around.

>Theoretically you could make these hyperparameters trainable if you had the coding chops, so why are we still as a community tweaking hyperparameters iteratively?

Because you as a community have been forced to decide for a job when you were 14 years old, and you chose to become a machine learning engineer because you were more talented than others, and now you are performing the show of the useful engineer.

−1

Optimal-Asshole t1_iyqtp3l wrote

No, the reason for hyper parameter optimization isn’t job security. It’s because choosing better hyper parameters will produce better results which has more success in applications. There are people working on automatic hyperparameter optimization.

But let’s not act like it’s due solely due to some community caused phenomenon and engineers putting on a show. Honestly your message comes off as a little bitter.

2