Viewing a single comment thread. View all comments

trendymoniker t1_ivf0kjq wrote

I think you’ve got the right idea. It makes sense that big companies are developing and pushing big models. They’ve got the resources to train them. But you can often get a lot done with a much smaller, boutique model — thats one of the next frontiers.

8

IndieAIResearcher t1_ivf77hf wrote

Examples?

3

trendymoniker t1_ivf84sd wrote

Easy answer is distillations like EfficientNet or DistillBERT. You can also get an intuition for the process by taking a small easy dataset — like MNIST or CIFAR — and running a big hyperparameter search over models. There will be small models which perform close to the best models.

These days nobody uses ResNet or Inception but there was a time they were the bleeding edge. Now it’s all smaller more precise stuff.

There other dimension you can win over big models is hardcoding in your priors.

11