Submitted by naequs t3_yon48p in MachineLearning
trendymoniker t1_ivf0kjq wrote
I think you’ve got the right idea. It makes sense that big companies are developing and pushing big models. They’ve got the resources to train them. But you can often get a lot done with a much smaller, boutique model — thats one of the next frontiers.
IndieAIResearcher t1_ivf77hf wrote
Examples?
trendymoniker t1_ivf84sd wrote
Easy answer is distillations like EfficientNet or DistillBERT. You can also get an intuition for the process by taking a small easy dataset — like MNIST or CIFAR — and running a big hyperparameter search over models. There will be small models which perform close to the best models.
These days nobody uses ResNet or Inception but there was a time they were the bleeding edge. Now it’s all smaller more precise stuff.
There other dimension you can win over big models is hardcoding in your priors.
IndieAIResearcher t1_ivf9i33 wrote
Thanks :)
Viewing a single comment thread. View all comments