trendymoniker t1_ivf0kjq wrote on November 7, 2022 at 2:25 PM

I think you’ve got the right idea. It makes sense that big companies are developing and pushing big models. They’ve got the resources to train them. But you can often get a lot done with a much smaller, boutique model — thats one of the next frontiers.

IndieAIResearcher t1_ivf77hf wrote on November 7, 2022 at 3:12 PM

Examples?

trendymoniker t1_ivf84sd wrote on November 7, 2022 at 3:19 PM

Easy answer is distillations like EfficientNet or DistillBERT. You can also get an intuition for the process by taking a small easy dataset — like MNIST or CIFAR — and running a big hyperparameter search over models. There will be small models which perform close to the best models.

These days nobody uses ResNet or Inception but there was a time they were the bleeding edge. Now it’s all smaller more precise stuff.

There other dimension you can win over big models is hardcoding in your priors.

IndieAIResearcher t1_ivf9i33 wrote on November 7, 2022 at 3:28 PM

Thanks :)