AuspiciousApple OP t1_itwvq1f wrote on October 26, 2022 at 10:01 PM

Reply to comment by _Arsenie_Boca_ in [D] What's the best open source model for GPT3-like text-to-text generation on local hardware? by AuspiciousApple

>I dont think any model you can run on a single commodity gpu will be on par with gpt-3.

That makes sense. I'm not an NLP person, so I don't have a good intuition on how these models scale or what the benchmark numbers actually mean.

In CV, the difference between a small and large model might be a few % accuracy on imagenet but even small models work reasonably well. FLAN T5-XL seems to generate nonsense 90% of the time for the prompts that I've tried, whereas GPT3 has great output most of the time.

Do you have any experience with these open models?

_Arsenie_Boca_ t1_ityccjh wrote on October 27, 2022 at 5:10 AM

I dont think there is a fundamental difference between cv and nlp. However, we expect language models to be much more generalist than any vision model (Have you ever seen a vision model that performs well on discriminative and generative tasks across domains without finetuning?) I believe this is where scale is the enabling factor.