Submitted by starstruckmon t3_1027geh in MachineLearning
nutpeabutter t1_j2snx76 wrote
Reply to comment by Taenk in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
From my personal interactions it just gave off this vibe that it was trained on websites, rather than the GPT-3 (both base and chat) models which felt much more natural. Something to do with having to learn too many languages?
Viewing a single comment thread. View all comments