Submitted by __Maximum__ t3_11l3as6 in MachineLearning
whata_wonderful_day t1_jbhp4gb wrote
Reply to comment by Jepacor in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Thanks, alas I thought it was an encoder model. I've been on the lookout for a big one, largest I've seen is deberta V2 with 1.5B params
Viewing a single comment thread. View all comments