Submitted by __Maximum__ t3_11l3as6 in MachineLearning
Jepacor t1_jbdrovb wrote
Reply to comment by whata_wonderful_day in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
The link to the model is in the Google sheets they linked : https://github.com/facebookresearch/fairseq/blob/main/examples/megatron_11b/README.md
whata_wonderful_day t1_jbhp4gb wrote
Thanks, alas I thought it was an encoder model. I've been on the lookout for a big one, largest I've seen is deberta V2 with 1.5B params
Viewing a single comment thread. View all comments