soraki_soladead
soraki_soladead OP t1_j0gemzb wrote
Reply to comment by 2600_yay in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead
It isn’t but interesting paper!
soraki_soladead OP t1_j0el3yt wrote
Reply to comment by Axel-Blaze in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead
Thanks, I’ll take a look!
soraki_soladead OP t1_j0desco wrote
Reply to comment by aps692 in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead
Reading through it now. It was on my reading list but it doesn’t look familiar.
soraki_soladead OP t1_j0cmgzs wrote
Reply to comment by Rabrg in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead
Perfect. Thank you! That explains why I couldn't find it.
EDIT: Spoke too soon. I think this covers some of the same ideas but it isn't the one I'm remembering. There's no method for simplifying the earlier layers of the transformer and exploiting the fact that they primarily learn bigrams. I could have sworn I read about it in an arxiv or openreview paper.
Submitted by soraki_soladead t3_zmoxp7 in MachineLearning
soraki_soladead t1_iznhiei wrote
Reply to comment by aussie_punmaster in [R] Large language models are not zero-shot communicators by mrx-ai
Sure but in the context of ChatGPT and how it was trained this isn’t a surprising result.
soraki_soladead t1_izjq5j8 wrote
Reply to comment by abecedarius in [R] Large language models are not zero-shot communicators by mrx-ai
it seems obvious that the ambiguity comes from the framing of the question. the model has no way of knowing if the person has been found or when the question was posed to Juan. however if you ask the model to explain Juan’s answer that is a very different request
soraki_soladead t1_izjpxd6 wrote
Reply to comment by jcasper in [R] Large language models are not zero-shot communicators by mrx-ai
fwiw, it is very difficult to know if the model has seen the task (or similar tasks) before or not due to the nature of the data collection
I feel like zero shot / few shot has taken a much less rigorous tone when applied to LLMs
soraki_soladead t1_j375igy wrote
Reply to comment by Cpt_shortypants in [D] Is it a time to seriously regulate and restrict AI research? by Baturinsky
Regulating machine learning sounds ridiculous but note that cryptography is regulated and also consists of just math and programming. For example, if you publish an app on the app store with cryptography you need to comply with export regulations: https://help.apple.com/app-store-connect/#/dev88f5c7bf9
Now, that’s for exports and publishing. Regulating personal use is much more difficult but it’s still possible: perhaps requiring a photo ID to download certain libraries or requisition GPUs/TPUs.
Personally, I think it’s unlikely to happen and the benefits of doing so are minimal.