soraki_soladead t1_j375igy wrote on January 6, 2023 at 2:42 PM

Reply to comment by Cpt_shortypants in [D] Is it a time to seriously regulate and restrict AI research? by Baturinsky

Regulating machine learning sounds ridiculous but note that cryptography is regulated and also consists of just math and programming. For example, if you publish an app on the app store with cryptography you need to comply with export regulations: https://help.apple.com/app-store-connect/#/dev88f5c7bf9

Now, that’s for exports and publishing. Regulating personal use is much more difficult but it’s still possible: perhaps requiring a photo ID to download certain libraries or requisition GPUs/TPUs.

Personally, I think it’s unlikely to happen and the benefits of doing so are minimal.

soraki_soladead OP t1_j0gemzb wrote on December 16, 2022 at 1:30 PM

Reply to comment by 2600_yay in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead

It isn’t but interesting paper!

soraki_soladead OP t1_j0el3yt wrote on December 16, 2022 at 1:54 AM

Reply to comment by Axel-Blaze in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead

Thanks, I’ll take a look!

soraki_soladead OP t1_j0desco wrote on December 15, 2022 at 8:56 PM

Reply to comment by aps692 in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead

Reading through it now. It was on my reading list but it doesn’t look familiar.

soraki_soladead OP t1_j0cmgzs wrote on December 15, 2022 at 5:55 PM

Reply to comment by Rabrg in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead

Perfect. Thank you! That explains why I couldn't find it.

EDIT: Spoke too soon. I think this covers some of the same ideas but it isn't the one I'm remembering. There's no method for simplifying the earlier layers of the transformer and exploiting the fact that they primarily learn bigrams. I could have sworn I read about it in an arxiv or openreview paper.

soraki_soladead t1_iznhiei wrote on December 10, 2022 at 12:36 PM

Reply to comment by aussie_punmaster in [R] Large language models are not zero-shot communicators by mrx-ai

Sure but in the context of ChatGPT and how it was trained this isn’t a surprising result.

soraki_soladead t1_izjq5j8 wrote on December 9, 2022 at 4:46 PM

Reply to comment by abecedarius in [R] Large language models are not zero-shot communicators by mrx-ai

it seems obvious that the ambiguity comes from the framing of the question. the model has no way of knowing if the person has been found or when the question was posed to Juan. however if you ask the model to explain Juan’s answer that is a very different request

soraki_soladead t1_izjpxd6 wrote on December 9, 2022 at 4:44 PM

Reply to comment by jcasper in [R] Large language models are not zero-shot communicators by mrx-ai

fwiw, it is very difficult to know if the model has seen the task (or similar tasks) before or not due to the nature of the data collection

I feel like zero shot / few shot has taken a much less rigorous tone when applied to LLMs