currentscurrents t1_jdmzphs wrote

That's true, but only for the given compute budget used in training.

Right now we're really limited by compute power, while training data is cheap. Chinchilla and LLaMA are intentionally trading more data for less compute. Larger models still perform better than smaller ones given the same amount of data.

In the long run I expect this will flip; computers will get very fast and data will be the limiting factor.


currentscurrents t1_jdmyjrb wrote

Bigger models are more sample efficient, so it should need less data.

But - didn't the Chinchilla paper say bigger models need more data? Yes, but that's only true because right now compute is the limiting factor. They're intentionally trading off more data for less model size.

As computers get faster and models bigger, data will increasingly become the limiting factor, and people will trade off in the opposite direction instead.


currentscurrents t1_jdjc1hl wrote

I don't think this is a good test because these questions allow you to trade off knowledge for creativity, and LLMs have vast internet knowledge. It's easy to find listicles with creative uses for all of the objects in the test.

Now, this applies to human creativity too! If you ask me for an alternative use for a pair of jeans, I might say that you could cut them up and braid them into a rug. This isn't my creative idea; I just happen to know there's a hobbyist community that does that.

I think in order to test creativity you need constraints. It's not enough to find uses for jeans, you need to find uses for jeans that solve a specific problem.


currentscurrents t1_jdj9tsl wrote

Oh, definitely. I just checked ChatGPT and it's both aware of the existence of the test and can generate example question/answer pairs. This is a general problem when applying human psychology tests to LLMs.

It does help that this test is open-ended and has no right answer. You can always come up with new objects to ask about.


currentscurrents t1_jdf547h wrote

I expect it's more likely that people will run their own chatbots with proprietary content. (Even if just built on top of the GPT API)

For example you might have a news chatbot that knows the news and has up-to-date information not available to ChatGPT. And you'd pay a monthly subscription to the news company for it, not to OpenAI.


currentscurrents t1_jdaq9xo wrote

Right, but you're still loading the full GPT4 to do that.

The idea is that domain-specific chatbots might have better performance at a given model size. You can see this with StableDiffusion models, the ones trained on just a few styles have much higher quality than the base model - but only for those styles.

This is basically the idea behind mixture of experts.


currentscurrents t1_jd10ab5 wrote

Llamma.cpp uses the neural engine, so does StableDiffusion. And the speed is not that far off from VRAM, actually.

>Memory bandwidth is increased to 800GB/s, more than 10x the latest PC desktop chip, and M1 Ultra can be configured with 128GB of unified memory.

By comparison, the Nvidia 4090 is clocking in at ~1000GB/s

Apple is clearly positioning their devices for AI.


currentscurrents t1_jczods2 wrote

I'm hoping that non-Vonn-Neumann chips will scale up in the next few years. There's some you can buy today but they're small:

>NDP200 is designed natively run deep neural networks (DNN) on a variety of architectures, such as CNN, RNN, and fully connected networks, and it performs vision processing with highly accurate inference at under 1mW.

>Up to 896k neural parameters in 8bit mode, 1.6M parameters in 4bit mode, and 7M+ In 1bit mode

An arduino idles at about 10mw, for comparison.

The idea is that if you're not shuffling the entire network weights across the memory bus every inference cycle, you save ludicrous amounts of time and energy. Someday, we'll use this kind of tech to run LLMs on our phones.


currentscurrents t1_jcqzjil wrote

I haven't heard of anybody running LLama as a paid API service. I think doing so might violate the license terms against commercial use.

>(or any other) model

OpenAI has a ChatGPT API that costs pennies per request. Anthropic also recently announced one for their Claude language model but I have not tried it.


currentscurrents t1_jch9ulc wrote

Oh, it is clearly structured. Words and phrases and sentences are all forms of structure and we're using them right now.

What it doesn't have is formal structure; it cannot be fully defined by any set of rules. This is why you can't build a rules-based parser that understands english and have to use an 800GB language model instead.

>shared across essentially every language and dialect

Noam Chomsky thinks this, but the idea of a universal grammar is controversial in modern linguistics.