Viewing a single comment thread. View all comments

throwawaydthrowawayd t1_jdqisag wrote

Remember, the text of an LLM is literally the thought process of the LLM. Trying to have it instantly write an answer to what you ask makes it nigh impossible to accomplish the task. Microsoft and OpenAI have said that the chatbot format degrades the AI's intelligence, but it's the format that is the most useful/profitable currently. If a human were to try to write a sentence with 8 words, they'd mentally retry multiple times, counting over and over, before finally saying an 8 word sentence. By using a chat format, the AI can't do this.

ALSO, the AI does not speak English. It gets handing a bunch of vectors, which do not directly correspond to word count, and it thinks about those vectors, before handing back a number. The fact these vectors + a number directly translate into human language doesn't mean it's going to have an easy time figuring out how many vectors add up to 8 words. That's just a really hard task for LLMs to learn.


RadioFreeAmerika OP t1_jdqky02 wrote

Ah, okay, thanks. I have to look more into this vector-number representation.

For the chatbot thing, why can't the LLM generate a non-displayed output, "test it", and try again until it is confident it is right and only then display it? Ideally, with a time counter that at some point lets it just display what it has with a qualifier. Or if the confidence still is very low, just state that it doesn't know.


throwawaydthrowawayd t1_jdqqsur wrote

> For the chatbot thing, why can't the LLM generate a non-displayed output, "test it", and try again

You can! There are systems designed around that. OpenAI even internally had GPT-4 using a multi-stage response system (a read-execute-print loop, they called it) while testing, to give it more power. There is also the "Reflexion" posts on this sub lately, where they have GPT-4 improve on its own writing. But, A, it's too expensive. Using a reflective system means lots of extra words, and each word costs more electricity.

And B, LLMs currently love to get sidetracked. They use the word "hallucinations" to say that the LLM just starts making things up, or acting like you asked a different question, or many other things. Adding an internal thought process dramatically increases the chances of LLMs going off the rails. There are solutions to this (usually, papers on it will describe their solutions as "grounding" the AI), but once again, they cost more money to do.

So that's why all these chatbots aren't as good as they could be. it's just not worth the electricity to them.


RadioFreeAmerika OP t1_jdr46f0 wrote

Very insightful! Seems like even without groundbreaking stuff, more efficient hardware will likely make the solutions you mentioned more feasible in the future.


turnip_burrito t1_jdsoxo1 wrote

Yeah, we're really waiting for electricity costs to fall if we want to implement things like this in reality.

Right now the roughly current rate of $0.10/(1000tokens)/minute/LLM will, per hour, cost us $6 per hour to run a single LLM. If you have some ensemble of LLMs checking each other's work and working in parallel, say 10 LLMs, that's $60/hr, or $1440/day. Yikes, I can't afford that. And that will maybe have performance and problem solving somewhere between a single LLM and one human.

Once the cost falls by a factor of 100, that's $14.40/day. Expensive, but much more reasonable.


RadioFreeAmerika OP t1_jdufzz4 wrote

But even with $60/h, this might already be profitable if you replace a job that has a higher hourly wage. Lawyers, e.g. At 14.4/h, you beat minimum wage. For toying around, yeah, that's a bit expensive.


turnip_burrito t1_jduhcoa wrote

Yeah for an individual it's no joke .

For a business it may be worth it, depending on the job.