igorhorst t1_jc372db wrote on March 13, 2023 at 6:34 PM

> Without a clear path to increasing this vital metric, I struggle to see how modern generative AI models can be used for any important tasks that are sensitive to correctness.

My immediate response is "human-in-the-loop" - let the machine generate solutions and then let the human user validate the correctness of said solutions. That being said, that relies on humans being competent to validate correctness, which may be a dubious proposition.

Perhaps a better way forward is to take a general-purpose text generator and finetune it on a more limited corpus that you can guarantee validity on. Then use this finetuned model on important tasks that are sensitive to correctness. This is the basis behind this Othello-GPT paper - take an existing GPT-3 model and finetune it on valid Othello boards so you can generate valid Othello moves. You wouldn't trust this Othello-GPT to write code for you, but you don't have to - you would find a specific machine learning model finetuned on code, and let that model generate code. It's interesting that OpenAI has Codex models that is finetuned on code, such as "code-davinci-003" (which is based off GPT-3).

This latter approach kinda reminds me of the Bitter Solution:

>The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.

But the flipside of the Bitter Solution is that building knowledge into your agent (via approaches like finetuning) will lead to better results in the short-term. In the long-term, solutions based on scaling computation by search and learning may outperform current solutions - but we shouldn't wait for the long term to show up. We have tasks to solve now, and so it's okay to build knowledge into our agents. The resulting agents might become obsolete in a few years, but that's okay. We build tools to solve problems, we solve those problems, and then we retire those tools and move on.

>And certainly we are really far from anything remotely "AGI".

The issue is that we're dealing with "general intelligence" here, and just because a human is terrible at bunch of subjects, we do not say that human lacks general intelligence. I generally conflate the term "AGI" with "general-purpose", and while ChatGPT isn't fully general-purpose (at the end of the day, it just generates text - though it's surprising to me that lots of tasks can be modeled and expressed by mere text), you could use ChatGPT to generate a bunch of solutions. So, I think we're close to getting general-purpose agents that can generate solutions for everything, but the timeline for getting correct solutions for everything may be longer.

buggaby OP t1_jc3dslx wrote on March 13, 2023 at 7:17 PM

Great resources there, thanks.

I'm quite torn by the Bitter Solution, since, in my eyes, the types of questions explored since the start of AI research have been, from one perspective, quite simple. Chess and Go (and indeed other more recent examples in Poker and real-time video games) can be easily simulated. The game is perfectly replicated in the simulation. And speech and image recognition are very easily labelled by human labellers. But I wonder if we are entering a dramatically different goal for modern algorithms.

I quite like the take in this piece about how slowly human brains work and yet how complex they are. That describes a very different learning pattern than what results from the increasing computational speed of computers. Humans learn through a relatively small number of exposures to a very highly complex set of data (the experienced world). But algorithms have always relied on huge amounts of data (even simulated data, in the case of reinforcement learning). But when this data is hard to simulate and hard to label, then how can simply increasing the computation lead to faster machine learning?

I would argue that much of the world is driven by dynamic complexity, which highlights that data is only so valuable without knowledge of the underlying structure. (One example is the 3 body problem - small changes in initial condition results in very quick and dramatic changes in future trajectory.)

As an aside, I would argue that this is one reason that AI solutions have so rarely been used in healthcare settings: the data is so sparse compared with the complexity of the problem.

It seems to me that the value of computation depends on the volume and correctness and appropriateness of the data. So many systems that we navigate and are important to us have hard-to-measure data, data that is noisy, data that is relatively sparse given the complexity of the system, and whose future behaviour is incredibly sensitive to noise in the data.