Viewing a single comment thread. View all comments

Neurogence t1_j757fn0 wrote

So not all rumors are fake. People were saying that semafor article was garbage.

This is interesting so far but not groundbreaking yet. I'm hoping the rumor that GPT4 can code entire programs also is not fake.

48

tk854 OP t1_j75bo9e wrote

That rumor came from Connor Leahy, CEO of Conjecture, https://twitter.com/npcollapse/ . He is a serious person in the AI/ML world and has direct connections to Sam Altman and other big names, so it would be very unusual if the rumor was not true. When trying to interpret that rumor though, you have to realize that being able to "code an entire program" can place GPT-4 somewhere on a scale between high school programmer who can write a basic crud all the way to John Carmack, which actually doesn't tell us very much.

43

Neurogence t1_j75czbn wrote

Even if it had just the skill set of an entry level programmer would result in massive societal effects and job displacement.

15

tk854 OP t1_j75e2fp wrote

I'm not so sure. 28 million programmers in the world means 0.3% of all people on earth could be affected by job displacement, but only a small percentage of that 0.3% might lose their job, and the work that's being automated won't result in many visible changes except the financial outlook of companies that previously employed those workers. The programming abilities of LLMs like GPT-4 needs to exceed human ability in a general way before the effects to society could be described as massive.

3

Old-Owl-139 t1_j75feeu wrote

You are missing the point. The average "knowledge" worker doesn't go beyond Excel spreadsheets. Even an AI with the skills of the average high school graduate will make a massive disruption.

33

visarga t1_j764340 wrote

No, you're thinking AI can do this alone. Let me tell you - it can't. If it has 1% error rate in information extraction from documents, you need to manually verify everything. Like Tesla's SDC, 99% there is nothing groundbreaking.

I have been working on this very task for 5+ years. I know every paper and model there is. I tested all public APIs for this task. I extensively used GPT-3 for it, and that's my professional judgement.

As for AI validation, it can be 10x more comfortable than manual information extraction, but still requires about 50% of the manual effort. It is not making people suddenly 10x more effective.

Not even OCR is 100% accurate. The best systems have 95% accuracy on noisy document scans. One digit or comma could make the whole transaction absurd, if you send those money without checking you could go in bankruptcy.

The best models we have today are good at generating correct answers 90% of the time - code, factual questions, reasoning. They can do it all but not perfectly. We don't know the risks and can't use this level of confidence without human in the loop.

13

X-msky t1_j765co5 wrote

You assume humans have 100% accuracy?

12

visarga t1_j76lslh wrote

Oh, I can tell you stories about human accuracy. At some point I re-labelled the same test set three times and was still finding errors. My models surpass untrained human accuracy, but still need hand holding, there's one error on every page on average. Humans do more cross checking and correlating, filling a gap in AI.

6

purepersistence t1_j7689y4 wrote

If you're debugging code you don't have to be accurate until the problem is fixed. Mistakes will be common. Accuracy is not absolutely necessary. But competence is. It will be a long damn time before something like chatGPT will find an fix subtle bugs that occur in a production system with many interacting services distributed across multiple computers running software controlled by different corporations.

2

kai_luni t1_j76dmfm wrote

I agree with you point and think about it the same way. Even a great GPT 4 is useless when your nodejs app does not work and chat gtp4 just gives up on it. I think a half good Software Developer is capable of trying until it works. He will sleep on it, he will try, he will talk to other people about it and then he will learn on the way.

​

At some point your nodejs app will work and you are happy. The question is if an AI will reach this level. Even an accuracy of 99.9% still means the app does not work. Can it fix the last ten bugs on its own? If not you need to hire someone to spend many days on this app, hire a real person.

​

Maybe this new technology just leads to better Code Quality. It can streamline you Spaghetti Code and give it a proper documentation. Maybe a Sales person can ask the AI "Can our program do x and then y?" and the AI will say: "Not yet, but with an estimated time of two weeks development that might be possible". That would greatly increase information flow in companies.

​

So lets see if current Machine Learning can reach a level where it impacts the world. Its an exciting time to be alive.

2

tk854 OP t1_j76ks8y wrote

Your explanation is spot on. My one-line take on it is that a larger percentage jobs are AGI-hard than most people are assuming. Take driving for example.

I also think that a lot of people are underestimating how difficult most jobs are, even when it's a job that can be described as "just looking at a spreadsheet".

1

futebollounge t1_j76v5ov wrote

The context space of understanding the content within a spreadsheet versus a dynamic physical world (driving) are night and day in complexity.

3

Neurogence t1_j77a6ep wrote

It depends on how complex the program is. I think it will be much harder to have an AI that can code a program such as a browser entirely by itself versus a fully driverless car AI.

1

Stakbrok t1_j77dwl5 wrote

John Carmack? Meh, I'd put Fabrice Bellard on the end of the spectrum.

1

visarga t1_j763vpj wrote

That depends a lot on context window size - if it's 4K or 8K tokens like today, it won't cut it. For full app coding you need to be able to load dozens of files.

Related to this - if we get, say... 100K context size, we could just "talk to books" or "talk to scientific papers".

5