Comments

You must log in or register to comment.

TFenrir t1_j9i5653 wrote

Holy shit, 32k token context? That is a complete fucking game changer. That's about 30k words. Current context length is about 4k tokens.

A simple example of why that is relevant - it's hard for a model to hold an entire research paper in its context right now - this could now handle probably multiple research papers in context.

Code wise... It's the difference between a 100 line toy app, to something like 800.

Context window increasing this much just also makes so many more apps easier to write, or fundamentally possible when they weren't before. Chat memory extends, short story writing basically hits new heights. The average book has 500 words a page, ish - about 6-7 pages currently, will jump up to 50.

1 token = 4 English characters. Average word is 4.7.

145

turnip_burrito t1_j9i94b1 wrote

Now you got me excited about 2-3 years from now when the order of magnitude jumps 10x again or more.

Right now that's a good amount. But when it ncreases again by 10x, that would be enough to handle multiple very large papers, or a whole medium size novel plus some.

In any case, say hello to loading tons of extra info into short term context to improve information synthesis.

You could also do computations within the context window by running mini "LLM programs" within it while working on a larger problem, using it as a workspace to solve a problem.

52

TFenrir t1_j9iaxdg wrote

I really want to see how coherent and sensible it can be at 32k tokens, and a fundamentally better model. Could it write a whole short story off a prompt?

24

turnip_burrito t1_j9ib4kx wrote

That's a really good question. I want to see too how reasoning, coherence, and creativity are affected by large context length.

13

dasnihil t1_j9jzkpp wrote

early glimpses of sophisticated and extremely coherent sounding output.

1

visarga t1_ja8cjid wrote

There's much less long form data to train on. That's problematic.

1

diabeetis t1_j9irqxa wrote

the order of magnitude will always jump by 10x

7

turnip_burrito t1_j9it0u6 wrote

No, could be 100x or 1000x.

3

redpnd t1_j9j6dl9 wrote

Those are two and three order of magnitudes.

5

turnip_burrito t1_j9j74rx wrote

Yes, I said "when the order of magnitude jumps by 10x or more".

Hence a jump by one order of magnitude, (10x), or two orders of magnitude (100x), or three orders of magnitude (1000x).

You can jump by more than one order of magnitude. Diabeetus' comment is wrong, because the order of magnitude can grow more than one order.

1

diabeetis t1_j9jb6wd wrote

I mean who cares but I think the standard way of expressing that thought would be "jumps by 2 or 3 orders of magnitude"

0

turnip_burrito t1_j9jbkjo wrote

I think the intended meaning is quite clear by context, but your point is taken.

3

grimorg80 t1_j9ke8k6 wrote

Two three years?? It's gonna happen way sooner than that.

3

nexapp t1_j9m0ja0 wrote

>https://twitter.com/transitive_bs/status/1628118163874516992?s=20

make it 100x more as the most likely estimate given present lightning speed progression.

2

gONzOglIzlI t1_j9izs1r wrote

I'm I the only one wondering how quantum computers will factor in to all of this?
Feels like a hidden wild card, could expend the token budged exponentially.

0

turnip_burrito t1_j9j1pe8 wrote

Why would it expand the token budget exponentially?

Also we have nowhere near enough qubits to handle these kinds of computations. The number of bits you need to run these models is huge (GPT3 ~170bil or 10^11 parameters). Quantum computers nowadays are lucky to be around 10^3 qubits, and they decohere too quickly to be used for very long (about 10^-4 seconds). * numbers pulled from a quick Google search.

That said, new (classical computer) architectures do exist that can use longer context windows: H3 (Hungry Hungry Hippos) and RWVST or whatever it's called.

5

D2MAH t1_j9jb0sf wrote

I’m not quite sure if quantum computing is even needed for AGI. It’s so far behind and we’re so far ahead.

4

ChezMere t1_j9owekb wrote

Quantum computers work nothing like how you think they do, and are completely useless for AI (as well as almost all other classes of problem).

1

gONzOglIzlI t1_j9so4my wrote

"Quantum computers are completely useless for AI."
Bold prediction, we'll see how it ages.
Can't say I'm anything close to an expert, but I do have masters CS, was a competitive programmer and am a professional programmer now with 10y of xp.

0

GPT-5entient t1_j9l4ex1 wrote

32k tokens would mean approximately 150 kB of text. That is a decent sized code base! Also with this much context memory the known context saving tricks would work much better so this could be theoretically used to create code bases of virtually unlimited size.
This amazes me and also (being a software dev) also scares me...
But, as they say, what a time to be alive!

5

GoldenRain t1_j9j3fb1 wrote

I wonder how expensive each prompt is though.

4

GPT-5entient t1_j9l5ph0 wrote

From that table it looks like it will be 6x more expensive than ChatGPT's model. It looks like you need 600 units per instance vs. 100. Not sure how this translates into raw token cost though, but it seems that it is going to be more expensive once they expose serverless pay-as-you-go pricing. text-davinci-003 is $0.02 per 1k token so this could be $0.12 per kilotoken.

5

YobaiYamete t1_j9k8s5j wrote

> That's about 30k words.

Dude give me this, but for a character Ai style chat or NovelAI style RP

1

visarga t1_ja8c2yk wrote

I tested a paper quickly and it was 20K tokens in 200KB of text.

1

CustardNearby t1_j9ib3xo wrote

Stuff like this is the true start of the AI revolution. Once companies start partnering with OpenAi for their own private, highly specialized ChatGPT, it’s over. Layoffs are gonna be massive.

50

ExtraFun4319 t1_j9ijmok wrote

As someone who actually works in a related field and is pretty familiar with the actual field of AI itself, and have met and know people with all sorts of work backgrounds which has given me insight about many work fields, I am extremely doubtful that ChatGPT (in any capacity) will result in major layoffs.

The technology just isn't there, and I don't see it getting there (to the level where it'd cause the economic damage you're describing) anytime soon.

9

Electronic_Source_70 t1_j9iw65f wrote

The problem with this is that unless they work in Google or Amazon, it's hard to know because all the advance and powerful models' info haven't been sent to the public or info is known. Also do they work in computer vision, ML or LLMs or other deep learning fields and are the AI engineers actual ones with credibility or are they SE that watch George hotz or something because I don't believe you unless your related field is neuroscience if that's the case I will shut up and hid in the corner.

5

turnip_burrito t1_j9j3k3y wrote

Neuroscience has basically no relationship to machine learning at this point (Neural networks are just """inspired"""^(TM) by neuroscience) so I wouldn't trust anyone but an AI specialist.

8

[deleted] t1_j9j9o40 wrote

Computational neuroscientists, they use a lot of the same techniques but for different purposes.

Plus a lot of the leading research centres for computational neuroscience tend to also be involved in AI and machine learning

0

turnip_burrito t1_j9ja3q5 wrote

What problems are the computational neuroscientists trying to solve? Modeling parts of brains using artificial neural networks (the ML kind)?

2

nexapp t1_j9m1k0m wrote

Oh it will result in massive layoffs, no doubt about it. The whole point is optimize and reduce redundancy / costly work flows. If UBI doesn't catch-up, this will most certainly lead to major political upheavals world-wide.

3

visarga t1_ja8dm4q wrote

That's a naive view that doesn't take into consideration the second order effects. In 5-10 years companies will have to compete with more advanced products that use AI, a lot of that new found AI productivity will be spent to level off with the competition instead of raking in absurd profits. And lowering prices will help consumers.

2

NoidoDev t1_j9iuhtl wrote

Could it reduce the numbers of required people and create more competition by elevating some people using such tools. Could this be done remote, maybe even without too much knowledge what the company does, so it could be outsourced? Could a combination of input into some AI based system from the top and the bottom, with some oversight of a much smaller number of middle mangers reduce how many of them are needed?

1

iamozymandiusking t1_j9kg9hp wrote

Of course it’s a huge unknown right now how all this will settle out, but it’s also worth remembering that computers were supposedly going to reduce the need for people, but it just upped expectations of productivity. Something similar will happen here. Certainly some jobs will be less valuable, and likely some skills will be more valuable, such as the ability to effectively direct AI tools to a desired result. And then some entirely new roles will come into existence.

0

Artanthos t1_j9labqn wrote

Depending on what you did, there was a massive wave of right sizing in the 80s, just as computers were becoming more popular.

Things like secretarial pools went away.

Yes, programmers of various flavors came into high demand, eventually creating more jobs than were lost,

The difference is, this time you won’t need more people to program the computers, you will need fewer. There will be no new high positions created for those displaced.

0

feedmaster t1_j9m117q wrote

Of course chatGPT won't, but GPT4, 5, 6 definitely will. GPT4 is coming this year already and could be an order of magnitude better than chatGPT. This change will come quickly.

1

IndependenceRound453 t1_j9iqtbg wrote

>Layoffs are gonna be massive.

No, they aren't because what you're describing isn't gonna happen (at the very least not with current/near term AI). Be realistic.

The hopium on this sub is on another level. You guys upvote comments just because they sound pretty to you, even if they aren't the slightest bit rational.

9

Savings-Juice-9517 t1_j9ix6v8 wrote

Exactly. I’m a full time programmer and AI, at least in its current form, definitely improves my productivity but is no where near the level where it will replace programmers or software engineers. Less than 5% of a programmers time is spent physically writing code but this subreddit seems to think that’s what programmers do all day

−4

GPT-5entient t1_j9lbsfr wrote

>Less than 5% of a programmers time is spent physically writing code

Not sure where you work at, but I am a principal SDE with 16 YoE and even though I spend most of my time in meetings, helping more junior team members or just on communication in general I try to shoot for 40-50% of my time actually writing code and Copilot does help with that (I'd say maybe 20% productivity increase). Even our dev manager probably spends more than 5% of his time writing actual code (but most dev managers don't of course).

4

madali0 t1_j9j7siw wrote

As a non-programmer, i tried asking it to change one small addition to an indicator in tradingview and i had to keep giving it the errors i got and did additional searches on google until i figured it out. At the end, all i changed was just two lines of code and it took me a long time.

Basically, what i mean is that the person giving the prompts already needs to have some programming knowledge to get help.

Even if it becomes more advanced, i bet you'd need workers to know how to give it prompts (or have unique ai prompters as a new position) to get the best outcome.

I think it's true for ai art. I see great ai art online but when i do it, it usually comes out far worse. It's when i realize that if they are going to replace some lower level cheap artist, they'd still need some ai art prompter to know what keywords to give it and what filter to use to get the best art, and you'd also probably need some other person to actually sort through the outputs to see which best fits their needs.

For those basic stock pictures, it doesn't really change much. Imagine if an outlets is rushing out articles, and they write one on how drinking water is healthy and they need an imagine of a woman drinking water. Seems cheaper and easier to just choose one with their stock images subscription.

And if they need something really unique and special for a main product, they can't just let some middle manager type a prompt and use that. They have to call the prompt guy (or maybe more realistically, they'll outsource it to an ai generator company who has humans that receive what the company needs, they then do the promoting, choosing the best, and editing it to provide them the image that fits their needs.

Basically, for every job they do away with, they'll just create a new human need.

1

[deleted] t1_j9j1pqt wrote

[deleted]

−4

Savings-Juice-9517 t1_j9j43yd wrote

You completely bypassed the points being made and instead were trying to make a pedantic semantics argument about two terms that are two halves of the same coin

0

[deleted] t1_j9j4e9w wrote

[deleted]

1

madali0 t1_j9j7z6a wrote

I'm sorry, but as an AI Language Model, I do not have feelings to be considered "okay". Is there any other question I can help you with?

5

Glad_Laugh_5656 t1_j9igems wrote

If you believe that, then you either aren't very familiar with labor or aren't very familiar with ChatGPT (or both).

I agree with funprize (another redditor who also replied to your comment) that while a future version could someday be a threat to a lot of workers, this one (even if it's finetuned) probably won't.

3

bigseamonsters t1_j9l9dxb wrote

>Could it reduce the numbers of required people and create more competition by elevating some people using such tools. Could this be done remote, maybe even without too much k

yeah there's a solid 7-8 years left, nobody should be worried! /s

2

xott t1_j9ihhe1 wrote

Good bye middle management roles

3

TheOneTrueEris t1_j9iscgr wrote

Actually those people management roles are probably one of the safer ones.

−1

redroverdestroys t1_j9if238 wrote

This stuff is so much fun to read about. Literally none of my real life friends or online gaming ones give a shit about any of this. But I guess none of them seem to be dreamers or creative types either.

47

Akimbo333 t1_j9hzzkg wrote

What's the benefits?

9

maskedpaki t1_j9i1by1 wrote

32k context !!!!!!!

That means 8x chatgpt which can remember the last 4k tokens or 2000 words

GPT 4 can remember (in the more premium version) 16000 words !

39

Akimbo333 t1_j9i654s wrote

Awesome! Yeah, memory is very much needed!!!

4

arindale t1_j9i16cu wrote

Dedicated compute allows companies to build without worrying that another developer (or the community at large) uses up too much of Open-AI’s compute

Allowing custom models allows companies to fine tune their own models using Chat GPT as a base. So think of a ChatGPT like a university grad. It’s intelligent and broadly capable. But not necessarily a specialist in the exact tasks that a company may need. But what if that company could train it on 100,000 samples of those tasks?

I am not an AI expert. I welcome anyone to correct me here.

12

Akimbo333 t1_j9i3c13 wrote

Oh ok cool! So it could be a doctor GPT or something like CharacterAI GPT.

4

blueSGL t1_j9i64l3 wrote

CallcenterGPT

12

ghaj56 t1_j9im5sy wrote

Please stay on the line after this call to take a brief survey to determine if I have been a good Bing

5

Most-Inflation-1022 t1_j9lk5h0 wrote

Yep, this will be the first widespread use of it. Fine tunned chat gpt will most likely generate all written communication. This will reduce the workforce need by at least 50%. And its only the start.

2

Wyrade t1_j9ig088 wrote

Can someone explain the pricing for me?

I'm just a random guy, not in business, but I'm curious.

For example, is gpt3.5 turbo 100 computing units per instance, costing $26/month PLUS $260/month per unit, meaning $26026 per month for a single instance? And how many people can a single instance serve?

The above doesn't sound right to me, but I'm confused.

5

maskedpaki t1_j9ilkeq wrote

it seems like its 6 times more expensive than chatgpt given the relative pricing

​

since chatgpt costs a few cents per query we can expect this to cost a few tens of cents per query.

​

But the queries can be like 20,000 words of context which allows it to write like whole programs entire research papers short books etc

10

jeffkeeg t1_j9ip8yp wrote

Did anyone manage to grab the full google doc before it got deleted?

5

MrEloi t1_j9jneec wrote

Hold your horses!

Research has shown that excessively large contexts disrupt the model performance.

That said, I have no idea where the token size boundary between amazing and broken is.

5

Lawjarp2 t1_j9ivxdr wrote

So the last one with 32k token context could be gpt-4 or gpt-4 will atleast have 32k token context.

3

Wroisu t1_j9i8tx1 wrote

Damn.

2

BuildingCastlesInAir t1_j9iqaph wrote

Can you train the model on proprietary date? For example, behind the firewall at a company at the wikis, chats, and knowledge bases of a company so that when you query, you're getting data from inside the company? If not, what data is used to train GPT4? I think the larger this gets, the more garbage in, garbage out. What's the governance model for the accuracy of the data? How is it scored?

2

Yngstr t1_j9k1dpf wrote

Just to be clear, from developer perspective, is this just "unlimited" API access?

2

BPlansai t1_j9keq38 wrote

Damn, this is exciting!

1

mrfreeman93 t1_j9yy6pw wrote

One step closer to decent short term memory is increasing input length. If it works, that's great. There have been architectures like longformer for a while though

1

No_Ninja3309_NoNoYes t1_j9ix5p4 wrote

32K context is the new 600K RAM. The bigger the model the more resources you need to support it and the more expensive it gets. Without any guarantee about the quality. For example ChatGPT would produce code like:

int result= num1 + num2; return result;

That's in itself not technically wrong but it is unnecessary long. Any static analysis tool would have nagged about this. Also, unit tests or compilers would have caught any actual errors. The OpenAI culture is of PhDs with a certain background. They work in Jupiter notebooks and don't know about standard Dev tools.

My friend Fred says that he can add value with his code generation startup because of that. I also think that LLMs and more traditional technology combined are the way to go.

−4

DonOfTheDarkNight t1_j9jgped wrote

It's interesting to see how dismissive you are of PhDs not being able to use standard Dev tools

6