Viewing a single comment thread. View all comments

EntireContext OP t1_iynldwj wrote

It remembered previous prompts when I talked about them.

4

red75prime t1_iynlrrc wrote

Make sure that the prompt is 2000-3000 words away from the question.

7

EntireContext OP t1_iynm425 wrote

No idea what the context window is, but at the end of the day they can just increase it....

It's already commercially useful right now. It doesn't need more context window to be more useful (although the context window will continue to increase) but only more qualitative intelligence.

0

red75prime t1_iynpyob wrote

It's not feasible to increase context window due to quadratic growth of required computations.

> It doesn't need more context window to be more useful

It needs memory to be significantly more useful (as in large-scale disruptive) and, probably, other subsystems/capabilities (error detection, continual learning). Its current applications require significant human participation and scaling alone will not change that.

14

EntireContext OP t1_iynq6u8 wrote

I mean the context window will increase with incoming models. GPT-1 had a smaller context window than ChatGPT.

2

ChronoPsyche t1_iyp04j8 wrote

It will increase but the size of increases will slow down without major breakthroughs. You can't predict the rate of future progesss solely based on the rate of past progress in the short term.

You guys take the "exponential growth" stuff way too seriously. All that refers to is technological growth over human history itself, but every time scale doesn't follow the exact same growth patterns. If they did we'd have already reached the singularity a long time ago.

Bottlenecks sometimes occur in the short term and the context-window problem is one such bottleneck.

Nobody doubts that we can solve it eventually, but we haven't solved it yet.

There are potential workarounds like using external memory systems, but that is only a partial workaround for enabling more modest context-window increases. External-memory systems are not feasible for AGI because they are way too slow and do not scale well dynamically, not to mention they are separate from the neural network itself.

In the end, we either need an algorithmic breakthroughs or quantum computers to solve the context-window problem as it relates to AGI. An algorithmic breakthrough is more likely to happen before quantum computers become viable. If it doesn't, then we may be waiting a long time for AGI.

Look into the concept of computational complexity if you want to better understand the issue we are dealing with here.

2

ReadSeparate t1_iynt062 wrote

They can’t just increase it. The context window’s time complexity is O(n^2) which means the amount of compute needed per token added grows exponentially.

This is an architectural constraint of transformers. We’ll either need a better algorithm than transformers, or a way to encode/decode important information to, say, a database and insert it back into the prompt when it’s required

9

EntireContext OP t1_iyntah2 wrote

Well they will make a better algorithm than transformers then (which have already been improved to performers and whatnot).

At any rate, I still see AGI in 2025.

−4

EpicMasterOfWar t1_iyo3tr2 wrote

Based on what?

3

EntireContext OP t1_iyo9fg4 wrote

The difference between what was possible in 2019 and what the models can do now.

Back when GPT-2 was out it could barely produce coherent sentences.

This GPTChat model does make mistakes, but it always speaks in a coherent way.

0

ReadSeparate t1_iyo883j wrote

I do agree with this comment. It’s feasible that long term memory isn’t required for AGI (though I think it probably is) or that hacks like reading/writing to a database will be able to simulate long term memory.

I think it may take longer than 2025 to replace transformers though. They’ve been around since 2017 and we haven’t seen any real promising candidates yet.

I can definitely see a scenario where GPT-5 or 6 has prompts built into is training data which are designed to teach it to utilize database read/writes.

Imagine it says hello to you after seeing your name only once six months ago. It could have a read database token which has sub-input tokens to fetch your name from a database based on some sort of identifier.

It could probably get really good at doing this too if it’s actually in the training data.

Eventually, I could see the model using its coding knowledge to design the database/promoting system on its own.

2

ChronoPsyche t1_iyp084x wrote

Eventually, but without any knowledge of specific breakthroughs that will happen very shortly, your 2025 estimation is an uninformed guess at best.

1

EntireContext OP t1_iyskmjg wrote

I don't see a need for specific breakthroughs. I believe the rate of progress we've been seeing since 2012 will get us to AGI by 2025.

0

ChronoPsyche t1_iytra7q wrote

Well you can believe whatever you want but you're not basing those beliefs on anything substantive.

Honestly, the rate of progress since 2012 has been very slow. It's only in the past few years that things have picked up substantially and that was only because of recent breakthroughs with transformer models.

That's kind of how the history of AI progress has worked. We typically have breakthroughs that lead to a surge in progress that eventually plateaus and then stalls for a while as bottlenecks are reached and then eventually a new breakthrough is reached and there is another surge in progress.

It's not guaranteed there will be another plateau before AGI, but we're gonna need new breakthroughs to get there, because as I said, we are approaching bottlenecks with the current technology that will slow down the rate of progress.

That's not necessarily a bad thing, by the way. Our society isn't currently ready to handle AGI. It's good to have some time pass to actually integrate the new technology rather than developing it faster than we can even use it.

1