All the examples from langchain and on huggingface create memory by pasting the entire history in every prompt. This seems to violate the max input prompt length pretty quickly. And it’s expensive. Does chatgpt use something revolutionary? It forgets everything when you create a new session so it ‘feels’ it’s using the convo as memory as well.

But then the question; how do they get past prompt limits? Chunking doesn’t help as it still doesn’t get context in that case between prompts. Maybe they ask the same question with different chunks many times and then ask for a final result?

Apologies if this was answered somewhere, I cannot find it at all and all examples use the same kind of history memory.

Comments

You must log in or register to comment.

DaLameLama t1_j4zhqqj wrote on January 19, 2023 at 9:58 AM

Does ChatGPT actually get past the token limit? Codex supports ~8000 tokens. You might underestimate how much this is. Has anyone tested the limits?

Unfortunately, OpenAI aren't serious about publishing technical reports anymore.

xorbinant_ranchu t1_j4zmevh wrote on January 19, 2023 at 11:01 AM

1 token ~= 4 chars in English 1 token ~= ¾ words 100 tokens ~= 75 words

https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

I've had it reference much more than ~6000 words so there's definitely something else going on

LetterRip t1_j56kpcq wrote on January 20, 2023 at 7:24 PM

It probably does a summarize if it is longer than the allowed input.

andreichiffa t1_j50x4ky wrote on January 19, 2023 at 5:07 PM

Reported token size is 2048, but they likely do a hard attention mask. In about 1/4th of words

EmmyNoetherRing t1_j510553 wrote on January 19, 2023 at 5:25 PM

>Unfortunately, OpenAI aren't serious about publishing technical reports anymore.

Do OpenAI folks show up to any of the major research conferences? These days I mostly come into contact with AI when it wanders into the tech policy/governance world, and this seems like the sort of work that would get you invited to an OSTP workshop, but I'm not sure if that's actually happening.

OpenAI's latest not-so-technical report (on their website) has a few folks from Georgetown contributing to it, and since AAAI is in DC in a few weeks I was hoping OpenAI would be around and available for questions in some capacity, in some room at the conference.

DaLameLama t1_j519tns wrote on January 19, 2023 at 6:23 PM

There was an OpenAI party at NeurIPS, but I wasn't there. No clue about AAAI :)

EmmyNoetherRing t1_j51cvjh wrote on January 19, 2023 at 6:42 PM

yeah, as an uninformed guess it seems like IJCAI or NeurIPS would be a more natural home, but AAAI is actually in DC, which seems helpful for some categories of conversation. if the right people attend.

[deleted] t1_j51wpic wrote on January 19, 2023 at 8:42 PM

[deleted]

EmmyNoetherRing t1_j50zesm wrote on January 19, 2023 at 5:21 PM

I've heard a diverse variety of folks talk about leaving chatGPT tabs/sessions open for for days or weeks and maintaining context plausibly well throughout.

Daos-Lies t1_j4zpwjr wrote on January 19, 2023 at 11:44 AM

This is just a suspicion, but I think it's just a matter of embedding the conversation and using that embedding as an input, in addition to your most recent question. (Which is just classic recurrence really).

I'm relatively confident that the mechanism would be something along those lines because they made a relatively big fuss about their new embedding service around the same time that chatgpt was released. (tho obviously that didn't get as much attention as chatgpt itself).

(and in response to u/DaLameLama asking if chatGPT goes past the token limit: Yes. it deffo can go past 8000 tokens, I have had some v v v long conversations with it.)

MysteryInc152 t1_j50pkxw wrote on January 19, 2023 at 4:21 PM

With embeddings, it should theoritically not have a hard limit at all. But experiments here suggest a sliding context window of 8096

https://mobile.twitter.com/goodside/status/1598874674204618753?t=70_OKsoGYAx8MY38ydXMAA&s=19

IntelArtiGen t1_j4zr3iq wrote on January 19, 2023 at 11:57 AM

Yeah that's also what I would say, I doubt it's anything revolutionary as it's likely not necessary. It might be an innovative use of embeddings of a conversation but I wouldn't qualify that as "revolutionary".

They probably don't use only one embedding for the whole conv, perhaps they use one embedding per prompt and/or they keep in memory some tokens.

MysteryInc152 t1_j50pw6e wrote on January 19, 2023 at 4:23 PM

With embeddings, it should theoritically not have a hard limit at all. But experiments here suggest a sliding context window of 8096

https://mobile.twitter.com/goodside/status/1598874674204618753?t=70_OKsoGYAx8MY38ydXMAA&s=19

Daos-Lies t1_j50vdq9 wrote on January 19, 2023 at 4:57 PM

That is indeed fair enough.

Big fan of the concept of screaming at it until it forgets ;)

And I suppose it is very possible that as part of my 'v long conversations with it' if the topic of the conversation repeated at any stage, which I'm sure they would have done at points, then that could have fooled me into thinking it was remembering things from right at the start.

MysteryInc152 t1_j50ym7g wrote on January 19, 2023 at 5:16 PM

There's a repo that actually uses embeddings for long term conversations you can try out.

https://github.com/Kav-K/GPT3Discord

Czl2 t1_j4zqan4 wrote on January 19, 2023 at 11:48 AM

Ask model to summarize whatever is about to be cut off as you slide the token window and replace what is lost with that summary? In this way your token window always has a summarized version of what is missing attached?

wind_dude t1_j50pmcc wrote on January 19, 2023 at 4:21 PM

I would suspect similar to blenderbot2 from meta and parl.ai.

Chat memory is searched for relevant information and sent to the decoder for the final output.

https://medium.com/ai-network/is-there-a-chatbot-that-goes-beyond-the-gpt-3-blenderbot-2-0-17e42e674824

https://ai.facebook.com/blog/blender-bot-2-an-open-source-chatbot-that-builds-long-term-memory-and-searches-the-internet/

So it's in the model architecture.

BenjaminJamesBush t1_j51w4g7 wrote on January 19, 2023 at 8:39 PM

https://dagster.io/blog/chatgpt-langchain

holy_onasandwich t1_j4zs0ls wrote on January 19, 2023 at 12:07 PM

Should be the ~8k token context size, experiment done here: https://twitter.com/goodside/status/1598874674204618753?t=70_OKsoGYAx8MY38ydXMAA&s=19

drumnation t1_j52yrfo wrote on January 20, 2023 at 12:59 AM

The api docs don’t seem clear in how to remake the same session memory in the main app. It appeared to me as if it uses stop words to achieve this but I’m still trying to figure out how to emulate conversation memory.

kvutxdy t1_j531msz wrote on January 20, 2023 at 1:19 AM

I asked ChatGPT and it said RNN is used in the system as well. (probably not true)