patient_zer00 t1_izuqszr wrote
It doesn't remember stuff, its mostly the web app that remembers it, it sometimes resends the previous request with your current one. (Check the chrome request logs) It will then probably concatenate the prompts and feed them as one to the model.
master3243 t1_izv48yc wrote
This is it, they have a huge context size and they just feed it in.
I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same size model but there's only speculation in that regards.
In either case, it's nothing we haven't seen in recent papers here and there.
maxToTheJ t1_izvltcw wrote
It probably does some basic checks for adversarial text like putting AAAAAAAAA*, BBBBBBBBBBBBB*, [[[[[[[[*, or profanity profanity profanity then preprocesses the text before inputting.
EDIT: Only mentioning since some folks will argue chatGPT has a long crazy memory (10K tokens) because you sandwich stuff around some trivial 9.5k tokens of repetitions. They likely added a bunch of defenses against different basic prompt engineering attacks so people dont get it to say certain things too.
zzzthelastuser t1_izx8k9l wrote
> I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same
They could unironically use ChatGPT for this task.
master3243 t1_izxkwzt wrote
True, using the embedding from an LLM as a summary of the past for the same LLM is a technique I've seen done before.
p-morais t1_izvyzit wrote
It’s instructGPT, which is based on GPT3.5 with RLHF. People have reversed engineered that it uses a context window of 8,192 tokens and primed with a special prompt.
rePAN6517 t1_izw4vqj wrote
Need a source for the 8192 context window. Last I heard it was 4000.
f10101 t1_izwhslb wrote
To confirm, 4k tokens is indeed what their FAQ says. https://help.openai.com/en/articles/6787051-does-chatgpt-remember-what-happened-earlier-in-the-conversation
sandboxsuperhero t1_izw2k3k wrote
Where did you see this? text-davinci-003 (which seems to be GOT3.5) has a context window of ~4000 tokens.
42gauge t1_izw3a1a wrote
What's the special prompt?
029187 OP t1_izveeec wrote
That is surprisingly clever.
[deleted] t1_izvm5ob wrote
[deleted]
MaceGrim t1_izvnq8t wrote
It’s definitely some form of a Large Language Model implemented through a transformer neural network. GPT references the large language models that OpenAI previously built (GPT-3), and it’s also likely that ChatGPT is a finely-tuned and/or optimized version dedicated to chatting.
Duckdog2022 t1_izvtrg2 wrote
Pretty unlikely it's that simple.
p-morais t1_izvz91t wrote
Not “pretty unlikely”. The architecture is literally in the name: Generative Pretrained Transformer
5erif t1_izwnbsq wrote
Their comment was colloquially synonymous with
> I doubt it's that simple.
Your comment could just as easily have started with
> You're right, it's not that simple.
But reddit is what you might call a generative adversarial network.
blablanonymous t1_izx7gp9 wrote
Decision tree of life
[deleted] t1_izwlr3g wrote
[deleted]
Viewing a single comment thread. View all comments