j4nds4 t1_iu25gly wrote on October 28, 2022 at 12:12 AM

Isn't this similar something that was done with AI Dungeon? I seem to recall in the early post-GPT-3 days that when creating custom scenarios you could include a collection of world data separate from the actual prompts/story that would (presumably) be re-injected into the prompt in order to maintain some structure. How effective it was though I'm unsure.

yaosio t1_iu2ylrs wrote on October 28, 2022 at 4:03 AM

NovelAI does it as well and it doesn't really work that well. Many times it completely ignores entries. However both NovelAI and AI Dungeon have a limited output. This study is on generating 2000+ words without human intervention. I made plenty of stories with both NovelAI and AI Dungeon to know that neither stay on topic and quickly go off the rails. It doesn't matter what model is used, they all go off the rails.

0xWTC OP t1_iu3n17z wrote on October 28, 2022 at 9:15 AM

because GPT-3 is not capable of lookback that goes so far.

o_snake-monster_o_o_ t1_iu4d33q wrote on October 28, 2022 at 1:43 PM

I'm pretty sure this is how we're gonna advance these models to the next step. It's a lot easier to think about these things in the context of coding, because coding is thinking but in a very restricted symbolic world.

For example, the next step for coding language models will be to implement a command language to let them query the code and get information/intelligence from it (a LSP for example). Then we use some sort of RL algorithm or a hypernetwork to finetune how the context should be written and organized to maximize efficiency, which information to drop to make room for new information, etc.

We have this huge GPT context window but we're filling it up with so noise! Humans work with highly augmented data, for example the syntax highlighting in our code editor, so why are we not augmenting GPT-3's input?