FutureIsMine t1_iu12lki wrote on October 27, 2022 at 7:37 PM

THESE RESEARCHERS JUST TOOK MY IDEA!!!!! Congrats to the team, Im glad they've vindicated that I was on the right path

usrnme878 t1_iu154kd wrote on October 27, 2022 at 7:53 PM

I think you BOTH stole my idea!!! Not sure how, but I was definitely thinking of this as a strategy before @ least I was right!

FutureIsMine t1_iu178nu wrote on October 27, 2022 at 8:07 PM

we did, we're inside your head

0xWTC OP t1_iu15vi2 wrote on October 27, 2022 at 7:58 PM

good one :) I'm actually doing a very similar thing to what they are doing, to generate 1000-2000 word blog posts. The lack of lookback at the entire text with most models (too expensive, currently) is a very interesting problem to circumvent. How to avoid repetitions, and keep the flow logical, while not spending too much GPU time.

yaosio t1_iu2za10 wrote on October 28, 2022 at 4:10 AM

Diffusion for language models provides more coherent output according to various studies I've found. I'm surprised nobody's talking about it considering all the hype about diffusion for image generators. I guess it's not as cool as it sounds. The paper doesn't compare it to GPT models which should have told me something.

https://arxiv.org/abs/2205.14217

https://github.com/xiangli1999/diffusion-lm

There's also a new method that's even faster than diffusion.

https://www.assemblyai.com/blog/an-introduction-to-poisson-flow-generative-models/

I hope you have good luck on your text generating endeavors!

DigThatData t1_iu3eb42 wrote on October 28, 2022 at 7:08 AM

computer vision often overshadows NLP. Hard to compete when something novel is making the rounds with pretty pictures to go with it.

0xWTC OP t1_iu3kyag wrote on October 28, 2022 at 8:44 AM

the paper is actually using GPT-3, as far as I understand. It's hard to compare since you physically can't really generate a 2000 word article with GPT-3 (one shot)

thanks I will look into it

0xWTC OP t1_iu3omcf wrote on October 28, 2022 at 9:37 AM

this is fantastic.

Gordath t1_iu2g4zm wrote on October 28, 2022 at 1:33 AM

First time? 🤣😭

FutureIsMine t1_iu2isi8 wrote on October 28, 2022 at 1:53 AM

It actually is 😆🤣😭

DigThatData t1_iu3dzo3 wrote on October 28, 2022 at 7:04 AM

i like to think of it as "oh sweet, they did the work for me, now I can jump straight into that other idea that built on top of this one."

master3243 t1_iu1gyux wrote on October 27, 2022 at 9:09 PM

Work by group at UC Berkeley

https://people.eecs.berkeley.edu/~yangk/

https://scholar.google.com/citations?hl=en&user=jJRiZR8AAAAJ&view_op=list_works&sortby=pubdate

j4nds4 t1_iu25gly wrote on October 28, 2022 at 12:12 AM

Isn't this similar something that was done with AI Dungeon? I seem to recall in the early post-GPT-3 days that when creating custom scenarios you could include a collection of world data separate from the actual prompts/story that would (presumably) be re-injected into the prompt in order to maintain some structure. How effective it was though I'm unsure.

yaosio t1_iu2ylrs wrote on October 28, 2022 at 4:03 AM

NovelAI does it as well and it doesn't really work that well. Many times it completely ignores entries. However both NovelAI and AI Dungeon have a limited output. This study is on generating 2000+ words without human intervention. I made plenty of stories with both NovelAI and AI Dungeon to know that neither stay on topic and quickly go off the rails. It doesn't matter what model is used, they all go off the rails.

0xWTC OP t1_iu3n17z wrote on October 28, 2022 at 9:15 AM

because GPT-3 is not capable of lookback that goes so far.

o_snake-monster_o_o_ t1_iu4d33q wrote on October 28, 2022 at 1:43 PM

I'm pretty sure this is how we're gonna advance these models to the next step. It's a lot easier to think about these things in the context of coding, because coding is thinking but in a very restricted symbolic world.

For example, the next step for coding language models will be to implement a command language to let them query the code and get information/intelligence from it (a LSP for example). Then we use some sort of RL algorithm or a hypernetwork to finetune how the context should be written and organized to maximize efficiency, which information to drop to make room for new information, etc.

We have this huge GPT context window but we're filling it up with so noise! Humans work with highly augmented data, for example the syntax highlighting in our code editor, so why are we not augmenting GPT-3's input?

andreasblixt t1_iu3hw2a wrote on October 28, 2022 at 7:59 AM

Now please run this on all of /r/WritingPrompts

leepenkman t1_iu3sd0z wrote on October 28, 2022 at 10:27 AM

This is some truly epic prompting, checkout https://text-generator.io for replacing some of the gpt-instruct-13B queries to save cost, you can generate many results in a single inference too and are only charged for a whole request, then can rerank them.

Theres a lot of tricks in here like using a relevant/coherent detection network to rerank and using DPR to select relevant parts of the summary for the context. (see the utils in

```

dpr_query_encoder = SentenceTransformer('sentence-transformers/facebook-dpr-question_encoder-single-nq-base')

dpr_context_encoder = SentenceTransformer('sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base'))

```

a whole bunch of other models are loaded in there like for NER, entailment, QA using unifiedqa-t5-large, i find it a good reference for some good/appropriate to use models https://github.com/yangkevin2/emnlp22-re3-story-generation/blob/20a99853ff4acbdb11865f57f4fa74431af0b628/story_generation/common/util.py#L69

1point21giggawats t1_iu3uzod wrote on October 28, 2022 at 10:59 AM

Great, now someone finish Winds of Winter with this thing, please.

Zachorious t1_iu64j1w wrote on October 28, 2022 at 8:55 PM

Lol, given that GPT-3 is not capable of lookback, I think we're going to end up with some loose threads

[R] "Re3: Generating Longer Stories With Recursive Reprompting and Revision" - Generating stories of 2000+ words (or even much longer)

Comments