Viewing a single comment thread. View all comments

0xWTC OP t1_iu15vi2 wrote

good one :) I'm actually doing a very similar thing to what they are doing, to generate 1000-2000 word blog posts. The lack of lookback at the entire text with most models (too expensive, currently) is a very interesting problem to circumvent. How to avoid repetitions, and keep the flow logical, while not spending too much GPU time.

9

yaosio t1_iu2za10 wrote

Diffusion for language models provides more coherent output according to various studies I've found. I'm surprised nobody's talking about it considering all the hype about diffusion for image generators. I guess it's not as cool as it sounds. The paper doesn't compare it to GPT models which should have told me something.

https://arxiv.org/abs/2205.14217

https://github.com/xiangli1999/diffusion-lm

There's also a new method that's even faster than diffusion.

https://www.assemblyai.com/blog/an-introduction-to-poisson-flow-generative-models/

I hope you have good luck on your text generating endeavors!

7

DigThatData t1_iu3eb42 wrote

computer vision often overshadows NLP. Hard to compete when something novel is making the rounds with pretty pictures to go with it.

4

0xWTC OP t1_iu3kyag wrote

the paper is actually using GPT-3, as far as I understand. It's hard to compare since you physically can't really generate a 2000 word article with GPT-3 (one shot)

thanks I will look into it

2