Viewing a single comment thread. View all comments

FutureIsMine t1_iu12lki wrote

THESE RESEARCHERS JUST TOOK MY IDEA!!!!! Congrats to the team, Im glad they've vindicated that I was on the right path

27

usrnme878 t1_iu154kd wrote

I think you BOTH stole my idea!!! Not sure how, but I was definitely thinking of this as a strategy before @ least I was right!

15

0xWTC OP t1_iu15vi2 wrote

good one :) I'm actually doing a very similar thing to what they are doing, to generate 1000-2000 word blog posts. The lack of lookback at the entire text with most models (too expensive, currently) is a very interesting problem to circumvent. How to avoid repetitions, and keep the flow logical, while not spending too much GPU time.

9

yaosio t1_iu2za10 wrote

Diffusion for language models provides more coherent output according to various studies I've found. I'm surprised nobody's talking about it considering all the hype about diffusion for image generators. I guess it's not as cool as it sounds. The paper doesn't compare it to GPT models which should have told me something.

https://arxiv.org/abs/2205.14217

https://github.com/xiangli1999/diffusion-lm

There's also a new method that's even faster than diffusion.

https://www.assemblyai.com/blog/an-introduction-to-poisson-flow-generative-models/

I hope you have good luck on your text generating endeavors!

7

DigThatData t1_iu3eb42 wrote

computer vision often overshadows NLP. Hard to compete when something novel is making the rounds with pretty pictures to go with it.

4

0xWTC OP t1_iu3kyag wrote

the paper is actually using GPT-3, as far as I understand. It's hard to compare since you physically can't really generate a 2000 word article with GPT-3 (one shot)

thanks I will look into it

2

DigThatData t1_iu3dzo3 wrote

i like to think of it as "oh sweet, they did the work for me, now I can jump straight into that other idea that built on top of this one."

3