master3243 t1_ir9a5wt wrote on October 6, 2022 at 8:24 AM

Reply to comment by ThePerson654321 in [R] Google announces Imagen Video, a model that generates videos from text by Erosis

Image generation is by definition an easier task so the two will never catch up.

But do you not think that at some point in the future, video generation in the year 20XX will be better than image generation in 2022?

Even in the year 2050 or 2100?

ThePerson654321 t1_ir9adad wrote on October 6, 2022 at 8:27 AM

Perhaps a few seconds but never a full movie.

tdgros t1_ir9hdy2 wrote on October 6, 2022 at 10:10 AM

Phenaki already shows the generation of 2mn videos (using lots of prompts): https://phenaki.video/#interactive it's not that far fetched to imagine that working on longer prompts and videos...

master3243 t1_ir9bp3h wrote on October 6, 2022 at 8:47 AM

What about a coherent 30 second silent clip from a short description that is as difficult to distenguish from real images as current SOTA image generation.

cleverestx t1_irbcdi0 wrote on October 6, 2022 at 6:45 PM

Why not? I admit it IS more challenging, but video is only a series of images...

ThePerson654321 t1_irbck16 wrote on October 6, 2022 at 6:46 PM

They said the same thing about nuclear fusion reactors.

cleverestx t1_irbcqd5 wrote on October 6, 2022 at 6:47 PM

Those reactors are not a series of images.

wtf-hair-do t1_iraoudr wrote on October 6, 2022 at 4:10 PM

they'll just never figure it out and give up