Viewing a single comment thread. View all comments

TFenrir OP t1_j0dk5rr wrote

I mostly agree, but I think there is some opportunity here. Using img2img in real time to extend audio forever, and the relationship between images and audio in general are quite interesting - would a model that is only trained on these images provide a "better" result? Would different fine tuned models give different experiences? How is this impacted by other improvements to models?

11