Viewing a single comment thread. View all comments

aperrien t1_j0dsyu9 wrote

I can't believe that that running the Fourier sound transformations through Stable Diffusion and transforming them back into sound actually works. At this point, I really am calling into question what the SD model is actually capturing. Creativity? Pattern Consistency? This technology may have legs far beyond what I initially assumed.

23

xoexohexox t1_j0dv1fc wrote

It's just a table of weighted averages my dude

6

visarga t1_j0mzjox wrote

Let me tell you one weird trick all artists hate. It's actually averages of gradients collected from training examples, not averages of the training examples themselves. Gradients represent what has been learned from each example, and can be added together regardless of the content of the examples without becoming all jumbled up.

For instance, one can add the gradient derived from an image of a duck to that derived from an image of a horse. This is only possible in the space of gradients, as opposed to the space of images. If it weren't for this trick we would not be discussing art in this sub.

But are gradients derived from an image subject to copyright restrictions, even when all mixed up over billions of examples? All individual influences are almost "averaged out" by the large numbers of examples. That's how SD breaks training examples into first principles and then can generate an astronaut on a horse even though it has never seen that - only possible if you go back to all the way to basic concepts.

3