franciscrot t1_irx4mwf wrote on October 11, 2022 at 6:16 PM

Reply to comment by ReasonablyBadass in [D] Reversing Image-to-text models to get the prompt by MohamedRashad

You'd think, but I'm pretty sure no. Different models. Also different types of models, I think? Isn't most image captioning GAN?

One thing that's interesting about this q is that the diffusion models, as I understand them (not too well) do already involve a kind of "reversal" in their training - adding more and more noise to an image till it vanishes, then trying to create an image from "pure" noise.

Just in a really non mathy way, I wonder how OP imagines this accommodating rerolling? Would it provide an image seed?

Related: Can the model produce the exact same image from two slightly different prompts?

ReasonablyBadass t1_irzfwml wrote on October 12, 2022 at 4:37 AM

If stochastic noise is added in the process "reverse engineering" the prompt shouldn't be possible, eight?

Since, as per your last question, the same prompt would generate different image.

Actually, comse to think of it, don't the systems spit out multiple images for a prompt for the user to choose one?