HoLeeFaak t1_ivkbo4e wrote on November 8, 2022 at 4:30 PM

Reply to [D] At what tasks are models better than humans given the same amount of data? by billjames1685

Chess

HoLeeFaak t1_irvoxe5 wrote on October 11, 2022 at 12:09 PM

Reply to comment by MohamedRashad in [D] Reversing Image-to-text models to get the prompt by MohamedRashad

What you propose is a cycle-loss. It's valid, but the biggest problem is the non-differentiable parts, and this is a big problem that I didn't find a solution to.

HoLeeFaak t1_irvnlrf wrote on October 11, 2022 at 11:55 AM

Reply to [D] Reversing Image-to-text models to get the prompt by MohamedRashad

That's a pretty hard problem, because text generation involve argmax/sampling which is not differentiable, so it's hard to optimize a model to generate text that will then be inserted as input to a text2img model to generate a given image. I guess you could do something similar to https://arxiv.org/abs/2111.14447 replacing CLIP with Stable Diffusion, changing the objective a bit, but I think it will be hard to optimize.