Submitted by yamakeeen t3_ynq05k in MachineLearning
I'm planning to see how a latent diffusion model would perform in the image reconstruction from brain activity task. Specifically, the image generation would be conditioned on brain activity instead of text. Has anyone tried conditioning on brain activity or other information apart from text? I'm having a hard time digesting the code from the LDM repo and was wondering if anyone has tried coding it (or a simpler version) from scratch.
shawarma_bees t1_ivaa8if wrote
How is the “brain activity” information encoded?