Submitted by pm_me_your_pay_slips t3_10r57pn in MachineLearning
https://twitter.com/eric_wallace_/status/1620449934863642624?s=46&t=GVukPDI7944N8-waYE5qcw
Extracting training data from diffusion models is possible by following, more or less, these steps:
- Compute CLIP embeddings for the images in a training dataset.
- Perform an all-pairs comparison and mark the pairs with l2 distance smaller than some threshold as near duplicates
- Use the prompts for training samples marked as near duplicates to generate N synthetic samples with the trained model
- Compute the all-pairs l2 distance between the embeddings of generated samples for a given training prompt. Build a graph where the nodes are generated samples and an edge exists if the l2 distance is less than some threshold. If the largest clique in the resulting graph is of size 10, then the training sample is considered to be memorized.
- Visually inspect the results to determine if the samples considered to be memorized are similar to the training data samples.
With this method, the authors were able to find samples from Stable Diffusion and Imagen corresponding to copyrighted training images.
mongoosefist t1_j6ufv6a wrote
Is this really that surprising? Theoretically every image from clip should be in the latent space in a close-ish to original form. Obviously these guys went through a fair amount of trouble to recover these images, but it shouldn't surprise anyone that it's possible.