[R] We found nearly half a billion duplicated images on LAION-2B-en. Submitted by von-hust t3_11jyrfj on March 6, 2023 at 1:20 PM in MachineLearning 36 comments 375
alushamir t1_jb6o8jd wrote on March 6, 2023 at 8:59 PM Hi,I'm one of the authors of fastdup. In an analysis we did 5 months ago we have found only around 15% of duplicated in Laion 400M. You can check out a short video on the matter here: https://www.youtube.com/watch?v=s6qamoFzyis For additional info read here: https://visual-layer.readme.io Permalink 11
Viewing a single comment thread. View all comments