Viewing a single comment thread. View all comments

JJP77 t1_ire3y9h wrote

where'd they get 3d training data from?

1

Smearle t1_iuhhgyz wrote

They don't use any. Instead they capture screenshots of the 3D object from various perspectives then feed them into CLIP to determine how much the object resembles the text prompt.

2