LastVariation t1_jdsx10i wrote on March 26, 2023 at 10:58 PM

Reply to [D] Build a ChatGPT from zero by manuelfraile

Start by training a gpt-2, then add 2, then reinforce it to not be crazy by feeding chatGPT responses

LastVariation t1_iu1vsrf wrote on October 27, 2022 at 10:57 PM

Reply to North Korea nuclear test would show program moving 'full steam' - IAEA chief by Beckles28nz

Steam powered nuclear weapons? Very punk Kimmy

LastVariation t1_iu1mbuj wrote on October 27, 2022 at 9:46 PM

Reply to [D] Do companies actually care about their model's training/inference speed? by GPUaccelerated

Inference speed is important because that's what goes to production typically. If you're already waiting hours-days on training, then it probably takes an order of magnitude improvement to make the investment worth it. As a private consumer, my bottleneck to upgrade is my steam library.

LastVariation t1_itps1fq wrote on October 25, 2022 at 12:43 PM

Reply to comment by External_Oven_6379 in Combining image and text embedding [P] by External_Oven_6379

R.e. the scale of one-hot vectors, it's a little hard to say, it probably depends on your data and task. Essentially you could scale the one hot vectors up by sqrt(K), where K is the average similarity of two images with the same label. That way having the same label has the cosine similarity as two images being averagely similar for the label. In practice you'd probably want to fit K as a hyperparameter with some training data.

R.e. CLIP, you can input categorical text labels as raw text and the model is decent at interpreting it. I believe it's common practice to make the text a bit more natural language in that case, so "a photo of a <object>" rather than just "<object>".

LastVariation t1_itpa0b2 wrote on October 25, 2022 at 9:21 AM

Reply to Combining image and text embedding [P] by External_Oven_6379

Maybe the distance between two similar images is on a different scale to the difference between two different categorical labels. Using one-hot for the categoricals means 2 different labels are always a distance 1 apart. It could be worth looking at the cosine distances between all image embeddings with a given label, and some average of those embeddings to get a sense of the scale.

Also one-hot might not be best if the categorical labels aren't actually orthogonal - e.g. you'd expect there to be correlations between images of "cats" and "kittens".

Have you thought about just using something like CLIP for embedding both image and label?