I am an engineering student who is very new to machine learning.

I want to make an object classification kind of model to deduce the attributes like color, pattern, type, fit, etc. of various pieces of clothing. Most sources I could find on the subject are classifying to a single class like if something is a cat or dog.

I wish to train a model that can find multiple attributes associated with the image like a shirt with attributes ["black, "long sleeves", "fit", "V collar"] or a skirt with attributes ["red", "spotted", "short"]. I have the dataset, which is custom-made, and there are a lot of attributes.

I don't know where to start my research about this. What is the name of this kind of classification? Is there a pre-existing network architecture that I can use? Is there a tutorial video that you can recommend or an academic paper?

Comments

You must log in or register to comment.

Seankala t1_iruvc69 wrote on October 11, 2022 at 5:31 AM

#75,728

I work at a startup that provides solutions to exactly this problem. It's just multi-label classification, you might want to look more into that.

PassionatePossum t1_irvbp94 wrote on October 11, 2022 at 9:25 AM

#76,314

If you are using CNNs, it is actually very straightforward to solve: You need different loss functions for every independent attribute. And the optimization objective is to minimize the (weighted) sum of these loss functions.

Add a separate output layer for every independent attribute and a separate loss function to every output layer.
During training, set target values for the unneeded layers to an arbitrary value and set the loss weights to zero.

AKavun OP t1_irvbvrj wrote on October 11, 2022 at 9:28 AM

#76,318

Replying to PassionatePossum (#76,314)

As I said, I am a beginner to this stuff. Even though I am familiar with every term in that sentence, can you maybe share some articles or videos that are doing or explaining something similar to what you have in mind so that I can understand you better?

AKavun OP t1_irvc02x wrote on October 11, 2022 at 9:30 AM

#76,320

Replying to Seankala (#75,728)

Thank you, I am reading about it right now!

Electronic-Art-2105 t1_irvsyi9 wrote on October 11, 2022 at 12:46 PM

#77,095

This looks like a multilabel or multi output classification to me. Exactly the same thing was done here: https://www.kaggle.com/code/cbrincoveanu/transfer-learning-and-multi-output-tutorial Hope this helps.

AKavun OP t1_irwurmu wrote on October 11, 2022 at 5:12 PM

#79,088

Replying to Electronic-Art-2105 (#77,095)

Yeah, this is mostly similar to what I want to do but there is a difference.

In the tutorial, there are only binary attributes like if a celebrity is bald or not. But I want to do multi-value attributes like the color of the clothing which can take a lot of values, not just 1 or 0.

With this in mind, is it still multilabel classification

Electronic-Art-2105 t1_irwzixb wrote on October 11, 2022 at 5:43 PM

#79,309

Replying to AKavun (#79,088)

I see. In the tutorial, for each output, a 1-dimensional Dense layer with a sigmoid activation function is used, along with binary crossentropy as the loss function. You could exchange that by an n-dimensional Dense layer with softmax activation, along with categorical crossentropy. So the basic architecture can remain similar, you just have to adapt the outputs.

AKavun OP t1_irx83tz wrote on October 11, 2022 at 6:38 PM

#79,698

Replying to Electronic-Art-2105 (#79,309)

>I see. In the tutorial, for each output, a 1-dimensional Dense layer with a sigmoid activation function is used, along with binary crossentropy as the loss function. You could exchange that by an n-dimensional Dense layer with softmax activation, along with categorical crossentropy. So the basic architecture can remain similar, you just have to adapt the outputs.

I will first learn what these things mean, then I will get back to you. Thank you for your guidance.