m98789 t1_ittly1y wrote on October 26, 2022 at 5:32 AM

There’s several strategies to combine multimodal data. Here’s some simple approaches:

First train the cnn classsifer. Then use it as a feature extractor by extracting the feature vector from the penultimate layer. Then augment those image features with the features from your tabular data. And then train it all with a classifier like xgboost.
If you want to train both your feature extractor and classifier end to end, you could try different strategies for encoding the tabular data into the input tensor. A simple and fun way to try is to encode them visually into your images themselves, such as adding a few more pixel rows at bottom of image. One row can represent country (uniquely color by a country index), and so on.

rich_atl OP t1_iu1zu10 wrote on October 27, 2022 at 11:28 PM

Thanks , great ideas. I’ll try them

m98789 t1_iu23g1e wrote on October 27, 2022 at 11:56 PM

Your welcome. Would be great if you could update this thread after your experiment to see what worked best, to help future readers in the same boat.

vedrano- t1_ittm7at wrote on October 26, 2022 at 5:36 AM

Depends on how sophisticated model would you like to build, but the simplest way would be to flatten those data and add (extend 1D array) to flattened CNN part prior to last fully connected layers of CNN model.

rich_atl OP t1_iu1ldim wrote on October 27, 2022 at 9:39 PM

Thanks that makes sense

Dear-Acanthisitta698 t1_ittlyp3 wrote on October 26, 2022 at 5:33 AM

May use embedding for categorical value and linear for continuous value. You can also use bin for making continuous value into categorical.

[D] how to add tabular data to cnn image classifier

Comments