SufficientStautistic t1_iz1c1yp wrote on December 5, 2022 at 7:13 PM

+1 for using a model template rather than experimenting/crafting things by hand for your problem. Many good general-purpose architectures for classification exist and in my experience they work very well. For the classification problem you describe you will probably be fine using one of the architectures mentioned on the Keras CV page (or the equivalent place in the timm/pytorch docs). Recommend starting from a pretrained model.

The approach I usually take to solving a CV problem is to survey what architectures are recommended for the problem in the abstract (e.g. classification, segmentation, pose estimation etc), try those, then make modifications using details from the specifics of the problem if necessary.

Tbh you might not even need a deep vision model for your problem.