Viewing a single comment thread. View all comments

Technical-Owl-6919 t1_iwch8sv wrote

See, from my experience I would ask you to use EfficientNets in the first place. Secondly please don't unfreeze the model at the very beginning. Train the frozen model with your custom head for a few epochs and when the loss saturates, reduce the Lr and unfreeze the entire network and train again. Btw did you try LR Scheduling ?

1

Tiny-Mud6713 OP t1_iwcjcrv wrote

In the post I said I unfroze the CNN layers, I meant after the transfer learning part. I run it untill it early stops with all CNN layers frozen, then run it with unfreezing the top 200 layers or so.

I'm obliged to work on Keras K don't know if it has an LR sched method, I'll check the API great advice.

1

Tiny-Mud6713 OP t1_iwck8e1 wrote

The problem with efficient nets is that I ran a test on some models apriori, I got this graph, note that the dataset was ran for 3 epochs only each model.

https://drive.google.com/file/d/1OyXaWg6vMirYeI9zLSeGJ2v_qCz3msu4/view?usp=share_link

1

Technical-Owl-6919 t1_iwckvp7 wrote

Something seems to be wrong, the validation scores should not be so low. Exactly what type of data are you dealing with ?

1

Tiny-Mud6713 OP t1_iwcleya wrote

They're pictures of some plant, 8 classes for 8 different species of the same type of the plant.

1

Technical-Owl-6919 t1_iwclyq8 wrote

So my friend, then you have to train the network from scratch, it is getting trapped into a local minima. Maybe a small network might help. Try training a ResNet15 or something similar from scratch. This has happened with me once, I was working with Simulation Images and could not get the AuC score to go above 0.92, once I trained it from scratch I got AUC scores close to 0.99, 0.98 etc.

1

arg_max t1_iwcn3y0 wrote

Imagenet 1k pretraining might not be the best for this as it contains few plant classes. The bigger in-21k has a much larger selection of plants and might be better suited for you. Timm has efficient net v2, beit, vit and convnext models pretrained on this though I don't use keras you might be able to find them for this framework.

1