Submitted by Oceanboi t3_z30bf2 in MachineLearning
GeneralBh t1_ixkahfh wrote
Could you please point to some papers for "I see a lot of image models (ImageNet, ResNet, etc) that are being used for transfer learning in the audio classification domain"?
I thought YAMNet was trained on Audioset from scratch. Could you please point to the paper which uses a pretrained image model to train the YAMNet?
Oceanboi OP t1_ixkc2dn wrote
It is trained on AudioSet. I listed YAMNet to highlight the lack of large audio models when compared to image models. And highlight the problem that it limits your data input due to its architecture.
Also, I mainly see transfer learning for CNN in kaggle notebooks, and could find a few papers where an image net is used as one of the models being used.
https://arxiv.org/pdf/2007.07966
https://research.google/pubs/pub45611/
These are just a few but it seems decently common.
GeneralBh t1_ixkimkd wrote
thank you! There are few works on huge audio models e.g. https://arxiv.org/pdf/2109.13226.pdf that might be interesting.
Viewing a single comment thread. View all comments