Oceanboi OP t1_ixkc2dn wrote on November 24, 2022 at 2:11 AM

Reply to comment by GeneralBh in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

It is trained on AudioSet. I listed YAMNet to highlight the lack of large audio models when compared to image models. And highlight the problem that it limits your data input due to its architecture.

Also, I mainly see transfer learning for CNN in kaggle notebooks, and could find a few papers where an image net is used as one of the models being used.

https://arxiv.org/pdf/2007.07966

https://research.google/pubs/pub45611/

These are just a few but it seems decently common.

GeneralBh t1_ixkimkd wrote on November 24, 2022 at 3:05 AM

thank you! There are few works on huge audio models e.g. https://arxiv.org/pdf/2109.13226.pdf that might be interesting.