Submitted by AutoModerator t3_11pgj86 in MachineLearning
ViceOA t1_jccekvp wrote
Precious Advices About AI-supported Audio Classification Model
Hello everyone,I'm Omer.
I am new in this group and writing from Turkey. I need very valuable advice from you precious researchers.
I am a PhD program student in the department of music technology. I have been working in the field of sound design and audio post-production for about 8 years. For the last 6 months, I have been doing research on AI-supported audio classification.My goal is to design an audio classifier to be used in the classification of audio libraries. Let me explain with an example as follows; I have a sound bank with 30 different classes and 1000 sounds in each class (such as bird, wind, door closing, footsteps etc.).
I want to train an artificial neural network with this sound bank. This network will produce labels as output. I also have various complex signals (imagine a single sound track with different sound sources like bird, wind, fire, etc.). When I give a complex signal to this network for testing, it will give me the relevant labels.I have been doing research on this system for 6 months and if I succeed, I want to write my PhD thesis on this subject. I need some advice from you, my dear friends, about this network. For example, which features should I look at for classification? Or what kind of artificial intelligence algorithm should I use?
Any advice you say you should definitely read this article or that article on this subject.I apologize if I've given you a headache. I really need your advice. Please guide me. Thank you very much in advance.
henkje112 t1_jcxlc7t wrote
Look into Convolutional Neural Networks as your architecture type and different types of spectrograms as your input features. The different layers of the CNN should do the feature transformation, and your final layer should be dense, with a softmax (or any other desired) activation function.
ViceOA t1_jd20dzj wrote
>Look into Convolutional Neural Networks as your architecture type and different types of spectrograms as your input features. The different layers of the CNN should do the feature transformation, and your final layer should be dense, with a softmax (or any other desired) activation function.
Thanks for your precios advices, im grateful!
Viewing a single comment thread. View all comments