Viewing a single comment thread. View all comments

perfopt OP t1_iril212 wrote

The data is MFCCs created from audio files. Sort of like this - https://www.youtube.com/watch?v=szyGiObZymo

1

kingfung1120 t1_iripkba wrote

I haven’t handled audio data before, but it seems like you are flattening a [1723, 13]shape data into a vector(correct me if I am wrong), which is definitely going to affect the information that the model can learn since the data is sequential and it is in 2-D.

Unfortunately, I haven’t studied/read anything related to audio data deep learning, I couldn’t give you anymore in-depth opinion, but based on my understanding, using a CNN or anything recurrent should improve the model performance better than fine-tuning a MLP.

2

perfopt OP t1_iriqicz wrote

Yes you are correct. I am flattening (1723, 13) shape data.

I will try out CNN as well.

2

kingfung1120 t1_irirqor wrote

Look forward to receiving updates from you ;)

1

perfopt OP t1_iritkyy wrote

Certainly. I've got to travel a couple of days but Tue after work I'll be back on this.

2

perfopt OP t1_is5q78g wrote

Got back to a totally crazy week at work. Finally got time to spend on my project. I think I need to simplify my inputs and give MFCC another try before jumping into CNNs

2