uhules t1_j6sp5wk wrote
Reply to comment by jiamengial in [D] Audio segmentation - Machine Learning algorithm to segment a audio file into multiple class by PlayfulMenu1395
CTC is better suited for unaligned sequences, if OP has precise timings for the sound events, plain frame-wise classification should work better.
jiamengial t1_j6t854s wrote
That's true, was thinking that flat frame-wise predictions could lead to incorrect mid-segment predictions, which might be an annoying model error to get
Viewing a single comment thread. View all comments