scitech_boom

scitech_boom t1_iw6zpa9 wrote

There are multiple reasons. The main issue has to do with validation error. It usually follows a U curve, with a minimum at some epoch. This is the point at which we usually stop the training (`early stopping`). Any further training, with or without new data is only going to make the performance worse (I don't have a paper to cite for that).

I also started with the best model and that did not work. But when I took the model 2 epochs before the best model, it worked well. In my case(speech recognition), it was a nice balance between improvement and training time.

1

scitech_boom t1_iw4ck9z wrote

>Concatenate old and new data and train one epoch.

This is what I did in the past and it worked reasonably well for my cases. But is that the best? I don't know.

Anyhow, you cannot do this:

>Simultaneously, I do want to use this model as starting point,

Instead pick the weights from 2 or 3 epochs before the best performing one in the previous training. That should be the starting point.

Training on the top of something that has already hit the bottom wont help, even if we add more data.

5