ant9zzzzzzzzzz

ant9zzzzzzzzzz t1_j6a37a1 wrote

Is there research about order of training examples, or running epochs on batches of data rather than full training set at a time?

I was thinking about how for people we learn better if focus on one problem at a time until grokking it, rather than randomly learning things in different domains.

I am thinking like train some epochs on one label type, then another, rather than all data in the same epoch, for example.

This is also related to state full retraining, like one probably does professionally - you have an existing model checkpoint and retrain on new data. How does it compare to retraining on all data from scratch?

1