Viewing a single comment thread. View all comments

badabummbadabing t1_iu3ubql wrote

My company cares about training time. We are iterating on one main model, and training a series of experiments in a shorter amount of time allows you to have a faster iteration cycle. I think many people absolutely underappreciate this. There are also critical training times which you may need to hit in order to make real use of that. For example, if your training time is below on the order of 2 days, you may be able to get 2 (or even 3) iteration cycles in per week. A training time of 3 days reduces this to 1-2 iteration cycles per week. A training time of 4 days means that you can only realistically achieve 1 iteration cycle per week.

Another way of thinking about this is that doubling your training speed also doubles the amount of hardware you have at your disposal, and halves the cost per experiment.

2

GPUaccelerated OP t1_iu4xa3i wrote

This perspective and use case is really important to note. Thank you for sharing! Your last comment makes so much sense.

1