Ttttrrrroooowwww

Ttttrrrroooowwww t1_iyctkhw wrote

Normally your dataloader gets single samples from your dataset. Such as reading an image one by one. In that case RAM is never a problem.

If that is not an option for you (why I would not know), then numpy memmaps might be for you. Basically an array thats read from disk, not from RAM. I use it to handle arrays that are Billions of values.

2

Ttttrrrroooowwww OP t1_it5qizp wrote

Reply to comment by suflaj in EMA / SWA / SAM by Ttttrrrroooowwww

Currently my research focuses mostly on the semi-supervised space, and especially EMA is still relevant. Apparently its good to reduce confirmation biased on the inherent noisyness of pseudo labels.

While that agrees with your statement and answers my question (that I should use EMA because its relevant), I found some codes that dont mention all methods in their publications but they exist in their codebase.

1

Ttttrrrroooowwww t1_ir1tfxo wrote

Miniset training. This partial dataset should somewhat reflect the mean/distribution of your actual dataset. Also, if it is very small, validation set should be a little larger.

For learning rate tune a “base learning rate” and scale it to your desired batch size using sqrt_k or linear_k rule. https://stackoverflow.com/questions/53033556/how-should-the-learning-rate-change-as-the-batch-size-change. Personally, sqrt_k rule works very well, but linear_k works too (depending on problem/model)

1