Viewing a single comment thread. View all comments

xl0 t1_ixegyz5 wrote

Don't just "implement" the models - implement the training loop in "pure" PyTorch, including mixed precision, gradient accumulation and metrics. It's not super hard but gives much-needed insight into why higher-level frameworks (like fastai or lightning) do things the way they do them.

And then actually get the models to train and see if you can replicate the results in the paper, at least some of them. You can train on smaller datasets like imaganette instead of imagenet if you don't have resources. If you can spend some money, vast.ai is good for relatively long-running tasks.

7

itsstylepoint OP t1_ixeisda wrote

Yes, that is how it usually works with my impls! (check out a few vids)

As for mixed precision and metrics - I will be making separate vids for both and of course, for every implemented model, will try to find a dataset to demo train/eval.

It is cool that you mentioned mixed precision as I already have the materials ready for this vid - will be discussing mixed precision, quantization (post-training and quantization aware training), pruning, etc. Improving perf!

4

xl0 t1_ixekt7a wrote

Cool, had a glance at a couple of your videos. They are pretty good, the production quality is good enough, and the explanations are clear.

One suggestion - maybe you could use notebooks? Can't overstate the importance of being able to interact with the code and visualize the data bit by bit, as you are writing the code. It makes it much easier to follow and understand what's going on in the code.

2

itsstylepoint OP t1_ixenran wrote

Hey thanks.

I am not a big fan of notebooks and rarely use them. When I do, I prefer using VS Code notebooks. So maybe I will make a few vids with notebooks in the future, but will likely stick to Neovim.

P.S. As for loss plots, monitoring performance, and those kinds of things, I prefer using tools like WandB, TensorBoard, etc. Will be covering those as well.

2