LightGreenSquash

LightGreenSquash OP t1_iwi9q1g wrote

Yep, that's kind of along the lines I'm thinking as well. The only possible drawback I can see is that for such small datasets even "basic" architectures like MLPs can do well enough and thus you might not be able to see the benefit, say, a ResNet brings.

It's still very much a solid approach though, and I've used it in the past to deepen my knowledge of stuff I already knew, e.g. coding a very basic computational graph framework and then using it to train an MLP on MNIST. It was really cool to see my "hand-made" graph topological sort + fprop/bprop methods written for different functions actually reach 90%+ accuracy.

1

LightGreenSquash OP t1_ivxgu6h wrote

Yeah, I think you should definitely have a solid understanding of the basics, otherwise as you said new developments can seem flashy but incomprehensible. Paper discussions and such are definitely useful and a part of the process, but I'm not entirely convinced that the level of understanding you get from them is enough without actually doing some things yourself too.

1

LightGreenSquash OP t1_ivxgmef wrote

I think I mostly agree on keeping track of the field, the only thing that's not clear to me is what should be considered "foundational" in an area where most of the exciting things have happened in the last ten years or so.

But don't you think that just discussing ideas ends up hiding a significant part of the complexity of actually getting them to do something? It's true that learning by doing seems rather time-consuming, but wouldn't we consider it strange if someone said that he'd learn about algorithms without trying to implement/use/benchmark them, theorems without solving problems with them, or even coding techniques by simply reading about them?

Then again, I guess you'll inevitably end up actually having to do things in your PhD/job/whatever, but I'm concerned that a lack of "foundational" knowledge and experience can greatly hamper you at some time-critical point of this process.

3