Submitted by nullspace1729 t3_y0dk5c in MachineLearning
Small-Reason-8096 t1_irr6g0q wrote
Hands down the best paper I have ever read (and reimplemented) is the ResNets paper:
https://arxiv.org/abs/1512.03385
The descriptions are clear and concise - but with enough detail to reimplement in whatever framework you like. Also, OOTB the results I got on CIFAR10 matched the paper pretty much perfectly (not always a given!).
Another good paper to try is AWD-LSTM: https://arxiv.org/pdf/1708.02182.pdf
Basically, if you are implementing and training from scratch, focus on something you can train with a smallish dataset in a reasonable period of time. I would generally steer away from LLMs and object detection / segmentation models as they require more resources to train that are commonly available!
TheInfelicitousDandy t1_irsfw1a wrote
I've tried to reimplement AWD-LSTM in pytorch > 1. and have never been able to get close to the original results. I've also seen other people try and not get close. Pretty sure it has to do with the weight dropout they used.
If anyone knows of any pytorch > 1. version that achieves the same PPL on PTB/Wiki02 I'd very much like to know.
Small-Reason-8096 t1_irzvwc8 wrote
That surprises me as there was a good Fastai version:
https://docs.fast.ai/text.models.awdlstm.html
which is built on pytorch. When I played with it ages ago the results seemed comparable to the paper, but I haven't revisited it for a while :)
TheInfelicitousDandy t1_is0ajet wrote
As far as I know that version doesn't give comparable PPL.
Someone else saying the same https://github.com/salesforce/awd-lstm-lm/issues/86#issuecomment-453266265
A major issue here (and for other reproductions) are people saying they have a reproduction because they can run it without errors but never actually getting the same results.
Viewing a single comment thread. View all comments