Viewing a single comment thread. View all comments

AccountGotLocked69 t1_jceti2q wrote

I mean... If this holds true for other benchmarks, it would be a huge shock for the entire community. If someone published a paper showing that AlexNet beats ViT on imagenet if you simply train it for ten million epochs, that would be insane. That would mean all the research into architectures we did in the last ten years can be replaced by a good hyperparameter search and training longer.

2