AccountGotLocked69
AccountGotLocked69 t1_jceti2q wrote
Reply to comment by camp4climber in [D] Is there an expectation that epochs/learning rates should be kept the same between benchmark experiments? by TheWittyScreenName
I mean... If this holds true for other benchmarks, it would be a huge shock for the entire community. If someone published a paper showing that AlexNet beats ViT on imagenet if you simply train it for ten million epochs, that would be insane. That would mean all the research into architectures we did in the last ten years can be replaced by a good hyperparameter search and training longer.
AccountGotLocked69 t1_jcesw8m wrote
Reply to comment by MrTacobeans in [D] Is there an expectation that epochs/learning rates should be kept the same between benchmark experiments? by TheWittyScreenName
I assume by hallucinate gaps you mean interpolate? In general it's the opposite, smaller simpler models are better at generalizing. Of course there are a million exceptions to this rule, but in the simple picture of using stable combinations of batch sizes and learning rates, big models will be more prone to overfit the data. Most of this rests on the assumption that the "ground truth" is always a simpler function than memorizing the entire dataset.
AccountGotLocked69 t1_iy77cgi wrote
Reply to comment by ZylonBane in I am Max Florschutz, author of Science-Fiction and Fantasy, back again to celebrate the launch of my latest book, Starforge! by MaxFlorschutzAMA
Hmmm wouldn't a star that's used as a forge be a forgestar?
AccountGotLocked69 t1_jdd0tdc wrote
Reply to comment by Enzo-chan in New 'biohybrid' implant will restore function in paralyzed limbs | "This interface could revolutionize the way we interact with technology." by chrisdh79
For anyone wondering how much reading that is: You'd have to spend an entire year reading 6200 words per second, if every single word you read is "revolutionize" to hit musk's net worth.