machinelearner77
machinelearner77 t1_ixhcdck wrote
Reply to comment by Insighteous in [D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older by RobbinDeBank
> Like a big fat self-marketing campaign. Disgusting.
You mean the Canada-US researcher circle jerk, don't you?
machinelearner77 t1_iw7omuy wrote
Reply to comment by vwings in Relative representations enable zero-shot latent space communication by 51616
That it works seems interesting, especially since I would have thought that it might depend too much on the hyper-parameter (anchors), which apparently it doesn't. But why shouldn't you be able to "backprop over this"? It's just cosine, everything is naturally differentiable
machinelearner77 t1_iw6l27e wrote
Reply to comment by huehue9812 in Relative representations enable zero-shot latent space communication by 51616
I don't get the huge thing either. Seems to me like a thorough (and valuable) analysis of something that's probably already been known and tried out in one form or another a couple of times, since the idea is so simple. But is it a big or even huge finding? I don't know..
machinelearner77 t1_iw21k83 wrote
Reply to comment by samloveshummus in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
I guess the problem with "seed hacking" is just that it reduces trust in the proposed method. People want to build on methods that aren't brittle and if presented model performance depends (too) much on random seed it lowers trust in the method and makes people less likely to want to build on it
machinelearner77 t1_j1437x9 wrote
Reply to [R] Nonparametric Masked Language Modeling - MetaAi 2022 - NPM - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks by Singularian2501
Looks like cool stuff... but if you put a code link in the abstract and publish your paper, it should be a functioning link...