Comments

You must log in or register to comment.

LtCmdrData t1_ixhtqdw wrote

J. Schmidhuber, "Schmidhuber Is All You Need,"
arXiv:1706.00001 [cs], Jan. 1963.

259

Jonno_FTW t1_ixjgnwo wrote

Why didn't he publish in the Annals of Schmidhuber?

13

ThRiLLeXx t1_ixggy00 wrote

Have you ever heard of the underrated and undercited hidden gem Long short-term memory?!

167

crouching_dragon_420 t1_ixh9qt5 wrote

I thought not. It’s not a paper Yann LeCun would cite. It’s an RNN legend.

85

RobbinDeBank OP t1_ixhcktc wrote

Darth Schmidhuber the Wise is a Dark Lord of ML, so powerful and so wise he could use ML to influence the neurons to create learning models

46

plocco-tocco t1_ixgibh7 wrote

There are thousands of people now going through Schmidhuber's old papers looking for the new revolution in ML

129

gwern t1_ixhmjmb wrote

Sadly, it turns out to be easier to invent the new revolution from scratch than to merely implement SMDH96's old revolution.

41

RobbinDeBank OP t1_ixhuw1g wrote

Not to mention you need to identify which of all his thousands of works are revolutionary. Might as well just reinvent it

24

TheInfelicitousDandy t1_ixi0q8a wrote

Also not to mention the entire process that happens after finding the revolutionary idea where you have to look at it with enough squint to be able to say 'this would be applicable to this other domain or task if we just change a few things', then the 8 pages of explanation you need to write up explaining how the changes are not novel but direct and banal outcomes of the original work (when interpreting the original work with the necessary squint).

19

chaosmosis t1_ixhcvxy wrote

Title's a little misleading. Initially I thought the claim was that the best ideas Lecun's had all came from Schmidhuber. Instead, the claim is that the best ideas anyone's had, as listed by Lecun, all came from Schmidhuber.

Amusingly, that's actually a more arrogant claim, but it's less personal and I don't think the tweet's "striking back" against Lecun.

111

ReginaldIII t1_ixhg0a6 wrote

Title isn't even misleading.

And given that the content behind the title is the length of a tweet, is it not reasonable for a person to actually just read the tweet to get the full content?

How is it a "more" arrogant claim? LeCun of his own fruition identified the most important works, and they align with works that Schmidhuber's group has put out in that time frame. That is a matter of fact, not opinion. Those peer reviewed and published works exist.

22

chaosmosis t1_ixhl90r wrote

"My ideas are the best in {Big Set}" versus "My ideas are the best in {Bigger Set}".

20

Beor_The_Old t1_ixkef7g wrote

It’s a smaller set since it’s just 5 ideas, he didn’t make LeCunn list those specific ideas.

1

Blutorangensaft t1_ixgxhya wrote

My favourite line from the blog: "ResNets (not intellectually deep, but useful) "

64

esmooth t1_ixih6ez wrote

that parenthetical applies to like 90% of ML papers

44

MrAcurite t1_ixhhwfx wrote

Should I just add Schmidhuber as a coauthor on my papers, just to make sure he's receiving appropriate credit for the ideas I probably stole from him?

55

RobbinDeBank OP t1_ixhk2f7 wrote

Better play it safe by citing him in your introduction:
“In recent years, machine learning [1] has achieved….
[1] Schmidhuber et al. (Dawn of time)”

On a side note: he’s a brilliant mind with so many ideas that deserve more recognition, but on the other hand, he can’t just claim that nobody else has original ideas. I’m sure many of his ideas are now independently rediscovered in recent breakthroughs by many other researchers with no knowledge of some vaguely related papers from decades ago.

36

Ulfgardleo t1_ixhlloe wrote

but he does not claim that. What he does claim is that developments in his lab predate those ideas. It might be that those ideas were rediscovered independently by others, but as in so many things, who is first matters. And from a scientific point of view, not doing literature review for ones own ideas is bad science, and especially so if a lab strategically avoids citing some other lab.

It shouldn't be so difficult to write "idea X [1] rediscovered by [more prominent 2] has led to" in ones work.

17

MrAcurite t1_ixjgpkd wrote

> And from a scientific point of view, not doing literature review for ones own ideas is bad science

I'd like to agree, but honestly ML is at the point where you couldn't possibly exhaustively review the literature to make sure your own work is original. I think you should make a sincere attempt at it, but the volume of publications per unit time is just beyond the ability of a human being to handle.

6

RobbinDeBank OP t1_ixhmsqs wrote

I don’t mean this very post but his attitudes overall on this topic. There are definitely breakthrough out there where authors don’t know about the existence of Schmidhuber’s related works from a long time ago under different terminologies. He’s probably the most brilliant mind in this field with the amount of original ideas he has, but most of those aren’t popularized and might be independently rediscovered decades later.

1

[deleted] t1_ixi25ng wrote

I'd never heard of the guy before he started being famous for his whining.

And I'd had many of the same ideas during the "AI winter".

−2

ureepamuree t1_ixi9uqc wrote

Do share the link of your arxiv submissions (or wherever you published them) and give us a chance to read in-depth about your approach those ideas. Thanks

5

[deleted] t1_ixgmza0 wrote

Schmidhuber might do some fantastic research if he stopped living in the past.

39

mietminderung t1_ixh3fw6 wrote

He can do both. He does have interesting publications.

  • 2022 - 2x ICMLs, 1 x ICLR,
  • 2021 - 2x NeurIps, 2x ICMLs, 5x ICLRs, 1 x AAAI

https://people.idsia.ch/~juergen/onlinepub.html#secConferences

One thing I like about Schmidhuber's publications are that the authors are often in small groups 2 or 3.

72

[deleted] t1_ixi1bqr wrote

Ah I see you follow the religion of publication count.

−17

mietminderung t1_ixi43vq wrote

Better than following an anonymous comment on Reddit. I also read his papers. But, the off hand comment doesn’t deserve a literature survey of Schmidhuber’s research .

34

ReginaldIII t1_ixhfqha wrote

Dude has been consistently publishing good ideas the whole time and continues to.

His criticism isn't that he should be lauded solely for past successes. It's that people, like LeCun, actively go out of their way to not cite his labs contributions out of personal and petty grievances with him as an individual, and often actively espouse the novelty of ideas he has already established.

58

JustOneAvailableName t1_ixid0yr wrote

Schmidhuber would have a way better point if he kept it to quality criticism. His OG papers are very often pretty far from the idea he tries to take credit for. Not that the high cited papers aren't a special case of something Schmidhuber also wrote a paper about, but in the same vein you can say that every paper is just a special case of a NN.

27

dataslacker t1_ixikye2 wrote

LeCun doesn’t actually credit anyone with those ideas. Likely because they are very broad topics with hundreds of contributors over the years. If Schmidhuber is so arrogant that he’s going to claim them all as his own I don’t blame other scientist for not taking him seriously.

29

acardosoj t1_ixh18ye wrote

2022 Second Semester's Schmidhuber drama -> Check

See you, folks, in six months for the next Schmidhuber drama!

27

Insighteous t1_ixh7wbc wrote

So this is how research works? Like a big fat self-marketing campaign. Disgusting. Was it always like that? What is with (my idealized) imagination of working together in a big scientific community and enhance knowledge for everyone. Or what is this about?

23

ReginaldIII t1_ixhgfbo wrote

Imagine your peer reviewed publications were routinely uncited by your immediate peers and they often claimed novelty in ideas you had already published about.

Schmidhuber literally just wants to be cited when people refer to work that they did.

42

new_name_who_dis_ t1_ixhxgdg wrote

> Schmidhuber literally just wants to be cited when people refer to work that they did.

He has some of the most cited papers in the field. What Schmidhuber wants is to be cited for papers that almost no one read and whose ideas can be vaguely relevant to some of the new breakthrough papers, but only if you really squint.

He's a very good researcher and and has many cool ideas, and it'd be much better if he was actually encouraging people to adopt them the proper way (like by creating demos and easy to use libraries and opensourcing code/weights) -- instead of trying to prove that the already widely adopted techniques are actually special cases of his own techniques.

30

ReginaldIII t1_ixi0bfl wrote

Look up how often people like LeCun actively avoid citing his "most cited papers in the field" out of little more than unprofessional spite.

> it'd be much better if he was actually encouraging people to adopt them the proper way

He is and does. That's literally why they are highly cited papers in the first place.

His argument for not being cited isn't against the wider community who do cite him. It's against the major players who actively refuse to cite him.

10

mcbainVSmendoza t1_ixjdvwa wrote

"but only if you really squint" Bingo. That's what feels so petty to me. That's where you really see ego behind the wheel.

6

crouching_dragon_420 t1_ixi9vba wrote

Have you ever read some random RNN paper from LeCun's group and noticed they didn't cite the LSTM paper but instead cited the GRU paper, which is a watered-down version of the LSTM?

5

new_name_who_dis_ t1_ixie59m wrote

GRU cites LSTM paper so it's fine imo, especially if they're using the GRU architecture and not the LSTM architecture.

Citing the original LSTM paper is kind of dumb in general since the modern LSTM architecture is not the one described in the paper. You really need to cite one of the latter papers that introduced the Forget gate, if you are using the default LSTM implementation.

5

crouching_dragon_420 t1_ixilxbs wrote

That's total horseshit when the architecture in the paper is almost the same as the original LSTM. I'm not talking about modern papers. If they cite GRU, they should cite LSTM as well. I dont agree with the saying GRU cite LSTM so it's fine to cite GRU but not LSTM. That's shouldnt be how credit assignment work.

5

DigThatData t1_ixinfbc wrote

> If they cite GRU, they should cite LSTM as well.

that's not how citations work...

> GRU cite LSTM so it's fine to cite GRU but not LSTM.

but that's literally how citations work. If you cite paper X, you are implicitly citing everything that paper X cited as well. citation graphs are transitive.

1

new_name_who_dis_ t1_ixiofup wrote

Yea exactly. If you’re citing a paper you’re implicitly citing all of the papers that paper cited.

No one is citing the original perceptron paper even though pretty much every deep learning paper uses some form of a perceptron. Because the citation is implied going from more complex architectures cited, to simpler ones those cited, and so on until you get to perceptron.

6

alwayslttp t1_ixj9zpy wrote

All metrics are stacked massively in favour of first level citations - many entirely ignore second level and beyond. For example, a paper's "cited by" count is its most prominent metric of influence/importance, and is a count of how many papers directly cite it.

I don't know this particular beef, but it sounds like citing GRU and not LSTM is a potential sleight/insult here. Exactly the kind of thing you see in petty academic rivalries. You're explicitly deciding who you're crediting with the key innovations you're building from, and you know that most people aren't chasing every sub reference of every citation.

4

DigThatData t1_ixkghj5 wrote

sounds like the problem here is the metrics then. which also is something I'm pretty sure only even became a thing extremely recently. For a long time, the only citation-based metric anyone talked about was their Erdos number, which was a tongue-in-cheek thing anyway. Concern over metrics like this is more likely than not going to damage research progress by encouraging gamification. The only "cited by" count I ever concern myself with is for sorting stuff on google scholar, which I never presume is an exact count or directly maps to the sorting I really need.

1

MTGTraner t1_ixhasus wrote

>Was it always like that?

Always has been 🌏👨‍🚀🔫👨‍🚀

I recently learned about Rosalind Franklin's sad story behind the scenes of the discovery of the structure of DNA through Walter Isaacson's CRISPR book. I am sure that there are even older examples of gross behavior in the history of science.

18

-gh0stRush- t1_ixi4td8 wrote

Einstein's wife was a mathematician who helped him develop his theories. He privately wrote about her assistance but never publicly credited her for any of his results.

7

machinelearner77 t1_ixhcdck wrote

> Like a big fat self-marketing campaign. Disgusting.

You mean the Canada-US researcher circle jerk, don't you?

11

MartianTomato t1_ixhle92 wrote

Yes. In my conversations with people thinking about what topics / research to work on, I'd say number of citations / marketability is top 3 factor in their decision making. I also see people draw an equivalence between # of citations and impact.

The flaw in this reasoning is that most highly cited work (in "hot" topics) is, by its nature, replaceable. If you don't do it, someone else inevitably will. And this kind of research feels empty in the same way that software engineering does... the researcher has become a replaceable cog in the machine learning machine. Yet somehow, I see people are more motivated to pursue topics they feel they will be "scooped" on if they delay even one conference cycle...

6

DevFRus t1_ixihbol wrote

I feel you.

> I see people are more motivated to pursue topics they feel they will be "scooped" on if they delay even one conference cycle...

I also see this often and so I use the inverse of this as the guiding principle in much of my work. If I feel like this is a topic I'd get 'scooped' on if I didn't publish quickly enough then I look for another topic to work on. It usually feels nicer to do 'slow' research. However, it can be a bit isolating.

2

rolexpo t1_ixhed72 wrote

Have we seen them both in the same room? What if they are the same person?

10

VinnyVeritas t1_ixj2wd6 wrote

If I was Schmidhuber, instead of whining every time someone "stole my idea that didn't really work" and made it work, I'd just revisit my old ideas and make them work myself.

It's like inventing the airplane, there's a big difference between the general idea and actually making one that flies.

4

[deleted] t1_ixhc9sl wrote

[deleted]

3

the_dreaded_OOM t1_ixhncpf wrote

Because he's citing previous works, so you can get a better idea of the problem being communicated through its wider context.

4

patriot2024 t1_ixhwqep wrote

It feels like another collision of Big Egos.

3

lmilasl t1_ixjgbvc wrote

It feels like there are some people in this thread that can tell us what the weather is in Lugano.

3

Due-Philosopher-1426 t1_ixmu8u0 wrote

I am planning on citing Schmidhuber in my next paper on the middle east crisis, just to be on the safe side in case he proposed a solution to the israel-palestine conflicr at some point in the past.

2

-xylon t1_ixgrarj wrote

lmaoing @ the 2 blatant bot accounts that posted there 1h ago

1

ThRiLLeXx t1_ixgtlo0 wrote

Lmao, I just headed straight to this subreddit when I saw the tweet

1

nikgeo25 t1_ixiv67r wrote

AGI is solved lol. Just add more compute.

1

Grahabalaya t1_ixkwq6m wrote

Could someone catch me up a little bit with the set of Schmidhuber's ideas and projects?

1

Acceptable-Cress-374 t1_ixgmn8o wrote

Does anyone remember the model of the phone that had a touch-screen before the iPhone became a thing?

Doing research in NNs in the 90s was purely theoretical, and anything done then had 0 real-world application at-the-time. What we have now are SOTA models that are being deployed in production all over the place, a lot of business use them daily and amateurs like us get to deploy on a cheap GPU and play around with the tech.

With all due respect to the pioneers of the 90s, this is tech that we can see, smell and test ourselves.

−17

arg_max t1_ixh2wrq wrote

thats not how scientific citation works though. just because people in the 90s didn't train on JFT-3B and got close to 90% imagenet ACC doesn't mean its purely theoretical. And if ideas were presented earlier they should be cited.

14

[deleted] t1_ixi2ee8 wrote

Ideas are cheap, implementation is everything.

−8

Acceptable-Cress-374 t1_ixh4mo5 wrote

If they're after citations why not simply say so?

−14

respeckKnuckles t1_ixhedyv wrote

Uh...they do

8

Acceptable-Cress-374 t1_ixipz7u wrote

Well, in that case my bad. I don't do academia, and am not up-to-speed with what's the common way to cite previous work & such. To me, a layman, the blog post read more like whining and shaking a fist at the clouds and not like asking for proper citations. Again, my bad if I misread that.

0

mietminderung t1_ixh34zt wrote

> Doing research in NNs in the 90s was purely theoretical, and anything done then had 0 real-world application at-the-time.

Yet, Yann Le Cunn is cited for his work in the 90s too.

13