Submitted by MurlocXYZ t3_110swn2 in MachineLearning

I could be wrong, but I see a trend that posts in this sub are getting to a lower quality and/or lower relevance.

I see a lot of posts of the type "how do I run X" (usually a generative model) with a complete disregard to how it actually works or nonsense posts about ChatGPT.

I believe this is due to an influx of new people who gained an interest in ML now that the hype is around generative AI. Which is fantastic, don't get me wrong.

But, I see less academic discussions and less papers being posted. Or perhaps they are just not as upvoted. Is it just me?

268

Comments

You must log in or register to comment.

ArnoF7 t1_j8azbzj wrote

Discussion in this subreddit is always a bit hit and miss. After all, reddit as a community has almost no gate keeping. While this could be a good thing, there are of course downsides to it.

If you look at this post about batch norm, you see that there are people who brought up interesting insights, and there are a good chunk of people who clearly have never even read the paper carefully. And this post is 5 years ago.

81

tysam_and_co t1_j8cf1o9 wrote

That is a really good point.

Though, minor contention, it seems like most of the comments in the post are pretty well-informed. I see the main difference is batchnorm before or after the activation, which oddly enough years-later seems to be better in the form of being before the activation due to the efficiency increases offered by fusing.

I'm surprised they were so on the mark even 6 years ago about being skeptical of this internal covariate shift business. I guess keeping the statistics centered and such is helpful but as we've seen since then, batchnorm seems to do so much more than just that (and is a frustratingly utilitarian, if limiting tool, in my experience, unfortunately).

6

starfries t1_j8fp30e wrote

What's the current understanding of why/when batch norm works? I haven't kept up with the literature but I had the impression there was no real consensus.

5

dustintran t1_j8bdcv6 wrote

r/MachineLearning today has 2.6 million subscribers. The more influx of newcomers the more beginner-friendly posts get upvoted. This is OK—don't get me wrong—it's just a different setting.

Academic discussions were popular back when there were only 50-100K. In fact, I remember in 2017 being in OpenAI offices and every morning, seeing a row of researchers with reddit on their monitor. Discussions mostly happen now on Twitter.

56

MurlocXYZ OP t1_j8bdr4q wrote

Dang it. I was hoping I can get away with not having a Twitter account

35

daking999 t1_j8bnr9q wrote

Completely agree. I use reddit casually and twitter as more of a work/research tool, but I really much prefer reddit to twitter as a platform (especially post Musk). I tried getting into mastodon but it just feels like more awkward-to-use twitter. An academic focused ML subreddit might be good. Maybe even enforce "real" names for users to post?

14

MrAcurite t1_j8c9u48 wrote

I joined the Sigmoid Mastodon. It's a wasteland of people posting AI "art," pseudo-intellectual gibberish about AI, and nonsense that belongs on the worst parts of LinkedIn.

23

gopher9 t1_j8d1odf wrote

Did you take a look as Mathstodon? There are some actuall mathematicians and computer scientists there, so maybe it's a better place to look at.

10

MrAcurite t1_j8d301x wrote

I'll take a look, thanks for the recommendation. Right now what I really want is a place to chat with ML researchers, primarily to try and get some eyes on my pre-prints before I submit to conferences and such. I'm still kinda new to publishing, my coworkers aren't really familiar with the current state of the ML publishing circuit, and I could always use more advice.

6

daking999 t1_j8dn7ar wrote

It's also frustrating finding researchers that I want to follow. I work on ML/compbio so the ppl I want to follow are spread across multiple mastodon servers which makes them hard to search for.

6

MrAcurite t1_j8dnscj wrote

I get that. I've come to actively hate a lot of the big, visual, attention-grabbing work that comes out of labs like OpenAI, FAIR, and to some extent Stanford and Berkeley. I work more in the trenches, on stuff like efficiency, but Two Minute Papers is never going to feature a paper just because it has an interesting graph or two. Such is life.

7

AdamAlexanderRies t1_j8cahbg wrote

What about a public discord server that only allows actual researchers to post, but allows everyone to view? Easy with roles.

8

daking999 t1_j8dmw8r wrote

I haven't used discord but heard good things about it, even with some labs using it instead of slack.

3

AdamAlexanderRies t1_j8eb94a wrote

I'm unaffiliated but pretty passionate about good design in general. Discord's really the spiritual successor to IRC, which predates the world wide web. The server-channel-role skeleton comes from IRC, but it's so feature rich and easy to use that I can see it supplanting a large portion of the social internet over the next decade. For the last month I've been developing my first discord bot (with chatgpt assistance) and the dev interface is excellent, too.

No experience with slack, so I can't comment on it.

5

daking999 t1_j8ejfe2 wrote

Hmm well now I don't know if I'm talking to you or your bot!

Cool I should check it out. Seems like the free version is already pretty functional?

2

AdamAlexanderRies t1_j8eppdi wrote

ChatGPT's mostly a cool toy, but there are some tasks it's genuinely useful for. I use it to explain complex topics, write code, brainstorm ideas, and for fun creative writing exercises. I've only tried the free version, but I am seeing mostly disappointment about the pro version.

Definitely check it out for at least curiosity's sake.

2

daking999 t1_j8f0a6l wrote

Oh sorry I meant I should check out discord!

I've used ChatGPT for a few tasks and it's been helpful (not perfect), e.g. summarizing a long document. Current issue is mainly just it being overloaded! Haven't tried code writing or brainstorming yet.

3

VacuousWaffle t1_j8l262a wrote

I just find that Discord is bad at being archived, and not indexed by search engines. It's kind of a mess of a walled garden, and even searching within it is kind of mediocre.

3

uristmcderp t1_j8dg14x wrote

If there are people willing to moderate with an iron fist, an academic focused subreddit can work well. An open forum always get derailed, real name or no.

2

CumbrianMan t1_j8cxawl wrote

Twitter is REALLY good if you aggressively curate your contacts, interaction & interests. The aim is to avoid BS political point scoring and MSM driven noise.

Edited “circle” out for clarity

3

mindmech t1_j8f44bh wrote

Yeah i have no idea how to do that. I tried following some data scientists but they kept posting about politics.

5

starfries t1_j8gcrzo wrote

Me too. There's a lot of great people I want to hear from but only when they post about ML, not politics.

9

t1ku2ri37gd2ubne t1_j92gvwh wrote

I accomplish that by using a ton of keyword filters for different political terms. Any post by people I follow that includes political keywords gets filtered out and I’m left with the relevant stuff.

2

gopher9 t1_j8d1ce0 wrote

/r/math uses extensive moderation to deal with this kind of problem. Low effort post just get removed.

16

goolulusaurs t1_j8evwnj wrote

I remember being here in 2017 also and I definitely recall the quality of the post being much higher. Even looking at the sidebar, most of the high quality AMAs from prominent researchers where prior to 2018. Now I often see posts that I would classify as relevant, correct or high quality get downvoted, and posts that seem misinformed or incorrect get upvoted. Personally I blame the reddit redesign for deemphasizing text and discussion in favor of lowest common denominator stuff like eye catching images and video.

5

uhules t1_j8dsggs wrote

Aside from "We've just published X" threads (which are usually comprised of healthy praises, questions and critiques), I loathe most ML twitter discussions. They tend to have all the usual "hot take" issues from the platform, even from prominent names in the field. Not really a great place to discuss ML as a whole.

4

berryaroberry t1_j8aveyl wrote

The following is my opinion; so bias is there. My feeling is the sub was never about academic discussions per se. The papers and academic discussions acted like vessels to carry people towards "(deep learning hype + money flow+ industry jobs)" island. In most of the earlier discussions ,if you follow them closely, you will see that there was never really a push for genuine understanding, rather people looking for easy way to earn "publication currency". Initial impression was having some kinda project or publication could land people a high-paying job. Probably later people realized that actually they don't need to worry about papers and stuff, rather doing some kinda quick LLM based project will help to land high-paying jobs even faster. I mean LLMs are currently at the peak of hype. Thus we have more random looking posts.

28

leondz t1_j8cc933 wrote

As an academic, the non-academic nature of the sub has always been one of its great advantages. I get enough academic research in the day job

9

impossiblefork t1_j8ewogm wrote

I talked research with researchers here, partially in PM, but some of it openly.

I'm sure many others did too. The current problem is something new and which has come during the past few days.

1

Myxomatosiss t1_j8bllfu wrote

"How many years before ChatGPT takes control of the global nuclear arsenal and demands the destruction of all humans?"

26

rafgro t1_j8cc6ne wrote

Agreed. The quality of discussions under posts is also pretty bad.

IMO it's the result of outdated rules and lax moderation. On the rules, there's definitely a need to address low-effort chatgpt posts and comments. Some of them are straight scam posts! On the moderation, it's not about quality but about the quantity, realistically this sub has just a few moderators (because some/most of these 9 lads are very busy engineers), with no new moderators added in the last two years, while it has seen enormous huge growth in members.

11

piman01 t1_j8c35wp wrote

It's because the name of this sub is a buzz word. Would be much fewer of these posts if it were called something like statistical learning

10

tysam_and_co t1_j8cf8al wrote

I...I...this is the first time I've heard this. Machine learning is often used as the hype-shelter word for "AI", because it triggers very few people (in the hype sense -- or at least it used to).

I'm not quite sure what to say, this is very confusing to me.

11

uhules t1_j8dtc07 wrote

The problem is that what defines what a "buzzword" is is its attention-grabbing, catchy misuse. The shelter has unfortunately been breached for a while now.

2

qalis t1_j8csdd1 wrote

On the related note, can anyone recommend more technically or research-oriented ML subreddits? I already unsubscribed from r/Python due to sheer amount of low effort spam questions, and I am considering the same for r/MachineLearning for the same reason.

10

Throwaway00000000028 t1_j8dssnp wrote

You're telling me there aren't actually 2.6 million machine learning experts on Reddit? I guarantee 95% of the people are here for the hype and don't actually understand anything about ML. Pretty picture go brrrr

9

throwaway2676 t1_j8digqj wrote

Here are the top 10 posts on my front page right now:

>[R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research

>[D] Quality of posts in this sub going down

>[D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published?

>[R] [N] pix2pix-zero - Zero-shot Image-to-Image Translation

>[P] Extracting Causal Chains from Text Using Language Models

>[R] [P] Adding Conditional Control to Text-to-Image Diffusion Models. "This paper presents ControlNet, an end-to-end neural network architecture that controls large image diffusion models (like Stable Diffusion) to learn task-specific input conditions." Example uses the Scribble ControlNet model.

>[R] [P] OpenAssistant is a fully open-source chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

>[D] What ML dev tools do you wish you'd discovered earlier?

>[R] CIFAR10 in <8 seconds on an A100 (new architecture!)

>[D] Engineering interviews at Anthropic AI?

From this list the only non-academic/"low quality" posts are the last one and this one. This is consistent with my normal experience, so I'm not really sure what you are talking about.

8

MurlocXYZ OP t1_j8dknrw wrote

I have been filtering by Hot, so my experience has been quite different. I guess I should filter by Top more.

7

codename_failure t1_j8dc6h2 wrote

The only solution would be to create /r/AcademicMachineLearning to discuss papers there, and to leave this subreddit for the general public.

7

Embarrassed_Ride_896 t1_j8byciz wrote

Bubble started. Everyone who got laid off will be wanting to be ai experts

5

EnjoyableGamer t1_j8b36s2 wrote

Not just you, it pivoted with the narrative that existing models will scale and stand the test of time with more data and bigger models.

4

VacuousWaffle t1_j8l2qh3 wrote

I wonder at what compute cost per model evaluation will the narrative about pushing for larger models will end.

1

colugo t1_j8be3q6 wrote

It's ChatGPT writing about ChatGPT

4

SatoshiNotMe t1_j8d53d3 wrote

Agreed. I often see more nuanced discussions on ML related topics on Hacker News. E.g this post on ToolFormer last week, compared to the same topic posted in this sub today.

https://news.ycombinator.com/item?id=34757265

Also I think many serious ML folks even avoid posting here.

3

franztesting t1_j8cjfgm wrote

It certainly has. I hope the moderators will fix it otherwise the community will become as annoying and unusable as many other technology-related subreddits like /r/datascience or /r/python.

2

aDutchofMuch t1_j8c0gcn wrote

My post earlier today on DigiFace discussing usages was just removed by the mods for literally no reason. Maybe the discussion is going down hill because of too much oversight

1

csreid t1_j8dqcrs wrote

I like that /r/science (I think?) has verification and flair to show levels of expertise in certain areas, and strict moderation. I wouldn't hate some verification and a crackdown on low-effort bloom-/doom-posting around AI ("How close are we to star trek/skynet?").

1

dojoteef t1_j8e2m8g wrote

Tbh, it's because I took a step back and haven't been moderating the sub the past week and a half. I've been the one mod doing the majority of the filtering of these posts over the past couple of years and the noise has just been going up exponentially over that time. It's very time consuming and I'm pretty burned out doing it, so I've taken some time away. I brought this up with the other mods before stepping back a bit.

It's probably good to try to get more mods, but I think the majority of the current mods are afraid to hire on new mods that might have a different philosophy of moderating, thus changing the feel of the sub.

1

ReginaldIII t1_j8e9sc3 wrote

It's been going downhill for a lot longer than that, and it's not something that can be solved with better moderation.

The people who are engaging with the sub in higher and higher frequencies than before simply do not know anything substantive about this field.

How many times will we have people try to asininely argue about stuff like a models "rights" or that "they" (the model) have "learned just like a person does", when the discussion should have just been about data licensing laws, intellectual property, and research ethics.

People just don't understand what it is that we actually do anymore.

25

zackline t1_j92ie5v wrote

> it’s not something that can be solved with better moderation

Didn’t that work pretty well over at /r/covid19?

3

velcher t1_j8glba7 wrote

Could ML or simple rule-based filters help us out here?

4

Swing_Bishop t1_j8elyx8 wrote

Maybe they're written by bots?

0

Borrowedshorts t1_j8fc7bk wrote

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

−2

MurlocXYZ OP t1_j8fwzw7 wrote

The posts I'm referring to are typically poorly constructed philosophical arguments on ChatGPT, or just straight up "how does it work". I do not want to gatekeep. I like that ML is hyped and new people are interested. But we have separare threads for beginner questions and/or tutorials, as per this subreddit's About section, specifically to avoid spammy posts.

5

Borrowedshorts t1_j8fc3wu wrote

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

−3

Ronny_Jotten t1_j8auqai wrote

This seems like a really low-quality post.

−26