Submitted by adventurousprogram4 t3_zyiib1 in MachineLearning

Has the research community embraced any of the frameworks or findings published by Anthropic at all? Google Scholar seems to indicate no, but I'm curious. I work on the applied side and not on the research side, so I don't have a good sense for how influential their work on interpretability is.

The motivation for my question is that they have a huge amount of funding (although how long that will last after SBF's downfall remains to be seen) and a lot of press attention and fans in the rationalist/EA communities, but my feeling is that their work is largely not being adopted or cited in AI research. If I am correct in this, I'm curious if this is because it is seen as unoriginal, incorrect, or misguided? Or is there something else going on?

41

Comments

You must log in or register to comment.

Ready-Farmer7451 t1_j2620xk wrote

The research is fine.

  1. LLM research is really new.
  2. LLM research is in general a bit shallow and not that interesting.
  3. Ability to do large scale LLM research is limited to a few labs, so there's not that many that could cite them. The few labs who do it tend to focus on their own work rather than the work of others.
24

AGI_aint_happening t1_j26847d wrote

As a former interpretability researcher who has skimmed their work but not read it closely, I just don't find it terribly interesting or novel. Also, frankly, I find the writing style for the papers pretty hard to parse (as they don't follow standard paper formats) and a tad grandiose, as they tend to avoid standard things like comparing against other methods or citing other work. Relatedly, I think their choice to avoid peer review has impacted how people perceive their work, and limited its distribution.

48

veejarAmrev t1_j2724im wrote

As you said, it's kind of cult in the EA community. Outside of that, no one bothers. They haven't done anything significant to be of any value to the community.

20

thejaminator t1_j286yqs wrote

I think it's the case where they are still pretty new and comparatively unknown.

They have done good work like releasing their paper and dataset for training an assistant RLHF model. https://github.com/anthropics/hh-rlhf

You won't get any dataset like that from OpenAI. It's useful for anyone who wants to experiment with RLHF with LLMs. Which is pretty important as OpenAI is having lots of success with it in InstructGPT and ChatGPT

14

KvanteKat t1_j291xw0 wrote

I'm not sure reading LessWrong will necessaryly disuade someone who is already a bit sceptical of the Rationalist EA community from believing that there is something culty going on. One of the things that really rubbed me the wrong way about that blog back in the day (I'll be up front and say that I haven't been keeping up with it for the past 10 years) was exactly how insular a lot of the writing was and how little it seriously engaged with existing literature and research in favor of reinventing the wheel and relying on their own private language which was not used by anyone else working in similar fields (as an example, Yudkovski is far from the first person to promote naive Bayesianism (basically the idea that if you get good enough at applying Bayes' rule, you will have solved the problem of induction), but if you only read his blog back then you could easily come to believe that he was doing groundbreaking stuff with respect to this topic when this was far from the case).

10

nic001a t1_j2941zb wrote

Not an expert But wishing you best of luck !!

−4

papajan18 t1_j29cxmw wrote

Chris Olah's work is very solid. Actually some of the best interpretability work I've seen. Haven't heard of anyone else in particular.

6

Hyper1on t1_j2dzz01 wrote

Bit early to say, but I'd be willing to bet that most of their major papers this year will be widely cited. Their work on RLHF, including constitutional AI and HH seems particularly likely to be picked up by other industry labs, since it provides a way to improve LLMs deployed in the wild while reducing the cost of collecting human feedback data.

2

Flag_Red t1_j2ek25d wrote

I, personally, don't consider LessWrong a cult (I lurk the blog, and have even been to an ACX meetup). There's definitely a very insular core community, though, which regularly gets caught up in "cults of personality". Yudkowski is the most obvious person to point to here, but Leverage Research is the best example of cult behaviour coming out of LessWrong and the EA community IMO.

With regards to machine learning in particular, there's some very extreme views about the mid/long term prospects of AI. Yudkowski himself explicitly believes humanity is doomed, and AI will takeover the world within our lifetimes.

3