TeamRocketsSecretary t1_j97xsud wrote

Reply to comment by pyepyepie in [D] Please stop by [deleted]

The reason of why overparameterized networks work at all theoretically is still an open question, but that we don’t have the full answer doesn’t mean that the weights are performing “human-like” processing the same way that classical mechanics pre-Einstein didn’t make the corpuscle theory of light any more valid. You all just love to anthromorphize anything and the amount of metaphysical mental snakeoil that chatGPT has generated is ridiculous.

But sure. ChatGPT is mildly sentient 🤷‍♂️


TeamRocketsSecretary t1_j93os17 wrote

Reply to comment by kromem in [D] Please stop by [deleted]

Look if you think the dismissals are increasingly obsolete it’s because you don’t understand the underlying tech… autocomplete isn’t autoregression isn’t sentience. Your fake example isn’t even a good one.

To suggest that it’s performing human like processing of emotions because the internal states of a regression model resemble some notion of intermediate mathematical logic is ridiculous especially in light of research showing these autoregressive models struggle with symbolic logic, and if you favor that type of discussion I’m sure there’s a philosophical/ethical/metaphysical focused sub you can have that discussion in. Physics subs suffer from the same problem especially anything quantum/black hole related where non-practitioners ask absolutely insane thought-experiments. That you even think that these dismissals of chatgpt are “parroted” shows your bias and like I said there’s a relevant sub where you can mentally masturbate over that but this sub isn’t it.


TeamRocketsSecretary t1_j8z7pqs wrote

Given your reply I’m unsure of why you would want be able to follow the proofs then?

Some of the proofs in optimization are particularly rough so if you want to understand them the only way to is to wade through a book or the very least online lecture videos + slides.


TeamRocketsSecretary t1_j8ywdem wrote

Lol dude you wanna learn optimization, the details and length are what make the subject. If you want a high level overview look at a blog post. At the very least find an online offering of an optimization course with lecture videos and watch those and read the slides if you can’t be bothered to open a textbook.

All these low effort posts in this sub about people just looking to cut corners are depressing.


TeamRocketsSecretary t1_j3fmzxw wrote

Fusion of LLM and vision models is something I’m noticing more work on. Also, embodied feedback with human in the loop, especially towards robotics applications. The vision field def seems to be co-opting language models and there is research on making inference with them faster (recurrent-transformers) and bringing back recurrence into the transformer which is interesting since transformers succeeded them naturally once the power of attention came to light.

Also a lot of work to be done on using them for mission critical applications (healthcare) as well as “robustifying” them (transformers using raw byte sequences showing much more robustness to noise.)

So I guess a lot of the native NLP tasks that LLM were made for are being used more for non-NLP tasks, especially now in reinforcement learning.