Viewing a single comment thread. View all comments

comefromspace t1_j1z64ux wrote

Language is syntax, and LLMs excel at it. I think it is interesting to note that GPT improved with learning programming because programming languages follows exact syntactic rules, which are rules of symbol manipulation. But it seems those rules are also great when applied to ordinary language which is much more fuzzy and ambiguous. transformers do seem to be exceptional at capturing syntactic relationships without necessarily knowing what it is that they are talking about (so, abstractly). And math is all about manipulating abstract entities..

I think symbol manipulation is something that transformers will continue to excel at. After all it's not that difficult either - Mathematica does it. The model may not understand the consequences of their inventions, but it will definitely be able to come up with proofs , models, theorems, physical laws etc. If the next GPT will be multi-modal, it seems it might be able to reason about its sensory inputs as well

−2

seventyducks t1_j1zcf8m wrote

>Language is syntax

Language is much more than syntax; if language as pure syntax is your starting point then it's not really a conversation worth having IMO.

11

alsuhr t1_j1zftk4 wrote

Language is an action we take to achieve some short- or long-term intent by affecting others' actions. It just so happens that text data is (mostly) symbolic, so it appears like only a problem of symbol manipulation. The text that these models are trained on are observations of language production, where utterances are generated from intent (e.g., wanting to convince someone of some argument, wanting to sell something to someone) and context (e.g., what you know about your interlocutor). This doesn't even cover vocal / signed communication, which is much more continuous.

Intent and context are not purely symbolic. Sure, with infinite observations, that generative structure would be perfectly reconstructable. But we are nowhere near that, and humans are completely capable of modeling that generative process with very little data and continuous input (which we learn to discretize).

10

maxToTheJ t1_j1zq0fa wrote

> Intent and context are not purely symbolic.

Yup . Thats why reasoning comes in and what makes what Demis from DeepMind said make sense

2

madnessandmachines t1_j1zkd85 wrote

Just want to reiterate: if you think language is just syntax, I'd recommend listening to some linguistics lectures or reading a book or two on the subject (i.e.: books on language, not on syntax or grammar). John McWhorter has some very approachable and eye-opening Audible courses that might change your perspective.

5

comefromspace t1_j1zm1fv wrote

I am aware of some of the philosophy of language, but i prefer to look at the neuroscientific findings instead. Language is a human construct that doesn't really exist in nature - communication does, which in humans is exchange of mental states between brains. The structure of language follows from abstracting the physical world into compact communicable units, and syntax is a very important byproduct of this process. I am more interested to see how hierarchical structure of language arises in these computational models like LLMs, which are open to empirical investigation. Most folk linguistic theories are high conjecture that has only circumstancial evidence.

−4

madnessandmachines t1_j1zok1s wrote

Linguistics is a field of study and analysis, not philosophy. And I am specifically talking about exploring the anthropological and ethnographical study of language which is where you might lose many of your assumptions. The way different languages work, how they change over time, is relevant to anyone working in NLP.

I would argue the number one fallacy of modern LLM design is people disregarding all we have come to know about language in favor of just hoping something interesting will emerge when we throw billions of parameters at it.

7

madnessandmachines t1_j1zousp wrote

Also “the structure of language follows from abstracting the world into compact communicable units” is itself a “folk theory” of languages. Many supposedly neuroscientific theories of language are little more than conjecture based on assumptions.

3

comefromspace t1_j23hsyi wrote

It is a conjecture that can be tested however, starting with artificial networks. I don't think it's folk theory because it s not mainstream at all

1