Viewing a single comment thread. View all comments

dookiehat t1_j2ylyao wrote

OP, you should watch the machine learning street talk YouTube channel and watch whatever videos interest you. It is an incredible resource from top minds all around the world in the field. Only caveat is i would say don’t start with the Noam chomsky episode. It is a great episode, but a bad first episode to introduce yourself to the series because so much exposition goes into explaining the extreme unfortunate technical issues they faced in fixing a messed up recording.

I’ll tell you what i think you are asking then answer. Fwiw i am NOT an ML engineer. I think you are asking “can machine learning models with current technological limits outperform humans in any given task considering the data is possibly compromised by human error? What about in the future?”

Firstly, self supervised learning approaches are apparently beginning to perform better than human guided models in many tasks. I’m referring to isolated cognitive tasks which can operate within a VERY limited amount of specified parameters. Take diffusion image generation. There are perhaps 60 to 80 settings you tweak. before you include token processing (processing the prompt. Smth like 4 words is more or less equivalent to a token) . My point here is that these are GAN driven, and in a way, while the data is not cleaned, the really poorly fitting data will be statistically reflected in output probably to the extent that it is an outlier. So low quality things will be output less because within the context of the model, the model may “understand” this is not a desired output.

Second, while your question is basic on the surface, it is actually a major subject of debate still. My personal opinion is that there are major structural components and AI architectures that still need to be created, if not to imitate human conscious thinking, executive function, attention, strategizing, and possibly desire and more nuanced reward systems, and they must all be integrated in such a way that when they train themselves on data that they are able to discover best practices for multi-step, multimodal, and various cognitive approaches before it is as intuitive as talking to another person and explaining what you want and getting the desired result.

While transformers (an ai architecture invented in 2017) are powerful and appear to learn data and it’s semantical importance, and can lead to better performance than humans in many tasks, there is still something “dumb” about these models, namely that they are highly specialized. This is why the datasets are absolutely ginormous, yet if you put them outside of their speciality they have no clue what the hell is going on really.

There are actually some interesting counterpoints to this. For instance, for google imagen, a diffusion image generator, i believe it has one of the largest parameter datasets for image generation. What is particularly interesting though is that even though the model is trained on images, because it has seen so much text in images, has learned to spell spontaneously. Therefore you can ask for an image of a guy holding up a sign that says “a computer wrote this sign” and it will create images of each letter in order to appear as the words requested.

While that is incredibly interesting, datasets will be eventually approaching the size of the internet itself and only able to do simple tasks like this. As a tiny human, i didn’t need to absorb the entire world too learn about its generalities. Ultimately i think that is the answer you are looking for.

I personally believe that consciousness has to do with integrated multimodal information processing and that intelligence is simply a brain having ample raw data stored in memory, but in extremely efficient and compressed ways structured from general to specific. It is less like the information is stored there as discrete facts, and more like the information is stored in configurations of layers of processing, so that multiple different concepts can be represented within these spaces at any given time.

One very strong support to this idea is considering what human attention actually is. I think attention is less an accessing of raw data and “focusing” on that data than it is a holistic reconstruction of a concept from multiple constituent components. This is especially why it would be nearly impossible for a person to think in great depth about two very different and unrelated concepts simultaneously. However This is also why metaphors and analogies appear so powerful as explanatory devices for humans, because a metaphor takes two simple but seemingly unrelated concepts and makes them fit together in a way that makes sense within the given context of the metaphor. We understand that a metaphor is not literal though, which is the only reason they work, and is why even high functioning autistic people may have difficulty understanding them, because their “attention” in sensory processing and therefore concept representation is not focused enough and gets processed more as a broad and heavy multimodal concept that is hard to parse because it is taken as literal data.

My point though is that current machine learning models, while they have layers of processing, still are behind in general and broad intelligence because they do not have multiple integrated data types and models consulting with one another to form more generalized notions of concepts. They only have these concepts in particular forms of data by themselves which in turn makes them error prone no matter the data.

I don’t think it is bad data that is the problem as much as missing context to allow the model to understand what is and isn’t bad data

2