jamesj t1_jbgsbr5 wrote

This is puzzling if you think natural selection acts on the level of organisms, but completely explained (along with other altruistically motivated actions) I'd you think that natural selection acts on the level of genes (selfish gene theory).


jamesj t1_ja4aicc wrote

If you are worried about this (which I think is totally valid) then spend time learning and using the new AI tools. Someone who can keep up with all the changes and know which tools help with which problems will be super valuable over the coming years. So right now, use copilot, stable diffusion, and chatgpt. Learn python, colab notebooks, and HuggingFace. There's so much cool stuff to learn about and use.


jamesj OP t1_ja3weuo wrote

I agree problems of bias and privacy are real and important, but your claim about what anyone in ML believes just isn't true, and the article goes in to some depth about it. Experts in machine learning collectovely give a 50% probability by 2061 of AGI with huge differences in their individual estimates. Almost all of them say it will happen in the next 75 years.

If experts were saying there was a 90% chance an asteroid would hit the earth in the next 75 years, would you claim we shouldn't start working on a solution now?


jamesj OP t1_ja2acnm wrote

Hey I appreciate the time to engage with the article and provide your thoughts. I'll respond to a few things.

>The first two elements of that is the definition for any model, which is exactly what both AI and deterministic regression algorithms all do.

Yes, under the framework used in the article, an agent using linear regression might be a little intelligent. It can take past state data and use it to make predictions about the future state, and use those predictions to act. That would be more intelligent than an agent which makes random actions.

>I'm not saying it's a bad paper or theory, but that this essay doesn't really justify why it brings it up so much

Yes, that is a fair point. I was worried that spending more time on it would have made it even longer than it already was. But one justification is that it is a good, practical, definition of intelligence that demystifies the process of intelligence to what kind of information processing must be taking place. It is built off of information theory work in information bottlenecks, and is directly related to the motivation for autoencoders.

>The problem is that Schmidhuber 2008 only exists as a preprint and later as a conference paper -- it was never peer-reviewed.

The paper isn't an experiment with data, it was first presented at a conference to put forward an interpretation. It's been cited 189 times. I think it is worth reading, the ideas can be understood pretty easily. But it isn't the only paper that discusses the connection between compression, prediction, and intelligence. Not everyone talks in the language of compression, they may use words like elegance, parameter efficiency, information bottlenecks, or whatever, but we are talking about the same ideas. This paper has some good references, it states, "Several authors [1,5,6,11,7,9] have suggested the relevance of compression to intelligence, especially the inductive inferential (or inductive learning) part of intelligence. M. Hutter even proposed a compression contest (the Hutter prize) which was “motivated by the fact that being able to compress well is closely related to acting intelligently”

>The equation E = mc2 For the newbies out there, this is what's called a red flag.

I was trying to use an example that people would be familiar with. All the example is pointing out is that the equations of physics are highly compressed representations of the data of past physical measurements, that allow us to predict lots of future physical measurements. That could be said of Maxwell's equations or the Standard Model or any successful physical theory. Most physicists like more compressed mathematical descriptions: though they usually would call it more elegant rather than use the language of compression.

>This is completely the wrong way to think about it if you're trying to understand these things, so I hope he actually knows this.

I don't think it is wrong to say that what the transformer "knows" about the images in its dataset has been compressed into its weights. In a very real sense, a transformer is very lossy compression algorithm which takes in a huge dataset and learns weights which represent patterns in the dataset. So no, I'm not saying that literally every image in the dataset was compressed down to 1.2 bytes each. I'm saying that whatever SD learned about the relationships of the pixels in an image to their text labels is stored in 1.2 bytes per dataset image in its weights. And you can actually use those weights as a good image compression codec. The fact that it has to do this in a limited number of parameters is one of the things that forces it to learn higher-level patterns and not rely on memorization or other simpler strategies. Illya Sutskever talks about this, and was part of a team that published on it, basically showing that there is a sweet spot for data/parameter where giving it more parameters improves performance to a point, but there is a point where adding even more decreases performance. His explanation for this is that by limiting the number of parameters, the model is forced to generalize. So in Schmidhubers language, the network is forced to make more compressed representations, so it overfits less and generalizes better.

>First, this is the connectivist problem/fallacy in early AI and cog sci -- the notion that because small neuronal systems could be emulated somewhat with neural nets, and because neural nets could do useful biological-looking things, that then the limiting factor to intelligence/ability is simple scale

My argument about this doesn't come from ML systems mimicking biology. It comes from looking at exponential graphs of cost, performance, model parameters, and so on, and projecting that exponential growth will likely continue for a while. The first airplane didn't fly like a bird, it did something a lot simpler than that. In the same way, I'd bet the first AGI will be a lot simpler than a brain. I could be wrong about that.

But, I'm not even claiming that scaling transformers will lead to AGI, or that AGI will definitely be developed soon. All I'm saying is that there is significant expert uncertainty in when AGI will be developed, and it is possible that it could be developed soon. If it were, that would probably be the most difficult type of AGI to align, which is a concern.


jamesj OP t1_j9zv8zv wrote

I'd like to share some of my thoughts and have a discussion regarding the timeline for AGI and the risks inherent in building it. My argument boils down to:

  1. AGI is possible to build
  2. It is possible the first AGI will be built soon
  3. AGI which is possible to build soon is inherently existentially dangerous

So we need more people working on the problems of alignment and of deciding what goals increasingly intelligent AI systems should pursue.


jamesj OP t1_j9p3m4l wrote

>Then at the part where they offer a skewed definition of Intelligence, "First, a few definitions. Intelligence, as defined in this article, is the ability to compress data describing past events, in order to predict future outcomes...". This is not correct. Why not just use some agreed-upon definition? Like "The ability to acquire and apply knowledge and skills."
>I'm just stopping there. Calling BS.

This definition of intelligence comes from Juergen Schmidhuber, who's team was instrumental in the development of LSTMs and advances in deep learning in the 90s.

I recommend reading the paper, it is a very useful view of what the core of intelligence really is. https://arxiv.org/pdf/0812.4360.pdf


jamesj t1_j9e35d1 wrote

Theorizing worlds doesn't make them true. The fact we can imagine other worlds doesn't make them exist. They could exist, they might exist, but I'm still not in control of which one I end up in, even if they do exist.


jamesj t1_j9bzbrb wrote

Yes, there are two important levels where things are outside of my control: first, i didn't choose my place of birth, language, parents, schools, upbringing, and what ideas I was exposed to. Second, I conceive of my self as a subset of my brain and body, and at a deep level I don't believe that part of me is in control. I'm along for the ride, and I experience stories about why things happen, some of those stories involving the idea of choices, but I don't believe all of those stories.


jamesj t1_j99ebw7 wrote

  1. If someone acts of her own free will, then she could have done otherwise.
  2. If determinism is true, no one can do otherwise than one actually does.
  3. Therefore, if determinism is true, no one acts of her own free will.

Is the standard argument.

What's your argument for your claim?


jamesj t1_j99bqfk wrote

I think it could be true that people exercise real choice. But I don't think it is consistent with determinism.


Scholarly definitions aside, ordinary people generally understand free will as the ability to choose a desired course of action without restraint (Monroe, Dillon & Malle, 2014; Feldman, Wong & Baumeister, 2014; Feldman, 2017). Even if some scholars conceptualize free will in abstract, metaphysical terms (Greene & Cohen, 2004; Montague, 2008; Bargh, 2008), people tend to link free will most closely with the psychological concept of choice, not metaphysical concepts (Vonash, Baumeister, & Mele, 2018).


jamesj t1_j98b5pb wrote

This all makes sense. I suppose that I think that the compatibilist redefinition of the terms make everything less literal and more metaphorical, and it is less in line with what I believe most people mean by the terms, "free will", "morally responsible", and "choose". Also, there's often a real difference in belief between us: I really don't think anyone is in any important sense "morally responsible". This means I support preventative justice but I don't support retributive justice.


jamesj t1_j98aavi wrote

Sure, but why is that called compatibilism and not illusionism, which seems like a much more appropriate label? It just feels to me that compatibilists want the claim that free will is compatible with determinism (because they are physicalists who like the idea of moral responsibility) more than they want clarity around the words "free will".