__ingeniare__

__ingeniare__ t1_jea74g3 wrote

Currently, AI research for large models (such as ChatGPT) is expensive since you need large data centers to train and run the model. Therefore, these powerful models are mostly developed by companies that have a profit incentive to not publish their research.

A well known non-profit called LAION has made a petition that proposes a large publicly funded international data center for researchers to use for training open source foundation models ("foundation model" means its a large model used as a base for more specialized models, open source means that they are freely available for everyone to download). It's a bit like how particle accelerators are international and publicly funded for use in particle physics, but instead we have large data centers for AI development.

5

__ingeniare__ t1_ja6z4j2 wrote

Flickering is not solved at the moment yes, but how do you know it is far from being solved? Temporal consistency has already been solved in other aspects of generative AI (like inter-frame interpolation for FPS upscaling). I wouldn't be surprised if flickering is solved by the end of this year. Stable diffusion's Emad has already talked about real-time generated videos coming very soon with a recent breakthrough in their algorithm, allowing for something like 100x generation speedup.

11

__ingeniare__ t1_j8i7pmf wrote

That has already happened, there's a hybrid ChatGPT/Wolfram Alpha program but it's not available to the public. It can understand which parts of the user request should be handed off to Wolfram Alpha and combine it into the final output.

37

__ingeniare__ t1_j8c4z0x wrote

I think we have different definitions of scalable then. Our minds emerged from computation under the evolutionary pressure to form certain information processing patterns, so it isn't just any computation. Just so I understand you correctly, are you claiming an arbitrary computational system would inevitably lead to theory of mind and other emergent properties by simply scaling it (in other words, adding more compute units like neurons or transistors)?

2

__ingeniare__ t1_j8c0bbz wrote

Let's say you have a computer that simply adds two large numbers. You can scale it indefinitely to add even larger numbers, but it will never do anything interesting beyond that because it's not a complex system. Computation in itself does not necessarily lead to emergent properties, it is the structure of the information processing that dictates this.

2

__ingeniare__ t1_j880eru wrote

The difference is the computing architecture. Obviously you can't just scale any computing system and have theory of mind appear as an emergent property, the computations need to have a pattern that allows it.

1

__ingeniare__ t1_j8722ii wrote

You're talking about generative adversarial networks (GANs), which is a type of architecture from many years ago. More recent image generators tend to be based on diffusion, and text generators like in the article are transformer based.

2

__ingeniare__ t1_j4fry4r wrote

Even if it could code better than humans (like AlphaCode, that outperforms most humans in coding competitions), that's not the hard part.

The hard part is the science/engineering aspect of machine learning, programming is just the implementation of the ideas when they are already thought out. Actually coming up with useful improvements is significantly harder and requires a thorough grasp of the mathematical underpinnings of ML. ChatGPT is nowhere near capable of making useful contributions to the machine learning research community (or in other words, capable of writing a ML paper), and therefore it is incapable of improving its own software. AI most likely will reach that level at some point however, possibly in the near future.

7

__ingeniare__ t1_j0p2vrm wrote

Not really, this isn't necessarily something it saw in the dataset. You can easily reach that conclusion by looking at the size of ChatGPT vs the size of its dataset. The model is orders of magnitude smaller than the dataset, so it hasn't just stored things verbatim. Instead, it has compressed it to the essence of what people tend to say, which is a vital step towards understanding. It's how it can combine concepts rather than just words, which also allows for potentially novel ideas.

5

__ingeniare__ t1_j0fw5pq wrote

Not exactly, in this case it's theoretical because it is not of practical concern, despite being true. A theory in science is the highest possible status an idea can achieve, nothing can be conclusively proven.

Quantum randomness is a pretty popular idea, but everything else is known to be deterministic. Whether the universe as a whole is random or deterministic depends on if quantum randomness is actually true randomness, maybe we'll have an answer in the coming decades.

3

__ingeniare__ t1_j0ftkkp wrote

I'm not talking about practically predictable using current tech, I'm talking about theoretically predictable. Everything that happens can be theoretically derived from the laws of the universe. The laws are deterministic (perhaps except for quantum mechanics). Therefore, everything is deterministic and can be predicted. Human emotions are only unpredictable because we don't have an accurate model of the brain of the human we're trying to predict. If we did, and had a computer powerful enough to simulate it, their behaviour could be predicted. Hence - lack of data (model of the brain) and processing power (computer to simulate it).

Also chaos theory has nothing to do with the possibility of predictions, only the difficulty. It states that a chaotic system yields very different results for very small differences in initial conditions, not that there is some magic randomness that is impossible to account for in the system. Given the complete initial conditions, you can compute it completely deterministically. Therefore, if you get it wrong because of a chaotic system, it was lack of data (the incorrect initial conditions).

2