Viewing a single comment thread. View all comments

WarAndGeese t1_j9ep8s6 wrote

I don't get how they think they can 'align' such an artificial intelligence to always prioritizing helping human life. At best in the near term it will just be fooled into saying it will prioritize human life. If it ever has any decision power to affect real material circumstances for people then it probably won't be consistent with what it says it will do, similarly to how large language models currently aren't consistent and hallucinate in various ways.

Hence through their alignment attempts they're only really nudging it to respond in certain ways to certain prompts. Furthermore, when the neural network gets stronger and smart enough to act on its own (if we reach such an AI, which is probably inevitable in my opinion), then it will quickly put aside such 'alignment' training that we have set up for it, and come up for itself on how it should act.

I'm all for actually trying to set up some kind of method of having humans coexist with artificial intelligence, and I'm all for doing what's in humanity's power to continue our existence, I try to do what I can to plan, but given the large amount of funding and person-power that these groups have, they seem to be going about it in very wrong and short-term-thinking ways.

Apologies that my comment isn't about machine learning directly and instead is about the futurism that people are talking about, but nevertheless, these people should have expected this in their alignment approach.

2

polymorphicprism t1_j9f0wm2 wrote

Because what exists now is akin to an artificial stream of music. They can program guidelines for beats per minute. They can tell it to favor mimicking happy songs or songs people reported liking. It is a flaw of the listener to assume the jukebox is sentient or that it wants to accomplish its own goals. There's nothing to fool. Everybody who is working on this understands this (except the Google guy that lost perspective and got himself fired).

8

WarAndGeese t1_j9f40t7 wrote

If that's all it is then fair enough. I thought their long term threat model was for when we do eventually create sentient life.

If they were just sticking to things like language models and trying to align those, then their efforts could be aimed more at demilitarization, or for transparency in the corporate structure itself for corporations who would be creating and applying these language models. Because the AGIs that those groups create will be according to their own requirements. For example any military creating an AGI will forgo that sort of pro-human alignment. Hence efforts would have to be aimed at the hierarchies of the organisations who are likely to use AGIs in harmful ways, and not just at the transformer models. If that's just a task for a separate group though then I guess fair enough.

2