Viewing a single comment thread. View all comments

ProShortKingAction t1_iurhi4z wrote

How do you prevent the robot from writing unsafe code? If it is continually adding new code without being checked by devs or a security team it seems like you'd run into the issue of there always being the possibility of it being one instruction away from generating code that includes a dangerous vulnerability

28

Sashinii t1_iurkiad wrote

They address the potential negatives with built-in safety checks, while also encouraging suggestions for other methods to ensure that the AI is as safe as possible.

12

ProShortKingAction t1_iurmfoc wrote

Sorry, I took that as them saying they had built-in safety checks that are meant to prevent the robot from doing an unsafe physical action not prevent it from writing vulnerable code. I might have misinterpreted that.

Another thing I would like to bring up in the favor of this model of going about things is that vulnerabilities slip through in regular code all the time, this approach doesn't have to be perfect just more safe than the current approach. It's like with driverless cars, they don't have to be perfect just more safe than a car driven by a human which seems like a low bar. I just don't see anything from this post that implies a safe way to do this approach isn't rather far off

Edit: In the Twitter thread made by one of the researchers posted elsewhere in this thread they very vaguely mention "... and many potential safety risks need to be addressed" its hard to tell if this is referencing the robot physically interacting with the world, cybersecurity concerns, or both.

6

visarga t1_iusk21l wrote

They do a few preventive measures.

> we first check that it is safe to run by ensuring there are no import statements, special variables that begin with __, or calls to exec and eval. Then, we call Python’s exec function with the code as the input string and two dictionaries that form the scope of that code execution: (i) globals, containing all APIs that the generated code might call, and (ii) locals, an empty dictionary which will be populated with variables and new functions defined during exec. If the LMP is expected to return a value, we obtain it from locals after exec finishes.

3

ProShortKingAction t1_iuskyif wrote

This seems to be saying "safe to run" as in make it less likely to crash not as in prevent cybersecurity issues.

3

visarga t1_iuvrqym wrote

it prevents access to various Python APIs, exec and eval

it's just a basic check

1