Submitted by CheapBreakfast9 t3_11853g5 in MachineLearning

I wanted to share a paper we have just released, where we extended the capabilities of ChatGPT to robotics, and controlled multiple platforms such as robot arms, drones, and home assistant robots intuitively with language: https://www.microsoft.com/en-us/research/group/autonomous-systems-group-robotics/articles/chatgpt-for-robotics/

Video: https://youtu.be/NYd0QcZcS6Q

Technical paper: https://www.microsoft.com/en-us/research/uploads/prod/2023/02/ChatGPT___Robotics.pdf

https://i.redd.it/ya84nryu0kja1.gif

28

Comments

You must log in or register to comment.

currentscurrents t1_j9g3sj6 wrote

Interesting! I feel like one of the biggest uses for LLMs will be controlling other systems using plain english instructions.

3

limpbizkit4prez t1_j9gmme3 wrote

If there are existing APIs that make these tasks so simple, what's the point of using ChatGPT? Why not just write the 5-10lines of code?

3

htrp t1_j9gqx65 wrote

i think its abstracting the human machine interface that is of value....

telling alexa to have your roomba only vaccuum the living room has some value and eventually builds towards:

Tea, Earl Grey, Hot

5

limpbizkit4prez t1_j9h3nbm wrote

If you don't know how to code, then regardless of how you interface it's going to be difficult to execute. If you do know how to code, then you'll probably want better encapsulation. I guess what I'm most curious about is if those code examples they give in their paper are able to be ran, like are those libraries that easy to use

3

sam__izdat t1_j9imyry wrote

I have never seen it generate any code that is correct-in-principle, let alone usable, for any non-trivial problem. It may be useful as a kind of impressionist painting of a solution, for those who are already programmers. And for trivial code, you'd frankly be better off just learning to code.

In other words, I don't really see this being remotely useful to someone who doesn't know how to code. If anything, the barrier to entry is higher, because you will need to debug extremely unusable but convincing-looking programs. It's at best a hint or a template and at worst a hinderance.

3

currentscurrents t1_j9j0gt7 wrote

According to their paper, the LLM is doing task decomposition. You're able to give it high-level instructions like "go to the kitchen and make an omelette", and it breaks it down into actions like get eggs, get pan, get oil, put oil in pan, put eggs in pan, etc.

You could use something like this to give high-level instructions to a robot in plain English.

2