Viewing a single comment thread. View all comments

alterframe t1_jb5oel8 wrote

RL is one of those concepts where it's easy to fool ourselves that we get it, but in reality we don't. We have this fuzzy notion of what RL is and what it is good for, so in our imagination this is going to be a perfect match for our problem. In reality, our problem may look like those RL-friendly tasks on the surface, but we are lacking several important properties or challenges to really make it reasonable.

It doesn't mean that this is not useful at all. Quite opposite. People are wrongly discouraged from RL, based on experience with projects where it didn't actually make sense, and draw false conclusions about it's practicality.

1