fnbr t1_j1bc8i3 wrote on December 23, 2022 at 1:13 AM

The main problem with tabular Q-learning (I'm assuming that by classical, you mean tabular) is that for most environments that are interesting, the state space is massive, so we can't actually store all states in memory.

In particular for lunar lander, you have a continuous observation space, so you need to apply some sort of discretization; at that point, you might as well just use tile coding or some sort of other function approximator.