Viewing a single comment thread. View all comments

deathisnear t1_j6swwf9 wrote

You can implement alpha beta pruning to further reduce the number of actions evaluated or look at Monte Carlo tree search as a potential option in terms of scalability (this combined with deep learning was used for Alpha Go). Games such as Fire Emblem have a similar setup and they definitely are not using RL for such a case and they tend to have reasonable performance.

1

jtpaquet OP t1_j6tzmhw wrote

Ok thanks I’ll look into it, I was thinking maybe to do minimax as a base for RL so it has already a starting point to improve. I considered checking every possible option at first but ruled it out since I thought there would be too many outcomes. Pruning seems to reduce the number of outcomes so that might be possible after all. Thanks for making me see the problem in an other way!

1