Viewing a single comment thread. View all comments

harharveryfunny t1_ir76pn9 wrote

Well, the gist of it is that they first transform the minimal-factors matmul problem into decomposition of a 3-D matrix into minimal number of factors, then use RL to perform this decomposition by making it a stepwise decomposition with the reward being mininum number of steps.

That said, I don't understand *why* they are doing it this way.

  1. Why solve the indirect decomposition problem, not just directly search for factors of the matmul itself ?

  2. Why use RL rather than some other solution space search method like an evolutionary algorithm? Brute force checking of all solutions is off the table since the search space is massive.

32

Mr_Smartypants t1_ir7rbxp wrote

At the end of RL training, they don't just have an efficient matrix multiplication algorithm (sequence of steps), they also have the policy they learned.

I don't know what that adds, though. Maybe it will generalize over input size?

19