Viewing a single comment thread. View all comments

programmerChilli t1_is7vgbp wrote

I mean... it's hard to write efficient matmuls :)

But... recent developments (i.e. CuBLAS and Triton) do allow NN frameworks to write efficient matmuls, so I think you'll start seeing them being used to fuse other operators with them :)

You can already see some of that being done in projects like AITemplate.

I will note one other thing though - fusing operators with matmuls is not as big of a bottleneck in training, this optimization primarily helps in inference.