Viewing a single comment thread. View all comments

femboyxx98 t1_j4vlsfj wrote

Have you compared it against modern transformer implementations e.g. with FlashAttention, which can provide 3x-5x speed up by itself?

5