[P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels Submitted by pommedeterresautee t3_ydqmjp on October 26, 2022 at 6:10 AM in MachineLearning 40 comments 352
juliensalinas t1_ityl00x wrote on October 27, 2022 at 6:59 AM Reply to comment by pommedeterresautee in [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee Definitely. I will keep you posted Michael. Thanks! Permalink Parent 2
Viewing a single comment thread. View all comments