Viewing a single comment thread. View all comments

JClub t1_jc5ys39 wrote

Is there any implementation of CAM? Why is this better than the tglobal attention used in LongT5?

1