Submitted by super_deap t3_11tmpc5 in MachineLearning
mrpogiface t1_jckmi7d wrote
Reply to comment by kittenkrazy in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
Definitely, but you'd need to further fine-tune the model to "teach" it to make use of the additional context
super_deap OP t1_jcktps3 wrote
This
Viewing a single comment thread. View all comments