[R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? Submitted by Moose_a_Lini t3_yjwvav on November 2, 2022 at 5:48 AM in MachineLearning 23 comments 22
LetterRip t1_iusmrac wrote on November 2, 2022 at 6:50 PM bitsandbytes LLM int8 you can quantize most weights in large models, and keep a small subset in full range, and get equivalent output. You could then also use a lookup table to further compress the weights. Permalink 2
Viewing a single comment thread. View all comments