Submitted by floppy_llama t3_1266d02 in MachineLearning
lxe t1_je9vkqx wrote
In what way is this different than the existing low rank adaptation method everyone is doing already?
aliasaria t1_jefj33h wrote
It's a very different way to finetune a model efficiently.
All these tools try to nudge an existing large model, without having to nudge all the weights.
A simplistic explanation of LoRA is that LoRA looks at the whole pretrained model and tries to identify only the most influential weights, and nudge those only.
This tool, instead, adds weights to the model (at the start of prompts) in addition to the existing model.
One advantage to LoRA, in this case, is that you can merge your LoRA finetuned weights into the original model and the result is a new model that is exactly the same size and shape as the original model. In the technique in this paper, however, the final model is a different shape from the original model. But the concept is sort of simpler.
lxe t1_jeg2h5j wrote
Thank you. Much appreciate the explanation.
Viewing a single comment thread. View all comments