Viewing a single comment thread. View all comments

machineko t1_je88wj9 wrote

I'm working on an open source library focused on resource-efficient fine-tuning methods called xTuring: https://github.com/stochasticai/xturing

Here's how you would perform int8 LoRA fine-tuning in three lines:

python: https://github.com/stochasticai/xturing/blob/main/examples/llama/llama_lora_int8.py
colab notebook: https://colab.research.google.com/drive/1SQUXq1AMZPSLD4mk3A3swUIc6Y2dclme?usp=sharing

Of course the Colab still only works with smaller models. In the example above, 7B required 9G VRAM.

12

Evening_Ad6637 t1_jeapgrs wrote

That sounds very interesting. I'm sorry if this question is trivial or stupid, but I'm an absolute newcomer in this field. Is there a way to train the model as you describe it here (https://xturing.stochastic.ai/quickstart) with only or almost only CPU performance? It's about the fact that I have the following specifications i5 @3.5ghz, 16gb ddr4 ram and only a radeon pro 575 4gb graca. But since I saw how fast alpaca runs over my cpu and ram on my computer, I hope that I could also fine-tune a llama model with this equipment. I would be very grateful for more information regarding possibilities in this direction.

1

itsyourboiirow t1_jecqjqd wrote

Training requires a significant more amount of memory as it it has to keep track of the gradient for every parameter. I would check to see how much memory it takes up on your computer.

2

machineko t1_jecvhyt wrote

16gb of RAM is not enough for even the smallest LLaMA 7b model. You can try doing LoRA with int8 listed above. Did you try the python script I linked above?

1