Comments

You must log in or register to comment.

Nezarah t1_jdz1zqc wrote

For specifically personal use and research? And not commercial? LlaMA is a good place to start, and/or Alpaca 7B. Small scale (can run on most hardware locally), can be Lora trained and fine-tuned. Also has High token limits (I think it’s 2000 or so?).

Can have outputs comparable to GPT3 which can be further enhanced with Pre-Context training.

Can add branching functionality through the Langchain library.

6

andreichiffa t1_jdzcbln wrote

Depends on which hardware you have. A rule of thumb is that if you want to be efficient, you need about 3x the model size in VRAM to store optimizers state, plus some headroom for data.

You also need to use float for training, due to stability issues. So unless your GPU supports float8, double the RAM.

Realistically, if you have an RTX 4090, you can go up to 6-7B models (Bloom-6B, GPT-j, …). Anything below, and I would aim at 2.7B models (GPT-neo).

I would avoid LLaMA family due to how you get access to pretrained model weights, for liability, and stay with FOSS. In the latter case you can contribute back and gain some visibility this way, assuming you want some.

4

Eaklony t1_je3dqa5 wrote

I am doing the same thing as you. I am currently playing with gpt2 since it’s extremely small. Then when I am comfortable I plan to play with gptj or other ~7b models. Then finally I kinda want to try something with a 20b model as a final big project maybe since I saw you can fine tune it on 4090.

1