Viewing a single comment thread. View all comments

machineko t1_jdjeh6y wrote

We have a similar open-source project focused on personalization of LLMs and efficient fine-tuning: https://github.com/stochasticai/xturing

We actually released code for GPT-J, LLaMA and GPT-2 before these guys but we are a small team. You can run it on any local machines too.

182

SWESWESWEh t1_jdk8rtn wrote

Doing the lords work my friend. Does it work with Apple Silicon Metal shaders? I've trained my own models as both TF and pytorch support it but I've noticed a lot of people use cuda only methods which makes it hard to use open source stuff

17

machineko t1_jdmm43b wrote

Thanks for the comment. Are you looking to run on M2 or smaller edge devices?

3

light24bulbs t1_jdks13d wrote

Question: i notice there's a focus here on fine tuning for instruction following, which is clearly different from the main training where the LLM just reads stuff and tries to predict the next word.

Is there any easy way to continue that bulk part of the training with some additional data? Everyone seems to be trying to get there with injecting embedding chunk text into prompts (my team included) but that approach just stinks for a lot of uses.

8

elbiot t1_jdlgxnz wrote

In my understanding, if you have text, it's not a challenge to train on next word prediction. Just keep the learning rate low. The reason there's a focus on the instruction based fine tuning is because that data is harder to come by.

My only experience is I've done this with a sentence embedding model (using sbert) and I just trained on my new text and the original training data 50/50 and it both got better at embedding my text and didn't forget how to do what it was originally trained on

5

light24bulbs t1_jdlrnll wrote

That's cool, that's exactly what I want to do. I'm hunting around for a ready-made pipeline to do that on top of a good open source model.

3

visarga t1_jdloh24 wrote

Since RLHF finetuning is short, you can continue training your original model and RLHF again.

2

baffo32 t1_jdnppmp wrote

this is the same task as instruction tuning. instruction tuning just uses specific datasets where instructions are followed. it‘s called “finetuning” but nowadays people are using adapters and peft to do this on low end systems.

1

light24bulbs t1_jdntdbb wrote

I'm not hoping to do instruction tuning, i want to do additional pre-training.

1

baffo32 t1_jdo24su wrote

It is the same thing. The alpaca data is just further pretraining data consisting of instructions and responses. Doing this is called finetuning.

1

baffo32 t1_jdrhj77 wrote

I was still confused as to your response, and I’m thinking that if you wanted a model to behave like you had given different pretraining data, you would probably first finetune on the different bulk data, and then after this finetune on the target task such as instruction following.

Instruction following is indeed of course just predicting the next word: on data where the next word is obedient to instructions preceding it.

1

light24bulbs t1_jdrm9kh wrote

That's the part I wasn't getting. I assumed the fine tuning involved a different process. I see now that it is fact just more training data, often templated into a document in such a way that it's framed clearly for the LLM.

The confusing thing is that most of the LLM-as-a-service companies, Open-AI included, will ONLY take data in the question answer format, as if that's the only data you'd want to use to fine tune.

What if i want to feed a book in so we can talk about the book? A set of legal documents? Documentation of my project? Transcriptions of TV shows?

There are so many use cases for training on top of an already pre-trained LLM that aren't just question answering.

I'm into training llama now. I simply took some training code i found, removed the JSON parsing question answer templating stuff, and done.

1

nemorocksharder t1_jdz8kt5 wrote

What you're describing is exactly what I have been looking to do too, and am really surprised I'm not hearing more about it. Have you found any useful approaches to essentially adding to the LLM's Corpus with target material/text? or anyone else trying to do this?

1

ephemeralentity t1_jdm6wkc wrote

Playing around with this. Running BaseModel.create("llama_lora") seems to return "Killed". I'm running it on WSL2 from Windows 11 so I'm not sure if that could be the issue. Running on my RTX 3070 with only 8GB VRAM so maybe that's the issue ...

EDIT - Side note, I first tried directly on Windows 11 but it seems deepspeed dependency is not fully supported: https://github.com/microsoft/DeepSpeed/issues/1769

2

machineko t1_jdnmg8l wrote

Right, 8GB won't be enough for LLaMA 7b. You should try GPT-2 model. That should work on 8GB VRAM.

2

ephemeralentity t1_jdp2pu8 wrote

Thanks looks like gpt2 worked! Sorry, stupid question but how do I save/re-use the results of my model finetune? When I re-finetune for 0:2 epochs it gives a reasonable response but if I try to skip model.finetune, it responds with new lines only (\n\n\n\n\n\n\n\n ...).

1

machineko t1_jdqzmyq wrote

model.save("path/to/your/weights") saves it to the directory
After that, you can load it with
model = BaseModel.create("gpt2", "path/to/your/weights")

Can you share the input text you have used? It is possible that GPT-2 is too small and needs custom generation parameters.

2

ephemeralentity t1_jdt1krp wrote

Thanks a lot! To be honest, I need to spend a bit more time familiarising myself with pytorch / this package. I'll see if I can figure it out from here.

1

machineko t1_jdtv8jv wrote

If you need help, come find us on our discord channel.

2

light24bulbs t1_jdmad5n wrote

Hey, I've been looking at this more and it's very cool. One thing I REALLY like is that I see see self-training using dataset generation on your roadmap. This is essentially the technique that Facebook used to train ToolFormer, if I'm reading their paper correctly.

I'd really love to use your library to try to reimplement toolformers approach someday.

2

RiyazRockz t1_jdnbroi wrote

Hey, I want to fine tune a model to solve a pharma related problem. I want to know if I can fine tune my model with this.. Could you please share your contact details so that I can learn about this more?

1