RoyalCities t1_jcrxlvr wrote

I was talking to GPT 4 about this and it said that it seems plausible and can dramatically bring down costs.

It called it "knowledge distillation"

It also mentioned that if we had access to the weights from open ai you can use a process called model compression to scale down the hardware and put it on less powerful gpus or distributed gpus (like how render farms work)

This also explains why open ai is so cagey on releasing weights - the initial training cost is where the money sink is but once weights are out their is ways to make it run on cheaper hardware.

But Im wondering does this mean the smaller model can ONLY respond to the questions your generating or will it have latent knowledge outside of just the knowledge transfer? Like would say the smaller model thats trained off this approach also be able to answer questions on topics that are "restricted" in open ais view that you couldnt ask it or do you absolutely must need to get an initial answer for such restricted content for it to be able to produce a responce?

Talking about things like writing malicious code or what not. I dont plan on doing that obviously but Im curious on if this means that these smaller models will basically be totally unrestricted now or if its just trained on say tons of python code it can just create said malicious code from scratch without actually being exposed with examples of "how" to make it (since it has a greater knowledge of the ubderlying principals of python)

Edit: Okay guess it can per GPT 4.

Damn these things are fascinating.

>Yes, the same concerns can apply to a smaller model being trained from a larger one via knowledge distillation. Knowledge distillation is a technique where the smaller model learns to mimic the larger model's behavior by training on a dataset generated using the larger model's outputs. The smaller model effectively learns from the larger model's knowledge and understanding of language patterns and concepts.

>As a result, the smaller model can also gain latent knowledge about various topics and domains, even if it hasn't been explicitly exposed to specific examples during training. This means that the smaller model could potentially generate undesirable content based on its understanding of the relationships between words and concepts, similar to the larger model.


RoyalCities t1_j8vfovl wrote

Of course they dont. They're also wrong saying that men cant gain muscle while in a caloric deficit. Its harder mind you since you need to ensure muscle protein synthesis is higher than muscle protein breakdown by eating more protein than usual but it IS possible so I would take w.e. they said here with a grain of salt.