wojtek15
wojtek15 t1_jd0p206 wrote
Reply to comment by currentscurrents in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
Hey, recently I was thinking if Apple Silicon Macs may be best thing for AI in the future. Most powerful Mac Studio has 128Gb of Uniform RAM which can be used by CPU, GPU or Neural Engine. If only memory size is considered, even A100, let alone any consumer oriented model, can't match. With this amount of memory you could run GPT3 Davinci size model in 4bit mode.
wojtek15 t1_jdlpai0 wrote
Reply to comment by ttkciar in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Exactly, I have seen many inaccurate claims, e.g. LLaMa-7B with Alpaca being as capable as ChatGPT. From my testing even much bigger LLaMa-30B with Alpaca is far worse than ChatGPT, can't even get simplest programming and common knowledge tasks right, and GPT3 ChatGPT get them right without any problems every time. I have not tried LLaMa-65B with Alpaca yet, because it has not being trained yet AFAIK, but I doubt it will be very different. GPT3 ChatGPT is 175B, maybe some 100B model can match it, but not 6B or 7B model, if someone claim this, he clearly don't know what he is talking about.