Submitted by deck4242 t3_125q87z in MachineLearning
machineko t1_je86hwt wrote
Reply to comment by ortegaalfredo in [D] llama 7b vs 65b ? by deck4242
What GPUs are you using to run them? Are you using any compression (i.e. quantization)?
ortegaalfredo t1_jegn9zu wrote
2x3090, 65B is using int4, 30B is using int8 (required for LoRA)
Viewing a single comment thread. View all comments