Viewing a single comment thread. View all comments

CommunismDoesntWork t1_iyecw45 wrote on November 30, 2022 at 7:42 PM

Reply to comment by diviramon in [R] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models - Massachusetts Institute of Technology and NVIDIA Guangxuan Xiao et al - Enables INT8 for LLM bigger than 100B parameters including OPT-175B, BLOOM-176B and GLM-130B. by Singularian2501

> MF8

I've never heard of this and google isn't being helpful. Any links?

diviramon t1_iyejg7z wrote on November 30, 2022 at 8:24 PM

It is the new Nvidia FP8 data type: https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/