Submitted by MahmoudAbdAlghany t3_zg71j1 in deeplearning
I am looking for a quantization framework that can transform full models into quantized ones with arbitrary bit widths (e.g. 10-bit weights and 12-bit activations).
​
Tensorflow and Pytorch seem to only support quantizing to 8-bit integers and 16-bit floating point values.
​
Any ideas?
suflaj t1_izfke61 wrote
There are none, unless you plan on emulating them, which you'd have to do yourself.
The available quantization widths correspond to what the hardware is capable of doing, and hardware generally revolves around widths that have bytes as their base length.