It sounds like you just need straight compression. Not neural network compression but like storage of numbers compression. That’s going to be 10x or 100x better than whatever pruning will do for you. There is research in the field of compressing models for storage or transmission.
burn_1298 t1_iuutosp wrote
Reply to [R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? by Moose_a_Lini
It sounds like you just need straight compression. Not neural network compression but like storage of numbers compression. That’s going to be 10x or 100x better than whatever pruning will do for you. There is research in the field of compressing models for storage or transmission.