mad_alim

mad_alim t1_iuwfsz0 wrote

If I understood correctly, you want to train a model on the "earth based training system" and transmit the whole model to an FPGA on space for inference (usage).

The main paper I remember reading on this is https://arxiv.org/abs/1510.00149.
(they propose techniques, including quantization to reduce size while maintaining the same accuracy.)

In general, for "embedded applications" (or edge IA, low power, etc) it is very common to quantize ANNs (weight and activation-wise. Going from floating 32bits to 8bits or lower).
(It is so common that there is: https://www.tensorflow.org/lite.)
What particularly interests you is weights quantization (because that's the biggest part you'll transmit). So I'd recommend reading more on quantization.

Another thing to consider is the architecture itself, which determines how many parameters you have in the first place.
Particularly, convolutional layers use a small shared kernel for each convolution (e.g. 3x3 weights) but compute several dot matrix multiplications, whereas dense layers are essentially one dot matrix multiplication but with n_in*n_out weights.
(Keep in mind that compression is just a tangent to my research topic and that my main education was not in CS/ML, so I might be missing relevant topics)

2