YoloSwaggedBased

YoloSwaggedBased t1_jdp9cge wrote

I can't find it now, but I've read a paper that essentially proposed this, at least for inferencing. You essentially have a model output and task loss after every n layers of the model. At training time, you produce outputs up to the end of the architecture and then at inference time utilise some heuristic to measure how much accuracy loss you're willing to sacrifice for layer wise model reduction.

2