Viewing a single comment thread. View all comments

seba07 t1_itd3x25 wrote

A few random things I learned in practice: "Because it worked" is a valid answer to why you chose certain parameters or algorithms. Older architectures like resnet are still state of the art for certain scenarios. Executive time is crucial, we often take the smallest models available.

40

Furrealpony t1_itdbrky wrote

I am still shocked as to how densenets are not the standart in the industry, also with a cross stage partial design you split the gradient flow and allow for much much easier training process. Is it the complexity of implementation thats holding them back I wonder?

−8

Red-Portal t1_itliz8y wrote

Anybody who has tried to run densenet knows, that it requires an absurd amount of memory in comparison to resnets.

1