Submitted by _Arsenie_Boca_ t3_118cypl in MachineLearning
Many neural architectures use bottleneck layers somewhere in the architecture. What I mean by bottleneck is projecting activations to a lower dimension and back up. This is e.g. used in ResNet blocks.
What is your intuition on why this is beneficial? From an information theory standpoint, it creates potential information loss due to the lower dimensionality. Can we see this as a form of regularisation, that makes the model learn more meaningful representations?
Im interested in your intuitions in that matter or empirical results that might support these intuitions. Are you aware of other works that use bottlenecks and what is their underlying reasoning?
aMericanEthnic t1_j9gf0l3 wrote
Bottlenecks are typically a point that is outside of control, purposeful implementation of a bottleneck can only be explained as an attempt at ambiguity in the sense that it’s an attempt to appear of create the feel of a real world issue’ , they “bottlenecks” are unnecessary and should be removed…