Viewing a single comment thread. View all comments

jabowery t1_jdm16ig wrote

Algorithmic information theory: Smallest model that memorizes all the data is optimal. "Large" is only there because of the need to expand in order to compress. Think decompress gz in order to compress with bz2. Countering over-fitting with over-informing (bigger data) yields interpolation, sacrificing extrapolation.

If you understand all of the above you'll be light years beyond the current ML industry including the political/religious bias of "algorithmic bias experts".

0