Viewing a single comment thread. View all comments

keepthepace t1_jdzvxl2 wrote

Maybe I am stubborn but I haven't totally digested the "bitter lesson" and I am not sure I agree in its inevitability. Transformers did not appear magically out of nowhere, they were a solution to RNN's venishing gradient problem. AlphaGo had to be put into a min-max montecarlo search to do anything good, and it is hard to not feel that LLMs grounding issues may be a problem to solve with architecture changes rather than scale.

3