Submitted by Johns-schlong t3_zpczfe in singularity
rainy_moon_bear t1_j0utlxp wrote
If you consider transformer models progress towards AGI, then I think the answer is hardware.
There really isn't anything too shocking or new about the transformer architecture, it is derived from statistics and ML concepts that have been around for a while.
Of course advancing the architecture and training methods is important but the only reason these models did not exist sooner seems to be hardware cost efficiency.
Viewing a single comment thread. View all comments