Submitted by These-Assignment-936 t3_10y2mu0 in MachineLearning
avocadoughnut t1_j7yaq8w wrote
Reply to comment by Sm0oth_kriminal in [D] Using LLMs as decision engines by These-Assignment-936
I'm considering a higher level idea. There's no way that transformers are the end-all-be-all model architecture. By identifying the mechanisms that large models are learning, I'm hoping a better architecture can be found that reduces the total number of multiplications and samples needed for training. It's like feature engineering.
Viewing a single comment thread. View all comments