avocadoughnut t1_j7yaq8w wrote on February 10, 2023 at 7:00 AM

Reply to comment by Sm0oth_kriminal in [D] Using LLMs as decision engines by These-Assignment-936

I'm considering a higher level idea. There's no way that transformers are the end-all-be-all model architecture. By identifying the mechanisms that large models are learning, I'm hoping a better architecture can be found that reduces the total number of multiplications and samples needed for training. It's like feature engineering.