LowPressureUsername t1_jdq0nsn wrote on March 26, 2023 at 7:58 AM

Reply to comment by yaru22 in [D] Simple Questions Thread by AutoModerator

It’s mostly computational power available AFAIK. More context = more tokens = more processing power required.

yaru22 t1_jdron1b wrote on March 26, 2023 at 5:39 PM

So it's not an inherent limitation on the number of parameters the model has? Or is that what you meant by more processing power? Do you or does anyone have some pointers to papers that talk about this?