lucidraisin t1_jamtx7b wrote on March 2, 2023 at 3:46 PM

Reply to comment by fmai in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir

it cannot, the compute still scales quadratically although the memory bottleneck is now gone. however, i see everyone training at 8k or even 16k within two years, which is more than plenty for previously inaccessible problems. for context lengths at the next order of magnitude (say genomics at million basepairs), we will have to see if linear attention (rwkv) pans out, or if recurrent + memory architectures make a comeback.

LetterRip t1_janljeo wrote on March 2, 2023 at 6:49 PM

Ah, I'd not seen the Block Recurrent Transformers paper before, interesting.