Thunderbird120
Thunderbird120 t1_jajok9y wrote
Reply to comment by LetterRip in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
I'm curious which memory efficient transformer variant they've figured out how to leverage at scale. They're obviously using one of them since they're offering models with 32k context but it's not clear which one.
Thunderbird120 t1_jaarrg9 wrote
Thunderbird120 t1_jaaqq3u wrote
Reply to comment by Animal_Prong in TIL the last B-52 Bomber produced for the US left the factory 10/26/1962 - the same day as the climax of the Cuban Missile Crisis - they're still used 60 yrs later. by GoGaslightYerself
Nope, they live on. The B-21 replaces the B-1 and the B-2 but the B-52 continues. There are a lot of roles which don't need stealth but do need significant payloads, range, and the ability to bolt large, oddly shaped things under the wings. The B-52s theoretically take some pressure off the B-21s for things like chucking long range cruise missiles or deploying MALDs. You can technically do that out of cargo planes these days but the B-52s already exist, are a little better for the role, and don't really cost that much to operate.
Thunderbird120 t1_jakbyew wrote
Reply to comment by lucidraisin in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
You're better qualified to know than nearly anyone who posts here, but is flash attention really all that's necessary to make that feasible?