rePAN6517 t1_izilxsq wrote on December 9, 2022 at 11:19 AM

Paper only tested against InstructGPT 175B / text-da-vinci-002. They did not test against ChatGPT or text-da-vinci-003.

If they had, I think the paper would obviously be titled "Large language models are zero-shot communicators"

CommunismDoesntWork t1_izj06ql wrote on December 9, 2022 at 1:44 PM

Yeah, we're at the point where models are improving faster than we can evaluate them lol

leliner t1_iznqdj4 wrote on December 10, 2022 at 2:05 PM

https://postimg.cc/cgxTrGYG

[deleted] t1_iziwvfr wrote on December 9, 2022 at 1:16 PM

[deleted]

egrefen t1_iznmjuu wrote on December 10, 2022 at 1:29 PM

Those models weren’t released at time of writing. I would love it if these models significantly moved the dial on this benchmark, as that would confirm the direction we see with Davinci. Curious to hear why you are so confident, though.