Submitted by mrx-ai t3_zgr7nr in MachineLearning
rePAN6517 t1_izilxsq wrote
Paper only tested against InstructGPT 175B / text-da-vinci-002. They did not test against ChatGPT or text-da-vinci-003.
If they had, I think the paper would obviously be titled "Large language models are zero-shot communicators"
CommunismDoesntWork t1_izj06ql wrote
Yeah, we're at the point where models are improving faster than we can evaluate them lol
leliner t1_iznqdj4 wrote
[deleted] t1_iziwvfr wrote
[deleted]
egrefen t1_iznmjuu wrote
Those models weren’t released at time of writing. I would love it if these models significantly moved the dial on this benchmark, as that would confirm the direction we see with Davinci. Curious to hear why you are so confident, though.
Viewing a single comment thread. View all comments