Submitted by balancetheuniverse t3_11rc0wa in dataisbeautiful
Jackdaw99 t1_jcaxztr wrote
Reply to comment by Hypo_Mix in Exam results for recently released GPT 4 compared to GPT 3.5 by balancetheuniverse
I would imagine the language model spihons questions off into a calculator. It would be pretty easy, considering the infinite nature of integers, to give it a simple arithmetic problem that has never been written down or even devised before. Say, "What's 356.639565777 divided by 1.6873408216337?" I would be very surprised if it didn't get this sort of thing right.
Follow-up: I just tried that calculation on ChatGPT and it got it...wrong. Twice. With different answers each time. Though it was close...
That's bizarre to me, since it couldn't have used a language model to calculate that, and in fact it explicitly told me it was sending the calculation to Python. So I don't know what's going on here.
torchma t1_jcbubdj wrote
I don't get your comment. You know it's a language model and not a calculator and yet are surprised that it got a calculation wrong? And no, it doesn't send anything to anything else. It's a language model. It's just predicting that the sequence of words "I'm sending this calculation to python" is the most likely sequence of words that should follow.
Jackdaw99 t1_jccaord wrote
That doesn't make sense to me. It would be the easiest thing in the world to build a calculator into it, have it send questions which look like basic arithmetic in, and then spit out the answer. Hell, it could build access to Wolfram Alpha in. Then it wouldn't make basic mistakes and would much more impressive. And after all, that's what a person would do.
Moreover, if it doesn't have the ability to calculate at all, how did it get so close to the answer when I fed it a problem which, I'm pretty sure, no one has ever tried to solve before?
And finally, how did it do so well on the math SATs if it was just guessing at what people would expect the next digit to be?
I'm not saying you're wrong, I'm just baffled by why they wouldn't implement that kind of functionality. Because as it stands, no one is ever going to use it for anything that requires even basic math skills. "ChatGPT, how many bottles of wine should I buy for a party with 172 attendees?" I'm not going to shop based on its answer.
Maybe this iteration is just further Proof of Concept, but if so, all it proves is that concept is useless for many applications.
torchma t1_jccg5ej wrote
Because basic calculation is already a solved problem. OpenAI is concerned with pushing the frontiers of AI, not with trivial integration of current systems. No one is going to use GPT for basic calculation anyways. People already have ubiquitous access to basic calculators. It's not part of GPT's core competency, so why waste time on it? What is part of GPT's core competency is an advanced ability to process language.
That's not to say that they are necessarily ignoring the math problem. But the approach you are suggesting is not an AI-based approach. You are suggesting a programmatic approach (i.e. "if this, then do this..."). If they were only concerned with turning ChatGPT into a basic calculator, that might work. But that's a dead-end. If OpenAI is addressing the math problem, they would be taking an AI approach to it (developing a model that learns math on its own). That's a much harder problem, but one with much greater returns from solving it.
Jackdaw99 t1_jcclxlr wrote
If they're going to release it and people are going to use it (whatever the warnings may be), I don't think it's trivial at all. Basic math factors into a significant percentage of the conversations we have. And it's certainly not trivial to be able to tell when a question needs it.
I'm not calling for it to be turned into a basic calculator: I'm asking why they don't recognize that a portion of the answers they provide will be useless without being able to solve simple math problems.
They could certainly build in a calculator now and continue to explore ways for it to learn math on its own. I just don't understand why you would release a project that gets so much wrong that could easily be made right. (And nothing I've read on it, which is a non-trivial amount, mentions that it can't (always) calculate basic arithmetic.) If I can't count on the thing to know for sure the answer to a basic division problem, I can't count on it at all -- at which point, there's no reason to use it.
torchma t1_jccsuru wrote
>I'm asking why they don't recognize that a portion of the answers they provide will be useless without being able to solve simple math problems.
?? They absolutely recognize that it's not good at math. It's not meant to be good at math. If you're still using it for math despite being told it's not good for math and despite the obvious results showing it's not good at math, then that's your problem, not theirs. That it's not good at math hardly negates its core competencies, which are remarkable and highly valuable.
>They could certainly build in a calculator now
What? That would be absolutely pointless. They might as well just tell you to use your own calculator. In fact, that's what Bing Chat would tell people if you asked it to do a math problem back before they neutered it.
Viewing a single comment thread. View all comments