Viewing a single comment thread. View all comments

Surur t1_jdqfxw6 wrote

I aksed ChatGPT:


Large language models, like GPT-4, are primarily designed for natural language processing tasks, such as understanding and generating human-like text. While these models can perform some mathematical operations, they have inherent limitations when it comes to solving complex math problems. Here are some reasons why large language models might struggle with math:

  1. Limited training data: The training data for large language models primarily consists of text from the internet, which contains less math problems and solutions compared to natural language content. This means the models have fewer opportunities to learn mathematical reasoning and problem-solving skills.

  2. Sequential processing: Language models read and process text in a linear, sequential manner, making it challenging for them to handle complex mathematical problems that require multi-step reasoning or manipulation of symbols and equations.

  3. Inability to represent precise numerical values: The token-based nature of language models can make it difficult for them to represent and manipulate exact numerical values, especially when dealing with very large or very small numbers.

  4. Lack of specialized mathematical knowledge: While large language models can learn general mathematical concepts, they lack the specialized knowledge and techniques required to solve advanced mathematical problems, such as those found in higher-level calculus or abstract algebra.

  5. No built-in error-checking: Language models are designed to generate text that sounds fluent and coherent, but they do not have built-in mechanisms to verify the correctness of their mathematical solutions.

  6. Inability to handle symbolic notation: Language models can struggle with the manipulation of mathematical symbols and expressions, which often requires a deeper understanding of mathematical structure and logic.

These limitations make large language models less suited for advanced mathematical tasks. However, they can still be useful for simple arithmetic, understanding math-related natural language queries, or providing approximate answers. For more complex math problems, specialized tools and software, such as computer algebra systems (CAS), are more appropriate.


I think 2 and 3 are the most significant.

1

RadioFreeAmerika OP t1_jdqix38 wrote

Thanks! I will play around with maths questions solely expressed in language. What I wonder however is not the complex questions, but the simple ones for which incorrect replies are quite common, too.

From the response it seems that, while some probless are inherent to LLMs, most can and will most probably be adressed in future releases.

Number 1 just needs more mathematical data in the training data.

Number 2 could be addressed by processing the output a second time before prompting, or alternatively running it through another plugin. Ideally, the processed sequence length would be increased. Non-linear sequence processing might also be an option, but I have no insights into that.

Number 3 shouldn't be a problem for most everyday maths problems, depending on the definition of precise. Just cut off after two decimal places, e.g. . For maths that is useful in professional settings, it will, though.

Number 4 gets into the hard stuff. I have nothing to offer here besides using more specialized plugins.

Number 5 can easily be addressed. Even without plugins, it can identify and fix code errors (at least sometimes in my experience). This seems kinda similar to errors in "mathematical code"

Number 6 is a bit strange to me. Just translate the symbolic notation into the internal working language of an LLM, "solve" it in natural language space, and retranslate it into symbolic notation space. Otherwise, use image recognition. If GPT4 could recognize that a VGA plug doesn't fit into a smartphone and regarded this as a joke, it should be able to identify meaning in symbolic notation.

Besides all that, now I want a "childlike" AI that I can train until it has "grown up" and the student becomes the master and can help me to better understand things.

2

Surur t1_jdqjdyr wrote

I would add one issue is that transformers are not turing complete, so they can not perform an arbitrary calculation of arbitrary length. However recurrent neural networks, which loop, are, so it is not a fundamental issue.

Also there are ways to make transformers turing complete.

3

FoniksMunkee t1_jdqt5ci wrote

Regarding 2. MS says - "We believe that the ... issue constitutes a more profound limitation."

They say: "...it seems that the autoregressive nature of the model
which forces it to solve problems in a sequential fashion sometimes poses a more profound difficulty that cannot be remedied simply by instructing the model to find a step by step solution" and "In short, the problem ... can be summarized as the model’s “lack of ability to plan ahead”."

Notably, MS did not provide a solution for this - and pointed at another paper by LeCun that suggests a non LLM model to solve the issue. Which is not super encouraging.

2