jackmountion t1_j9jasgc wrote

could also be pretraining thought that's another theory that in pretraining data there leaks in some stuff from other languages. But I personally don't buy that it's simply not enough data. Maybe both theories are slightly right it's generalizing better than we thought but it needs so language context at first?


jackmountion t1_j9janho wrote

Well it doesn't. ChatGPTs training data is largely English with some other languages mixed in but extremely limited(not exactly sure about this). ChatGPT understanding Chinese could be part of a strange phenomena which we don't completely understand yet. There has been a major paper about it but it seems that these LLMs have an emergent capacity to generalize to languages it is not trained on. One of the theories on this is perhaps during learning it is actually learning a grammer structure, since it's the most efficient way to "understand" human language. This which can be easily copied for other languages. Sorta like if I really learn the ins and outs of Calculous, I can sorta give you a general understanding of the ins and outs of what Physics math is doing without taking a Physics class. What's amazing if true is this would mean AI generalizes much easier than anticipated. Maybe even giving insight to how there statistical models seem to have theory of mind capabilities.

Here's a dude talking about the study. Hopefully u can use this to find it. It's very recently done. https://twitter.com/janleike/status/1625207251630960640?t=3z0NEYPFifguL2u8NOCWfA&s=19