>At best it can recreate latent patterns in the training data.
Have you actually read the paper? On fresh LeetCode tests, GPT-4 significantly outperforms humans on all difficulty questions, reaching nearly double the performance of LeetCode users on medium and hard questions. Those are tests that were recently added to LeetCode's database and were not in the training data. Also, It performs genuinely well with image generation through SVG-code. The 3D modelling in Javascript example (Figure 2.7) is way out of the domain of what you would expect from "just a transformer", it demonstrates real understanding outside of the domain of the training data. It even outperforms purposely trained image generation models like stable diffusion in some regards, namely the adherence to instructions, although the generated images are not that visually pleasing compared to the likes of Dall-E of stable diffusion, which is a very unfair complaint for a freaking Language Model.
CptTombstone t1_jdhkpsp wrote
Reply to comment by Maleficent_Refuse_11 in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII
>At best it can recreate latent patterns in the training data.
Have you actually read the paper? On fresh LeetCode tests, GPT-4 significantly outperforms humans on all difficulty questions, reaching nearly double the performance of LeetCode users on medium and hard questions. Those are tests that were recently added to LeetCode's database and were not in the training data. Also, It performs genuinely well with image generation through SVG-code. The 3D modelling in Javascript example (Figure 2.7) is way out of the domain of what you would expect from "just a transformer", it demonstrates real understanding outside of the domain of the training data. It even outperforms purposely trained image generation models like stable diffusion in some regards, namely the adherence to instructions, although the generated images are not that visually pleasing compared to the likes of Dall-E of stable diffusion, which is a very unfair complaint for a freaking Language Model.