mlresearchoor
mlresearchoor t1_j92bdan wrote
Reply to [D] Please stop by [deleted]
the crypto and NFT crowd just discovered AI, are clueless, and are starting AI companies
mlresearchoor t1_j6r93hm wrote
Reply to comment by RandomCandor in [R] Faithful Chain-of-Thought Reasoning by starstruckmon
we got front-row seats to this race and a chance to participate, +1 great time to be alive
mlresearchoor t1_j6r8x7y wrote
Reply to [R] Faithful Chain-of-Thought Reasoning by starstruckmon
nice find! would be helpful, as well, to compare with similar papers from 2022 that this paper cites, but did not compare to in results section
("We note that our work is concurrent with Chen et al. (2022) and Gao et al. (2022), both generating the reasoning chain in Python code and calling a Python interpreter to derive the answer. While we do not compare with them empirically since they are not yet published...")
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (Chen)
https://arxiv.org/abs/2211.12588
PAL: Program-aided Language Models (Gao)
https://arxiv.org/abs/2211.10435
mlresearchoor t1_j1y8ijq wrote
Reply to [D] What are some applied domains where academic ML researchers are hoping to produce impressive results soon? by [deleted]
Impressive applied ML results will come in healthcare, multimedia (e.g., video summarization), sustainability, efficient ML (e.g., TinyML), robotics (e.g., vision-language navigation), human-machine interaction, and more. It's important for our community to value research that uses smaller domain-specific datasets, as well as massive datasets.
But many of the greatest breakthroughs in the next decade will probably come from collaborations between those academic ML researchers and large industry labs.
mlresearchoor t1_je1mvf7 wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
OpenAI blatantly ignored the norm to not train on the ~200 tasks collaboratively prepared by the community for BIG-bench. GPT-4 knows the BIG-bench canary ID afaik, which removes the validity of GPT-4 eval on BIG-bench.
OpenAI is cool, but they genuinely don't care about academic research standards or benchmarks carefully created over years by other folks.