mlresearchoor t1_je1mvf7 wrote on March 28, 2023 at 7:30 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

OpenAI blatantly ignored the norm to not train on the ~200 tasks collaboratively prepared by the community for BIG-bench. GPT-4 knows the BIG-bench canary ID afaik, which removes the validity of GPT-4 eval on BIG-bench.

OpenAI is cool, but they genuinely don't care about academic research standards or benchmarks carefully created over years by other folks.

mlresearchoor t1_j92bdan wrote on February 18, 2023 at 6:19 PM

Reply to [D] Please stop by [deleted]

the crypto and NFT crowd just discovered AI, are clueless, and are starting AI companies

mlresearchoor t1_j6r93hm wrote on February 1, 2023 at 9:22 AM

Reply to comment by RandomCandor in [R] Faithful Chain-of-Thought Reasoning by starstruckmon

we got front-row seats to this race and a chance to participate, +1 great time to be alive

mlresearchoor t1_j6r8x7y wrote on February 1, 2023 at 9:20 AM

Reply to [R] Faithful Chain-of-Thought Reasoning by starstruckmon

nice find! would be helpful, as well, to compare with similar papers from 2022 that this paper cites, but did not compare to in results section

("We note that our work is concurrent with Chen et al. (2022) and Gao et al. (2022), both generating the reasoning chain in Python code and calling a Python interpreter to derive the answer. While we do not compare with them empirically since they are not yet published...")

Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (Chen)
https://arxiv.org/abs/2211.12588

PAL: Program-aided Language Models (Gao)
https://arxiv.org/abs/2211.10435

mlresearchoor t1_j1y8ijq wrote on December 28, 2022 at 7:26 AM

Reply to [D] What are some applied domains where academic ML researchers are hoping to produce impressive results soon? by [deleted]

Impressive applied ML results will come in healthcare, multimedia (e.g., video summarization), sustainability, efficient ML (e.g., TinyML), robotics (e.g., vision-language navigation), human-machine interaction, and more. It's important for our community to value research that uses smaller domain-specific datasets, as well as massive datasets.

But many of the greatest breakthroughs in the next decade will probably come from collaborations between those academic ML researchers and large industry labs.