Comments

You must log in or register to comment.

limpbizkit4prez t1_jahaq8v wrote

The authors kept increasing model size until the model overfit the task. I'm not sure if that's high impact. It's cool and everything, but over fitting a data set is never really valuable.

4

MysteryInc152 OP t1_jahgb2n wrote

Overfitting comes the necessary connotation that the model does not generalize well to instances of the task outside the training data.

As long as what the model creates is novel and works, "overfitting" seems like an unimportant if not misleading distinction.

2

limpbizkit4prez t1_jahhmhd wrote

Lol, I strongly disagree. There are already methods out there that provide architecture design. This is a "that's neat" type of project, but I'd be really disappointed to see this anywhere other than arxiv.

3

_Arsenie_Boca_ t1_jai5zgz wrote

The final evaluation is done on test metrics right? If so, why does it matter?

2

limpbizkit4prez t1_jai7l96 wrote

It matters because the authors continue to increase model capacity to do better on a single task and that's it. They also determined that strategy, not the LLM. It would be way cooler if they constrained the problem to roughly the same number of parameters and showed generalization across multiple tasks. Again, it's neat, just not innovative or sexy.

2

MysteryInc152 OP t1_jah5w3t wrote

>Given the recent impressive accomplishments of language models (LMs) for code generation, we explore the use of LMs as adaptive mutation and crossover operators for an evolutionary neural architecture search (NAS) algorithm. While NAS still proves too difficult a task for LMs to succeed at solely through prompting, we find that the combination of evolutionary prompt engineering with soft prompt-tuning, a method we term EvoPrompting, consistently finds diverse and high performing models. We first demonstrate that EvoPrompting is effective on the computationally efficient MNIST-1D dataset, where EvoPrompting produces convolutional architecture variants that outperform both those designed by human experts and naive few-shot prompting in terms of accuracy and model size. We then apply our method to searching for graph neural networks on the CLRS Algorithmic Reasoning Benchmark, where EvoPrompting is able to design novel architectures that outperform current state-of-the-art models on 21 out of 30 algorithmic reasoning tasks while maintaining similar model size. EvoPrompting is successful at designing accurate and efficient neural network architectures across a variety of machine learning tasks, while also being general enough for easy adaptation to other tasks beyond neural network design.

Between this and being able to generate novel functioning protein structures, i hope the "it can't truly create anything new!" argument for LLMs die but i'm sure we'll find more posts to move lol

1