Submitted by orangelord234 t3_11129cq in MachineLearning

For a dataset, the top result gets a high accuracy ~10% better than the second-best paper. But this "SOTA" paper uses some methods that just don't seem practical for applications at all. For example, they use an ensemble of 6 different SOTA models and also train on external data. Of course, it performs well, but it's a bit ridiculous cause it adds almost nothing of value besides "we combined all the best models and got a better score!".

If I have a novel method that is applied to the second-best paper that improves it by ~5% with the same to better compute efficiency but still is worse than the SOTA method, is it still good research to try to publish to conferences? It's also 40% above the baseline model.

I would think so because it's a decent improvement (with an interesting motivation + method) from prior work while keeping the model reasonable. Would reviewers agree or would they just see that it isn't better than SOTA and reject based on not being SOTA alone?

26

Comments

You must log in or register to comment.

Pyramid_Jumper t1_j8cuxdp wrote

Yes, of course. If the research is novel and you believe that the methods are interesting and/or of value then you should definitely seek publication. The goal of research is not to develop SoTA models, but to expand our knowledge in a particular area.

Yes, developing a SoTA method is a great way of getting published, but laying the groundwork for other methods and exploring ideas are all crucial parts of ML research too.

42

GFrings t1_j8dixv3 wrote

In general, absolutely yes. In practice, the review process for most tier 1 and 2 conferences right now is a complete roll of the dice. For example, WACV and some other conferences explicitly state in their reviewer guidelines that you should consider the novelty of the approach over the performance. But I still see many reviews that ping the work for lack of SOTAness. The best thing you can do is make your work as academically rigorous as possible (have good baseline experiments, ablation studies, analysis...) And submit until you get in. Don't worry about what you can't control, which is randomly being assigned to a dud reviewer.

12

Affectionate_Leg_686 t1_j8ebfju wrote

I second this adding that "reviewer roulette" is now the norm in other research communities too. Some conferences are making an effort to impriove the reviewing process, e.g., ICML has metareviewers and an open back-and-forth discussion between the authors and the reviewers. Still, it has not solved the problem.

​

Regarding your work, If possible, define a metric that encapsulates accuracy vs. cost (memory and compute), show how this varies across different established models, and then use that as part of your case: why is your model much more "efficient" than the alternative of running X models in parallel.

In my experience, using a proxy metric for cost is preferable for the ML crowd. I mean something like operation counts and bits transferred. Of course, if you can measure time on existing hardware, say a GPU or CPU that would be best.

Good luck!

8

_d0s_ t1_j8d1psc wrote

YES! improvement is not only to create the best models but also how you get there. for example, you could argue that your approach is much more computationally efficient.

7

FastestLearner t1_j8esc0c wrote

Neural networks were not SOTA for a very very long time. The world would be very different if everyone had only published SOTA results improving upon existing SOTAs of the 90s.

4

BedroomScientist92 t1_j8fdlbs wrote

That is very true. Connectionists were told to go home and stop pursuing that avenue. Great example!

3

tdgros t1_j8d490l wrote

Yes, you can see several NLP papers with ideas making models competitive to much larger ones, for instance.

3

TMills t1_j8eh61k wrote

It doesn't need to be sota in an absolute sense, but it should be interesting in an empirical way. If the model is small, it needs to benchmark against other small models. If it's efficient it should compare against other efficient models. If you just like it aesthetically, or think it's clever, then you need to think about what that cleverness buys you and evaluate it in that dimension.

3

BrotherAmazing t1_j8jd23p wrote

Yes.

“SoTA” is also often ill-defined and while important, can sometimes be a bit overhyped IMO.

Most practitioners and engineers want something that is as good as it can be or is above some threshold in accuracy, given constraints that can often be severe. If a “SoTA” approach cannot meet these real-world constraints, I would argue it’s not “SoTA” for that particular problem of interest.

If you have something that performs very well under such real-world constraints and can demonstrate value to the practitioner, it should be considered for publication by the editors.

3

farmingvillein t1_j8ftdg9 wrote

Some helpful gut checks:

  1. Do you have reason to believe that your method will scale (with parameters and data)? Maybe (probably) you can't actually test things at Google scale--but if you have good theoretical reasons to believe that your method would be accretive at scale, that is a major +.

Yes, getting things to run really well at small scale can be of (sometimes extreme!) value--but you're simply going to see less interest from reviewers on its own. There have been a bazillion hacky ML methods that turn out to be entirely irrelevant once you scale up substantially, and people are wary of such papers/discussions.

If you've got to go down this path, then make sure to position it explicitly as hyper-optimizing small-scale models (like for mobile).)

  1. Do you have good reasons to believe that the "top" paper plus your method would further boost SOTA? Even better, can you test it to confirm?

If your method is--at its theoretical core--simply a twist on a subset of the methods from that SOTA used, then you're going to see much less paper interest, unless you can promise significant improvements in simplicity/efficiency.

> But this "SOTA" paper uses some methods that just don't seem practical for applications at all.

  1. Can you demonstrate the superiority of your method on some of these other applications? So that you can, e.g., create an SOTA in some sort of subset? That can be helpful.
2