Viewing a single comment thread. View all comments

trnka t1_j1heqwa wrote

In actual product work, it's rarely sufficient to look at a single metric. If I'm doing classification, I typically check accuracy, balanced accuracy, and the confusion matrix for the quality of the model among other things. Other factors like interpretability/explainability, RAM, and latency also play into whether I can actually use a different model, and those will depend on the use case as well.

I would never feel personally comfortable with deploying a model if I haven't reviewed a sample of typical errors. But there are many people deploy models without that and just rely on metrics. In that case it's more important to get your top-level metric right, or to get product-level metrics right and inspect any trends in say user churn.

> Do you see qualitative improvements of models as more or less important in comparison to quantitative?

I generally view quantitative metrics as more important though I think I value qualitative feedback much more than others in the industry. For the example of bias, I'd say that if it's valued by your employer there should be a metric for it. Not that I like having metrics for everything, but having a metric will force you to be specific about what it means.

I'll also acknowledge that there are many qualitative perspectives on quality that don't have metrics *yet*.

> do you ever read papers that just improved SOTA without introducing significant novel ideas?

In my opinion, yes. If your question was why I read them, it's because I don't know whether they contribute useful, new ideas until after I've read the paper.

Hope this helps - I'm not certain that I understood all of the question but let me know if I missed anything

2