Submitted by DisWastingMyTime t3_yfmtd3 in MachineLearning

Hey all, I'm just wondering how other teams approach theses tasks, I have feeling I'm a bit behind in my process in terms of debugging/improving my models, so what I'm wondering if you guys leverage these kind of tools (most recent examples I know of are CartoonX Pixel RDE) to debug/improve your models, outside sources, papers or specific example from your experience would be great!

Thank you.

106

Comments

You must log in or register to comment.

jellyfishwhisperer t1_iu4twl9 wrote

Great list. To add, in the CV space you should be very careful with many "xai" methods. Usually they're just fancy edge detectors. Been Kim is pretty good on this stuff.

https://arxiv.org/abs/1810.03292

18

DigThatData t1_iue7pne wrote

very thought provoking stuff! I wonder if maybe an alternative interpretation of these observations might be something along the lines of deep image prior, i.e. maybe randomly initialized deep architectures are capable of performing edge detection just by virtue of how the gradient responds to the stacked operators?

1

jellyfishwhisperer t1_iuisl72 wrote

That's about right. Convolution priors in particular lend themselves to edge detection. CV xai is weird in general though so I've stepped back a bit. Is a good explanation one that looks good or one that is faithful to the model or what? Everyone disagrees. So Ive moved to inputs with interpretable features (text, tables, science, etc).

2

DisWastingMyTime OP t1_iu52gcw wrote

Thank you for the well thought response, will look into those (or my team will ;) )

2

Borky_ t1_iu659bd wrote

Damn is there anything for us poor tf/keras users in there? :(

2

DigThatData t1_iu6zlbo wrote

i have tunnel vision on the pytorch ecosystem (with the occasional jax cameo)

2

Borky_ t1_iu8d8o2 wrote

yeah seems like you guys are getting all the fun toys recently, either way I'll save this post when I'm eventually forced to switch !

1

DigThatData t1_iue3h89 wrote

I think "recently" started about two years after pytorch was released.

1

PeedLearning t1_iu4r6gf wrote

No, never used them

19

DisWastingMyTime OP t1_iu50s57 wrote

Do you use anything else instead, or 'just' metrics and inference results/distribution?

7

Imnimo t1_iu50biy wrote

I don't use that sort of thing as part of a normal process, but I did run into a situation where I had an image dataset with small objects on potentially distracting backgrounds. Regular old CAM helped me check whether my misclassifications were finding the right object and just not understanding what it is, or missing the object all together (it was mostly the former).

5

DisWastingMyTime OP t1_iu54512 wrote

Yeah that's the main usage I imagined, would would have the course of action for these cases? More adversarial examples/cutout of the object and leaving the background?

1

anish9208 t1_iu4rwr2 wrote

my daytoday tasks are more with classical ML as opposed to DL but we take offline DL model score as a features. for classical boosted models, we normally use a tool developed over shap decision plots. where you can visualise plots on sample values for each features. In those plots for each sample you can see that how much contribution ( in +ve or -ve direction) that feature value makes to the final output.

For DL/computer vision specific tasks it wouldn't be practical since each pixel is your feature however shap packages provides a way to generate heatmaps for image classification as far as i can recall.

That being said I'm also keen to know how other DL/ML practitioners do their model debugging and especially if someone has done it for NLP domain, then i would really like to hear their experience.

2

schwagggg t1_iu4slj7 wrote

no

mostly just common sense, tensorboard for grad history is good enough

1

avialex t1_iu6ajry wrote

I use fullgrad religiously, although I've removed the multiplication by the original image so that I'm just seeing the model gradients. I don't really use it to debug, it's more useful as a post-facto indication of what the important features in the data were. Every once in a while I'll see a model is overly focused on corners or something obviously wrong, and that can be an indication of too much instability, but aside from that it's more of an explanatory tool than a debugging tool.

1

__mantissa__ t1_iu806a9 wrote

I work studying the viability of deep learning in a specific scientific field in which is quite important to assess why the model reach certain prediction. This is why we started using these tools, even developing new metrics based on them.

1

DigThatData t1_iu9ppwr wrote

have you played with any techniques from causal inference, like counterfactual explanations?

2

__mantissa__ t1_iuh7h76 wrote

> counterfactual explanations

I have not yet explored any causal inference technique but it is a nice path to to consider for future research directions. Could you recommend me any book/survey to read as introduction to this field?

1