Hey all, I'm just wondering how other teams approach theses tasks, I have feeling I'm a bit behind in my process in terms of debugging/improving my models, so what I'm wondering if you guys leverage these kind of tools (most recent examples I know of are CartoonX Pixel RDE) to debug/improve your models, outside sources, papers or specific example from your experience would be great!

Thank you.

Comments

You must log in or register to comment.

DigThatData t1_iu4rr8y wrote on October 28, 2022 at 3:26 PM

sometimes, usually more likely to bust something like this out if I have a specific need than it being part of my general process. simpler metrics like gradient magnitude often get the job done well enough.

in any event, it sounds like you're interested in the tooling space so here are a few projects I think are interesting, regardless of whether or not I use them myself:

jellyfishwhisperer t1_iu4twl9 wrote on October 28, 2022 at 3:40 PM

Great list. To add, in the CV space you should be very careful with many "xai" methods. Usually they're just fancy edge detectors. Been Kim is pretty good on this stuff.

https://arxiv.org/abs/1810.03292

DigThatData t1_iue7pne wrote on October 30, 2022 at 5:38 PM

very thought provoking stuff! I wonder if maybe an alternative interpretation of these observations might be something along the lines of deep image prior, i.e. maybe randomly initialized deep architectures are capable of performing edge detection just by virtue of how the gradient responds to the stacked operators?

jellyfishwhisperer t1_iuisl72 wrote on October 31, 2022 at 5:25 PM

That's about right. Convolution priors in particular lend themselves to edge detection. CV xai is weird in general though so I've stepped back a bit. Is a good explanation one that looks good or one that is faithful to the model or what? Everyone disagrees. So Ive moved to inputs with interpretable features (text, tables, science, etc).

soulshakedown t1_iu5n8kc wrote on October 28, 2022 at 6:56 PM

I haven't used it yet, but I really want to dig into WeightWatcher---I just listened to a really nice Practical AI podcast episode with its main contributor about this tool if anyone is interested: https://changelog.com/practicalai/194

DisWastingMyTime OP t1_iu52gcw wrote on October 28, 2022 at 4:37 PM

Thank you for the well thought response, will look into those (or my team will ;) )

Borky_ t1_iu659bd wrote on October 28, 2022 at 9:00 PM

Damn is there anything for us poor tf/keras users in there? :(

DigThatData t1_iu6zlbo wrote on October 29, 2022 at 12:57 AM

i have tunnel vision on the pytorch ecosystem (with the occasional jax cameo)

Borky_ t1_iu8d8o2 wrote on October 29, 2022 at 10:30 AM

yeah seems like you guys are getting all the fun toys recently, either way I'll save this post when I'm eventually forced to switch !

DigThatData t1_iue3h89 wrote on October 30, 2022 at 5:10 PM

I think "recently" started about two years after pytorch was released.

PeedLearning t1_iu4r6gf wrote on October 28, 2022 at 3:22 PM

No, never used them

DisWastingMyTime OP t1_iu50s57 wrote on October 28, 2022 at 4:26 PM

Do you use anything else instead, or 'just' metrics and inference results/distribution?

Imnimo t1_iu50biy wrote on October 28, 2022 at 4:23 PM

I don't use that sort of thing as part of a normal process, but I did run into a situation where I had an image dataset with small objects on potentially distracting backgrounds. Regular old CAM helped me check whether my misclassifications were finding the right object and just not understanding what it is, or missing the object all together (it was mostly the former).

DisWastingMyTime OP t1_iu54512 wrote on October 28, 2022 at 4:48 PM

Yeah that's the main usage I imagined, would would have the course of action for these cases? More adversarial examples/cutout of the object and leaving the background?

anish9208 t1_iu4rwr2 wrote on October 28, 2022 at 3:27 PM

my daytoday tasks are more with classical ML as opposed to DL but we take offline DL model score as a features. for classical boosted models, we normally use a tool developed over shap decision plots. where you can visualise plots on sample values for each features. In those plots for each sample you can see that how much contribution ( in +ve or -ve direction) that feature value makes to the final output.

For DL/computer vision specific tasks it wouldn't be practical since each pixel is your feature however shap packages provides a way to generate heatmaps for image classification as far as i can recall.

That being said I'm also keen to know how other DL/ML practitioners do their model debugging and especially if someone has done it for NLP domain, then i would really like to hear their experience.

schwagggg t1_iu4slj7 wrote on October 28, 2022 at 3:31 PM

mostly just common sense, tensorboard for grad history is good enough

avialex t1_iu6ajry wrote on October 28, 2022 at 9:39 PM

I use fullgrad religiously, although I've removed the multiplication by the original image so that I'm just seeing the model gradients. I don't really use it to debug, it's more useful as a post-facto indication of what the important features in the data were. Every once in a while I'll see a model is overly focused on corners or something obviously wrong, and that can be an indication of too much instability, but aside from that it's more of an explanatory tool than a debugging tool.

mantissa t1_iu806a9 wrote on October 29, 2022 at 7:16 AM

I work studying the viability of deep learning in a specific scientific field in which is quite important to assess why the model reach certain prediction. This is why we started using these tools, even developing new metrics based on them.

DigThatData t1_iu9ppwr wrote on October 29, 2022 at 5:29 PM

have you played with any techniques from causal inference, like counterfactual explanations?

mantissa t1_iuh7h76 wrote on October 31, 2022 at 9:11 AM

> counterfactual explanations

I have not yet explored any causal inference technique but it is a nice path to to consider for future research directions. Could you recommend me any book/survey to read as introduction to this field?

DigThatData t1_iuibdo1 wrote on October 31, 2022 at 3:30 PM

Judea Pearl - "The Book of Why"

Imaginary_pencil t1_iu4iaut wrote on October 28, 2022 at 2:21 PM

Have you tried turning it off and on again?

DisWastingMyTime OP t1_iu53wx1 wrote on October 28, 2022 at 4:47 PM

Yeah dropouts are great!