Viewing a single comment thread. View all comments

I_like_sources OP t1_j9565gb wrote

Good questions. Machine learning models are usually black boxes, that work or don't as expected.

There is no detail tweaking possible, only re-training. And the specification for good training data is vague at best.

That causes unnecessary frustration, time-consumption and is similar to the blind leading the blind.

The attitude is "offer just more and more data and hope the ai will figure things out, if not, offer more". I am sure I am not the only one who sees fault in this approach.

−32

A1-Delta t1_j9572gt wrote

Tweaking a CNN without retraining makes it sound like you want a no-code option on your machine learning.

Totally agree that model interpretability is a challenge, but there is a whole subsection of our field working on that. The fundamental design of deep learning sort of precludes what you’re talking about - at least given our current understanding of model interpretation. At best, a model may be trained to give options on certain aspects based on its input (we see this all the time), but that doesn’t sound like what you want. It sounds like you want to be able to target specific and arbitrary components of an output and intuitively modify the weights of all nodes contributing to that part of the output - presumably in isolation.

I think your challenge might lie with a fundamental lack of understanding of how these models actually work. I don’t mean that as a dig - they’re complicated. I just want to help bring you to a place of understanding about why the field is how you’re experiencing it.

Not a huge fan of massive edits to original posts after people have started responding. Your newly added recommendations put an onerous responsibility on any open source authors who might make their work public as a hobby rather than a career.

16

I_like_sources OP t1_j957v1f wrote

What are your contributions to enabling users customizability of the result without retraining?

​

>Not a huge fan of massive edits to original posts after people have started responding.

I am not here to make you happy.

−35

Ferocious_Armadillo t1_j98sbqm wrote

I think I’m gonna have to respectfully disagree on a lot of this. You’re right that it largely comes down to training data used. The thing that largely jumps out to me, though, in the examples you give and in your point (1) is that while you want to train using a large amount of data, especially for such large networks as those you suggest, is that while you need that large amount of data, you want to avoid overfitting your model to your data in the pursuit of accuracy or reliability or whatever metric you choose to determine how “good” or accurate your model is against some ground truth.

And while on the surface, NNs can definitely seem like or appear as though they’re “black boxes” or “we can’t accurately describe their structure or how they work”. That’s largely untrue. In fact, I would claim that it’s precisely because we can design and model NN structure and use a structure (both in terms of # of layers, connectedness between them, inputs, weights, biases, activation functions, etc.) that would lend itself best to a given purpose, that has allowed the field to come as far as it has, to generate the NNS in the examples you provide in the first place.

Sorry about the rant… I didn’t realize I get so passionate about NNs.

1