Viewing a single comment thread. View all comments

Erosis t1_ivtxuwn wrote

Are the outputs of your model binary? You could instead set the target of your uncertain data points to somewhere closer to the middle instead of 0 and 1.

If you are training in batches, you could reduce the size of the gradient updates coming from the unreliable data.

7

DreamyPen OP t1_ivty1ow wrote

Unfortunately not, I'm predicting material properties on the continuous scale.

1

Erosis t1_ivtyokc wrote

You could use a custom training loop where you down-weight the gradients of the unreliable samples before you do parameter updates.

5

DreamyPen OP t1_ivu0v9o wrote

Thank you for your comment. I am not sure what that custom loop would look like for an ensemble method (trees/gradient boosted), and how to proceed with down-weighing? Is it a documented technique I can read more about, or more of a workaound you are thinking of?

1

Erosis t1_ivu2gnv wrote

Trees complicate it a bit more. I've never done it for something like that, but check this instance weight input to xgboost as an example. In the xgboost fit function, there is an input for sample_weight.

I know that tensorflow has a new-ish library for trees. You could manually write a gradient descent loop with modified minibatch gradients there, potentially.

1