DreamyPen OP t1_ivty1ow wrote
Reply to comment by Erosis in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen
Unfortunately not, I'm predicting material properties on the continuous scale.
Erosis t1_ivtyokc wrote
You could use a custom training loop where you down-weight the gradients of the unreliable samples before you do parameter updates.
DreamyPen OP t1_ivu0v9o wrote
Thank you for your comment. I am not sure what that custom loop would look like for an ensemble method (trees/gradient boosted), and how to proceed with down-weighing? Is it a documented technique I can read more about, or more of a workaound you are thinking of?
Erosis t1_ivu2gnv wrote
Trees complicate it a bit more. I've never done it for something like that, but check this instance weight input to xgboost as an example. In the xgboost fit function, there is an input for sample_weight.
I know that tensorflow has a new-ish library for trees. You could manually write a gradient descent loop with modified minibatch gradients there, potentially.
Viewing a single comment thread. View all comments