Viewing a single comment thread. View all comments

groman434 OP t1_j2xb0c8 wrote

My question was slighly different. My understanding is that one of major factors that impact your quality of your model predictions is your training set. But since your training set could be inaccurare (in other words, made by humans), how this fact can impact quality of learning and then quality of predictions.

Of course, as u/IntelArtiGen wrote, models can avoid reproducing errors made by humans (I guess because they are able to learn specific features during a teaching phase when your training set is good enough). But I wonder what this good enough means exactly (in other words, how inevitable errors made by humans when preparing it impact an entire learning process and what kind of errors are acceptable) and how an entire training process can be described mathematically. Of course, I have seen many explanation using gradient descent as an example, but none of them incorporated the fact that a training set (or loss function) was imperfect.

5

Ali_M t1_j2xkhl3 wrote

Supervised learning isn't the only game in town, and human demonstrations aren't the only kind of data we can collect. For example we can record human preferences over model outputs and then use this data to fine tune models using reinforcement learning (e.g. https://arxiv.org/abs/2203.02155). Even though I'm not a musician, I can still make a meaningful judgement about whether one piece of music is better than another. By analogy, we can use human preferences to train models that are capable of superhuman performance.

8

e_for_oil-er t1_j2xz49i wrote

I guess "errors" in the dataset could be equivalent to introducing noise (like random perturbations with mean 0) or a bias (perturbation with non 0 expectation). I guess those would be the two main kind of innacuracies found in data.

Bias has been the plague of some language models which were based on internet forum data. The training data was biased towards certain opinions, and the model just spat them out. This is has caused the creators of those models to shut them down. I don't know how could one do to correct bias, since this is not at all my expertise.

Learning techniques resistant to noise (often called robust) are an active field of research, and some methods actually perform really well.

2