Viewing a single comment thread. View all comments

shumpitostick t1_ivfwf6z wrote

That gets us into the realm of causal inference. This is not really what the author was talking about, but yes, it's a field that has a bunch of additional challenges. In this case, more data points might not help, but collecting data about additional variables might. In any case, getting more data will pretty much never cause your model to be worse.

34

ajt9000 t1_ivics5n wrote

The main way its gonna make a statistical model worse is by increasing the computational power needed to run it. Thats not an argument about the quality of the model results though. I agree the author's understanding of statistics is really bad.

7

shumpitostick t1_ivlb6zi wrote

I was oversimplifying my comments a bit. There is the curse of dimensionality. And in causal inference if you just use every variable as a confounder your model can also get worse because you're blocking forward paths. But if you know what you're doing it shouldn't be a problem. And I haven't met any ML practitioner or statistician who doesn't realize the importance of getting to understand your data and making proper modelling decisions.

1