Viewing a single comment thread. View all comments

VirtualHat t1_j9j2gwx wrote

Large linear models tend not to scale well to large datasets if the solution is not in the model class. Because of this lack of expressivity, linear models tend to do poorly on complex problems.

15

relevantmeemayhere t1_j9khp8m wrote

As you mentioned, this is highly dependent on the functional relationship of the data.

You wouldn’t not use domain knowledge to determine that.

Additionally, non linear models tend to have their own drawbacks. Lack of interpretability, high variability being some of them

2

GraciousReformer OP t1_j9j7iwm wrote

>Large linear models tend not to scale well to large datasets if the solution is not in the model class

Will you provide me a reference?

−8

VirtualHat t1_j9j8805 wrote

Linear models make an assumption that the solution is in the form of y=ax+b. If the solution is not in this form then the best solution will is likely to be a poor solution.

I think Emma Brunskill's notes are quite good at explaining this. Essentially the model will underfit as it is too simple. I am making an assumption though, that a large dataset implies a more complex non-linear solution, but this is generally the case.

10

relevantmeemayhere t1_j9kifhu wrote

Linear models are often preferred for the reasons you mentioned. Under fitting is almost always preferred to overfitting.

1

VirtualHat t1_j9ll5i2 wrote

Yes, that's right. For many problems, a linear model is just what you want. I guess what I'm saying is that the dividing line between when a linear model is appropriate vs when you want a more expressive model is often related to how much data you have.

1

GraciousReformer OP t1_j9j8bsl wrote

Thank you. I understand the math. But I meant a real world example that "the solution is not in the model class."

−4

VirtualHat t1_j9j8uvr wrote

For example, in IRIS dataset, the class label is not a linear combination of the input. Therefore, if your model class is all linear models, you won't find the optimal or in this case, even a good solution.

If you extend the model class to include non-linear functions, then your hypothesis space now at least contains a good solution, but finding it might be a bit more trickly.

15

GraciousReformer OP t1_j9jgdmc wrote

But DL is not a linear model. Then what will be the limit of DL?

−13

terminal_object t1_j9jp51j wrote

You seem confused as to what you yourself are saying.

6

GraciousReformer OP t1_j9jppu7 wrote

"Artificial neural networks are often (demeneangly) called "glorified regressions". The main difference between ANNs and multiple / multivariate linear regression is of course, that the ANN models nonlinear relationships."

https://stats.stackexchange.com/questions/344658/what-is-the-essential-difference-between-a-neural-network-and-nonlinear-regressi

−3

PHEEEEELLLLLEEEEP t1_j9k691x wrote

Regression doesnt just mean linear regression, if that's what you're confused about

4

Acrobatic-Book t1_j9k94l4 wrote

The simplest example is the xor-problem (aka either or). This was also why multilayer perceptrons as the basis of deep learning where actually created. Because a linear model cannot solve it.

2

VirtualHat t1_j9lkto4 wrote

Oh wow, super weird to be downvoted just for asking for a reference. r/MachineLearning isn't what it used to be I guess, sorry about that.

1