Viewing a single comment thread. View all comments

PassionatePossum t1_iqv5fo9 wrote

Are we talking about training loss or validation loss? Because the training loss will almost always go gown and it means very little.

3

Imaginary_Carrot4092 OP t1_iqv9s87 wrote

Ok..I have added a few images to clarify this. Firstly, how can I make sure that I can learn something out of my data based on its distribution.

1

PassionatePossum t1_iqvklvt wrote

Sorry, I still need a little more information. From the plots you have provided I would assume that you have a regression problem and you have a 1D input and a 1D target, correct? Or are we talking about a time series?

For the moment I'll go with assumption (1). The data you provided looks fairly random. I'm curious what function you want to use to model this? How does the network look like (how many layers) and what exactly are the inputs to your network (are they powers of your input variable or something else?)

1

Imaginary_Carrot4092 OP t1_iqvqztc wrote

Yes your assumption 1 is exactly right. The network I am using is very simple with 2 hidden layers (I am not sure if this model is enough to learn this data). This is not a time series data.

The input is the number of hours it takes for a certain process to complete and the output is one of the process variables.

1

PassionatePossum t1_iqvz2n0 wrote

Yeah, this has no chance of working. Neural networks aren't magic. They are function approximators, nothing more. And an neuron can only learn a linear combination of its inputs.

Since you only have one input, the first layer will only be able to learn fractions of the original input. And the second layer will learn how to add them together. So some non-linearities (due to activations) aside, your model can essentially only learn to add fractions of the original input.

And while the universal approximation theorem says that theoretically this is enough to approximate any function if you make your network wide or deep enough, you have no guarantees that the solver will actually find the solution. And in practice, it won't.

A common trick is to use (1, x, x^2, ..., x^n) as input but I doubt that this will do the trick in your case. If there is a function that describes a relationship between your input variable and the output variable, it has to be a polynomial of extremely high degree.

If you have additional inputs you could use, it might help. But just looking at what you have provided, it is not going to work.

2

Tgs91 t1_iryxsqa wrote

A neural network is capable of learning non-linear relationships from a 1d input to a 1d output. The problem is that your data doesn't doesn't have any relationship between those variables. You need to find some input variables that are actually related to the output. A neural net can't approximate a relationship that doesn't exist

1

BrotherAmazing t1_iqwgcms wrote

You can’t tell if the data is “fairly random” or not just based on that plot though. Once a blue dot of finite size plots over another, a density of two dots nearly on top of one another will appear identically to the human eye as 10 or 100 or any N > 1 dots plotted almost entirely on top of another.

Unfortunately, OP doesn’t provide anything close to enough information for anyone here to truly be able to diagnose the problem (what is the theoretical relationship between these inputs/outputs?), or even if there is a problem; i.e., what would we estimate the Bayes Error Rate to be for this problem and what loss would that yield?

1