Comments

You must log in or register to comment.

cma_4204 t1_iuyamtp wrote

Data augmentation, dropout?

2

tivotox t1_iuyavzv wrote

The model is equivariant no dataset augmentation, no DO as well. The model doesn't overfit as I said

1

cma_4204 t1_iuyb08l wrote

Well, clearly it’s not getting any better with what you’re trying. Maybe time to rethink

1

tivotox t1_iuybact wrote

But DO will prevent overfitting, I don't have any overfitting it's not the relevant tool

0

cma_4204 t1_iuybjkm wrote

My best guess is coding mistake on your part. Good luck tivo

1

tivotox t1_iuyfoup wrote

I mean the dataset is extremely diverse. like millions clusters and every entry is noised when loaded on GPUs

1

suflaj t1_iuy2y71 wrote

Loss doesn't matter, what are the validation metrics?

1

tivotox t1_iuy798z wrote

The loss here is for a denoiser, it can be seen as the variance between the noise and the noise predicted. So it's in this case a good metric

1

suflaj t1_iuya2f9 wrote

It can be seen as an approximation of the variance between the noise and the noise predicted conditioned on some data.

If it's on the training set it is not even usable as a metric, and if it is not directly related to the performance it is not a good metric. You want to see how it acts on unseen data.

1

tivotox t1_iuyb2oh wrote

The split has been done such as the train and test are highly different. the loss are almost equal on both datasets.

1

suflaj t1_iuybshu wrote

That seems very bad. You want your train-dev-test to be different samples of the same distribution, so, not very different sets.

Furthermore, if you're using test for model validation, that means you will have no dataset to finally evaluate your model on. Reconsider your process.

Finally, again, I urge you to evaluate your dataset on an established evaluation metric for the task, not the loss you use to train the model. What is the exact task?

2

[deleted] OP t1_iuyf897 wrote

[deleted]

1

suflaj t1_iuyg2am wrote

Well I couldn't understand what your task was when you didn't say what it was until now.

Other than that, skimming through the paper it quite clearly says the following:

> Our present results do not indicate our procedure can generalize to motifs that are not present in the training set

Because what they're doing doesn't generalize, I think the starting assumptions (that there will be imprevements with a larger model) are wrong, and so the question is unnecessary... The issue is with the method or the data, they do not elaborate more than that.

2