Viewing a single comment thread. View all comments

CyberPun-K t1_iyj4snj wrote

The M3 dataset consists only of 3,003 series, a minimal improvement of DL is not a surprise. Everybody knows that neural networks require large datasets to show substantial improvements over statistical baselines.

What is truly surprising is the time it takes to train the networks, 13 days for thousand series

=> there must be something broken with the experiments

44

HateRedditCantQuitit t1_iyj6yb6 wrote

14 days is 20k minutes, so it’s about 6.7 minutes per time series. I don’t know how many models are in the ensemble, but let’s assume it’s 13 models for even math, making an average deep model take 30s to train on an average time series.

Is that so crazy?

13

CyberPun-K t1_iyj7gb9 wrote

All the models are global models, trained using cross learning. Not single models per series. Unless the experiments were done like that.

19

I_LOVE_SOURCES t1_iykxl0a wrote

…. am i failing to detect humour/sarcasm? those words don’t appear to say anything

−1

BrisklyBrusque t1_iyj6bja wrote

13 days to tune multiple deep neural networks is not at all unrealistic depending on the number of gpus.

7

CyberPun-K t1_iyj6q2r wrote

NBEATs hyper-parameters are minimally explored in the original paper the ensemble was not tuned. There is something broken with the reported times.

17

Historical_Ad2338 t1_iylgux6 wrote

I was thinking the same thing when I looked into this. I'm not sure if the experiments are necessarily 'broken' (there may be at least reasonable justification for why it took 13 days to train), but the first point about dataset size is a smoking gun.

4

__mantissa__ t1_iylhzuj wrote

I have not read the paper yet, but the time DL ensemble takes may be due to some kind of hyperparameter search

4