CyberPun-K t1_iyj4snj wrote on December 1, 2022 at 7:59 PM

The M3 dataset consists only of 3,003 series, a minimal improvement of DL is not a surprise. Everybody knows that neural networks require large datasets to show substantial improvements over statistical baselines.

What is truly surprising is the time it takes to train the networks, 13 days for thousand series

=> there must be something broken with the experiments

HateRedditCantQuitit t1_iyj6yb6 wrote on December 1, 2022 at 8:12 PM

14 days is 20k minutes, so it’s about 6.7 minutes per time series. I don’t know how many models are in the ensemble, but let’s assume it’s 13 models for even math, making an average deep model take 30s to train on an average time series.

Is that so crazy?

CyberPun-K t1_iyj7gb9 wrote on December 1, 2022 at 8:16 PM

All the models are global models, trained using cross learning. Not single models per series. Unless the experiments were done like that.

I_LOVE_SOURCES t1_iykxl0a wrote on December 2, 2022 at 3:48 AM

…. am i failing to detect humour/sarcasm? those words don’t appear to say anything

BrisklyBrusque t1_iyj6bja wrote on December 1, 2022 at 8:09 PM

13 days to tune multiple deep neural networks is not at all unrealistic depending on the number of gpus.

CyberPun-K t1_iyj6q2r wrote on December 1, 2022 at 8:11 PM

NBEATs hyper-parameters are minimally explored in the original paper the ensemble was not tuned. There is something broken with the reported times.

Historical_Ad2338 t1_iylgux6 wrote on December 2, 2022 at 7:07 AM

I was thinking the same thing when I looked into this. I'm not sure if the experiments are necessarily 'broken' (there may be at least reasonable justification for why it took 13 days to train), but the first point about dataset size is a smoking gun.

mantissa t1_iylhzuj wrote on December 2, 2022 at 7:22 AM

I have not read the paper yet, but the time DL ensemble takes may be due to some kind of hyperparameter search