Viewing a single comment thread. View all comments

DrXaos t1_isn3k9e wrote

Certainly could be dropout. Dropout is on during training, stochastically perturbing activations in its usual form in packages, and off during test.

Take out dropout, use other regularization and report directly on your optimized loss function, train and test, often NLL if you're using a conventional softmax + CE loss function which is the most common for multinomial outcomes.

3

redditnit21 OP t1_isn465e wrote

>Views

Yeah I am using conventional softmax + CE loss function which is the most common for multinomial outcomes. Which regularization method would you suggest me to use and what's the main reason why test acc should be less than train acc?

1

DrXaos t1_isn4fs5 wrote

top 1 accuracy is a noisy measurement particularly if it's a binary 0/1 measurement.

A continuous performance statistic will more likely show the expected behavior of train perf better than test. Note on loss functions lower is better.

There's lots of regularization possible, but start with L2, weight decay, and/or limiting the size of your network.

1