aleguida OP t1_itw7yh3 wrote on October 26, 2022 at 7:28 PM

Reply to comment by aleguida in [D] Tensorflow learning differently than Pytorch by aleguida

there was indeed a bug on the val loss calculation. Good catch!

aleguida OP t1_itvhw6d wrote on October 26, 2022 at 4:42 PM

Reply to comment by edunuke in [D] Tensorflow learning differently than Pytorch by aleguida

it is worth a shot, thanks for the suggestions! I will try it in the next days and report back here :)

aleguida OP t1_itvhqg8 wrote on October 26, 2022 at 4:40 PM

Reply to comment by _peabody124 in [D] Tensorflow learning differently than Pytorch by aleguida

We should have 50% of the dataset as cats and 50% dogs.

Looks like a bug. The cross entropy loss in PyTorch seems incredibly out of line, as does the training-validation loss gap.

Good point. I need to double check that. What worries me more is the TF implementation that is struggling to get Ok results.

aleguida OP t1_itvhaee wrote on October 26, 2022 at 4:38 PM

Reply to comment by I_will_delete_myself in [D] Tensorflow learning differently than Pytorch by aleguida

thanks for the feedback. Turning off the pretraining causes pytorch to learn more slowly (to be expected) but TF is stuck and not learning anything. See colabs notebooks below.

I see many other TF implementation adding a few more FC layers on the top of the VGG16 but as you also stated I would expect to see the same problem in pytorch while I am kind of getting different results with a similar network. I will try next to build a CNN from scratch using the very same layers for both frameworks

TF: https://colab.research.google.com/drive/1O6qzopiFzK5tDmLQAzLKmNoNaEMDc4Ze?usp=sharing
PYTORCH: https://colab.research.google.com/drive/1g-1CEpzmWJi9xOiZHzvDSv_-eDlXZO9u?usp=sharing

aleguida OP t1_itvgiu8 wrote on October 26, 2022 at 4:33 PM

Reply to comment by seba07 in [D] Tensorflow learning differently than Pytorch by aleguida

Thanks for the feedback. I tried retraining everything from scratch without downloading any pretrained weights. here is the colab links update.

While Pytorch is learning something, Tf is not learning anything. This is actually quite confusing as I used tf.Keras to minimize any possible error on my part. I will try to build the same network from scratch in both frameworks next