I have been playing around with both TF and Pytorch for a while and I noticed that PyTorch in general gives me better results than Tensorflow on a simple binary classification**.** Baffled by this I tried to investigate a little bit further with a simple comparison:

I made two colabs notebooks trying to solve the very same binary classification problem (cats vs dogs) with both frameworks. I tried to keep the model's architecture as similar as possible, as far as I can tell, relying on pre-trained VGG16 weights and allowing training on all the layers. The following plots show that PyTorch immediately reaches top performance in just one epoch, while TF is not even close after 10 epochs.

Learning rate, optimizer are the same. The VGG16 architecture seems a little bit different in the two frameworks with different number of parameters. Am I missing something obvious?

plot legend

training = blue line
validation = orange line

Tensorflow (tf.keras)

PyTorch

TF colab link: https://colab.research.google.com/drive/1YeOlEGNJXXWJ2bkY1kk2iMJ4JCJwpsKm?usp=sharing
Pytorch colab link: https://colab.research.google.com/drive/1nSAuyd9x7WAfA3FkwD6NTyBOzuBvGiRQ#scrollTo=Yf22Fq7CyB3F

EDIT 1

On a closer inspection, pytorch vgg16 is using Batch Normalized layers while TF is not. There is an alternative VGG16 pretrained network that doesn't use BN. I will try to use that one instead

Comments

pornthrowaway42069l t1_itrxwks wrote on October 25, 2022 at 9:20 PM

#226,802

Try predicting using a generated dataset, or one of the generic datasets, this will show if vgg16 is the culprit or its a pattern. GET MORE DATA FOR THE GODS OF DATA

TiredOldCrow t1_its89ot wrote on October 25, 2022 at 10:34 PM

#227,922

Since you're using different pre-trained VGG16 models as a starting point, you may just be demonstrating that the PyTorch torchvision model is more amenable to your combination of hyperparameters than the TensorFlow one.

Ideally for this kind of comparison you'd use the exact same pretrained model architecture+weights as a starting point. Maybe look for a set of weights that has been ported to both PyTorch and TensorFlow?

edunuke t1_itsiaxr wrote on October 25, 2022 at 11:49 PM

#229,159

What if you export vgg16 from tensorflow to onnx and load it on pytorch and compare that (and the opposite as well) maybe more comparative.

_peabody124 t1_itt8pno wrote on October 26, 2022 at 3:18 AM

#232,296

How balanced is your data? Looks like a bug. The cross entropy loss in PyTorch seems incredibly out of line, as does the training-validation loss gap.

seba07 t1_ittpya9 wrote on October 26, 2022 at 6:23 AM

#233,813

Replying to TiredOldCrow (#227,922)

Or otherwise don't use a pre-trained network for this test. Pytorch randomness shouldn't be better than Tensorflows.

I_will_delete_myself t1_itv85fs wrote on October 26, 2022 at 3:38 PM

#239,261

I know it's more "cool" to use PyTorch, however they are practically performing very similar math and should get very similar results. This is if you decide to train something from scratch. What's more important is the person using the tool rather than just the tools themselves.

Edit: Also their VGG16 weights are probably going to be different than Tensorflow's, so it isn't an accurate representation. You should try a model trained from scratch.

aleguida OP t1_itvgiu8 wrote on October 26, 2022 at 4:33 PM

#240,360

Replying to seba07 (#233,813)

Thanks for the feedback. I tried retraining everything from scratch without downloading any pretrained weights. here is the colab links update.

While Pytorch is learning something, Tf is not learning anything. This is actually quite confusing as I used tf.Keras to minimize any possible error on my part. I will try to build the same network from scratch in both frameworks next

aleguida OP t1_itvhaee wrote on October 26, 2022 at 4:38 PM

#240,456

Replying to I_will_delete_myself (#239,261)

thanks for the feedback. Turning off the pretraining causes pytorch to learn more slowly (to be expected) but TF is stuck and not learning anything. See colabs notebooks below.

I see many other TF implementation adding a few more FC layers on the top of the VGG16 but as you also stated I would expect to see the same problem in pytorch while I am kind of getting different results with a similar network. I will try next to build a CNN from scratch using the very same layers for both frameworks

TF: https://colab.research.google.com/drive/1O6qzopiFzK5tDmLQAzLKmNoNaEMDc4Ze?usp=sharing
PYTORCH: https://colab.research.google.com/drive/1g-1CEpzmWJi9xOiZHzvDSv_-eDlXZO9u?usp=sharing