Submitted by aleguida t3_yde1q8 in MachineLearning
I have been playing around with both TF and Pytorch for a while and I noticed that PyTorch in general gives me better results than Tensorflow on a simple binary classification**.** Baffled by this I tried to investigate a little bit further with a simple comparison:
I made two colabs notebooks trying to solve the very same binary classification problem (cats vs dogs) with both frameworks. I tried to keep the model's architecture as similar as possible, as far as I can tell, relying on pre-trained VGG16 weights and allowing training on all the layers. The following plots show that PyTorch immediately reaches top performance in just one epoch, while TF is not even close after 10 epochs.
Learning rate, optimizer are the same. The VGG16 architecture seems a little bit different in the two frameworks with different number of parameters. Am I missing something obvious?
plot legend
- training = blue line
- validation = orange line
​
​
​
​
​
- TF colab link: https://colab.research.google.com/drive/1YeOlEGNJXXWJ2bkY1kk2iMJ4JCJwpsKm?usp=sharing
- Pytorch colab link: https://colab.research.google.com/drive/1nSAuyd9x7WAfA3FkwD6NTyBOzuBvGiRQ#scrollTo=Yf22Fq7CyB3F
​
EDIT 1
On a closer inspection, pytorch vgg16 is using Batch Normalized layers while TF is not. There is an alternative VGG16 pretrained network that doesn't use BN. I will try to use that one instead
pornthrowaway42069l t1_itrxwks wrote
Try predicting using a generated dataset, or one of the generic datasets, this will show if vgg16 is the culprit or its a pattern. GET MORE DATA FOR THE GODS OF DATA