Viewing a single comment thread. View all comments

melgor89 t1_j6xufba wrote

From my experience, they are equal now, especially when we are using now BatchNorm or LayerNorm. Both normalization methods also use mean and std value, and I make irrelevant, which kind of method you are using. Then I prefere the TensorFlow idea as it is simpler one.

3

netw0rkf10w OP t1_j6z0oia wrote

So no noticeable difference in performance in your experiments?

1