Submitted by Quiet-Investment-734 t3_10ek5oh in deeplearning
ed3203 t1_j4slyft wrote
Reply to comment by WinterExtreme9316 in Increasing number of output nodes on addition of a new class by Quiet-Investment-734
Yes, you may arrive at a different local minima which could be more performant. You give the model more freedom to explore. OP gave no context, if it's a huge transformer model for instance that would be impractical to retain then sure use the model as is with a different final classification layer.
Viewing a single comment thread. View all comments