bacon_boat t1_iygvgdf wrote on December 1, 2022 at 8:07 AM

f(f(f(f(x)))) =/= f(x)+f(x)+f(x)+f(x)

eternal-abyss-77 OP t1_iygwbbr wrote on December 1, 2022 at 8:20 AM

Got it bro, thanks

Bro, but let me ask you one more question, please bear with me.

If the result [ f(x)+f(x)+f(x)+f(x) ] >= result [ f(f(f(f(x)))) ]

(Result is feature map, features retained or extracted )

Can I conclude that both are same?

I think you need to check if you have a specific case in mind.

They are obviously not the same in general.

No because stacking layers is basically what gives neural network their ability to extract high level features

That’s a good point. Actually slightly change your question leads to the problem of neural network width vs depth. Check these materials.

Do Wide and Deep Networks Learn the Same Things?

Universal approximation theorem.