The purpose of a deep network is to approximate complex non linear functions. With relu the network is piecewise linear. Imagine slicing a space with many planes, locally it's flat, but zooming out it has a very complex shape, similar to getting a 3D model out of triangles. Each layer adds an additional linear deformation and a slice to the space.
Read the resnent paper. It's a great explanation for both why depth matters for performance and how it causes issues for training. The solution of residual connections is central to every deep learning architecture after this paper.
If you want to be someone that understands it very deeply get REALLY good at linear algebra and REALLY good understanding of multi-variate calculus.
The not so deep answer to your questions is your understanding right now is right. You have a bunch of functions that take multiple inputs and spit out 1 output and that output is combined with other outputs to be put into other functions. Each function has parameters that can vary which changes the output. When you train you give a bunch of examples that in real life you know (hope) are related. The model learns parameters such that it maps input to output.
The-Last-Lion-Turtle t1_j89jm9s wrote
The purpose of a deep network is to approximate complex non linear functions. With relu the network is piecewise linear. Imagine slicing a space with many planes, locally it's flat, but zooming out it has a very complex shape, similar to getting a 3D model out of triangles. Each layer adds an additional linear deformation and a slice to the space.
Read the resnent paper. It's a great explanation for both why depth matters for performance and how it causes issues for training. The solution of residual connections is central to every deep learning architecture after this paper.