Submitted by MLNoober t3_xuogm3 in MachineLearning
Hi all,
I've been reading up on neural networks, primarily for image processing applications. Given the current capabilities of the neural networks, it seems a little simplistic to think that in the end, we are learning a bunch of linear functions (hyperplanes). Why not use more complex functions to represent neurons or higher-order functions?
Thanks, MLNoober
------------------------------------------------------------------------------------------
Thank you for the replies.
I understand that neural networks can represent non-linear complex functions.
To clarify more,
My question is that a single neuron still computes F(X) = WX + b, which is a linear function.
Why not use a higher order function F(X) = WX^n + W1 X^(n-1) + ... +b.
I can imagine the increase in computational needed to implement this, but neural networks were considered to be time-consuming until we started using GPUs for parallel computations.
So if we ignore the implementation details to accomplish this for large networks, are there any inherent advantages to using higher-order neurons?
--------------------------------------------------------------------------------------------
Update:
I did some searching and found a few papers that are relatively new that utilize quadratic neurons. Some have even successfully incorporated them even into convolutional layers, and show a significant improvement in performance. However, they report the need for significantly large number of parameters (may be the reason why I could not find anything higher than order 2). So, I wonder
- How a combination of quadratic and linear neurons incorporated in each layer would perform?
- Are there a different set of activation functions that are suitable for quadratic neurons?
[1] Fenglei Fan, Wenxiang Cong and Ge Wang, "A new type of neurons for machine learning", International journal for numerical methods in biomedical engineering, vol. 34, no. 2, pp. e2920, 2018.
[2] Fenglei Fan, Hongming Shan, Lars Gjesteby and Ge Wang, "Quadratic neural networks for ct metal artifact reduction" in Developments in X-Ray Tomography XII, W. International Society for Optics and Photonics, vol. 11113, pp. 111130, 2019.
[3] Yaparla Ganesh, Rhishi Pratap Singh and Garimella Rama Murthy, "Pattern classification using quadratic neuron: An experimental study", 2017 8th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1-6, 2017.
[4] P. Mantini and S. K. Shah, "CQNN: Convolutional Quadratic Neural Networks," 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 9819-9826, doi: 10.1109/ICPR48806.2021.9413207.
​
happy_guy_2015 t1_iqwnq5x wrote
In deep learning, neurons are not represented as a linear function. The output of a neuron is implemented by taking a linear combination of the inputs and then feeding that into a non-linear function, e.g. ReLU. The non-linearity is critical, because without it, you can't approximate non-linear functions well, even with deep networks.