Viewing a single comment thread. View all comments

abhitopia OP t1_iw6l77r wrote

Thanks for the response.

I am yet to read in details the work of Millidge, Tschantz, Song in detail. I agree that this is not PC in the sense that came out from NeuroScience literature. I have only thoroughly read Bogatz 2017 paper.
and next on my list is Can the Brain Do Backpropagation? —Exact Implementation of Backpropagation in Predictive Coding Networks (also from Bogatz).

>If you look at the equations more closely you find that it basically can not be any more efficient than backpropagation

The interesting bit for me is not the exact correspondence with PC (as described in Neuroscience) but rather following properties that lend it suitable for asynchronous paralellisation is Local Synaptic Plasticity which I believe still holds good. The problem with backprop is not that it is not efficient, in fact it is highly efficient. I just cannot see how backprop systems can be scaled, and do online and continual learning.

>In the case of backpropagation "a" corresponds to backpropagated errors, and the dynamical update equation corresponds to the recursive equations which defines backpropagation. I.e. we are assigning "a" to the value of dL/dt, for a loss L. (it's a little more than this, but I'm drunk so I'll leave that to you to discern). If you look at the equations more closely you find that it basically can not be any more efficient than backpropagation because the error information still has to propagate backwards, albeit indirectly.

Can't we make first order approximation, like we do in any gradient descent algorithm? Again emphasing that the issue is not only speed of learning.

I will certainly checkout the paper by Robert Rosenbaum and thanks for sharing that. I will comment more once I have read this paper.

2

liukidar t1_iwc3rkb wrote

> The interesting bit for me is not the exact correspondence with PC (as described in Neuroscience) but rather following properties that lend it suitable for asynchronous paralellisation is Local Synaptic Plasticity which I believe still holds good

Indeed this still holds with all the definitions of PC out there (I guess that's why very different implementations such as FPA are still called PC). In theory, therefore, it is possible to parallelise all the computations across different layers.

However, it seems that deep learning frameworks such as PyTorch and JAX are not able to do this kind of parallelization on a single GPU (I would be very very glad if someone who knows more about this would like to have a chat on the topic; maybe I'm lucky and some JAX/Pytorch/Cuda developers stumble upon this comment :P)

3

miguelstar98 t1_iwoeafv wrote

🖒Noted. I'll take a look at it when I get some free time. Although someone should probably make a discord for this....

2

abhitopia OP t1_iwrtr60 wrote

It's a good idea. I am currently still reading the papers on the subject. But I can create a discord if it's helpful

1

liukidar t1_iwu5qgv wrote

Sounds good. I can provide more details about the issue in the case.

1