liukidar

liukidar t1_iwcnuo2 wrote

Hello. Thank you for your reply. I will go into the details as well since I think we're creating a good review of PC that may help all different kinds of people that are interested.

I think we should divide the literature into two sets: FPA PC and PC. All the papers we cited (Salvatori, Song, Millidge) belongs indeed to the FPA PC. The aim of those papers was basically to give theoretical proof to show that PC was able to replicate BP in the brain (despite using a lot of assumptions on how this can be done).

However, note that the goal of the papers you have cited is to provide an equivalence or approximation between PC and BP, and not to use PC with FPA as a general-purpose algorithm. In fact, the same authors have then realised several papers that do NOT use FPA, and are applied to different machine learning tasks. I believe that the original idea of creating a general library to run these experiments is more focused towards applications, and not towards reimplementing the experiments that show equivalence and approximations of PC. Something interesting to replicate, still from the same authors, is the following: https://arxiv.org/pdf/2201.13180.pdf. And I am not aware of any library that has implemented something similar in an efficient way.

In relation to the accuracy, I'm not sure about what reported by Kinghorn, but already in Whittington 2017, you can see that they get a 98% accuracy on MNIST with standard PC. So the performance of PC on those it's not to be doubted.

​

I agree there's a lack of evaluations on deeper and more complex architectures. However here you can see an example of what you called IL can do: https://arxiv.org/abs/2211.03481 .

3

liukidar t1_iwc3rkb wrote

> The interesting bit for me is not the exact correspondence with PC (as described in Neuroscience) but rather following properties that lend it suitable for asynchronous paralellisation is Local Synaptic Plasticity which I believe still holds good

Indeed this still holds with all the definitions of PC out there (I guess that's why very different implementations such as FPA are still called PC). In theory, therefore, it is possible to parallelise all the computations across different layers.

However, it seems that deep learning frameworks such as PyTorch and JAX are not able to do this kind of parallelization on a single GPU (I would be very very glad if someone who knows more about this would like to have a chat on the topic; maybe I'm lucky and some JAX/Pytorch/Cuda developers stumble upon this comment :P)

3

liukidar t1_iwc2tbm wrote

Hello. Since it may be relevant for the conversation, I'd like to specify that the work by Song doesn't use FPA (except here where they mathematically prove the identity between fpa PC and BP) and all the experimental results in others of his papers are obtained via "normal" PC, where the prediction is updated at every iteration using gradient descent on the log joint probability (so, as far as my understatement of the theory is correct, it corresponds to the MAP on a probabilistic model). I'm not 100% sure about which papers by Millidge do and don't, but I'm quite confident that the majority don't (like here the predictions seem to be updated at every iteration; however, in the paper cited by abhitopia, apparently, they use FPA). Unfortunately, I'm not familiar with the work by Tschantz, so I cannot comment on that.

1