Viewing a single comment thread. View all comments

abhitopia OP t1_ixdnki4 wrote

u/maizeq - I have finished reading the Rosenbaum paper . It is certainly very accessible and useful paper to understand the details and nuances between various PC implementations. So thank you for sharing that.

The objective of the author seems to compare various versions of the algorithm and highlight subtle difference and does a great job at it. It does not however exploit the local synaptic plasticity in its implementation (and uses loops) which is exactly where l think lies the limitation of Pytorch, Jax, and Tensorflow.

For instance, one could imagine each node and each weight in an PC (non FPA) MLP network as a standalone process communicating with other nodes and weights process only via message passing to run completely asynchronously. Furthermore, we can limit the amount of commputation by thresholding the value of error nodes (so weight updates for connected weight processes with happen) in a sense enforcing sparsity.

May be I am wrong, I do not (yet) see why in this simple MLP it should be be possible to add new nodes (in a hot fashion), for example, if the activity in any node increases by certain threshold then scale up automatically preserving 2% activity per layer.

Contrast this with GPU based backward passes, a lot of wasteful computation can be prevented. At the very least, Backward doesn't need to weight for FP in the EM like learning algorithm that PC is.

P.S. - My motivation isn't PC==BP, but rather can PC replace BP and is it worth it.