New Paper on Finite Size Effects in Spiking Neural Networks

Posted on the 25 January 2013 by Ccc1685 @ccc1685

Michael Buice and I have finally published our paper entitled “Dynamic finite size effects in spiking neural networks” in PLoS Computational Biology (link here). Finishing this paper seemed like a Sisyphean ordeal and it is only the first of a series of papers that we hope to eventually publish. This paper outlines a systematic perturbative formalism to compute fluctuations and correlations in a coupled network of a finite but large number of spiking neurons. The formalism borrows heavily from the kinetic theory of plasmas and statistical field theory and is similar to what we used in our previous work on the Kuramoto model (see here and  here) and the “Spike model” (see here).  Our heuristic paper on path integral methods is  here.  Some recent talks and summaries can be found here and here.

The main goal that this paper is heading towards is to systematically derive a generalization of Wilson-Cowan type mean field rate equations (see here for a review) to include the effects of fluctuations and correlations and even higher order cumulants. The germ of the idea for this quest came to me about a decade ago when I sat in an IBSC centre meeting headed by Jay McClelland. The mission of the grant was to build “biologically informed” models of cognition. Jay had gathered a worldwide group of his colleagues to work on a number of projects that tried to better connect connectionist models to biology. He invited me to join the group because of my work on binocular rivalry. What Jay and others had done was to show how models consisting of coupled rate neurons with simple learning rules like Backpropagation and Contrastive Hebbian Learning could impressively reproduce many human cognitive functions like classifying objects. However, it was not clear what neural mechanisms could implement such learning rules in the brain. Backpropagation is notoriously nonbiological because it requires that the weights be updated in a “backwards” direction to how the signal is propagating. Contrastive Hebbian learning seems more biological in that it alternates episodes of learning and “unlearning” where weights are increased and decreased respectively. Spike-timing dependent plasticity (STDP), where weights increase or decrease depending on the precise timing of pre- and post-synaptic spikes, had just been recently discovered and seemed like a good candidate learning mechanism. Howerer, STDP required information about spike times that rate models didn’t have by design. In order to apply STDP, one had to simulate an entire spiking network. The connectionists rate models already maximized the computational power at that time and spiking networks would require even more power. What was needed, I thought, was a rate model that carried correlation information – a set of generalized activity equations. I imagined a self-consistent system where correlations would change the rate and the rate would influence correlations. It didn’t take me very long to realize that this was a difficult task and I needed to develop new machinery to do it.

At about the same time I was sitting in these meetings, I was also helping Eric Hildebrand finish his PhD. Eric was JD Crawford’s graduate student when JD sadly died before his time. He had already done some interesting work looking at some bifurcations in the Kuramoto model and I suggested that he finish his thesis by applying kinetic theory to the Kuramoto model to calculate finite size fluctuations, which was an open problem posed by Steve Strogatz and Rennie Mirollo. I came up with the kinetic theory idea when I visited Wulfram Gerstner at EPFL in 1997 and realized that plasma physics and neural dynamics had a lot in common. In the back of my mind, I knew that the kinetic theory technology Eric and I were developing would be useful for the generalized activity equations. However, the calculations were very unwieldy so I put the project aside.

In 2005, I was invited by Steve Coombes and Gabriel Lord to attend the Mathematical Neuroscience meeting in Edinburgh. Jack Cowan was also invited and he gave a talk on applying quantum field theory methods to neural dynamics. Jack had been on his own quest to generalize his Wilson-Cowan equations to a full probabilistic theory for decades. Michael Buice was his graduate student and was able to make Jack’s ideas more concrete by applying concepts from the theory of critical phenomena and in particular those of John Cardy. One of the great realizations in modern theoretical physics was that quantum field theory and statistical mechanics were actually very similar and that the Feynman path integral and Feynman diagram methods used to calculate quantities in quantum field theory were equally adept at calculating quantities in statistical mechanics. In fact the problems of renormalization in Quantum Field Theory were only fully understood after Ken Wilson formalized the renormalization group  method for critical phenomena using field theory methods. Both theories are about representing probability distributions in terms of “sums over states or histories” known as the path integral. Perturbation theory can then be performed using asymptotic methods such as the method of steepest descents. The method works because the Gaussian integral can be generalized to an arbitrary number (including uncountably infinite) number of dimensions. The terms of the perturbative expansion also obey systematic rules that can be encapsulated in terms of diagrams. When I saw Jack’s talk, I realized right then that this was the formalism I needed. I then asked when Michael was graduating and recruited him to come to NIH.

When Michael arrived in the fall of 2005, it took awhile for us to learn how to communicate to each other. I was mostly steeped in dynamical systems culture and he was a hard-core field theorist using a formalism that I wasn’t acquainted with. So I thought, a good warm up problem would be for him to turn Eric’s thesis into a paper. Eric had left for a job in industry before he finished his thesis and didn’t have the time to write a paper. Also, the calculations didn’t fit the simulations very well. Michael immediately understood what we were doing and found that we missed a factor of , which made the fits beautiful. This led to our paper in Phys Rev Letters. Michael also realized that the kinetic theory formalism we were using could be recast in terms of path integrals. (In fact,  any Klimontovich equation can be converted into a path integral using this formalism.) This made the calculations much more tractable. Using this formalism, he was able to show in our Phys Rev E paper that the asynchronous stationary state in the Kuramoto model, which is marginally stable in mean field theory, is stabilized by finite size fluctuations.

After the resounding success of the Kuramoto model, we turned our attention to neural networks. We wanted to apply our method to integrate-and-fire models but ran into some technical hurdles (that are probably not insurmountable – interested anyone?) so we switched our attention elsewhere. One of the things that I thought we could do was to explicitly write down a set of generalized activity equations for the “Spike model” that he and Jack worked on.  This is a simple model where a “neuron” spikes with a rate that depends on inputs through a gain function and then decays with a fixed rate. The full probabilistic model can be written down as a giant intractable Master equation. In his PhD thesis, Michael wrote down the equivalent Path Integral representation for the Master equation. Our goal was to extract a set of self-consistent equations for the first two cumulants (i.e. the mean and covariance). Vipul Periwal, my colleague in the LBM who is a former string theorist, pointed us to the Two Particle Irreducible effective action (2PI) approach, which does exactly what we needed. The result is our paper in Neural Computation. I also suggested that he derive the equations two ways. The first was to use kinetic theory directly and the second was using the 2PI field theory methods. That way, people could better understand the formalism. This turned out to be a daunting task and required heroic calculations on Michael’s part. One of the issues was that Michael had defined mean field theory for his system to be one where all probability distribution of spike counts was exactly Poisson, which means that all the cumulants are equal to the mean. In order to expand around the Poisson state, one needs to construct the expansion in terms of factorial moments, which we called “normal ordered” cumulants in our paper. This was very painful. The field theory automatically does this through what is called a Doi-Peliti-Janssen transformation. The paper is nice but in some sense it side-tracked us from our main goal, and that was deriving the 2PI for a spiking neural network.

I don’t remember what happened after that exactly but after a few failed attempts at other projects, we finally set down in earnest to tackle the generalized activity equations for spiking neurons. We realized that instead of using integrate-and-fire neurons, which have a discontinuity, we should use the Theta neuron model with synaptic coupling. One of the side benefits of our project was that we fully worked out the correct mean field theory for this model. It involves two “fields”, one for the density of neurons at a particular phase and the other for the synaptic drive to the neurons. The resulting path integral representation also has two fields. Since our theory  perturbs around mean field theory, a requirement to calculate quantities is that we must first solve the mean field equations. This is actually quite difficult since it consists of an integro-partial differential equation. Closed form solutions can be obtained for the asynchronous state and numerical solutions can be obtained in other regimes. Our PLoS Computational Biology paper considers the Theta model and an even simpler “phase” model. We explicitly compute the complete time-dependent second moment of the firing rate and synaptic drive including transients. The equations in the paper are not rendered perfectly unfortunately but they should be understandable. Don’t hesitate to ask me or Michael if you have any questions. This paper does not produce the 2PI for the theta model but it shows the way of how that can be achieved.