-
Order from chaos: Interplay of development and learning in recurrent networks of structured neurons
Authors:
Laura Kriener,
Kristin Völk,
Ben von Hünerbein,
Federico Benitez,
Walter Senn,
Mihai A. Petrovici
Abstract:
Behavior can be described as a temporal sequence of actions driven by neural activity. To learn complex sequential patterns in neural networks, memories of past activities need to persist on significantly longer timescales than relaxation times of single-neuron activity. While recurrent networks can produce such long transients, training these networks in a biologically plausible way is challengin…
▽ More
Behavior can be described as a temporal sequence of actions driven by neural activity. To learn complex sequential patterns in neural networks, memories of past activities need to persist on significantly longer timescales than relaxation times of single-neuron activity. While recurrent networks can produce such long transients, training these networks in a biologically plausible way is challenging. One approach has been reservoir computing, where only weights from a recurrent network to a readout are learned. Other models achieve learning of recurrent synaptic weights using propagated errors. However, their biological plausibility typically suffers from issues with locality, resource allocation or parameter scales and tuning. We suggest that many of these issues can be alleviated by considering dendritic information storage and computation. By applying a fully local, always-on plasticity rule we are able to learn complex sequences in a recurrent network comprised of two populations. Importantly, our model is resource-efficient, enabling the learning of complex sequences using only a small number of neurons. We demonstrate these features in a mock-up of birdsong learning, in which our networks first learn a long, non-Markovian sequence that they can then reproduce robustly despite external disturbances.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Confidence and second-order errors in cortical circuits
Authors:
Arno Granier,
Mihai A. Petrovici,
Walter Senn,
Katharina A. Wilmes
Abstract:
Minimization of cortical prediction errors has been considered a key computational goal of the cerebral cortex underlying perception, action and learning. However, it is still unclear how the cortex should form and use information about uncertainty in this process. Here, we formally derive neural dynamics that minimize prediction errors under the assumption that cortical areas must not only predic…
▽ More
Minimization of cortical prediction errors has been considered a key computational goal of the cerebral cortex underlying perception, action and learning. However, it is still unclear how the cortex should form and use information about uncertainty in this process. Here, we formally derive neural dynamics that minimize prediction errors under the assumption that cortical areas must not only predict the activity in other areas and sensory streams but also jointly project their confidence (inverse expected uncertainty) in their predictions. In the resulting neuronal dynamics, the integration of bottom-up and top-down cortical streams is dynamically modulated based on confidence in accordance with the Bayesian principle. Moreover, the theory predicts the existence of cortical second-order errors, comparing confidence and actual performance. These errors are propagated through the cortical hierarchy alongside classical prediction errors and are used to learn the weights of synapses responsible for formulating confidence. We propose a detailed mapping of the theory to cortical circuitry, discuss entailed functional interpretations and provide potential directions for experimental work.
△ Less
Submitted 26 March, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Learning beyond sensations: how dreams organize neuronal representations
Authors:
Nicolas Deperrois,
Mihai A. Petrovici,
Walter Senn,
Jakob Jordan
Abstract:
Semantic representations in higher sensory cortices form the basis for robust, yet flexible behavior. These representations are acquired over the course of development in an unsupervised fashion and continuously maintained over an organism's lifespan. Predictive learning theories propose that these representations emerge from predicting or reconstructing sensory inputs. However, brains are known t…
▽ More
Semantic representations in higher sensory cortices form the basis for robust, yet flexible behavior. These representations are acquired over the course of development in an unsupervised fashion and continuously maintained over an organism's lifespan. Predictive learning theories propose that these representations emerge from predicting or reconstructing sensory inputs. However, brains are known to generate virtual experiences, such as during imagination and dreaming, that go beyond previously experienced inputs. Here, we suggest that virtual experiences may be just as relevant as actual sensory inputs in shaping cortical representations. In particular, we discuss two complementary learning principles that organize representations through the generation of virtual experiences. First, "adversarial dreaming" proposes that creative dreams support a cortical implementation of adversarial learning in which feedback and feedforward pathways engage in a productive game of trying to fool each other. Second, "contrastive dreaming" proposes that the invariance of neuronal representations to irrelevant factors of variation is acquired by trying to map similar virtual experiences together via a contrastive learning process. These principles are compatible with known cortical structure and dynamics and the phenomenology of sleep thus providing promising directions to explain cortical learning beyond the classical predictive learning paradigm.
△ Less
Submitted 5 December, 2023; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Learning efficient backprojections across cortical hierarchies in real time
Authors:
Kevin Max,
Laura Kriener,
Garibaldi Pineda García,
Thomas Nowotny,
Ismael Jaras,
Walter Senn,
Mihai A. Petrovici
Abstract:
Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths.
We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered…
▽ More
Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths.
We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered cortical hierarchies. This is achieved by exploiting the noise naturally found in biophysical systems as an additional carrier of information. In our dynamical system, all weights are learned simultaneously with always-on plasticity and using only information locally available to the synapses. Our method is completely phase-free (no forward and backward passes or phased learning) and allows for efficient error propagation across multi-layer cortical hierarchies, while maintaining biologically plausible signal transport and learning.
Our method is applicable to a wide class of models and improves on previously known biologically plausible ways of credit assignment: compared to random synaptic feedback, it can solve complex tasks with less neurons and learn more useful latent representations. We demonstrate this on various classification tasks using a cortical microcircuit model with prospective coding.
△ Less
Submitted 2 February, 2024; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons
Authors:
Paul Haider,
Benjamin Ellenberger,
Laura Kriener,
Jakob Jordan,
Walter Senn,
Mihai A. Petrovici
Abstract:
The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also lea…
▽ More
The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also learning. We introduce Latent Equilibrium, a new framework for inference and learning in networks of slow components which avoids these issues by harnessing the ability of biological neurons to phase-advance their output with respect to their membrane potential. This principle enables quasi-instantaneous inference independent of network depth and avoids the need for phased plasticity or computationally expensive network relaxation phases. We jointly derive disentangled neuron and synapse dynamics from a prospective energy function that depends on a network's generalized position and momentum. The resulting model can be interpreted as a biologically plausible approximation of error backpropagation in deep cortical networks with continuous-time, leaky neuronal dynamics and continuously active, local plasticity. We demonstrate successful learning of standard benchmark datasets, achieving competitive performance using both fully-connected and convolutional architectures, and show how our principle can be applied to detailed models of cortical microcircuitry. Furthermore, we study the robustness of our model to spatio-temporal substrate imperfections to demonstrate its feasibility for physical realization, be it in vivo or in silico.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Learning cortical representations through perturbed and adversarial dreaming
Authors:
Nicolas Deperrois,
Mihai A. Petrovici,
Walter Senn,
Jakob Jordan
Abstract:
Humans and other animals learn to extract general concepts from sensory experience without extensive teaching. This ability is thought to be facilitated by offline states like sleep where previous experiences are systemically replayed. However, the characteristic creative nature of dreams suggests that learning semantic representations may go beyond merely replaying previous experiences. We suppor…
▽ More
Humans and other animals learn to extract general concepts from sensory experience without extensive teaching. This ability is thought to be facilitated by offline states like sleep where previous experiences are systemically replayed. However, the characteristic creative nature of dreams suggests that learning semantic representations may go beyond merely replaying previous experiences. We support this hypothesis by implementing a cortical architecture inspired by generative adversarial networks (GANs). Learning in our model is organized across three different global brain states mimicking wakefulness, NREM and REM sleep, optimizing different, but complementary objective functions. We train the model on standard datasets of natural images and evaluate the quality of the learned representations. Our results suggest that generating new, virtual sensory inputs via adversarial dreaming during REM sleep is essential for extracting semantic concepts, while replaying episodic memories via perturbed dreaming during NREM sleep improves the robustness of latent representations. The model provides a new computational perspective on sleep states, memory replay and dreams and suggests a cortical implementation of GANs.
△ Less
Submitted 18 February, 2022; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Evolving Neuronal Plasticity Rules using Cartesian Genetic Programming
Authors:
Henrik D. Mettler,
Maximilian Schmidt,
Walter Senn,
Mihai A. Petrovici,
Jakob Jordan
Abstract:
We formulate the search for phenomenological models of synaptic plasticity as an optimization problem. We employ Cartesian genetic programming to evolve biologically plausible human-interpretable plasticity rules that allow a given network to successfully solve tasks from specific task families. While our evolving-to-learn approach can be applied to various learning paradigms, here we illustrate i…
▽ More
We formulate the search for phenomenological models of synaptic plasticity as an optimization problem. We employ Cartesian genetic programming to evolve biologically plausible human-interpretable plasticity rules that allow a given network to successfully solve tasks from specific task families. While our evolving-to-learn approach can be applied to various learning paradigms, here we illustrate its power by evolving plasticity rules that allow a network to efficiently determine the first principal component of its input distribution. We demonstrate that the evolved rules perform competitively with known hand-designed solutions. We explore how the statistical properties of the datasets used during the evolutionary search influences the form of the plasticity rules and discover new rules which are adapted to the structure of the corresponding datasets.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Natural-gradient learning for spiking neurons
Authors:
Elena Kreutzer,
Walter M. Senn,
Mihai A. Petrovici
Abstract:
In many normative theories of synaptic plasticity, weight updates implicitly depend on the chosen parametrization of the weights. This problem relates, for example, to neuronal morphology: synapses which are functionally equivalent in terms of their impact on somatic firing can differ substantially in spine size due to their different positions along the dendritic tree. Classical theories based on…
▽ More
In many normative theories of synaptic plasticity, weight updates implicitly depend on the chosen parametrization of the weights. This problem relates, for example, to neuronal morphology: synapses which are functionally equivalent in terms of their impact on somatic firing can differ substantially in spine size due to their different positions along the dendritic tree. Classical theories based on Euclidean gradient descent can easily lead to inconsistencies due to such parametrization dependence. The issues are solved in the framework of Riemannian geometry, in which we propose that plasticity instead follows natural gradient descent. Under this hypothesis, we derive a synaptic learning rule for spiking neurons that couples functional efficiency with the explanation of several well-documented biological phenomena such as dendritic democracy, multiplicative scaling and heterosynaptic plasticity. We therefore suggest that in its search for functional synaptic plasticity, evolution might have come up with its own version of natural gradient descent.
△ Less
Submitted 23 February, 2022; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Cortical oscillations implement a backbone for sampling-based computation in spiking neural networks
Authors:
Agnes Korcsak-Gorzo,
Michael G. Müller,
Andreas Baumbach,
Luziwei Leng,
Oliver Julien Breitwieser,
Sacha J. van Albada,
Walter Senn,
Karlheinz Meier,
Robert Legenstein,
Mihai A. Petrovici
Abstract:
Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attrac…
▽ More
Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attractors, but between themselves can be very dissimilar, switching between such states can be difficult. We propose that cortical oscillations can be effectively used to overcome this challenge. By acting as an effective temperature, background spiking activity modulates exploration. Rhythmic changes induced by cortical oscillations can then be interpreted as a form of simulated tempering. We provide a rigorous mathematical discussion of this link and study some of its phenomenological implications in computer simulations. This identifies a new computational role of cortical oscillations and connects them to various phenomena in the brain, such as sampling-based probabilistic inference, memory replay, multisensory cue combination, and place cell flickering.
△ Less
Submitted 4 April, 2022; v1 submitted 19 June, 2020;
originally announced June 2020.
-
Versatile emulation of spiking neural networks on an accelerated neuromorphic substrate
Authors:
Sebastian Billaudelle,
Yannik Stradmann,
Korbinian Schreiber,
Benjamin Cramer,
Andreas Baumbach,
Dominik Dold,
Julian Göltz,
Akos F. Kungl,
Timo C. Wunderlich,
Andreas Hartel,
Eric Müller,
Oliver Breitwieser,
Christian Mauch,
Mitja Kleider,
Andreas Grübl,
David Stöckel,
Christian Pehle,
Arthur Heimbrecht,
Philipp Spilger,
Gerd Kiene,
Vitali Karasenko,
Walter Senn,
Mihai A. Petrovici,
Johannes Schemmel,
Karlheinz Meier
Abstract:
We present first experimental results on the novel BrainScaleS-2 neuromorphic architecture based on an analog neuro-synaptic core and augmented by embedded microprocessors for complex plasticity and experiment control. The high acceleration factor of 1000 compared to biological dynamics enables the execution of computationally expensive tasks, by allowing the fast emulation of long-duration experi…
▽ More
We present first experimental results on the novel BrainScaleS-2 neuromorphic architecture based on an analog neuro-synaptic core and augmented by embedded microprocessors for complex plasticity and experiment control. The high acceleration factor of 1000 compared to biological dynamics enables the execution of computationally expensive tasks, by allowing the fast emulation of long-duration experiments or rapid iteration over many consecutive trials. The flexibility of our architecture is demonstrated in a suite of five distinct experiments, which emphasize different aspects of the BrainScaleS-2 system.
△ Less
Submitted 9 May, 2022; v1 submitted 30 December, 2019;
originally announced December 2019.
-
Fast and energy-efficient neuromorphic deep learning with first-spike times
Authors:
Julian Göltz,
Laura Kriener,
Andreas Baumbach,
Sebastian Billaudelle,
Oliver Breitwieser,
Benjamin Cramer,
Dominik Dold,
Akos Ferenc Kungl,
Walter Senn,
Johannes Schemmel,
Karlheinz Meier,
Mihai Alexandru Petrovici
Abstract:
For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems are optimized for short time-to-solution and low energy-to-solution characteristics. At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. With time-to-first-spike co…
▽ More
For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems are optimized for short time-to-solution and low energy-to-solution characteristics. At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. With time-to-first-spike coding both of these goals are inherently emerging features of learning. Here, we describe a rigorous derivation of a learning rule for such first-spike times in networks of leaky integrate-and-fire neurons, relying solely on input and output spike times, and show how this mechanism can implement error backpropagation in hierarchical spiking networks. Furthermore, we emulate our framework on the BrainScaleS-2 neuromorphic system and demonstrate its capability of harnessing the system's speed and energy characteristics. Finally, we examine how our approach generalizes to other neuromorphic platforms by studying how its performance is affected by typical distortive effects induced by neuromorphic substrates.
△ Less
Submitted 17 May, 2021; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Ghost Units Yield Biologically Plausible Backprop in Deep Neural Networks
Authors:
Thomas Mesnard,
Gaetan Vignoud,
Joao Sacramento,
Walter Senn,
Yoshua Bengio
Abstract:
In the past few years, deep learning has transformed artificial intelligence research and led to impressive performance in various difficult tasks. However, it is still unclear how the brain can perform credit assignment across many areas as efficiently as backpropagation does in deep neural networks. In this paper, we introduce a model that relies on a new role for a neuronal inhibitory machinery…
▽ More
In the past few years, deep learning has transformed artificial intelligence research and led to impressive performance in various difficult tasks. However, it is still unclear how the brain can perform credit assignment across many areas as efficiently as backpropagation does in deep neural networks. In this paper, we introduce a model that relies on a new role for a neuronal inhibitory machinery, referred to as ghost units. By cancelling the feedback coming from the upper layer when no target signal is provided to the top layer, the ghost units enables the network to backpropagate errors and do efficient credit assignment in deep structures. While considering one-compartment neurons and requiring very few biological assumptions, it is able to approximate the error gradient and achieve good performance on classification tasks. Error backpropagation occurs through the recurrent dynamics of the network and thanks to biologically plausible local learning rules. In particular, it does not require separate feedforward and feedback circuits. Different mechanisms for cancelling the feedback were studied, ranging from complete duplication of the connectivity by long term processes to online replication of the feedback activity. This reduced system combines the essential elements to have a working biologically abstracted analogue of backpropagation with a simple formulation and proofs of the associated results. Therefore, this model is a step towards understanding how learning and memory are implemented in cortical multilayer structures, but it also raises interesting perspectives for neuromorphic hardware.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Dendritic cortical microcircuits approximate the backpropagation algorithm
Authors:
João Sacramento,
Rui Ponte Costa,
Yoshua Bengio,
Walter Senn
Abstract:
Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, the main learning mechanism behind these advances - error backpropagation - appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model with simplified dendritic compartments in which error-driven synaptic plasticity adapts the network towards a gl…
▽ More
Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, the main learning mechanism behind these advances - error backpropagation - appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model with simplified dendritic compartments in which error-driven synaptic plasticity adapts the network towards a global desired output. In contrast to previous work our model does not require separate phases and synaptic learning is driven by local dendritic prediction errors continuously in time. Such errors originate at apical dendrites and occur due to a mismatch between predictive input from lateral interneurons and activity from actual top-down feedback. Through the use of simple dendritic compartments and different cell-types our model can represent both error and normal activity within a pyramidal neuron. We demonstrate the learning capabilities of the model in regression and classification tasks, and show analytically that it approximates the error backpropagation algorithm. Moreover, our framework is consistent with recent observations of learning between brain areas and the architecture of cortical microcircuits. Overall, we introduce a novel view of learning on dendritic cortical circuits and on how the brain may solve the long-standing synaptic credit assignment problem.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
Stochasticity from function -- why the Bayesian brain may need no noise
Authors:
Dominik Dold,
Ilja Bytschok,
Akos F. Kungl,
Andreas Baumbach,
Oliver Breitwieser,
Walter Senn,
Johannes Schemmel,
Karlheinz Meier,
Mihai A. Petrovici
Abstract:
An increasing body of evidence suggests that the trial-to-trial variability of spiking activity in the brain is not mere noise, but rather the reflection of a sampling-based encoding scheme for probabilistic computing. Since the precise statistical properties of neural activity are important in this context, many models assume an ad-hoc source of well-behaved, explicit noise, either on the input o…
▽ More
An increasing body of evidence suggests that the trial-to-trial variability of spiking activity in the brain is not mere noise, but rather the reflection of a sampling-based encoding scheme for probabilistic computing. Since the precise statistical properties of neural activity are important in this context, many models assume an ad-hoc source of well-behaved, explicit noise, either on the input or on the output side of single neuron dynamics, most often assuming an independent Poisson process in either case. However, these assumptions are somewhat problematic: neighboring neurons tend to share receptive fields, rendering both their input and their output correlated; at the same time, neurons are known to behave largely deterministically, as a function of their membrane potential and conductance. We suggest that spiking neural networks may, in fact, have no need for noise to perform sampling-based Bayesian inference. We study analytically the effect of auto- and cross-correlations in functionally Bayesian spiking networks and demonstrate how their effect translates to synaptic interaction strengths, rendering them controllable through synaptic plasticity. This allows even small ensembles of interconnected deterministic spiking networks to simultaneously and co-dependently shape their output activity through learning, enabling them to perform complex Bayesian computation without any need for noise, which we demonstrate in silico, both in classical simulation and in neuromorphic emulation. These results close a gap between the abstract models and the biology of functionally Bayesian spiking networks, effectively reducing the architectural constraints imposed on physical neural substrates required to perform probabilistic computing, be they biological or artificial.
△ Less
Submitted 24 August, 2019; v1 submitted 21 September, 2018;
originally announced September 2018.
-
Dendritic error backpropagation in deep cortical microcircuits
Authors:
João Sacramento,
Rui Ponte Costa,
Yoshua Bengio,
Walter Senn
Abstract:
Animal behaviour depends on learning to associate sensory stimuli with the desired motor command. Understanding how the brain orchestrates the necessary synaptic modifications across different brain areas has remained a longstanding puzzle. Here, we introduce a multi-area neuronal network model in which synaptic plasticity continuously adapts the network towards a global desired output. In this mo…
▽ More
Animal behaviour depends on learning to associate sensory stimuli with the desired motor command. Understanding how the brain orchestrates the necessary synaptic modifications across different brain areas has remained a longstanding puzzle. Here, we introduce a multi-area neuronal network model in which synaptic plasticity continuously adapts the network towards a global desired output. In this model synaptic learning is driven by a local dendritic prediction error that arises from a failure to predict the top-down input given the bottom-up activities. Such errors occur at apical dendrites of pyramidal neurons where both long-range excitatory feedback and local inhibitory predictions are integrated. When local inhibition fails to match excitatory feedback an error occurs which triggers plasticity at bottom-up synapses at basal dendrites of the same pyramidal neurons. We demonstrate the learning capabilities of the model in a number of tasks and show that it approximates the classical error backpropagation algorithm. Finally, complementing this cortical circuit with a disinhibitory mechanism enables attention-like stimulus denoising and generation. Our framework makes several experimental predictions on the function of dendritic integration and cortical microcircuits, is consistent with recent observations of cross-area learning, and suggests a biological implementation of deep learning.
△ Less
Submitted 29 December, 2017;
originally announced January 2018.
-
Spiking neurons with short-term synaptic plasticity form superior generative networks
Authors:
Luziwei Leng,
Roman Martel,
Oliver Breitwieser,
Ilja Bytschok,
Walter Senn,
Johannes Schemmel,
Karlheinz Meier,
Mihai A. Petrovici
Abstract:
Spiking networks that perform probabilistic inference have been proposed both as models of cortical computation and as candidates for solving problems in machine learning. However, the evidence for spike-based computation being in any way superior to non-spiking alternatives remains scarce. We propose that short-term plasticity can provide spiking networks with distinct computational advantages co…
▽ More
Spiking networks that perform probabilistic inference have been proposed both as models of cortical computation and as candidates for solving problems in machine learning. However, the evidence for spike-based computation being in any way superior to non-spiking alternatives remains scarce. We propose that short-term plasticity can provide spiking networks with distinct computational advantages compared to their classical counterparts. In this work, we use networks of leaky integrate-and-fire neurons that are trained to perform both discriminative and generative tasks in their forward and backward information processing paths, respectively. During training, the energy landscape associated with their dynamics becomes highly diverse, with deep attractor basins separated by high barriers. Classical algorithms solve this problem by employing various tempering techniques, which are both computationally demanding and require global state updates. We demonstrate how similar results can be achieved in spiking networks endowed with local short-term synaptic plasticity. Additionally, we discuss how these networks can even outperform tempering-based approaches when the training data is imbalanced. We thereby show how biologically inspired, local, spike-triggered synaptic dynamics based simply on a limited pool of synaptic resources can allow spiking networks to outperform their non-spiking relatives.
△ Less
Submitted 10 October, 2017; v1 submitted 24 September, 2017;
originally announced September 2017.
-
Feedforward Initialization for Fast Inference of Deep Generative Networks is biologically plausible
Authors:
Yoshua Bengio,
Benjamin Scellier,
Olexa Bilaniuk,
Joao Sacramento,
Walter Senn
Abstract:
We consider deep multi-layered generative models such as Boltzmann machines or Hopfield nets in which computation (which implements inference) is both recurrent and stochastic, but where the recurrence is not to model sequential structure, only to perform computation. We find conditions under which a simple feedforward computation is a very good initialization for inference, after the input units…
▽ More
We consider deep multi-layered generative models such as Boltzmann machines or Hopfield nets in which computation (which implements inference) is both recurrent and stochastic, but where the recurrence is not to model sequential structure, only to perform computation. We find conditions under which a simple feedforward computation is a very good initialization for inference, after the input units are clamped to observed values. It means that after the feedforward initialization, the recurrent network is very close to a fixed point of the network dynamics, where the energy gradient is 0. The main condition is that consecutive layers form a good auto-encoder, or more generally that different groups of inputs into the unit (in particular, bottom-up inputs on one hand, top-down inputs on the other hand) are consistent with each other, producing the same contribution into the total weighted sum of inputs. In biological terms, this would correspond to having each dendritic branch correctly predicting the aggregate input from all the dendritic branches, i.e., the soma potential. This is consistent with the prediction that the synaptic weights into dendritic branches such as those of the apical and basal dendrites of pyramidal cells are trained to minimize the prediction error made by the dendritic branch when the target is the somatic activity. Whereas previous work has shown how to achieve fast negative phase inference (when the model is unclamped) in a predictive recurrent model, this contribution helps to achieve fast positive phase inference (when the target output is clamped) in such recurrent neural models.
△ Less
Submitted 27 June, 2016; v1 submitted 6 June, 2016;
originally announced June 2016.