subscribe to arXiv mailings

arXiv:2407.10828 [pdf]

Towards Enhanced Classification of Abnormal Lung sound in Multi-breath: A Light Weight Multi-label and Multi-head Attention Classification Method

Authors: Yi-Wei Chua, Yun-Chien Cheng

Abstract: This study aims to develop an auxiliary diagnostic system for classifying abnormal lung respiratory sounds, enhancing the accuracy of automatic abnormal breath sound classification through an innovative multi-label learning approach and multi-head attention mechanism. Addressing the issue of class imbalance and lack of diversity in existing respiratory sound datasets, our study employs a lightweig… ▽ More This study aims to develop an auxiliary diagnostic system for classifying abnormal lung respiratory sounds, enhancing the accuracy of automatic abnormal breath sound classification through an innovative multi-label learning approach and multi-head attention mechanism. Addressing the issue of class imbalance and lack of diversity in existing respiratory sound datasets, our study employs a lightweight and highly accurate model, using a two-dimensional label set to represent multiple respiratory sound characteristics. Our method achieved a 59.2% ICBHI score in the four-category task on the ICBHI2017 dataset, demonstrating its advantages in terms of lightweight and high accuracy. This study not only improves the accuracy of automatic diagnosis of lung respiratory sound abnormalities but also opens new possibilities for clinical applications. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.06939 [pdf, other]

Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge

Authors: Sriram Yenamandra, Arun Ramachandran, Mukul Khanna, Karmesh Yadav, Jay Vakil, Andrew Melnik, Michael Büttner, Leon Harz, Lyon Brown, Gora Chand Nandi, Arjun PS, Gaurav Kumar Yadav, Rahul Kala, Robert Haschke, Yang Luo, Jinxin Zhu, Yansen Han, Bingyi Lu, Xuan Gu, Qinyuan Liu, Yaping Zhao, Qiting Ye, Chenxiao Dou, Yansong Chua, Volodymyr Kuzma , et al. (20 additional authors not shown)

Abstract: In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface withi… ▽ More In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface within that environment. We organized a NeurIPS 2023 competition featuring both simulation and real-world components to evaluate solutions to this task. Our baselines on the most challenging version of this task, using real perception in simulation, achieved only an 0.8% success rate; by the end of the competition, the best participants achieved an 10.8\% success rate, a 13x improvement. We observed that the most successful teams employed a variety of methods, yet two common threads emerged among the best solutions: enhancing error detection and recovery, and improving the integration of perception with decision-making processes. In this paper, we detail the results and methodologies used, both in simulation and real-world settings. We discuss the lessons learned and their implications for future research. Additionally, we compare performance in real and simulated environments, emphasizing the necessity for robust generalization to novel settings. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2309.03641 [pdf, other]

Spiking Structured State Space Model for Monaural Speech Enhancement

Authors: Yu Du, Xu Liu, Yansong Chua

Abstract: Speech enhancement seeks to extract clean speech from noisy signals. Traditional deep learning methods face two challenges: efficiently using information in long speech sequences and high computational costs. To address these, we introduce the Spiking Structured State Space Model (Spiking-S4). This approach merges the energy efficiency of Spiking Neural Networks (SNN) with the long-range sequence… ▽ More Speech enhancement seeks to extract clean speech from noisy signals. Traditional deep learning methods face two challenges: efficiently using information in long speech sequences and high computational costs. To address these, we introduce the Spiking Structured State Space Model (Spiking-S4). This approach merges the energy efficiency of Spiking Neural Networks (SNN) with the long-range sequence modeling capabilities of Structured State Space Models (S4), offering a compelling solution. Evaluation on the DNS Challenge and VoiceBank+Demand Datasets confirms that Spiking-S4 rivals existing Artificial Neural Network (ANN) methods but with fewer computational resources, as evidenced by reduced parameters and Floating Point Operations (FLOPs). △ Less

Submitted 20 April, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2307.00789 [pdf, other]

doi 10.1088/1742-6596/2600/14/142009

Utilizing wearable technology to characterize and facilitate occupant collaborations in flexible workspaces

Authors: Kristi Maisha, Mario Frei, Matias Quintana, Yun Xuan Chua, Rishee Jain, Clayton Miller

Abstract: Hybrid working strategies have become, and will continue to be, the norm for many offices. This raises two considerations: newly unoccupied spaces needlessly consume energy, and the occupied spaces need to be effectively used to facilitate meaningful interactions and create a positive, sustainable work culture. This work aims to determine when spontaneous, collaborative interactions occur within t… ▽ More Hybrid working strategies have become, and will continue to be, the norm for many offices. This raises two considerations: newly unoccupied spaces needlessly consume energy, and the occupied spaces need to be effectively used to facilitate meaningful interactions and create a positive, sustainable work culture. This work aims to determine when spontaneous, collaborative interactions occur within the building and the environmental factors that facilitate such interactions. This study uses smartwatch-based micro-surveys using the Cozie platform to identify the occurrence of and spatially place interactions while categorizing them as a collaboration or distraction. This method uniquely circumvents pitfalls associated with surveying and qualitative data collection: occupant behaviors are identified in real-time in a non-intrusive manner, and survey data is corroborated with quantitative sensor data. A proof-of-concept study was deployed with nine hybrid-working participants providing 100 micro-survey cluster responses over approximately two weeks. The results show the spontaneous interactions occurring in hybrid mode are split evenly among the categories of collaboration, wanted socialization, and distraction and primarily occur with coworkers at one's desk. From these data, we can establish various correlations between the occurrence of positive spontaneous interactions and different factors, such as the time of day and the locations in the building. This framework and first deployment provide the foundation for future large-scale data collection experiments and human interaction modeling. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Journal ref: J Phys Conf Ser. 2023;2600: 142009

arXiv:2305.18925 [pdf, other]

Investigating model performance in language identification: beyond simple error statistics

Authors: Suzy J. Styles, Victoria Y. H. Chua, Fei Ting Woon, Hexin Liu, Leibny Paola Garcia Perera, Sanjeev Khudanpur, Andy W. H. Khong, Justin Dauwels

Abstract: Language development experts need tools that can automatically identify languages from fluent, conversational speech, and provide reliable estimates of usage rates at the level of an individual recording. However, language identification systems are typically evaluated on metrics such as equal error rate and balanced accuracy, applied at the level of an entire speech corpus. These overview metrics… ▽ More Language development experts need tools that can automatically identify languages from fluent, conversational speech, and provide reliable estimates of usage rates at the level of an individual recording. However, language identification systems are typically evaluated on metrics such as equal error rate and balanced accuracy, applied at the level of an entire speech corpus. These overview metrics do not provide information about model performance at the level of individual speakers, recordings, or units of speech with different linguistic characteristics. Overview statistics may therefore mask systematic errors in model performance for some subsets of the data, and consequently, have worse performance on data derived from some subsets of human speakers, creating a kind of algorithmic bias. In the current paper, we investigate how well a number of language identification systems perform on individual recordings and speech units with different linguistic properties in the MERLIon CCS Challenge. The Challenge dataset features accented English-Mandarin code-switched child-directed speech. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: Accepted to Interspeech 2023, 5 pages, 5 figures

arXiv:2304.03450 [pdf, other]

Striving for Authentic and Sustained Technology Use In the Classroom: Lessons Learned from a Longitudinal Evaluation of a Sensor-based Science Education Platform

Authors: Yvonne Chua, Sankha Cooray, Juan Pablo Forero Cortes, Paul Denny, Sonia Dupuch, Dawn L Garbett, Alaeddin Nassani, Jiashuo Cao, Hannah Qiao, Andrew Reis, Deviana Reis, Philipp M. Scholl, Priyashri Kamlesh Sridhar, Hussel Suriyaarachchi, Fiona Taimana, Vanessa Tanga, Chamod Weerasinghe, Elliott Wen, Michelle Wu, Qin Wu, Haimo Zhang, Suranga Nanayakkara

Abstract: Technology integration in educational settings has led to the development of novel sensor-based tools that enable students to measure and interact with their environment. Although reports from using such tools can be positive, evaluations are often conducted under controlled conditions and short timeframes. There is a need for longitudinal data collected in realistic classroom settings. However, s… ▽ More Technology integration in educational settings has led to the development of novel sensor-based tools that enable students to measure and interact with their environment. Although reports from using such tools can be positive, evaluations are often conducted under controlled conditions and short timeframes. There is a need for longitudinal data collected in realistic classroom settings. However, sustained and authentic classroom use requires technology platforms to be seen by teachers as both easy to use and of value. We describe our development of a sensor-based platform to support science teaching that followed a 14-month user-centered design process. We share insights from this design and development approach, and report findings from a 6-month large-scale evaluation involving 35 schools and 1245 students. We share lessons learnt, including that technology integration is not an educational goal per se and that technology should be a transparent tool to enable students to achieve their learning goals. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2303.07830 [pdf]

Emergent Bio-Functional Similarities in a Cortical-Spike-Train-Decoding Spiking Neural Network Facilitate Predictions of Neural Computation

Authors: Tengjun Liu, Yansong Chua, Yiwei Zhang, Yuxiao Ning, Pengfu Liu, Guihua Wan, Zijun Wan, Shaomin Zhang, Weidong Chen

Abstract: Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks. In this study, we proposed the motorSRNN, a recurrent SNN topologically inspired by the neural motor circuit of primates. By employing the moto… ▽ More Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks. In this study, we proposed the motorSRNN, a recurrent SNN topologically inspired by the neural motor circuit of primates. By employing the motorSRNN in decoding spike trains from the primary motor cortex of monkeys, we achieved a good balance between classification accuracy and energy consumption. The motorSRNN communicated with the input by capturing and cultivating more cosine-tuning, an essential property of neurons in the motor cortex, and maintained its stability during training. Such training-induced cultivation and persistency of cosine-tuning was also observed in our monkeys. Moreover, the motorSRNN produced additional bio-functional similarities at the single-neuron, population, and circuit levels, demonstrating biological authenticity. Thereby, ablation studies on motorSRNN have suggested long-term stable feedback synapses contribute to the training-induced cultivation in the motor cortex. Besides these novel findings and predictions, we offer a new framework for building authentic models of neural computation. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2302.08607 [pdf, other]

Adaptive Axonal Delays in feedforward spiking neural networks for accurate spoken word recognition

Authors: Pengfei Sun, Ehsan Eqlimi, Yansong Chua, Paul Devos, Dick Botteldooren

Abstract: Spiking neural networks (SNN) are a promising research avenue for building accurate and efficient automatic speech recognition systems. Recent advances in audio-to-spike encoding and training algorithms enable SNN to be applied in practical tasks. Biologically-inspired SNN communicates using sparse asynchronous events. Therefore, spike-timing is critical to SNN performance. In this aspect, most wo… ▽ More Spiking neural networks (SNN) are a promising research avenue for building accurate and efficient automatic speech recognition systems. Recent advances in audio-to-spike encoding and training algorithms enable SNN to be applied in practical tasks. Biologically-inspired SNN communicates using sparse asynchronous events. Therefore, spike-timing is critical to SNN performance. In this aspect, most works focus on training synaptic weights and few have considered delays in event transmission, namely axonal delay. In this work, we consider a learnable axonal delay capped at a maximum value, which can be adapted according to the axonal delay distribution in each network layer. We show that our proposed method achieves the best classification results reported on the SHD dataset (92.45%) and NTIDIGITS dataset (95.09%). Our work illustrates the potential of training axonal delays for tasks with complex temporal structures. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: Accepted by ICASSP 2023

arXiv:2210.13977 [pdf, other]

doi 10.1088/1742-6596/2600/14/142003

Cozie Apple: An iOS mobile and smartwatch application for environmental quality satisfaction and physiological data collection

Authors: Federico Tartarini, Mario Frei, Stefano Schiavon, Yun Xuan Chua, Clayton Miller

Abstract: Collecting feedback from people in indoor and outdoor environments is traditionally challenging and complex in a reliable, longitudinal, and non-intrusive way. This paper introduces Cozie Apple, an open-source mobile and smartwatch application for iOS devices. This platform allows people to complete a watch-based micro-survey and provide real-time feedback about environmental conditions via their… ▽ More Collecting feedback from people in indoor and outdoor environments is traditionally challenging and complex in a reliable, longitudinal, and non-intrusive way. This paper introduces Cozie Apple, an open-source mobile and smartwatch application for iOS devices. This platform allows people to complete a watch-based micro-survey and provide real-time feedback about environmental conditions via their Apple Watch. It leverages the inbuilt sensors of a smartwatch to collect physiological (e.g., heart rate, activity) and environmental (sound level) data. This paper outlines data collected from 48 research participants who used the platform to report perceptions of urban-scale environmental comfort (noise and thermal) and contextual factors such as who they were with and what activity they were doing. The results of 2,400 micro-surveys across various urban settings are illustrated in this paper showing the variability of noise-related distractions, thermal comfort, and associated context. The results show people experience at least a little noise distraction 58% of the time, with people talking being the most common reason (46%). This effort is novel due to its focus on spatial and temporal scalability and collection of noise, distraction, and associated contextual information. These data set the stage for larger deployments, deeper analysis, and more helpful prediction models toward better understanding the occupants' needs and perceptions. These innovations could result in real-time control signals to building systems or nudges for people to change their behavior. △ Less

Submitted 20 June, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: Accepted at the CISBAT 2023 The Built Environment in Transition, Hybrid International Conference, EPFL, Lausanne, Switzerland, 13-15 September 2023

Journal ref: J Phys Conf Ser. 2023;2600: 142003

arXiv:2210.04532 [pdf, other]

Training Spiking Neural Networks with Local Tandem Learning

Authors: Qu Yang, Jibin Wu, Malu Zhang, Yansong Chua, Xinchao Wang, Haizhou Li

Abstract: Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient over their predecessors. However, there is a lack of an efficient and generalized training method for deep SNNs, especially for deployment on analog computing substrates. In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL). The LTL rule follows the teacher-stude… ▽ More Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient over their predecessors. However, there is a lack of an efficient and generalized training method for deep SNNs, especially for deployment on analog computing substrates. In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL). The LTL rule follows the teacher-student learning approach by mimicking the intermediate feature representations of a pre-trained ANN. By decoupling the learning of network layers and leveraging highly informative supervisor signals, we demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity. Our experimental results have also shown that the SNNs thus trained can achieve comparable accuracies to their teacher ANNs on CIFAR-10, CIFAR-100, and Tiny ImageNet datasets. Moreover, the proposed LTL rule is hardware friendly. It can be easily implemented on-chip to perform fast parameter calibration and provide robustness against the notorious device non-ideality issues. It, therefore, opens up a myriad of opportunities for training and deployment of SNN on ultra-low-power mixed-signal neuromorphic computing chips.10 △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: Accepted by NeurIPS 2022

arXiv:2208.06918 [pdf, other]

Gradient Mask: Lateral Inhibition Mechanism Improves Performance in Artificial Neural Networks

Authors: Lei Jiang, Yongqing Liu, Shihai Xiao, Yansong Chua

Abstract: Lateral inhibitory connections have been observed in the cortex of the biological brain, and has been extensively studied in terms of its role in cognitive functions. However, in the vanilla version of backpropagation in deep learning, all gradients (which can be understood to comprise of both signal and noise gradients) flow through the network during weight updates. This may lead to overfitting.… ▽ More Lateral inhibitory connections have been observed in the cortex of the biological brain, and has been extensively studied in terms of its role in cognitive functions. However, in the vanilla version of backpropagation in deep learning, all gradients (which can be understood to comprise of both signal and noise gradients) flow through the network during weight updates. This may lead to overfitting. In this work, inspired by biological lateral inhibition, we propose Gradient Mask, which effectively filters out noise gradients in the process of backpropagation. This allows the learned feature information to be more intensively stored in the network while filtering out noisy or unimportant features. Furthermore, we demonstrate analytically how lateral inhibition in artificial neural networks improves the quality of propagated gradients. A new criterion for gradient quality is proposed which can be used as a measure during training of various convolutional neural networks (CNNs). Finally, we conduct several different experiments to study how Gradient Mask improves the performance of the network both quantitatively and qualitatively. Quantitatively, accuracy in the original CNN architecture, accuracy after pruning, and accuracy after adversarial attacks have shown improvements. Qualitatively, the CNN trained using Gradient Mask has developed saliency maps that focus primarily on the object of interest, which is useful for data augmentation and network interpretability. △ Less

Submitted 14 August, 2022; originally announced August 2022.

arXiv:2111.04479 [pdf, other]

ExtremeBB: A Database for Large-Scale Research into Online Hate, Harassment, the Manosphere and Extremism

Authors: Anh V. Vu, Lydia Wilson, Yi Ting Chua, Ilia Shumailov, Ross Anderson

Abstract: We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist i… ▽ More We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist ideology; tracking the relationships between online subcultures, extremist behaviours, and real-world violence; and monitoring extremist communities in near real time. This can shed light not only on the spread of problematic ideologies but also the effectiveness of interventions. ExtremeBB comes with a robust ethical data-sharing regime that allows us to share data with academics worldwide. Since 2020, access has been granted to 49 licensees in 16 research groups from 12 institutions. △ Less

Submitted 20 August, 2023; v1 submitted 8 November, 2021; originally announced November 2021.

arXiv:2107.05747 [pdf, other]

doi 10.1088/2634-4386/aca710

SoftHebb: Bayesian Inference in Unsupervised Hebbian Soft Winner-Take-All Networks

Authors: Timoleon Moraitis, Dmitry Toichkin, Adrien Journé, Yansong Chua, Qinghai Guo

Abstract: Hebbian plasticity in winner-take-all (WTA) networks is highly attractive for neuromorphic on-chip learning, owing to its efficient, local, unsupervised, and on-line nature. Moreover, its biological plausibility may help overcome important limitations of artificial algorithms, such as their susceptibility to adversarial attacks, and their high demands for training-example quantity and repetition.… ▽ More Hebbian plasticity in winner-take-all (WTA) networks is highly attractive for neuromorphic on-chip learning, owing to its efficient, local, unsupervised, and on-line nature. Moreover, its biological plausibility may help overcome important limitations of artificial algorithms, such as their susceptibility to adversarial attacks, and their high demands for training-example quantity and repetition. However, Hebbian WTA learning has found little use in machine learning (ML), likely because it has been missing an optimization theory compatible with deep learning (DL). Here we show rigorously that WTA networks constructed by standard DL elements, combined with a Hebbian-like plasticity that we derive, maintain a Bayesian generative model of the data. Importantly, without any supervision, our algorithm, SoftHebb, minimizes cross-entropy, i.e. a common loss function in supervised DL. We show this theoretically and in practice. The key is a "soft" WTA where there is no absolute "hard" winner neuron. Strikingly, in shallow-network comparisons with backpropagation (BP), SoftHebb shows advantages beyond its Hebbian efficiency. Namely, it converges in fewer iterations, and is significantly more robust to noise and adversarial attacks. Notably, attacks that maximally confuse SoftHebb are also confusing to the human eye, potentially linking human perceptual robustness, with Hebbian WTA circuits of cortex. Finally, SoftHebb can generate synthetic objects as interpolations of real object classes. All in all, Hebbian efficiency, theoretical underpinning, cross-entropy-minimization, and surprising empirical advantages, suggest that SoftHebb may inspire highly neuromorphic and radically different, but practical and advantageous learning algorithms and hardware accelerators. △ Less

Submitted 12 November, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

Journal ref: Neuromorphic Computing and Engineering, 2(4), 044017 (2022)

arXiv:2003.11837 [pdf, other]

Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks

Authors: Malu Zhang, Jiadong Wang, Burin Amornpaisannon, Zhixuan Zhang, VPK Miriyala, Ammar Belatreche, Hong Qu, Jibin Wu, Yansong Chua, Trevor E. Carlson, Haizhou Li

Abstract: Spiking Neural Networks (SNNs) use spatio-temporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. Motivated by the success of deep learning, the study of Deep Spiking Neural Networks (DeepSNNs) provides promising directions for artificial intelligence applications. Howeve… ▽ More Spiking Neural Networks (SNNs) use spatio-temporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. Motivated by the success of deep learning, the study of Deep Spiking Neural Networks (DeepSNNs) provides promising directions for artificial intelligence applications. However, training of DeepSNNs is not straightforward because the well-studied error back-propagation (BP) algorithm is not directly applicable. In this paper, we first establish an understanding as to why error back-propagation does not work well in DeepSNNs. To address this problem, we propose a simple yet efficient Rectified Linear Postsynaptic Potential function (ReL-PSP) for spiking neurons and propose a Spike-Timing-Dependent Back-Propagation (STDBP) learning algorithm for DeepSNNs. In STDBP algorithm, the timing of individual spikes is used to convey information (temporal coding), and learning (back-propagation) is performed based on spike timing in an event-driven manner. Our experimental results show that the proposed learning algorithm achieves state-of-the-art classification accuracy in single spike time based learning algorithms of DeepSNNs. Furthermore, by utilizing the trained model parameters obtained from the proposed STDBP learning algorithm, we demonstrate the ultra-low-power inference operations on a recently proposed neuromorphic inference accelerator. Experimental results show that the neuromorphic hardware consumes 0.751~mW of the total power consumption and achieves a low latency of 47.71~ms to classify an image from the MNIST dataset. Overall, this work investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems. △ Less

Submitted 3 November, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: This work has been submitted to the IEEE for possible publication. Copyrightmay be transferred without notice, after which this version may no longer beaccessible

arXiv:1912.12419 [pdf, other]

Transfer Learning in General Lensless Imaging through Scattering Media

Authors: Yukuan Yang, Lei Deng, Peng Jiao, Yansong Chua, Jing Pei, Cheng Ma, Guoqi Li

Abstract: Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a l… ▽ More Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a large number of training samples on various datasets have to be collected, with a DNN trained on one dataset generally performing poorly for recovering images from another dataset. The underlying reason is that lensless imaging through scattering media is a high dimensional regression problem and it is difficult to obtain an analytical solution. In this work, transfer learning is proposed to address this issue. Our main idea is to train a DNN on a relatively complex dataset using a large number of training samples and fine-tune the last few layers using very few samples from other datasets. Instead of the thousands of samples required to train from scratch, transfer learning alleviates the problem of costly data acquisition. Specifically, considering the difference in sample sizes and similarity among datasets, we propose two DNN architectures, namely LISMU-FCN and LISMU-OCN, and a balance loss function designed for balancing smoothness and sharpness. LISMU-FCN, with much fewer parameters, can achieve imaging across similar datasets while LISMU-OCN can achieve imaging across significantly different datasets. What's more, we establish a set of simulation algorithms which are close to the real experiment, and it is of great significance and practical value in the research on lensless scattering imaging. In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs. △ Less

Submitted 28 December, 2019; originally announced December 2019.

arXiv:1909.08018 [pdf, other]

Neural Population Coding for Effective Temporal Classification

Authors: Zihan Pan, Jibin Wu, Yansong Chua, Malu Zhang, Haizhou Li

Abstract: Neural encoding plays an important role in faithfully describing the temporally rich patterns, whose instances include human speech and environmental sounds. For tasks that involve classifying such spatio-temporal patterns with the Spiking Neural Networks (SNNs), how these patterns are encoded directly influence the difficulty of the task. In this paper, we compare several existing temporal and po… ▽ More Neural encoding plays an important role in faithfully describing the temporally rich patterns, whose instances include human speech and environmental sounds. For tasks that involve classifying such spatio-temporal patterns with the Spiking Neural Networks (SNNs), how these patterns are encoded directly influence the difficulty of the task. In this paper, we compare several existing temporal and population coding schemes and evaluate them on both speech (TIDIGITS) and sound (RWCP) datasets. We show that, with population neural codings, the encoded patterns are linearly separable using the Support Vector Machine (SVM). We note that the population neural codings effectively project the temporal information onto the spatial domain, thus improving linear separability in the spatial dimension, achieving an accuracy of 95\% and 100\% for TIDIGITS and RWCP datasets classified using the SVM, respectively. This observation suggests that an effective neural coding scheme greatly simplifies the classification problem such that a simple linear classifier would suffice. The above datasets are then classified using the Tempotron, an SNN-based classifier. SNN classification results agree with the SVM findings that population neural codings help to improve classification accuracy. Hence, other than the learning algorithm, effective neural encoding is just as important as an SNN designed to recognize spatio-temporal patterns. It is an often neglected but powerful abstraction that deserves further study. △ Less

Submitted 25 September, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

arXiv:1909.01302 [pdf, other]

An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks

Authors: Zihan Pan, Yansong Chua, Jibin Wu, Malu Zhang, Haizhou Li, Eliathamby Ambikairajah

Abstract: Auditory front-end is an integral part of a spiking neural network (SNN) when performing auditory cognitive tasks. It encodes the temporal dynamic stimulus, such as speech and audio, into an efficient, effective and reconstructable spike pattern to facilitate the subsequent processing. However, most of the auditory front-ends in current studies have not made use of recent findings in psychoacousti… ▽ More Auditory front-end is an integral part of a spiking neural network (SNN) when performing auditory cognitive tasks. It encodes the temporal dynamic stimulus, such as speech and audio, into an efficient, effective and reconstructable spike pattern to facilitate the subsequent processing. However, most of the auditory front-ends in current studies have not made use of recent findings in psychoacoustics and physiology concerning human listening. In this paper, we propose a neural encoding and decoding scheme that is optimized for speech processing. The neural encoding scheme, that we call Biologically plausible Auditory Encoding (BAE), emulates the functions of the perceptual components of the human auditory system, that include the cochlear filter bank, the inner hair cells, auditory masking effects from psychoacoustic models, and the spike neural encoding by the auditory nerve. We evaluate the perceptual quality of the BAE scheme using PESQ; the performance of the BAE based on speech recognition experiments. Finally, we also built and published two spike-version of speech datasets: the Spike-TIDIGITS and the Spike-TIMIT, for researchers to use and benchmarking of future SNN research. △ Less

Submitted 4 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

arXiv:1907.01167 [pdf, other]

A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks

Authors: Jibin Wu, Yansong Chua, Malu Zhang, Guoqi Li, Haizhou Li, Kay Chen Tan

Abstract: Spiking neural networks (SNNs) represent the most prominent biologically inspired computing model for neuromorphic computing (NC) architectures. However, due to the non-differentiable nature of spiking neuronal functions, the standard error back-propagation algorithm is not directly applicable to SNNs. In this work, we propose a tandem learning framework, that consists of an SNN and an Artificial… ▽ More Spiking neural networks (SNNs) represent the most prominent biologically inspired computing model for neuromorphic computing (NC) architectures. However, due to the non-differentiable nature of spiking neuronal functions, the standard error back-propagation algorithm is not directly applicable to SNNs. In this work, we propose a tandem learning framework, that consists of an SNN and an Artificial Neural Network (ANN) coupled through weight sharing. The ANN is an auxiliary structure that facilitates the error back-propagation for the training of the SNN at the spike-train level. To this end, we consider the spike count as the discrete neural representation in the SNN, and design ANN neuronal activation function that can effectively approximate the spike count of the coupled SNN. The proposed tandem learning rule demonstrates competitive pattern recognition and regression capabilities on both the conventional frame-based and event-based vision datasets, with at least an order of magnitude reduced inference time and total synaptic operations over other state-of-the-art SNN implementations. Therefore, the proposed tandem learning rule offers a novel solution to training efficient, low latency, and high accuracy deep SNNs with low computing resources. △ Less

Submitted 30 June, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

arXiv:1906.08853 [pdf, other]

Hardware-friendly Neural Network Architecture for Neuromorphic Computing

Authors: Roshan Gopalakrishnan, Yansong Chua, Ashish Jith Sreejith Kumar

Abstract: The hardware-software co-optimization of neural network architectures is becoming a major stream of research especially due to the emergence of commercial neuromorphic chips such as the IBM Truenorth and Intel Loihi. Development of specific neural network architectures in tandem with the design of the neuromorphic hardware considering the hardware constraints will make a huge impact in the complet… ▽ More The hardware-software co-optimization of neural network architectures is becoming a major stream of research especially due to the emergence of commercial neuromorphic chips such as the IBM Truenorth and Intel Loihi. Development of specific neural network architectures in tandem with the design of the neuromorphic hardware considering the hardware constraints will make a huge impact in the complete system level application. In this paper, we study various neural network architectures and propose one that is hardware-friendly for a neuromorphic hardware with crossbar array of synapses. Considering the hardware constraints, we demonstrate how one may design the neuromorphic hardware so as to maximize classification accuracy in the trained network architecture, while concurrently, we choose a neural network architecture so as to maximize utilization in the neuromorphic cores. We also proposed a framework for mapping a neural network onto a neuromorphic chip named as the Mapping and Debugging (MaD) framework. The MaD framework is designed to be generic in the sense that it is a Python wrapper which in principle can be integrated with any simulator tool for neuromorphic chips. △ Less

Submitted 3 April, 2019; originally announced June 2019.

Comments: 18 pages, 14 figures, 4 tables, submitted to Frontiers in Neuroscience

arXiv:1902.05705 [pdf, ps, other]

Deep Spiking Neural Network with Spike Count based Learning Rule

Authors: Jibin Wu, Yansong Chua, Malu Zhang, Qu Yang, Guoqi Li, Haizhou Li

Abstract: Deep spiking neural networks (SNNs) support asynchronous event-driven computation, massive parallelism and demonstrate great potential to improve the energy efficiency of its synchronous analog counterpart. However, insufficient attention has been paid to neural encoding when designing SNN learning rules. Remarkably, the temporal credit assignment has been performed on rate-coded spiking inputs, l… ▽ More Deep spiking neural networks (SNNs) support asynchronous event-driven computation, massive parallelism and demonstrate great potential to improve the energy efficiency of its synchronous analog counterpart. However, insufficient attention has been paid to neural encoding when designing SNN learning rules. Remarkably, the temporal credit assignment has been performed on rate-coded spiking inputs, leading to poor learning efficiency. In this paper, we introduce a novel spike-based learning rule for rate-coded deep SNNs, whereby the spike count of each neuron is used as a surrogate for gradient backpropagation. We evaluate the proposed learning rule by training deep spiking multi-layer perceptron (MLP) and spiking convolutional neural network (CNN) on the UCI machine learning and MNIST handwritten digit datasets. We show that the proposed learning rule achieves state-of-the-art accuracies on all benchmark datasets. The proposed learning rule allows introducing latency, spike rate and hardware constraints into the SNN learning, which is superior to the indirect approach in which conventional artificial neural networks are first trained and then converted to SNNs. Hence, it allows direct deployment to the neuromorphic hardware and supports efficient inference. Notably, a test accuracy of 98.40% was achieved on the MNIST dataset in our experiments with only 10 simulation time steps, when the same latency constraint is imposed during training. △ Less

Submitted 15 February, 2019; originally announced February 2019.

arXiv:1901.00128 [pdf, other]

MaD: Mapping and debugging framework for implementing deep neural network onto a neuromorphic chip with crossbar array of synapses

Authors: Roshan Gopalakrishnan, Ashish Jith Sreejith Kumar, Yansong Chua

Abstract: Neuromorphic systems or dedicated hardware for neuromorphic computing is getting popular with the advancement in research on different device materials for synapses, especially in crossbar architecture and also algorithms specific or compatible to neuromorphic hardware. Hence, an automated mapping of any deep neural network onto the neuromorphic chip with crossbar array of synapses and an efficien… ▽ More Neuromorphic systems or dedicated hardware for neuromorphic computing is getting popular with the advancement in research on different device materials for synapses, especially in crossbar architecture and also algorithms specific or compatible to neuromorphic hardware. Hence, an automated mapping of any deep neural network onto the neuromorphic chip with crossbar array of synapses and an efficient debugging framework is very essential. Here, mapping is defined as the deployment of a section of deep neural network layer onto a neuromorphic core and the generation of connection lists among population of neurons to specify the connectivity between various neuromorphic cores on the neuromorphic chip. Debugging is the verification of computations performed on the neuromorphic chip during inferencing. Together the framework becomes Mapping and Debugging (MaD) framework. MaD framework is quite general in usage as it is a Python wrapper which can be integrated with almost every simulator tools for neuromorphic chips. This paper illustrates the MaD framework in detail, considering some optimizations while mapping onto a single neuromorphic core. A classification task on MNIST and CIFAR-10 datasets are considered for test case implementation of MaD framework. △ Less

Submitted 1 January, 2019; originally announced January 2019.

Comments: 7 pages, 6 figures and 6 tables. Submitting to IJCNN 2019

arXiv:1807.01013 [pdf, other]

doi 10.3389/fnins.2021.608567

Is Neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain

Authors: Laxmi R. Iyer, Yansong Chua, Haizhou Li

Abstract: The advantage of spiking neural networks (SNNs) over their predecessors is their ability to spike, enabling them to use spike timing for coding and efficient computing. A neuromorphic dataset should allow a neuromorphic algorithm to clearly show that a SNN is able to perform better on the dataset than an ANN. We have analyzed both N-MNIST and N-Caltech101 along these lines, but focus our study on… ▽ More The advantage of spiking neural networks (SNNs) over their predecessors is their ability to spike, enabling them to use spike timing for coding and efficient computing. A neuromorphic dataset should allow a neuromorphic algorithm to clearly show that a SNN is able to perform better on the dataset than an ANN. We have analyzed both N-MNIST and N-Caltech101 along these lines, but focus our study on N-MNIST. First we evaluate if additional information is encoded in the time domain in a neuromoprhic dataset. We show that an ANN trained with backpropagation on frame based versions of N-MNIST and N-Caltech101 images achieve 99.23% and 78.01% accuracy. These are the best classification accuracies obtained on these datasets to date. Second we present the first unsupervised SNN to be trained on N-MNIST and demonstrate results of 91.78%. We also use this SNN for further experiments on N-MNIST to show that rate based SNNs perform better, and precise spike timings are not important in N-MNIST. N-MNIST does not, therefore, highlight the unique ability of SNNs. The conclusion of this study opens an important question in neuromorphic engineering - what, then, constitutes a good neuromorphic dataset? △ Less

Submitted 3 July, 2018; originally announced July 2018.

Journal ref: Frontiers in Neuroscience, 2021

arXiv:1807.00578 [pdf, other]

Classifying neuromorphic data using a deep learning framework for image classification

Authors: Roshan Gopalakrishnan, Yansong Chua, Laxmi R Iyer

Abstract: In the field of artificial intelligence, neuromorphic computing has been around for several decades. Deep learning has however made much recent progress such that it consistently outperforms neuromorphic learning algorithms in classification tasks in terms of accuracy. Specifically in the field of image classification, neuromorphic computing has been traditionally using either the temporal or rate… ▽ More In the field of artificial intelligence, neuromorphic computing has been around for several decades. Deep learning has however made much recent progress such that it consistently outperforms neuromorphic learning algorithms in classification tasks in terms of accuracy. Specifically in the field of image classification, neuromorphic computing has been traditionally using either the temporal or rate code for encoding static images in datasets into spike trains. It is only till recently, that neuromorphic vision sensors are widely used by the neuromorphic research community, and provides an alternative to such encoding methods. Since then, several neuromorphic datasets as obtained by applying such sensors on image datasets (e.g. the neuromorphic CALTECH 101) have been introduced. These data are encoded in spike trains and hence seem ideal for benchmarking of neuromorphic learning algorithms. Specifically, we train a deep learning framework used for image classification on the CALTECH 101 and a collapsed version of the neuromorphic CALTECH 101 datasets. We obtained an accuracy of 91.66% and 78.01% for the CALTECH 101 and neuromorphic CALTECH 101 datasets respectively. For CALTECH 101, our accuracy is close to the best reported accuracy, while for neuromorphic CALTECH 101, it outperforms the last best reported accuracy by over 10%. This raises the question of the suitability of such datasets as benchmarks for neuromorphic learning algorithms. △ Less

Submitted 2 July, 2018; originally announced July 2018.

Comments: 4 pages, 3 figures, submitted to ICARCV 2018

Showing 1–23 of 23 results for author: Chua, Y