Welcome to the IKCEST
Journal
IEEE Transactions on Neural Networks and Learning Systems

IEEE Transactions on Neural Networks and Learning Systems

Archives Papers: 887
IEEE Xplore
Please choose volume & issue:
Fast Algorithms for Deep Octonion Networks
Aleksandr CariowGalina Cariowa
Keywords:Neural networksTransformsSignal processing algorithmsLearning systemsSymmetric matricesSparse matricesDeep learningAcceleration of calculationsdeep neural networksfast Walsh–Hadamard transform (FWHT)octonions
Abstracts:This brief presents the results of a study of the possibilities of reducing the arithmetic complexity of computing basic operations in octonionic neural networks and also proposes new algorithmic solutions for efficiently performing these operations. Here, we primarily mean the operation of multiplying octonions, the operation of computing the dot product of two octonion-valued vectors, and the operation of multiple multiplications of an octonion by several other octonions. In order to reduce the computational complexity of these operations, it is proposed to use the fast Walsh–Hadamard transform, which is well known in digital signal processing. Using this transform reduces the number of multiplications and additions of real numbers required to perform computations. Thus, the use of the proposed algorithms will speed up computations in octonion-valued neural networks.
An RNN-Based Algorithm for Decentralized-Partial-Consensus Constrained Optimization
Zicong XiaYang LiuJianlong QiuQihua RuanJinde Cao
Keywords:OptimizationRecurrent neural networksCost functionTask analysisOptimization methodsNeurodynamicsLearning systemsDecentralized-partial-consensus optimization (DPCO)nonsmooth analysispartial-consensus matrixrecurrent neural networks (RNNs)
Abstracts:This technical note proposes a decentralized-partial-consensus optimization (DPCO) problem with inequality constraints. The partial-consensus matrix originating from the Laplacian matrix is constructed to tackle the partial-consensus constraints. A continuous-time algorithm based on multiple interconnected recurrent neural networks (RNNs) is derived to solve the optimization problem. In addition, based on nonsmooth analysis and Lyapunov theory, the convergence of continuous-time algorithm is further proved. Finally, several examples demonstrate the effectiveness of main results.
Neural Networks as Geometric Chaotic Maps
Ziwei LiSai Ravela
Keywords:Artificial neural networksTrajectoryNeuronsMathematical modelNumerical modelsTraining dataTrainingChaosneural networks (NNs)nonlinear dynamical systemstopological mixing
Abstracts:The use of artificial neural networks (NNs) as models of chaotic dynamics has been rapidly expanding. Still, a theoretical understanding of how NNs learn chaos is lacking. Here, we employ a geometric perspective to show that NNs can efficiently model chaotic dynamics by becoming structurally chaotic themselves. We first confirm NN’s efficiency in emulating chaos by showing that a parsimonious NN trained only on few data points can reconstruct strange attractors, extrapolate outside training data boundaries, and accurately predict local divergence rates. We then posit that the trained network’s map comprises sequential geometric stretching, rotation, and compression operations. These geometric operations indicate topological mixing and chaos, explaining why NNs are naturally suitable to emulate chaotic dynamics.
Deep Multiview Collaborative Clustering
Xu YangCheng DengZhiyuan DangDacheng Tao
Keywords:Task analysisFeature extractionCollaborative workCorrelationClustering methodsKernelCollaborationCollaborative learningheterogeneous graph learningmultiview adaptive fusionmultiview clustering
Abstracts:The clustering methods have absorbed even-increasing attention in machine learning and computer vision communities in recent years. In this article, we focus on the real-world applications where a sample can be represented by multiple views. Traditional methods learn a common latent space for multiview samples without considering the diversity of multiview representations and use <inline-formula> <tex-math notation="LaTeX">$K$ </tex-math></inline-formula>-means to obtain the final results, which are time and space consuming. On the contrary, we propose a novel end-to-end deep multiview clustering model with collaborative learning to predict the clustering results directly. Specifically, multiple autoencoder networks are utilized to embed multi-view data into various latent spaces and a heterogeneous graph learning module is employed to fuse the latent representations adaptively, which can learn specific weights for different views of each sample. In addition, intraview collaborative learning is framed to optimize each single-view clustering task and provide more discriminative latent representations. Simultaneously, interview collaborative learning is employed to obtain complementary information and promote consistent cluster structure for a better clustering solution. Experimental results on several datasets show that our method significantly outperforms several state-of-the-art clustering approaches.
Robust Visual Tracking via Multitask Sparse Correlation Filters Learning
Ke NaiZhiyong LiYihui GanQi Wang
Keywords:Target trackingVisualizationFeature extractionCorrelationTask analysisColorOptimizationAlternating direction method of multipliers (ADMM)correlation filter (CF)multitask sparse learningvisual tracking
Abstracts:In this article, a novel multitask sparse correlation filters (MTSCF) model, which introduces multitask sparse learning into the CFs framework, is proposed for visual tracking. Specifically, the proposed MTSCF method exploits multitask learning to take the interdependencies among different visual features (e.g., histogram of oriented gradient (HOG), color names, and CNN features) into account to simultaneously learn the CFs and make the learned filters enhance and complement each other to boost the tracking performance. Moreover, it also performs feature selection to dynamically select discriminative spatial features from the target region to distinguish the target object from the background. A <inline-formula> <tex-math notation="LaTeX">$l_{2,1}$ </tex-math></inline-formula> regularization term is considered to realize multitask sparse learning. In order to solve the objective model, alternating direction method of multipliers is utilized for learning the CFs. By considering multitask sparse learning, the proposed MTSCF model can fully utilize the strength of different visual features and select effective spatial features to better model the appearance of the target object. Extensive experiment results on multiple tracking benchmarks demonstrate that our MTSCF tracker achieves competitive tracking performance in comparison with several state-of-the-art trackers.
Kernel Path for &#x03BD;-Support Vector Classification
Bin GuZiran XiongXiang LiZhou ZhaiGuansheng Zheng
Keywords:KernelStatic VAr compensatorsSupport vector machinesComputational modelingTrainingAnalytical modelsMachine learning algorithmsKernel pathmodel selectionpiecewise non-linearsupport vector machine (SVM)
Abstracts:It is well known that the performance of a kernel method is highly dependent on the choice of kernel parameter. However, existing kernel path algorithms are limited to plain support vector machines (SVMs), which has one equality constraint. It is still an open question to provide a kernel path algorithm to <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-support vector classification (<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC) with more than one equality constraint. Compared with plain SVM, <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC has the advantage of using a regularization parameter <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula> for controlling the number of support vectors and margin errors. To address this problem, in this article, we propose a kernel path algorithm (KP<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>SVC) to trace the solutions of <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC exactly with respect to the kernel parameter. Specifically, we first provide an equivalent formulation of <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC with two equality constraints, which can avoid possible conflicts during tracing the solutions of <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC. Based on this equivalent formulation of <inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>-SVC, we propose the KP<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>SVC algorithm to trace the solutions with respect to the kernel parameter. However, KP<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>SVC traces nonlinear solutions of kernel method rather than the errors of loss function, and it is still a challen- e to provide the algorithm that guarantees to find the global optimal model. To address this challenging problem, we extend the classical error path algorithm to the nonlinear kernel solution paths and propose a new kernel error path (KEP) algorithm that ensures to find the global optimal kernel parameter by minimizing the cross validation error. We also provide the finite convergence analysis and computational complexity analysis to KP<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>SVC and KEP. Extensive experimental results on a variety of benchmark datasets not only verify the effectiveness of KP<inline-formula> <tex-math notation="LaTeX">$nu $ </tex-math></inline-formula>SVC but also show the advantage of applying KEP to select the optimal kernel parameter.
RelativeNAS: Relative Neural Architecture Search via Slow-Fast Learning
Hao TanRan ChengShihua HuangCheng HeChangxiao QiuFan YangPing Luo
Keywords:Computer architectureStatisticsSociologySearch problemsOptimizationNeural networksEstimationAutoMLconvolutional neural network (CNN)neural architecture search (NAS)population-based searchslow-fast learning
Abstracts:Despite the remarkable successes of convolutional neural networks (CNNs) in computer vision, it is time-consuming and error-prone to manually design a CNN. Among various neural architecture search (NAS) methods that are motivated to automate designs of high-performance CNNs, the differentiable NAS and population-based NAS are attracting increasing interests due to their unique characters. To benefit from the merits while overcoming the deficiencies of both, this work proposes a novel NAS method, RelativeNAS. As the key to efficient search, RelativeNAS performs joint learning between fast learners (i.e., decoded networks with relatively lower loss value) and slow learners in a pairwise manner. Moreover, since RelativeNAS only requires low-fidelity performance estimation to distinguish each pair of fast learner and slow learner, it saves certain computation costs for training the candidate architectures. The proposed RelativeNAS brings several unique advantages: 1) it achieves state-of-the-art performances on ImageNet with top-1 error rate of 24.88&#x0025;, that is, outperforming DARTS and AmoebaNet-B by 1.82&#x0025; and 1.12&#x0025;, respectively; 2) it spends only 9 h with a single 1080Ti GPU to obtain the discovered cells, that is, <inline-formula> <tex-math notation="LaTeX">${3.75times }$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">${7875times }$ </tex-math></inline-formula> faster than DARTS and AmoebaNet, respectively; and 3) it provides that the discovered cells obtained on CIFAR-10 can be directly transferred to object detection, semantic segmentation, and keypoint detection, yielding competitive results of 73.1&#x0025; mAP on PASCAL VOC, 78.7&#x0025; mIoU on Cityscapes, and 68.5&#x0025; AP on MSCOCO, respectively. The implementation of RelativeNAS is available at <uri>https://github.com/EMI-Group/RelativeNAS</uri>.
Lifelong Mixture of Variational Autoencoders
Fei YeAdrian G. Bors
Keywords:Task analysisTrainingData modelsDatabasesComputer architectureProbabilityMixture modelsDisentangled representationslifelong learningmixture of evidence lower bounds (ELBOs)mixture of variational autoencoders (VAEs)multitask learning
Abstracts:In this article, we propose an end-to-end lifelong learning mixture of experts. Each expert is implemented by a variational autoencoder (VAE). The experts in the mixture system are jointly trained by maximizing a mixture of individual component evidence lower bounds (MELBO) on the log-likelihood of the given training samples. The mixing coefficients in the mixture model control the contributions of each expert in the global representation. These are sampled from a Dirichlet distribution whose parameters are determined through nonparametric estimation during lifelong learning. The model can learn new tasks fast when these are similar to those previously learned. The proposed lifelong mixture of VAE (L-MVAE) expands its architecture with new components when learning a completely new task. After the training, our model can automatically determine the relevant expert to be used when fed with new data samples. This mechanism benefits both the memory efficiency and the required computational cost as only one expert is used during the inference. The L-MVAE inference model is able to perform interpolations in the joint latent space across the data domains associated with different tasks and is shown to be efficient for disentangled learning representation.
A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks
Jibin WuYansong ChuaMalu ZhangGuoqi LiHaizhou LiKay Chen Tan
Keywords:NeuronsTrainingTask analysisBiological neural networksBackpropagationPattern recognitionComputer architectureDeep spiking neural network (SNN)efficient neuromorphic inferenceevent-driven visionneuromorphic computing (NC)object recognition
Abstracts:Spiking neural networks (SNNs) represent the most prominent biologically inspired computing model for neuromorphic computing (NC) architectures. However, due to the nondifferentiable nature of spiking neuronal functions, the standard error backpropagation algorithm is not directly applicable to SNNs. In this work, we propose a tandem learning framework that consists of an SNN and an artificial neural network (ANN) coupled through weight sharing. The ANN is an auxiliary structure that facilitates the error backpropagation for the training of the SNN at the spike-train level. To this end, we consider the spike count as the discrete neural representation in the SNN and design an ANN neuronal activation function that can effectively approximate the spike count of the coupled SNN. The proposed tandem learning rule demonstrates competitive pattern recognition and regression capabilities on both the conventional frame- and event-based vision datasets, with at least an order of magnitude reduced inference time and total synaptic operations over other state-of-the-art SNN implementations. Therefore, the proposed tandem learning rule offers a novel solution to training efficient, low latency, and high-accuracy deep SNNs with low computing resources.
Hierarchical Passivity Criterion for Delayed Neural Networks via A General Delay-Product-Type Lyapunov&#x2013;Krasovskii Functional
Fei LongChuan-Ke ZhangYong HeQing-Guo WangZhen-Man GaoMin Wu
Keywords:DelaysSymmetric matricesStability criteriaNumerical stabilityNeural networksTime-varying systemsNeuronsDelay-product-type termdelayed neural networks (DNNs)general convexity lemmaLyapunov–Krasovskii functional (LKF)passivity criterion
Abstracts:This article is concerned with passivity analysis of neural networks with a time-varying delay. Several techniques in the domain are improved to establish the new passivity criterion with less conservatism. First, a Lyapunov&#x2013;Krasovskii functional (LKF) is constructed with two general delay-product-type terms which contain any chosen degree of polynomials in time-varying delay. Second, a general convexity lemma without conservatism is developed to address the positive-definiteness of the LKF and the negative-definiteness of its time-derivative. Then, with these improved results, a hierarchical passivity criterion of less conservatism is obtained for neural networks with a time-varying delay, whose size and conservatism vary with the maximal degree of the time-varying delay polynomial in the LKF. It is shown that the conservatism of the passivity criterion does not always reduce as the degree of the time-varying delay polynomial increases. Finally, a numerical example is given to illustrate the proposed criterion and benchmark against the existing results.
Hot Journals