Bruges, Belgium, April 22-23-24
Content of the proceedings
-
Semi-supervised learning
Dimensionality reduction
Signal and image processing
Learning (with) preferences
Learning I
Efficient learning in recurrent networks
Classification and fuzzy logic
Neurosciences
Weightless neural systems
Learning II
Brain Computer Interfaces: from theory to practice
Generative and bayesian models
Neural maps and learning vector quantization - theory and applications
Learning III
Semi-supervised learning
ES2009-3
Machine Learning with Labeled and Unlabeled Data
Tijl De Bie, Thiago Turchetti Maia, Antônio Braga
Machine Learning with Labeled and Unlabeled Data
Tijl De Bie, Thiago Turchetti Maia, Antônio Braga
Abstract:
The field of semi-supervised learning has been expanding rapidly in the past few years, with a sheer increase in the number of related publications. In this paper we present the SSL problem in contrast with supervised and unsupervised learning. In addition, we propose a taxonomy with which we categorize many existing approaches described in the literature based on their underlying framework, data representation, and algorithmic class.
The field of semi-supervised learning has been expanding rapidly in the past few years, with a sheer increase in the number of related publications. In this paper we present the SSL problem in contrast with supervised and unsupervised learning. In addition, we propose a taxonomy with which we categorize many existing approaches described in the literature based on their underlying framework, data representation, and algorithmic class.
ES2009-9
A Variational Approach to Semi-Supervised Clustering
Peng Li, Yiming Ying, Colin Campbell
A Variational Approach to Semi-Supervised Clustering
Peng Li, Yiming Ying, Colin Campbell
Abstract:
We present a Bayesian variational inference scheme for semi-supervised clustering in which data is supplemented with side information in the form of common labels. There is no mutual exclusion of classes assumption and samples are represented as a combinatorial mixture over multiple clusters. We illustrate performance on six datasets and find a positive comparison against constrained K-means clustering.
We present a Bayesian variational inference scheme for semi-supervised clustering in which data is supplemented with side information in the form of common labels. There is no mutual exclusion of classes assumption and samples are represented as a combinatorial mixture over multiple clusters. We illustrate performance on six datasets and find a positive comparison against constrained K-means clustering.
ES2009-75
A self-training method for learning to rank with unlabeled data
Tuong Vinh Truong, Massih-Reza Amini, Patrick Gallinari
A self-training method for learning to rank with unlabeled data
Tuong Vinh Truong, Massih-Reza Amini, Patrick Gallinari
Abstract:
This paper presents a new algorithm for bipartite ranking functions trained with partially labeled data. The algorithm is an extension of the self-training paradigm developed under the classification framework. We further propose an efficient and scalable optimization method for training linear models though the approach is general in the sense that it can be applied to any classes of scoring functions. Empirical results on several common image and text corpora over the Area Under the ROC Curve (AUC) and the Average Precision measure show that the use of unlabeled data in the training process leads to improve the performance of baseline supervised ranking functions.
This paper presents a new algorithm for bipartite ranking functions trained with partially labeled data. The algorithm is an extension of the self-training paradigm developed under the classification framework. We further propose an efficient and scalable optimization method for training linear models though the approach is general in the sense that it can be applied to any classes of scoring functions. Empirical results on several common image and text corpora over the Area Under the ROC Curve (AUC) and the Average Precision measure show that the use of unlabeled data in the training process leads to improve the performance of baseline supervised ranking functions.
ES2009-80
Transductively Learning from Positive Examples Only
Kristiaan Pelckmans, Johan Suykens
Transductively Learning from Positive Examples Only
Kristiaan Pelckmans, Johan Suykens
Abstract:
This paper considers the task of learning a binary labeling of the vertices of a graph, given only a small set of positive examples and knowledge of the desired amount of positives. A learning machine is described maximizing the precision of the prediction, a combinatorial optimization problem which can be rephrased as a S-T mincut problem. For validation, we consider the movie recommendation dataset of MOVIELENS. For each user we have given a collection of (ratings of) movies which are liked well, and the task is to recommend a disjoint set of movies which are most probably of interest to the user.
This paper considers the task of learning a binary labeling of the vertices of a graph, given only a small set of positive examples and knowledge of the desired amount of positives. A learning machine is described maximizing the precision of the prediction, a combinatorial optimization problem which can be rephrased as a S-T mincut problem. For validation, we consider the movie recommendation dataset of MOVIELENS. For each user we have given a collection of (ratings of) movies which are liked well, and the task is to recommend a disjoint set of movies which are most probably of interest to the user.
ES2009-104
Supervised classification of categorical data with uncertain labels for DNA barcoding
Charles Bouveyron, Stephane Girard, Madalina Olteanu
Supervised classification of categorical data with uncertain labels for DNA barcoding
Charles Bouveyron, Stephane Girard, Madalina Olteanu
Abstract:
In the supervised classification framework, the human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, the human supervision is either imprecise, difficult or expensive and this gives rise to non robust classifiers. An interesting application where this situation occurs is DNA barcoding which aims to develop a standard tool to identify species with no or limited recourse to taxonomic expertise. In some cases, the morphological features describing the reference sample may be misleading and the taxonomists attribute labels incorrectly. This work presents a robust supervised classification method for categorical data based on a multivariate multinomial mixture model. The proposed method is applied to DNA barcoding and compared to classical methods on a real dataset.
In the supervised classification framework, the human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, the human supervision is either imprecise, difficult or expensive and this gives rise to non robust classifiers. An interesting application where this situation occurs is DNA barcoding which aims to develop a standard tool to identify species with no or limited recourse to taxonomic expertise. In some cases, the morphological features describing the reference sample may be misleading and the taxonomists attribute labels incorrectly. This work presents a robust supervised classification method for categorical data based on a multivariate multinomial mixture model. The proposed method is applied to DNA barcoding and compared to classical methods on a real dataset.
ES2009-81
A semi-supervised approach to question classification
David Tomás, Claudio Giuliano
A semi-supervised approach to question classification
David Tomás, Claudio Giuliano
Abstract:
This paper presents a machine learning approach to question classification. We have defined a kernel function based on latent semantic information acquired from unlabeled data. This kernel allows including external semantic knowledge into the supervised learning process. We have combined this knowledge with a bag-of-words approach by means of composite kernels to obtain state-of-the-art results. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains.
This paper presents a machine learning approach to question classification. We have defined a kernel function based on latent semantic information acquired from unlabeled data. This kernel allows including external semantic knowledge into the supervised learning process. We have combined this knowledge with a bag-of-words approach by means of composite kernels to obtain state-of-the-art results. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains.
ES2009-56
Improving BAS committee performance with a semi-supervised approach
Ruy Luiz Milidiú, Julio Cesar Duarte
Improving BAS committee performance with a semi-supervised approach
Ruy Luiz Milidiú, Julio Cesar Duarte
Abstract:
Semi-supervised Learning is a machine learning approach that, by making use of both labeled and unlabeled data for training, can significantly improve learning accuracy. Boosting is a machine learning technique that combines several weak classifiers to improve the overall accuracy. At each iteration, the algorithm changes the weights of the examples and builds an additional classifier. A well known algorithm based on boosting is AdaBoost, which uses an initial uniform distribution. Boosting At Start (BAS) is a boosting framework that generalizes AdaBoost by allowing any initial weight distribution and a cost function. Here, we present a scheme that allows the use of unlabeled data in the BAS framework. We examine the performance of the proposed scheme in some datasets commonly used in semi-supervised approaches. Our empirical findings indicate that BAS can improve the accuracy of the generated classifiers by taking advantage of unlabeled data.
Semi-supervised Learning is a machine learning approach that, by making use of both labeled and unlabeled data for training, can significantly improve learning accuracy. Boosting is a machine learning technique that combines several weak classifiers to improve the overall accuracy. At each iteration, the algorithm changes the weights of the examples and builds an additional classifier. A well known algorithm based on boosting is AdaBoost, which uses an initial uniform distribution. Boosting At Start (BAS) is a boosting framework that generalizes AdaBoost by allowing any initial weight distribution and a cost function. Here, we present a scheme that allows the use of unlabeled data in the BAS framework. We examine the performance of the proposed scheme in some datasets commonly used in semi-supervised approaches. Our empirical findings indicate that BAS can improve the accuracy of the generated classifiers by taking advantage of unlabeled data.
ES2009-136
Semi-supervised bipartite ranking with the normalized Rayleigh coefficient
Liva Ralaivola
Semi-supervised bipartite ranking with the normalized Rayleigh coefficient
Liva Ralaivola
Abstract:
We propose a new algorithm for semi-supervised learning in the bipartite ranking framework. It is based on the maximization of a so-called normalized Rayleigh coefficient, which differs from the usual Rayleigh coefficient of Fisher's linear discriminant in that the actual covariance matrices are used instead of the scatter matrices. We show that if the class conditional distributions are Gaussian, then the ranking function produced by our algorithm is the optimal linear ranking function. A kernelized version of the proposed algorithm and a semi-supervised formulation are provided. Preliminary numerical results are promising.
We propose a new algorithm for semi-supervised learning in the bipartite ranking framework. It is based on the maximization of a so-called normalized Rayleigh coefficient, which differs from the usual Rayleigh coefficient of Fisher's linear discriminant in that the actual covariance matrices are used instead of the scatter matrices. We show that if the class conditional distributions are Gaussian, then the ranking function produced by our algorithm is the optimal linear ranking function. A kernelized version of the proposed algorithm and a semi-supervised formulation are provided. Preliminary numerical results are promising.
ES2009-61
Partially-supervised learning in Independent Factor Analysis
Etienne Côme, Latifa Oukhellou, Patrice Aknin, Thierry Denoeux
Partially-supervised learning in Independent Factor Analysis
Etienne Côme, Latifa Oukhellou, Patrice Aknin, Thierry Denoeux
Abstract:
Independent Factor Analysis (IFA) is used to recover latent components (or sources) from their linear observed mixtures within an unsupervised learning framework. Both the mixing process and the source densities are learned from the observed data. The sources are assumed to be mutually independent and distributed according to a mixture of Gaussians. This paper investigates the possibility of incorporating partial knowledge on the cluster belonging of some samples to estimate the IFA model. Semi-supervised and partially supervised learning cases can thus be handled. Experimental results demonstrate the ability of this approach to enhance estimation accuracy and remove indeterminacy commonly encountered in unsupervised IFA such as the permutation of the sources.
Independent Factor Analysis (IFA) is used to recover latent components (or sources) from their linear observed mixtures within an unsupervised learning framework. Both the mixing process and the source densities are learned from the observed data. The sources are assumed to be mutually independent and distributed according to a mixture of Gaussians. This paper investigates the possibility of incorporating partial knowledge on the cluster belonging of some samples to estimate the IFA model. Semi-supervised and partially supervised learning cases can thus be handled. Experimental results demonstrate the ability of this approach to enhance estimation accuracy and remove indeterminacy commonly encountered in unsupervised IFA such as the permutation of the sources.
Dimensionality reduction
ES2009-133
The Exploration Machine - a novel method for structure-preserving dimensionality reduction
Axel Wismueller
The Exploration Machine - a novel method for structure-preserving dimensionality reduction
Axel Wismueller
Abstract:
We present a novel method for structure-preserving dimensionality reduction. The Exploration Machine (Exploratory Observation Machine, XOM) computes graphical representations of high-dimensional observations by a strategy of self-organized model adaptation. Although simple and computationally efficient, XOM enjoys a surprising flexibility to simultaneously contribute to several different domains of advanced machine learning, scientific data analysis, and visualization, such as structure-preserving dimensionality reduction and data clustering.
We present a novel method for structure-preserving dimensionality reduction. The Exploration Machine (Exploratory Observation Machine, XOM) computes graphical representations of high-dimensional observations by a strategy of self-organized model adaptation. Although simple and computationally efficient, XOM enjoys a surprising flexibility to simultaneously contribute to several different domains of advanced machine learning, scientific data analysis, and visualization, such as structure-preserving dimensionality reduction and data clustering.
ES2009-65
Nonlinear Discriminative Data Visualization
Kerstin Bunte, Barbara Hammer, Petra Schneider, Michael Biehl
Nonlinear Discriminative Data Visualization
Kerstin Bunte, Barbara Hammer, Petra Schneider, Michael Biehl
Abstract:
Due to the tremendous increase of electronic information with respect to the size of the data sets as well as its dimensionality, visualization of high dimensional data constitutes one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss of information, methods to explicitly control the information kept by a specific visualization technique are highly desirable. The incorporation of supervised class information constitutes an important specific case. In this contribution we propose an extension of prototype-based local matrix learning by a charting technique which results in an efficient nonlinear discriminative visualization of a given labelled data manifold.
Due to the tremendous increase of electronic information with respect to the size of the data sets as well as its dimensionality, visualization of high dimensional data constitutes one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss of information, methods to explicitly control the information kept by a specific visualization technique are highly desirable. The incorporation of supervised class information constitutes an important specific case. In this contribution we propose an extension of prototype-based local matrix learning by a charting technique which results in an efficient nonlinear discriminative visualization of a given labelled data manifold.
ES2009-114
Does dimensionality reduction improve the quality of motion interpolation?
Sebastian Bitzer, Stefan Klanke, Sethu Vijayakumar
Does dimensionality reduction improve the quality of motion interpolation?
Sebastian Bitzer, Stefan Klanke, Sethu Vijayakumar
Abstract:
In recent years nonlinear dimensionality reduction has frequently been suggested for the modelling of high-dimensional motion data. While it is intuitively plausible to use dimensionality reduction to recover low dimensional manifolds which compactly represent a given set of movements, there is a lack of critical investigation into the quality of resulting representations, in particular with respect to generalisability. Furthermore it is unclear how consistently particular methods can achieve good results. Here we use a set of robotic motion data of which we know ground truth to evaluate a range of nonlinear dimensionality reduction methods with respect to the quality of motion interpolation. We show that results are sensitive to parameter settings and data set used and that no dimensionality reduction method significantly outperforms naive interpolation.
In recent years nonlinear dimensionality reduction has frequently been suggested for the modelling of high-dimensional motion data. While it is intuitively plausible to use dimensionality reduction to recover low dimensional manifolds which compactly represent a given set of movements, there is a lack of critical investigation into the quality of resulting representations, in particular with respect to generalisability. Furthermore it is unclear how consistently particular methods can achieve good results. Here we use a set of robotic motion data of which we know ground truth to evaluate a range of nonlinear dimensionality reduction methods with respect to the quality of motion interpolation. We show that results are sensitive to parameter settings and data set used and that no dimensionality reduction method significantly outperforms naive interpolation.
ES2009-113
Transformations for variational factor analysis to speed up learning
Jaakko Luttinen, Alexander Ilin, Tapani Raiko
Transformations for variational factor analysis to speed up learning
Jaakko Luttinen, Alexander Ilin, Tapani Raiko
Abstract:
We present a way to speed-up learning of variational Bayesian factor analysis models by doing simple transformations during the learning process. These transformations are motivated by some representational ambiguities in the model and are given a theoretical justification from the Bayesian framework. We derive the formulae for variational Bayesian PCA and show experimentally that the transformations may improve the rate of convergence by orders of magnitude. The result can be applied to several factor analysis models that use EM or gradient-based algorithms for learning.
We present a way to speed-up learning of variational Bayesian factor analysis models by doing simple transformations during the learning process. These transformations are motivated by some representational ambiguities in the model and are given a theoretical justification from the Bayesian framework. We derive the formulae for variational Bayesian PCA and show experimentally that the transformations may improve the rate of convergence by orders of magnitude. The result can be applied to several factor analysis models that use EM or gradient-based algorithms for learning.
ES2009-117
X-SOM and L-SOM: a nested approach for missing value imputation
Paul Merlin, Antti Sorjamaa, Bertrand Maillet, Amaury Lendasse
X-SOM and L-SOM: a nested approach for missing value imputation
Paul Merlin, Antti Sorjamaa, Bertrand Maillet, Amaury Lendasse
Abstract:
In this paper, a new method for the determination of missing values in temporal databases is presented. This one is based on a robust version of a nonlinear classification algorithm: the Self-Organizing Maps and consists to the combination of two classifications in order to take advantage of spatial as well as temporal dependency of the dataset. This nested approach leads to a significant improvement of the estimation of missing values. An application of the determination of missing values for hedge fund return database is proposed.
In this paper, a new method for the determination of missing values in temporal databases is presented. This one is based on a robust version of a nonlinear classification algorithm: the Self-Organizing Maps and consists to the combination of two classifications in order to take advantage of spatial as well as temporal dependency of the dataset. This nested approach leads to a significant improvement of the estimation of missing values. An application of the determination of missing values for hedge fund return database is proposed.
Signal and image processing
ES2009-70
Sparse differential connectivity graph of scalp EEG for epileptic patients
Ladan Amini, Sophie Achard, Christian Jutten, Hamid Soltanian-Zadeh, Gholam Ali Hossein-Zadeh, Olivier David, Laurent Vercueil
Sparse differential connectivity graph of scalp EEG for epileptic patients
Ladan Amini, Sophie Achard, Christian Jutten, Hamid Soltanian-Zadeh, Gholam Ali Hossein-Zadeh, Olivier David, Laurent Vercueil
Abstract:
The aim of the work is to integrate the information modulation of the inter-relations between EEG scalp measurements of two brain states in a connectivity graph. We present a sparse differential connectivity graph (SDCG) to distinguish the effectively modulated connections between epileptiform and non-epileptiform states of the brain from all the common connections created by noise, artifact, unwanted background activities and their related volume conduction effect. The proposed method is applied on real epileptic EEG data. Clustering the extracted features from SDCG may present valuable information about the epileptiform focus and their relations.
The aim of the work is to integrate the information modulation of the inter-relations between EEG scalp measurements of two brain states in a connectivity graph. We present a sparse differential connectivity graph (SDCG) to distinguish the effectively modulated connections between epileptiform and non-epileptiform states of the brain from all the common connections created by noise, artifact, unwanted background activities and their related volume conduction effect. The proposed method is applied on real epileptic EEG data. Clustering the extracted features from SDCG may present valuable information about the epileptiform focus and their relations.
ES2009-83
Patch-based bilateral filter and local m-smoother for image denoising
Arnaud de Decker, John Lee, Michel Verleysen
Patch-based bilateral filter and local m-smoother for image denoising
Arnaud de Decker, John Lee, Michel Verleysen
Abstract:
In the field of image analysis, denoising is an important preprocessing task. The design of an efficient, robust, and computationally effective edge-preserving denoising algorithm is a widely studied, and yet unsolved problem. One of the most efficient edge-preserving denoising algorithms is the bilateral filter, which is an intuitive generalization of the local M-smoother. In this paper, we propose to modify both the bilateral filter and the local M-smoother to use patches of the image instead of single pixel in the denoising process. With this modification, the filtering effet becomes more sensitive to the different areas of the image and the filtering results improve. The denoising quality of these patch-based filters is evaluated on test images and compared to classical bilateral filtering and local M-smoother.
In the field of image analysis, denoising is an important preprocessing task. The design of an efficient, robust, and computationally effective edge-preserving denoising algorithm is a widely studied, and yet unsolved problem. One of the most efficient edge-preserving denoising algorithms is the bilateral filter, which is an intuitive generalization of the local M-smoother. In this paper, we propose to modify both the bilateral filter and the local M-smoother to use patches of the image instead of single pixel in the denoising process. With this modification, the filtering effet becomes more sensitive to the different areas of the image and the filtering results improve. The denoising quality of these patch-based filters is evaluated on test images and compared to classical bilateral filtering and local M-smoother.
ES2009-94
Adaptive anisotropic denoising: a bootstrapped procedure
John Lee, Arnaud de Decker, Michel Verleysen
Adaptive anisotropic denoising: a bootstrapped procedure
John Lee, Arnaud de Decker, Michel Verleysen
Abstract:
Signal denoising proves to be important in many domains such as pattern recognition and image analysis. This paper investigates several refinements of adaptive local filters that rely on local mode finding. These spatial filters are anisotropic and offer the advantage of attenuating noise without smoothing salient signal features such as discontinuities or other sharp transitions. In particular, a bootstrapped procedure is developed and leads to an improvement of the denoising quality without increasing the computational complexity. Experiments with an artificial benchmark allow the quantification of the performance gain.
Signal denoising proves to be important in many domains such as pattern recognition and image analysis. This paper investigates several refinements of adaptive local filters that rely on local mode finding. These spatial filters are anisotropic and offer the advantage of attenuating noise without smoothing salient signal features such as discontinuities or other sharp transitions. In particular, a bootstrapped procedure is developed and leads to an improvement of the denoising quality without increasing the computational complexity. Experiments with an artificial benchmark allow the quantification of the performance gain.
Learning (with) preferences
ES2009-4
Supervised learning as preference optimization
Fabio Aiolli, Alessandro Sperduti
Supervised learning as preference optimization
Fabio Aiolli, Alessandro Sperduti
Abstract:
Learning with preferences is receiving more and more attention in the last few years. The goal in this setting is to learn based on qualitative or quantitative declared preferences between objects of a domain. In this paper we give a survey of a recent framework for supervised learning based on preference optimization. In fact, many of the broad set of supervised tasks can all be seen as particular instances of this preference based framework. They include simple binary classification, (single or multi) label multiclass classification, ranking problems, and (ordinal) regression, just to name a few. We show that the proposed general preference learning model (GPLM), which is based on a large-margin principled approach, gives a flexible way to codify cost functions for all the above problems as sets of linear preferences. Examples of how the proposed framework has been effectively used to address a variety of real-world applications are reported clearly showing the flexibility and effectiveness of the approach.
Learning with preferences is receiving more and more attention in the last few years. The goal in this setting is to learn based on qualitative or quantitative declared preferences between objects of a domain. In this paper we give a survey of a recent framework for supervised learning based on preference optimization. In fact, many of the broad set of supervised tasks can all be seen as particular instances of this preference based framework. They include simple binary classification, (single or multi) label multiclass classification, ranking problems, and (ordinal) regression, just to name a few. We show that the proposed general preference learning model (GPLM), which is based on a large-margin principled approach, gives a flexible way to codify cost functions for all the above problems as sets of linear preferences. Examples of how the proposed framework has been effectively used to address a variety of real-world applications are reported clearly showing the flexibility and effectiveness of the approach.
ES2009-112
Efficient voting prediction for pairwise multilabel classification
Eneldo Loza Mencía, Sang-Hyeun Park, Johannes Fürnkranz
Efficient voting prediction for pairwise multilabel classification
Eneldo Loza Mencía, Sang-Hyeun Park, Johannes Fürnkranz
Abstract:
The pairwise approach to multilabel classification reduces the problem to learning and aggregating preference predictions among the possible labels. A key problem is the need to query a quadratic number of preferences for making a prediction. To solve this problem, we extend the recently proposed QWeighted algorithm for efficient pairwise multiclass voting to the multilabel setting, and evaluate the adapted algorithm on several real-world datasets. We achieve an average-case reduction of classifier evaluations from n^2 to n + dn log n, where n is the total number of labels and d is the average number of labels, which is typically quite small in real-world datasets.
The pairwise approach to multilabel classification reduces the problem to learning and aggregating preference predictions among the possible labels. A key problem is the need to query a quadratic number of preferences for making a prediction. To solve this problem, we extend the recently proposed QWeighted algorithm for efficient pairwise multiclass voting to the multilabel setting, and evaluate the adapted algorithm on several real-world datasets. We achieve an average-case reduction of classifier evaluations from n^2 to n + dn log n, where n is the total number of labels and d is the average number of labels, which is typically quite small in real-world datasets.
ES2009-122
Multi-task Preference learning with Gaussian Processes
Adriana Birlutiu, Perry Groot, Tom Heskes
Multi-task Preference learning with Gaussian Processes
Adriana Birlutiu, Perry Groot, Tom Heskes
Abstract:
We present an EM-algorithm for the problem of learning user preferences with Gaussian Processes in the context of multi-task learning. We validate our approach on an audiological data set and show that predictive results for sound quality perception of normal hearing and hearing-impaired subjects, in the context of pairwise comparison experiments, can significantly be improved using the hierarchial model.
We present an EM-algorithm for the problem of learning user preferences with Gaussian Processes in the context of multi-task learning. We validate our approach on an audiological data set and show that predictive results for sound quality perception of normal hearing and hearing-impaired subjects, in the context of pairwise comparison experiments, can significantly be improved using the hierarchial model.
Learning I
ES2009-62
Adaptive Metrics for Content Based Image Retrieval in Dermatology
Kerstin Bunte, Michael Biehl, Nicolai Petkov, Marcel F. Jonkman
Adaptive Metrics for Content Based Image Retrieval in Dermatology
Kerstin Bunte, Michael Biehl, Nicolai Petkov, Marcel F. Jonkman
Abstract:
We apply distance based classifiers in the context of a content based image retrieval task in dermatology. In the present project, only RGB color information is used. We employ two different methods in order to obtain a discriminative distance measure for classification and retrieval: Generalized Matrix LVQ and Large Margin Nearest Neighbor approach. Both methods provide a linear transformation of the original features to lower dimensions. We demonstrate that both methods lead to very similar discriminative transformations and improve the classification and retrieval performances significantly.
We apply distance based classifiers in the context of a content based image retrieval task in dermatology. In the present project, only RGB color information is used. We employ two different methods in order to obtain a discriminative distance measure for classification and retrieval: Generalized Matrix LVQ and Large Margin Nearest Neighbor approach. Both methods provide a linear transformation of the original features to lower dimensions. We demonstrate that both methods lead to very similar discriminative transformations and improve the classification and retrieval performances significantly.
ES2009-67
Bayesian periodogram smoothing for speech enhancement
Xueru Zhang, Alexander Ypma, Bert de Vries
Bayesian periodogram smoothing for speech enhancement
Xueru Zhang, Alexander Ypma, Bert de Vries
Abstract:
Periodogram smoothing of the received noisy signal is a challenging problem in speech enhancement. We present a Bayesian approach, where the instantaneous periodogram is smoothed through an adaptive smoothing parameter. By updating sufficient statistics using new samples of the noisy signal, the smoothing parameter is adjusted on-line. The performance of the novel smoothing algorithm is studied in a speech enhancement context. It is demonstrated that with respect to Mean Square Error, the proposed Bayesian smoothing algorithm performs better than the other non-Bayesian smoothing algorithms in higher signal-to-noise ratio environments.
Periodogram smoothing of the received noisy signal is a challenging problem in speech enhancement. We present a Bayesian approach, where the instantaneous periodogram is smoothed through an adaptive smoothing parameter. By updating sufficient statistics using new samples of the noisy signal, the smoothing parameter is adjusted on-line. The performance of the novel smoothing algorithm is studied in a speech enhancement context. It is demonstrated that with respect to Mean Square Error, the proposed Bayesian smoothing algorithm performs better than the other non-Bayesian smoothing algorithms in higher signal-to-noise ratio environments.
ES2009-78
Improving the transition modelling in hidden Markov models for ECG segmentation
Benoît Frénay, Gaël de Lannoy, Michel Verleysen
Improving the transition modelling in hidden Markov models for ECG segmentation
Benoît Frénay, Gaël de Lannoy, Michel Verleysen
Abstract:
The segmentation of ECG signal is a useful tool for the diagnosis of cardiac diseases. However, the state-of-the-art methods use hidden Markov models which do not adequately model the transitions between successive waves. This paper proposes two methods which attempt to overcome this limitation: a HMM state scission scheme which prevents ingoing and outgoing transitions in the middle of the waves and a bayesian network where the transitions are emission-dependent. Experiments show that both methods improve the results on pathological ECG signals.
The segmentation of ECG signal is a useful tool for the diagnosis of cardiac diseases. However, the state-of-the-art methods use hidden Markov models which do not adequately model the transitions between successive waves. This paper proposes two methods which attempt to overcome this limitation: a HMM state scission scheme which prevents ingoing and outgoing transitions in the middle of the waves and a bayesian network where the transitions are emission-dependent. Experiments show that both methods improve the results on pathological ECG signals.
ES2009-74
A robust biologically plausible implementation of ICA-like learning
Felipe Gerhard, Cristina Savin, Jochen Triesch
A robust biologically plausible implementation of ICA-like learning
Felipe Gerhard, Cristina Savin, Jochen Triesch
Abstract:
We present a model that can perform ICA-like computation by simple, local, biologically plausible rules. By combining synaptic learning with homeostatic regulation of neurons properties and adaptive lateral inhibition, the neural network can robustly learn Gabor-like receptive fields from natural images. With spatially localized inhibitory connections, a topographic map can be achieved. Additionally, the network can solve the Foldiak bars problem, a classical nonlinear ICA task.
We present a model that can perform ICA-like computation by simple, local, biologically plausible rules. By combining synaptic learning with homeostatic regulation of neurons properties and adaptive lateral inhibition, the neural network can robustly learn Gabor-like receptive fields from natural images. With spatially localized inhibitory connections, a topographic map can be achieved. Additionally, the network can solve the Foldiak bars problem, a classical nonlinear ICA task.
ES2009-137
Spline-based neuro-fuzzy Kolmogorov’s network for time series prediction
Vitaliy Kolodyazhniy
Spline-based neuro-fuzzy Kolmogorov’s network for time series prediction
Vitaliy Kolodyazhniy
Abstract:
A spline-based modification of the previously developed Neuro-Fuzzy Kolmogorov's Network (NFKN) is proposed. In order to improve the approximation accuracy, cubic B-splines are substituted for triangular membership functions. The network is trained with a hybrid learning rule combining least squares estimation for the output layer and gradient descent for the hidden layer. The initialization of the NFKN is deterministic and is based on the PCA procedure. The advantages of the modified NFKN are confirmed by long-range iterated predictions of two chaotic time series: an artificial data generated by the Mackey-Glass equation and a real data of laser intensity oscillations.
A spline-based modification of the previously developed Neuro-Fuzzy Kolmogorov's Network (NFKN) is proposed. In order to improve the approximation accuracy, cubic B-splines are substituted for triangular membership functions. The network is trained with a hybrid learning rule combining least squares estimation for the output layer and gradient descent for the hidden layer. The initialization of the NFKN is deterministic and is based on the PCA procedure. The advantages of the modified NFKN are confirmed by long-range iterated predictions of two chaotic time series: an artificial data generated by the Mackey-Glass equation and a real data of laser intensity oscillations.
ES2009-110
Gene expression data analysis using spatiotemporal blind source separation
Matthieu Sainlez, Pierre-Antoine Absil, Andrew Teschendorff
Gene expression data analysis using spatiotemporal blind source separation
Matthieu Sainlez, Pierre-Antoine Absil, Andrew Teschendorff
Abstract:
We propose a ``time-biased'' and a ``space-biased'' method for spatiotemporal independent component analysis (ICA). The methods rely on computing an orthogonal approximate joint diagonalizer of a collection of covariance-like matrices. In the time-biased version, the time signatures of the ICA modes are imposed to be white, whereas the space-biased version imposes the same condition on the space signatures. We apply the two methods to the analysis of gene expression data, where the genes play the role of the space and the cell samples stand for the time. This study is a step towards addressing a question first raised by Liebermeister, on whether ICA methods for gene expression analysis should impose independence across genes or across cell samples. Our preliminary experiment indicates that both approaches have value, and that exploring the continuum between these two extremes can provide useful information about the interactions between genes and their impact on the phenotype.
We propose a ``time-biased'' and a ``space-biased'' method for spatiotemporal independent component analysis (ICA). The methods rely on computing an orthogonal approximate joint diagonalizer of a collection of covariance-like matrices. In the time-biased version, the time signatures of the ICA modes are imposed to be white, whereas the space-biased version imposes the same condition on the space signatures. We apply the two methods to the analysis of gene expression data, where the genes play the role of the space and the cell samples stand for the time. This study is a step towards addressing a question first raised by Liebermeister, on whether ICA methods for gene expression analysis should impose independence across genes or across cell samples. Our preliminary experiment indicates that both approaches have value, and that exploring the continuum between these two extremes can provide useful information about the interactions between genes and their impact on the phenotype.
ES2009-127
A wavelet-heterogeneous index of market shocks for assessing the magnitude of financial crises
Christophe Boucher, Patrick Kouontchou, Bertrand Maillet, Raymond Hélène
A wavelet-heterogeneous index of market shocks for assessing the magnitude of financial crises
Christophe Boucher, Patrick Kouontchou, Bertrand Maillet, Raymond Hélène
Abstract:
An accurate quantitative definition of financial crisis requires a universal and robust scale for measuring market shocks. Following Zumbach et al. (2000) and Maillet et Michel (2003), we propose a new quantitative measure of financial disturbances, which captures the heterogeneity of investor horizons – from day traders to pension funds. The indicator resides on a multi-resolution analysis of market volatility, each scale corresponding to various investment horizons and different data frequencies. This new risk measure, called “Wavelet-heterogeneous Index of Market Shocks” (WhIMS), is based on the combination of two methods: the Wavelet Packets Sub-band Decomposition and the constrained Independent Component Analysis (See Kopriva and Sersic, 2007 and Lu and Rajapakse, 2005). We apply this measure on the French stock markets (high frequency CAC40) to date and gauge the severity of financial crises.
An accurate quantitative definition of financial crisis requires a universal and robust scale for measuring market shocks. Following Zumbach et al. (2000) and Maillet et Michel (2003), we propose a new quantitative measure of financial disturbances, which captures the heterogeneity of investor horizons – from day traders to pension funds. The indicator resides on a multi-resolution analysis of market volatility, each scale corresponding to various investment horizons and different data frequencies. This new risk measure, called “Wavelet-heterogeneous Index of Market Shocks” (WhIMS), is based on the combination of two methods: the Wavelet Packets Sub-band Decomposition and the constrained Independent Component Analysis (See Kopriva and Sersic, 2007 and Lu and Rajapakse, 2005). We apply this measure on the French stock markets (high frequency CAC40) to date and gauge the severity of financial crises.
ES2009-128
A robust hybrid DHMM-MLP modelling of financial crises measured by the WhIMS
Christophe Boucher, Bertrand Maillet, Paul Merlin
A robust hybrid DHMM-MLP modelling of financial crises measured by the WhIMS
Christophe Boucher, Bertrand Maillet, Paul Merlin
Abstract:
This paper develops a hybrid model combining a Hidden Markov Chain (HMC) and Multilayer Perceptrons (MLP) on the Waveletheterogeneous Index of Market Shocks (WhIMS) to identify dynamically regimes in financial turbulences. The WhIMS is an aggregate measure of volatility computed at different frequencies. We estimate the model based on a French market stock index (CAC40 Index) and compare the prediction performance of the HMC-MLP model to classical linear and non-linear models. A state separation of financial disturbances based on the WhIMS and conditional probabilities of the HMC-MLP model is then performed using a Robust SOM.
This paper develops a hybrid model combining a Hidden Markov Chain (HMC) and Multilayer Perceptrons (MLP) on the Waveletheterogeneous Index of Market Shocks (WhIMS) to identify dynamically regimes in financial turbulences. The WhIMS is an aggregate measure of volatility computed at different frequencies. We estimate the model based on a French market stock index (CAC40 Index) and compare the prediction performance of the HMC-MLP model to classical linear and non-linear models. A state separation of financial disturbances based on the WhIMS and conditional probabilities of the HMC-MLP model is then performed using a Robust SOM.
ES2009-105
A faster model selection criterion for OP-ELM and OP-KNN: Hannan-Quinn criterion
Yoan Miché, Amaury Lendasse
A faster model selection criterion for OP-ELM and OP-KNN: Hannan-Quinn criterion
Yoan Miché, Amaury Lendasse
Abstract:
The OP-ELM and OP-KNN algorithms make use of the same methodology structure, based on a random initialization of a Feedforward Neural Network followed by a ranking of the neurons; final step uses this ranking of neurons to determine the best combination of them to retain. This is usually achieved by Leave One Out (LOO) cross-validation. It is proposed in this article to use the Hannan-Quinn Information Criterion as a model selection criterion, instead of the classical LOO. This criterion proved to be efficient and just as good as (or slightly better than) the previously used LOO one for both OP-ELM and OP-KNN, while decreasing computational times by factors of four to five.
The OP-ELM and OP-KNN algorithms make use of the same methodology structure, based on a random initialization of a Feedforward Neural Network followed by a ranking of the neurons; final step uses this ranking of neurons to determine the best combination of them to retain. This is usually achieved by Leave One Out (LOO) cross-validation. It is proposed in this article to use the Hannan-Quinn Information Criterion as a model selection criterion, instead of the classical LOO. This criterion proved to be efficient and just as good as (or slightly better than) the previously used LOO one for both OP-ELM and OP-KNN, while decreasing computational times by factors of four to five.
ES2009-123
Rosen's projection method for SVM training
Jorge López, José Dorronsoro
Rosen's projection method for SVM training
Jorge López, José Dorronsoro
Abstract:
In this work we will give explicit formulae for the application of Rosen's gradient projection method to SVM training that leads to a very simple implementation. We shall experimentally show that the method provides good descent directions that result in less training iterations, particularly when large precision is wanted. However, a naive kernelization may end up in a procedure requiring more KOs than SMO and further work is needed to arrive at an efficient implementation.
In this work we will give explicit formulae for the application of Rosen's gradient projection method to SVM training that leads to a very simple implementation. We shall experimentally show that the method provides good descent directions that result in less training iterations, particularly when large precision is wanted. However, a naive kernelization may end up in a procedure requiring more KOs than SMO and further work is needed to arrive at an efficient implementation.
ES2009-124
On the huge benefit of quasi-random mutations for multimodal optimization with application to grid-based tuning of neurocontrollers
Guillaume Chaslot, Jean-Baptiste Hoock, Fabien Teytaud, Olivier Teytaud
On the huge benefit of quasi-random mutations for multimodal optimization with application to grid-based tuning of neurocontrollers
Guillaume Chaslot, Jean-Baptiste Hoock, Fabien Teytaud, Olivier Teytaud
Abstract:
In this paper, we study the optimization of a neural network used for controlling a Monte-Carlo Tree Search (MCTS/UCT) algorithm. The main results are: (i) the specification of a new multimodal benchmark function; this function has been defined in particular in agree ment with \cite{multimodalppsn} which has pointed out that most multimodal functions are not satisfactory for some real-wor ld multimodal scenarios (section \ref{sota}); (ii) experimentation of Evolution Strategies on this new multimodal benchmark function, showing the great efficienc y of quasi-random mutations in this framework (section \ref{artif}); (iii) the proof-of-concept of the application of ES for grid-based tuning Neural Networks for controlling MCTS/UCT (see section \ref{rw}). %As this work combines several notions, including some not very well known yet algorithms (e.g. MCTS), the first section co ntains a brief introduction to all these important notions (the MCTS part can be seen as a particular benchmark for readers uninterested in UCT/MCTS approaches). %However, readers who are not familiar with MCTS and not interested in this family o f algorithms can only see this as a particularly difficult benchmark and read the rest of the paper only.
In this paper, we study the optimization of a neural network used for controlling a Monte-Carlo Tree Search (MCTS/UCT) algorithm. The main results are: (i) the specification of a new multimodal benchmark function; this function has been defined in particular in agree ment with \cite{multimodalppsn} which has pointed out that most multimodal functions are not satisfactory for some real-wor ld multimodal scenarios (section \ref{sota}); (ii) experimentation of Evolution Strategies on this new multimodal benchmark function, showing the great efficienc y of quasi-random mutations in this framework (section \ref{artif}); (iii) the proof-of-concept of the application of ES for grid-based tuning Neural Networks for controlling MCTS/UCT (see section \ref{rw}). %As this work combines several notions, including some not very well known yet algorithms (e.g. MCTS), the first section co ntains a brief introduction to all these important notions (the MCTS part can be seen as a particular benchmark for readers uninterested in UCT/MCTS approaches). %However, readers who are not familiar with MCTS and not interested in this family o f algorithms can only see this as a particularly difficult benchmark and read the rest of the paper only.
ES2009-32
Support vectors machines regression for estimation of mars surface physical properties
Caroline Bernard Michel, Sylvain Douté, Mathieu Fauvel, Laurent Gardes, Stephane Girard
Support vectors machines regression for estimation of mars surface physical properties
Caroline Bernard Michel, Sylvain Douté, Mathieu Fauvel, Laurent Gardes, Stephane Girard
Abstract:
In this paper, the estimation of physical properties from hyperspectral data with support vector machine is addressed. Several kernel functions were used, from classical to advanced ones. The results are compared with Gaussian Regularized Sliced Inversion Regression and Partial Least Squares, both in terms of accuracy and complexity. Experiments on simulated data show that SVM produce highly accurate results, for some kernels, but with an increased of the processing time. Inversion of real images shows that SVM are robust and generalize well. In addition, the analysis of the support vectors allows to detect saturation of the physical model used to generate the simulated data.
In this paper, the estimation of physical properties from hyperspectral data with support vector machine is addressed. Several kernel functions were used, from classical to advanced ones. The results are compared with Gaussian Regularized Sliced Inversion Regression and Partial Least Squares, both in terms of accuracy and complexity. Experiments on simulated data show that SVM produce highly accurate results, for some kernels, but with an increased of the processing time. Inversion of real images shows that SVM are robust and generalize well. In addition, the analysis of the support vectors allows to detect saturation of the physical model used to generate the simulated data.
ES2009-98
Self-organising map for large scale processes monitoring
Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre
Self-organising map for large scale processes monitoring
Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre
Abstract:
A feed-forward neural network is proposed for monitoring operating modes of large scale processes. A Gaussian hidden layer associated with a Kohonen output layer map the principal features of measurements of state variables. Subsets of selective neurons are generated into the hidden layer by means of self adapting of centers and dispersions parameters of the Gaussian functions. The output layer operates like a data fusion operator by means of adapting the hidden-to-output matrix of weights through a winner takes all strategy. The algorithm is tested with the Tennessee Eastman Challenge Process. The results prove that the proposed neural network clearly maps the different operating modes.
A feed-forward neural network is proposed for monitoring operating modes of large scale processes. A Gaussian hidden layer associated with a Kohonen output layer map the principal features of measurements of state variables. Subsets of selective neurons are generated into the hidden layer by means of self adapting of centers and dispersions parameters of the Gaussian functions. The output layer operates like a data fusion operator by means of adapting the hidden-to-output matrix of weights through a winner takes all strategy. The algorithm is tested with the Tennessee Eastman Challenge Process. The results prove that the proposed neural network clearly maps the different operating modes.
ES2009-100
The Use of ANN for Turbo Engine Applications
René Meier, Lars Frank Große, Franz Joos
The Use of ANN for Turbo Engine Applications
René Meier, Lars Frank Große, Franz Joos
Abstract:
To reduce environmental pollution and increase efficiency of commercially available turbo engines it is essential to optimize. The suggestion made in this paper, is the use of evolution strategies and artificial neuronal networks (ANN) for turbo engine applications. Optimisations of the impeller and the combustion process are only two applications in the wide range of improvements.
To reduce environmental pollution and increase efficiency of commercially available turbo engines it is essential to optimize. The suggestion made in this paper, is the use of evolution strategies and artificial neuronal networks (ANN) for turbo engine applications. Optimisations of the impeller and the combustion process are only two applications in the wide range of improvements.
Efficient learning in recurrent networks
ES2009-7
Recent advances in efficient learning of recurrent networks
Barbara Hammer, Benjamin Schrauwen, Jochen J. Steil
Recent advances in efficient learning of recurrent networks
Barbara Hammer, Benjamin Schrauwen, Jochen J. Steil
Abstract:
Recurrent neural networks (RNNs) carry the promise of implementing efficient and biologically plausible signal processing. They both are optimally suited for a wide area of applications when dealing with spatiotemporal data or causalities and provide explanation of cognitive phenomena of the human brain. Recently, a few new fundamental paradigms connected to RNNs have been developed which allow insights into their potential for information processing. They also pave the way towards new efficient training algorithms which overcome the well-known problem of long-term dependencies. This tutorial gives an overview of this recent developments in efficient, biologically plausible recurrent information processing.
Recurrent neural networks (RNNs) carry the promise of implementing efficient and biologically plausible signal processing. They both are optimally suited for a wide area of applications when dealing with spatiotemporal data or causalities and provide explanation of cognitive phenomena of the human brain. Recently, a few new fundamental paradigms connected to RNNs have been developed which allow insights into their potential for information processing. They also pave the way towards new efficient training algorithms which overcome the well-known problem of long-term dependencies. This tutorial gives an overview of this recent developments in efficient, biologically plausible recurrent information processing.
ES2009-132
Studies on reservoir initialization and dynamics shaping in echo state networks
Joschka Boedecker, Oliver Obst, N. Michael Mayer, Minoru Asada
Studies on reservoir initialization and dynamics shaping in echo state networks
Joschka Boedecker, Oliver Obst, N. Michael Mayer, Minoru Asada
Abstract:
The fixed random connectivity of networks in reservoir computing leads to significant variation in performance. Only few problem specific optimization procedures are known to date. We study a general initialization method using permutation matrices and derive a new unsupervised learning rule based on intrinsic plasticity (IP) for echo state networks. Using three different benchmarks, we show that networks with permutation matrices for the reservoir connectivity have much longer memory than the other methods, but are also able to perform highly non-linear mappings. We also show that IP based on sigmoid transfer functions is limited concerning the output distributions that can be achieved.
The fixed random connectivity of networks in reservoir computing leads to significant variation in performance. Only few problem specific optimization procedures are known to date. We study a general initialization method using permutation matrices and derive a new unsupervised learning rule based on intrinsic plasticity (IP) for echo state networks. Using three different benchmarks, we show that networks with permutation matrices for the reservoir connectivity have much longer memory than the other methods, but are also able to perform highly non-linear mappings. We also show that IP based on sigmoid transfer functions is limited concerning the output distributions that can be achieved.
ES2009-63
Non-markovian process modelling with Echo State Networks
Xavier Dutoit, Benjamin Schrauwen, Hendrik Van Brussel
Non-markovian process modelling with Echo State Networks
Xavier Dutoit, Benjamin Schrauwen, Hendrik Van Brussel
Abstract:
Reservoir Computing (RC) is a relatively recent technique for training Recurrent Neural Networks. It has shown interesting performances in a wide range of tasks despite the simple training rules. We use it here in a logistic regression framework. Considering non-markovian time series with a hidden variable, we show that RC can be used to estimate the transition probabilities at each time step and also to estimate the hidden variable. We also show that it outperforms classic logistic regression on this task. Finally, it can be used to extract invariants from a stochastic series.
Reservoir Computing (RC) is a relatively recent technique for training Recurrent Neural Networks. It has shown interesting performances in a wide range of tasks despite the simple training rules. We use it here in a logistic regression framework. Considering non-markovian time series with a hidden variable, we show that RC can be used to estimate the transition probabilities at each time step and also to estimate the hidden variable. We also show that it outperforms classic logistic regression on this task. Finally, it can be used to extract invariants from a stochastic series.
ES2009-17
Stimulus processing and unsupervised learning in autonomously active recurrent networks
Claudius Gros, Gregor Kaczor
Stimulus processing and unsupervised learning in autonomously active recurrent networks
Claudius Gros, Gregor Kaczor
Abstract:
Strongly recurrent neural nets may show a continuously ongoing self-sustained activity, as it is the case for the brain. A new paradigm for learning is needed for neural nets being such autonomously active, since standard Hebbian-style online learning would result in uncontrolled reinforcement of accidental activity patterns. Here we propose that autonomously active neural networks processing a time series of stimuli adapt whenever a stimulus successfully influences the ongoing internal dynamics. In this case the incoming stimulus corresponds to a novel signal. We then show, that the network performance results in an unsupervised non-linear independent component analysis of the input data stream. We propose this paradigm to be of relevance for stimulus processing in both natural and artificial neural nets.
Strongly recurrent neural nets may show a continuously ongoing self-sustained activity, as it is the case for the brain. A new paradigm for learning is needed for neural nets being such autonomously active, since standard Hebbian-style online learning would result in uncontrolled reinforcement of accidental activity patterns. Here we propose that autonomously active neural networks processing a time series of stimuli adapt whenever a stimulus successfully influences the ongoing internal dynamics. In this case the incoming stimulus corresponds to a novel signal. We then show, that the network performance results in an unsupervised non-linear independent component analysis of the input data stream. We propose this paradigm to be of relevance for stimulus processing in both natural and artificial neural nets.
ES2009-135
Reservoir computing for static pattern recognition
Mark Embrechts, Luis Alexandre, Jonathan Linton
Reservoir computing for static pattern recognition
Mark Embrechts, Luis Alexandre, Jonathan Linton
Abstract:
This paper introduces reservoir computing for static pattern recognition. Reservoir computing networks are neural networks with a sparsely connected recurrent hidden layer (or reservoir) of neurons. The weights from the inputs to the reservoir and the reservoir weights are randomly selected. The weights of the second layer are determined with a linear partial least squares solver. The outputs of the reservoir layer can be considered as an unsupervised data transformation and this stage has a brain-like plausibility. This paper shows that by letting the dynamics of the reservoir evolve to a stable solution, and then applying a sigmoid transfer function, reservoir computing can be applied as a robust and highly accurate pattern classifier. Reservoir computing is applied to 16 difficult multi-class classification benchmark cases, and compared with the best results of state-of the art neural network classification methods with entropic error criteria.
This paper introduces reservoir computing for static pattern recognition. Reservoir computing networks are neural networks with a sparsely connected recurrent hidden layer (or reservoir) of neurons. The weights from the inputs to the reservoir and the reservoir weights are randomly selected. The weights of the second layer are determined with a linear partial least squares solver. The outputs of the reservoir layer can be considered as an unsupervised data transformation and this stage has a brain-like plausibility. This paper shows that by letting the dynamics of the reservoir evolve to a stable solution, and then applying a sigmoid transfer function, reservoir computing can be applied as a robust and highly accurate pattern classifier. Reservoir computing is applied to 16 difficult multi-class classification benchmark cases, and compared with the best results of state-of the art neural network classification methods with entropic error criteria.
ES2009-88
Generalisation of action sequences in RNNPB networks with mirror properties
Raymond Cuijpers, Floran Stuijt, Ida Sprinkhuizen-Kuyper
Generalisation of action sequences in RNNPB networks with mirror properties
Raymond Cuijpers, Floran Stuijt, Ida Sprinkhuizen-Kuyper
Abstract:
The human mirror neuron system (MNS) is supposed to be involved in recognition of observed action sequences. However, it remains unclear how such a system could learn to recognise a large variety of action sequences. Here we investigated a neural network with mirror properties, the Recurrent Neural Network with Parametric Bias (RNNPB). We show that the network is capable of recognising noisy action sequences and that it is capable of generalising from a few learnt examples. Such a mechanism may explain how the human brain is capable of dealing with an infinite variety of action sequences.
The human mirror neuron system (MNS) is supposed to be involved in recognition of observed action sequences. However, it remains unclear how such a system could learn to recognise a large variety of action sequences. Here we investigated a neural network with mirror properties, the Recurrent Neural Network with Parametric Bias (RNNPB). We show that the network is capable of recognising noisy action sequences and that it is capable of generalising from a few learnt examples. Such a mechanism may explain how the human brain is capable of dealing with an infinite variety of action sequences.
ES2009-54
Attractor-based computation with reservoirs for online learning of inverse kinematics
R. Felix Reinhart, Jochen J. Steil
Attractor-based computation with reservoirs for online learning of inverse kinematics
R. Felix Reinhart, Jochen J. Steil
Abstract:
We implement completely data driven and efficient online learning from temporally correlated data in a reservoir network setup. We show that attractor states rather than transients are used for computation when learning inverse kinematics for the redundant robot arm PA-10. Our findings shade also light on the role of output feedback.
We implement completely data driven and efficient online learning from temporally correlated data in a reservoir network setup. We show that attractor states rather than transients are used for computation when learning inverse kinematics for the redundant robot arm PA-10. Our findings shade also light on the role of output feedback.
Classification and fuzzy logic
ES2009-59
Supervised variable clustering for classification of NIR spectra
Catherine Krier, Damien Francois, Fabrice Rossi, Michel Verleysen
Supervised variable clustering for classification of NIR spectra
Catherine Krier, Damien Francois, Fabrice Rossi, Michel Verleysen
Abstract:
Spectrometric data involve very high-dimensional observations representing sampled spectra. The correlation of the resulting spectral variables and their high number are two sources of difficulties in modeling. This paper proposes a supervised feature clustering algorithm that provides dimension reduction for this type of data in a classification context. The new features designed by this method are means of the original spectral variables computed on specific ranges of wavelengths and are therefore easy to interpret. Experiments on real world data show that the reduction in redundancy and in number of features leads to better performances obtained using a very low number of spectral ranges.
Spectrometric data involve very high-dimensional observations representing sampled spectra. The correlation of the resulting spectral variables and their high number are two sources of difficulties in modeling. This paper proposes a supervised feature clustering algorithm that provides dimension reduction for this type of data in a classification context. The new features designed by this method are means of the original spectral variables computed on specific ranges of wavelengths and are therefore easy to interpret. Experiments on real world data show that the reduction in redundancy and in number of features leads to better performances obtained using a very low number of spectral ranges.
ES2009-58
Fuzzy Fleiss-kappa for Comparison of Fuzzy Classifiers
Dietlind Zühlke, Tina Geweniger, Ulrich Heimann, Thomas Villmann
Fuzzy Fleiss-kappa for Comparison of Fuzzy Classifiers
Dietlind Zühlke, Tina Geweniger, Ulrich Heimann, Thomas Villmann
Abstract:
In this paper we show a straight forward extension of the fuzzy Cohen's-ê to Fleiss'-ê for the determination of classification agreements of fuzzy classifiers. In addition we investigate the influence of different interpretations of fuzzy intersection in terms of t-norms. These considerations are done for exemplary artificial data as well as for classification in image recognition for counting pollen grains.
In this paper we show a straight forward extension of the fuzzy Cohen's-ê to Fleiss'-ê for the determination of classification agreements of fuzzy classifiers. In addition we investigate the influence of different interpretations of fuzzy intersection in terms of t-norms. These considerations are done for exemplary artificial data as well as for classification in image recognition for counting pollen grains.
ES2009-23
Lukasiewicz fuzzy logic networks and their ultra low power hardware implementation
Rafal Dlugosz, Witold Pedrycz
Lukasiewicz fuzzy logic networks and their ultra low power hardware implementation
Rafal Dlugosz, Witold Pedrycz
Abstract:
In this paper, we propose a new category of current-mode Lukasiewicz OR and AND logic neurons and logic networks and show their ultra low power realization. The introduced circuits can operate with very low input signals that set up the operating point of transistors in the subthreshold region. In this region, the mismatch between transistors has much stronger impact on the current mirror gain than in the strong inversion region. The proposed solution minimizes this problem by reducing the number of current mirrors between the input and output of the neuron to only one.
In this paper, we propose a new category of current-mode Lukasiewicz OR and AND logic neurons and logic networks and show their ultra low power realization. The introduced circuits can operate with very low input signals that set up the operating point of transistors in the subthreshold region. In this region, the mismatch between transistors has much stronger impact on the current mirror gain than in the strong inversion region. The proposed solution minimizes this problem by reducing the number of current mirrors between the input and output of the neuron to only one.
ES2009-97
Simultaneous Clustering and Segmentation for Functional Data
Bernard Hugueney, Georges Hébrail, Yves Lechevallier, Fabrice Rossi
Simultaneous Clustering and Segmentation for Functional Data
Bernard Hugueney, Georges Hébrail, Yves Lechevallier, Fabrice Rossi
Abstract:
We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into K clusters and represents each cluster by a piecewise constant prototype. The total number of segments in the prototypes, P, is chosen by the user and optimally distributed into the clusters via two dynamic programming algorithms.
We propose in this paper an exploratory analysis algorithm for functional data. The method partitions a set of functions into K clusters and represents each cluster by a piecewise constant prototype. The total number of segments in the prototypes, P, is chosen by the user and optimally distributed into the clusters via two dynamic programming algorithms.
Neurosciences
ES2009-50
Cerebellum and spatial cognition: A connectionist approach
Jean-Baptiste Passot, Laure Rondi-Reig, Angelo Arleo
Cerebellum and spatial cognition: A connectionist approach
Jean-Baptiste Passot, Laure Rondi-Reig, Angelo Arleo
Abstract:
A large body of experimental and theoretical work has investigated the role of the cerebellum in adaptive motor control, movement coordination, and Pavlovian conditioning. Recent experimental findings have also begun to unravel the implication of the cerebellum in high-level functions such as spatial cognition. We focus on behavioural genetic data suggesting that cerebellar long-term plasticity may mediate the procedural component of spatial learning. We present a spiking neural network model of the cerebellar microcomplex that reproduces these experimental findings. The model brings forth a testable prediction about the interaction between the neural substrates subserving procedural and declarative spatial learning.
A large body of experimental and theoretical work has investigated the role of the cerebellum in adaptive motor control, movement coordination, and Pavlovian conditioning. Recent experimental findings have also begun to unravel the implication of the cerebellum in high-level functions such as spatial cognition. We focus on behavioural genetic data suggesting that cerebellar long-term plasticity may mediate the procedural component of spatial learning. We present a spiking neural network model of the cerebellar microcomplex that reproduces these experimental findings. The model brings forth a testable prediction about the interaction between the neural substrates subserving procedural and declarative spatial learning.
ES2009-125
A neural model for binocular vergence control without explicit calculation of disparity
Agostino Gibaldi, Manuela Chessa, Andrea Canessa, Silvio P. Sabatini, Fabio Solari
A neural model for binocular vergence control without explicit calculation of disparity
Agostino Gibaldi, Manuela Chessa, Andrea Canessa, Silvio P. Sabatini, Fabio Solari
Abstract:
A computational model for the control of horizontal vergence, based on a population of disparity tuned complex cells, is presented. The model directly extracts the disparity-vergence response by combining the outputs of the disparity detectors without explicit calculation of the disparity map. The resulting vergence control yields to stable fixation and has small response time to a wide range of disparities. Experimental simulations with synthetic stimuli in depth validate the approach.
A computational model for the control of horizontal vergence, based on a population of disparity tuned complex cells, is presented. The model directly extracts the disparity-vergence response by combining the outputs of the disparity detectors without explicit calculation of the disparity map. The resulting vergence control yields to stable fixation and has small response time to a wide range of disparities. Experimental simulations with synthetic stimuli in depth validate the approach.
Weightless neural systems
ES2009-6
A brief introduction to Weightless Neural Systems
Igor Aleksander, Massimo De Gregorio, Felipe França, Priscila Lima, Helen Morton
A brief introduction to Weightless Neural Systems
Igor Aleksander, Massimo De Gregorio, Felipe França, Priscila Lima, Helen Morton
Abstract:
Mimicking biological neurons by focusing on the excitatory/inhibitory decoding performed by the dendritic trees is a different and attractive alternative to the integrate-and-fire McCullogh-Pitts neuron stylisation. In such alternative analogy, neurons can be seen as a set of RAM nodes addressed by Boolean inputs and producing Boolean outputs. The shortening of the semantic gap between the synaptic-centric model introduced by the McCullogh-Pitts neuron and the dominating, binary digital, computational environment, is among the interesting benefits of the weightless neural approach. This paper presents an overview of the most representative paradigms of weightless neural systems and corresponding applications, at abstraction levels ranging from pattern recognition to artificial consciousness.
Mimicking biological neurons by focusing on the excitatory/inhibitory decoding performed by the dendritic trees is a different and attractive alternative to the integrate-and-fire McCullogh-Pitts neuron stylisation. In such alternative analogy, neurons can be seen as a set of RAM nodes addressed by Boolean inputs and producing Boolean outputs. The shortening of the semantic gap between the synaptic-centric model introduced by the McCullogh-Pitts neuron and the dominating, binary digital, computational environment, is among the interesting benefits of the weightless neural approach. This paper presents an overview of the most representative paradigms of weightless neural systems and corresponding applications, at abstraction levels ranging from pattern recognition to artificial consciousness.
ES2009-28
Phenomenal weightless machines
Igor Aleksander, Helen Morton
Phenomenal weightless machines
Igor Aleksander, Helen Morton
Abstract:
This paper describes how early designs of dynamic weightless neural systems were developed to enable some of the states of a state structure to have a phenomenal character. Such states reflect the features of a sensory reality and allow the storage of aspects of sensory experience and access to it. The ‘machine consciousness’ paradigm is summarised in this paper. The paper concludes with a description of the current state-of-the-art of a phenomenal approach to a model of consciousness which is based on the first of a set of introspective axioms.
This paper describes how early designs of dynamic weightless neural systems were developed to enable some of the states of a state structure to have a phenomenal character. Such states reflect the features of a sensory reality and allow the storage of aspects of sensory experience and access to it. The ‘machine consciousness’ paradigm is summarised in this paper. The paper concludes with a description of the current state-of-the-art of a phenomenal approach to a model of consciousness which is based on the first of a set of introspective axioms.
ES2009-116
Extracting fuzzy rules from “mental” images generated by a modified WISARD perceptron
Bruno Grieco, Priscila Lima, Massimo De Gregorio, Felipe França
Extracting fuzzy rules from “mental” images generated by a modified WISARD perceptron
Bruno Grieco, Priscila Lima, Massimo De Gregorio, Felipe França
Abstract:
The pioneering WISARD weightless neural classifier is based on the collective response of RAM-based neurons. The ability of producing prototypes, analog to “mental images”, from learned categories, was firstly introduced in the DRASIW model. By counting the frequency of writing accesses at each RAM neuron content accessed during the training phase, it is possible to associate the mostly accessed contents to the corresponding input field addresses that defined them. This work is about extracting information from such frequency/location counting in the form of fuzzy rules, as an alternative way to describe the same mental images produced by DRASIW as logical prototypes.
The pioneering WISARD weightless neural classifier is based on the collective response of RAM-based neurons. The ability of producing prototypes, analog to “mental images”, from learned categories, was firstly introduced in the DRASIW model. By counting the frequency of writing accesses at each RAM neuron content accessed during the training phase, it is possible to associate the mostly accessed contents to the corresponding input field addresses that defined them. This work is about extracting information from such frequency/location counting in the form of fuzzy rules, as an alternative way to describe the same mental images produced by DRASIW as logical prototypes.
ES2009-10
FPGA-based enhanced probabilistic convergent weightless Network for human Iris recognition
Pierre Lorrentz, Gareth Howells, Klaus McDonald-Maier
FPGA-based enhanced probabilistic convergent weightless Network for human Iris recognition
Pierre Lorrentz, Gareth Howells, Klaus McDonald-Maier
Abstract:
This paper investigates how human identification and identity verification can be performed by the application of an FPGA based weightless neural network, entitled the Enhanced Probabilistic Convergent Neural Network (EPCN), to the iris biometric modality. The human iris are processed for feature vectors which will be employed for formation of connectivity, during learning and subsequent recognition. The pre-processing of the iris, prior to EPCN training, is very minimal. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time.
This paper investigates how human identification and identity verification can be performed by the application of an FPGA based weightless neural network, entitled the Enhanced Probabilistic Convergent Neural Network (EPCN), to the iris biometric modality. The human iris are processed for feature vectors which will be employed for formation of connectivity, during learning and subsequent recognition. The pre-processing of the iris, prior to EPCN training, is very minimal. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time. Structural modifications were also made to the Random Access Memory (RAM) based neural network which enhances its robustness when applied in real-time.
ES2009-138
Novel Modular Weightless Neural Architectures for Biometrics-based Recognition
Konstantinos Sirlantzis, Gareth Howells, Bogdan Gherman
Novel Modular Weightless Neural Architectures for Biometrics-based Recognition
Konstantinos Sirlantzis, Gareth Howells, Bogdan Gherman
Abstract:
We introduce a novel weightless artificial neural architecture based on multiple classifier systems. In this, different modules of a network specialise in recognising specific classes of a multiclass recognition task. Each of these modules comprises individual RAM addresses which store frequency-based probabilistic estimates of how likely it is to observe this pattern as a feature of the training examples available from a particular class. The class-wise likelihood of observing a combination of addresses for each class is calculated as a sum-based scheme (one of the most commonly used multi-classifier fusion methods). The classification decision is finally obtained by choosing the class with the highest pseudo-posterior probability for an address combination. Tests of our system on a face recognition problem using Minchinton cell encoding for mapping regions of interest (ROIs) to the network’s input layer showed very encouraging results.
We introduce a novel weightless artificial neural architecture based on multiple classifier systems. In this, different modules of a network specialise in recognising specific classes of a multiclass recognition task. Each of these modules comprises individual RAM addresses which store frequency-based probabilistic estimates of how likely it is to observe this pattern as a feature of the training examples available from a particular class. The class-wise likelihood of observing a combination of addresses for each class is calculated as a sum-based scheme (one of the most commonly used multi-classifier fusion methods). The classification decision is finally obtained by choosing the class with the highest pseudo-posterior probability for an address combination. Tests of our system on a face recognition problem using Minchinton cell encoding for mapping regions of interest (ROIs) to the network’s input layer showed very encouraging results.
ES2009-101
Quantum RAM Based Neural Netoworks
Wilson de Oliveira
Quantum RAM Based Neural Netoworks
Wilson de Oliveira
Abstract:
A mathematical quantisation of a Random Access Memory (RAM) is proposed starting from its matrix representation. This quantum RAM (q-RAM) is employed as the neural unit of q-RAM-based Neural Networks, q-RbNN, which can be seen as the quantisation of the corresponding RAM-based ones. The models proposed here are direct realisable in quantum circuits, have a natural adaptation of the classical learning algorithms and physical feasibility of quantum learning in contrast to what has been proposed in the literature.
A mathematical quantisation of a Random Access Memory (RAM) is proposed starting from its matrix representation. This quantum RAM (q-RAM) is employed as the neural unit of q-RAM-based Neural Networks, q-RbNN, which can be seen as the quantisation of the corresponding RAM-based ones. The models proposed here are direct realisable in quantum circuits, have a natural adaptation of the classical learning algorithms and physical feasibility of quantum learning in contrast to what has been proposed in the literature.
Learning II
ES2009-24
Comparison between linear discrimination analysis and support vector machine for detection of pesticide on spinach leaf by hyperspectral imaging with excitation-emission matrix
Mizuki Tsuta, Gamal El Masry, Takehiro Sugiyama, Kaori Fujita, Junichi Sugiyama
Comparison between linear discrimination analysis and support vector machine for detection of pesticide on spinach leaf by hyperspectral imaging with excitation-emission matrix
Mizuki Tsuta, Gamal El Masry, Takehiro Sugiyama, Kaori Fujita, Junichi Sugiyama
Abstract:
The performances of support vector machine (SVM) and linear discrimination analysis (LDA) for detection of pesticide on spinach leaf were investigated. Fluorescence images of spinach leaves without any treatment, treated with pure water and methamidophos solution were taken under 561 difference wavelength conditions to acquire hyperspectral excitation-emission matrix (EEM) data. Then LDA and SVM were applied to EEMs of pixels randomly sampled from the data for the classification of treatment. Misclassification rate for LDA and SVM were 18.8% and 9.9%, respectively. It was also found that methamidophos treated leaves could be distinguished visibly from others after SVM was applied to each pixel of hyperspectral EEM data.
The performances of support vector machine (SVM) and linear discrimination analysis (LDA) for detection of pesticide on spinach leaf were investigated. Fluorescence images of spinach leaves without any treatment, treated with pure water and methamidophos solution were taken under 561 difference wavelength conditions to acquire hyperspectral excitation-emission matrix (EEM) data. Then LDA and SVM were applied to EEMs of pixels randomly sampled from the data for the classification of treatment. Misclassification rate for LDA and SVM were 18.8% and 9.9%, respectively. It was also found that methamidophos treated leaves could be distinguished visibly from others after SVM was applied to each pixel of hyperspectral EEM data.
ES2009-40
SVM-based learning method for improving colour adjustment in automotive basecoat manufacturing
Francisco J. Ruiz, Nuria Agell, Cecilio Angulo
SVM-based learning method for improving colour adjustment in automotive basecoat manufacturing
Francisco J. Ruiz, Nuria Agell, Cecilio Angulo
Abstract:
A new iterative method based on Support Vector Machines to perform automated colour adjustment processing in the automotive industry is proposed in this paper. The iterative methodology relies on a SVM trained with patterns provided by expert colourists and an actions' generator module. The SVM algorithm enables selecting the most adequate action in each step of an iterated feed-forward loop until the final state satisfies colourimetric bounding conditions. Both encouraging results obtained and the significant reduction of non-conformance costs, justify further industrial efforts to develop an automated software tool in this and similar industrial processes.
A new iterative method based on Support Vector Machines to perform automated colour adjustment processing in the automotive industry is proposed in this paper. The iterative methodology relies on a SVM trained with patterns provided by expert colourists and an actions' generator module. The SVM algorithm enables selecting the most adequate action in each step of an iterated feed-forward loop until the final state satisfies colourimetric bounding conditions. Both encouraging results obtained and the significant reduction of non-conformance costs, justify further industrial efforts to develop an automated software tool in this and similar industrial processes.
ES2009-60
Application of SVM for cell recognition in BCC skin pathology
Tomasz Markiewicz, Stanislaw Osowski, Cezary Jochymski, Joanna Narbutt, Wojciech Kozlowski
Application of SVM for cell recognition in BCC skin pathology
Tomasz Markiewicz, Stanislaw Osowski, Cezary Jochymski, Joanna Narbutt, Wojciech Kozlowski
Abstract:
The paper presents the application of Support Vector Machine (SVM) for the recognition of immunopositive and immunonegative cells at basal cell carcinoma. The developed algorithm applies two kinds of SVM: the Gaussian kernel SVM for direct cell recognition and linear kernel SVM as a preprocessing stage for sequential thresholding of the image. The developed computer program was tested on the examples of 528 images of carcinoma and the obtained results are in good agreement with the human expert score.
The paper presents the application of Support Vector Machine (SVM) for the recognition of immunopositive and immunonegative cells at basal cell carcinoma. The developed algorithm applies two kinds of SVM: the Gaussian kernel SVM for direct cell recognition and linear kernel SVM as a preprocessing stage for sequential thresholding of the image. The developed computer program was tested on the examples of 528 images of carcinoma and the obtained results are in good agreement with the human expert score.
ES2009-87
A neural network model of landmark recognition in the fiddler crab, Uca lactea
Hyunggi Cho, DaeEun Kim
A neural network model of landmark recognition in the fiddler crab, Uca lactea
Hyunggi Cho, DaeEun Kim
Abstract:
The fiddler crabs, Uca lactea, which live on intertidal mudflats, exhibit a remarkable ability to return to its burrow. It has been reported that the species usually use path integration, an ideothetic mechanism for short-range homing. During the mating season, however, the accumulation error of the process increases due to vigorous courtship movement. To compensate for this, most courting males construct the vertical mud structures, called semidomes, at the entrance of their burrows and use them as landmarks. Here, we suggest a possible neural medel that demonstrates how visual landmark navigation could be implemented in the fiddler crab's central nervous system. The model consisting of two levels of a population of neurons, is based on the snapshot hypothesis and a simplified version of Franz's algorithm is used for the computation of home vector.
The fiddler crabs, Uca lactea, which live on intertidal mudflats, exhibit a remarkable ability to return to its burrow. It has been reported that the species usually use path integration, an ideothetic mechanism for short-range homing. During the mating season, however, the accumulation error of the process increases due to vigorous courtship movement. To compensate for this, most courting males construct the vertical mud structures, called semidomes, at the entrance of their burrows and use them as landmarks. Here, we suggest a possible neural medel that demonstrates how visual landmark navigation could be implemented in the fiddler crab's central nervous system. The model consisting of two levels of a population of neurons, is based on the snapshot hypothesis and a simplified version of Franz's algorithm is used for the computation of home vector.
ES2009-82
Classification of high-dimensional data for cervical cancer detection
Charles Bouveyron, Camille Brunet, Vincent Vigneron
Classification of high-dimensional data for cervical cancer detection
Charles Bouveyron, Camille Brunet, Vincent Vigneron
Abstract:
In this paper, the performance of different generative methods for the classification of cervical nuclei are compared in order to detect cancer of cervix. These methods include classical approaches, such as Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA) or Mixture Discriminant Analysis (MDA) and a high dimensional approach (HDDA) recently developed. The classification of cervical nuclei presents two main statistical issues, scarce population and high dimensional data, which impact on the ability to successfully discriminate the different classes. This paper presents an approach to face these problems of unbalanced data and high dimensions.
In this paper, the performance of different generative methods for the classification of cervical nuclei are compared in order to detect cancer of cervix. These methods include classical approaches, such as Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA) or Mixture Discriminant Analysis (MDA) and a high dimensional approach (HDDA) recently developed. The classification of cervical nuclei presents two main statistical issues, scarce population and high dimensional data, which impact on the ability to successfully discriminate the different classes. This paper presents an approach to face these problems of unbalanced data and high dimensions.
ES2009-36
Sparse support vector machines by kernel discriminant analysis
Kazuki Iwamura, Shigeo Abe
Sparse support vector machines by kernel discriminant analysis
Kazuki Iwamura, Shigeo Abe
Abstract:
We discuss sparse support vector machines (SVMs) by selecting the linearly independent data in the empirical feature space. First we select training data that maximally separate two classes in the empirical feature space. As a selection criterion we use linear discriminant analysis in the empirical feature space and select training data by forward selection. Then the SVM is trained in the empirical feature space spanned by the selected training data. We evaluate our method by computer experiments and show that our method can realize sparse SVMs with comparable generalization performance with that of regular SVMs.
We discuss sparse support vector machines (SVMs) by selecting the linearly independent data in the empirical feature space. First we select training data that maximally separate two classes in the empirical feature space. As a selection criterion we use linear discriminant analysis in the empirical feature space and select training data by forward selection. Then the SVM is trained in the empirical feature space spanned by the selected training data. We evaluate our method by computer experiments and show that our method can realize sparse SVMs with comparable generalization performance with that of regular SVMs.
ES2009-111
Embedding Proximal Support Vectors into Randomized Trees
Cedric Simon, Christophe De Vleeschouwer, Jérôme Meessen
Embedding Proximal Support Vectors into Randomized Trees
Cedric Simon, Christophe De Vleeschouwer, Jérôme Meessen
Abstract:
By embedding multiple proximal SVM classifiers into a binary tree architecture, it is possible to turn an arbitrary multi-classes problem into a hierarchy of binary classifications. The critical issue then consists in determining in each node of the tree how to aggregate the multiple classes into a pair of say overlay classes to discriminate. As a fundamental contribution, our paper proposes to deploy an ensemble of randomized trees, instead of a single optimized decision tree, to bypass the question of overlay classes definition. Empirical results on various datasets demonstrate a significant gain in accuracy both compared to 'one versus one' SVM solutions and to conventional ensemble of decision trees classifiers.
By embedding multiple proximal SVM classifiers into a binary tree architecture, it is possible to turn an arbitrary multi-classes problem into a hierarchy of binary classifications. The critical issue then consists in determining in each node of the tree how to aggregate the multiple classes into a pair of say overlay classes to discriminate. As a fundamental contribution, our paper proposes to deploy an ensemble of randomized trees, instead of a single optimized decision tree, to bypass the question of overlay classes definition. Empirical results on various datasets demonstrate a significant gain in accuracy both compared to 'one versus one' SVM solutions and to conventional ensemble of decision trees classifiers.
ES2009-99
Echo State networks and Neural network Ensembles to predict Sunspots activity
Friedhelm Schwenker, Amr Labib
Echo State networks and Neural network Ensembles to predict Sunspots activity
Friedhelm Schwenker, Amr Labib
Abstract:
Echo state networks (ESN) and ensembles of neural networks are developed for the prediction of the monthly sunspots series. An echo state network and a multilayer perceptron approach were used with the neural network ensembles. Through numerical evaluation on this data it is shown that ESN outperform feedforward MLP. Furthermore, it is shown that median fusion lead to robust predictors, and even can improve the prediction accuracy of the best individual predictors.
Echo state networks (ESN) and ensembles of neural networks are developed for the prediction of the monthly sunspots series. An echo state network and a multilayer perceptron approach were used with the neural network ensembles. Through numerical evaluation on this data it is shown that ESN outperform feedforward MLP. Furthermore, it is shown that median fusion lead to robust predictors, and even can improve the prediction accuracy of the best individual predictors.
ES2009-96
Monotonic Recurrent Bounded Derivative Neural Network
Alexey Minin, Bernhard Lang
Monotonic Recurrent Bounded Derivative Neural Network
Alexey Minin, Bernhard Lang
Abstract:
Neural networks applied in control loops and safety-critical domains have to meet hard requirements. First of all, a small approximation error is required, then, the smoothness and the monotonicity of selected input-output relations have to be taken into account and finally time dependencies in time series should be induced into the model. Otherwise the stability of the control laws can be lost. New Monotonic Recurrent Bounded Derivative Network (RBDN) on the basis of the Bounded Derivative Network (BDN) will be considered. Authors compared two networks, investigated the influence of the back connection in recurrent network, stability and monotonicity of the new network.
Neural networks applied in control loops and safety-critical domains have to meet hard requirements. First of all, a small approximation error is required, then, the smoothness and the monotonicity of selected input-output relations have to be taken into account and finally time dependencies in time series should be induced into the model. Otherwise the stability of the control laws can be lost. New Monotonic Recurrent Bounded Derivative Network (RBDN) on the basis of the Bounded Derivative Network (BDN) will be considered. Authors compared two networks, investigated the influence of the back connection in recurrent network, stability and monotonicity of the new network.
ES2009-73
Modeling pigeon behavior using a Conditional Restricted Boltzmann Machine
Matthew Zeiler, Graham Taylor, Nikolaus Troje, Geoffrey Hinton
Modeling pigeon behavior using a Conditional Restricted Boltzmann Machine
Matthew Zeiler, Graham Taylor, Nikolaus Troje, Geoffrey Hinton
Abstract:
In an effort to better understand the complex courtship behaviour of pigeons, we have built a model learned from motion capture data. We employ a Conditional Restricted Boltzmann Machine with binary latent features and real-valued visible units. The units are conditioned on information from previous time steps to capture dynamics. We validate a trained model by quantifying the characteristic "head-bobbing" present in pigeons. We also show how to predict missing data by marginalizing out the hidden variables and minimizing free energy.
In an effort to better understand the complex courtship behaviour of pigeons, we have built a model learned from motion capture data. We employ a Conditional Restricted Boltzmann Machine with binary latent features and real-valued visible units. The units are conditioned on information from previous time steps to capture dynamics. We validate a trained model by quantifying the characteristic "head-bobbing" present in pigeons. We also show how to predict missing data by marginalizing out the hidden variables and minimizing free energy.
ES2009-22
Connection strategy and performance in sparsely connected 2D associative memory models with non-random images
Lee Calcraft, Rod Adams, Neil Davey
Connection strategy and performance in sparsely connected 2D associative memory models with non-random images
Lee Calcraft, Rod Adams, Neil Davey
Abstract:
A sparsely connected associative memory model is tested with different pattern sets, and it is found that pattern recall is highly dependent on the type of patterns used. Performance is also found to depend critically on the connection strategy used to build the networks. Comparisons of topology reveal that connectivity matrices based on Gaussian distributions perform well for all pattern types tested, and that for best pattern recall at low wiring costs, the optimal value of Gaussian  used in creating the connection matrix is dependent on properties of the pattern set.
A sparsely connected associative memory model is tested with different pattern sets, and it is found that pattern recall is highly dependent on the type of patterns used. Performance is also found to depend critically on the connection strategy used to build the networks. Comparisons of topology reveal that connectivity matrices based on Gaussian distributions perform well for all pattern types tested, and that for best pattern recall at low wiring costs, the optimal value of Gaussian  used in creating the connection matrix is dependent on properties of the pattern set.
ES2009-44
Zero phase-lag synchronization through short-term modulations
Thomas Burwick
Zero phase-lag synchronization through short-term modulations
Thomas Burwick
Abstract:
Considering coupled phase model oscillator systems with non-identical time delays, we study the possibility of close-to-zero phase-lag synchronization (ZPS) without frequency depression (FD). FD refers to nearly vanishing frequencies of the synchzronized oscillators (in comparison to the intrinsic frequencies); its absence is crucial for interpretations related to brain dynamics. Discussing an extension of the Kuramoto model, it is demonstrated that ZPS without FD may arise by allowing for dynamical parameters. Two models are presented: one is based on short-term modulation of the delays, while the other assumes static delays but short-term modulation of coupling strengths. We also speculate on possible relevance of such mechanisms with respect to assembly formation by relating the frequency of the synchronized oscillation to recently proposed pattern frequency bands.
Considering coupled phase model oscillator systems with non-identical time delays, we study the possibility of close-to-zero phase-lag synchronization (ZPS) without frequency depression (FD). FD refers to nearly vanishing frequencies of the synchzronized oscillators (in comparison to the intrinsic frequencies); its absence is crucial for interpretations related to brain dynamics. Discussing an extension of the Kuramoto model, it is demonstrated that ZPS without FD may arise by allowing for dynamical parameters. Two models are presented: one is based on short-term modulation of the delays, while the other assumes static delays but short-term modulation of coupling strengths. We also speculate on possible relevance of such mechanisms with respect to assembly formation by relating the frequency of the synchronized oscillation to recently proposed pattern frequency bands.
ES2009-48
Learning reconstruction and prediction of natural stimuli by a population of spiking neurons
Michael Gutmann, Aapo Hyvärinen
Learning reconstruction and prediction of natural stimuli by a population of spiking neurons
Michael Gutmann, Aapo Hyvärinen
Abstract:
We propose a model for learning representations of time dependent data with a population of spiking neurons. Encoding is based on a standard spiking neuron model, and the spike timings of the neurons represent the stimulus. Learning is based on the sole principle of maximization of representation accuracy: the stimulus can be decoded from the spike timings with minimum error. Since the encoding is causal, we propose two different representation strategies: The spike timings represent the stimulus either in a predictive manner or by reconstructing past input. We apply the model to speech data and discuss differences between the emergent representations.
We propose a model for learning representations of time dependent data with a population of spiking neurons. Encoding is based on a standard spiking neuron model, and the spike timings of the neurons represent the stimulus. Learning is based on the sole principle of maximization of representation accuracy: the stimulus can be decoded from the spike timings with minimum error. Since the encoding is causal, we propose two different representation strategies: The spike timings represent the stimulus either in a predictive manner or by reconstructing past input. We apply the model to speech data and discuss differences between the emergent representations.
Brain Computer Interfaces: from theory to practice
ES2009-5
Brain-Computer Interfaces: from theory to practice
Dieter Devlaminck, Bart Wyns, Luc Boullart, Patrick Santens, Georges Otte
Brain-Computer Interfaces: from theory to practice
Dieter Devlaminck, Bart Wyns, Luc Boullart, Patrick Santens, Georges Otte
Abstract:
Brain-Computer Interfaces (BCI) are a new kind of human-machine interfaces emerging on the horizon. They form a communication pathway between the brain and a machine. This can be achieved by measuring brain signals and translate them directly into control commands. Such a system allows people with severe motor disabilities to manipulate their environment in an alternative way. However there’s still a lot of work to be done to make it usable in daily life. In this contribution we give a tutorial overview of existing methods and possible applications.
Brain-Computer Interfaces (BCI) are a new kind of human-machine interfaces emerging on the horizon. They form a communication pathway between the brain and a machine. This can be achieved by measuring brain signals and translate them directly into control commands. Such a system allows people with severe motor disabilities to manipulate their environment in an alternative way. However there’s still a lot of work to be done to make it usable in daily life. In this contribution we give a tutorial overview of existing methods and possible applications.
ES2009-64
Oscillation in a network model of neocortex
Wim van Drongelen, Hyong Lee, Amber Martell, Jennifer Dwyer, Rick Stevens, Mark Hereld
Oscillation in a network model of neocortex
Wim van Drongelen, Hyong Lee, Amber Martell, Jennifer Dwyer, Rick Stevens, Mark Hereld
Abstract:
A basic understanding of the relationships between the activity of individual neurons and macroscopic electrical activity of local field potentials or electroencephalogram (EEG) may provide guidance for experimental design in neuroscience, improve the development of therapeutic approaches in neurology, and offer opportunities for computer aided design of brain-computer interfaces. Here, we study the relationship between subthreshold resonant properties of cortical neurons and the onset and offset of network oscillations in a computational model of neocortex. This model includes two types of pyramidal cells and four types of inhibitory interneurons and is capable of generating network oscillations and bursting activity. Our findings suggest that neuronal resonance is associated with subthreshold oscillation of neurons. This subthreshold behavior affects spike timing and therefore plays a significant role in the generation of the network’s extracellular currents as reflected in the EEG. In addition, we find that electrical stimulation to stop bursting in a network is most effective around the resonant frequency of the neurons.
A basic understanding of the relationships between the activity of individual neurons and macroscopic electrical activity of local field potentials or electroencephalogram (EEG) may provide guidance for experimental design in neuroscience, improve the development of therapeutic approaches in neurology, and offer opportunities for computer aided design of brain-computer interfaces. Here, we study the relationship between subthreshold resonant properties of cortical neurons and the onset and offset of network oscillations in a computational model of neocortex. This model includes two types of pyramidal cells and four types of inhibitory interneurons and is capable of generating network oscillations and bursting activity. Our findings suggest that neuronal resonance is associated with subthreshold oscillation of neurons. This subthreshold behavior affects spike timing and therefore plays a significant role in the generation of the network’s extracellular currents as reflected in the EEG. In addition, we find that electrical stimulation to stop bursting in a network is most effective around the resonant frequency of the neurons.
ES2009-51
Sensors selection for P300 speller brain computer interface
Bertrand Rivet, Antoine Souloumiac, Guillaume Gibert, Virginie Attina, Olivier Bertrand
Sensors selection for P300 speller brain computer interface
Bertrand Rivet, Antoine Souloumiac, Guillaume Gibert, Virginie Attina, Olivier Bertrand
Abstract:
Brain-computer interfaces (BCI) are communication system that use brain activities to control a device. The BCI studied is based on the P300 speller [1]. A new algorithm to select relevant sensors is proposed: it is based on a previous proposed algorithm [2] used to enhance P300 potentials by spatial filters. Data recorded on three subjects were used to evaluate the proposed selection method: it is shown to be efficient and to compare favourably with a reference method [3].
Brain-computer interfaces (BCI) are communication system that use brain activities to control a device. The BCI studied is based on the P300 speller [1]. A new algorithm to select relevant sensors is proposed: it is based on a previous proposed algorithm [2] used to enhance P300 potentials by spatial filters. Data recorded on three subjects were used to evaluate the proposed selection method: it is shown to be efficient and to compare favourably with a reference method [3].
ES2009-139
Multiclass brain computer interface based on visual attention
Rolando Grave de Peralta Menendez, Jorge Dias, José Augusto Soares Prado, Hadi Aliakbarpour, Sara Gonzalez Andino
Multiclass brain computer interface based on visual attention
Rolando Grave de Peralta Menendez, Jorge Dias, José Augusto Soares Prado, Hadi Aliakbarpour, Sara Gonzalez Andino
Abstract:
Recent public demonstrations showed that a system based on imagination does not always work [1]. On the other side predicting limb movement based on scalp activity has proved to be hazardous [2] and thus other alternatives are needed. This paper describes the asynchronous Geneva-BCI based on EEG and visual attention to external stimulus able to send commands every 0.5 (or 0.25) seconds with very high (98.88%) correct classification rates and optimal (178 bits/min) theoretical bit rate. This high performance allows for the distant real time control of robots using four commands.
Recent public demonstrations showed that a system based on imagination does not always work [1]. On the other side predicting limb movement based on scalp activity has proved to be hazardous [2] and thus other alternatives are needed. This paper describes the asynchronous Geneva-BCI based on EEG and visual attention to external stimulus able to send commands every 0.5 (or 0.25) seconds with very high (98.88%) correct classification rates and optimal (178 bits/min) theoretical bit rate. This high performance allows for the distant real time control of robots using four commands.
ES2009-102
Brain Computer Interface for Virtual Reality Control
Christoph Guger, Clemens Holzner, Christoph Groenegress, Günter Edlinger, Mel Slater
Brain Computer Interface for Virtual Reality Control
Christoph Guger, Clemens Holzner, Christoph Groenegress, Günter Edlinger, Mel Slater
Abstract:
An electroencephalogram (EEG) based brain-computer interface (BCI) was connected with a Virtual Reality system in order to control a smart home application. Therefore special control masks were developed which allowed using the P300 component of the EEG as input signal for the BCI system. Control commands for switching TV channels, for opening and closing doors and windows, for navigation and conversation were realized. Experiments with 12 subjects were made to investigate the speed and accuracy that can be achieved if several hundred of commands are used to control the smart home environment. The study clearly shows that such a BCI system can be used for smart home control. The Virtual Reality approach is a very cost effective way for testing the smart home environment together with the BCI system.
An electroencephalogram (EEG) based brain-computer interface (BCI) was connected with a Virtual Reality system in order to control a smart home application. Therefore special control masks were developed which allowed using the P300 component of the EEG as input signal for the BCI system. Control commands for switching TV channels, for opening and closing doors and windows, for navigation and conversation were realized. Experiments with 12 subjects were made to investigate the speed and accuracy that can be achieved if several hundred of commands are used to control the smart home environment. The study clearly shows that such a BCI system can be used for smart home control. The Virtual Reality approach is a very cost effective way for testing the smart home environment together with the BCI system.
ES2009-20
The Possibility of Single-trial Classification of Viewed Characters using EEG Waveforms
Minoru Nakayama, Hiroshi Abe
The Possibility of Single-trial Classification of Viewed Characters using EEG Waveforms
Minoru Nakayama, Hiroshi Abe
Abstract:
Electroencephalograms (EEGs) contain responses to visual stimulus, however, signal noise often prevents these from being easily obtained. To classify EEG waveforms, a signal processing procedure using the relationship between EEG and ERP, which is the summation of EEG waveforms, was developed. The processing technique involves the prediction of signals using Support Vector Regression. The procedure was developed and applied to a Kanji recognition task used to classify viewing characters, symbols or Kanji. The accuracy of classification between using EEG waveforms with ERP references and without ERP references was compared. The accuracy with references was significantly more than by chance and was higher than EEG waveforms without references.
Electroencephalograms (EEGs) contain responses to visual stimulus, however, signal noise often prevents these from being easily obtained. To classify EEG waveforms, a signal processing procedure using the relationship between EEG and ERP, which is the summation of EEG waveforms, was developed. The processing technique involves the prediction of signals using Support Vector Regression. The procedure was developed and applied to a Kanji recognition task used to classify viewing characters, symbols or Kanji. The accuracy of classification between using EEG waveforms with ERP references and without ERP references was compared. The accuracy with references was significantly more than by chance and was higher than EEG waveforms without references.
ES2009-52
Exploring the impact of alternative feature representations on BCI classification
Ali Bahramisharif, Marcel van Gerven, Tom Heskes
Exploring the impact of alternative feature representations on BCI classification
Ali Bahramisharif, Marcel van Gerven, Tom Heskes
Abstract:
Classification performance in BCIs depends heavily on the features that are used as input to the employed classifier. If the BCI signal is extended in time, we may either use a representation of the signal at multiple time segments with a high risk of overfitting or averaged over time with a high risk of underfitting as input to the classifier. In this paper we present an empirical study which allows us to determine the right balance between these two representations. Using two BCI data sets, we show that our method can significantly improve classification performance.
Classification performance in BCIs depends heavily on the features that are used as input to the employed classifier. If the BCI signal is extended in time, we may either use a representation of the signal at multiple time segments with a high risk of overfitting or averaged over time with a high risk of underfitting as input to the classifier. In this paper we present an empirical study which allows us to determine the right balance between these two representations. Using two BCI data sets, we show that our method can significantly improve classification performance.
ES2009-77
Uncued brain-computer interfaces: a variational hidden markov model of mental state dynamics
Cédric Gouy-Pailler, Jérémie Mattout, Marco Congedo, Christian Jutten
Uncued brain-computer interfaces: a variational hidden markov model of mental state dynamics
Cédric Gouy-Pailler, Jérémie Mattout, Marco Congedo, Christian Jutten
Abstract:
This paper describes a method to improve uncued Brain-Computer Interfaces based on motor imagery. Our algorithm aims at filtering the continuous classifier output by incorporating prior knowledge about the mental state dynamics. On dataset IVb of BCI competition III, we compare the performances of four different methods by combining smoothed probabilities filtered by our algorithm/direct classifier output and static/dynamic classifier. We demonstrate that the combination of our algorithm with a dynamic classifier yields the best results.
This paper describes a method to improve uncued Brain-Computer Interfaces based on motor imagery. Our algorithm aims at filtering the continuous classifier output by incorporating prior knowledge about the mental state dynamics. On dataset IVb of BCI competition III, we compare the performances of four different methods by combining smoothed probabilities filtered by our algorithm/direct classifier output and static/dynamic classifier. We demonstrate that the combination of our algorithm with a dynamic classifier yields the best results.
ES2009-93
Decoding finger flexion using amplitude modulation from band-specific ECoG
Nanying Liang, Laurent Bougrain
Decoding finger flexion using amplitude modulation from band-specific ECoG
Nanying Liang, Laurent Bougrain
Abstract:
EEG-BCIs have been well studied in the past decades and implemented into several famous applications, like P300 speller and wheelchair controller. However, these interfaces are indirect due to low spatial resolution of EEG. Recently, direct ECoG-BCIs attract intensive attention because ECoG provides a higher spatial resolution and signal quality. This makes possible localization of the source of neural signals with respect to certain brain functions. In this article, we present a realization of ECoG-BCIs for finger flexion prediction provided by BCI competition IV. Methods for finger flexion prediction including feature extraction and selection are provided in this article. Results show that the predicted finger movement is highly correlated with the true movement when we use band-specific amplitude modulation.
EEG-BCIs have been well studied in the past decades and implemented into several famous applications, like P300 speller and wheelchair controller. However, these interfaces are indirect due to low spatial resolution of EEG. Recently, direct ECoG-BCIs attract intensive attention because ECoG provides a higher spatial resolution and signal quality. This makes possible localization of the source of neural signals with respect to certain brain functions. In this article, we present a realization of ECoG-BCIs for finger flexion prediction provided by BCI competition IV. Methods for finger flexion prediction including feature extraction and selection are provided in this article. Results show that the predicted finger movement is highly correlated with the true movement when we use band-specific amplitude modulation.
ES2009-103
Neural network pruning for feature selection - Application to a P300 Brain-Computer Interface
Hubert Cecotti, Axel Graeser
Neural network pruning for feature selection - Application to a P300 Brain-Computer Interface
Hubert Cecotti, Axel Graeser
Abstract:
A Brain-Computer Interface (BCI) is an interface that enables the direct communication between human and machines by analyzing brain measurements. A P300 speller is based on the oddball paradigm, which generates event-related potential (ERP), like the P300 wave, on targets selected by the user. The detection of these P300 waves allows selecting visually characters on the screen. We present a new model for the detection of P300 waves. The techniques is based on a neural network that uses convolution layers for creating channels. One challenge for improving pragmatically BCIs is to reduce the number of electrodes and to select the best electrodes in relation to the subject particularities. We propose a feature selection strategy based on salient connexions in the first hidden layer of a neural network trained with all the electrodes as input. A new classifier is created in relation to the remaining topology and the desired number of electrodes for the system. The recognition rate of the P300 speller over 2 subjects is 87\% by considering only 8 electrodes.
A Brain-Computer Interface (BCI) is an interface that enables the direct communication between human and machines by analyzing brain measurements. A P300 speller is based on the oddball paradigm, which generates event-related potential (ERP), like the P300 wave, on targets selected by the user. The detection of these P300 waves allows selecting visually characters on the screen. We present a new model for the detection of P300 waves. The techniques is based on a neural network that uses convolution layers for creating channels. One challenge for improving pragmatically BCIs is to reduce the number of electrodes and to select the best electrodes in relation to the subject particularities. We propose a feature selection strategy based on salient connexions in the first hidden layer of a neural network trained with all the electrodes as input. A new classifier is created in relation to the remaining topology and the desired number of electrodes for the system. The recognition rate of the P300 speller over 2 subjects is 87\% by considering only 8 electrodes.
ES2009-115
Augmenting Information from Brain-Computer Interfaces through Bayesian Plan Recognition
Eric Demeester, Alexander Huntemann, Jose del R. Millan, Hendrik Van Brussel
Augmenting Information from Brain-Computer Interfaces through Bayesian Plan Recognition
Eric Demeester, Alexander Huntemann, Jose del R. Millan, Hendrik Van Brussel
Abstract:
For severely disabled people, Brain-Computer Interfaces (BCIs) may provide the means to regain mobility and manipulation capabilities. However, information obtained from current BCIs is uncertain and of limited bandwidth and resolution. This paper presents a Bayesian framework that estimates from uncertain BCI signals a richer representation of the task a robotic mobility or manipulation device should execute, such that these devices can be operated more safely, accurately and efficiently. The framework has been evaluated on a simulated robotic wheelchair.
For severely disabled people, Brain-Computer Interfaces (BCIs) may provide the means to regain mobility and manipulation capabilities. However, information obtained from current BCIs is uncertain and of limited bandwidth and resolution. This paper presents a Bayesian framework that estimates from uncertain BCI signals a richer representation of the task a robotic mobility or manipulation device should execute, such that these devices can be operated more safely, accurately and efficiently. The framework has been evaluated on a simulated robotic wheelchair.
Generative and bayesian models
ES2009-57
Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels
Jörg Beyer, Kai Heesche, Werner Hauptmann, Clemens Otte
Heterogeneous mixture-of-experts for fusion of locally valid knowledge-based submodels
Jörg Beyer, Kai Heesche, Werner Hauptmann, Clemens Otte
Abstract:
Real-world applications often require the joint use of data-driven and knowledge-based models. While data-driven models are learned from available process data, knowledge-based models are able to provide additional information not contained in the data. In this contribution, we propose a method to divide the input space on the basis of the validity ranges of the knowledge-based models. By doing so they are only active in those domains they are designed for. The data-driven models complete the coverage of the input space. We demonstrate the benefits of our approach on a real-world application for the energy management of a hybrid electric vehicle.
Real-world applications often require the joint use of data-driven and knowledge-based models. While data-driven models are learned from available process data, knowledge-based models are able to provide additional information not contained in the data. In this contribution, we propose a method to divide the input space on the basis of the validity ranges of the knowledge-based models. By doing so they are only active in those domains they are designed for. The data-driven models complete the coverage of the input space. We demonstrate the benefits of our approach on a real-world application for the energy management of a hybrid electric vehicle.
ES2009-107
Dirichlet process-based component detection in state-space models
Botond Bocsi, Lehel Csato
Dirichlet process-based component detection in state-space models
Botond Bocsi, Lehel Csato
Abstract:
An extension of the switching-state models that allows arbitrary number of components is presented. We introduce a Dirichlet process prior over the mixture components of the linear models. This prior allows the inference on the number of linear models to be put into the mixture. We develop a distance measure in the space of linear Kalman filters with the use of the Kullback-Leibler divergence over the conditional probabilities induced by the individual Kalman filters. The introduced distance measure allows to remove components that are no longer relevant, making the algorithm more effective. We test the proposed algorithm on both artificial and real-world data.
An extension of the switching-state models that allows arbitrary number of components is presented. We introduce a Dirichlet process prior over the mixture components of the linear models. This prior allows the inference on the number of linear models to be put into the mixture. We develop a distance measure in the space of linear Kalman filters with the use of the Kullback-Leibler divergence over the conditional probabilities induced by the individual Kalman filters. The introduced distance measure allows to remove components that are no longer relevant, making the algorithm more effective. We test the proposed algorithm on both artificial and real-world data.
ES2009-29
A variational radial basis function approximation for diffusion processes
Michail Vrettas, Dan Cornford, Yuan Shen
A variational radial basis function approximation for diffusion processes
Michail Vrettas, Dan Cornford, Yuan Shen
Abstract:
In this paper we present a radial basis function based extension to a recently proposed variational algorithm for approximate inference for diffusion processes. Inference, for state and in particular (hyper-)parameters, in diffusion processes is a challenging and crucial task. We show that the new radial basis function approximation based algorithm converges to the original algorithm and has beneficial characteristics when estimating (hyper-)parameters. We validate our new approach on a non-linear double well potential dynamical system.
In this paper we present a radial basis function based extension to a recently proposed variational algorithm for approximate inference for diffusion processes. Inference, for state and in particular (hyper-)parameters, in diffusion processes is a challenging and crucial task. We show that the new radial basis function approximation based algorithm converges to the original algorithm and has beneficial characteristics when estimating (hyper-)parameters. We validate our new approach on a non-linear double well potential dynamical system.
ES2009-106
A regression model with a hidden logistic process for signal parametrization
Faicel Chamroukhi, Allou Samé, Gérard Govaert, Patrice Aknin
A regression model with a hidden logistic process for signal parametrization
Faicel Chamroukhi, Allou Samé, Gérard Govaert, Patrice Aknin
Abstract:
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.
Neural maps and learning vector quantization - theory and applications
ES2009-8
Neural Maps and Learning Vector Quantization - Theory and Applications
Frank-Michael Schleif, Thomas Villmann
Neural Maps and Learning Vector Quantization - Theory and Applications
Frank-Michael Schleif, Thomas Villmann
Abstract:
Neural maps and Learning Vector Quantizer are fundamental paradigms in neural vector quantization based on Hebbian learning. The beginning of this field dates back over twenty years with strong progress in theory and outstanding applications. Their success lies in its robustness and simplicity in application whereas the mathematics beyond is rather difficult. We provide an overview on recent achievements and current trends of ongoing research.
Neural maps and Learning Vector Quantizer are fundamental paradigms in neural vector quantization based on Hebbian learning. The beginning of this field dates back over twenty years with strong progress in theory and outstanding applications. Their success lies in its robustness and simplicity in application whereas the mathematics beyond is rather difficult. We provide an overview on recent achievements and current trends of ongoing research.
ES2009-85
Hyperparameter Learning in Robust Soft LVQ
Petra Schneider, Michael Biehl, Barbara Hammer
Hyperparameter Learning in Robust Soft LVQ
Petra Schneider, Michael Biehl, Barbara Hammer
Abstract:
We present a technique to extend Robust Soft Learning Vector Quantization (RSLVQ). This algorithm is derived from an explicit cost function and follows the dynamics of a stochastic gradient ascent. The RSLVQ cost function involves a hyperparameter which is kept fixed during training. We propose to adapt the hyperparameter based on the gradient information. Experiments on artificial and real life data show that the hyperparameter crucially influences the performance of RSLVQ. However, it is not possible to estimate the best value from the data prior to learning. We show that the proposed variant of RSLVQ is very robust with respect to the choice of the hyperparameter.
We present a technique to extend Robust Soft Learning Vector Quantization (RSLVQ). This algorithm is derived from an explicit cost function and follows the dynamics of a stochastic gradient ascent. The RSLVQ cost function involves a hyperparameter which is kept fixed during training. We propose to adapt the hyperparameter based on the gradient information. Experiments on artificial and real life data show that the hyperparameter crucially influences the performance of RSLVQ. However, it is not possible to estimate the best value from the data prior to learning. We show that the proposed variant of RSLVQ is very robust with respect to the choice of the hyperparameter.
ES2009-66
Median Variant of Fuzzy c-Means
Tina Geweniger, Dietlind Zühlke, Barbara Hammer, Thomas Villmann
Median Variant of Fuzzy c-Means
Tina Geweniger, Dietlind Zühlke, Barbara Hammer, Thomas Villmann
Abstract:
In this paper we introduce Median Fuzzy C-Means (M-FCM). This algorithm extends the Median C-Means (MCM) algorithm by allowing fuzzy values for the cluster assignments. To evaluate the performance of M-FCM we compare the results with the clustering obtained by employing MCM and Median Neural Gas (MNG).
In this paper we introduce Median Fuzzy C-Means (M-FCM). This algorithm extends the Median C-Means (MCM) algorithm by allowing fuzzy values for the cluster assignments. To evaluate the performance of M-FCM we compare the results with the clustering obtained by employing MCM and Median Neural Gas (MNG).
ES2009-108
Topologically Ordered Graph Clustering via Deterministic Annealing
Fabrice Rossi, Nathalie Villa
Topologically Ordered Graph Clustering via Deterministic Annealing
Fabrice Rossi, Nathalie Villa
Abstract:
This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph partitions that lead to faithful and readable graph representations.
This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph partitions that lead to faithful and readable graph representations.
ES2009-39
Equilibrium properties of off-line LVQ
Aree Witoelar, Michael Biehl, Barbara Hammer
Equilibrium properties of off-line LVQ
Aree Witoelar, Michael Biehl, Barbara Hammer
Abstract:
The statistical physics analysis of offline learning is applied to cost function based learning vector quantization (LVQ) schemes. Typical learning behavior is obtained from a model with data drawn from high dimension Gaussian mixtures and a system of two or three prototypes. The analytic approach becomes exact in the limit of high training temperature. We study two cost function related LVQ algorithms and the influence of an appropriate weight decay. In our findings, learning from mistakes (LFM) achieves poor generalization ability, while a limiting case of generalized LVQ (GLVQ), termed LVQ+/-, displays much better performance with a properly chosen weight decay.
The statistical physics analysis of offline learning is applied to cost function based learning vector quantization (LVQ) schemes. Typical learning behavior is obtained from a model with data drawn from high dimension Gaussian mixtures and a system of two or three prototypes. The analytic approach becomes exact in the limit of high training temperature. We study two cost function related LVQ algorithms and the influence of an appropriate weight decay. In our findings, learning from mistakes (LFM) achieves poor generalization ability, while a limiting case of generalized LVQ (GLVQ), termed LVQ+/-, displays much better performance with a properly chosen weight decay.
ES2009-49
Kernelizing Vector Quantization Algorithms
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
Kernelizing Vector Quantization Algorithms
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
Abstract:
The kernel trick is a well known approach allowing to implicitly cast a linear method into a nonlinear one by replacing any dot product by a kernel function. However few vector quantization algorithms have been kernelized. Indeed, they usually imply to compute linear transformations (e.g. moving prototypes), what is not easily kernelizable. This paper introduces the Kernel-based Vector Quantization (KVQ) method which allows working in an approximation of the feature space, and thus kernelizing any Vector Quantization (VQ) algorithm.
The kernel trick is a well known approach allowing to implicitly cast a linear method into a nonlinear one by replacing any dot product by a kernel function. However few vector quantization algorithms have been kernelized. Indeed, they usually imply to compute linear transformations (e.g. moving prototypes), what is not easily kernelizable. This paper introduces the Kernel-based Vector Quantization (KVQ) method which allows working in an approximation of the feature space, and thus kernelizing any Vector Quantization (VQ) algorithm.
ES2009-134
A computational framework for exploratory data analysis
Axel Wismueller
A computational framework for exploratory data analysis
Axel Wismueller
Abstract:
We introduce the Exploration Machine (Exploratory Observation Machine, XOM) as a novel versatile method for the analysis of multidimensional data. XOM systematically inverts structural and functional components of so-called topology-preserving mappings. It provides a surprising flexibility to simultaneously contribute to complementary domains of unsupervised learning for exploratory pattern analysis, namely both structure-preserving dimensionality reduction and data clustering. We demonstrate XOM’s applicability to synthetic and real-world data.
We introduce the Exploration Machine (Exploratory Observation Machine, XOM) as a novel versatile method for the analysis of multidimensional data. XOM systematically inverts structural and functional components of so-called topology-preserving mappings. It provides a surprising flexibility to simultaneously contribute to complementary domains of unsupervised learning for exploratory pattern analysis, namely both structure-preserving dimensionality reduction and data clustering. We demonstrate XOM’s applicability to synthetic and real-world data.
Learning III
ES2009-92
SOM based methods in early fault detection of nuclear industry
Miki Sirola, Jaakko Talonen, Golan Lampi
SOM based methods in early fault detection of nuclear industry
Miki Sirola, Jaakko Talonen, Golan Lampi
Abstract:
Early fault detection in nuclear industry is studied. Tools have been developed for control room operators and experts in an industrial project. Self-Organizing Map (SOM) method has been used in combination with other methods. Decision support visualizations are introduced. The usability of methods have been tested and verified by constructing prototype systems. The use of SOM method in dynamic systems is discussed. Applications for industrial domain are presented. Data sets from a Finnish nuclear power plant have been analyzed. Promising results in failure management are achieved.
Early fault detection in nuclear industry is studied. Tools have been developed for control room operators and experts in an industrial project. Self-Organizing Map (SOM) method has been used in combination with other methods. Decision support visualizations are introduced. The usability of methods have been tested and verified by constructing prototype systems. The use of SOM method in dynamic systems is discussed. Applications for industrial domain are presented. Data sets from a Finnish nuclear power plant have been analyzed. Promising results in failure management are achieved.
ES2009-131
Projection of undirected and non-positional graphs using Self Organizing Maps
Hagenbuchner Markus, ShuJia Zhang, Ah Chung Tsoi, Alessandro Sperduti
Projection of undirected and non-positional graphs using Self Organizing Maps
Hagenbuchner Markus, ShuJia Zhang, Ah Chung Tsoi, Alessandro Sperduti
Abstract:
Kohonen's Self-Organizing Map is a popular method which allows the projection of high dimensional data onto a low dimensional display space. Models of Self-Organizing Maps for the treatment of graphs have also been defined and studied. This paper proposes an extension to the GraphSOM model which substantially improves the stability of the model, and, as a side effect, allows for an acceleration of training. The proposed extension is based on a soft encoding of the information needed to represent the vertices of an input graph. Experimental results versus the original GraphSOM model demonstrate the advantages of the proposed extension.
Kohonen's Self-Organizing Map is a popular method which allows the projection of high dimensional data onto a low dimensional display space. Models of Self-Organizing Maps for the treatment of graphs have also been defined and studied. This paper proposes an extension to the GraphSOM model which substantially improves the stability of the model, and, as a side effect, allows for an acceleration of training. The proposed extension is based on a soft encoding of the information needed to represent the vertices of an input graph. Experimental results versus the original GraphSOM model demonstrate the advantages of the proposed extension.
ES2009-30
Hardware Implementation Issues of the Neighborhood Mechanism in Kohonen Self Organized Feature Maps
Marta Kolasa, Rafal Dlugosz
Hardware Implementation Issues of the Neighborhood Mechanism in Kohonen Self Organized Feature Maps
Marta Kolasa, Rafal Dlugosz
Abstract:
In this paper, we discuss an important problem of the selection of the neighborhood radius in the learning schemes of the Winner Takes Most Kohonen neural network. The optimization of this parameter is essential in case of hardware realization of the network given that the lower values of the radius can result in significant reduction of both the power dissipation and the chip area, even by 40-60% that is important in application of such networks in low power devices. The simulation studies reveal that using large initial values of the neighborhood radius usually is not the most optimal. For a wide range of the training parameters some optimal values, usually small, of the neighborhood radius may be indicated that allow for the minimization of the quantization error.
In this paper, we discuss an important problem of the selection of the neighborhood radius in the learning schemes of the Winner Takes Most Kohonen neural network. The optimization of this parameter is essential in case of hardware realization of the network given that the lower values of the radius can result in significant reduction of both the power dissipation and the chip area, even by 40-60% that is important in application of such networks in low power devices. The simulation studies reveal that using large initial values of the neighborhood radius usually is not the most optimal. For a wide range of the training parameters some optimal values, usually small, of the neighborhood radius may be indicated that allow for the minimization of the quantization error.
ES2009-31
Reconciling neural fields to self-organization
Lucian Alecu, Hervé Frezza-Buet
Reconciling neural fields to self-organization
Lucian Alecu, Hervé Frezza-Buet
Abstract:
Despite being successfully used in the design of various biologically-inspired applications, the paradigm of dynamic neural fields (DNF) does not seem to have been exploited at its full potential yet. Partly because of the difficulties concerning a comprehensive theoretical study of them, essential aspects as learning mechanisms have rarely been addressed in the literature. In the current paper, we first show that classical DNF equations fail to offer reliable support for self-organization, unveiling some behavioural issues that prevent the fields to achieve this goal. Then, as an alternative to these, we propose a new DNF equation capable of deploying indeed a self-organizing mechanism based on neural fields.
Despite being successfully used in the design of various biologically-inspired applications, the paradigm of dynamic neural fields (DNF) does not seem to have been exploited at its full potential yet. Partly because of the difficulties concerning a comprehensive theoretical study of them, essential aspects as learning mechanisms have rarely been addressed in the literature. In the current paper, we first show that classical DNF equations fail to offer reliable support for self-organization, unveiling some behavioural issues that prevent the fields to achieve this goal. Then, as an alternative to these, we propose a new DNF equation capable of deploying indeed a self-organizing mechanism based on neural fields.
ES2009-126
Applying Mutual Information for Prototype or Instance Selection in Regression Problems
Alberto Guillen, Luis Javier Herrera, Gines Rubio, Héctor Pomares, Amaury Lendasse, Ignacio Rojas
Applying Mutual Information for Prototype or Instance Selection in Regression Problems
Alberto Guillen, Luis Javier Herrera, Gines Rubio, Héctor Pomares, Amaury Lendasse, Ignacio Rojas
Abstract:
The problem of selecting the patterns to be learned by any model is usually not considered by the time of designing the concrete model but as a preprocessing step. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. This paper presents a new application of the concept of mutual information not to select the variables but to decide which prototypes should belong to the training data set in regression problems. The proposed methodology consists in deciding if a prototype should belong or not to the training set using as criteria the estimation of the mutual information between the variables. The novelty of the approach is to focus in prototype selection for regression problems instead of classification as the majority of the literature deals only with the last one. Other element that distinguish this work from others is that it is not proposed as an outlier identificator but as algorithm that determines the best subset of input vectors by the time of building a model to approximate it. As the experiment section shows, this new method is able to identify a high percentage of the real data set when it is applied to a highly distorted data sets.
The problem of selecting the patterns to be learned by any model is usually not considered by the time of designing the concrete model but as a preprocessing step. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. This paper presents a new application of the concept of mutual information not to select the variables but to decide which prototypes should belong to the training data set in regression problems. The proposed methodology consists in deciding if a prototype should belong or not to the training set using as criteria the estimation of the mutual information between the variables. The novelty of the approach is to focus in prototype selection for regression problems instead of classification as the majority of the literature deals only with the last one. Other element that distinguish this work from others is that it is not proposed as an outlier identificator but as algorithm that determines the best subset of input vectors by the time of building a model to approximate it. As the experiment section shows, this new method is able to identify a high percentage of the real data set when it is applied to a highly distorted data sets.
ES2009-43
Forward feature selection using Residual Mutual Information
Erik Schaffernicht, Christoph Möller, Klaus Debes, Horst-Michael Gross
Forward feature selection using Residual Mutual Information
Erik Schaffernicht, Christoph Möller, Klaus Debes, Horst-Michael Gross
Abstract:
In this paper, we propose a hybrid filter/wrapper approach for fast feature selection using the Residual Mutual Information (RMI) between a classifier output and the remaining features as selection criterion. This approach can handle redundancies in the data as well as the bias of the employed learning machine while keeping the number of required evaluation cycles low. In classification experiments, we compare the Residual Mutual Information algorithm with other basic approaches for feature subset selection that use similar selection criteria. The efficiency and effectiveness of our method are demonstrated by the obtained results on UCI datasets.
In this paper, we propose a hybrid filter/wrapper approach for fast feature selection using the Residual Mutual Information (RMI) between a classifier output and the remaining features as selection criterion. This approach can handle redundancies in the data as well as the bias of the employed learning machine while keeping the number of required evaluation cycles low. In classification experiments, we compare the Residual Mutual Information algorithm with other basic approaches for feature subset selection that use similar selection criteria. The efficiency and effectiveness of our method are demonstrated by the obtained results on UCI datasets.
ES2009-68
Gaussian Mixture Models for multiclass problems with performance constraints
Nisrine Jrad, Edith Grall-Maes, Pierre Beauseroy
Gaussian Mixture Models for multiclass problems with performance constraints
Nisrine Jrad, Edith Grall-Maes, Pierre Beauseroy
Abstract:
This paper proposes a method using labelled data to learn a decision rule for multiclass problems with class-selective rejection and performance constraints. The method is based on class-conditional density estimations obtained by using the Gaussian Mixture Models (GMM). The rule is thus determined by plugging these estimations in the statistical hypothesis framework and solving an optimization problem. Two simulations are then carried out to corroborate the efficiency of the proposed method. Experimental results show that it compares well with a non-parametric solution using Parzen estimator.
This paper proposes a method using labelled data to learn a decision rule for multiclass problems with class-selective rejection and performance constraints. The method is based on class-conditional density estimations obtained by using the Gaussian Mixture Models (GMM). The rule is thus determined by plugging these estimations in the statistical hypothesis framework and solving an optimization problem. Two simulations are then carried out to corroborate the efficiency of the proposed method. Experimental results show that it compares well with a non-parametric solution using Parzen estimator.
ES2009-25
On the routing complexity of neural network models - Rent's Rule revisited
Johannes Partzsch, Rene Schüffny
On the routing complexity of neural network models - Rent's Rule revisited
Johannes Partzsch, Rene Schüffny
Abstract:
In most models of spiking neural networks, routing complexity and scalability have not been taken into account. In this paper, we analyse recent neural network models on their routing complexity, using a method from circuit design known as Rent's Rule. We find a high complexity in most of the models for a wide range of connectivity levels. As a consequence, these models do not scale well in a two- or three-dimensional substrate, such as neuromorphic hardware or the brain.
In most models of spiking neural networks, routing complexity and scalability have not been taken into account. In this paper, we analyse recent neural network models on their routing complexity, using a method from circuit design known as Rent's Rule. We find a high complexity in most of the models for a wide range of connectivity levels. As a consequence, these models do not scale well in a two- or three-dimensional substrate, such as neuromorphic hardware or the brain.