subscribe to arXiv mailings

A Trainable Feature Extractor Module for Deep Neural Networks and Scanpath Classification

Abstract: Scanpath classification is an area in eye tracking research with possible applications in medicine, manufacturing as well as training systems for students in various domains. In this paper we propose a trainable feature extraction module for deep neural networks. The purpose of this module is to transform a scanpath into a feature vector which is directly useable for the deep neural network archit… ▽ More Scanpath classification is an area in eye tracking research with possible applications in medicine, manufacturing as well as training systems for students in various domains. In this paper we propose a trainable feature extraction module for deep neural networks. The purpose of this module is to transform a scanpath into a feature vector which is directly useable for the deep neural network architecture. Based on the backpropagated error of the deep neural network, the feature extraction module adapts its parameters to improve the classification performance. Therefore, our feature extraction module is jointly trainable with the deep neural network. The motivation to this feature extraction module is based on classical histogram-based approaches which usually compute distributions over a scanpath. We evaluated our module on three public datasets and compared it to the state of the art approaches. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.11665 [pdf, other]

Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction

Authors: Wolfgang Fuhl

Abstract: We propose an improvement to the landmark validity loss. Landmark detection is widely used in head pose estimation, eyelid shape extraction, as well as pupil and iris segmentation. There are numerous additional applications where landmark detection is used to estimate the shape of complex objects. One part of this process is the accurate and fine-grained detection of the shape. The other part is t… ▽ More We propose an improvement to the landmark validity loss. Landmark detection is widely used in head pose estimation, eyelid shape extraction, as well as pupil and iris segmentation. There are numerous additional applications where landmark detection is used to estimate the shape of complex objects. One part of this process is the accurate and fine-grained detection of the shape. The other part is the validity or inaccuracy per landmark, which can be used to detect unreliable areas, where the shape possibly does not fit, and to improve the accuracy of the entire shape extraction by excluding inaccurate landmarks. We propose a normalization in the loss formulation, which improves the accuracy of the entire approach due to the numerical balance of the normalized inaccuracy. In addition, we propose a margin for the inaccuracy to reduce the impact of gradients, which are produced by negligible errors close to the ground truth. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2311.03996 [pdf, other]

An Initialization Schema for Neuronal Networks on Tabular Data

Authors: Wolfgang Fuhl

Abstract: Nowadays, many modern applications require heterogeneous tabular data, which is still a challenging task in terms of regression and classification. Many approaches have been proposed to adapt neural networks for this task, but still, boosting and bagging of decision trees are the best-performing methods for this task. In this paper, we show that a binomial initialized neural network can be used ef… ▽ More Nowadays, many modern applications require heterogeneous tabular data, which is still a challenging task in terms of regression and classification. Many approaches have been proposed to adapt neural networks for this task, but still, boosting and bagging of decision trees are the best-performing methods for this task. In this paper, we show that a binomial initialized neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We also show that this initializing schema can be used to jointly train ensembles by adding gradient masking to batch entries and using the binomial initialization for the last layer in a neural network. For this purpose, we modified the hinge binary loss and the soft max loss to make them applicable for joint ensemble training. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches. In addition, we discuss the limitations and possible further research of our approach for improving the applicability of neural networks to tabular data. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FInitializationNeuronalNetworksTabularData&mode=list △ Less

Submitted 24 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2303.12757 [pdf, other]

A temporally quantized distribution of pupil diameters as a new feature for cognitive load classification

Authors: Wolfgang Fuhl, Susanne Zabel, Theresa Harbig, Julia Astrid Moldt, Teresa Festl Wiete, Anne Herrmann Werner, Kay Nieselt

Abstract: In this paper, we present a new feature that can be used to classify cognitive load based on pupil information. The feature consists of a temporal segmentation of the eye tracking recordings. For each segment of the temporal partition, a probability distribution of pupil size is computed and stored. These probability distributions can then be used to classify the cognitive load. The presented feat… ▽ More In this paper, we present a new feature that can be used to classify cognitive load based on pupil information. The feature consists of a temporal segmentation of the eye tracking recordings. For each segment of the temporal partition, a probability distribution of pupil size is computed and stored. These probability distributions can then be used to classify the cognitive load. The presented feature significantly improves the classification accuracy of the cognitive load compared to other statistical values obtained from eye tracking data, which represent the state of the art in this field. The applications of determining Cognitive Load from pupil data are numerous and could lead, for example, to pre-warning systems for burnouts. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FCognitiveLoadFeature&mode=list △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2303.12744 [pdf, other]

Area of interest adaption using feature importance

Authors: Wolfgang Fuhl, Susanne Zabel, Theresa Harbig, Julia Astrid Moldt, Teresa Festl Wiete, Anne Herrmann Werner, Kay Nieselt

Abstract: In this paper, we present two approaches and algorithms that adapt areas of interest (AOI) or regions of interest (ROI), respectively, to the eye tracking data quality and classification task. The first approach uses feature importance in a greedy way and grows or shrinks AOIs in all directions. The second approach is an extension of the first approach, which divides the AOIs into areas and calcul… ▽ More In this paper, we present two approaches and algorithms that adapt areas of interest (AOI) or regions of interest (ROI), respectively, to the eye tracking data quality and classification task. The first approach uses feature importance in a greedy way and grows or shrinks AOIs in all directions. The second approach is an extension of the first approach, which divides the AOIs into areas and calculates a direction of growth, i.e. a gradient. Both approaches improve the classification results considerably in the case of generalized AOIs, but can also be used for qualitative analysis. In qualitative analysis, the algorithms presented allow the AOIs to be adapted to the data, which means that errors and inaccuracies in eye tracking data can be better compensated for. A good application example is abstract art, where manual AOIs annotation is hardly possible, and data-driven approaches are mainly used for initial AOIs. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FAOIGradient&mode=list △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2303.06154 [pdf, other]

Resource saving taxonomy classification with k-mer distributions and machine learning

Authors: Wolfgang Fuhl, Susanne Zabel, Kay Nieselt

Abstract: Modern high throughput sequencing technologies like metagenomic sequencing generate millions of sequences which have to be classified based on their taxonomic rank. Modern approaches either apply local alignment and comparison to existing data sets like MMseqs2 or use deep neural networks as it is done in DeepMicrobes and BERTax. Alignment-based approaches are costly in terms of runtime, especiall… ▽ More Modern high throughput sequencing technologies like metagenomic sequencing generate millions of sequences which have to be classified based on their taxonomic rank. Modern approaches either apply local alignment and comparison to existing data sets like MMseqs2 or use deep neural networks as it is done in DeepMicrobes and BERTax. Alignment-based approaches are costly in terms of runtime, especially since databases get larger and larger. For the deep learning-based approaches, specialized hardware is necessary for a computation, which consumes large amounts of energy. In this paper, we propose to use $k$-mer distributions obtained from DNA as features to classify its taxonomic origin using machine learning approaches like the subspace $k$-nearest neighbors algorithm, neural networks or bagged decision trees. In addition, we propose a feature space data set balancing approach, which allows reducing the data set for training and improves the performance of the classifiers. By comparing performance, time, and memory consumption of our approach to those of state-of-the-art algorithms (BERTax and MMseqs2) using several datasets, we show that our approach improves the classification on the genus level and achieves comparable results for the superkingdom and phylum level. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FTaxonomyClassification&mode=list △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.06039 [pdf, ps, other]

One step closer to EEG based eye tracking

Authors: Wolfgang Fuhl, Susanne Zabel, Theresa Harbig, Julia Astrid Moldt, Teresa Festl Wiete, Anne Herrmann Werner, Kay Nieselt

Abstract: In this paper, we present two approaches and algorithms that adapt areas of interest We present a new deep neural network (DNN) that can be used to directly determine gaze position using EEG data. EEG-based eye tracking is a new and difficult research topic in the field of eye tracking, but it provides an alternative to image-based eye tracking with an input data set comparable to conventional ima… ▽ More In this paper, we present two approaches and algorithms that adapt areas of interest We present a new deep neural network (DNN) that can be used to directly determine gaze position using EEG data. EEG-based eye tracking is a new and difficult research topic in the field of eye tracking, but it provides an alternative to image-based eye tracking with an input data set comparable to conventional image processing. The presented DNN exploits spatial dependencies of the EEG signal and uses convolutions similar to spatial filtering, which is used for preprocessing EEG signals. By this, we improve the direct gaze determination from the EEG signal compared to the state of the art by 3.5 cm MAE (Mean absolute error), but unfortunately still do not achieve a directly applicable system, since the inaccuracy is still significantly higher compared to image-based eye trackers. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FEEGGaze&mode=list △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2303.00423 [pdf, other]

Multiperspective Teaching of Unknown Objects via Shared-gaze-based Multimodal Human-Robot Interaction

Authors: Daniel Weber, Wolfgang Fuhl, Enkelejda Kasneci, Andreas Zell

Abstract: For successful deployment of robots in multifaceted situations, an understanding of the robot for its environment is indispensable. With advancing performance of state-of-the-art object detectors, the capability of robots to detect objects within their interaction domain is also enhancing. However, it binds the robot to a few trained classes and prevents it from adapting to unfamiliar surroundings… ▽ More For successful deployment of robots in multifaceted situations, an understanding of the robot for its environment is indispensable. With advancing performance of state-of-the-art object detectors, the capability of robots to detect objects within their interaction domain is also enhancing. However, it binds the robot to a few trained classes and prevents it from adapting to unfamiliar surroundings beyond predefined scenarios. In such scenarios, humans could assist robots amidst the overwhelming number of interaction entities and impart the requisite expertise by acting as teachers. We propose a novel pipeline that effectively harnesses human gaze and augmented reality in a human-robot collaboration context to teach a robot novel objects in its surrounding environment. By intertwining gaze (to guide the robot's attention to an object of interest) with augmented reality (to convey the respective class information) we enable the robot to quickly acquire a significant amount of automatically labeled training data on its own. Training in a transfer learning fashion, we demonstrate the robot's capability to detect recently learned objects and evaluate the influence of different machine learning models and learning procedures as well as the amount of training data involved. Our multimodal approach proves to be an efficient and natural way to teach the robot novel objects based on a few instances and allows it to detect classes for which no training dataset is available. In addition, we make our dataset publicly available to the research community, which consists of RGB and depth data, intrinsic and extrinsic camera parameters, along with regions of interest. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2206.09697 [pdf, other]

Technical Report: Combining knowledge from Transfer Learning during training and Wide Resnets

Authors: Wolfgang Fuhl

Abstract: In this report, we combine the idea of Wide ResNets and transfer learning to optimize the architecture of deep neural networks. The first improvement of the architecture is the use of all layers as information source for the last layer. This idea comes from transfer learning, which uses networks pre-trained on other data and extracts different levels of the network as input for the new task. The s… ▽ More In this report, we combine the idea of Wide ResNets and transfer learning to optimize the architecture of deep neural networks. The first improvement of the architecture is the use of all layers as information source for the last layer. This idea comes from transfer learning, which uses networks pre-trained on other data and extracts different levels of the network as input for the new task. The second improvement is the use of deeper layers instead of deeper sequences of blocks. This idea comes from Wide ResNets. Using both optimizations, both high data augmentation and standard data augmentation can produce better results for different models. Link: https://github.com/wolfgangfuhl/PublicationStuff/tree/master/TechnicalReport1/Supp △ Less

Submitted 20 June, 2022; originally announced June 2022.

arXiv:2204.12150 [pdf, other]

doi 10.1145/3530887

Where and What: Driver Attention-based Object Detection

Authors: Yao Rong, Naemi-Rebecca Kassautzki, Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing and analyzing gaze information has emerged in recent years to benefit autonomous driving technology. Previous works in this context have primarily aimed at predicting "where" human drivers look at and lack knowledge of "what" obj… ▽ More Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing and analyzing gaze information has emerged in recent years to benefit autonomous driving technology. Previous works in this context have primarily aimed at predicting "where" human drivers look at and lack knowledge of "what" objects drivers focus on. Our work bridges the gap between pixel-level and object-level attention prediction. Specifically, we propose to integrate an attention prediction module into a pretrained object detection framework and predict the attention in a grid-based style. Furthermore, critical objects are recognized based on predicted attended-to areas. We evaluate our proposed method on two driver attention datasets, BDD-A and DR(eye)VE. Our framework achieves competitive state-of-the-art performance in the attention prediction on both pixel-level and object-level but is far more efficient (75.3 GFLOPs less) in computation. △ Less

Submitted 22 May, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

Comments: 22 pages

Journal ref: Proceedings of the ACM on Human-Computer Interaction, 2022

arXiv:2203.15651 [pdf, other]

Gaze-based Object Detection in the Wild

Authors: Daniel Weber, Wolfgang Fuhl, Andreas Zell, Enkelejda Kasneci

Abstract: In human-robot collaboration, one challenging task is to teach a robot new yet unknown objects enabling it to interact with them. Thereby, gaze can contain valuable information. We investigate if it is possible to detect objects (object or no object) merely from gaze data and determine their bounding box parameters. For this purpose, we explore different sizes of temporal windows, which serve as a… ▽ More In human-robot collaboration, one challenging task is to teach a robot new yet unknown objects enabling it to interact with them. Thereby, gaze can contain valuable information. We investigate if it is possible to detect objects (object or no object) merely from gaze data and determine their bounding box parameters. For this purpose, we explore different sizes of temporal windows, which serve as a basis for the computation of heatmaps, i.e., the spatial distribution of the gaze data. Additionally, we analyze different grid sizes of these heatmaps, and demonstrate the functionality in a proof of concept using different machine learning techniques. Our method is characterized by its speed and resource efficiency compared to conventional object detectors. In order to generate the required data, we conducted a study with five subjects who could move freely and thus, turn towards arbitrary objects. This way, we chose a scenario for our data collection that is as realistic as possible. Since the subjects move while facing objects, the heatmaps also contain gaze data trajectories, complicating the detection and parameter regression. We make our data set publicly available to the research community for download. △ Less

Submitted 25 January, 2023; v1 submitted 29 March, 2022; originally announced March 2022.

arXiv:2201.08354 [pdf, other]

HPCGen: Hierarchical K-Means Clustering and Level Based Principal Components for Scan Path Genaration

Authors: Wolfgang Fuhl

Abstract: In this paper, we present a new approach for decomposing scan paths and its utility for generating new scan paths. For this purpose, we use the K-Means clustering procedure to the raw gaze data and subsequently iteratively to find more clusters in the found clusters. The found clusters are grouped for each level in the hierarchy, and the most important principal components are computed from the da… ▽ More In this paper, we present a new approach for decomposing scan paths and its utility for generating new scan paths. For this purpose, we use the K-Means clustering procedure to the raw gaze data and subsequently iteratively to find more clusters in the found clusters. The found clusters are grouped for each level in the hierarchy, and the most important principal components are computed from the data contained in them. Using this tree hierarchy and the principal components, new scan paths can be generated that match the human behavior of the original data. We show that this generated data is very useful for generating new data for scan path classification but can also be used to generate fake scan paths. △ Less

Submitted 19 January, 2022; originally announced January 2022.

arXiv:2201.07692 [pdf, other]

GroupGazer: A Tool to Compute the Gaze per Participant in Groups with integrated Calibration to Map the Gaze Online to a Screen or Beamer Projection

Authors: Wolfgang Fuhl, Daniel Weber, Shahram Eivazi

Abstract: In this paper we present GroupGaze. It is a tool that can be used to calculate the gaze direction and the gaze position of whole groups. GroupGazer calculates the gaze direction of every single person in the image and allows to map these gaze vectors to a projection like a projector. In addition to the person-specific gaze direction, the person affiliation of each gaze vector is stored based on th… ▽ More In this paper we present GroupGaze. It is a tool that can be used to calculate the gaze direction and the gaze position of whole groups. GroupGazer calculates the gaze direction of every single person in the image and allows to map these gaze vectors to a projection like a projector. In addition to the person-specific gaze direction, the person affiliation of each gaze vector is stored based on the position in the image. Also, it is possible to save the group attention after a calibration. The software is free to use and requires a simple webcam as well as an NVIDIA GPU and the operating system Windows or Linux. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FGroupGazer&mode=list △ Less

Submitted 10 March, 2023; v1 submitted 19 January, 2022; originally announced January 2022.

arXiv:2201.06799 [pdf, other]

Pistol: Pupil Invisible Supportive Tool to extract Pupil, Iris, Eye Opening, Eye Movements, Pupil and Iris Gaze Vector, and 2D as well as 3D Gaze

Authors: Wolfgang Fuhl, Daniel Weber, Shahram Eivazi

Abstract: This paper describes a feature extraction and gaze estimation software, named \textit{Pistol} that can be used with Pupil Invisible projects and other eye trackers in the future. In offline mode, our software extracts multiple features from the eye including, the pupil and iris ellipse, eye aperture, pupil vector, iris vector, eye movement types from pupil and iris velocities, marker detection, ma… ▽ More This paper describes a feature extraction and gaze estimation software, named \textit{Pistol} that can be used with Pupil Invisible projects and other eye trackers in the future. In offline mode, our software extracts multiple features from the eye including, the pupil and iris ellipse, eye aperture, pupil vector, iris vector, eye movement types from pupil and iris velocities, marker detection, marker distance, 2D gaze estimation for the pupil center, iris center, pupil vector, and iris vector using Levenberg Marquart fitting and neural networks. The gaze signal is computed in 2D for each eye and each feature separately and for both eyes in 3D also for each feature separately. We hope this software helps other researchers to extract state-of-the-art features for their research out of their recordings. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FPISTOL&mode=list △ Less

Submitted 10 March, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

arXiv:2109.02345 [pdf, other]

Tensor Normalization and Full Distribution Training

Authors: Wolfgang Fuhl

Abstract: In this work, we introduce pixel wise tensor normalization, which is inserted after rectifier linear units and, together with batch normalization, provides a significant improvement in the accuracy of modern deep neural networks. In addition, this work deals with the robustness of networks. We show that the factorized superposition of images from the training set and the reformulation of the multi… ▽ More In this work, we introduce pixel wise tensor normalization, which is inserted after rectifier linear units and, together with batch normalization, provides a significant improvement in the accuracy of modern deep neural networks. In addition, this work deals with the robustness of networks. We show that the factorized superposition of images from the training set and the reformulation of the multi class problem into a multi-label problem yields significantly more robust networks. The reformulation and the adjustment of the multi class log loss also improves the results compared to the overlay with only one class as label. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FTNandFDT&mode=list △ Less

Submitted 6 September, 2021; originally announced September 2021.

arXiv:2105.10277 [pdf, other]

Maximum and Leaky Maximum Propagation

Authors: Wolfgang Fuhl

Abstract: In this work, we present an alternative to conventional residual connections, which is inspired by maxout nets. This means that instead of the addition in residual connections, our approach only propagates the maximum value or, in the leaky formulation, propagates a percentage of both. In our evaluation, we show on different public data sets that the presented approaches are comparable to the resi… ▽ More In this work, we present an alternative to conventional residual connections, which is inspired by maxout nets. This means that instead of the addition in residual connections, our approach only propagates the maximum value or, in the leaky formulation, propagates a percentage of both. In our evaluation, we show on different public data sets that the presented approaches are comparable to the residual connections and have other interesting properties, such as better generalization with a constant batch normalization, faster learning, and also the possibility to generalize without additional activation functions. In addition, the proposed approaches work very well if ensembles together with residual networks are formed. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FMaximumPropagation&mode=list △ Less

Submitted 8 September, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

arXiv:2102.02115 [pdf, other]

doi 10.1109/ismar52148.2021.00053

TEyeD: Over 20 million real-world eye images with Pupil, Eyelid, and Iris 2D and 3D Segmentations, 2D and 3D Landmarks, 3D Eyeball, Gaze Vector, and Eye Movement Types

Authors: Wolfgang Fuhl, Gjergji Kasneci, Enkelejda Kasneci

Abstract: We present TEyeD, the world's largest unified public data set of eye images taken with head-mounted devices. TEyeD was acquired with seven different head-mounted eye trackers. Among them, two eye trackers were integrated into virtual reality (VR) or augmented reality (AR) devices. The images in TEyeD were obtained from various tasks, including car rides, simulator rides, outdoor sports activities,… ▽ More We present TEyeD, the world's largest unified public data set of eye images taken with head-mounted devices. TEyeD was acquired with seven different head-mounted eye trackers. Among them, two eye trackers were integrated into virtual reality (VR) or augmented reality (AR) devices. The images in TEyeD were obtained from various tasks, including car rides, simulator rides, outdoor sports activities, and daily indoor activities. The data set includes 2D and 3D landmarks, semantic segmentation, 3D eyeball annotation and the gaze vector and eye movement types for all images. Landmarks and semantic segmentation are provided for the pupil, iris and eyelids. Video lengths vary from a few minutes to several hours. With more than 20 million carefully annotated images, TEyeD provides a unique, coherent resource and a valuable foundation for advancing research in the field of computer vision, eye tracking and gaze estimation in modern VR and AR applications. Download: Just connect via FTP as user TEyeDUser and without password to nephrit.cs.uni-tuebingen.de (ftp://nephrit.cs.uni-tuebingen.de). △ Less

Submitted 6 June, 2023; v1 submitted 3 February, 2021; originally announced February 2021.

Comments: Download: Just connect via FTP as user TEyeDUser and without password to nephrit.cs.uni-tuebingen.de (ftp://nephrit.cs.uni-tuebingen.de)

arXiv:2102.01921 [pdf, other]

1000 Pupil Segmentations in a Second using Haar Like Features and Statistical Learning

Authors: Wolfgang Fuhl

Abstract: In this paper we present a new approach for pupil segmentation. It can be computed and trained very efficiently, making it ideal for online use for high speed eye trackers as well as for energy saving pupil detection in mobile eye tracking. The approach is inspired by the BORE and CBF algorithms and generalizes the binary comparison by Haar features. Since these features are intrinsically very sus… ▽ More In this paper we present a new approach for pupil segmentation. It can be computed and trained very efficiently, making it ideal for online use for high speed eye trackers as well as for energy saving pupil detection in mobile eye tracking. The approach is inspired by the BORE and CBF algorithms and generalizes the binary comparison by Haar features. Since these features are intrinsically very susceptible to noise and fluctuating light conditions, we combine them with conditional pupil shape probabilities. In addition, we also rank each feature according to its importance in determining the pupil shape. Another advantage of our method is the use of statistical learning, which is very efficient and can even be used online. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FStatsPupil&mode=list △ Less

Submitted 3 February, 2021; originally announced February 2021.

arXiv:2101.04318 [pdf]

A Multimodal Eye Movement Dataset and a Multimodal Eye Movement Segmentation Analysis

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: We present a new dataset with annotated eye movements. The dataset consists of over 800,000 gaze points recorded during a car ride in the real world and in the simulator. In total, the eye movements of 19 subjects were annotated. In this dataset there are several data sources such as the eyelid closure, the pupil center, the optical vector, and a vector into the pupil center starting from the cent… ▽ More We present a new dataset with annotated eye movements. The dataset consists of over 800,000 gaze points recorded during a car ride in the real world and in the simulator. In total, the eye movements of 19 subjects were annotated. In this dataset there are several data sources such as the eyelid closure, the pupil center, the optical vector, and a vector into the pupil center starting from the center of the eye corners. These different data sources are analyzed and evaluated individually as well as in combination with respect to their goodness of fit for eye movement classification. These results will help developers of real-time systems and algorithms to find the best data sources for their application. Also, new algorithms can be trained and evaluated on this data set. The data and the Matlab code can be downloaded here https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FA%20Multimodal%20Eye%20Movement%20Dataset%20and%20...&mode=list △ Less

Submitted 12 January, 2021; originally announced January 2021.

arXiv:2101.03793 [pdf, other]

The Gaze and Mouse Signal as additional Source for User Fingerprints in Browser Applications

Authors: Wolfgang Fuhl, Daniel Weber, Shahram Eivazi

Abstract: In this work, we inspect different data sources for browser fingerprints. We show which disadvantages and limitations browser statistics have and how this can be avoided with other data sources. Since human visual behavior is a rich source of information and also contains person specific information, it is a valuable source for browser fingerprints. However, human gaze acquisition in the browser a… ▽ More In this work, we inspect different data sources for browser fingerprints. We show which disadvantages and limitations browser statistics have and how this can be avoided with other data sources. Since human visual behavior is a rich source of information and also contains person specific information, it is a valuable source for browser fingerprints. However, human gaze acquisition in the browser also has disadvantages, such as inaccuracies via webcam and the restriction that the user must first allow access to the camera. However, it is also known that the mouse movements and the human gaze correlate and therefore, the mouse movements can be used instead of the gaze signal. In our evaluation, we show the influence of all possible combinations of the three information sources for user recognition and describe our simple approach in detail. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FThe%20Gaze%20and%20Mouse%20Signal%20as%20additional%20Source%20...&mode=list △ Less

Submitted 10 March, 2023; v1 submitted 11 January, 2021; originally announced January 2021.

arXiv:2010.00873 [pdf, other]

Rotated Ring, Radial and Depth Wise Separable Radial Convolutions

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Simple image rotations significantly reduce the accuracy of deep neural networks. Moreover, training with all possible rotations increases the data set, which also increases the training duration. In this work, we address trainable rotation invariant convolutions as well as the construction of nets, since fully connected layers can only be rotation invariant with a one-dimensional input. On the on… ▽ More Simple image rotations significantly reduce the accuracy of deep neural networks. Moreover, training with all possible rotations increases the data set, which also increases the training duration. In this work, we address trainable rotation invariant convolutions as well as the construction of nets, since fully connected layers can only be rotation invariant with a one-dimensional input. On the one hand, we show that our approach is rotationally invariant for different models and on different public data sets. We also discuss the influence of purely rotational invariant features on accuracy. The rotationally adaptive convolution models presented in this work are more computationally intensive than normal convolution models. Therefore, we also present a depth wise separable approach with radial convolution. Link to CUDA code https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/ △ Less

Submitted 17 January, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

arXiv:2010.00866 [pdf, other]

Weight and Gradient Centralization in Deep Neural Networks

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage… ▽ More Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage of the newer methods compared to the batch normalization is not only increased generalization, but also that these methods only have to be applied during training and, therefore, do not influence the running time during use. Link to CUDA code https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/ △ Less

Submitted 17 January, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

arXiv:2010.00821 [pdf, other]

Explainable Online Validation of Machine Learning Models for Practical Applications

Authors: Wolfgang Fuhl, Yao Rong, Thomas Motz, Michael Scheidt, Andreas Hartel, Andreas Koch, Enkelejda Kasneci

Abstract: We present a reformulation of the regression and classification, which aims to validate the result of a machine learning algorithm. Our reformulation simplifies the original problem and validates the result of the machine learning algorithm using the training data. Since the validation of machine learning algorithms must always be explainable, we perform our experiments with the kNN algorithm as w… ▽ More We present a reformulation of the regression and classification, which aims to validate the result of a machine learning algorithm. Our reformulation simplifies the original problem and validates the result of the machine learning algorithm using the training data. Since the validation of machine learning algorithms must always be explainable, we perform our experiments with the kNN algorithm as well as with an algorithm based on conditional probabilities, which is proposed in this work. For the evaluation of our approach, three publicly available data sets were used and three classification and two regression problems were evaluated. The presented algorithm based on conditional probabilities is also online capable and requires only a fraction of memory compared to the kNN algorithm. △ Less

Submitted 17 January, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

arXiv:2006.06969 [pdf, other]

Multi Layer Neural Networks as Replacement for Pooling Operations

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Pooling operations, which can be calculated at low cost and serve as a linear or nonlinear transfer function for data reduction, are found in almost every modern neural network. Countless modern approaches have already tackled replacing the common maximum value selection and mean value operations, not to mention providing a function that allows different functions to be selected through changing p… ▽ More Pooling operations, which can be calculated at low cost and serve as a linear or nonlinear transfer function for data reduction, are found in almost every modern neural network. Countless modern approaches have already tackled replacing the common maximum value selection and mean value operations, not to mention providing a function that allows different functions to be selected through changing parameters. Additional neural networks are used to estimate the parameters of these pooling functions.Consequently, pooling layers may require supplementary parameters to increase the complexity of the whole model. In this work, we show that one perceptron can already be used effectively as a pooling operation without increasing the complexity of the model. This kind of pooling allows for the integration of multi-layer neural networks directly into a model as a pooling operation by restructuring the data and, as a result, learnin complex pooling operations. We compare our approach to tensor convolution with strides as a pooling operation and show that our approach is both effective and reduces complexity. The restructuring of the data in combination with multiple perceptrons allows for our approach to be used for upscaling, which can then be utilized for transposed convolutions in semantic segmentation. △ Less

Submitted 17 January, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

arXiv:2002.10905 [pdf, other]

Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction

Authors: Wolfgang Fuhl, Yao Rong, Enkelejda Kasneci

Abstract: In this paper, we use fully convolutional neural networks for the semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data. The first improvement of our approach is that no input window is necessary, due to the use of fully convolutional networks and therefore any input size can be pr… ▽ More In this paper, we use fully convolutional neural networks for the semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data. The first improvement of our approach is that no input window is necessary, due to the use of fully convolutional networks and therefore any input size can be processed directly. The second improvement is that the used and generated data is raw eye tracking data (position X, Y and time) without preprocessing. This is achieved by pre-initializing the filters in the first layer and by building the input tensor along the z axis. We evaluated our approach on three publicly available datasets and compare the results to the state of the art. △ Less

Submitted 17 January, 2021; v1 submitted 17 February, 2020; originally announced February 2020.

arXiv:2002.08972 [pdf, other]

doi 10.1371/journal.pone.0255979

Differential Privacy for Eye Tracking with Temporal Correlations

Authors: Efe Bozkir, Onur Günlü, Wolfgang Fuhl, Rafael F. Schaefer, Enkelejda Kasneci

Abstract: New generation head-mounted displays, such as VR and AR glasses, are coming into the market with already integrated eye tracking and are expected to enable novel ways of human-computer interaction in numerous applications. However, since eye movement properties contain biometric information, privacy concerns have to be handled properly. Privacy-preservation techniques such as differential privacy… ▽ More New generation head-mounted displays, such as VR and AR glasses, are coming into the market with already integrated eye tracking and are expected to enable novel ways of human-computer interaction in numerous applications. However, since eye movement properties contain biometric information, privacy concerns have to be handled properly. Privacy-preservation techniques such as differential privacy mechanisms have recently been applied to eye movement data obtained from such displays. Standard differential privacy mechanisms; however, are vulnerable due to temporal correlations between the eye movement observations. In this work, we propose a novel transform-coding based differential privacy mechanism to further adapt it to the statistics of eye movement feature data and compare various low-complexity methods. We extend the Fourier perturbation algorithm, which is a differential privacy mechanism, and correct a scaling mistake in its proof. Furthermore, we illustrate significant reductions in sample correlations in addition to query sensitivities, which provide the best utility-privacy trade-off in the eye tracking literature. Our results provide significantly high privacy without any essential loss in classification accuracies while hiding personal identifiers. △ Less

Submitted 20 December, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: In PLOS ONE

arXiv:2002.06806 [pdf, other]

Reinforcement learning for the privacy preservation and manipulation of eye tracking data

Authors: Wolfgang Fuhl, Efe Bozkir, Enkelejda Kasneci

Abstract: In this paper, we present an approach based on reinforcement learning for eye tracking data manipulation. It is based on two opposing agents, where one tries to classify the data correctly and the second agent looks for patterns in the data, which get manipulated to hide specific information. We show that our approach is successfully applicable to preserve the privacy of the subjects. For this pur… ▽ More In this paper, we present an approach based on reinforcement learning for eye tracking data manipulation. It is based on two opposing agents, where one tries to classify the data correctly and the second agent looks for patterns in the data, which get manipulated to hide specific information. We show that our approach is successfully applicable to preserve the privacy of the subjects. For this purpose, we evaluate our approach iteratively to showcase the behavior of the reinforcement learning based approach. In addition, we evaluate the importance of temporal, as well as spatial, information of eye tracking data for specific classification goals. In the last part of our evaluation, we apply the procedure to further public data sets without re-training the autoencoder or the data manipulator. The results show that the learned manipulation is generalized and applicable to unseen data as well. △ Less

Submitted 2 October, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

arXiv:1905.10073 [pdf, other]

Training Decision Trees as Replacement for Convolution Layers

Authors: Wolfgang Fuhl, Gjergji Kasneci, Wolfgang Rosenstiel, Enkelejda Kasneci

Abstract: We present an alternative layer to convolution layers in convolutional neural networks (CNNs). Our approach reduces the complexity of convolutions by replacing it with binary decisions. Those binary decisions are used as indexes to conditional distributions where each weight represents a leaf in a decision tree. This means that only the indices to the weights need to be determined once, thus reduc… ▽ More We present an alternative layer to convolution layers in convolutional neural networks (CNNs). Our approach reduces the complexity of convolutions by replacing it with binary decisions. Those binary decisions are used as indexes to conditional distributions where each weight represents a leaf in a decision tree. This means that only the indices to the weights need to be determined once, thus reducing the complexity of convolutions by the depth of the output tensor. Index computation is performed by simple binary decisions that require fewer cycles compared to conventionally used multiplications. In addition, we show how convolutions can be replaced by binary decisions. These binary decisions form indices in the conditional distributions and we show how they are used to replace 2D weight matrices as well as 3D weight tensors. These new layers can be trained like convolution layers in CNNs based on the backpropagation algorithm, for which we provide a formalization. Our results on multiple publicly available data sets show that our approach performs similar to conventional neuronal networks. Beyond the formalized reduction of complexity and the improved qualitative performance, we show the runtime improvement empirically compared to convolution layers. △ Less

Submitted 11 February, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Will be published in the proceedings oft he AAAI 2020 conference

arXiv:1901.10143 [pdf, other]

doi 10.1117/12.2559517

Learning to Validate the Quality of Detected Landmarks

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: We present a new loss function for the validation of image landmarks detected via Convolutional Neural Networks (CNN). The network learns to estimate how accurate its landmark estimation is. This loss function is applicable to all regression-based location estimations and allows the exclusion of unreliable landmarks from further processing. In addition, we formulate a novel batch balancing approac… ▽ More We present a new loss function for the validation of image landmarks detected via Convolutional Neural Networks (CNN). The network learns to estimate how accurate its landmark estimation is. This loss function is applicable to all regression-based location estimations and allows the exclusion of unreliable landmarks from further processing. In addition, we formulate a novel batch balancing approach which weights the importance of samples based on their produced loss. This is done by computing a probability distribution mapping on an interval from which samples can be selected using a uniform random selection scheme. We conducted experiments on the 300W, AFLW, and WFLW facial landmark datasets. In the first experiments, the influence of our batch balancing approach is evaluated by comparing it against uniform sampling. In addition, we evaluated the impact of the validation loss on the landmark accuracy based on uniform sampling. The last experiments evaluate the correlation of the validation signal with the landmark accuracy. All experiments were performed for all three datasets. △ Less

Submitted 11 February, 2020; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: Will be published in the proceedings of the ICMV 2019 conference

MSC Class: 65D19; 93E35

arXiv:1808.09296 [pdf, other]

Eye movement velocity and gaze data generator for evaluation, robustness testing and assess of eye tracking software and visualization tools

Authors: Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Eye movements hold information about human perception, intention, and cognitive state. We propose a novel eye movement simulator that i) probabilistically simulates saccade movements as gamma distributions considering different peak velocities and ii) models smooth pursuit onsets with the sigmoid function. Additionally, it is capable of producing velocity and two-dimensional gaze sequences for sta… ▽ More Eye movements hold information about human perception, intention, and cognitive state. We propose a novel eye movement simulator that i) probabilistically simulates saccade movements as gamma distributions considering different peak velocities and ii) models smooth pursuit onsets with the sigmoid function. Additionally, it is capable of producing velocity and two-dimensional gaze sequences for static and dynamic scenes using saliency maps or real fixation targets. Our approach is also capable of simulating any sampling rate, even with uctuations. The simulation is evaluated against publicly available annotated data. The simulator can be used in EyeTrace or downloaded at http://ti.unituebingen. de/Projekte.1801.0.html. △ Less

Submitted 10 September, 2018; v1 submitted 27 August, 2018; originally announced August 2018.

Comments: arXiv admin note: substantial text overlap with arXiv:1804.00970

arXiv:1804.00970 [pdf, other]

Eye movement simulation and detector creation to reduce laborious parameter adjustments

Authors: Wolfgang Fuhl, Thiago Santini, Thomas Kuebler, Nora Castner, Wolfgang Rosenstiel, Enkelejda Kasneci

Abstract: Eye movements hold information about human perception, intention and cognitive state. Various algorithms have been proposed to identify and distinguish eye movements, particularly fixations, saccades, and smooth pursuits. A major drawback of existing algorithms is that they rely on accurate and constant sampling rates, impeding straightforward adaptation to new movements such as micro saccades. We… ▽ More Eye movements hold information about human perception, intention and cognitive state. Various algorithms have been proposed to identify and distinguish eye movements, particularly fixations, saccades, and smooth pursuits. A major drawback of existing algorithms is that they rely on accurate and constant sampling rates, impeding straightforward adaptation to new movements such as micro saccades. We propose a novel eye movement simulator that i) probabilistically simulates saccade movements as gamma distributions considering different peak velocities and ii) models smooth pursuit onsets with the sigmoid function. This simulator is combined with a machine learning approach to create detectors for general and specific velocity profiles. Additionally, our approach is capable of using any sampling rate, even with fluctuations. The machine learning approach consists of different binary patterns combined using conditional distributions. The simulation is evaluated against publicly available real data using a squared error, and the detectors are evaluated against state-of-the-art algorithms. △ Less

Submitted 28 March, 2018; originally announced April 2018.

arXiv:1712.08900 [pdf, other]

doi 10.1016/j.cviu.2018.02.002

PuRe: Robust pupil detection for real-time pervasive eye tracking

Authors: Thiago Santini, Wolfgang Fuhl, Enkelejda Kasneci

Abstract: Real-time, accurate, and robust pupil detection is an essential prerequisite to enable pervasive eye-tracking and its applications -- e.g., gaze-based human computer interaction, health monitoring, foveated rendering, and advanced driver assistance. However, automated pupil detection has proved to be an intricate task in real-world scenarios due to a large mixture of challenges such as quickly cha… ▽ More Real-time, accurate, and robust pupil detection is an essential prerequisite to enable pervasive eye-tracking and its applications -- e.g., gaze-based human computer interaction, health monitoring, foveated rendering, and advanced driver assistance. However, automated pupil detection has proved to be an intricate task in real-world scenarios due to a large mixture of challenges such as quickly changing illumination and occlusions. In this paper, we introduce the Pupil Reconstructor PuRe, a method for pupil detection in pervasive scenarios based on a novel edge segment selection and conditional segment combination schemes; the method also includes a confidence measure for the detected pupil. The proposed method was evaluated on over 316,000 images acquired with four distinct head-mounted eye tracking devices. Results show a pupil detection rate improvement of over 10 percentage points w.r.t. state-of-the-art algorithms in the two most challenging data sets (6.46 for all data sets), further pushing the envelope for pupil detection. Moreover, we advance the evaluation protocol of pupil detection algorithms by also considering eye images in which pupils are not present. In this aspect, PuRe improved precision and specificity w.r.t. state-of-the-art algorithms by 25.05 and 10.94 percentage points, respectively, demonstrating the meaningfulness of PuRe's confidence measure. PuRe operates in real-time for modern eye trackers (at 120 fps). △ Less

Submitted 24 December, 2017; originally announced December 2017.

arXiv:1711.03306 [pdf, other]

Fast camera focus estimation for gaze-based focus control

Authors: Wolfgang Fuhl, Thiago Santini, Enkelejda Kasneci

Abstract: Many cameras implement auto-focus functionality. However, they typically require the user to manually identify the location to be focused on. While such an approach works for temporally-sparse autofocusing functionality (e.g., photo shooting), it presents extreme usability problems when the focus must be quickly switched between multiple areas (and depths) of interest - e.g., in a gaze-based autof… ▽ More Many cameras implement auto-focus functionality. However, they typically require the user to manually identify the location to be focused on. While such an approach works for temporally-sparse autofocusing functionality (e.g., photo shooting), it presents extreme usability problems when the focus must be quickly switched between multiple areas (and depths) of interest - e.g., in a gaze-based autofocus approach. This work introduces a novel, real-time auto-focus approach based on eye-tracking, which enables the user to shift the camera focus plane swiftly based solely on the gaze information. Moreover, the proposed approach builds a graph representation of the image to estimate depth plane surfaces and runs in real time (requiring ~20ms on a single i5 core), thus allowing for the depth map estimation to be performed dynamically. We evaluated our algorithm for gaze-based depth estimation against state-of-the-art approaches based on eight new data sets with flat, skewed, and round surfaces, as well as publicly available datasets. △ Less

Submitted 9 November, 2017; originally announced November 2017.

ACM Class: I.4.5; I.4.6; I.4.7; I.4.8

arXiv:1711.00112 [pdf, other]

PupilNet v2.0: Convolutional Neural Networks for CPU based real time Robust Pupil Detection

Authors: Wolfgang Fuhl, Thiago Santini, Gjergji Kasneci, Wolfgang Rosenstiel, Enkelejda Kasneci

Abstract: Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in realworld scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non-centered and off-axis eye recording, as well as physiological eye characteristics. In this paper, we approach this challenge throu… ▽ More Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in realworld scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non-centered and off-axis eye recording, as well as physiological eye characteristics. In this paper, we approach this challenge through: I) a convolutional neural network (CNN) running in real time on a single core, II) a novel computational intensive two stage CNN for accuracy improvement, and III) a fast propability distribution based refinement method as a practical alternative to II. We evaluate the proposed approaches against the state-of-the-art pupil detection algorithms, improving the detection rate up to ~9% percent points on average over all data sets (~7% on one CPU core 7ms). This evaluation was performed on over 135,000 images: 94,000 images from the literature, and 41,000 new hand-labeled and challenging images contributed by this work (v1.0). △ Less

Submitted 30 October, 2017; originally announced November 2017.

Comments: Pupil detection, pupil center estimation, image processing, CNN. arXiv admin note: substantial text overlap with arXiv:1601.04902

arXiv:1601.04902 [pdf, other]

PupilNet: Convolutional Neural Networks for Robust Pupil Detection

Authors: Wolfgang Fuhl, Thiago Santini, Gjergji Kasneci, Enkelejda Kasneci

Abstract: Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in real-world scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non centered and off-axis eye recording, and physiological eye characteristics. In this paper, we propose and evaluate a method based… ▽ More Real-time, accurate, and robust pupil detection is an essential prerequisite for pervasive video-based eye-tracking. However, automated pupil detection in real-world scenarios has proven to be an intricate challenge due to fast illumination changes, pupil occlusion, non centered and off-axis eye recording, and physiological eye characteristics. In this paper, we propose and evaluate a method based on a novel dual convolutional neural network pipeline. In its first stage the pipeline performs coarse pupil position identification using a convolutional neural network and subregions from a downscaled input image to decrease computational costs. Using subregions derived from a small window around the initial pupil position estimate, the second pipeline stage employs another convolutional neural network to refine this position, resulting in an increased pupil detection rate up to 25% in comparison with the best performing state-of-the-art algorithm. Annotated data sets can be made available upon request. △ Less

Submitted 19 January, 2016; originally announced January 2016.

Comments: 9 pages, 11 figures

arXiv:1511.07732 [pdf, other]

Bayesian Identification of Fixations, Saccades, and Smooth Pursuits

Authors: Thiago Santini, Wolfgang Fuhl, Thomas Kübler, Enkelejda Kasneci

Abstract: Smooth pursuit eye movements provide meaningful insights and information on subject's behavior and health and may, in particular situations, disturb the performance of typical fixation/saccade classification algorithms. Thus, an automatic and efficient algorithm to identify these eye movements is paramount for eye-tracking research involving dynamic stimuli. In this paper, we propose the Bayesian… ▽ More Smooth pursuit eye movements provide meaningful insights and information on subject's behavior and health and may, in particular situations, disturb the performance of typical fixation/saccade classification algorithms. Thus, an automatic and efficient algorithm to identify these eye movements is paramount for eye-tracking research involving dynamic stimuli. In this paper, we propose the Bayesian Decision Theory Identification (I-BDT) algorithm, a novel algorithm for ternary classification of eye movements that is able to reliably separate fixations, saccades, and smooth pursuits in an online fashion, even for low-resolution eye trackers. The proposed algorithm is evaluated on four datasets with distinct mixtures of eye movements, including fixations, saccades, as well as straight and circular smooth pursuits; data was collected with a sample rate of 30 Hz from six subjects, totaling 24 evaluation datasets. The algorithm exhibits high and consistent performance across all datasets and movements relative to a manual annotation by a domain expert (recall: μ= 91.42%, σ= 9.52%; precision: μ= 95.60%, σ= 5.29%; specificity μ= 95.41%, σ= 7.02%) and displays a significant improvement when compared to I-VDT, an state-of-the-art algorithm (recall: μ= 87.67%, σ= 14.73%; precision: μ= 89.57%, σ= 8.05%; specificity μ= 92.10%, σ= 11.21%). For algorithm implementation and annotated datasets, please contact the first author. △ Less

Submitted 24 November, 2015; originally announced November 2015.

Comments: 8 pages

ACM Class: I.5.1; I.6.4; J.7

arXiv:1511.06575 [pdf, other]

ElSe: Ellipse Selection for Robust Pupil Detection in Real-World Environments

Authors: Wolfgang Fuhl, Thiago C. Santini, Thomas Kuebler, Enkelejda Kasneci

Abstract: Fast and robust pupil detection is an essential prerequisite for video-based eye-tracking in real-world settings. Several algorithms for image-based pupil detection have been proposed, their applicability is mostly limited to laboratory conditions. In realworld scenarios, automated pupil detection has to face various challenges, such as illumination changes, reflections (on glasses), make-up, non-… ▽ More Fast and robust pupil detection is an essential prerequisite for video-based eye-tracking in real-world settings. Several algorithms for image-based pupil detection have been proposed, their applicability is mostly limited to laboratory conditions. In realworld scenarios, automated pupil detection has to face various challenges, such as illumination changes, reflections (on glasses), make-up, non-centered eye recording, and physiological eye characteristics. We propose ElSe, a novel algorithm based on ellipse evaluation of a filtered edge image. We aim at a robust, resource-saving approach that can be integrated in embedded architectures e.g. driving. The proposed algorithm was evaluated against four state-of-the-art methods on over 93,000 hand-labeled images from which 55,000 are new images contributed by this work. On average, the proposed method achieved a 14.53% improvement on the detection rate relative to the best state-of-the-art performer. download:ftp://emmapupildata@messor.informatik.unituebingen. de (password:eyedata). △ Less

Submitted 23 November, 2015; v1 submitted 20 November, 2015; originally announced November 2015.

ACM Class: I.4.3; I.4.8

Showing 1–37 of 37 results for author: Fuhl, W