Skip to main content

Showing 101–150 of 153 results for author: Schmidhuber, J

  1. arXiv:1505.00387  [pdf, other

    cs.LG cs.NE

    Highway Networks

    Authors: Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber

    Abstract: There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult with increasing depth and training of very deep networks remains an open problem. In this extended abstract, we introduce a new architecture designed to ease gradient-based training of very deep networks. We refer to network… ▽ More

    Submitted 3 November, 2015; v1 submitted 2 May, 2015; originally announced May 2015.

    Comments: 6 pages, 2 figures. Presented at ICML 2015 Deep Learning workshop. Full paper is at arXiv:1507.06228

    MSC Class: 68T01 ACM Class: I.2.6; G.1.6

  2. LSTM: A Search Space Odyssey

    Authors: Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, Jürgen Schmidhuber

    Abstract: Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In t… ▽ More

    Submitted 4 October, 2017; v1 submitted 13 March, 2015; originally announced March 2015.

    Comments: 12 pages, 6 figures

    MSC Class: 68T10 ACM Class: I.2.6; I.2.7; I.5.1; H.5.5

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 28, Issue: 10, Oct. 2017 ) Pages: 2222 - 2232

  3. Assessment of algorithms for mitosis detection in breast cancer histopathology images

    Authors: Mitko Veta, Paul J. van Diest, Stefan M. Willems, Haibo Wang, Anant Madabhushi, Angel Cruz-Roa, Fabio Gonzalez, Anders B. L. Larsen, Jacob S. Vestergaard, Anders B. Dahl, Dan C. Cireşan, Jürgen Schmidhuber, Alessandro Giusti, Luca M. Gambardella, F. Boray Tek, Thomas Walter, Ching-Wei Wang, Satoshi Kondo, Bogdan J. Matuszewski, Frederic Precioso, Violet Snell, Josef Kittler, Teofilo E. de Campos, Adnan M. Khan, Nasir M. Rajpoot , et al. (4 additional authors not shown)

    Abstract: The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automati… ▽ More

    Submitted 21 November, 2014; originally announced November 2014.

    Comments: 23 pages, 5 figures, accepted for publication in the journal Medical Image Analysis

  4. arXiv:1410.1165  [pdf, other

    cs.NE cs.LG

    Understanding Locally Competitive Networks

    Authors: Rupesh Kumar Srivastava, Jonathan Masci, Faustino Gomez, Jürgen Schmidhuber

    Abstract: Recently proposed neural network activation functions such as rectified linear, maxout, and local winner-take-all have allowed for faster and more effective training of deep neural architectures on large and complex datasets. The common trait among these functions is that they implement local competition between small groups of computational units within a layer, so that only part of the network i… ▽ More

    Submitted 8 April, 2015; v1 submitted 5 October, 2014; originally announced October 2014.

    Comments: 9 pages + 2 supplementary, Accepted to ICLR 2015 Conference track

    MSC Class: 68T30; 68T10 ACM Class: I.2.6

  5. arXiv:1407.3068  [pdf, ps, other

    cs.CV cs.LG cs.NE

    Deep Networks with Internal Selective Attention through Feedback Connections

    Authors: Marijn Stollenga, Jonathan Masci, Faustino Gomez, Juergen Schmidhuber

    Abstract: Traditional convolutional neural networks (CNN) are stationary and feedforward. They neither change their parameters during evaluation nor use feedback from higher to lower layers. Real brains, however, do. So does our Deep Attention Selective Network (dasNet) architecture. DasNets feedback structure can dynamically alter its convolutional filter sensitivities during classification. It harnesses t… ▽ More

    Submitted 28 July, 2014; v1 submitted 11 July, 2014; originally announced July 2014.

    Comments: 13 pages, 3 figures

    MSC Class: 68T45

  6. Deep Learning in Neural Networks: An Overview

    Authors: Juergen Schmidhuber

    Abstract: In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between… ▽ More

    Submitted 8 October, 2014; v1 submitted 30 April, 2014; originally announced April 2014.

    Comments: 88 pages, 888 references

    Report number: Technical Report IDSIA-03-14

    Journal ref: Neural Networks, Vol 61, pp 85-117, Jan 2015

  7. arXiv:1402.3511  [pdf, other

    cs.NE cs.LG

    A Clockwork RNN

    Authors: Jan Koutník, Klaus Greff, Faustino Gomez, Jürgen Schmidhuber

    Abstract: Sequence prediction and classification are ubiquitous and challenging problems in machine learning that can require identifying complex dependencies between temporally distant inputs. Recurrent Neural Networks (RNNs) have the ability, in theory, to cope with these temporal dependencies by virtue of the short-term memory implemented by their recurrent (feedback) connections. However, in practice th… ▽ More

    Submitted 14 February, 2014; originally announced February 2014.

  8. arXiv:1312.6764  [pdf

    cs.AI

    Bounded Recursive Self-Improvement

    Authors: E. Nivel, K. R. Thórisson, B. R. Steunebrink, H. Dindo, G. Pezzulo, M. Rodriguez, C. Hernandez, D. Ognibene, J. Schmidhuber, R. Sanz, H. P. Helgason, A. Chella, G. K. Jonsson

    Abstract: We have designed a machine that becomes increasingly better at behaving in underspecified circumstances, in a goal-directed way, on the job, by modeling itself and its environment as experience accumulates. Based on principles of autocatalysis, endogeny, and reflectivity, the work provides an architectural blueprint for constructing systems with high levels of operational autonomy in underspecifie… ▽ More

    Submitted 24 December, 2013; originally announced December 2013.

    Report number: RUTR-SCS13006

  9. arXiv:1312.5548  [pdf, other

    cs.NE

    My First Deep Learning System of 1991 + Deep Learning Timeline 1962-2013

    Authors: Jürgen Schmidhuber

    Abstract: Deep Learning has attracted significant attention in recent years. Here I present a brief overview of my first Deep Learner of 1991, and its historic context, with a timeline of Deep Learning highlights.

    Submitted 19 December, 2013; originally announced December 2013.

    Comments: 11 pages. As a machine learning researcher I am obsessed with proper credit assignment. This draft is the result of an experiment in rapid massive open online peer review. Since 20 September 2013, subsequent revisions published under http://www.deeplearning.me have absorbed many suggestions for improvements by experts

  10. arXiv:1309.0261  [pdf, other

    cs.CV

    Multi-Column Deep Neural Networks for Offline Handwritten Chinese Character Classification

    Authors: Dan Cireşan, Jürgen Schmidhuber

    Abstract: Our Multi-Column Deep Neural Networks achieve best known recognition rates on Chinese characters from the ICDAR 2011 and 2013 offline handwriting competitions, approaching human performance.

    Submitted 1 September, 2013; originally announced September 2013.

    Comments: 5 pages, 1 figure, IDSIA tech report

    Report number: IDSIA-05-13

  11. arXiv:1305.0423  [pdf, other

    cs.LG cs.AI stat.ML

    Testing Hypotheses by Regularized Maximum Mean Discrepancy

    Authors: Somayeh Danafar, Paola M. V. Rancoita, Tobias Glasmachers, Kevin Whittingstall, Juergen Schmidhuber

    Abstract: Do two data samples come from different distributions? Recent studies of this fundamental problem focused on embedding probability distributions into sufficiently rich characteristic Reproducing Kernel Hilbert Spaces (RKHSs), to compare distributions by the distance between their embeddings. We show that Regularized Maximum Mean Discrepancy (RMMD), our novel measure for kernel-based hypothesis tes… ▽ More

    Submitted 2 May, 2013; originally announced May 2013.

  12. arXiv:1302.1700  [pdf, other

    cs.CV cs.AI

    Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks

    Authors: Alessandro Giusti, Dan C. Cireşan, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber

    Abstract: Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are present.

    Submitted 7 February, 2013; originally announced February 2013.

    Comments: 11 pages, 2 figures, 3 tables, 21 references, submitted to ICIP 2013

    Report number: IDSIA-01-13

    Journal ref: International Conference on Image Processing (ICIP) 2013, Melbourne

  13. arXiv:1302.1690  [pdf, other

    cs.CV

    A Fast Learning Algorithm for Image Segmentation with Max-Pooling Convolutional Networks

    Authors: Jonathan Masci, Alessandro Giusti, Dan Cireşan, Gabriel Fricout, Jürgen Schmidhuber

    Abstract: We present a fast algorithm for training MaxPooling Convolutional Networks to segment images. This type of network yields record-breaking performance in a variety of tasks, but is normally trained on a computationally expensive patch-by-patch basis. Our new method processes each training image in a single pass, which is vastly more efficient. We validate the approach in different scenarios and r… ▽ More

    Submitted 7 February, 2013; originally announced February 2013.

  14. arXiv:1212.6521  [pdf, other

    cs.AI

    A Frequency-Domain Encoding for Neuroevolution

    Authors: Jan Koutník, Juergen Schmidhuber, Faustino Gomez

    Abstract: Neuroevolution has yet to scale up to complex reinforcement learning tasks that require large networks. Networks with many inputs (e.g. raw video) imply a very high dimensional search space if encoded directly. Indirect methods use a more compact genotype representation that is transformed into networks of potentially arbitrary size. In this paper, we present an indirect method where networks are… ▽ More

    Submitted 28 December, 2012; originally announced December 2012.

  15. arXiv:1212.2546  [pdf, other

    cs.CV

    A Learning Framework for Morphological Operators using Counter-Harmonic Mean

    Authors: Jonathan Masci, Jesús Angulo, Jürgen Schmidhuber

    Abstract: We present a novel framework for learning morphological operators using counter-harmonic mean. It combines concepts from morphology and convolutional neural networks. A thorough experimental validation analyzes basic morphological operators dilation and erosion, opening and closing, as well as the much more complex top-hat transform, for which we report a real-world application from the steel indu… ▽ More

    Submitted 11 December, 2012; originally announced December 2012.

    Comments: Submitted to ISMM'13

  16. arXiv:1210.8385  [pdf, other

    cs.AI cs.LG

    First Experiments with PowerPlay

    Authors: Rupesh Kumar Srivastava, Bas R. Steunebrink, Jürgen Schmidhuber

    Abstract: Like a scientist or a playing child, PowerPlay not only learns new skills to solve given problems, but also invents new interesting problems by itself. By design, it continually comes up with the fastest to find, initially novel, but eventually solvable tasks. It also continually simplifies or compresses or speeds up solutions to previous tasks. Here we describe first experiments with PowerPlay. A… ▽ More

    Submitted 31 October, 2012; originally announced October 2012.

    Comments: 13 pages, 6 figures. Extends preliminary work presented at ICDL-EpiRob 2012

  17. arXiv:1210.0118  [pdf, ps, other

    cs.NE

    Self-Delimiting Neural Networks

    Authors: Juergen Schmidhuber

    Abstract: Self-delimiting (SLIM) programs are a central concept of theoretical computer science, particularly algorithmic information & probability theory, and asymptotically optimal program search (AOPS). To apply AOPS to (possibly recurrent) neural networks (NNs), I introduce SLIM NNs. Neurons of a typical SLIM NN have threshold activation functions. During a computational episode, activations are spreadi… ▽ More

    Submitted 29 September, 2012; originally announced October 2012.

    Comments: 15 pages

    Report number: IDSIA-08-12

  18. arXiv:1209.6048  [pdf, other

    stat.ME

    Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices

    Authors: Yi Sun, Faustino Gomez, Juergen Schmidhuber

    Abstract: We present a new way of converting a reversible finite Markov chain into a non-reversible one, with a theoretical guarantee that the asymptotic variance of the MCMC estimator based on the non-reversible chain is reduced. The method is applicable to any reversible chain whose states are not connected through a tree, and can be interpreted graphically as inserting vortices into the state transition… ▽ More

    Submitted 26 September, 2012; originally announced September 2012.

    Comments: Published in NIPS 2010

  19. arXiv:1209.5853  [pdf, other

    cs.AI

    Efficient Natural Evolution Strategies

    Authors: Yi Sun, Daan Wierstra, Tom Schaul, Juergen Schmidhuber

    Abstract: Efficient Natural Evolution Strategies (eNES) is a novel alternative to conventional evolutionary algorithms, using the natural gradient to adapt the mutation distribution. Unlike previous methods based on natural gradients, eNES uses a fast algorithm to calculate the inverse of the exact Fisher information matrix, thus increasing both robustness and performance of its evolution gradient estimatio… ▽ More

    Submitted 26 September, 2012; originally announced September 2012.

    Comments: Puslished in GECCO'2009

  20. arXiv:1207.1765  [pdf, other

    cs.CV cs.NE

    Object Recognition with Multi-Scale Pyramidal Pooling Networks

    Authors: Jonathan Masci, Ueli Meier, Gabriel Fricout, Jürgen Schmidhuber

    Abstract: We present a Multi-Scale Pyramidal Pooling Network, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. Thanks to the former the network does not require all images of a given classification task to be of equal size. The encoding layer improves generalisation performance in comparison to similar neural network architectures, especially when training data is sca… ▽ More

    Submitted 7 July, 2012; originally announced July 2012.

  21. arXiv:1207.1522  [pdf, other

    cs.CV cs.NE

    Multimodal similarity-preserving hashing

    Authors: Jonathan Masci, Michael M. Bronstein, Alexander A. Bronstein, Jürgen Schmidhuber

    Abstract: We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches… ▽ More

    Submitted 6 July, 2012; originally announced July 2012.

  22. arXiv:1206.4623  [pdf

    cs.LG stat.ML

    On the Size of the Online Kernel Sparsification Dictionary

    Authors: Yi Sun, Faustino Gomez, Juergen Schmidhuber

    Abstract: We analyze the size of the dictionary constructed from online kernel sparsification, using a novel formula that expresses the expected determinant of the kernel Gram matrix in terms of the eigenvalues of the covariance operator. Using this formula, we are able to connect the cardinality of the dictionary with the eigen-decay of the covariance operator. In particular, we show that under certain tec… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012

  23. arXiv:1202.2745  [pdf, other

    cs.CV cs.AI

    Multi-column Deep Neural Networks for Image Classification

    Authors: Dan Cireşan, Ueli Meier, Juergen Schmidhuber

    Abstract: Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional winner-take-all neurons yield large network depth, resulting in roughly as many sparsely connected neur… ▽ More

    Submitted 13 February, 2012; originally announced February 2012.

    Comments: 20 pages, 14 figures, 8 tables

    Report number: IDSIA-04-12

    Journal ref: CVPR 2012, p. 3642-3649

  24. arXiv:1201.0292  [pdf, other

    cs.LG

    T-Learning

    Authors: Vincent Graziano, Faustino Gomez, Mark Ring, Juergen Schmidhuber

    Abstract: Traditional Reinforcement Learning (RL) has focused on problems involving many states and few actions, such as simple grid worlds. Most real world problems, however, are of the opposite type, Involving Few relevant states and many actions. For example, to return home from a conference, humans identify only few subgoal states such as lobby, taxi, airport etc. Each valid behavior connecting two such… ▽ More

    Submitted 31 December, 2011; originally announced January 2012.

  25. arXiv:1112.6291  [pdf, other

    cs.CV cs.NE

    Descriptor learning for omnidirectional image matching

    Authors: Jonathan Masci, Davide Migliore, Michael M. Bronstein, Jürgen Schmidhuber

    Abstract: Feature matching in omnidirectional vision systems is a challenging problem, mainly because complicated optical systems make the theoretical modelling of invariance and construction of invariant feature descriptors hard or even impossible. In this paper, we propose learning invariant descriptors using a training set of similar and dissimilar descriptor pairs. We use the similarity-preserving hashi… ▽ More

    Submitted 29 December, 2011; originally announced December 2011.

  26. arXiv:1112.5309  [pdf, ps, other

    cs.AI cs.LG

    POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

    Authors: Jürgen Schmidhuber

    Abstract: Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solution… ▽ More

    Submitted 4 November, 2012; v1 submitted 22 December, 2011; originally announced December 2011.

    Comments: 21 pages, additional connections to previous work, references to first experiments with POWERPLAY

  27. arXiv:1112.2113  [pdf, other

    cs.AI

    Incremental Slow Feature Analysis: Adaptive and Episodic Learning from High-Dimensional Input Streams

    Authors: Varun Raj Kompella, Matthew Luciw, Juergen Schmidhuber

    Abstract: Slow Feature Analysis (SFA) extracts features representing the underlying causes of changes within a temporally coherent high-dimensional raw sensory input signal. Our novel incremental version of SFA (IncSFA) combines incremental Principal Components Analysis and Minor Components Analysis. Unlike standard batch-based SFA, IncSFA adapts along with non-stationary environments, is amenable to episod… ▽ More

    Submitted 9 December, 2011; originally announced December 2011.

    Journal ref: Neural Computation, 2012, Vol. 24, No. 11, Pages 2994-3024

  28. arXiv:1109.1314  [pdf, ps, other

    cs.AI

    Measuring Intelligence through Games

    Authors: Tom Schaul, Julian Togelius, Jürgen Schmidhuber

    Abstract: Artificial general intelligence (AGI) refers to research aimed at tackling the full problem of artificial intelligence, that is, create truly intelligent agents. This sets it apart from most AI research which aims at solving relatively narrow domains, such as character recognition, motion planning, or increasing player satisfaction in games. But how do we know when an agent is truly intelligent? A… ▽ More

    Submitted 6 September, 2011; originally announced September 2011.

  29. arXiv:1106.4487  [pdf, ps, other

    stat.ML cs.NE

    Natural Evolution Strategies

    Authors: Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jürgen Schmidhuber

    Abstract: This paper presents Natural Evolution Strategies (NES), a recent family of algorithms that constitute a more principled approach to black-box optimization than established evolutionary algorithms. NES maintains a parameterized distribution on the set of solution candidates, and the natural gradient is used to update the distribution's parameters in the direction of higher expected fitness. We intr… ▽ More

    Submitted 22 June, 2011; originally announced June 2011.

  30. arXiv:1106.1998  [pdf, other

    cs.AI

    A Linear Time Natural Evolution Strategy for Non-Separable Functions

    Authors: Yi Sun, Faustino Gomez, Tom Schaul, Juergen Schmidhuber

    Abstract: We present a novel Natural Evolution Strategy (NES) variant, the Rank-One NES (R1-NES), which uses a low rank approximation of the search distribution covariance matrix. The algorithm allows computation of the natural gradient with cost linear in the dimensionality of the parameter space, and excels in solving high-dimensional non-separable problems, including the best result to date on the Rosenb… ▽ More

    Submitted 13 June, 2011; v1 submitted 10 June, 2011; originally announced June 2011.

  31. arXiv:1103.5708  [pdf, other

    cs.AI stat.ML

    Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments

    Authors: Yi Sun, Faustino Gomez, Juergen Schmidhuber

    Abstract: To maximize its success, an AGI typically needs to explore its initially unknown world. Is there an optimal way of doing so? Here we derive an affirmative answer for a broad class of environments.

    Submitted 29 March, 2011; originally announced March 2011.

  32. arXiv:1103.4487  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs

    Authors: Dan C. Cireşan, Ueli Meier, Luca M. Gambardella, Jürgen Schmidhuber

    Abstract: The competitive MNIST handwritten digit recognition benchmark has a long history of broken records since 1998. The most recent substantial improvement by others dates back 7 years (error rate 0.4%) . Recently we were able to significantly improve this result, using graphics cards to greatly speed up training of simple but deep MLPs, which achieved 0.35%, outperforming all the previous more complex… ▽ More

    Submitted 23 March, 2011; originally announced March 2011.

    Comments: 9 pages, 4 figures, 3 tables

    Report number: IDSIA-03-11

  33. arXiv:1102.0183  [pdf, other

    cs.AI cs.NE

    High-Performance Neural Networks for Visual Object Classification

    Authors: Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber

    Abstract: We present a fast, fully parameterizable GPU implementation of Convolutional Neural Network variants. Our feature extractors are neither carefully designed nor pre-wired, but rather learned in a supervised way. Our deep hierarchical architectures achieve the best published results on benchmarks for object classification (NORB, CIFAR10) and handwritten digit recognition (MNIST), with error rates of… ▽ More

    Submitted 1 February, 2011; originally announced February 2011.

    Comments: 12 pages, 2 figures, 5 tables

    Report number: IDSIA 1-11

  34. arXiv:1009.2634  [pdf, other

    physics.hist-ph cs.CY

    Evolution of National Nobel Prize Shares in the 20th Century

    Authors: Juergen Schmidhuber

    Abstract: We analyze the evolution of cumulative national shares of Nobel Prizes since 1901, properly taking into account that most prizes were divided among several laureates. We rank by citizenship at the moment of the award, and by country of birth. Surprisingly, graphs of this type have not been published before, even though they powerfully illustrate the century's migration patterns (brain drains and g… ▽ More

    Submitted 14 September, 2010; originally announced September 2010.

    Comments: 19 pages, 17 figures

  35. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition

    Authors: Dan Claudiu Ciresan, Ueli Meier, Luca Maria Gambardella, Juergen Schmidhuber

    Abstract: Good old on-line back-propagation for plain multi-layer perceptrons yields a very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images, and graphics cards to greatly speed up learning.

    Submitted 1 March, 2010; originally announced March 2010.

    Comments: 14 pages, 2 figures, 4 listings

    Journal ref: Neural Computation, Volume 22, Number 12, December 2010

  36. arXiv:0812.4360  [pdf, ps, other

    cs.AI cs.NE

    Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes

    Authors: Juergen Schmidhuber

    Abstract: I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful. Curiosity is the desire to create or discover more non-random, non-arbitrary, regular data that is novel and surprising not in the traditional se… ▽ More

    Submitted 15 April, 2009; v1 submitted 23 December, 2008; originally announced December 2008.

    Comments: 35 pages, 3 figures, based on KES 2008 keynote and ALT 2007 / DS 2007 joint invited lecture

    Journal ref: Short version: J. Schmidhuber. Simple Algorithmic Theory of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes. Journal of SICE 48(1), 21-32, 2009

  37. Algorithm Selection as a Bandit Problem with Unbounded Losses

    Authors: Matteo Gagliolo, Juergen Schmidhuber

    Abstract: Algorithm selection is typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which a performance model is iteratively updated and used to guide selection on a sequence of problem instances. The resulting exploration-exploitation trade-off was represented as a… ▽ More

    Submitted 9 July, 2008; originally announced July 2008.

    Comments: 15 pages, 2 figures

    Report number: IDSIA-07-08 ACM Class: F.2.2; G.3; I.1.2; I.2.6; I.2.8

  38. arXiv:0804.3269  [pdf, ps, other

    cs.CL cs.NE

    Phoneme recognition in TIMIT with BLSTM-CTC

    Authors: Santiago Fernández, Alex Graves, Juergen Schmidhuber

    Abstract: We compare the performance of a recurrent neural network with the best results published so far on phoneme recognition in the TIMIT database. These published results have been obtained with a combination of classifiers. However, in this paper we apply a single recurrent neural network to the same task. Our recurrent neural network attains an error rate of 24.6%. This result is not significantly… ▽ More

    Submitted 21 April, 2008; originally announced April 2008.

    Comments: 8 pages

    Report number: IDSIA-04-08 ACM Class: I.2.7; I.5.4

  39. arXiv:0709.0674  [pdf, ps, other

    cs.AI cs.GR

    Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity

    Authors: Juergen Schmidhuber

    Abstract: I postulate that human or other intelligent agents function or should function as follows. They store all sensory observations as they come - the data is holy. At any time, given some agent's current coding capabilities, part of the data is compressible by a short and hopefully fast program / description / explanation / world model. In the agent's subjective eyes, such data is more regular and m… ▽ More

    Submitted 5 September, 2007; originally announced September 2007.

    Comments: 15 pages, 3 highly compressible low-complexity drawings. Joint Invited Lecture for Algorithmic Learning Theory (ALT 2007) and Discovery Science (DS 2007), Sendai, Japan, 2007

    ACM Class: I.2.0

  40. arXiv:0709.0670  [pdf, ps, other

    cs.DS cs.IT

    Using Data Compressors to Construct Rank Tests

    Authors: Daniil Ryabko, Juergen Schmidhuber

    Abstract: Nonparametric rank tests for homogeneity and component independence are proposed, which are based on data compressors. For homogeneity testing the idea is to compress the binary string obtained by ordering the two joint samples and writing 0 if the element is from the first sample and 1 if it is from the second sample and breaking ties by randomization (extension to the case of multiple samples… ▽ More

    Submitted 5 September, 2007; originally announced September 2007.

    Journal ref: Applied Mathematics Letters, 22:7, 1029-1032, 2009

  41. arXiv:0708.4311  [pdf, ps, other

    cs.AI

    2006: Celebrating 75 years of AI - History and Outlook: the Next 25 Years

    Authors: Juergen Schmidhuber

    Abstract: When Kurt Goedel layed the foundations of theoretical computer science in 1931, he also introduced essential concepts of the theory of Artificial Intelligence (AI). Although much of subsequent AI research has focused on heuristics, which still play a major role in many practical AI applications, in the new millennium AI theory has finally become a full-fledged formal science, with important opti… ▽ More

    Submitted 31 August, 2007; originally announced August 2007.

    Comments: 14 pages; preprint of invited contribution to the Proceedings of the ``50th Anniversary Summit of Artificial Intelligence'' at Monte Verita, Ascona, Switzerland, 9-14 July 2006

    ACM Class: I.2.0

  42. arXiv:0705.2011  [pdf, other

    cs.AI cs.CV

    Multi-Dimensional Recurrent Neural Networks

    Authors: Alex Graves, Santiago Fernandez, Juergen Schmidhuber

    Abstract: Recurrent neural networks (RNNs) have proved effective at one dimensional sequence learning tasks, such as speech and online handwriting recognition. Some of the properties that make RNNs suitable for such tasks, for example robustness to input warping, and the ability to access contextual information, are also desirable in multidimensional domains. However, there has so far been no direct way o… ▽ More

    Submitted 14 May, 2007; originally announced May 2007.

    Comments: 10 pages, 10 figures

    Report number: 04-07

  43. arXiv:cs/0701120  [pdf, ps, other

    cs.LG cs.AI cs.IT

    Algorithmic Complexity Bounds on Future Prediction Errors

    Authors: A. Chernov, M. Hutter, J. Schmidhuber

    Abstract: We bound the future loss when predicting any (computably) stochastic sequence online. Solomonoff finitely bounded the total deviation of his universal predictor $M$ from the true distribution $mu$ by the algorithmic complexity of $mu$. Here we assume we are at a time $t>1$ and already observed $x=x_1...x_t$. We bound the future prediction performance on $x_{t+1}x_{t+2}...$ by a new variant of al… ▽ More

    Submitted 19 January, 2007; originally announced January 2007.

    Comments: 21 pages

    Journal ref: Information and Computation, Vol.205,Nr.2 (2007) 242-261

  44. arXiv:cs/0606081  [pdf, ps, other

    cs.AI

    New Millennium AI and the Convergence of History

    Authors: Juergen Schmidhuber

    Abstract: Artificial Intelligence (AI) has recently become a real formal science: the new millennium brought the first mathematically sound, asymptotically optimal, universal problem solvers, providing a new, rigorous foundation for the previously largely heuristic field of General AI and embedded agents. At the same time there has been rapid progress in practical methods for learning true sequence-proces… ▽ More

    Submitted 29 June, 2006; v1 submitted 19 June, 2006; originally announced June 2006.

    Comments: Speed Prior: clarification / 15 pages, to appear in "Challenges to Computational Intelligence"

    Report number: IDSIA-14-06 ACM Class: I.2

  45. arXiv:cs/0603023  [pdf, ps, other

    cs.RO cs.LG

    Metric State Space Reinforcement Learning for a Vision-Capable Mobile Robot

    Authors: Viktor Zhumatiy, Faustino Gomez, Marcus Hutter, Juergen Schmidhuber

    Abstract: We address the problem of autonomously learning controllers for vision-capable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for general metrics over state-action trajectories. We demonstrate the feasibility of our approach by successfully running our algorithm on a real mobile robot. The algorithm is novel and unique in that it (a) explores the environmen… ▽ More

    Submitted 7 March, 2006; originally announced March 2006.

    Comments: 14 pages, 8 figures

    Report number: IDSIA-05-06

    Journal ref: Proc. 9th International Conf. on Intelligent Autonomous Systems (IAS 2006) pages 272-281

  46. arXiv:cs/0512062  [pdf, ps, other

    cs.NE

    Evolino for recurrent support vector machines

    Authors: Juergen Schmidhuber, Matteo Gagliolo, Daan Wierstra, Faustino Gomez

    Abstract: Traditional Support Vector Machines (SVMs) need pre-wired finite time windows to predict and classify time series. They do not have an internal state necessary to deal with sequences involving arbitrary long-term dependencies. Here we introduce a new class of recurrent, truly sequential SVM-like devices with internal adaptive states, trained by a novel method called EVOlution of systems with KEr… ▽ More

    Submitted 15 December, 2005; originally announced December 2005.

    Comments: 10 pages, 2 figures

    Report number: IDSIA-19-05 version 2.0 ACM Class: F.1.1; I.2.6

  47. arXiv:cs/0309048  [pdf, ps, other

    cs.LO cs.AI

    Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements

    Authors: Juergen Schmidhuber

    Abstract: We present the first class of mathematically rigorous, general, fully self-referential, self-improving, optimally efficient problem solvers. Inspired by Kurt Goedel's celebrated self-referential formulas (1931), such a problem solver rewrites any part of its own code as soon as it has found a proof that the rewrite is useful, where the problem-dependent utility function and the hardware and the… ▽ More

    Submitted 17 December, 2006; v1 submitted 25 September, 2003; originally announced September 2003.

    Comments: 29 pages, 1 figure, minor improvements, updated references

    Report number: IDSIA-19-03 ACM Class: F.4.1

    Journal ref: Variants published in "Adaptive Agents and Multi-Agent Systems II", LNCS 3394, p. 1-23, Springer, 2005: ISBN 978-3-540-25260-3; as well as in Proc. ICANN 2005, LNCS 3697, p. 223-233, Springer, 2005 (plenary talk); as well as in "Artificial General Intelligence", Series: Cognitive Technologies, Springer, 2006: ISBN-13: 978-3-540-23733-4

  48. arXiv:cs/0302012  [pdf, ps, other

    cs.AI cs.LG quant-ph

    The New AI: General & Sound & Relevant for Physics

    Authors: Juergen Schmidhuber

    Abstract: Most traditional artificial intelligence (AI) systems of the past 50 years are either very limited, or based on heuristics, or both. The new millennium, however, has brought substantial progress in the field of theoretically optimal and practically feasible algorithms for prediction, search, inductive inference based on Occam's razor, problem solving, decision making, and reinforcement learning… ▽ More

    Submitted 27 November, 2003; v1 submitted 10 February, 2003; originally announced February 2003.

    Comments: 23 pages, updated refs, added Goedel machine overview, corrected computing history timeline. To appear in B. Goertzel and C. Pennachin, eds.: Artificial General Intelligence

    Report number: TR IDSIA-04-03 ACM Class: I.2

  49. arXiv:cs/0207097  [pdf, ps, other

    cs.AI cs.CC cs.LG

    Optimal Ordered Problem Solver

    Authors: Juergen Schmidhuber

    Abstract: We present a novel, general, optimally fast, incremental way of searching for a universal algorithm that solves each task in a sequence of tasks. The Optimal Ordered Problem Solver (OOPS) continually organizes and exploits previously found solutions to earlier tasks, efficiently searching not only the space of domain-specific algorithms, but also the space of search algorithms. Essentially we ex… ▽ More

    Submitted 23 December, 2002; v1 submitted 31 July, 2002; originally announced July 2002.

    Comments: 43 pages, 2 figures, short version at NIPS 2002 (added 1 figure and references; streamlined presentation)

    Report number: IDSIA-12-02 ACM Class: I.2.2; I.2.6; I.2.8

    Journal ref: Machine Learning, 54, 211-254, 2004.

  50. arXiv:cs/0111060  [pdf, ps, other

    cs.AI

    Gradient-based Reinforcement Planning in Policy-Search Methods

    Authors: Ivo Kwee, Marcus Hutter, Juergen Schmidhuber

    Abstract: We introduce a learning method called ``gradient-based reinforcement planning'' (GREP). Unlike traditional DP methods that improve their policy backwards in time, GREP is a gradient-based method that plans ahead and improves its policy before it actually acts in the environment. We derive formulas for the exact policy gradient that maximizes the expected future reward and confirm our ideas with… ▽ More

    Submitted 28 November, 2001; originally announced November 2001.

    Comments: This is an extended version of the paper presented at the EWRL 2001 in Utrecht (The Netherlands)

    Report number: 14-01 ACM Class: I.2; I.2.6; I.2.8