Quadratic Observable Operator Models for efficient prediction and classification of stochastic time series
Final Report Abstract
Sequence data made from strings of symbols occur in several highly relevant domains. The most important sources of such data are written texts, gene sequences and protein sequences. A less widely visible, but likewise important occasion where such data occur is constituted by sequences of acfions made by a human or robot Such sequences play an important role in the design of computer-user interfaces and in autonomous robots. A primary task in all of these applications is to enable computers to learn the regularities that are hidden in the empirically encountered sequence data. This is a precondition, for example, • to enable spam filters to disfinguish spam from regular email, • to automatically classify texts In text retrieval systems, • to find out how many (and which) species of marine micro-organisms are contained in a probe of deep-sea water, • to classify newly found enzymes and derive hypotheses about their funcfion by comparison with known enzymes, • to enable autonomous robots to plan actions In incompletely known environments which are full of random effects. The main analytical tool in state-of-the-art tools dealing with such tasks is hidden Markov modets (HMMs). The known learning algorithms for HMMs are however not perfect they are computationally expensive and the resulting capture of regularities in the data does not reach the theoretical optimum. Since a few years, an alternative to HMMs, called Observable Operator Models (OOMs) has been developed. OOMs admit in principle, much faster learning algorithms and can yield more accurate models. However, this potential of OOMs had not been fully realized: the algorithms known for OOMs in 2006 were not always stable and, worse, could sometimes lead to the prediction of negative "probabilities" when applied to new data. In this situation, the DFG-funded project "Quadratic Observable Operator Models" (2005 - 2008, Jacobs University Bremen, working group of Herbert Jaeger, principal investigator Dr. Mingjie Zhao) developed new learning algorithms and an altogether novel variant of OOMs, termed norm-OOMs, which prevent negative "probabilities" from occurring. These new analytical and algorithmic results represent an important step toward superseding HMMs by a next-generation toolset based on OOMs. The new mathematical theory yielded a surprising insight: The quantum-physics formalism is closely related to norm-OOMs. Indeed, certain (discrete) quantum systems can be naturally captured by norm-OOMs. This may open doors in physics to learn quantum models from experimental data with highly efficient automated tools.
Publications
- A bound on modeling error in observable operator models and an associated learning algorithm
M, Zhao, H. Jaeger
- Diverse results from the "Quadratic Observable Operator Models" project
M. Zhao
- (2007): The Error Controlling Algorithm for Learning OOMs. Jacobs University Technical Report Nr. 6
M. Zhao, H. Jaeger
- (2007); Norm observable operator models. Jacobs University Technical Report Nr, 8
M. Zhao, H. Jaeger