Input-output HMMs for sequence processing

1 September 1996

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 7 (5), 1231-1249
https://doi.org/10.1109/72.536317

Abstract

We consider problems of sequence processing and propose a solution based on a discrete-state model in order to represent past context. We introduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call input-output hidden Markov model (IOHMM). It can be trained by the estimation-maximization (EM) or generalized EM (GEM) algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.

Keywords

This publication has 38 references indexed in Scilit:

The time dimension of neural network models
ACM SIGART Bulletin, 1994
Smooth On-Line Learning Algorithms for Hidden Markov Models
Neural Computation, 1994
Induction of Finite-State Languages Using Second-Order Recurrent Networks
Neural Computation, 1992
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks
Neural Computation, 1992
Adaptive Mixtures of Local Experts
Neural Computation, 1991
Links between Markov models and multilayer perceptrons
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
Neural Computation, 1989
Learning State Space Trajectories in Recurrent Neural Networks
Neural Computation, 1989
A tutorial on hidden Markov models and selected applications in speech recognition
Proceedings of the IEEE, 1989
Continuous speech recognition via centisecond acoustic states
The Journal of the Acoustical Society of America, 1976

Cited by 181 articles