Scene labeling with LSTM recurrent neural networks
- 1 June 2015
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636919,p. 3547-3555
- https://doi.org/10.1109/cvpr.2015.7298977
Abstract
This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks.Keywords
This publication has 17 references indexed in Scilit:
- Supervised texture segmentation using 2D LSTM networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Texture Classification Using 2D LSTM NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- DeepFace: Closing the Gap to Human-Level Performance in Face VerificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Learning Hierarchical Features for Scene LabelingIEEE Transactions on Pattern Analysis and Machine Intelligence, 2012
- Offline Arabic Handwriting Recognition with Multidimensional Recurrent Neural NetworksPublished by Springer Nature ,2012
- SuperParsing: Scalable Nonparametric Image Parsing with SuperpixelsLecture Notes in Computer Science, 2010
- Multi-dimensional Recurrent Neural NetworksLecture Notes in Computer Science, 2007
- Multiscale conditional random fields for image labelingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Long Short-Term MemoryNeural Computation, 1997
- Backpropagation Applied to Handwritten Zip Code RecognitionNeural Computation, 1989