Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Top Cited Papers

1 December 2015

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 1026-1034
https://doi.org/10.1109/iccv.2015.123

Abstract

Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on the learnable activation and advanced initialization, we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66% [33]). To our knowledge, our result is the first to surpass the reported human-level performance (5.1%, [26]) on this dataset.

Keywords

All Related Versions

Version 1, 2015-02-06, ArXiv (Unconfirmed version)

This publication has 13 references indexed in Scilit:

Fast R-CNN
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Convolutional neural networks at constrained time cost
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Caffe
Published by Association for Computing Machinery (ACM) ,2014
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
Lecture Notes in Computer Science, 2014
On rectified linear units for speech processing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Pushing Stochastic Gradient towards Second-Order Methods – Backpropagation Learning with Transformations in Nonlinearities
Lecture Notes in Computer Science, 2013
Multi-column deep neural networks for image classification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision, 2009
Backpropagation Applied to Handwritten Zip Code Recognition
Neural Computation, 1989

Cited by 11767 articles