Automated analysis of high‐content microscopy data with deep learning

Abstract
Existing computational pipelines for quantitative analysis of high‐content microscopy data rely on traditional machine learning approaches that fail to accurately classify more than a single dataset without substantial tuning and training, requiring extensive analysis. Here, we demonstrate that the application of deep learning to biological image data can overcome the pitfalls associated with conventional machine learning classifiers. Using a deep convolutional neural network (DeepLoc) to analyze yeast cell images, we show improved performance over traditional approaches in the automated classification of protein subcellular localization. We also demonstrate the ability of DeepLoc to classify highly divergent image sets, including images of pheromone‐arrested cells with abnormal cellular morphology, as well as images generated in different genetic backgrounds and in different laboratories. We offer an open‐source implementation that enables updating DeepLoc on new microscopy datasets. This study highlights deep learning as an important tool for the expedited analysis of high‐content microscopy data. Synopsis Deep learning is used to classify protein subcellular localization in genome‐wide microscopy screens of GFP‐tagged yeast strains. The resulting classifier (DeepLoc) outperforms previous classification methods and is transferable across image sets. A deep convolutional neural network (DeepLoc) is trained to classify protein subcellular localization in GFP‐tagged yeast cells using over 21,000 labeled single cells. DeepLoc outperformed previous SVM‐based classifiers on the same dataset. DeepLoc was used to assess a genome‐wide screen of GFP‐tagged yeast cells exposed to mating pheromone and identified ˜300 proteins with significant localization changes. DeepLoc can be effectively applied to other image sets with minimal additional training.
Funding Information
  • Canada Foundation for Innovation (21475)
  • Ministry of Research and Innovation (21475)
  • Canadian Institutes of Health Research (FDN‐143264, FDN‐143265)
  • National Institutes of Health (R01HG005853)
  • Connaught Fund (GCDF:2013‐14)