Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma

Abstract
Statistical analysis is critical in the interpretation of experimental data across the life sciences, including neuroscience. The nature of the data collected has a critical role in determining the best statistical approach to take. One particularly prevalent type of data is referred to as “clustered data.” Clustered data are characterized as data that can be classified into a number of distinct groups or “clusters” within a particular study. Clustered data arise most commonly in neuroscience when data are compiled across multiple experiments, for example in electrophysiological or optical recordings taken from synaptic terminals, with each experiment providing a distinct cluster of data. However, there are many other types of experimental design that can yield clustered data. Here, we provide a statistical model for intracluster correlation and systematically investigate a range of methods for analyzing clustered data. Our analysis reveals that it is critical to take data clustering into account and suggests appropriate statistical approaches that can be used to account for data clustering.