Abstract
We are living in an era of data deluge and as a result, the term ``big data'' is appearing in many contexts, from meteorology, genomics, complex physics simulations, biological and environmental research, finance and business to healthcare. As the name implies, big data literally means large collections of data sets containing abundant information. However, it has some special characteristics that distinguish it from “very large data”or “massive data”that are simply enormous collections of simple-format records, typically equivalent to enormous spreadsheets. Big data, being generally unstructured and heterogeneous, is extremely complex to deal with via traditional approaches, and requires real-time or almost real-time analysis. A short definition can therefore be that “big data” refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.