The amount of data has been exploding. Organizations capture trillions bytes of relevant to their operations information, sensors, multimedia and individuals also generate enormous quantities of data.
Big data is a set of data from both traditional as well as digital sources that can be used for continuous discovery and analysis. The data can contain unstructured and multi-structured data. Unstructured data is the data stored not in organized system, it cannot easily be interpreted by traditional databases or data models, and usually it consists of large amounts of text. The example of unstructured data is social media posts, like Twitter tweets. Multi-structured data is a data which contains different data formats and types. An example is log data, which represents a mix of text information and images (Arthur).
Also big data has been defined by the concept of 4 Vs:
- Volume – the quantity of data. Big data requires to process huge amounts of unstructured data (unknown value data). In some cases, the processing of tens of terabytes is needed, in other cases this number can be hundreds of petabytes.
- Velocity – the rate at which data is got and processed.
- Variety – means availability of a lot of unstructured types of data.
- Value – the value is discovered from data. There is an impressing number of investigating techniques, for example, finding out consumer preferences and sentiments, identifying a piece of equipment that is about to fall (Oracle).
The ways how big data can establish value:
1) Big data can make data more usable.
2) More accurate and detailed information can be collected, which can be analyzed. As a result better management decisions will be made, also forecasting can be performed on this data.
3) With big data it is possible to narrower segment customers and tailor products and services with higher precision rates.
In order to capture full potential of big data, privacy, security and intellectual property issues will need to be addressed (Manyika, Chui, Brown, Dobbs, Roxburgh, Hung Byers).
Analytics of Big Data provides a possibility for breakdown changes and growth rates. A lot of companies like Coca Cola, Neflix are already leveraged Big Data. Generally, the sets of data are too bulky and complex, alter fast, which creates supplementary challenges. There are appropriate technologies to solve these problems, but a need to investment time, money and resources exists for implementation of a Big Data solution (General Networks).
A lot of techniques as well as technologies were developed for aggregation, manipulation, analysis and visualization of big data. These techniques and technologies inherited from several fields, which include computer science, statistics, economics, applied mathematics. The techniques for Big Data analysis are A/B testing, Association rule learning, Classification, Cluster analysis, Crowd sourcing, Data Mining, Machine Learning and etc. Big Data technologies are Business Intelligence (BI), Cloud computing, Data mart, Hadoop, MapReduce and etc. Presenting information in convenient for human way is very important. There are following data visualization techniques: tag cloud, clustergram, history flow, spatial information flow. (McKinsey Global Institute).
Arthur Lisa. “What Is Big Data?” Forbes. CWO Media, 15 Aug. 2013. Web. 03 Feb. 2016
Oracle. “The Foundation for Data Innovation”. Web. 03 Feb. 2016
Manyika James, Chui Michael, Brown Jacques, Dobbs Richard, Roxburgh Charles, Hung Byers Angela. “Big data: The next frontier for innovation, competition, and productivity”. McKinsey Company, May 2011. Web. 03 Feb. 2016
General Networks. “What is Big Data? And Why Is It Important for Me?” 31 Mar. 2015. Web. 03 Feb. 2016
McKinsey Global Institute. “Big Data: The next frontier for innovation, competition and productivity ”. June 2011. Web. 03 Feb. 2016