Big data, at its most basic, is the pool of digital data available to an organization.
It is generally made up of both structured and unstructured data. Structured data is data that’s already in a format designed for analysis, like data in a spreadsheet or database, while unstructured data is freeform, and includes things like infographics, presentations, blog posts, and social media posts. Astounding quantities of data are generated every day. In just one minute, there are 2 million Google searches, 685,000 Facebook updates, and 48 hours of video uploaded to YouTube. How can organizations draw useful analytics from this massive, heterogeneous pool of data? Here are 5 ways.
Big data analytics can be used to issue alerts that let scientists, activists, and other organizations act quickly. It can be used, for example, to help agencies like Interpol track criminals across international borders. Another example of big data used to generate alerts is Global Forest Watch, which monitors deforestation worldwide. Powered by Google Earth Engine, this project crunches data from NASA and US Geological Survey satellites. Tropical zone data is refreshed every 16 days, allowing tracking of deforestation in vulnerable areas. Additionally, users can sign up for alerts generated when the big data analytics detect signs of illegal logging practices in specific areas.
The insurance industry has recently drawn attention (not all of it positive) for using big data analytics in price optimization. By law, premiums are based on actuarial risk and the projected cost for providing coverage. However, some insurers are using big data analytics to help them set rates. Price optimization in the insurance industry is a data mining tool that allows insurance companies to determine which customer subsets are more likely to accept premium increases, and which ones are more likely to shop around for a better rate. Insurers call it a way to be more efficient with pricing, while some consumer advocates call it a way to misuse risk-based pricing.
3. Predictive Modeling
Big data can be used to accurately predict box office success.
Big data analytics researcher Taha Yasseri, of the Oxford Internet Institute, built a predictive model of financial success of movies based on collective online activities, and has shown that a movie’s popularity can be predicted well ahead of the movie’s release. This predictive modeling technique involves deriving editorial activity and page view data from Wikipedia pages for movies. Yasseri studied things like number of edits and number of unique editors, and calibrated the model based on known financial success of already-released movies. Without performing sentiment analysis, Yasseri was able to predict a movie’s financial success accurately. The number one predictor of financial success for a movie was the number of page views, most likely because people researching movies often reach the Wikipedia page for that movie first.
Perhaps the most famous example of forecasting using big data analytics was Google’s Flu Trends. This type of analytic uses Google searches to track how flu outbreaks spread worldwide. Although there is considerable noise in this data, the massive volume of search data on flu allowed researchers to successfully identify and track the spread of flu outbreaks in almost real time. When doctors report flu cases as they are observed, there’s a lag of up to two weeks in the identification of outbreaks. This analytic model isn’t perfect, however. A flu outbreak in the US was overestimated, possibly because heightened media coverage led to an increase in searches on flu symptoms. But the model did provide actionable insights based on correlation.
5. Statistical Analysis
Statistical analysis and big data analytics go hand in hand, and big data analytics allow construction of metrics that evolve quickly. Two economists with the Massachusetts Institute of Technology came up with The Billion Prices Project, which calculates a daily inflation index from a continually evolving “basket” of products. The data comes from prices listed on websites of online retailers, and the index is calculated as an average of individual price changes. The BPP showed, for example, that in September 2008, immediately after the Lehmann Brothers collapse, businesses started cutting prices, suggesting a drop in demand. The official inflation numbers released by government statistical agencies, by contrast, didn’t show this deflationary trend until November.
The demand for big data analytics is increasing as organizations of every type discover the actionable insights they can get from big data. Syncsort‘s products help customers from over 85 countries around the world – including the majority of the Fortune 100 companies – collect, process, and distribute massive amounts of data quickly, and at lower cost. With Syncsort, organizations don’t have to use expensive and inefficient legacy data workloads to enjoy fast data warehousing and processing as well as the actionable insights produced by big data analytics.