Big Themes in Big Data
Last week GigaOM hosted a conference in New York on Big Data, a sister event to the annual Structure conference held on the west coast. We were on site for the day, and came away awash in big numbers and big ideas. Certain themes repeated over and over again. The increasing scale and decreasing cost of storage, computer processing, and bandwidth are combining to create a foundation for new applications and new application models. We’re at a major tipping point. It’s not just that we can do things faster. We can approach problems differently because of the new resources we have available.
Taking a step back, one of the ideas the speakers discussed last week was the need for scale-out storage to go hand in hand with greater computational capabilities. The more information there is, the more need there is to process and analyze that data. We’ve always collected information, but until recently we haven’t stored most data in massive quantities. For example, businesses have traditionally only stored security camera footage for a limited period of time. But what if they had stored footage from every moment of every day going back decades? Consider the analysis implications around human behavior, visitor demographics, traffic patterns over time, and more.
And in addition to big data that’s always been available, we also have new data flowing our way. Jeff Jonas, a Distinguished Engineer from IBM, quoted a stat that there are 600 billion transactions a day being created in the US alone based on mobile phone geo-locational data. Storing it all could be useful, but only if we had the ability to provide deep context for the data, and near-real-time analysis.
Which takes us to computation and computational performance. Big data sets require big analysis, and that requires high-performance computing. Our computing capability on individual systems continues to grow, but we can increase that power exponentially by moving to a parallel, distributed computing model. Higher performance isn’t just about speed. It’s about enabling whole new lines of thought. As Jim Baum, CEO of Netezza (also part of IBM) put it last week, if it takes three days to get back an answer to a question, you won’t ask the follow-up question, or the question after that. If you get an answer in three seconds, the dynamic changes significantly.
The implications of big data and big data analysis are astounding in their scale. There are things we’re already seeing implemented today, like real-time language translation and augmented reality apps, but there are also huge opportunities for causal analysis and even prediction engines. What’s the impact of migration on GDP? How do we better predict the trajectory of a disease outbreak? The more data we have, and the more effectively we can process it, the more we’ll be able to discover and apply to our world for better business, and better living. That’s big data in a nutshell. And it’s a big deal.
Want to read more? Here are some of the GigaOM posts covering last week’s Big Data event.
- Reducing Data Latency Leads to Faster Decisions
- Is Big Data Making Us Dumber?
- Big Data Could Be Cloud’s Killer App



