Overlooked Aspects of Big Data Product Management
Ben Woo is Managing Director of Neuralytix, a global leader in contemporary and relevant IT market research.
In a video interview with Aditya Mukerjee, you were asked about the role of the product manager in a big data-enabled enterprise. How do you see this role evolving as big data becomes increasingly second nature?
The role of product manager is going to become increasingly more important. Product managers will be a creative leader of any business. They will come up with ideas and leverage the information systems (read: big data) folks to get data/information about testing the market. Over time, they may even automate the market testing using machine learning, so that the big data systems are actually proactively providing recommendations to the product managers.
VERY IMPORTANT: This does not take away the human factor. Only humans have the creative attribute to take the output of the recommendation engine, and interpret it for consumption.
Some feel that the decreasing protection afforded by de-anonymization is a serious problem for big data applications in retail and healthcare. Does an industry / market analyst have a responsibility to highlight security and privacy issues when recommending technology adoption, as in the case of big data?
Of course. However, security is too often considered ex parte to the overall solution. Security should be part and parcel of the data (preferably embedded in the metadata). This will help improve performance and increase security. Additionally, enterprises need to be using Test Data Management (TDM) systems. That said, ultimately, anonymized data is useless unless the outcome can be attributed to a specific customer/patient.
In your rebuttal to a Gartner opinion that big data had “fallen into a trough of disillusionment,” you wrote that maturity for big data “will be reached in about three years.” That was more than a year ago. What say you today?
We are absolutely on course. 2013 was supposed to be the Year of Big Data, but overall, it turned out to be a dud. Over the course of 2014 through 2016, enterprises will accelerate their acceptance of big data from concept to production, and accelerate their deployments across the enterprise. We are still very much in the “hype” phase. Every day, a new big data-related company is popping up. During these next couple of years, there will be a consolidation of these companies. Then and only then will we see any peak of maturity, and perhaps then, we can consider falling in a trough.
When you wrote about interviewing an academic librarian in Asheville, N.C., you touched on the role of metadata for archivists and others responsible for long term custody of data. Looking ahead three to five years, will the semantic web become more relevant for these practitioners, or is the path still painfully gradualist?
The path will always be painful for any records manager. We don’t live in a static world. The amount of metadata that can be gathered for any given piece of data will always increase based on new understanding of the data, and derivative data.
When asked to name another analyst you regard highly, you mentioned IDC’s Richard Villars. What makes his work so special?
Rick represents someone who understands his markets in context. He doesn’t have a personal agenda. He has a very wide understanding of the IT market, and his personal as well as professional experience makes him one of the world’s stand-out IT industry analysts.
“Not disappointing” is how you described your own experience with a wearable device, the Pebble Steel “smartwatch.” Do you think today’s apps are sufficiently nuanced to push sensory alerts, or is a smartphone ring “tone” use a special alert use case?
The wearable market is still nascent. I think that we’re still testing the waters in terms of what are the key elements of a smartwatch that make sense in a mass market. Will smartwatches be a collector of data? Will they be a visualization of data? Will they be truly interactive? What market are we actually looking at (everyone having a smartwatch, or a more exclusive market)?
We even have to think about how many smart “things” we have on our bodies. Recently, I had the opportunity to try a Fitbit. I loved it, it collected data for me that was very interesting. But after the first few days, I forgot to wear it, making the data useless (or at least out of context). So, if I have a smartwatch, a Fitbit (or similar), a heart-rate monitor (HRM), will that increase my dressing time by 10 minutes every morning? Will that affect my desire to use these products? As you can see, many, many more questions than answers.
Below the waterline, the iceberg managed by Syncsort and other ETL suppliers looms large. When looking at the future Internet of Things, in what ways might it create more opportunities for big data ETL and ELT suppliers?
We collect data in many different places. When we look into the future, one thing is clear – there will be data providers, there will be data aggregators, and there will be data analyzers. At each stage there will be some form of ETL or ELT. As we measure more and more things through IoT, the opportunity (at least in the immediate term) seems limitless.
You cite a need for “tools that make data accessible to domain experts and, quite frankly, average knowledge workers. As long as data requires specialists such as data scientists to manipulate and analyze it, the whole idea of big data will refuse to scale. We need for data to stop being like a bug stuck in amber.” Won’t these tools have to become more domain-aware themselves in order to scale in this way?
What we’re seeing are point tools being developed for specific needs and markets. Over time, many of these point tools will be consolidated through M&A activity. Many of these tools (even today), do a very similar job, with edge nuances for each market. So, domain-awareness is necessary, but it will be more packaging than substance.
Two trends are pushing big data analytics in seemingly opposite directions. The first: democratizing and simplifying analytics (perhaps via AaaS) so that more people have access to the technology without requiring data scientists. The second: demand for more complex analytics, requiring more data scientists. Which trend will dominate?
Neither; they are actually complementary. On the first hand, you have portals and apps that make it easier to consume data; on the other hand, you have apps that make it easier to analyze data. Both are necessary.
In a critique of analyst coverage of Virtual Instruments, you wrote that “instrumentation is an enabling technology,” not just applicable for storage resource management. How much of future big data systems do you think will consist both of producing and consuming performance information about IT?
Instrumentation is a function for infrastructure. Big data is just the “soft” part of the compute process. That said, app performance management, and any other automated optimization, will become increasingly necessary. IT stands for information technology, not infrastructure technology. Anything that we can do to simplify, standardize, and make the infrastructure more predictable is critical.
What standards – formal or de facto – are you watching most closely?
Ah, a simple question to end. Will Hadoop (specifically HDFS) become the de facto data management platform in three to five years’ time?
Read more to learn how Syncsort offers streamlined, robust access to the Hadoop infrastructure.