data management

Last week,  I attended two days of sessions at GigaOM’s Structure:Data Conference  in New York City where over 700 attendees came together to discuss the business and industry-transformative nature of Big Data, and the latest technologies and approaches to best manage it all.

What struck me this year is that the conversation has evolved from Big Data being an infrastructure-only issue to now the realization that the Big Data stack requires contribution from everyone from the bottom layer of the infrastructure up through the top application layer.

The following key themes emerged from the onsite discussions and will be the focus as the community continues to develop the Big Data stack:

1) It’s all about high performance computing and speeding up analytics as data volumes grow exponentially. The pain points for unstructured versus structured data are different. While unstructured data requires better visualization of the data, structured data requires more cleansing making filtering and grouping much more critical. One of the speakers referenced a quote from Clay Shirky that, “Information overload is not the problem. It’s filter failure.”

2) The line between personal and business behavior is blurring as analytics moves out of the IT realm and into the hands of business users, and as a result there is an expectation that delivery of data can be more easily consumed, such as through visualization capabilities and collaboration.

3) Real-time decision making through predictive analytics and machine learning is becoming essential with sensor data, digital exhaust and need to get ‘insight’ to consumer behavior.

As such, there’s a realization that the Big Data market is fragmented, and there is plenty of opportunity to contribute to building the Big Data stack. Software packages and tools need to be built on top of Hadoop for example to increase enterprise adoption. Currently most of the available enterprise software is proprietary. Offering applications layered on top of Hadoop will spur the Big Data market leading to more open source contributions and additional opportunities for startups.

Syncsort has a lot to offer in the areas of performance, data integration, and processing – all critical components to the Big Data stack. We can deliver and run ETL over Hadoop without requiring a brand new development team and skill set. One of the speakers suggested that businesses should consider adopting Hadoop only if they are willing to dedicate a separate team. Syncsort’s offering eliminates this requirement for the enterprise. We can also efficiently move the data in and out of Hadoop which as John Webster points out in his CNET post continues to be an issue.

To reach the holy grail of Big Data management – the focus needs to be on building a top to bottom Big Data stack which will require different segments of the market to come together.

{ 0 comments }

Thanks to a recent post by my colleague Steven Totman, some of you may already know that I had the chance to participate on a panel about “The Breakpoints of Big Data” at the FIMA conference in London. One thing that really caught my attention during the entire conference was the number of attendees from the business side of the house. I think this represents further evidence that the information technology landscape continues to change dramatically. More than ever, the business stakeholders are willing to get their hands dirty to learn, participate and immerse themselves in the world of data management and information technology.

Some IT organizations may feel threatened by this trend and may choose to ignore it or minimize it. However, many others are already embracing it with significant benefits to the business. Ultimately, true collaboration between business users and IT empowers companies to make sense of ‘Big Data.’ I believe this holds true for most aspects of IT, especially those that are more strategic like analytics, business intelligence and, of course, data integration.

By enabling greater levels of collaboration between business users and IT, organizations can make a great leap forward and unleash the opportunities of ‘Big Data’ by:

  • Bringing together the right set of skills to understand the data and focus on the right areas
  • Accelerating development cycles, providing the business with the agility it needs to capitalize on opportunities faster than the competition
  • Understanding the business impact of diverse data services to prioritize resources and work towards a common goal
  • Generating more knowledgeable and satisfied business users, resulting in greater ROI and user adoption

That said, there is no perfect world. The new dynamics will require proper tools that allow and facilitate greater levels of collaboration, self-service, and reusability. That is why these concepts are core to Syncsort’s ETL 2.0 approach. Even then, there will always be sources of friction and areas of inefficiency. In this regard, it is really not any different than a couple that has been married for a long time.

The business user has arrived and there is no turning back. The good news for IT is that they just might have found the perfect ally for taming ‘Big Data’ and capitalizing on the opportunities it presents to organizations of all sizes.

{ 0 comments }

I am currently in Orlando for the first of two events happening here over the next couple weeks.  First is the Gartner Symposium / ITxpo, and the second is TDWI which kicks off later this month.  As hard as it is to believe, this is my first time to a Gartner Symposium, and I am very much enjoying the sessions, the insight, and meeting with colleagues, partners, and the Gartner analysts.

I am concentrating specifically on the data management and data integration sessions, but I am also very interested in the sessions on cloud, big data, and Hadoop.  I will also be spending time at the ITxpo aspect where vendors – Syncsort included – toot their own horn and try to impress the masses.

The data management and data integration sessions from analysts like Ted Friedman and Mark Beyer are particularly interesting to me. One of the benefits of my role at Syncsort is having the opportunity to regularly interact with smart analysts like Ted and Mark. It is impossible to participate in a briefing or advisory session with them and not learn something and/or engage in a spirited debate.  Specific sessions that have caught my eye at Symposium include:

I am interested in the trends that Gartner is seeing in the industry, interesting vendors and technology that have caught their attention, and most importantly what customers are sharing with them. It is always interesting to see how that last piece with customers aligns to what we are hearing from our customer base. For example, we are hearing more and more from prospects that they need to bring their transformations (the “T” in ETL) back to the ETL layer or engine. They’re experiencing performance and capacity issues with their current implementations.  More on this in the coming weeks, but in the meantime check out this video we recently put out on this issue.

Next is cloud.  I spend a lot of time talking to customers and thinking about the possibilities with the cloud from a data integration perspective. As part of this, I am constantly looking at the latest and greatest market projections and analysis.  I’m sure I will get my fill of this topic at Symposium and will share any key takeaways in this space in upcoming posts.

It would also be impossible to ignore anything Big Data and Hadoop related.  Everyone knows data is growing and there is a need to extract and process more data.  While many companies, including Syncsort, like to talk about social media and mobile, we are also seeing businesses wanting to handle more granular data and more historical data.  They don’t just want a month’s worth of a customer’s shopping habits or vendor purchases. Their marketing departments want 3 years worth of historical information…and they want it refreshed daily or even multiple times a day. Easy enough, right?

Enter Hadoop…ok, Hadoop entered several years ago.  But we are seeing more and more interest in what Syncsort can do for Hadoop.  At the conference, it’s going to be interesting to see what Gartner is saying and projecting for Hadoop usage, use cases, and the overall maturity of the Hadoop market and deployments.

Syncsort is also here with a strategic partner of ours, Clerity Solutions.  We are in booth 529 showcasing how we can work with leading re-hosting partners like Clerity (Micro Focus and Oracle too) to accelerate mainframe modernization and migration projects.  Stay tuned on Tuesday for a product announcement and we’ll be sharing thoughts throughout the week on Twitter, as well, using hashtag #GartnerSYM.

{ 1 comment }

What is Fast?

March 7, 2011

Recently, I had the chance to join my colleagues Steve Totman and Nikhil Kumar for a great conversation with Philip Howard, research director from Bloor Research, about Syncsort’s technology and the reasons why DMExpress is so fast. We always appreciate the opportunity to speak with someone like Philip – with his impressive 30+ years of experience in the data management world – about our “secret sauce” and love it even more when it seems to leave a strong impression!

But, what is fast? In our world fast means the ability to process large data volumes in less time and with less resources. How fast? Well, DMExpress can process data as fast as native I/O speed, which pretty much means there’s nothing faster.

But the real question is, what does that mean to me? Because in the end, it’s all about doing more with less, right? In the end is about costs, about business agility and being able respond faster to new demands for information.

Faster means you can offload your database by performing transformations on your ETL tool. Faster means you can spend less money on additional database capacity and more on supporting new initiatives. Faster means you get extra time to add new data sources. Faster means you spend less time fine-tuning and more time developing new reports. Faster means you can outpace the competition with information that is timely, relevant, and actionable.

So, how is DMExpress able to accomplish all these? Well, I think Philip Howard does an excellent job explaining this. You can read his article, “How Come Syncsort is So Fast and What Does That Mean?,” on Bloor Research’s website.

{ 0 comments }