Getting Started with Hadoop

Our fourth installment from Mitch Seigle’s time on “The Cube” at Hadoop Summit focuses on getting started with Hadoop. Three initial steps to consider include:  

  1. Experiment – We strongly recommend an experimentation phase, where significant testing is performed in a Hadoop environment, before it is put into production
  2. Prioritize Data – Identify the high value problems and prioritize Hadoop projects aimed at solving those first
  3. Don’t Re-architect…Yet – Test first before any strategy discussion around rebuilding data integration or data warehousing processes takes place

Syncsort DMExpress can help simplify many of the Hadoop processes such as loading data into the Hadoop framework, improving MapReduce performance, and helping to alleviate the skills gap associated with Hadoop (given DMExpress’ self-tuning capabilities).  

If you haven’t already, I strongly encourage you to also check out videos 1, 2, and 3 from the series for more great insights from Mitch.

Watch live video from on


Keith Kohl

Authored by Keith Kohl

Vice President, Product Management

Leave a Comment