Big Data, Data Management, Database

SQL collides into Hadoop

Tony Baer 1 Comment

It’s going to be quite a whirlwind this week. Informatica Analyst conference tomorrow, wall-to-wall meetings at Strata Wed afternoon, before heading off to the mountains of Colorado for the SAS Analyst meet next week.

So EMC Greenplum caught us at an opportune time in the news cycle. They chose the day before strata to saturate the airwaves with announcement that they are staking their future on Hadoop.

Specifically, they have come up with a product with a proliferation of brands. EMC is the parent company, Greenplum is the business unit and Advanced SQL MPP database, Pivotal HD is the branding of their new SQL/Hadoop offering, and guess what, did we forget to tell you about the HAWQ engine that provides the interactivity with Hadoop? Sounds like a decision by committee for branding.

(OK, HAWQ is the code name for the project, but Greeenplum has been promoting it quite prominently. Maybe they might want to tone it down as the HAWQ domain is already claimed.)

But it is an audacious move for Greenplum. While its rivals are putting Hadoop either at arm’s reach or in Siamese twin deployments, Greenplum is putting its engine directly into Hadoop, sitting atop HDFS. The idea is that to the enterprise user, it looks like Greenplum, but with the scale-out of HDFS underneath. Greenplum is not alone in looking to a singular Hadoop destination for big Data analytics; Cloudera is also pushing that direction heavily with Impala. And while Hortonworks has pushed coexistence with its HCatalog Apache incubator project ad close OEM partnerships with Teradata Aster and Microsoft, it is responding with announcement of the Tez runtime and Stinger interactive Hadoop query projects.

These developments simply confirm what we’ve been saying: SQL is converging with Hadoop, and for good reason. For Big Data – and Hadoop – to get accepted into the mainstream enterprise, it has to become a first class citizen with IT, the data center, and the business. That means (1) reasonably map to the skills (SQL, Java) that already exist, and extend from there (2) fit in with the databases, applications, and systems management practices (e.g., storage, virtualization) that are how the data center operate and (3) the analytics must start by covering the ground that the business understands.

For enterprises, these announcements represent vendors placing stakes in the ground; these are newly announced products that in many cases are still using pre-release technology. But it is important to understand the direction of the announcements, what this means for the analytics that your shop produces, and how your future needs are to be met.

Clearly, Hadoop and its programming styles (for now MapReduce remains the best known) offer a new approach for a new kind of analytic to connect the dots. But for enterprises, the journey to Big Data and Hadoop will be more evolutionary.

One thought on “SQL collides into Hadoop”

  1. Pingback: BI crashing into the database | OnStrategies Perspectives

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a class="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>