A Single Data Fabric to Tame Big Data

By now you probably know I’m an advocate of the data lake approach for expanding the possibilities around big data.  But, I’ve also made the point that data inside the lake isn’t much use if it isn’t accessible and connected to other data sources.  You can dump all the information you want in a data lake, but co-location is not good enough.  You also need coherence!  All the capacity and data architecture in the world will get you little more than chaos without an orchestration layer to “weave” data together across multiple systems and processes in usable ways.

The textile analogy is not a trivial one.  The idea of tightly woven nodes of connection, redistribution and communication points is behind the revolutionary “data fabric” concept for powerful, diverse and closely interconnected computing assets that are scalable and can be dynamically reconfigured as an agile analytic environment.  Throughout, the ability to orchestrate these assets makes the difference between a simple tangle of disjointed information and a cohesive data fabric that you can leverage for insight and value.

design template with underwater part and sunset skylight splitte

We took a big step toward fabric computing at scale with the April 2014 release of Teradata QueryGrid, a set of intelligent connectors and product capabilities that allows you to submit a single query across your entire multi-system architecture, sidestepping limitations of data movement and system boundaries.  Now I’m pleased to say we’ve added our new Data Fabric enabled by QueryGrid that further enhance orchestration of multiple enterprise systems to the point where they essentially behave as one and turn data fabric.

We announced these QueryGrid enhancements at our Partners Conference & Expo in Nashville, Tennessee, and you can read much more about this in our news release. But even a cursory look at these capabilities makes the value clear. Teradata QueryGrid now enables a Data Fabric with the ability to scale economically with architectural flexibility across the customer’s data and analytical environment.  These various layers of capability, all using the same familiar SQL interface to submit powerful queries to the Teradata Database, Aster Data, Hadoop or other data management platforms. The most powerful bond is between two Teradata Databases and between Teradata Database and Aster database. The next layer binds the Teradata Database and Hadoop systems within the Teradata Unified Data Architecture (UDA).

A third, more universal set of features and capabilities comes in the fabric extensions beyond the Teradata UDA to additional relational databases, NoSQL databases and other data repositories. An Adaptive Optimizer capability ensures optimal query performance across even the most diverse multi-system, multi-vendor and mixed technology ecosystems. Taken together, these improvements help Teradata QueryGrid turn a best-of-breed environment into what I call an Orchestrated Analytical Ecosystem. And a lot of problems that come inherent with the data lake are solved in the process.

It’s not hard to think of the use cases that are emerging as we continue to bring these capabilities to market over the next few months. Airlines can now use a single layer of analysis to examine sentiment expressed in social media data on Hadoop alongside high-value customer data from transaction history details located in the data warehouse. In a single step, a manufacturer can leverage serial numbers from the data warehouse along with sensor data in a MongoDB database to examine reports of product failures. And, insurers can avoid duplicating data, while using multiple systems to look at both recent and historical information when examining claims history for a particular medical condition over the period of a decade, or more.

I think it’s cause for celebration that we’re at a point in the analytic revolution where the data fabric has moved from concept to reality, as a means for enterprises to weave together a truly cohesive, flexible, efficient and economical framework for building and integrating a data lake. These latest Teradata QueryGrid enhancements give companies the flexibility to pick their file systems, operating systems, data types, analytic engines and system design characteristics to reap the most insight and value from their complex and customized analytic environments.  This lets everyone focus even more on getting answers to business questions, not the headaches of underlying IT process or infrastructure.

Scott Gnau

Teradata Blogs Feed