Spark Innovation With Teradata QueryGrid 

by Richard Hackathorn tech-knowledge

These are confusing times for IT management. Accepted practices and established technologies seem limited or even irrelevant in light of today’s opportunities and challenges with incorporating big data and advanced analytics into enterprise systems.

To understand these issues, Bolder Technology, Inc., conducted a study entitled “Analytics in Action with Teradata® QueryGrid™” that looked at how nine companies are using analytic solutions. These industry leaders blend data among the Teradata Database, Teradata Aster Database and open-source Apache™ Hadoop®. Their approach is to adopt the Teradata Unified Data Architecture™, an ecosystem for enterprise analytics.

The glue for this architecture is Teradata QueryGrid, whose goal is to orchestrate analytic processes as a single unit of work, based on SQL, across various platforms. Supporting a transparent access layer with parallelized data flows, Teradata QueryGrid can minimize data movement so business users are unaware—and don’t need to be concerned about—where data is stored. To leverage the unique processing capabilities of specific platforms, the solution also supports pushdown processing so remote platforms can perform specialized analytic functions.

Access Layer Simplifies Problem Solving

Teradata QueryGrid enables data movement to and from the Hadoop platform to support enterprise analytics. Here is one example:

A travel reservations company uses the solution for its website, call center and A/B testing. One of the organization’s goals is to better understand the “conversion funnel,” which is when online customers convert from browsing to purchasing. Gaining that understanding requires analyzing two data sets—behavior data from website logs and analytic vendors, and booking data.

Web logs provide insights into the conversion funnel so the company can answer questions such as, “Where should we spend marketing funds to improve website bookings?” These Web logs contain a significant amount of data that needs to be parsed to assess the unique customer journey for every individual visit.

While Hadoop is useful in parsing the Web logs, analysis is challenging. That’s because booking data must be transferred into the integrated data warehouse from an ERP system on a regular basis. Teradata QueryGrid resolves the issue by enabling the Teradata platform to process the finalized customer journey data coming from Hadoop.

Knowing more about the conversion funnel also requires using interactive voice response (IVR) data sets. These complex sequences of text and audio from call centers are stored on the Hadoop platform. Teradata QueryGrid can be used to combine booking and IVR data to uncover customer conversion insights.

In addition, Teradata QueryGrid can support A/B testing for improving website content design. The travel reservations company constantly tests new ideas for website content and style, and assesses the impact of dozens of website changes by observing randomly selected customers.

Using Hadoop, an A/B testing team looks at clickstream data to track each website change. Throughout each day, Teradata QueryGrid transfers booking data from the Teradata platform to Hadoop. The team is then able to match the two data sets to calculate metrics such as dollar volume for each test.

Recurring Themes Emerge With Enterprise Analytics

These seven themes recurred through the majority of the use cases:

  • Cultural bridge. The access layer of Teradata QueryGrid acts as the bridge between two technical cultures, such as the community around the Teradata and Hadoop platforms, enabling collaboration within the same information ecosystem. This results in a more natural work environment that brings together relational and non-relational data. For example, the efficiency of Teradata QueryGrid allowed a company to parallel load Hadoop data into the Teradata Aster Database, where text conditioning and analysis detected 50% more bad emails.
  • Data placement. Organizations realize that where data is placed and processed is an important configuration issue. Synchronizing data movement among platforms optimizes processing where the data resides. However, users want to access data through the platform they prefer. Therefore, the tradeoffs of data storage and processing must be balanced among the various platforms.
  • Data marriages. Business value is created when new data is married with older reference data on customers, purchases and the like. Proper analytics increases the value of the data, and that value is further enhanced when insights are combined with reporting and dissemination tools. These marriages drive the justification for data movement among the platforms, such as when a travel reservations company needs to integrate the booking data from the Teradata platform with data in Hadoop to complete its conversion funnel and A/B testing analyses.
  • Messy data storage. Messy data comes from sources including Web logs, sensors, text and social media. Organizations need the ability to quickly and efficiently store this data using Hadoop, then access the information to support business applications. A company can use Teradata QueryGrid to move messy data from the Teradata platform and from Hadoop into the Teradata Aster Discovery Platform to perform analytics for new applications development or other business needs.
  • Event sequencing. A critical analytics solution is the Teradata Aster nPath™ function that discovers the event sequence that precedes a significant business event, such as a customer switching to a competitor. One organization used Teradata QueryGrid to pull Web log data from Hadoop and billing data from a data warehouse into the Teradata Aster Discovery Platform where nPath could be employed to answer event-sequencing questions such as, “What is the last step the customer performs before going away from our website?”
  • Parallelizing data streams. The benefits of running parallelized data streams include eliminating bottlenecks and changing workflows. Analysts will be able to ask more questions and get more answers so they can explore more alternatives and better validate business solutions. Teradata QueryGrid initiates a massively parallel connection, resulting in hundreds of concurrent data streams between platforms.
  • SQL views. SQL views simplify usage and ensure security for Teradata QueryGrid. For instance, a company relied on the access layer to enable the Teradata Aster Discovery Platform to collect data from Hadoop and the Teradata Database for analysis. The results were shared using Teradata Aster Lens™ visualizations. The company used Teradata QueryGrid via SQL views to hide Hadoop configurations and protect the data since Hadoop lacks adequate security.

A New Approach

The IT industry is in the beginning stages of redefining the data warehouse and advanced analytics as an integrated information ecosystem within the enterprise. As implied by the use cases in the study, an approach is emerging that supports enterprise analytics at scale. This approach is enabled by federated workflows within an integrated information ecosystem among a fabric of closely coupled purpose-built platforms. The objective is a constant cycle of discovery and innovation, resulting in incremental improvements to business processes.

Richard Hackathorn is founder and president of Bolder Technology, Inc., a consultancy focused on enterprise analytics, business intelligence and data warehousing. 

Read the full article and more in the Q2 2015 issue of Teradata Magazine.

 

The post Spark Innovation With Teradata QueryGrid appeared first on Magazine Blog.

Teradata Blogs Feed