Tagged: Multiple Toggle Comment Threads | Keyboard Shortcuts

  • admin 10:14 am on April 27, 2017 Permalink
    Tags: , Multiple, ,   

    Multiple Data Warehouses in One System 

    Teradata Articles

  • admin 9:51 am on February 8, 2017 Permalink
    Tags: , , Britain, , , , , , , Multiple, , Prosper, Serving   

    Lloyds Banking Group: One Ecosystem Serving Multiple Brands Delivering Business Outcomes to Help Britain Prosper 

    Latest imported feed items on Analytics Matters

  • admin 9:52 am on February 23, 2016 Permalink
    Tags: Deploy, , Multiple,   

    Deploy Hadoop Multiple Ways 

    The current Hadoop market is dominated by two players being Cloudera and Hortonworks. Both are built on top of open source Hadoop and are very similar in their packaging except with a few differences in applications (Impala, Ambari, Ranger, Sentry etc etc) from a software perspective and their support structures. Standing on the sidelines reminds me of watching a similar game played out over a decade ago in the linux space when you had Redhat, Suse and others all competing in the same space.

    For our customers thinking about going down the Hadoop pathway they often have different objectives in their journey and come from different angles in how to begin. Sometimes they will setup a lab environment with a small deployment of the open-source no frills Hadoop and go from there by adding packages and building out their cluster from that. The risk is when to identify that the lab is ready for the prime time in a production sense and whether they should stick with the open source version or to convert across to an Enterprise grade distro complete with support moving forward. Or they will decide to go all in and begin their journey with an Enterprise grade Hadoop from day 1. The question on their mind is which one to choose?

    I’m often asked by customers and peers which distro to go with either Cloudera or Hortonworks. My answer will often be prefaced by a range of commentary including support options, resources in the market and who else is using which and how they are going on the journey. I’m in the enviable position to offer my views and recommendations backed by a deep understanding of multiple factors. However recently I’ve been challenging those asking me the question as to why they should hedge all their bets on a single vendor? After all if the differences aren’t too great then why not go with a dual vendor strategy?

    A single data lake versus multiple data lakes

    If you’ve heard of the concept of the data lake then you know it’s the approach of landing data of all shapes and sizes onto a low cost no schema environment. The data lake is then used to refine data and serve up to multiple analytic environments such as a data warehouse, SAS or Teradata’s Advanced Analytics platform Aster. The common approach in deploying a data lake thus far has been a single data lake for the organisation. This design approach is similar to the mindset in the 90’s with data warehouses where we would build a single warehouse that would be all things to all people. In modern times we now have some customers with multiple data warehouses with the primary driver being the requirement for separation of data and workloads. Especially in government we see a need for a data warehouse to store highly classified datasets and to keep them physically separated from other datasets. Take this design and now apply it to the data lake concept. Whilst a single data lake has the merits of storing all of the data under one roof handling different workloads and different security rights the reality is that it can quickly become a data management nightmare. The driver for having multiple data lakes is not a technology driver but rather driven by corporate needs for isolating different workloads, data security requirements, country boundaries, and corporate divisions.

    Deploying your data lake 3 ways

    When it comes to a data lake deployment strategy you essentially have the choice of 3 architectures. Shared nothing, Shared management or Shared everything.

    The Shared Nothing deployment

    The shared nothing architecture you may already be familiar with if you’ve been knocking around Massively Parallel Processing (MPP) architecture for a while. This concept is based on the view that each Hadoop cluster has it’s own dedicated storage, processing and management. An example of this is depicted in the following diagram:


    The Shared Management Deployment

    Using this deployment model, you maintain the separation of clusters, however centralize the management of the clusters under a single management layer. This approach still physically keeps the data separate and meets the numerous compliance and security requirements, however reduces administrative overhead of managing multiple clusters.


    The Shared Everything deployment

    This approach is how many have deployed their data lakes using a single cluster to service multiple data types, multiple users and multiple workloads.


    How you choose to deploy Hadoop is entirely up to your data security, workload and geographical boundaries. What you have here is flexibility. Don’t think that your data lake has to be a single lake with a single management layer. If you need to build multiple lakes, don’t be afraid to.

    Ben Davis is a Senior Architect for Teradata Australia based in Canberra. With 18 years of experience in consulting, sales and technical data management roles, he has worked with some of the largest Australian organisations in developing comprehensive data management strategies. He holds a Degree in Law, A post graduate Masters in Business and Technology and is currently finishing his PhD in Information Technology with a thesis in executing large scale algorithms within cloud environments.

    The post Deploy Hadoop Multiple Ways appeared first on International Blog.

    Teradata Blogs Feed

  • admin 9:53 am on May 20, 2015 Permalink
    Tags: , Multiple, , , ,   

    Teradata QueryGrid: One Solution to Connect Multiple Systems 

    Organizations want to access and benefit from growing data volumes coming from an expanding array of sources. The challenge is to be able to efficiently retrieve and analyze the data since some non-integrated systems from various vendors have complex processing requirements.

    A solution that lets businesses leverage all data, regardless of where it’s stored, is Teradata® QueryGrid™. It optimizes and simplifies access to data across the Teradata Database, Teradata Aster Database and open-source Apache™ Hadoop® that comprise the Teradata Unified Data Architecture™, as well as other source systems.

    Teradata QueryGrid is an enabling software engineered to tightly link specialized processing engines to act as one solution from the user’s perspective. This intelligent, seamless and transparent access lets users perform multi-system analytics and have queries, or even parts of queries, sent to the appropriate platforms for execution.

    Want to learn more? Read about the real-world benefits of Teradata QueryGrid in Teradata Magazine. 

    Brett Martin
    Teradata Magazine


    The post Teradata QueryGrid: One Solution to Connect Multiple Systems appeared first on Magazine Blog.

    Teradata Blogs Feed

  • admin 10:33 am on May 4, 2015 Permalink
    Tags: , channels, , measure, Multiple,   

    Measure Client Satisfaction across Multiple Channels 

    A technology leader and evangelist, John Thuma is a recognized leader in data warehousing, business intelligence, and advanced analytics. With nearly 30 years of practical experience, John has developed and implemented real world solutions across a variety of industries and disciplines.
    Teradata Events

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc