Tagged: Path Toggle Comment Threads | Keyboard Shortcuts

  • admin 9:51 am on November 3, 2017 Permalink
    Tags: , , Path, ,   

    Behavioral segmentation through path analysis 

    Latest imported feed items on Analytics Matters

     
  • admin 10:33 am on July 18, 2017 Permalink
    Tags: , , , Path,   

    Introducing the Path Analysis Interface for Teradata 

    Latest imported feed items on Analytics Matters

     
  • admin 9:51 am on May 25, 2017 Permalink
    Tags: , Path, ,   

    Predicting the Path of Predictive Analytics 

    Latest imported feed items on Analytics Matters

     
  • admin 9:52 am on May 9, 2017 Permalink
    Tags: Bees, , , Path, Rarely, , , Straight, ,   

    How Are Customers Like Bees? They Rarely Travel a Straight Path or Make a Single Stop 

    Latest imported feed items on Analytics Matters

     
  • admin 9:51 am on April 7, 2017 Permalink
    Tags: , , Path, ,   

    Path Analytics Shouldn’t Be This Difficult! 

    Latest imported feed items on Analytics Matters

     
  • admin 9:53 am on June 11, 2015 Permalink
    Tags: , Path, ,   

    Teradata’s Secure Path to the Cloud 

    Database users are increasingly becoming more comfortable with storing their databases in public or private clouds. For example, Amazon RDS (relational database service) is used by approximately half of all customers of Amazon’s AWS cloud.

    A key feature of cloud computing is multi-tenancy. It allows many different users, potentially from different organizations, to share the same physical resources in the cloud. Since many database-backed applications are not able to fully utilize the CPU, memory, and I/O resources of a single server 24 hours a day, cloud providers leverage multi-tenancy to reduce hardware and labor costs.

    The most straightforward way to implement database multi-tenancy in the cloud is to acquire a virtual machine (VM) in the cloud, install the database system on the VM, load data, and access it as one would any other database system. As an optimization, many cloud providers offer specialized virtual machines with the database preinstalled and preconfigured in order to accelerate the process of setting up the database and making it ready to use.

    The database system on a virtual machine approach is a clean, elegant, and general way to implement multi-tenancy. Multiple databases running on different virtual machines can be mapped to the same physical machine, with negligible concern for any security problems arising from the resulting multi-tenancy. This is because the hypervisor effectively shields each virtual machine from being able to access data from other virtual machines located on the same physical machine.

    For general cloud environments where different users are running arbitrary programs, the virtual machine approach to multi-tenant security is the best option. However, when all users are running the same program (e.g., all users are running the same database server in their virtual machines), this approach is inefficient for several reasons. First, the virtual machines themselves and the operating system software within them consume large amounts of storage and memory, thereby reducing the resources on the machine available to run anything else. Second, the same exact database software must be installed on each virtual machine, with each redundant copy consuming additional storage and memory. Third, the instruction cache on the server is inefficiently utilized, being filled with identical instructions from the different copies of the database server running in different virtual machines. This inefficient use of storage, memory, and cache resources ultimately leads to a much larger hardware-footprint than necessary, and the extra hardware and labor costs are ultimately passed down to the end-user.

    To summarize the above points:
    (1) Allowing multiple database users to share the same physical hardware (multi-tenancy) helps optimize resource utilization in the cloud, and therefore reduce costs.
    (2) Secure multi-tenancy can be easily achieved via giving each user a separate virtual machine and mapping multiple virtual machines to the same physical machine.
    (3) When the different virtual machines are all running the same OS and database software, the virtual machine approach results in inefficient redundancy.

    If all tenants of a multi-tenant system are using the same software, it is far more efficient to install a single instance of that software, and allow all tenants to share it. However, a major concern with this approach is security. In a database system, it is totally unacceptable for different tenants to have access to each other’s data. Even metadata should not be visible across tenants — they should not see each other’s table names, schema information, and data profiles. All data and metadata associated with other tenants should be totally invisible.

    Ideally, even the performance of database queries, transactions, and other types of requests should not be harmed by the requests of other tenants. For example, if tenant A is running a long and resource-intensive query, tenant B should not observe slow-downs of the requests it is concurrently making of the database system.

    Unfortunately, for most database systems, when different users share the same instance of the database system, the system does not provide any built-in mechanism to keep the users isolated from each other — either from a security or performance perspective.

    In contrast, with Teradata’s recent announcement of its secure zones feature, Teradata can now achieve both secure and performance isolated multi-tenancy. In particular, each tenant can now be located within its own secure zone. Each zone has its own separate set of users that can only access database objects within that zone. The view that a user has of the database is completely local to the zone in which that user is defined — even the database metadata (data dictionary) is local to the zone (user queries of metadata only return results for the metadata associated within the zone where the user is defined). Users are not even able to explicitly grant permissions to view database objects of their zone to users of a different zone. Each zone is 100% isolated from the other secure zones (1).

    secure zones abadi

     

     

     

     

     

     

    Figure 1: Secure zones contain database uses, tables, profiles, and views (click to enlarge image)

    A key design theme in Teradata’s secure zones feature is the separation of administrative duties from access privileges. For example, in order to create a new tenant, there needs to be a way to create a new secure zone for that tenant. Theoretically, the most straightforward mechanism for accomplishing this would be via a super user analogous to the Linux super user / root that has access to the entire system and can create new users and data on the system at will. This super user could then add and remove new secure zones, create users for those zones, and access data within those zones.

    Unfortunately, this straightforward super user solution is fundamentally antithetical to the general secure zones goal of isolating zones from each other, since the zone boundaries have no effect on the super user. In fact, the presence of a super user would violate regulatory compliance requirements in certain multi-tenant application scenarios.

    Therefore, Teradata’s secure zones feature includes the concept of a zone administrator — a special type of user that can perform high level zone administration duties, but has no discretionary access rights on any objects or data within a zone. For example, the zone administrator has the power to create and drop zones, and to grant limited access to the zone for specific types of users. Furthermore, the zone administrator determines the root object of a zone. However, the zone administrator cannot read or write that root object, nor any of its descendants.

    Analogous to a zone administrator is a special kind of user called a DBA user. Just as a zone administrator can perform administrative zone management tasks without discretionary access rights in the zones that it manages, a DBA user can perform administrative tasks for a particular zone without super user discretionary access rights in that zone. In particular, DBA users only receive DDL and DCL rights within a zone, along with the power to create and drop users and objects. However, they must be directly assigned DML rights for any objects within a zone that they do not own in order to be able to access them. Thus, if every zone in a Teradata system is managed by a DBA user, then the resulting configuration has complete separation of administrative duties from access privileges — the zone administrator and DBA users perform the administration without any automatic discretionary access rights on the objects in the system.

    The immediate use case for secure zones is Teradata’s new software-defined warehouse which is basically a Teradata private cloud within an organization. Teradata’s software-defined warehouse enables multi-tenant clouds and business-to-business services. It consists of a single Teradata system that is able to serve multiple different Teradata database instances from the same system. If the organization develops a new application that can be served from a Teradata database, instead of acquiring the hardware and software package that composes a new Teradata system, the organization can instead serve this application from the software-defined warehouse. Multiple existing Teradata database instances can also be consolidated into the software-defined warehouse.

    sw defined DEW abadi

     

     

     

     

     

     

    Figure 2: Software-Defined Warehouse Workloads (click to enlarge)

    The software-defined warehouse is currently intended for use cases where all applications / database instances that it is managing belong to the same organization. Nonetheless, in many cases, different parts of an organization are not allowed access to data for other parts of that organization. This is especially true for multinational, franchises, or conglomerate companies with multiple subsidiaries. Access to subsidiary data must be tightly controlled and restricted to users of the subsidiary or citizens of a specific country. Therefore, each database instance that the software-defined warehouse is managing exists within a secure zone.

    In addition to secure zones, the other major Teradata feature that makes efficient multi-tenancy possible is Teradata workload management. Without workload management, it is possible for system resources to get hogged by a single tenant that is running a resource intensive task, while other tenants sharing the same hardware see significantly increased latencies and overall degraded performance. For the multiple virtual-machine implementation of the cloud mentioned above, the hypervisor implements coarse grain workload management — ensuring that each virtual machine gets a guaranteed amount of important system resources such as CPU and memory. Teradata’s virtual partitions works the same way — the system resources are divided up so that each partition is guaranteed a fixed amount of system resources. By placing each Teradata instance inside its own virtual partition, the Teradata workload manager can thus ensure that the database utilization of one instance does not affect the observed performance of other instances. Teradata workload manager also has filters, throttles, and Linux level priority scheduling which can provide finer grained runtime control.

    When you combine Teradata secure zones and Teradata workload management, you end up with a cloud-like environment, where multiple different Teradata databases can be served from a single system. Additional database instances can be created on demand, backed by this same system, without having to wait for procurement of an additional Teradata system. However, this mechanism of cloudifying Teradata is much more efficient than installing the Teradata database software in multiple different virtual machines, since all instances are served from a single version of the Teradata codebase, without redundant operating system and database system installations.

    Since I am not a full-time employee of Teradata and have not been briefed on future plans for Teradata in the cloud, I can only speculate about the next steps for Teradata’s plans in this area. Obviously, Teradata’s main focus for secure zones and virtual partitions have been the software-defined warehouse, so that organizations can implement a private cloud or consolidate multiple Teradata instances onto a single system. However, I do not see any fundamental limitations to prevent Teradata from leveraging these technologies in order to build a public Teradata cloud, where Teradata instances from different organizations share the same physical hardware, just like VMs from different organizations share the same hardware in Amazon’s cloud. Whether or not Teradata chooses to go in this direction is likely a business decision that they will have to make, but it’s interesting to see that with secure zones and workload management, they already have the major technological components to proceed in this direction and build a highly-efficient database-as-a-service offering.

    (1)  There is a special type of user called a zone guest, which is not associated with any zone, and can have guest access to objects in multiple zones. This special type of user is outside the scope of this post.

    daniel abadi crop BLOG bio mgmt

    Daniel Abadi is an Associate Professor at Yale University, founder of Hadapt, and a Teradata employee following the recent acquisition. He does research primarily in database system architecture and implementation. He received a Ph.D. from MIT and a M.Phil from Cambridge. He is best known for his research in column-store database systems (the C-Store project, which was commercialized by Vertica), high performance transactional systems (the H-Store project, commercialized by VoltDB), and Hadapt (acquired by Teradata). http://twitter.com/#!/daniel_abadi.

     

     

    The post Teradata’s Secure Path to the Cloud appeared first on Data Points.

    Teradata Blogs Feed

     
  • admin 9:53 am on February 12, 2015 Permalink
    Tags: , , Path, , Speeding,   

    Siemens Speeding Down the Path of a Successful Future 


    Teradata Videos

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel