Teradata basic mod03

合集下载

Teradata RDBMS教程说明书

About the T utorialTeradata is a popular Relational Database Management System (RDBMS) suitable for large data warehousing applications. It is capable of handling large volumes of data and is highly scalable. This tutorial provides a good understanding of Teradata Architecture, various SQL commands, Indexing concepts and Utilities to import/export data.AudienceThis tutorial is designed for software professionals who are willing to learn Teradata concepts and become a Teradata developer. By the end of this tutorial, you will have gained intermediate level of expertise in Teradata.PrerequisitesYou should have a basic understanding of Relational concepts and basic SQL. It will be good if you have worked with any other RDBMS product.Copyright & DisclaimerCopyright 2018 by Tutorials Point (I) Pvt. Ltd.All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher.We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or inthistutorial,******************************************T able of ContentsAbout the Tutorial (i)Audience (i)Prerequisites (i)Copyright & Disclaimer (i)Table of Contents (ii)PART 1: TERADATA BASICS (1)1.Teradata - Introduction (2)What is Teradata? (2)History of Teradata (2)Features of Teradata (2)2.Teradata – Installation (4)Installation Steps for Windows (4)Starting BTEQ (8)3.Teradata – Architecture (9)Components of Teradata (9)Storage Architecture (10)Retrieval Architecture (11)4.Teradata – Relational Concepts (12)5.Teradata – Data Types (14)6.Teradata – Tables (16)Table Types (16)Create Table (16)Alter Table (18)Drop Table (19)7.Teradata – Data Manipulation (20)Insert Records (20)Insert from Another Table (21)Update Records (22)Delete Records (23)8.Teradata – SELECT Statement (24)WHERE Clause (25)ORDER BY (25)GROUP BY (26)9.Teradata – Logical & Conditional Operators (27)BETWEEN (28)IN (28)NOT IN (29)10.Teradata – SET Operators (30)UNION (30)UNION ALL (31)MINUS/EXCEPT (33)11.Teradata – String Manipulation (35)12.Teradata – Date/Time Functions (36)Date Storage (36)EXTRACT (36)INTERVAL (37)13.Teradata – Built-in Functions (39)14.Teradata – Aggregate Functions (40)15.Teradata – CASE & COALESCE (42)CASE Expression (42)COALESCE (43)NULLIF (44)16.Teradata – Primary Index (45)Unique Primary Index (UPI) (45)Non Unique Primary Index (NUPI) (46)17.Teradata – Joins (47)INNER JOIN (47)OUTER JOIN (48)CROSS JOIN (50)18.Teradata – SubQueries (51)PART 2: TERADATA ADVANCED (53)19.Teradata – Table Types (54)Derived Table (54)Volatile Table (55)Global Temporary Table (55)20.Teradata – Space Concepts (57)Permanent Space (57)Spool Space (57)Temp Space (57)21.Teradata – Secondary Index (58)Unique Secondary Index (USI) (58)Non Unique Secondary Index (NUSI) (58)22.Teradata – Statistics (59)Collecting Statistics (59)Viewing Statistics (60)23.Teradata – Compression (61)24.Teradata – EXPLAIN (62)Full Table Scan (FTS) (62)Unique Primary Index (63)Unique Secondary Index (63)Additional Terms (64)25.Teradata – Hashing Algorithm (65)26.Teradata – JOIN INDEX (67)Single Table Join Index (67)Multi Table Join Index (69)Aggregate Join Index (69)27.Teradata – Views (71)Create a View (71)Using Views (72)Modifying Views (72)Drop View (73)28.Teradata – Macros (74)Create Macros (74)Executing Macros (75)Parameterized Macros (76)Executing Parameterized Macros (76)29.Teradata – Stored Procedure (77)Creating Procedure (77)Executing Procedures (78)30.Teradata – JOIN Strategies (80)Join Methods (80)Merge Join (80)Nested Join (82)Product Join (82)31.Teradata – Partitioned Primary Index (83)32.Teradata – OLAP Functions (86)33.Teradata – Data Protection (89)Transient Journal (89)Fallback (89)Down AMP Recovery Journal (90)Cliques (90)Hot Standby Node (90)RAID (91)34.Teradata – User Management (92)Users (92)Accounts (93)Grant Privileges (93)Revoke Privileges (94)35.Teradata – Performance Tuning (95)36.Teradata – FastLoad (97)How FastLoad Works (97)Executing a FastLoad Script (98)FastLoad Terms (99)37.Teradata – MultiLoad (100)Limitation (100)How MultiLoad Works (100)Executing a MultiLoad Script (102)38.Teradata – FastExport (103)Executing a FastExport Script (104)FastExport Terms (104)39.Teradata – BTEQ (105)40.Teradata – Questions & Answers (108)Part 1: Teradata Basics1.TeradataWhat is T eradata?Teradata is one of the popular Relational Database Management System. It is mainly suitable for building large scale data warehousing applications. Teradata achieves this by the concept of parallelism. It is developed by the company called Teradata.History of T eradataFollowing is a quick summary of the history of Teradata, listing major milestones.1979– Teradata was incorporated1984– Release of first database computer DBC/10121986–Fortune magazine names Teradata as ‘Product of the Year’1999– Largest database in the world using Teradata with 130 Terabytes2002– Teradata V2R5 released with Partition Primary Index and compression2006– Launch of Teradata Master Data Management solution2008– Teradata 13.0 released with Active Data Warehousing2011– Acquires Teradata Aster and enters into Advanced Analytics Space2012– Teradata 14.0 introduced2014– Teradata 15.0 introducedFeatures of T eradataFollowing are some of the features of Teradata:●Unlimited Parallelism: Teradata database system is based on Massively ParallelProcessing (MPP) Architecture. MPP architecture divides the workload evenly across the entire system. Teradata system splits the task among its processes and runs them in parallel to ensure that the task is completed quickly.●Shared Nothing Architecture:Teradata’s architecture is called as SharedNothing Architecture. Teradata Nodes, its Access Module Processors (AMPs) and the disks associated with AMPs work independently. They are not shared with others.●Linear Scalability:Teradata systems are highly scalable. They can scale up to2048 Nodes. For example, you can double the capacity of the system by doubling the number of AMPs.●Connectivity:Teradata can connect to Channel-attached systems such asMainframe or Network-attached systems.Teradata ●Mature Optimizer:Teradata optimizer is one of the matured optimizer in themarket. It has been designed to be parallel since its beginning. It has been refined for each release.●SQL: Teradata supports industry standard SQL to interact with the data stored intables. In addition to this, it provides its own extension.●Robust Utilities: Teradata provides robust utilities to import/export data from/toTeradata system such as FastLoad, MultiLoad, FastExport and TPT.●Automatic Distribution: Teradata automatically distributes the data evenly to thedisks without any manual intervention.Teradata Teradata provides Teradata express for VMWARE which is a fully operational Teradata virtual machine. It provides up to 1 terabyte of storage. Teradata provides both 40GB and 1TB version of VMware.PrerequisitesSince the VM is 64 bit, your CPU must support 64-bit.Installation Steps for WindowsStep 1: Download the required VM version from the link, /download/database/teradata-express-for-vmware-playerStep 2: Extract the file and specify the target folder.Step 3: Download the VMWare Workstation player from the link, https:///web/vmware/downloads . It is available for both Windows and Linux. Download the VMWARE workstation player for Windows.Step 4: Once the download is complete, install the software.Step 5: After the installation is complete, run the VMWARE client.Step 6: Select 'Open a Virtual Machine'. Navigate through the extracted Teradata VMWare folder and select the file with extension .vmdk.2.Step 7: Teradata VMWare is added to the VMWare client. Select the added Teradata VMware and click ‘Play Virtual Machine’.Step 8: If you get a popup on software updates, you can select ‘Remind Me Later’.Step 9: Enter the user name as root, press tab and enter password as root and again press Enter.Step 10: Once the following screen appears on the desktop, double-click on ‘root’s home’. Then double-click on ‘Genome’s Terminal’. This will open the Shell.Step 11: From the following shell, enter the command /etc/init.d/tpa start. This will start the Teradata server.Starting BTEQBTEQ utility is used to submit SQL queries interactively. Following are the steps to start BTEQ utility.Step 1: Enter the command /sbin/ifconfig and note down the IP address of the VMWare. Step 2: Run the command bteq. At the logon prompt, enter the command.Logon <ipaddress>/dbc,dbc; and enter At the password prompt, enter password as dbc; You can log into Teradata system using BTEQ and run any SQL queries.TeradataTeradata architecture is based on Massively Parallel Processing (MPP) architecture. The major components of Teradata are Parsing Engine, BYNET and Access Module Processors (AMPs). The following diagram shows the high level architecture of a Teradata Node.Components of T eradataThe key components of Teradata are as follows:∙Node: It is the basic unit in Teradata System. Each individual server in a Teradata system is referred as a Node. A node consists of its own operating system, CPU, memory, own copy of Teradata RDBMS software and disk space. A cabinet consists of one or more Nodes.∙Parsing Engine: Parsing Engine is responsible for receiving queries from the client and preparing an efficient execution plan. The responsibilities of parsing engine are:o Receive the SQL query from the client.o Parse the SQL query check for syntax errors.o Check if the user has required privilege against the objects used in the SQLquery.3.o Check if the objects used in the SQL actually exists.o Prepare the execution plan to execute the SQL query and pass it to BYNET.o Receives the results from the AMPs and send to the client.∙Message Passing Layer:Message Passing Layer called as BYNET, is the networking layer in Teradata system. It allows the communication between PE and AMP and also between the nodes. It receives the execution plan from Parsing Engine and sends to AMP. Similarly, it receives the results from the AMPs and sends to Parsing Engine.∙Access Module Processor (AMP): AMPs, called as Virtual Processors (vprocs) are the one that actually stores and retrieves the data. AMPs receive the data and execution plan from Parsing Engine, performs any data type conversion, aggregation, filter, sorting and stores the data in the disks associated with them.Records from the tables are evenly distributed among the AMPs in the system. Each AMP is associated with a set of disks on which data is stored. Only that AMP can read/write data from the disks.Storage ArchitectureWhen the client runs queries to insert records, Parsing engine sends the records to BYNET. BYNET retrieves the records and sends the row to the target AMP. AMP stores these records on its disks. Following diagram shows the storage architecture of Teradata.Retrieval ArchitectureWhen the client runs queries to retrieve records, the Parsing engine sends a request to BYNET. BYNET sends the retrieval request to appropriate AMPs. Then AMPs search their disks in parallel and identify the required records and sends to BYNET. BYNET then sends the records to Parsing Engine which in turn will send to the client. Following is the retrieval architecture of Teradata.End of ebook previewIf you liked what you saw…Buy it from our store @ https://。

Teradata数据仓库产品介绍v1.0

最优性能 = CPU 和 I/O 带宽的平衡
只增加CPU: 没有足够的磁盘保证 CPU得到充分利用
CPU处理能力增强
当前系统
只增加磁盘: 没有足够的 CPU来充分利用这些磁盘
磁盘个数
Teradata 的推荐配置
Balanced Price/Performance Configuration
• 推荐配置需要找到节点CPU和内存、磁盘阵列的I/O（磁盘的多少）之间一种平衡
• 目标就是为系统提供一种性价比合理的配置
CPU Utilization
100%
I/O
Constrained
80%
60%
Balanced System
Super Saturated
40%
20%
Increase in number of drives
I/O = Number Disk Drives and Controllers
F
SOLARIS
ORACLE DBMS
8
10
15
13
10
8
17
14
11.9
4
5 IBM BLADECENTER
B
AIX
DB2
11
9
11
12
10
9
17
14
11.6
5
6 IBM SYSTEM z9
F
ZOS
DB2
11
14
10
8
10
9
17
14
11.5
6
7 HP BLADESYSTEM
B
HP-UX
ORACLE DBMS
数据仓库系统不是简单的主机、存储、数据库的堆积

我的Teradata使用经历

1. Teradata V2R6.0 Installation Instructions12/6/2004注：我是安装在Windows Server 2003下面（据说2000也可以，不过未试过），在XP下面试了好几台机器，都没有安装成功。

另外在Windows Server 2003的机器也不见得全部都能安装成功，若有条件建议重新装一个2003，然后再试安装Teradata （新装的机器我全部都成功，包括虚拟机）。

1.1. U ninstall Teradata Database and BYNET Driver:∙Backup any database tables that you will need later.∙Stop the Teradata database software.∙Click on the “Add/Remove Programs” icon in the Control Panel.∙Select “Teradata BYNET Software”.∙Or select “BYNET Software”.∙Or select “bynet(x86)”.∙Click on the “Change/Remove” button.∙After the BYNET Driver uninstall completes, click on “OK”.∙Don’t restart Microsoft Windows yet.∙(Click on the “Add/Remove Programs” icon in the Control Panel.)∙Select “Teradata Database 5.0 (Demo Version)”.∙Or select “Teradata Database For Microsoft Windows”.∙Click on the “Change/Remove” button.∙Select “All current and non-current versions” and click on “OK”.∙When asked, “Do you want to remove …”, click on “Yes”.∙When asked, “Are you sure …”, click on “Yes”.∙After the Teradata Database un install completes, click on “OK”.∙Restart Microsoft Windows via Start>>Shut Down>>Restart.1.2. I nstall BYNET Driver for V2R6.0:∙Download Demo Version of V2R6.0 BYNET Driver from/tw80demo.cfm.∙Extract the files and execute the bynet(x86).msi program.∙Don’t restart Microsoft Windows yet.注：我安装时老老实实的是重启的。

Teradata数据库配置说明

1．Teradata Client安装说明1、安装程序地址：ftp://10.3.7.141/2、点击Setup.exe，开始运行安装程序3、出现如下安装启动画面：4、选择Custom安装方式（请不要选择Typical）5、选择安装组件请依次选择以下程序：6、点击Next直至完成2．Teradata数据库配置说明1）测试数据库说明2）配置ODBC1、打开odbc数据源管理器、选择系统dsn页点击【添加】按钮2、选择T eradata驱动程序，点击【Finish】按钮3、填DB Source、T eradata Info、Uername、Password等选项点击【ok】按钮完成。

说明：●Data Source：odbc的名称●Teradata Server Info Name(s)：Teradata数据库的IP地址●Username：用户名●Password：密码●Default Database：默认数据库3）配置HOST文件1、打开系统目录－system32－Drivers－etc（如XP系统：C:\Windows\system32\drivers\etc）下的hosts文件2、填写T eradata数据库的CLI接口地址：如127.0.0.1 localhost10.3.7.9 cpcimtcop1说明：第一部分为Teradata的ip地址，后面为任意名字和cop1、cop2的组合3．Teradata客户端配置1） Teradata Administrator配置选择菜单T ools/Options修改General选项，选中用SQL Assistant代替查询窗口选项。

完成OK后可以选择查询按钮进入T eradata SQL Assistant2） Teradata SQL Assistant配置选择菜单T ools/Options修改查询选项卡，选中只提交选中的查询语句选项。

Teradata数据库配置说明

完成OK后可以选择查询按钮进入T eradata SQL Assistant2） Teradata SQL Assistant配置选择菜单T ools/Options修改查询选项卡，选中只提交选中的查询语句选项。

Teradata基础知识（中文）

Teradata基础教程目录第一章数据仓库基本概念 (1)1.1背景介绍 (1)1.2OLTP与OLAP (2)1.3数据仓库系统的查询特点 (3)1.4详细数据与小结数据(D ETAIL D ATA与S UMMARY D ATA) (5)1.5数据仓库与数据集市(D ATA W AREHOUSE与D ATA M ART) (7)1.6T ERADATA的出现 (10)1.7如何衡量数据仓库引擎 (11)1.7.1 TPC-D (12)1.7.2 TPC-H/R (20)1.8NCR可扩展数据仓库方法论与实施框架 (22)1.8.1 NCR可扩展数据仓库方法论 (22)1.8.2 NCR可扩展数据仓库框架 (27)1.8.3 NCR可扩展数据仓库合作伙伴 (29)第二章 TERADATA关系型数据库管理系统概要 (30)2.1T ERADATA数据库的设计思想 (30)2.2T ERADATA数据库的体系结构 (30)2.2.1 Teradata V1/DBC体系结构 (31)2.2.2 Teradata V1/NCR 3600体系结构 (34)2.2.3开放的Teradata V2/SMP体系结构 (40)2.2.4 Teradata V2/MPP体系结构 (45)2.3T ERADATA 的并行处理机制 (51)2.4W INDOWS平台的T ERADATA数据库 (53)2.5T ERADATA多媒体数据库 (54)第三章 TERADATA数据库的数据分配机制 (56)3.1哈希算法、主索引、与数据分配 (56)3.2T ERADATA数据分配示例 (58)3.3主索引与表的创建 (60)3.4哈希冲突与不唯一主索引 (61). I .3.5T ERADATA数据库系统的在线升级 (63)第四章 TERADATA数据库的数据访问机制 (65)4.1基于主索引的数据访问 (65)4.2基于唯一次索引USI的数据访问 (67)4.3基于非唯一次索引NUSI的数据访问 (71)4.4全表扫描 (74)4.5总结 (75)第五章如何选择主索引 (78)5.1T ERADATA数据库中的AMP与PDISK (78)5.2数据记录的分配 (79)5.3选择主索引的基本原则 (83)第六章数据库的空间管理、用户管理、访问权限 (85)6.1T ERADATA中的用户与数据库 (85)6.1.1数据库 (85)6.1.2用户 (88)6.2T ERADATA数据库的层次型结构 (88)6.3拥有者(O WNER)与创建者(C REATOR) (92)6.4T ERADATA数据库的访问权限 (94)6.4.1访问权限概述 (94)6.4.2显示权限 (102)6.4.3监控权限 (103)6.4.4如何检查一个用户或数据库的权限 (103)6.4.5 GRANT命令的操作 (106)6.4.6 REVOKE命令的操作 (107)第七章数据保护与恢复 (109)7.1锁(L OCK) (109)7.2优先权(P RIORITY) (113)7.3交易完整性(T RANSACTION I NTEGRITY) (114)7.4临时流水(T RANSIENT J OURNAL) (115)7.5永久流水(P ERMANENT J OURNAL) (115)7.6F ALL B ACK保护 (116). II .第八章客户端访问TERADATA数据库的方法 (120)8.1概述 (120)8.2T ERADATA数据库的编程接口 (122)8.2.1调用层接口CLI (122)8.2.2嵌入式预处理器 (123)8.2.3 ODBC (123)8.3T ERADATA应用工具 (124)8.3.1 BTEQ (124)8.3.2 FastLoad (125)8.3.3 MultiLoad (126)8.3.4 FastExport (127)8.3.5 TPump (127)第九章使用TERADATA的主要客户分析 (129)9.1零售业 (130)9.2消费品制造与零售业供货商 (131)9.3货运业 (132)9.4客运业 (133)9.5电信业 (134)9.6健康保险业 (135)9.7金融业 (137)9.8共用事业类 (139)9.9其它行业 (140)附录一 CLIENT/SERVER结构下的TERADATA数据库 (153)附录二 TERADATA ODBC驱动程序设置 (155)附录三 QUERYMAN介绍 (158)附录四 WINDDI介绍 (164). III .第一章数据仓库基本概念1.1背景介绍相对许多行业而言，信息处理技术还是一门新兴的技术，但其发展速度却几乎是最快的。

Teradata高级文档

Tera Blog 收藏Teradata SQL调优1.优化过程：依照运行时间，数据量和复杂度来定位瓶颈。

查看sql执行计划，判断其合理性。

性能监控==》目标选取==》性能分析==》过程优化==》运行跟踪（性能监控）注意：每个过程中都会产生必须的文档2.性能分析：? Review PDM --表定义--PI的选择--表的记录数与空间占用? Review SQL --关联的表--逻辑处理复杂度--整体逻辑--多余的处理? 测试运行--响应时间? 查看EXPLAIN --瓶颈定位3.过程优化：? 业务规则理解--合理选取数据访问路径? PDM设计--调整PDM ? SQL写法不优化，忽略了Teradata的机理与特性--调整SQL ? Teradata优化器未得到足够的统计信息--Collect Statistics4.Multiple Insert/select --> Multi-Statement Insert/Select * 并行插入空表不记录Transient Journal * 充分利用Teradata向空表Insert较快以及并行操作的特性如：? 现状INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC1 ; INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC2 ; INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC3 ; 说明：串行执行，多个Transaction ? 优化后：INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC1 ;INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC2 ;INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC3 ; 说明：并行执行，单个Transaction5.Insert/Select with Union/Union all --> Multi-Statement Insert/Select * Union 需要排除重复记录，Union all虽不需要排重，但都需要占用大量的Spool空间，都需要进行重新组织数据如：现状：INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC1 ; UNION ALL SELECT …FROM SRC2 ; UNION ALL SELECT …FROM SRC3 ; …调整后: INSERT INTO ${TARGETDB}.DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC1 ;INSERT INTO ${TARGETDB}.T01_DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC2 ;INSERT INTO ${TARGETDB}.T01_DES (Party_Id ,Party_Name ... ) SELECT …FROM SRC3 ;6.排除重复记录* 针对单表内的重复记录使用ROW_ NUMBER函数排重* 排重方式多了一层子查询* 增加了大量的数据重新分布的时间现状：……INSERT INTO ${TARGETDB}.T01_INDIV (Party_Id ,Party_Name ... ) SELECT COALESCE(b1.Party_Id,'-1') , COALESCE(TRIM(b1.Party_name),'') ... FROM ( select party_id party_name, …, ROW_NUMBER() OVER (PARTITION BY Party_Id ORDER BY Party_Name ) as rownum from ${TEMPDB}.T01_INDIV b1 …) AA where AA.rownum ＝1 ……建议做法：INSERT INTO ${TEMPDB}.T01_INDIV …INSERT INTO ${TEMPDB}.T01_INDIV ………INSERT INTO ${TARGETDB}.T01_INDIV (Party_Id ,Party_Name ... ) SELECT party_id party_name, …From ${TEMPDB}.T01_INDIV b1 Qualify ROW_NUMBER() OVER (PARTITION BY Party_Id ORDER BY Party_Name ) = 1 ? 运用Qualify + ROW_ NUMBER函数? SQL语句简洁明了? 避免子查询优化前explain：……4) We do an all-AMPs STAT FUNCTION step from PTEMP.VT_T01_INDIV_cur by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is assumed to be redistributed by value to all AMPs. The result rows are put into Spool 3 (all_amps), which is built locally on the AMPs. 5) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (all_amps), which is built locally on the AMPs. The result spool file will not be cachedin memory. The size of Spool 1 is estimated with no confidence to be 6,781,130 rows. The estimated time for this step is 16.01 seconds. 6) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of an all-rows scan with a condition of ("ROWNUMBER = 1") into Spool 8 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 8 by row hash. The result spool file will not be cached in memory. The size of Spool 8 is estimated with no confidence to be 6,781,130 rows. The estimated time for this step is 1 minute. 7) We do an all-AMPs MERGE into PDATA.T01_INDIV from Spool 8 (Last Use). 优化后explain: ……4) We do an all-AMPs STAT FUNCTION step from PTEMP.VT_T01_INDIV_cur by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is assumed to be redistributed by value to all AMPs. The result rows are put into Spool 3 (all_amps), which is built locally on the AMPs. 5) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan with a condition of ("Field_10 = 1") into Spool 1 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 1 by row hash. The result spool file will not be cached in memory. The size of Spool 1 is estimated with no confidence to be 6,781,130 rows. The estimated time for this step is 1 minute. 6) We do an all-AMPs MERGE into PDATA.T01_INDIV from Spool 1 (Last Use).BTEQ中不能用length函数LENGTH()不是Teradata 的标准函数，但是Teradata SQL Assitant 支持它。

003TWM 简介

Teradata Warehouse Miner 简介
TWM支持数据挖掘的全过程
统计和分析数据集
• 数据控索 • 数据转换
分析建模
• 多元统计分析 •机器学习算法
模型部署
• 打分和评估 •模型生命周期管理
通过TWM的in-database挖掘，高效实现反复迭代的建模和应用过程
数据仓库汇叫, 连接, 排序,转换

• Scoring Techniques

Linear Regression Model Coefficients and Statistics Step-Wise Linear Regression Factor Analysis PCA, PAF, MLF Orthogonal/Oblique Rotations Logistic Regression Model Coefficients and Statistics Step-Wise Logistic Regression Decision Tree/Rule Induction Success and Lift Tables Entropy (i.e. C4.5/C5.0) Gini/Regression (i.e. CART) Chaid Trees Clustering K-Means/Nearest Neighbor Expectation – Maximization Association/Sequence Analysis Support/Confidence/Lift/Z-Score Exported SQL/SP PMML Consumer Analysis References Publishing Models/ADS

Teradata数据库介绍

SMP 体系架构
Multi-Node MPP
NCR Rack-Based Cabinets
NCR MPP系统的一些特性
Teradata数据库软件:它允许多个SMP运行在Teradata数据库上，并扮演单个实例角色．
可升级的BYNET连接:当增加节点时，相应的增加了带宽．并行可升级性：通过安装/升级多个SMPs 实现软件的并行安装和升级． AWS（Administration Workstation） :单点操作控制及升级服务管理． SMP：SMP只需要负责管理各自资源还有一些冗余的组件：两个BYNET,在一个磁盘组中有两个磁盘控制器,又模
Network-Attached client software Overview
CLI提供对Teradata最大限度的连接和访问性，ODBC作为业内标准是更多的应用程序可以连接到Teradata
Micro Teradata Director Program (MTDP) 是Teradata 提供的网络连接环境下的TDP实现，它和渠道连接下的TDP功能基本一致，唯一的区别是它不负责session在多个PEs之间的分配，此功能由运行在Teradata系统上的Connect and Assign Servers 实现
据集成的企业范围的数据库，保证数据的一致性高可用性并行装载及卸数处理
强大的并行装载，load与unload工具可升级性，这些工具如：Fastload、 Multiload、TPump、and FastExport
主题
What is Teradata? Teradata数据库竞争优势 Teradata RDBMS 架构 Teradata 系统架构 Teradata 数据库与数据库用户的比较数据存储和访问

Teradata数据仓库简介讲义教材

Teradata数据库简介
Teradata数据仓库事业部华南区
Teradata Confidential
Agenda
关于TERADATA Teradata数据库原理
Teradata数据库架构 Teradata数据库工作原理 Teradata特性
Teradata数据仓库构建
基本概念常用工具介绍管理的一些约定
= 103 = 1000 bytes = 106 = 1,000,000 bytes = 109 = 1,000,000,000 bytes = 1012 = 1,000,000,000,000 bytes = 1015 = 1,000,000,000,000,000 bytes = 1018 = 1,000,000,000,000,000,000 bytes = 1021 = 1,000,000,000,000,000,000,000 bytes = 1024 = 1,000,000,000,000,000,000,000,000 bytes
自
AAMgPgr4eg的at数ing据
己
Building Indexes
Row LockinAgMP3的数据TJroaunrsnaaclitziionng
的 Loading 数据
AMP2的数据 AMP1的数据
Backup & Recovery
并行处理性能
其他关系数据库
“有条件的并行”
初始查询查询优化查询并行
• Network Distribution
V-AMP V-AMP V-AMP V-AMP
• Access Module Processors (AMP)
• Disk Partitions

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Module 3 Relational Database Concepts
Relational Database Concepts
After completing this module, you should be able to: • • • • Define the terms associated with relational databases. Discuss the function of the Primary Key. Discuss the function of Foreign Keys. List the advantages of a relational database.
Foreign Key
EMPLOYEE (partial listing)
MANAGER EMPLOYEE EMPLOYEE DEPARTMENT NUMBER NUMBER NUMBER PK 1006 1008 1005 1004 1007 1003 FK 1019 1019 0801 1003 0801 FK 301 301 403 401 401 JOB LAST CODE NAME FK 312101 312102 431100 412101 411100 Stein Kanieski Ryan Johnson Villegas Trader John Carol Loretta Darlene Arnando James 861015 870201 861015 861015 870102 860731 631015 680517 650910 560423 470131 570619 3945000 3925000 4120000 4630000 5970000 4785000 FIRST NAME HIRE DATE BIRTH DATE SALARY AMOUNT
DEPARTMENT DEPARTMENT NUMBER NAME PK 501 301 302 403 402 401 201 marketing sales research and development product planning education software support customer support technical operations
MANAGER BUDGET EMPLOYEE AMOUNT NUMBER
80050000 46560000 22600000 93200000 30800000 98230000 29380000 FK 1017 1019 1016 1005 1011 1003 1025
•
• •
Missing (null) FK values may be allowed.
Column
EMPLOYEE MANAGER EMPLOYEE EMPLOYEE DEPARTMENT NUMBER NUMBER NUMBER 1006 1008 1005 1004 1007 1003 1019 1019 0801 1003 0801 301 301 403 401 401 JOB LAST CODE NAME 312101 312102 431100 412101 411100 Stein Kanieski Ryan Johnson Villegas Trader FIRST NAME John Carol Loretta Darlene Arnando James HIRE DATE 861015 870201 861015 861015 870102 860731 BIRTH DATE 631015 680517 650910 560423 470131 570619 SALARY AMOUNT 3945000 3925000 4120000 4630000 5970000 4785000
Primary Key
Primary Key (PK) values uniquely identify each row in a table.
EMPLOYEE
MANAGER EMPLOYEE EMPLOYEE DEPARTMENT NUMBER NUMBER NUMBER PK 1006 1008 1005 1004 1007 1003 1019 1019 0801 1003 0801 301 301 403 401 401 312101 312102 431100 412101 411100 Stein Kanieski Ryan Johnson Villegas Trader John Carol Loretta Darlene Arnando James 861015 870201 861015 861015 870102 860731 631015 680517 650910 560423 470131 570619 3945000 3925000 4120000 4630000 5970000 4785000 JOB LAST CODE NAME FIRST NAME HIRE DATE BIRTH DATE SALARY AMOUNT
Relational Advantages
Advantages of a Relational Database compared to other database methodologies include: • More flexible than other types • Allowing businesses to quickly respond to changing conditions • Being data-driven vs. application driven • Modeling the business, not the processes • Makes applications easier to build because the data does more of the work • Being easy to understand • Supporting trend toward end-user computing • DBA’s and Users don’t need to know the access path • Solidly founded in Set Theory
1. Name the department in which James Trader works. 2. Who manages the Education Department? 3. Identify by name an employee who works for James Trader. 4. James Trader manages which department?
What Is a Teradata Database?
Database—A collection of permanently stored data that is:
> > > >
Logically related—data relates to other data. Shared—many users may access data. Protected—access to data is controlled. Managed—data has integrity and value.
Logical/Relational Modeling
The Logical Model: • Should be designed without regard to usage • accommodates a wide variety of front end tools • allows the database to be created more quickly
Row
The employee table has: > Nine columns of data > Six rows of data—one per employee > No prescribed order for the rows of the table > Only one row format for the entire table > Missing data values represented by nulls
IBM/MVS
Teradata DATABASE
UNIX
Win98
WinNT
W2K
WinXP
Relational Databases
• • A Relational Database consists of a set of logically related tables. A table is a two dimensional representation of data consisting of rows and columns.
• Should be the same regardless of data volume
Teradata Supports Fully Normalized Logical Models: • ability to perform 64 table joins • ability to perform large aggregations during a query
Primary Key Rules
> > > > > >
A Primary Key is required for every table. Only one Primary Key is allowed in a table. Primary Keys may consist of one or more columns. Primary Keys cannot have duplicate values. Primary Keys cannot be null. Primary Keys are considered “non-changing” values.