数据仓库与数据挖掘教程(第2版)课后习题答案 第四章

合集下载

数据仓库与数据挖掘考试习题汇总 3

数据仓库与数据挖掘考试习题汇总 3

1、数据仓库就是一个面向主题的、集成的、相对稳定的、反映历史变化的数据集合。

2、元数据是描述数据仓库内数据的结构和建立方法的数据,它为访问数据仓库提供了一个信息目录,根据数据用途的不同可将数据仓库的元数据分为技术元数据和业务元数据两类。

3、数据处理通常分成两大类:联机事务处理和联机分析处理。

4、多维分析是指以“维”形式组织起来的数据(多维数据集)采取切片、切块、钻取和旋转等各种分析动作,以求剖析数据,使拥护能从不同角度、不同侧面观察数据仓库中的数据,从而深入理解多维数据集中的信息。

5、ROLAP是基于关系数据库的OLAP实现,而MOLAP是基于多维数据结构组织的OLAP实现。

6、数据仓库按照其开发过程,其关键环节包括数据抽取、数据存储于管理和数据表现等。

7、数据仓库系统的体系结构根据应用需求的不同,可以分为以下4种类型:两层架构、独立型数据集合、以来型数据结合和操作型数据存储和逻辑型数据集中和实时数据仓库。

8、操作型数据存储实际上是一个集成的、面向主题的、可更新的、当前值的(但是可“挥发”的)、企业级的、详细的数据库,也叫运营数据存储。

9、“实时数据仓库”以为着源数据系统、决策支持服务和仓库仓库之间以一个接近实时的速度交换数据和业务规则。

10、从应用的角度看,数据仓库的发展演变可以归纳为5个阶段:以报表为主、以分析为主、以预测模型为主、以运营导向为主和以实时数据仓库和自动决策为主。

1、调和数据是存储在企业级数据仓库和操作型数据存储中的数据。

2、抽取、转换、加载过程的目的是为决策支持应用提供一个单一的、权威数据源。

因此,我们要求ETL过程产生的数据(即调和数据层)是详细的、历史的、规范的、可理解的、即时的和质量可控制的。

3、数据抽取的两个常见类型是静态抽取和增量抽取。

静态抽取用于最初填充数据仓库,增量抽取用于进行数据仓库的维护。

4、粒度是对数据仓库中数据的综合程度高低的一个衡量。

粒度越小,细节程度越高,综合程度越低,回答查询的种类越多。

数据挖掘作业集答案

数据挖掘作业集答案

数据挖掘作业集答案《数据挖掘》作业集答案第一章引言一、填空题(1)数据清理,数据集成,数据选择,数据变换,数据挖掘,模式评估,知识表示(2)算法的效率、可扩展性和并行处理(3)统计学、数据库技术和机器学习(4)WEB挖掘(5)一些与数据的一般行为或模型不一致的孤立数据二、单选题(1)B;(2)D;(3)D;(4)B;(5)A;(6)B;(7)C;(8)E;三、简答题(1)什么是数据挖掘?答:数据挖掘指的是从大量的数据中挖掘出那些令人感兴趣的、有用的、隐含的、先前未知的和可能有用的模式或知识。

(2)一个典型的数据挖掘系统应该包括哪些组成部分?答:一个典型的数据挖掘系统应该包括以下部分:数据库、数据仓库或其他信息库数据库或数据仓库服务器知识库数据挖掘引擎模式评估模块图形用户界面(3)请简述不同历史时代数据库技术的演化。

答:1960年代和以前:研究文件系统。

1970年代:出现层次数据库和网状数据库。

1980年代早期:关系数据模型, 关系数据库管理系统(RDBMS)的实现1980年代后期:出现各种高级数据库系统(如:扩展的关系数据库、面向对象数据库等等)以及面向应用的数据库系统(空间数据库,时序数据库,多媒体数据库等等。

1990年代:研究的重点转移到数据挖掘, 数据仓库, 多媒体数据库和网络数据库。

2000年代:人们专注于研究流数据管理和挖掘、基于各种应用的数据挖掘、XML 数据库和整合的信息系统。

(4)请列举数据挖掘应用常见的数据源。

(或者说,我们都在什么样的数据上进行数据挖掘)答:常见的数据源包括关系数据库、数据仓库、事务数据库和高级数据库系统和信息库。

其中高级数据库系统和信息库包括:空间数据库、时间数据库和时间序列数据库、流数据、多媒体数据库、面向对象数据库和对象-关系数据库、异种数据库和遗产(legacy)数据库、文本数据库和万维网(WWW)等。

(5)什么是模式兴趣度的客观度量和主观度量?答:客观度量指的是基于所发现模式的结构和关于它们的统计来衡量模式的兴趣度,比如:支持度、置信度等等;主观度量基于用户对数据的判断来衡量模式的兴趣度,比如:出乎意料的、新颖的、可行动的等等。

数据仓库与数据挖掘教程(第2版)课后习题答案 第五章

数据仓库与数据挖掘教程(第2版)课后习题答案 第五章
⑥净化数据
⑦合并查询
11.
数据库是面向事务的设计,数据仓库是面向主题设计的。
数据库一般存储在线交易数据,数据仓库存储的一般是历史数据。
数据库是为捕获数据而设计,数据仓库是为分析数据而设计,它的两个基本的元素是维表和事实表。
12.说明如何利用数据仓库发现问题并找出产生问题的原因
答:主要是通过三个步骤来完成的:概括分析,抽取,建模。
3.
1非规格化
规范化的作用是产生一种完全没有数据冗余的设计方法。
但是,有时在数据仓库设计中引入一些有限的数据冗余来提高数据访问效果。
2创建数据阵列
创建数据阵列,将相关类型的数据(如:1月、2月、3月等月份中的数据)存储在一起,提高访问效果。
3预连接表格
一个公用键和共同使用的数据将表格合并在一起。
共享一个公用键,可以将多个表格合并到一个物理表格中。这样做可以很大程度的提高数据访问效率。
19.决策支持系统(decision support system,简称dss)是辅助决策者通过数据、模型和知识,以人机交互方式进行半结构化或非结构化决策的计算机应用系统。它是管理信息系统(mis)向更高一级发展而产生的先进信息管理系统。它为决策者提供分析问题、建立模型、模拟决策过程和方案的环境,调用各种信息资源和分析工具,帮助决策者提高决策水平和质量。决策支持系统,是以管理科学、运筹学、控制论、和行为科学为基础,以计算机技术、仿真技术和信息技术为手段,针对半结构化的决策问题,支持决策活动的具有智能作用的人机系统。该系统能够为决策者提供所需的数据、信息和背景资料,帮助明确决策目标和进行问题的识别,建立或修改决策模型,提供各种备选方案,并且对各种方案进行评价和俦优选,通过人机交互功能进行分析、比较和判断,为正确的决策提供必要的支持。

(完整版)数据库系统原理与设计(第2版)课后习题详细答案

(完整版)数据库系统原理与设计(第2版)课后习题详细答案

数据库系统原理与设计习题集第一章绪论一、选择题1. DBS是采用了数据库技术的计算机系统,DBS是一个集合体,包含数据库、计算机硬件、软件和()。

A. 系统分析员B. 程序员C. 数据库管理员D. 操作员2. 数据库(DB),数据库系统(DBS)和数据库管理系统(DBMS)之间的关系是()。

A. DBS包括DB和DBMSB. DBMS包括DB和DBSC. DB包括DBS和DBMSD. DBS就是DB,也就是DBMS3. 下面列出的数据库管理技术发展的三个阶段中,没有专门的软件对数据进行管理的是()。

I.人工管理阶段II.文件系统阶段III.数据库阶段A. I 和IIB. 只有IIC. II 和IIID. 只有I4. 下列四项中,不属于数据库系统特点的是()。

A. 数据共享B. 数据完整性C. 数据冗余度高D. 数据独立性高5. 数据库系统的数据独立性体现在()。

A.不会因为数据的变化而影响到应用程序B.不会因为系统数据存储结构与数据逻辑结构的变化而影响应用程序C.不会因为存储策略的变化而影响存储结构D.不会因为某些存储结构的变化而影响其他的存储结构6. 描述数据库全体数据的全局逻辑结构和特性的是()。

A. 模式B. 内模式C. 外模式D. 用户模式7. 要保证数据库的数据独立性,需要修改的是()。

A. 模式与外模式B. 模式与内模式C. 三层之间的两种映射D. 三层模式8. 要保证数据库的逻辑数据独立性,需要修改的是()。

A. 模式与外模式的映射B. 模式与内模式之间的映射C. 模式D. 三层模式9. 用户或应用程序看到的那部分局部逻辑结构和特征的描述是(),它是模式的逻辑子集。

A.模式B. 物理模式C. 子模式D. 内模式10.下述()不是DBA数据库管理员的职责。

A.完整性约束说明B. 定义数据库模式C.数据库安全D. 数据库管理系统设计选择题答案:(1) C (2) A (3) D (4) C (5) B(6) A (7) C (8) A (9) C (10) D二、简答题1.试述数据、数据库、数据库系统、数据库管理系统的概念。

数据结构(第二版)习题谜底第4章[基础]

数据结构(第二版)习题谜底第4章[基础]

数据结构(第二版)习题答案第4章第4章字符串、数组和特殊矩阵4.1稀疏矩阵常用的压缩存储方法有(三元组顺序存储)和(十字链表)两种。

4.2设有一个10 × 10的对称矩阵 A采用压缩方式进行存储,存储时以按行优先的顺序存储其下三角阵,假设其起始元素 a00的地址为 1,每个数据元素占 2个字节,则 a65的地址为( 53 )。

4.3若串S =“software”,其子串的数目为( 36 )。

4.4常对数组进行的两种基本操作为(访问数据元素)和(修改数组元素)。

4.5 要计算一个数组所占空间的大小,必须已知(数组各维数)和(每个元素占用的空间)。

4.6对于半带宽为 b的带状矩阵,它的特点是:对于矩阵元素 aij,若它满足(|i-j|>b),则 aij = 0。

4.7字符串是一种特殊的线性表,其特殊性体现在(该线性表的元素类型为字符)。

4.8试编写一个函数,实现在顺序存储方式下字符串的 strcompare (S1,S2)运算。

【答】:#include <stdio.h>#include <string.h>#define MAXSIZE 100typedef struct{char str[MAXSIZE];int length;}seqstring;/* 函数 strcompare()的功能是:当 s1>s2时返回 1,当 s1==s2时返回 0,当 s1<s2时返回-1*/int strcompare(seqstring s1,seqstring s2){ int i,m=0,len;len=s1.length<s2.length ?s1.length:s2.length;for(i=0;i<=len;i++)if(s1.str[i]>s2.str[i]){m=1;break;}else if(s1.str[i]<s2.str[i]){m=-1;break;}return m;}int main(){ seqstring s1,s2;int i,m;printf("input char to s1:\n");gets(s1.str);s1.length=strlen(s1.str);printf("input char to s2:\n");gets(s2.str);s2.length=strlen(s2.str);m=strcompare(s1,s2);if(m==1) printf("s1>s2\n");else if(m==-1) printf("s2>s1\n");else if(m==0) printf("s1=s2\n");}4.9试编写一个函数,实现在顺序存储方式下字符串的replace(S,T1,T2)运算。

数据库系统教程课后答案(施伯乐)(第二版)

数据库系统教程课后答案(施伯乐)(第二版)

目录第1部分课程的教与学第2部分各章习题解答及自测题第1章数据库概论1.1 基本内容分析1.2 教材中习题1的解答1.3 自测题1.4 自测题答案第2章关系模型和关系运算理论2.1基本内容分析2.2 教材中习题2的解答2.3 自测题2.4 自测题答案第3章关系数据库语言SQL3.1基本内容分析3.2 教材中习题3的解答3.3 自测题3.4 自测题答案第4章关系数据库的规范化设计4.1基本内容分析4.2 教材中习题4的解答4.3 自测题4.4 自测题答案第5章数据库设计与ER模型5.1基本内容分析5.2 教材中习题5的解答5.3 自测题5.4 自测题答案第6章数据库的存储结构6.1基本内容分析6.2 教材中习题6的解答第7章系统实现技术7.1基本内容分析7.2 教材中习题7的解答7.3 自测题7.4 自测题答案第8章对象数据库系统8.1基本内容分析8.2 教材中习题8的解答8.3 自测题8.4 自测题答案第9章分布式数据库系统9.1基本内容分析9.2 教材中习题9的解答9.3 自测题9.4 自测题答案第10章中间件技术10.1基本内容分析10.2 教材中习题10的解答10.3 自测题及答案第11章数据库与WWW11.1基本内容分析11.2 教材中习题11的解答第12章 XML技术12.1基本内容分析12.2 教材中习题12的解答学习推荐书目1.国内出版的数据库教材(1)施伯乐,丁宝康,汪卫. 数据库系统教程(第2版). 北京:高等教育出版社,2003(2)丁宝康,董健全. 数据库实用教程(第2版). 北京:清华大学出版社,2003(3)施伯乐,丁宝康. 数据库技术. 北京:科学出版社,2002(4)王能斌. 数据库系统教程(上、下册). 北京:电子工业出版社,2002(5)闪四清. 数据库系统原理与应用教程. 北京:清华大学出版社,2001(6)萨师煊,王珊. 数据库系统概论(第3版). 北京:高等教育出版社,2000(7)庄成三,洪玫,杨秋辉. 数据库系统原理及其应用. 北京:电子工业出版社,20002.出版的国外数据库教材(中文版或影印版)(1)Silberschatz A,Korth H F,Sudarshan S. 数据库系统概念(第4版). 杨冬青,唐世渭等译. 北京:机械工业出版社,2003(2)Elmasri R A,Navathe S B. 数据库系统基础(第3版). 邵佩英,张坤龙等译. 北京:人民邮电出版社,2002(3)Lewis P M,Bernstein A,Kifer M. Databases and Transaction Processing:An Application-Oriented Approach, Addison-Wesley, 2002(影印版, 北京:高等教育出版社;中文版,施伯乐等译,即将由电子工业出版社出版)(4)Hoffer J A,Prescott M B,McFadden F R. Modern Database Management. 6th ed. Prentice Hall, 2002(中文版,施伯乐等译,即将由电子工业出版社出版)3.上机实习教材(1)廖疆星,张艳钗,肖金星. PowerBuilder 8.0 & SQL Server 2000数据库管理系统管理与实现. 北京:冶金工业出版社,2002(2)伍俊良. PowerBuilder课程设计与系统开发案例. 北京:清华大学出版社,20034.学习指导书(1)丁宝康,董健全,汪卫,曾宇昆. 数据库系统教程习题解答及上机指导. 北京:高等教育出版社,2003(2)丁宝康,张守志,严勇. 数据库技术学习指导书. 北京:科学出版社,2003(3)丁宝康,董健全,曾宇昆. 数据库实用教程习题解答. 北京:清华大学出版社,2003 (4)丁宝康. 数据库原理题典. 长春:吉林大学出版社,2002(5)丁宝康,陈坚,许建军,楼晓鸿. 数据库原理辅导与练习. 北京:经济科学出版社,2001第1部分课程的教与学1.课程性质与设置目的现在,数据库已是信息化社会中信息资源与开发利用的基础,因而数据库是计算机教育的一门重要课程,是高等院校计算机和信息类专业的一门专业基础课。

(完整word版)数据库系统基础教程第四章答案(word文档良心出品)

(完整word版)数据库系统基础教程第四章答案(word文档良心出品)

SolutionsChapter 4 4.1.14.1.2a)b)c)In c we assume that a phone and address can only belong to a single customer (1-m relationship represented by arrow into customer).d)In d we assume that an address can only belong to one customer and a phone canexist at only one address.If the multiplicity of above relationships were m-to-n, the entity set becomesweak and the key ssNo of customers will be needed as part of the composite keyof the entity set.In c&d, we convert attributes phones and addresses to entity sets. Since entitysets often become relations in relational design,we must consider more efficient alternatives.Instead of querying multiple tables where key values are duplicated, we can also modify attributes:(i) Phones attribute can be converted into HomePhone, OfficePhone and CellPhone. (ii) A multivalued attribute such as alias can be kept as an attribute where asingle column can be used in relational design i.e. concatenate all values. SQLallows a query "like '%Junius%'" to search the multiple values in a column alias.4.1.34.1.4a)b)c)The relationship "played" between Teams and Players is similar to relationship "plays" between Teams and Players.4.1.54.1.6 The information about children can be ascertained from motherOf and fatherOf relationships. Attribute ssNo is required since names are not unique.4.1.74.1.8a)(b)4.1.9AssumptionsA Professor only works in at most one department.A course has at most one TA.A course is only taught by one professor and offered by one department.Students and professors have been assigned unique email ids.A course is uniquely identified by the course no, section no, and semester (e.g. cs157-3 spring 09).4.1.10Given that for each movie, a unique studio exists that produces the movie. Each star is contracted to at most one studio.But stars could be unemployed at a given time. Thus the four-way relationship in fig 4.6 can be easily into converted equivalent relationships.4.2.1Redundancy: The owner address is repeated in AccSets and Addresses entity sets. Simplicity: AccSets does not serve any useful purpose and the design can be more simply represented by creating many-to-many relationship between Customers and Accounts.Right kind of element: The entity set Addresses has a single attribute address.A customer cannot have more than one address.Hence address should be an attribute of entity set Customers.Faithfulness: Customers cannot be uniquely identified by their names. In real world Customers would have a unique attribute such as ssNo or customerNo4.2.2Studios and Presidents can be combined into one entity set Studios withPresidents becoming an attribute of Studios under following circumstances:1. The Presidents entity set only contains a simple attribute viz. presidentName. Additional attributes specific to Presidents might justify making Presidentsinto an entity set.4.2.34.2.4 The entity sets should have single attribute.a) Stars: starNameb) Movies: movieNamec) Studios: studioName. However there exists a many-to-many relationship between Studios and Contracts. Hence, in addition, we need more information aboutstudios involved. If a contract always involves two studios, two attributes such as producingStudio and starStudio can replace theStudios entity set. If a contact can be associated with at most five studios, it may be possible to replace the Studios entity set by five attributes viz.studio1, studio2, studio3, studio4, and studio5. Alternately, a compositeattribute containing concatenation of all studio names in a contact can be considered. A separator character such as "$" can be used. SQL allows searchingof such an attribute using query like '%keyword%'4.2.5From Augmentation rule of Functional Dependency,giventhenBND -> M (N=Nurse, D=Doctor)Hence we can just put an arrow entering mother.a) Put an arrow entering entity set Mothers for the simplest solution (As in fig.4.4, where a multi-way relationship was allowed, even though Movies alone could identify the Studio). However, we can display more accurate information with below figure.b)givenBM -> DthenBMN -> DThus we can just add an arrow entering Doctors to fig 4.15. Below figure represents more accurate information however.4.2.6a)b) Transitivity and Augmentation rules of Functional Dependency allow arrow entering Mothers from Births. However, a new relationship in below figure represents more accurate information.c)Design flaws in abc above 1. As suggested above, using Transitivity and Augmentation rules of Functional Dependency, much simpler design is possible.4.2.7In below figure there exists a many-to-one relationship between Babies and Births and another many-to-one relationship between Births and Mothers. From transitivity of relationships, there is a many-to-one relationship between Babies and Mothers. Hence a baby has a unique mother while a birth can allow more than one baby.4.3.1a)b)A captain cannot exist without a team. However a player can (free agent). A recently formed (or defunct) team can exist without players or colors.c)Children can exist without mother and father (unknown).4.3.2a)The keys of both E1 and E2 are required for uniquely identifying tuples in Rb)The key of E1c)The key of E2d)The key of either E1 or E24.3.3Special Case: All entity sets have arrows going into them i.e. all relationships are 1-to-1Any KiOtherwise: Combination of all Ki's where there does not exist an arrow goingfrom R to Ei.4.4.1No, grade is not part of the key for enrollments. The keys of Students and Courses become keys of the weak entity set Enrollments.4.4.2It is possible to make assignment number a weak key of Enrollments but this is not good design (redundancy since multiple assignments correspond to a course).A new entity set Assignment is created and it is also a weak entity set. Hence the key attributes of Assignment will come from the strong entity sets to which Enrollments is connected i.e. studentID, dept, and CourseNo.4.4.3a)b)4.4.4a)4.5.1Customers(SSNo,name,addr,phone)Flights(number,day,aircraft)Bookings(custSSNo,flightNo,flightDay,row,seat)Relations for toCust and toFlt relationships are not required since the weak4.5.2(a)(b)Schema is changed. Since toCust is no longer an identifying relationship, SSNo is no longer a part of Bookings relation.Bookings(flightNo,flightDay,row,seat)ToCust(custSSNO,flightNo,flightDay,row,seat)The above relations are merged intoBookings(flightNo,flightDay,row,seat,custSSNo)However custSSNo is no longer a key of Bookings relation. It becomes a foreign4.5.3Ships(name, yearLaunched)SisterOf(name, sisterName)4.5.4(a)Stars(name,addr)Studios(name,addr)Movies(title,year,length,genre)Contracts(starName,movieTitle,movieYear,studioName,salary)Depending on other relationships not shown in ER diagram, studioName may not be required as a key of Contracts (or not even required as an attribute of Contracts).(b)Students(studentID)Courses(dept,courseNo)Enrollments(studentID,dept,courseNo,grade)(c)Departments(name)Courses(deptName,number)(d)Leagues(name)Teams(leagueName,teamName)Players(leagueName,teamName,playerName)4.6.1The weak relation Courses has the key from Depts along with number. Hence there is no relation for GivenBy relationship.(a)Depts(name, chair)Courses(number, deptName, room)LabCourses(number, deptName, allocation)(b) LabCourses has all the attributes of Courses.Depts(name, chair)Courses(number, deptName, room)(c) Courses and LabCourses are combined into one relation.Depts(name, chair)Courses(number, deptName, room, allocation)4.6.2(a)Person(name,address)ChildOf(personName,personAddress,childName,childAddress)Child(name,address,fatherName,fatherAddress,motherName,motherAddresss)Father(name,address,wifeName,wifeAddresss)Mother(name,address)Since FatherOf and MotherOf are many-one relationships from Child, there is no need for a separate relation for them. Similarly the one-one relationshipMarried can be included in Father (or Mother). ChildOf is a many-manyrelationship and needs a separate relation.However the ChildOf relation is not required since the relationship can be deduced from FatherOf and MotherOf relationships contained in Child relation. (b)A person cannot be both Mother and Father.Person(name,address)PersonChild(name,address)PersonChildFather(name,address)PersonChildMother(name,address)PersonFather(name,address)PersonMother(name,address)ChildOf(personName,personAddress,childName,childAddress)FatherOf(childName,childAddress,fatherName,fatherAddress)MotherOf(childName,childAddress,motherName,motherAddress)Married(husbandName,husbandAddress,wifeName,wifeAddress)The many-many ChildOf relationship again requires a relation.An entity belongs to one and only one class when using object-oriented approach. Hence, the many-one relations MotherOf and FatherOf could be added as attributes to PersonChild,PersonChildFather, and PersonChildMother relations.Similarly the Married relation can be added as attributes to PersonChildMother and PersonMother (or the corresponding father relations).(c) For the Person relation at least one of husband and wife attributes will be null.Person(personName,personAddress,fatherName,fatherAddress,motherName,motherAddres ss,wifeName,wifeAddresss,husbandName,husbandAddress)ChildOf(personName,personAddress,childName,childAddress)4.6.3(a)People(name,fatherName,motherName)Males(name)Females(name)Fathers(name)Mothers(name)ChildOf(personName,childName)(b)People(name)PeopleMale(name)PeopleMaleFathers(name)PeopleFemale(name)PeopleFemaleMothers(name)ChildOf(personName,childName)FatherOf(childName,fatherName)MotherOf(childName,motherName)People cannot belong to both male and female branch of the ER diagram.Moreover since an entity belongs to one and only one class when using object-oriented approach, no entity belongs to People relation.Again we could replace MotherOf and FatherOf relations by adding as attributesto PeopleMale,PeopleMaleFathers,PeopleFemale, and PeopleFemaleMothers relations.(c)People(name,fatherName,motherName)ChildOf(personName,childName)4.6.4(a)Each entity set results in one relation. Thus both the minimum and maximum number of relations is e.The root relation has a attributes including k keys. Thus the minimum number of attributes is a. All other relations include the k keys from root along withtheir a attributes. Thus the maximum number of attributes is a+k.(b)The relation for root will have a attributes. The relation representing the whole tree will have e*a attributes.The number of relations will depend on the shape of the tree. A tree of eentities where only one child exists(say left child only) would have the minimum number of relations. Thus below figure will only contain 4 subtrees that contain root E1,E1E2,E1E2E3, and E1E2E3E4. With e entity sets, minimum e relations are possible.The maximum number of subtrees result when all the entities(except root) are at depth 1. Thus below figure will contain 8 subtrees that contain rootE1,E1E2,E1E3,E1E4,E1E2E3,E1E3E4,E1E2E4,and E1E2E3E4. With e entity sets, maximum 2^(e-1) relations are possible.(c)The nulls method always results in one relation and contains attributes from all e entities i.e. e*a attributes.Summarizing for a,b, and c above;#Components #RelationsMin Max Min MaxMethodstraight-E/R a a e eobject-oriented a e*a e 2^(e-1)nulls e*a e*a 1 14.7.14.7.2a)b)c)d)4.7.34.7.5Males and Females subclasses are complete. Mothers and Fathers are partial. All subclasses are disjoint.4.7.7We convert the ternary relationship Contracts into three binary relationships between a new entity set Contracts and existing entity sets.4.7.9a)b)c)4.7.10A self-association ParentOf for entity set people has multiplicity 0..2 at parent role end.In a Library database, if a patron can loan at most 12 books, them multiplicity is 0..12.For a FullTimeStudents entity set, a relationship of multiplicity 5..* must exist with Courses (A student must take at least5 courses to be classified FullTime.4.8.1Customers(SSNo,name,addr,phone)Flights(number,day,aircraft)Bookings(row,seat,custSSNo,FlightNumber,FlightDay)Customers("SSNo",name,addr,phone)Flights("number","day",aircraft)Bookings(row,seat,"custSSNo","FlightNumber","FlightDay")4.8.2a)Movies(title,year,length,genre)Studios(name,address)Presidents(cert#,name,address)Owns(movieTitle,movieYear,studioName)Runs(studioName,presCert#)Movies("title","year",length,genre)Studios("name",address)Presidents("cert#",name,address)Owns("movieTitle","movieYear",studioName)Runs("studioName",presCert#)b)Since the subclasses are disjoint, Object Oriented Approach is used. The hierarchy is not complete. Hence four relations are required Movies(title,year,length,genre)MurderMysteries(title,year,length,genre,weapon)Cartoons(title,year,length,genre)Cartoon-MurderMysteries(title,year,length,genre,weapon)Movies("title","year",length,genre)MurderMysteries("title","year",length,genre,weapon)Cartoons("title","year",length,genre)Cartoon-MurderMysteries("title","year",length,genre,weapon)c)Customers(ssNo,name,phone,address)Accounts(number,balance,type)Owns(custSSNo,accountNumber)Customers("ssNo",name,phone,address)Accounts("number",balance,type)Owns("custSSNo","accountNumber")d)Teams(name,captainName)Players(name,teamName)Fans(name,favoriteColor)Colors(colorname)For Displays association,TeamColors(teamName,colorname)RootsFor(fanName,teamName)Admires(fanName,playerName)Teams("name",captainName)Players("name",teamName)Fans("name",favoriteColor)Colors("colorname")For Displays association,TeamColors("teamName","colorname")RootsFor("fanName","teamName")Admires("fanName","playerName")e)People(ssNo,name,fatherSSNo,motherSSNo)People("ssNo",name,fatherssNo,motherssNo)f)Students(email,name)Courses(no,section,semester,professorEmail)Departments(name)Professors(email,name,worksDeptName)Takes(letterGrade,studentEmail,courseNo,courseSection,courseSemester)Students("email",name)Courses("no","section","semester",professorEmail)Departments("name")Professors("email",name,worksDeptName)Takes(letterGrade,"studentEmail","courseNo","courseSection","courseSemester")4.8.3a)Each and every object is a member of exactly one subclass at leaf level. We have nine classes at the leaf of hierarchy. Hence we need nine relations.b)All objects only belong to one subclass and its ancestors. Hence, we need not consider every possible subtree but rather the total number of nodes in tree. Hence we need thirteen relations.c)We need all possible subtrees. Hence 218 relations are required.class Customer (key (ssNo)){attribute integer ssNo;attribute string name;attribute string addr;attribute string phone;relationship Set<Account> ownsAcctsinverse Account::ownedBy;};class Account (key (number)){attribute integer number;attribute string type;attribute real balance;relationship Set<Customer> ownedByinverse Customer::ownsAccts;};4.9.2a)Modify class Account to contain relationship Customer ownedBy (no Set)b)Also remove set in relationship ownsAccts of class Customer.c)ODL allows a collection of primitive types as well as structures. To class Customer add following attributes in place of simple attributes addr and phone: Set<string phone>Set<Struct addr{string street,string city,string state}>d)ODL allows structures and collections recursively.Set<Struct addr{string street,string city,string state},Set<string phone>>Collections are allowed in ODL. Hence, Colors Set can become an attribute of Teams.class Colors(key(colorname)){attribute string colorname;relationship Set<Fans> FavoredByinverse Fans::Favors;relationship set<Teams> DisplayedByinverse Teams::Displays;};class Teams(key(name)){attribute string name;relationship set<Colors> Displaysinverse Colors::DisplayedBy;relationship set<Players> PlayedByinverse Players::Plays;relationship PLayers CaptainedByinverse Platyers::Captains;relationship set<Fans> RootedByinverse Fans::Roots;};class Players(key(name)){attribute string name;relationship Set<Teams> Playsinverse Teams::PlayedBy;relationship Teams Captainsinverse Teams::CaptainedBy;relationship Set<Fans> AdmiredByinverse Fans::Admires;};class Fans(key(name)){attribute string name;relationship Colors Favorsinverse Colors::FavoredBy;relationship Set<Teams> RootedByinverse Teams::Roots;relationship Set<Players> Admiresinverse Players::AdmiredBy;};4.9.4class Person {attribute string name;relationship Person motherOfinverse Person::childrenOfFemale;relationship Person fatherOfinverse Person::childrenOfMale;relationship Set<Person> childreninverse Person::parentsOf;relationship Set<Person> childrenOfFemaleinverse Person::motherOf;relationship Set<Person> childrenOfMaleinverse Person::fatherOf;relationship Set<Person> parentsOfinverse Person::children;};4.9.5The struct education{string degree,string school,string date} cannot have duplication.Hence use of Sets does not make any different as compared to bags, lists, or arrays.Lists will allow faster access/queries due to the already sorted nature.4.9.6a)class Departments(key (name)) {attribute string name;relationship Courses offersinverse Courses::offeredBy;};class Courses(key (number,offeredBy)) {attribute string number;relationship Departments offeredByinverse Departments::offers;};b)class Leagues (key (name)) {attribute name;relationship Teams containsinverse Teams::belongs;};class Teams(key (name,belongs)) {attribute name,relationship Leagues belongsinverse Leagues::contains;relationship Players playinverse Players::plays;};class Players (key(number,plays)) {attribute number,relationship Teams playsinverse Teams::play;4.9.7class Students (key email) {attribute string email;attribute string name;relationship Courses isTAinverse Courses::TA;relationship Courses Takesinverse Courses::TakenBy;};class Professors (key email) {attribute string email;attribute string name;relationship Departments WorksForinverse Department::Works;relationship Courses Teachesinverse Courses::TaughtBy;};class Courses (key (no,semester,section)) {attribute string no;attribute string semester;attribute string section;relationship Students TAinverse Students::isTA;relationship Students TakenByinverse Students::Takes;relationship Professors TaughtByinverse Professors::Teaches;relationship Departments OfferedByinverse Departments::Offer;};class Departments (key name) {attribute name;relationship Courses Offerinverse Courses::OfferedBy;relationship Professors Worksinverse Professors::WorksFor;};4.9.8A relationship is its own inverse when for every attribute pair in the relationship, the inverse pair also exists. A relation with such a relationship is called symmetric in set theory. e.g. A relationship called SiblingOf in Person relation is its own inverse.4.10.1a)Customers(ssNo,name,addr,phone)Account(number,type,balance)Owns(ssNo,accountNumber)b)Accounts(number,balance,type,owningCustomerssNo)Customers(ssNo,name)Addresses(ownerssNo,street,state,city)Phones(ownerssNo,street,state,city,phonearea,phoneno)We can remove Addresses relation since its attributes are a subset of relation Phones.c)Fans(name,colors)RootedBy(fan_name,teamname)Admires(fan_name,playername)Players(name,teamname,is_captain)Teams(name)--remove subset of teamcolorTeamcolors(name,colorname)Colors(colorname)d)class Person {attribute string name;relationship Person motherOfinverse Person::childrenOfFemale;relationship Person fatherOfinverse Person::childrenOfMale;relationship Set<Person> childreninverse Person::parentsOf;relationship Set<Person> childrenOfFemaleinverse Person::motherOf;relationship Set<Person> childrenOfMaleinverse Person::fatherOf;relationship Set<Person> parentsOfinverse Person::children;};Person(name,mothername,fathername)The children relationship is many-many but the information can be deduced from Person relation. Hence below relation is redundant.Parent-Child(parent, child)4.10.2First consider each struct as if it were an atomic value i.e. key and value association pairs can be treated as two attributes. After applying normalization, the attributes can be replaced by the fields of the structs.4.10.3(a)Struct Card { string rank, string suit };(b)class Hand {attribute Set theHand;};(c)Hands(handId, rank, suit)Each tuple corresponds to one card of a hand. HandId is required key to identify a hand.class PokerHand{attribute Array Hand(Card card1,Card card2,Card card3,Cardcard4,Card card5)}PokerHandS(handId,rank1,suit1,,rank2,suit2,rank3,suit3,rank4,suit4,rank5,suit5) (e)class Deal {attribute Set <Struct PlayerHand { string Player, Hand theHand } > theDeal;}(f) PokerDeal consist of a player and array of five card deal.class PokerDeal{string Player,attribute Array Hand(Card card1,Card card2,Cardcard3,Card card4,Card card5)}(g) Above can similarly be represented by key player and a value consisting offive element array.(h)dealID is a key for Deals. Thus the relations for classes Deals and Hands are:Deals(dealID, player, handID)Hands(handID, rank, suit)A simpler relation Deals below can also represents the classes:Deals(dealID, player, rank, suit)(i)The relation Deals(dealID,card) cannot identify the hand to which a card belongs. Also two attributes are required for a card;its rank and suit.Deals(dealID, handID, rank, suit)4.10.4(a)C(a, f, g)(b)C(a, f, g, count)(c)C(a, f, g, position)(d)C(a, f, g, i, j)。

数据库系统基础教程第四章答案(完整资料).doc

数据库系统基础教程第四章答案(完整资料).doc

【最新整理,下载后即可编辑】SolutionsChapter 4 4.1.14.1.2a)b)c)In c we assume that a phone and address can only belong to a single customer (1-m relationship represented by arrow into customer).d)In d we assume that an address can only belong to one customer and a phone can exist at only one address.If the multiplicity of above relationships were m-to-n, the entity set becomes weak and the key ssNo of customers will be needed as part of the composite key of the entity set.In c&d, we convert attributes phones and addresses to entity sets. Since entity sets often become relations in relational design,we must consider more efficient alternatives.Instead of querying multiple tables where key values are duplicated, we can also modify attributes:(i) Phones attribute can be converted into HomePhone, OfficePhone and CellPhone.(ii) A multivalued attribute such as alias can be kept as an attribute where a single column can be used in relational design i.e. concatenate all values. SQL allows a query "like '%Junius%'" to search the multiple values in a column alias.4.1.34.1.4a)b)c)The relationship "played" between Teams and Players is similar to relationship "plays" between Teams and Players.4.1.54.1.6 The information about children can be ascertained from motherOf and fatherOf relationships. Attribute ssNo is required since names are not unique.4.1.74.1.8a)(b)4.1.9AssumptionsA Professor only works in at most one department.A course has at most one TA.A course is only taught by one professor and offered by one department. Students and professors have been assigned unique email ids.A course is uniquely identified by the course no, section no, and semester (e.g. cs157-3 spring 09).4.1.10Given that for each movie, a unique studio exists that produces the movie. Each star is contracted to at most one studio.But stars could be unemployed at a given time. Thus the four-way relationship in fig 4.6 can be easily into converted equivalent relationships.4.2.1Redundancy: The owner address is repeated in AccSets and Addresses entity sets. Simplicity: AccSets does not serve any useful purpose and the design can be more simply represented by creating many-to-many relationship between Customers and Accounts.Right kind of element: The entity set Addresses has a single attribute address. A customer cannot have more than one address.Hence address should be an attribute of entity set Customers. Faithfulness: Customers cannot be uniquely identified by their names. In real world Customers would have a unique attribute such as ssNo or customerNo 4.2.2Studios and Presidents can be combined into one entity set Studios with Presidents becoming an attribute of Studios under following circumstances:1. The Presidents entity set only contains a simple attribute viz. presidentName. Additional attributes specific to Presidents might justify making Presidents into an entity set.4.2.34.2.4 The entity sets should have single attribute.a) Stars: starNameb) Movies: movieNamec) Studios: studioName. However there exists a many-to-many relationship between Studios and Contracts. Hence, in addition, we need more informationabout studios involved. If a contract always involves two studios, two attributes such as producingStudio and starStudio can replace theStudios entity set. If a contact can be associated with at most five studios, it may be possible to replace the Studios entity set by five attributes viz. studio1, studio2, studio3, studio4, and studio5. Alternately, a composite attribute containing concatenation of all studio names in a contact can be considered. A separator character such as "$" can be used. SQL allows searching of such an attribute using query like '%keyword%'4.2.5From Augmentation rule of Functional Dependency,givenB -> M (B=Baby, M=Mother)thenBND -> M (N=Nurse, D=Doctor)Hence we can just put an arrow entering mother.a) Put an arrow entering entity set Mothers for the simplest solution (As in fig. 4.4, where a multi-way relationship was allowed, even though Movies alone could identify the Studio). However, we can display more accurate information with below figure.b)c)Again from Augmentation rule of Functional Dependency, givenBM -> DthenBMN -> DThus we can just add an arrow entering Doctors to fig 4.15. Below figure represents more accurate information however.4.2.6a)b) Transitivity and Augmentation rules of Functional Dependency allow arrow entering Mothers from Births. However, a new relationship in below figure represents more accurate information.c)Design flaws in abc above 1. As suggested above, using Transitivity and Augmentation rules of Functional Dependency, much simpler design is possible.4.2.7In below figure there exists a many-to-one relationship between Babies and Births and another many-to-one relationship between Births and Mothers. From transitivity of relationships, there is a many-to-one relationship between Babiesand Mothers. Hence a baby has a unique mother while a birth can allow more than one baby.4.3.1a)b)A captain cannot exist without a team. However a player can (free agent). A recently formed (or defunct) team can exist without players or colors.c)Children can exist without mother and father (unknown).4.3.2a)The keys of both E1 and E2 are required for uniquely identifying tuples in R b)The key of E1c)The key of E2d)The key of either E1 or E24.3.3Special Case: All entity sets have arrows going into them i.e. all relationships are 1-to-1Any KiOtherwise: Combination of all Ki's where there does not exist an arrow going from R to Ei.4.4.1No, grade is not part of the key for enrollments. The keys of Students and Courses become keys of the weak entity set Enrollments.4.4.2It is possible to make assignment number a weak key of Enrollments but this is not good design (redundancy since multiple assignments correspond to a course).A new entity set Assignment is created and it is also a weak entity set. Hence the key attributes of Assignment will come from the strong entity sets to which Enrollments is connected i.e. studentID, dept, and CourseNo.4.4.3a)b)c)4.4.4a)b)4.5.1Customers(SSNo,name,addr,phone)Flights(number,day,aircraft)Bookings(custSSNo,flightNo,flightDay,row,seat)Relations for toCust and toFlt relationships are not required since the weak entity set Bookings already contains the keys of Customers and Flights.4.5.2(a)(b)Schema is changed. Since toCust is no longer an identifying relationship, SSNo is no longer a part of Bookings relation.Bookings(flightNo,flightDay,row,seat)ToCust(custSSNO,flightNo,flightDay,row,seat)The above relations are merged intoBookings(flightNo,flightDay,row,seat,custSSNo)However custSSNo is no longer a key of Bookings relation. It becomes a foreign key instead.4.5.3Ships(name, yearLaunched)SisterOf(name, sisterName)4.5.4(a)Stars(name,addr)Studios(name,addr)Movies(title,year,length,genre)Contracts(starName,movieTitle,movieYear,studioName,salary)Depending on other relationships not shown in ER diagram, studioName may not be required as a key of Contracts (or not even required as an attribute of Contracts).(b)Students(studentID)Courses(dept,courseNo)Enrollments(studentID,dept,courseNo,grade)(c)Departments(name)Courses(deptName,number)(d)Leagues(name)Teams(leagueName,teamName)Players(leagueName,teamName,playerName)4.6.1The weak relation Courses has the key from Depts along with number. Hence there is no relation for GivenBy relationship.(a)Depts(name, chair)Courses(number, deptName, room)LabCourses(number, deptName, allocation)(b) LabCourses has all the attributes of Courses.Depts(name, chair)Courses(number, deptName, room)LabCourses(number, deptName, room, allocation)(c) Courses and LabCourses are combined into one relation.Depts(name, chair)Courses(number, deptName, room, allocation)4.6.2(a)Person(name,address)ChildOf(personName,personAddress,childName,childAddress)Child(name,address,fatherName,fatherAddress,motherName,motherAddresss) Father(name,address,wifeName,wifeAddresss)Mother(name,address)Since FatherOf and MotherOf are many-one relationships from Child, there is no need for a separate relation for them. Similarly the one-one relationship Married can be included in Father (or Mother). ChildOf is a many-many relationship and needs a separate relation.However the ChildOf relation is not required since the relationship can be deduced from FatherOf and MotherOf relationships contained in Child relation.(b)A person cannot be both Mother and Father.Person(name,address)PersonChild(name,address)PersonChildFather(name,address)PersonChildMother(name,address)PersonFather(name,address)PersonMother(name,address)ChildOf(personName,personAddress,childName,childAddress)FatherOf(childName,childAddress,fatherName,fatherAddress)MotherOf(childName,childAddress,motherName,motherAddress)Married(husbandName,husbandAddress,wifeName,wifeAddress)The many-many ChildOf relationship again requires a relation.An entity belongs to one and only one class when using object-oriented approach. Hence, the many-one relations MotherOf and FatherOf could be added as attributes to PersonChild,PersonChildFather, and PersonChildMother relations.Similarly the Married relation can be added as attributes to PersonChildMother and PersonMother (or the corresponding father relations).(c) For the Person relation at least one of husband and wife attributes will be null. Person(personName,personAddress,fatherName,fatherAddress,motherName,mot herAddresss,wifeName,wifeAddresss,husbandName,husbandAddress) ChildOf(personName,personAddress,childName,childAddress)4.6.3(a)People(name,fatherName,motherName)Males(name)Females(name)Fathers(name)Mothers(name)ChildOf(personName,childName)(b)People(name)PeopleMale(name)PeopleMaleFathers(name)PeopleFemale(name)PeopleFemaleMothers(name)ChildOf(personName,childName)FatherOf(childName,fatherName)MotherOf(childName,motherName)People cannot belong to both male and female branch of the ER diagram. Moreover since an entity belongs to one and only one class when using object-oriented approach, no entity belongs to People relation.Again we could replace MotherOf and FatherOf relations by adding as attributes to PeopleMale,PeopleMaleFathers,PeopleFemale, and PeopleFemaleMothers relations.(c)People(name,fatherName,motherName)ChildOf(personName,childName)4.6.4(a)Each entity set results in one relation. Thus both the minimum and maximum number of relations is e.The root relation has a attributes including k keys. Thus the minimum number of attributes is a. All other relations include the k keys from root along with their a attributes. Thus the maximum number of attributes is a+k.(b)The relation for root will have a attributes. The relation representing the whole tree will have e*a attributes.The number of relations will depend on the shape of the tree. A tree of e entities where only one child exists(say left child only) would have the minimum number of relations. Thus below figure will only contain 4 subtrees that contain rootE1,E1E2,E1E2E3, and E1E2E3E4. With e entity sets, minimum e relations are possible.The maximum number of subtrees result when all the entities(except root) are at depth 1. Thus below figure will contain 8 subtrees that contain rootE1,E1E2,E1E3,E1E4,E1E2E3,E1E3E4,E1E2E4,and E1E2E3E4. With e entity sets, maximum 2^(e-1) relations are possible.The nulls method always results in one relation and contains attributes from all e entities i.e. e*a attributes.Summarizing for a,b, and c above;#Components #RelationsMin Max Min MaxMethodstraight-E/R a a e eobject-oriented a e*a e 2^(e-1)nulls e*a e*a 1 14.7.14.7.2a)b)c)d)4.7.34.7.44.7.5Males and Females subclasses are complete. Mothers and Fathers are partial. All subclasses are disjoint.4.7.6。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

第四章作业 1. 数据仓库的需求分析的任务是什么?P67 需求分析的任务是通过详细调查现实世界要处理的对象(企业、部门用户等),充分了解源系统工作概况,明确用户的各种需求,为设计数据仓库服务。概括地说,需求分析要明确用那些数据经过分析来实现用户的决策支持需求。 2. 数据仓库系统需要确定的问题有哪些?P67、、 (1) 确定主题域 a) 明确对于决策分析最有价值的主题领域有哪些 b) 每个主题域的商业维度是那些?每个维度的粒度层次有哪些? c) 制定决策的商业分区是什么? d) 不同地区需要哪些信息来制定决策? e) 对那个区域提供特定的商品和服务? (2) 支持决策的数据来源 a) 那些源数据与商品的主题有关? b) 在已有的报表和在线查询(OLTP)中得到什么样的信息? c) 提供决策支持的细节程度是怎么样的? (3) 数据仓库的成功标准和关键性指标 a) 衡量数据仓库成功的标准是什么? b) 有哪些关键的性能指标?如何监控? c) 对数据仓库的期望是什么? d) 对数据仓库的预期用途有哪些? e) 对计划中的数据仓库的考虑要点是什么? (4) 数据量与更新频率 a) 数据仓库的总数据量有多少? b) 决策支持所需的数据更新频率是多少?时间间隔是多长? c) 每种决策分析与不同时间的标准对比如何? d) 数据仓库中的信息需求的时间界限是什么? 3. 实现决策支持所需要的数据包括哪些内容?P68 (1)源数据(2)数据转换(3)数据存储(4)决策分析

4.概念:将需求分析过程中得到的用户需求抽象为计算机表示的信息结构,叫做概念模型。 特点: (1)能真实反映现实世界,能满足用户对数据的分析,达到决策支持的要求,它是现实世界的一个真实模型。 (2)易于理解,便利和用户交换意见,在用户的参与下,能有效地完成对数据仓库的成功设计。 (3)易于更改,当用户需求发生变化时,容易对概念模型修改和扩充。 (4)易于向数据仓库的数据模型(星型模型)转换。

5.用长方形表示实体,在数据仓库中就表示主题,椭圆形表示主题的属性,并用无向边把主题与其属性连接起来; 用菱形表示主题之间的联系,用无向边把菱形分别与有关的主题连接; 若主题之间的联系也具有属性,则把属性和菱形也用无向边连接上。

6.数据库的概念模型设计主要采用E-R概念模型的设计方法。 数据仓库的概念模型设计主要采用E-R概念模型和面向对象的分析方法。

7 .图4.1所示的概念模型:商品和客户是两个主题,商品的销售信息等同于客户的购物信息,而每个商品具有本身的商品固有信息和商品号,还有就是商品的库存信息;客户具有自己的固有信息,还有就是客户号。

8.逻辑模型:计算机所支持的有E-R图转换成的数据模型,数据的逻辑结构 数据仓库的逻辑模型:星型模型

9.数据仓库的逻辑模型:用来构建数据仓库的数据库逻辑模型。 在数据库中,逻辑模型有关系、网状、层次,可以清晰的表示各个关系。

10.举例说明从数据仓库的概念模型到逻辑模型的转换? 答: 概念模型是对每个决策与属性及主体之间的关系用E-R图来表示的,E-R图能有效的将现实的世界表示成信息世界,他利于向计算机的表示形式进行转化。而逻辑模型设计是需求分析主题域,将概念模型E-R图转化为逻辑模型,即计算机表示的数据模型,数据仓库的数据模型一般采用星型模型。例如 概念模型设计时,确定了商品和客户两个主题。其中商品对于商场来说是更基本的业务对象,商品的业务有销售、采购、库存。其中商品销售时最重要的业务。它是进行决策分析的重要方面。星型模型的设计如下: 确定决策分析需求,数据仓库是面向决策分析的,决策需求是建立多维数据模型的依据。例如分析销售额趋势,对商品的销售量,促销手段对销售的影响。 从需求中识别出事实,从决策主题确定的情况下,选择或设计反映决策主体业务表。例如在商品主题中,以销售数据为事实表。 确定维,确定影响事实的各种因素,对销售业务的维一般的包括商店,地区,部门,城市,时间,商品等。 确定数据汇总的水平,存在于数据仓库中的数据包括汇总的数据。数据仓库中对数据不同粒度的综合形成了多层次的数据结构。例如 对于时间维,可以用年 月 日 不同水平进行汇总。 设计事实表和维表,设计事实表和维表的属性,再事实表中应该记录哪些属性是有维表的数量来决定的,一般来说,与事实表相关的维表的数量应该适中,太少的维表会影响查询的质量,用户得不到需要的数据,太多的数据会影响查询的速度。

11. 在数据仓库中为什么考虑数据的粒度层次划分? 答: 所谓的粒度是指数据仓库宗数据单元的详细程度和级别,数据越详细,粒度越小,层次级别九月低;数据综合度越高,粒度越大,层次级别就越高。在传统事务处理系统中,对数据的处理,操作都是再详细数据级别上的,即最低的粒度。但是数据仓库环境中主要是分析处理,粒度的划分键直接影响数据仓库中数据量以及所适合的查询类型。一般需要将数据划分为详细数据,轻度综合,高度综合三级或更多及粒度。不同粒度级别的数据用于不同类型的分析处理。力度的划分是数据仓库设计工作的一项重要内容,粒度划分是否适当影响数据仓库性能的一个重要方面。

12.数据仓库的记录系统包括什么内容,举例说明? 答:数据仓库中的数据来源与多个已经存在的事务处理系统外部系统,由于各个原系统的数据是面向应用的,不能完整地描述企业中的主题域,并且多个数据源的数据存在者许多不一致,因此要从数据仓库的概念模型出发,结合主题的多个表的关系模式,需要确定现有系统的哪些数据能较好地适应数据的需求。这就要求选择最完整的、最及时的、最准确的、最接近外部实体源的数据作为记录系统,同时这些数据所在的表的关系模式接近于构成主体的多个标的关系模式。记录系统的定义要记入数据仓库的元数据。

13、什么是物理模型?数据仓库的物理模型设计包括哪些工作? 答:物理模型就是逻辑模型在计算机中的物理结构,其中包括存储结构和存取方法;数据仓库的物理模型设计的工作包括:估计存储容量、确定数据的存储计划、确定索引策略、确定数据存放位置和确定存储分配。

14、为什么数据仓库物理模型设计中要建立汇总计划和确定数据分区方案? 答:如果数据仓库只存储最小粒度的数据,每次查询遍历所有的明细记录,然后生成汇总信息,这会造成很大的开销,因此要建立汇总计划; 分区可以将表分解成易于管理的小表,对事实表的分区医保采用垂直分区或水平分区,这样使得大表被分成小表,因此要建立分区方案。

15、说明图4.8中逻辑模型与物理模型的区别。 答:逻辑模型表现出各数据元素间直接或间接的关系,并体现主题域的结构,而且说明各个表所包含的元素。而物理模型要体现在计算机中的物理结构,所以有各个表元素的类型和长度。在图4.8中,产品维表的主键为产品键,我们只能在逻辑模型中得到这个信息,而在物理模型中,产品键为integer类型,长度为10,这是在计算机中的存储结构。

16.概念模型:E-R图 逻辑模型:星型模型 物理模型:存储结构、索引、数据存放位置、存储分配。

17.(1)位索引技术 ①Bit-Wise索引技术 ②B-Tree索引技术 (2)表示技术 (3)广义索引

18.因为B-Tree索引增加了在数据仓库中构造和维护索引的代价; B-Tree不适合复杂查询

19、数据仓库中采用标识技术有什么好处。 答:使用标准的数据库技术来储存数据仓库是非常昂贵的。较好的替代方法是用基于标识的技术来储存数据仓库。 一旦将基于标识的数据库存放在内存中,处理速度会得到很大的提高。 数据越多,标识数据比标准的、基于记录的数据更有利。 因为数据被大量压缩,所以整个数据库可以存放在内存中。 可以索引所有的行和所有的列。

20、数据仓库的广义索引时什么时候建立的?简单说明原因。 答:在从操作型环境抽取数据并向数据仓库中装载的同时,就可以根据用户的需要建立许多“广义索引”。每次数据仓库装载时,就重新生成这些“广义索引”的内容。这样并不需要为了建立“广义索引”而去扫描数据仓库。而且这些索引都非常小,开销也是相当小,但它给应用所带来的便利却是显而易见的。对于一些经常性的查询,利用一个规模小得多的“广义索引”总比去搜索一个大得多的关系表方便得多。

21、说明数据仓库开发的四个阶段和12个步骤 答:如下图所示发:分为分析设计阶段;数据获取阶段;决策支持阶段;维护与评估阶段。

22. 数据获取阶段包括数据抽取,数据转换,数据装载3个步骤。数据抽取:数据抽取主要进行数据源的确认,确定数据抽取技术,确认数据抽取频率,按照时间要求抽取数据。数据转换:数据抽取得到的数据不能直接存入数据仓库的。数据转换工作包括:数据格式的修改,字段的解码,单个字段的分离,信息的合并,变量单位的转化,时间的转化,数据汇总等。数据装载:数据装载包括初始装载,增量装载,完全刷新。

23. 数据仓库的简历就是要达到决策支持的目的。决策支持阶段包括信息查询和知识探索两个步骤。信息查询:信息查询者使用数据仓库发现目前存在的问题。为适应信息查询者的要求,数据仓库一般采用如下的方法提高信息查询效率:创建数据陈列,预连接表格,预聚集数据,聚类数据。知识探索:只是探索者使用数据仓库能对发现的问题找出原因。

24. 维护与评估阶段包括数据仓库增长,数据仓库维护,数据仓库评价。数据仓库增长:数据仓库建立以后,随着数据用户的不断增加,时间的曾增长,用户查询需求更多,数据会迅速增长。数据仓库维护:数据仓库维护包括适应数据仓库增长的维护和正常系统维护两类。数据仓库评估:数据仓库评估包括系统性能评定,投资回报分析,数据质量评估。

25.概括说明“概念模型、逻辑模型、物理模型”分别是什么样的数据模型? 答:将需求分析过程中得到的用户需求抽象为计算机表示的信息结构,即概念模型。逻辑模型是由概念模型进一步转化成计算机支持的数据模型。物理模型是逻辑模型设计的数据模型适应应用要求在计算机中的存储结构和存取方法。

26.数据仓库索引技术包括哪些内容? 答:位索引技术、标识技术、广义索引。

27.为什么B-Tree索引不适合数据仓库? 答:1、B-Tree只适合于高基数字段,但对于低基数字段毫无价值。 2、B-Tree索引需占一定的空间和时间,增加了在数据仓库中构造和维护索引的代价。 3、数据仓库应用中常常是复杂的查询,并经常带有分组及聚合条件,此时B-Tree索引往往无能为力。

28. 当有一个或多个维表没有直接连接到事实表上,而是通过其他维表连接到事实表上时,其图解就像多个雪花连接在一起,故称雪花模型。雪花模型是对星型模型的扩展。它对星型模型的维表进一步层次化,原有的各维表可能被扩展为小的事实表,形成一些局部的 "层次 " 区域,这些被分解的表都连接到主维度表而不是事实表。 管理大量数据,数据的高效装入和数据压缩,存储介质的管理,元数据的管理,数据仓库语言,高效索引,多维数据仓库和数据管理

相关文档
最新文档