第6章-DM-Association rule

合集下载

Association_Rule

10
Support: 2/8
Confidence: 1/3
Milk Bread
Support: 2/8 Confidence: 2/3
Frequent Itemsets and Strong Rules
Support and Confidence are bounded by thresholds:
C3 {{1, 2,3}}
24
Lk Ck+1
{X Yk | X , Y Lk , X i Yi , i [1, k 1], X k Yk }
Ordered List
L2 {{1, 2},{2,3}} L2 {{1,3},{2,3}} L2 {{1, 2},{1,3},{2,3}} L2 {{1, 2},{1,3}}
Searching for rules in the form of: Bread Butter
6
7
Support of an Itemset
Itemset
Bread
Butter Chips Jelly
Support
6/8
3/8 2/8 3/8
Itemset
Bread, Butter
Support
Minimum support σ
Minimum confidence Φ A frequent (large) itemset is an itemset with support larger than σ. A strong rule is a rule that is frequent and its confidence is higher than Φ. Association Rule Problem Given I, D, σ and Φ, to find all strong rules in the form of XY. The number of all possible association rules is huge. Brute force strategy is infeasible. A smart way is to find frequent itemsets first.

2022年吉林大学数据科学与大数据技术专业《数据库系统原理》科目期末试卷B(有答案)

2022年吉林大学数据科学与大数据技术专业《数据库系统原理》科目期末试卷B（有答案）一、填空题1、数据库内的数据是______的，只要有业务发生，数据就会更新，而数据仓库则是______的历史数据，只能定期添加和刷新。

2、以子模式为框架的数据库是______________；以模式为框架的数据库是______________；以物理模式为框架的数据库是______________。

3、视图是一个虚表，它是从______导出的表。

在数据库中，只存放视图的______，不存放视图对应的______。

4、关系模型由______________、______________和______________组成。

5、数据库恢复是将数据库从______状态恢复到______的功能。

6、数据库系统是利用存储在外存上其他地方的______来重建被破坏的数据库。

方法主要有两种：______和______。

7、在SQL Server 2000中，新建了一个SQL Server身份验证模式的登录账户LOG，现希望LOG在数据库服务器上具有全部的操作权限，下述语句是为LOG授权的语句，请补全该语句。

EXEC sp_addsrvrolemember‘LOG’，_____；8、数据仓库主要是供决策分析用的______，所涉及的数据操作主要是______，一般情况下不进行。

9、数据库管理系统的主要功能有______________、______________、数据库的运行管理以及数据库的建立和维护等4个方面。

10、在SQL语言中，为了数据库的安全性，设置了对数据的存取进行控制的语句，对用户授权使用____________语句，收回所授的权限使用____________语句。

二、判断题11、关系中任何一列的属性取值是不可再分的数据项，可取自不同域中的数据。

（）12、有出现并发操作时，才有可能出现死锁。

（）13、在第一个事务以S锁方式读数据R时，第二个事务可以进行对数据R加S锁并写数据的操作。

数据库系统基础教程第六章答案

Solutions Chapter 6Attributes must be separated by commas. Thus here B is an alias of A.6.1.2a)SELECT address AS Studio_AddressFROM StudioWHERE NAME = 'MGM';b)SELECT birthdate AS Star_BirthdateFROM MovieStarWHERE name = 'Sandra Bullock';c)SELECT starNameFROM StarsInWHERE movieYear = 1980OR movieTitle LIKE '%Love%';However, above query will also return words that have the substring Love e.g. Lover. Below query will only return movies that have title containing the word Love.SELECT starNameFROM StarsInWHERE movieYear = 1980OR movieTitle LIKE 'Love %'OR movieTitle LIKE '% Love %'OR movieTitle LIKE '% Love'OR movieTitle = 'Love';d)SELECT name AS Exec_NameFROM MovieExecWHERE netWorth >= 10000000;e)SELECT name AS Star_NameFROM movieStarWHERE gender = 'M'OR address LIKE '% Malibu %';a)SELECT model,speed,hdFROM PCWHERE price < 1000 ;MODEL SPEED HD----- ---------- ------1002 2.10 2501003 1.42 801004 2.80 2501005 3.20 2501007 2.20 2001008 2.20 2501009 2.00 2501010 2.80 3001011 1.86 1601012 2.80 1601013 3.06 8011 record(s) selected.b)SELECT model ,speed AS gigahertz,hd AS gigabytesFROM PCWHERE price < 1000 ;MODEL GIGAHERTZ GIGABYTES ----- ---------- ---------1002 2.10 2501003 1.42 801004 2.80 2501005 3.20 2501007 2.20 2001008 2.20 2501009 2.00 2501010 2.80 3001011 1.86 1601012 2.80 1601013 3.06 8011 record(s) selected.c)SELECT makerFROM ProductWHERE TYPE = 'printer' ; MAKER-----DDEEEHH7 record(s) selected.d)SELECT model,ram ,screenFROM LaptopWHERE price > 1500 ; MODEL RAM SCREEN ----- ------ -------2001 2048 20.12005 1024 17.02006 2048 15.42010 2048 15.44 record(s) selected.e)SELECT *FROM PrinterWHERE color ;MODEL CASE TYPE PRICE----- ----- -------- ------3001 TRUE ink-jet 993003 TRUE laser 9993004 TRUE ink-jet 1203006 TRUE ink-jet 1003007 TRUE laser 2005 record(s) selected.Note: Implementation of Boolean type is optional in SQL standard (feature IDT031). PostgreSQL has implementation similar to above example. Other DBMS provide equivalent support. E.g. In DB2 the column type can be declare as SMALLINT with CONSTRAINT that the value can be 0 or 1. The result can be returned as Boolean type CHAR using CASE.CREATE TABLE Printer(model CHAR(4) UNIQUE NOT NULL,color SMALLINT ,type VARCHAR(8) ,price SMALLINT ,CONSTRAINT Printer_ISCOLOR CHECK(color IN(0,1)));SELECT model,CASE colorWHEN 1THEN 'TRUE'WHEN 0THEN 'FALSE'ELSE 'ERROR'END CASE ,type,priceFROM PrinterWHERE color = 1;f)SELECT model,hdFROM PCWHERE speed = 3.2AND price < 2000;MODEL HD----- ------1005 2501006 3202 record(s) selected.6.1.4a)SELECT class,countryFROM ClassesWHERE numGuns >= 10 ; CLASS COUNTRY------------------ ------------ Tennessee USA1 record(s) selected.b)SELECT name AS shipName FROM ShipsWHERE launched < 1918 ; SHIPNAME------------------HarunaHieiKirishimaKongoRamilliesRenownRepulseResolutionRevengeRoyal OakRoyal Sovereign11 record(s) selected.c)SELECT ship AS shipName, battleFROM OutcomesWHERE result = 'sunk' ; SHIPNAME BATTLE------------------ ------------------ Arizona Pearl Harbor Bismark Denmark Strait Fuso Surigao Strait Hood Denmark Strait Kirishima Guadalcanal Scharnhorst North Cape Yamashiro Surigao Strait 7 record(s) selected.d)SELECT name AS shipNameFROM ShipsWHERE name = class ;SHIPNAME------------------IowaKongoNorth CarolinaRenownRevengeYamato6 record(s) selected.e)SELECT name AS shipNameFROM ShipsWHERE name LIKE 'R%';SHIPNAME------------------RamilliesRenownRepulseResolutionRevengeRoyal OakRoyal Sovereign7 record(s) selected.Note: As mentioned in exercise 2.4.3, there are some dangling pointers and to retrieve all ships a UNION of Ships and Outcomes is required.Below query returns 8 rows including ship named Rodney.SELECT name AS shipNameFROM ShipsWHERE name LIKE 'R%'UNIONSELECT ship AS shipNameFROM OutcomesWHERE ship LIKE 'R%';f) Only using a filter like '% % %' will incorrectly match name such as ' a b ' since % can match any sequence of 0 or more characters.SELECT name AS shipNameFROM ShipsWHERE name LIKE '_% _% _%' ;SHIPNAME------------------0 record(s) selected.Note: As in (e), UNION with results from Outcomes.SELECT name AS shipNameFROM ShipsWHERE name LIKE '_% _% _%'UNIONSELECT ship AS shipNameFROM OutcomesWHERE ship LIKE '_% _% _%' ;SHIPNAME------------------Duke of YorkKing George VPrince of Wales3 record(s) selected.6.1.5a)The resulting expression is false when neither of (a=10) or (b=20) is TRUE.a = 10b = 20 a = 10 OR b = 20NULL TRUE TRUETRUE NULL TRUEFALSE TRUE TRUETRUE FALSE TRUETRUE TRUE TRUEb)The resulting expression is only TRUE when both (a=10) and (b=20) are TRUE.a = 10b = 20 a = 10 AND b = 20TRUE TRUE TRUEThe expression is always TRUE unless a is NULL.a < 10 a >= 10 a = 10 ANDb = 20TRUE FALSE TRUEFALSE TRUE TRUEd)The expression is TRUE when a=b except when the values are NULL.a b a = bNOT NULL NOT NULL TRUE when a=b; else FALSEe)Like in (d), the expression is TRUE when a<=b except when the values are NULL.a b a <= bNOT NULL NOT NULL TRUE when a<=b; else FALSE6.1.6SELECT *FROM MoviesWHERE LENGTH IS NOT NULL;6.2.1a)SELECT AS starNameFROM MovieStar M,StarsIn SWHERE = S.starNameAND S.movieTitle = 'Titanic'AND M.gender = 'M';b)SELECT S.starNameFROM Movies M ,StarsIn S,Studios TWHERE ='MGM'AND M.year = 1995AND M.title = S.movieTitleAND M.studioName = ;SELECT AS presidentName FROM MovieExec X,Studio TWHERE X.cert# = T.presC#AND = 'MGM';d)SELECT M1.titleFROM Movies M1,Movies M2WHERE M1.length > M2.lengthAND M2.title ='Gone With the Wind' ;e)SELECT AS execNameFROM MovieExec X1,MovieExec X2WHERE Worth > WorthAND = 'Merv Griffin' ;6.2.2a)SELECT R.maker AS manufacturer,L.speed AS gigahertzFROM Product R,Laptop LWHERE L.hd >= 30AND R.model = L.model ; MANUFACTURER GIGAHERTZ------------ ----------A 2.00A 2.16A 2.00B 1.83E 2.00E 1.73E 1.80F 1.60F 1.60G 2.0010 record(s) selected.SELECT R.model,P.priceFROM Product R,PC PWHERE R.maker = 'B'AND R.model = P.model UNIONSELECT R.model,L.priceFROM Product R,Laptop LWHERE R.maker = 'B'AND R.model = L.model UNIONSELECT R.model,T.priceFROM Product R,Printer TWHERE R.maker = 'B'AND R.model = T.model ; MODEL PRICE----- ------1004 6491005 6301006 10492007 14294 record(s) selected.c)SELECT R.makerFROM Product R,Laptop LWHERE R.model = L.model EXCEPTSELECT R.makerFROM Product R,PC PWHERE R.model = P.model ; MAKER-----FG2 record(s) selected.SELECT DISTINCT P1.hd FROM PC P1,PC P2WHERE P1.hd =P2.hdAND P1.model > P2.model ; Alternate Answer:SELECT DISTINCT P.hdFROM PC PGROUP BY P.hdHAVING COUNT(P.model) >= 2 ;e)SELECT P1.model,P2.modelFROM PC P1,PC P2WHERE P1.speed = P2.speed AND P1.ram = P2.ramAND P1.model < P2.model ; MODEL MODEL----- -----1004 10121 record(s) selected.f)SELECT M.makerFROM(SELECT maker,R.modelFROM PC P,Product RWHERE SPEED >= 3.0AND P.model=R.modelUNIONSELECT maker,R.modelFROM Laptop L,Product RWHERE speed >= 3.0AND L.model=R.model) MGROUP BY M.makerHAVING COUNT(M.model) >= 2 ; MAKER-----B1 record(s) selected.6.2.3a)SELECT FROM Ships S,Classes CWHERE S.class = C.classAND C.displacement > 35000;NAME------------------IowaMissouriMusashiNew JerseyNorth CarolinaWashingtonWisconsinYamato8 record(s) selected.b)SELECT ,C.displacement,C.numGunsFROM Ships S ,Outcomes O,Classes CWHERE = O.shipAND S.class = C.classAND O.battle = 'Guadalcanal' ;NAME DISPLACEMENT NUMGUNS------------------ ------------ -------Kirishima 32000 8Washington 37000 92 record(s) selected.Note:South Dakota was also engaged in battle of Guadalcanal but not chosen since it is not in Ships table(Hence, no information regarding it's Class isavailable).SELECT name shipName FROM ShipsUNIONSELECT ship shipName FROM Outcomes ; SHIPNAME------------------ArizonaBismarkCaliforniaDuke of YorkFusoHarunaHieiHoodIowaKing George VKirishimaKongoMissouriMusashiNew JerseyNorth CarolinaPrince of Wales RamilliesRenownRepulseResolutionRevengeRodneyRoyal OakRoyal Sovereign ScharnhorstSouth DakotaTenneseeTennesseeWashingtonWest VirginiaWisconsinYamashiroYamato34 record(s) selected.SELECT C1.countryFROM Classes C1,Classes C2WHERE C1.country = C2.country AND C1.type = 'bb'AND C2.type = 'bc' ; COUNTRY------------Gt. BritainJapan2 record(s) selected.e)SELECT O1.shipFROM Outcomes O1,Battles B1WHERE O1.battle = AND O1.result = 'damaged'AND EXISTS(SELECT B2.dateFROM Outcomes O2,Battles B2WHERE O2.battle= AND O1.ship = O2.shipAND B1.date < B2.date) ;SHIP------------------0 record(s) selected.f)SELECT O.battleFROM Outcomes O,Ships S ,Classes CWHERE O.ship = AND S.class = C.class GROUP BY C.country,O.battleHAVING COUNT(O.ship) > 3; SELECT O.battleFROM Ships S ,Classes C,Outcomes OWHERE C.Class = S.classAND O.ship = GROUP BY C.country,O.battleHAVING COUNT(O.ship) >= 3;6.2.4Since tuple variables are not guaranteed to be unique, every relation Ri should be renamed using an alias. Every tuple variable should be qualified with the alias. Tuple variables for repeating relations will also be distinctlyidentified this way.Thus the query will be likeSELECT A1.COLL1,A1.COLL2,A2.COLL1,…FROM R1 A1,R2 A2,…,Rn AnWH ERE A1.COLL1=A2.COLC2,…6.2.5Again, create a tuple variable for every Ri, i=1,2,...,nThat is, the FROM clause isFROM R1 A1, R2 A2,...,Rn An.Now, build the WHERE clause from C by replacing every reference to some attribute COL1 of Ri by Ai.COL1. In addition apply Natural Join i.e. add condition to check equality of common attribute names between Ri and Ri+1 for all i from 0 to n-1. Also, build the SELECT clause from list of attributes L by replacing every attribute COLj of Ri by Ai.COLj.6.3.1a)SELECT DISTINCT makerFROM ProductWHERE model IN(SELECT modelFROM PCWHERE speed >= 3.0);SELECT DISTINCT R.makerFROM Product RWHERE EXISTS(SELECT P.modelFROM PC PWHERE P.speed >= 3.0AND P.model =R.model);SELECT P1.modelFROM Printer P1WHERE P1.price >= ALL(SELECT P2.priceFROM Printer P2) ;SELECT P1.modelFROM Printer P1WHERE P1.price IN(SELECT MAX(P2.price)FROM Printer P2) ;c)SELECT L.modelFROM Laptop LWHERE L.speed < ANY(SELECT P.speedFROM PC P) ;SELECT L.modelFROM Laptop LWHERE EXISTS(SELECT P.speedFROM PC PWHERE P.speed >= L.speed ) ;SELECT modelFROM(SELECT model,priceFROM PCUNIONSELECT model,priceFROM LaptopUNIONSELECT model,priceFROM Printer) M1WHERE M1.price >= ALL (SELECT priceFROM PCUNIONSELECT priceFROM LaptopUNIONSELECT priceFROM Printer) ;(d) – contd --SELECT modelFROM(SELECT model,priceFROM PCUNIONSELECT model,priceFROM LaptopUNIONSELECT model,priceFROM Printer) M1WHERE M1.price IN(SELECT MAX(price)FROM(SELECT priceFROM PCUNIONSELECT priceFROM LaptopUNIONSELECT priceFROM Printer) M2) ;e)SELECT R.makerFROM Product R,Printer TWHERE R.model =T.model AND T.price <= ALL(SELECT MIN(price)FROM Printer);SELECT R.makerFROM Product R,Printer T1WHERE R.model =T1.model AND T1.price IN(SELECT MIN(T2.price) FROM Printer T2);f)SELECT R1.makerFROM Product R1,PC P1WHERE R1.model=P1.modelAND P1.ram IN(SELECT MIN(ram)FROM PC)AND P1.speed >= ALL(SELECT P1.speedFROM Product R1,PC P1WHERE R1.model=P1.model AND P1.ram IN(SELECT MIN(ram)FROM PC));SELECT R1.makerFROM Product R1,PC P1WHERE R1.model=P1.modelAND P1.ram =(SELECT MIN(ram)FROM PC)AND P1.speed IN(SELECT MAX(P1.speed)FROM Product R1,PC P1WHERE R1.model=P1.model AND P1.ram IN(SELECT MIN(ram)FROM PC));6.3.2a)SELECT C.countryFROM Classes CWHERE numGuns IN(SELECT MAX(numGuns)FROM Classes);SELECT C.countryFROM Classes CWHERE numGuns >= ALL(SELECT numGunsFROM Classes);b)SELECT DISTINCT C.class FROM Classes C,Ships SWHERE C.class = S.classAND EXISTS(SELECT shipFROM Outcomes OWHERE O.result='sunk' AND O.ship = ) ;SELECT DISTINCT C.class FROM Classes C,Ships SWHERE C.class = S.classAND IN(SELECT shipFROM Outcomes OWHERE O.result='sunk' ) ;c)SELECT FROM Ships SWHERE S.class IN(SELECT classFROM Classes CWHERE bore=16) ;SELECT FROM Ships SWHERE EXISTS(SELECT classFROM Classes CWHERE bore =16AND C.class = S.class );SELECT O.battleFROM Outcomes OWHERE O.ship IN(SELECT nameFROM Ships SWHERE S.Class ='Kongo' );SELECT O.battleFROM Outcomes OWHERE EXISTS(SELECT nameFROM Ships SWHERE S.Class ='Kongo' AND = O.ship );SELECT FROM Ships S,Classes CWHERE S.Class = C.ClassAND numGuns >= ALL(SELECT numGunsFROM Ships S2,Classes C2WHERE S2.Class = C2.Class AND C2.bore = C.bore) ;SELECT FROM Ships S,Classes CWHERE S.Class = C.ClassAND numGuns IN(SELECT MAX(numGuns)FROM Ships S2,Classes C2WHERE S2.Class = C2.Class AND C2.bore = C.bore) ;Better answer;SELECT FROM Ships S,Classes CWHERE S.Class = C.ClassAND numGuns >= ALL(SELECT numGunsFROM Classes C2WHERE C2.bore = C.bore) ;SELECT FROM Ships S,Classes CWHERE S.Class = C.ClassAND numGuns IN(SELECT MAX(numGuns)FROM Classes C2WHERE C2.bore = C.bore) ;6.3.3SELECT titleFROM MoviesGROUP BY titleHAVING COUNT(title) > 1 ;6.3.4SELECT FROM Ships S,Classes CWHERE S.Class = C.Class ;Assumption: In R1 join R2, the rows of R2 are unique on the joining columns. SELECT COLL12,COLL13,COLL14FROM R1WHERE COLL12 IN(SELECT COL22FROM R2)AND COLL13 IN(SELECT COL33FROM R3)AND COLL14 IN(SELECT COL44FROM R4) ...6.3.5(a)SELECT ,S.addressFROM MovieStar S,MovieExec EWHERE S.gender ='F'AND Worth > 10000000AND = AND S.address = E.address ;Note: As mentioned previously in the book, the names of stars are unique. However no such restriction exists for executives. Thus, both name and address are required as join columns.Alternate solution:SELECT name,addressFROM MovieStarWHERE gender = 'F'AND (name, address) IN(SELECT name,addressFROM MovieExecWHERE netWorth > 10000000) ;(b)SELECT name,addressFROM MovieStarWHERE (name,address) NOT IN(SELECT name addressFROM MovieExec) ;6.3.6By replacing the column in subquery with a constant and using IN subquery forthe constant, statement equivalent to EXISTS can be found.i.e. replace "WHERE EXISTS (SELECT C1 FROM R1..)" by "WHERE 1 IN (SELECT 1 FROM R1...)"Example:SELECT DISTINCT R.makerFROM Product RWHERE EXISTS(SELECT P.modelFROM PC PWHERE P.speed >= 3.0AND P.model =R.model) ;Above statement can be transformed to below statement.SELECT DISTINCT R.makerFROM Product RWHERE 1 IN(SELECT 1FROM PC PWHERE P.speed >= 3.0AND P.model =R.model) ;6.3.7(a)n*m tuples are returned where there are n studios and m executives. Each studiowill appear m times; once for every exec.(b)There are no common attributes between StarsIn and MovieStar; hence no tuplesare returned.(c)There will be at least one tuple corresponding to each star in MovieStar. Theunemployed stars will appear once with null values for StarsIn. All employedstars will appear as many times as the number of movies they are working in. Inother words, for each tuple in StarsIn(starName), the correspoding tuple fromMovieStar(name)) is joined and returned. For tuples in MovieStar that do nothave a corresponding entry in StarsIn, the MovieStar tuple is returned with nullvalues for StarsIn columns.6.3.8Since model numbers are unique, a full natural outer join of PC, Laptop andPrinter will return one row for each model. We want all information about PCs,Laptops and Printers even if the model does not appear in Product but vice versais not true. Thus a left natural outer join between Product and result above isrequired. The type attribute from Product must be renamed since Printer has atype attribute as well and the two attributes are different.(SELECT maker,model,type AS productTypeFROM Product) RIGHT NATURAL OUTER JOIN ((PC FULL NATURAL OUTER JOIN Laptop) FULL NATURAL OUTER JOIN Printer);Alternately, the Product relation can be joined individually with each ofPC,Laptop and Printer and the three results can be Unioned together. Forattributes that do not exist in one relation, a constant such as 'NA' or 0.0 canbe used. Below is an example of this approach using PC and Laptop.SELECT R.MAKER ,R.MODEL ,R.TYPE ,P.SPEED ,P.RAM ,P.HD ,0.0 AS SCREEN,P.PRICEFROM PRODUCT R,PC PWHERE R.MODEL = P.MODELUNIONSELECT R.MAKER ,R.MODEL ,R.TYPE ,L.SPEED ,L.RAM ,L.HD ,L.SCREEN,L.PRICEFROM PRODUCT R,LAPTOP LWHERE R.MODEL = L.MODEL;6.3.9SELECT *FROM Classes RIGHT NATURALOUTER JOIN Ships ;6.3.10SELECT *FROM Classes RIGHT NATURALOUTER JOIN ShipsUNION(SELECT C2.class ,C2.type ,C2.country ,C2.numguns ,C2.bore ,C2.displacement,C2.class NAME ,FROM Classes C2,Ships S2WHERE C2.Class NOT IN(SELECT ClassFROM Ships)) ;6.3.11(a)SELECT *FROM R,S ;(b)Let Attr consist ofAttrR = attributes unique to RAttrS = attributes unique to SAttrU = attributes common to R and SThus in Attr, attributes common to R and S are not repeated. SELECT AttrFROM R,SWHERE R.AttrU1 = S.AttrU1AND R.AttrU2 = S.AttrU2 ...AND R.AttrUi = S.AttrUi ;(c)SELECT *FROM R,SWHERE C ;6.4.1(a)DISTINCT keyword is not required here since each model only occurs once in PC relation.SELECT modelFROM PCWHERE speed >= 3.0 ;(b)SELECT DISTINCT R.makerFROM Product R,Laptop LWHERE R.model = L.modelAND L.hd > 100 ;(c)SELECT R.model,P.priceFROM Product R,PC PWHERE R.model = P.modelAND R.maker = 'B'UNIONSELECT R.model,L.priceFROM Product R,Laptop LWHERE R.model = L.modelAND R.maker = 'B'UNIONSELECT R.model,T.priceFROM Product R,Printer TWHERE R.model = T.modelAND R.maker = 'B' ;SELECT modelFROM PrinterWHERE color=TRUEAND type ='laser' ;(e)SELECT DISTINCT R.makerFROM Product R,Laptop LWHERE R.model = L.modelAND R.maker NOT IN(SELECT R1.makerFROM Product R1,PC PWHERE R1.model = P.model) ;better:SELECT DISTINCT R.makerFROM Product RWHERE R.type = 'laptop'AND R.maker NOT IN(SELECT R.makerFROM Product RWHERE R.type = 'pc') ;(f)With GROUP BY hd, DISTINCT keyword is not required. SELECT hdFROM PCGROUP BY hdHAVING COUNT(hd) > 1 ;(g)SELECT P1.model,P2.modelFROM PC P1,PC P2WHERE P1.speed = P2.speedAND P1.ram = P2.ramAND P1.model < P2.model ;SELECT R.makerFROM Product RWHERE R.model IN(SELECT P.modelFROM PC PWHERE P.speed >= 2.8)OR R.model IN(SELECT L.modelFROM Laptop LWHERE L.speed >= 2.8)GROUP BY R.makerHAVING COUNT(R.model) > 1 ;(i)After finding the maximum speed, an IN subquery can provide the manufacturer name.SELECT MAX(M.speed)FROM(SELECT speedFROM PCUNIONSELECT speedFROM Laptop) M ;SELECT R.makerFROM Product R,PC PWHERE R.model = P.modelAND P.speed IN(SELECT MAX(M.speed)FROM(SELECT speedFROM PCUNIONSELECT speedFROM Laptop) M)UNIONSELECT R2.makerFROM Product R2,Laptop LWHERE R2.model = L.modelAND L.speed IN(SELECT MAX(N.speed)FROM(SELECT speedFROM PCUNIONSELECT speedFROM Laptop) N) ;Alternately,SELECT COALESCE(MAX(P2.speed),MAX(L2.speed),0) SPEED FROM PC P2FULL OUTER JOIN Laptop L2ON P2.speed = L2.speed ;SELECT R.makerFROM Product R,PC PWHERE R.model = P.modelAND P.speed IN(SELECT COALESCE(MAX(P2.speed),MAX(L2.speed),0) SPEED FROM PC P2FULL OUTER JOIN Laptop L2ON P2.speed = L2.speed)UNIONSELECT R2.makerFROM Product R2,Laptop LWHERE R2.model = L.modelAND L.speed IN(SELECT COALESCE(MAX(P2.speed),MAX(L2.speed),0) SPEED FROM PC P2FULL OUTER JOIN Laptop L2ON P2.speed = L2.speed)SELECT R.makerFROM Product R,PC PWHERE R.model = P.modelGROUP BY R.makerHAVING COUNT(DISTINCT speed) >= 3 ;(k)SELECT R.makerFROM Product R,PC PWHERE R.model = P.modelGROUP BY R.makerHAVING COUNT(R.model) = 3 ;better;SELECT R.makerFROM Product RWHERE R.type='pc'GROUP BY R.makerHAVING COUNT(R.model) = 3 ;6.4.2(a)We can assume that class is unique in Classes and DISTINCT keyword is not required.SELECT class,countryFROM ClassesWHERE bore >= 16 ;(b)Ship names are not unique (In absence of hull codes, year of launch can help distinguish ships).SELECT DISTINCT name AS Ship_NameFROM ShipsWHERE launched < 1921 ;(c)SELECT DISTINCT ship AS Ship_NameFROM OutcomesWHERE battle = 'Denmark Strait'AND result = 'sunk' ;(d)SELECT DISTINCT AS Ship_NameFROM Ships S,Classes CWHERE S.class = C.classAND C.displacement > 35000 ;SELECT DISTINCT O.ship AS Ship_Name,C.displacement ,C.numGunsFROM Classes C ,Outcomes O,Ships SWHERE C.class = S.classAND = O.shipAND O.battle = 'Guadalcanal' ;SHIP_NAME DISPLACEMENT NUMGUNS------------------ ------------ -------Kirishima 32000 8Washington 37000 92 record(s) selected.Note: South Dakota was also in Guadalcanal but its class information is not available. Below query will return name of all ships that were in Guadalcanal even if no other information is available (shown as NULL). The above query is modified from INNER joins to LEFT OUTER joins.SELECT DISTINCT O.ship AS Ship_Name,C.displacement ,C.numGunsFROM Outcomes OLEFT JOIN Ships SON = O.shipLEFT JOIN Classes CON C.class = S.classWHERE O.battle = 'Guadalcanal' ;SHIP_NAME DISPLACEMENT NUMGUNS------------------ ------------ -------Kirishima 32000 8South Dakota - -Washington 37000 93 record(s) selected.(f)The Set opearator UNION guarantees unique results.SELECT ship AS Ship_NameFROM OutcomesUNIONSELECT name AS Ship_NameFROM Ships ;(g)SELECT C.classFROM Classes C,Ships SWHERE C.class = S.classGROUP BY C.classHAVING COUNT() = 1 ;better:SELECT S.classFROM Ships SGROUP BY S.classHAVING COUNT() = 1 ;(h)The Set opearator INTERSECT guarantees unique results.SELECT C.countryFROM Classes CWHERE C.type='bb'INTERSECTSELECT C2.countryFROM Classes C2WHERE C2.type='bc' ;However, above query does not account for classes without any ships belonging to them.SELECT C.countryFROM Classes C,Ships SWHERE C.class = S.classAND C.type ='bb'INTERSECTSELECT C2.countryFROM Classes C2,Ships S2WHERE C2.class = S2.classAND C2.type ='bc' ;SELECT O2.ship AS Ship_Name FROM Outcomes O2,Battles B2WHERE O2.battle = AND B2.date > ANY(SELECT B.dateFROM Outcomes O,Battles BWHERE O.battle = AND O.result ='damaged' AND O.ship = O2.ship);6.4.3a)SELECT DISTINCT R.maker FROM Product R,PC PWHERE R.model = P.modelAND P.speed >= 3.0;b)Models are unique.SELECT P1.modelFROM Printer P1LEFT OUTER JOIN Printer P2 ON (P1.price < P2.price) WHERE P2.model IS NULL ;c)SELECT DISTINCT L.model FROM Laptop L,PC PWHERE L.speed < P.speed ;Due to set operator UNION, unique results are returned.It is difficult to completely avoid a subquery here. One option is to use Views. CREATE VIEW AllProduct ASSELECT model,priceFROM PCUNIONSELECT model,priceFROM LaptopUNIONSELECT model,priceFROM Printer ;SELECT A1.modelFROM AllProduct A1LEFT OUTER JOIN AllProduct A2ON (A1.price < A2.price)WHERE A2.model IS NULL ;But if we replace the View, the query contains a FROM subquery. SELECT A1.modelFROM(SELECT model,priceFROM PCUNIONSELECT model,priceFROM LaptopUNIONSELECT model,priceFROM Printer) A1LEFT OUTER JOIN(SELECT model,priceFROM PCUNIONSELECT model,priceFROM LaptopUNIONSELECT model,priceFROM Printer) A2ON (A1.price < A2.price) WHERE A2.model IS NULL ;e)SELECT DISTINCT R.makerFROM Product R,Printer TWHERE R.model =T.modelAND T.price <= ALL(SELECT MIN(price)FROM Printer);f)SELECT DISTINCT R1.makerFROM Product R1,PC P1WHERE R1.model=P1.modelAND P1.ram IN(SELECT MIN(ram)FROM PC)AND P1.speed >= ALL(SELECT P1.speedFROM Product R1,PC P1WHERE R1.model=P1.modelAND P1.ram IN(SELECT MIN(ram)FROM PC));6.4.4a)SELECT DISTINCT C1.countryFROM Classes C1LEFT OUTER JOIN Classes C2 ON (C1.numGuns < C2.numGuns) WHERE C2.country IS NULL ;。

dm3_精品文档

dm3DM3DM3 (Data Management and Mining) refers to the process of collecting, organizing, analyzing, and extracting meaningful insights and patterns from vast amounts of data. In today's digital age, data is generated at an unprecedented rate, making it crucial for businesses to effectively manage and utilize this data for decision-making and improving business performance. DM3 encompasses various techniques, tools, and methodologies that enable organizations to harness the power of data and derive valuable insights.1. Introduction to DM3With the advent of big data and advancements in technology, organizations are finding themselves drowning in a sea of data. However, raw data alone is not sufficient for decision-making. DM3 involves the systematic management of data from various sources and the application of statistical and machine learning techniques to uncover trends, patterns, and relationships that can drive business growth. This document explores the fundamentals of DM3 and its significance in today's business landscape.2. The Importance of Data ManagementEffective data management is the foundation of DM3. Organizations need to ensure that data is accurate, reliable, and accessible. This involves data collection, storage, integration, cleansing, and security to ensure the quality and integrity of the data. Data management helps organizations streamline their operations, enhance customer experiences, and gain a competitive edge. It also enables organizations to comply with data protection regulations and mitigate risks associated with data breaches.3. Data Mining TechniquesData mining is a key component of DM3. It involves the extraction of patterns and knowledge from large datasets using various algorithms and statistical models. Data mining techniques include:- Classification: This technique involves categorizing data into predefined classes or groups based on their attributes. It is often used for customer segmentation, fraud detection, and risk analysis.- Clustering: Clustering involves grouping similar data points together based on their characteristics. It helps in identifying natural groupings or patterns within the data, enabling organizations to offer personalized services, target marketing campaigns, and optimize resource allocation.- Association Rule Mining: This technique identifies relationships and associations between variables in a dataset. It helps in understanding customer buying patterns, uncovering cross-selling opportunities, and improving recommendation systems.- Regression Analysis: Regression analysis is used to model and predict the relationship between dependent and independent variables. It helps organizations understand the impact of various factors on business outcomes and make data-driven decisions.4. Data Visualization and ReportingData visualization plays a crucial role in DM3. It involves the use of charts, graphs, and other visual formats to represent data in a meaningful and easily understandable way.Visualization helps in identifying patterns, trends, and outliers in the data, making it easier to communicate insights to stakeholders. Reporting tools enable organizations to generate custom reports, dashboards, and automated alerts, providing real-time insights and facilitating data-driven decision-making.5. Challenges and Ethical ConsiderationsWhile DM3 offers immense potential, there are several challenges and ethical considerations that organizations need to address. This includes data privacy and security concerns, ensuring data accuracy and reliability, dealing with biases in data, and complying with regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Organizations need to establish robust data governance frameworks and adopt ethical practices to ensure responsible and transparent use of data.6. Future Trends in DM3DM3 is a rapidly evolving field, and several future trends are shaping its landscape. These include:- Artificial Intelligence (AI) and Machine Learning (ML): AI and ML techniques are increasingly being integrated into DM3 processes, enabling organizations to automatically analyze and interpret vast amounts of data.- Real-time Data Analysis: With the exponential growth of data, organizations are now focusing on real-time data analysis to make timely decisions. Real-time data streaming and processing technologies are being leveraged to analyze data as it is generated.- Cloud-based DM3: Cloud computing is revolutionizing the way organizations manage and analyze their data. Cloud-based DM3 solutions offer scalability, flexibility, and cost-efficiency by eliminating the need for on-premises infrastructure.- Predictive Analytics: Predictive analytics utilizes historical data along with statistical modeling techniques to make predictions and forecasts about future events. It helps organizations identify emerging trends, anticipate customer behavior, and make proactive decisions.ConclusionDM3 is a critical discipline that enables organizations to leverage the power of data for improving business performance and gaining a competitive edge. By effectively managing and mining data, organizations can uncover valuable insights, make data-driven decisions, and drive innovation. With the continued advancements in technology and the increasing availability of data, DM3 will only become more vital in the future. Organizations that embrace DM3 will be better positioned to succeed in the data-driven economy.。

v512工作室_张利国_Java高端培训系列教材_Oracle实用教程_06章_DML与事务控制

第六章 DML与事务控制数据操作语言语句（Data manipulation language，DML）用于进行数据的检索和更新操作。

其中，数据检索操作使用SELECT语句实现，是数据库应用中使用频率最高的操作类型，前文中已做详细讲解，本章专门介绍其它的DML语句以实现数据的更新操作，这里的“数据更新”属广义范畴，包括数据的插入、更新、删除以及表间数据合并等。

6.1 INSERT操作INSERT操作用于向表或视图（视图相关知识参见第7.3节，这里不必关注）中插入数据，使用INSERT语句可以实现向表中插入单行数据，如使用子查询也可以实现数据的批量插入、或者将子查询结果插入到多个不同的表中，下面分别加以介绍。

6.1.1 使用VALUES子句插入数据INSERT语句的最常用方式是向表中插入单行数据，此时要用到VALUES子句，其基本语法格式为：INSERT INTO table [(column1 [, column2...])]V ALUES (value1 [, value2...]);其中，table用于指定要插入数据的表名，VALUES子句用于指定新插入记录行各字段的值，其数目、类型和顺序必须表中的记录结构相匹配；如果只为新插入记录的部分字段赋值，则应在表名后指定要进行赋值的字段列表，column即为要赋值的字段名，多个字段间用英文/半角逗号隔开。

下面详细展示其用法：准备测试用样本表首先在SCOTT方案下创建测试样本表myemp，具体SQL语句如下：--删除可能存在的样本表myempDROP TABLE myemp;--创建一个简单的样本表CREATE TABLE myemp(name V ARCHAR2(20), salary NUMBER(10,2), birth DATE);插入新记录时对所有字段赋值在使用INSERT语句插入数据时，如果不在表名后给出字段列表，则必须显式指定新插入记录每一个字段的值，如为空值也应指定为NULL，例如：INSERT INTO myemp V ALUES('张三', 2000, '28-4月-1966');INSERT INTO myemp V ALUES('李四', 4500, to_date('1964-12-23','yyyy-mm-dd'));其中，数字值可以直接给出，文本型和日期型数值应使用单引号引起来，此时日期型数值应为系统缺省格式（'DD-MON-YY'，参见第4.1.4节），也可使用to_date转换函数指定任意格式的日期。

2023年 DCA考试题库

达梦DCA 题库第一章达梦数据库简介1. DM数据库由哪三大结构组成？（多选）A.内存结构B. 线程结构C.存储结构D. 物理结构2. DM内存结构中包含哪三个部分？（多选）A. 数据缓冲区B.日记缓冲区C.块缓冲区D. 共享内存池3. DM日记缓冲区的作用？A. 存储数据块B. 解决内存的申请与释放C.用于存放重做日记的内存缓冲区D. 用千存放回滚日记的内容缓冲区4. DM共享内存池的作用？A.提高系统运营效率B. 实现数据共享C.减少数据I/ 0 带来的资源消耗D. 解决DM Ser ver 对于小片内存的申请与释放问题5.下列属千DM线程结构的是？（多选）A. 用户线程B.日记线程C.服务器线程D. 后台线程6. DM用户线程的启动时间是？A. 启动OM 服务器时B. 启动OM 实例时C.用户请求连接到服务器时D. 新用户创建成功时7. 关千DM服务器线程说法错误的是？A. 在用户建立会话时启动。

B. 在启动O M 实例时启动。

C. 调用应用程序或OM 工具时，OM 服务器会通过创建服务器进程来执行应用程序发出的命令。

D. DM服务器还会针对一个实例创建一组后台进程。

8. 下列不属千DM服务器线程的是？A. 监听线程B.工作线程C.存储线程D. 检查点线程9.监听线程的作用是？A. 监听服务器的启动情况。

B. 监听系统故障并发出警报。

C.监听远程信号并作出反映。

D. 监听用户的连接请求，在客户端和服务器之间建立一个连接。

10.下列不属于DM数据库基本文献的是？A.控制文献B. 数据文献C.备份文献D. 回滚日记文献11. 成功运营DM数据库所需的附加文献涉及哪些？（多选）A. 配置文献B. 备份文献C.预警日记文献D. 归档日记文献12.关千数据库、文献组与数据文献的关系说法对的的是？（多选）A. 一个数据库包含多个文献组B. 一个文献组只能包含一个数据文献C.一个文献组包含—个或多个数据文献D. 一个数据文献仅属千—个数据库13. 关千SYSTEM数据库说法错误的是？A. 是必须存在的库B. 可以脱机存在C.用千核心功能D. 是创建数据库时自动创建的14.关千段、簇与页的说法错误的是？A.段可以跨越多个物理文献B. 段由簇组成C.簇是磁盘块的集合D. 簇是数据页的集合15.下列属于物理数据库结构的是？A.模式B.数据库C.文献组D. 数据文献第二章安装及卸载DM 软件l．不属千数据库管理员职责的是？A.拟定数据库服务器的硬件设备B. 安装DM软件C.DM设计与实现D. 创建、移植、打开和备份数据库E. 登记系统用户和制定用户访问DM的计划2.关千DM数据库工具说法错误的是？A. DM管理工具是管理DM数据库系统的图形化工具。

六章节RDBMS扩展-精选

• 相交/非递归的分子对象
— 分子间存在共享对象成份，即两个分子可能在一些抵赖的分子对象上重叠。例如：具有同一平台的两个几何体
• 不相交/递归 — 具有递归定义的不共享的分子结构，例：几何对象的CSG表示是一个递归分子对象的典型例子，它构成一颗二叉树。参照图3.2（支架表示）
• 递归/相交 — 如果允许CSG中的几何部件被共享，则它是一个有向无环图DAG
of aggregate keylist
s1: key R1; … sn: key Rn; end
对该语法的说明
• 对generic关系R，具有n个属性，其中：
－若是一般的类型，仅用类型符标识它－若是一个聚集引用，即若引用一个generic关系类型，则需要加前缀“key”进行区分 • 对R可以特化，其m维的特化属性用sk1…skm表示 • 每一个特化属性ski将R的实例化对象划分成pi个不相交的集合，每一个集合为R的一个特化集
where m.ID in {“cube#1”, “cube#2”}” …… …… ……
相应的QUEL查询语句
• 利用mech-part的FACE-JOIN和faces中的 GEO-JOIN属性代码为： range of m is mech-part retrieve m.all,m.FACE-JOIN where m.ID = “cube#1”
• 由此看出，一个object 有三种类型：
1. Prim_obj （原始对象） 2. Mot_obj （运动对象） 3. Comp_obj （组合对象）
如下图：
扩展CSG实例化关系模型
Mech_obj
ID MAN
ARG_OBJ
TYPE
LEFT_ARG

找寻关联规则.

Rule Generation
Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset
Frequent itemset generation is still computationally expensive
Rule Evaluation Metrics
Support (s)
Fraction of transactions that contain both X and Y
Confidence (c)
Measures how often items in Y appear in transactions that contain X
TID Items
1
Bread, Milk
2
Bread, Diaper, Beer, Eggs
3
Milk, Diaper, Beer, Coke
4
Bread, Milk, Diapeper, Coke
6
Support and Confidence of a rule
Example of Rules:
{Milk,Diaper} {Beer} (s=0.4, c=0.67) {Milk,Beer} {Diaper} (s=0.4, c=1.0) {Diaper,Beer} {Milk} (s=0.4, c=0.67) {Beer} {Milk,Diaper} (s=0.4, c=0.67) {Diaper} {Milk,Beer} (s=0.4, c=0.5) {Milk} {Diaper,Beer} (s=0.4, c=0.5)

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

• Data integration
– Integration of multiple databases, data cubes, or files
• Data transformation
– Normalization and aggregation
• Data reduction
– Obtains reduced representation in volume but produces the same or similar analytical results
– – – – – – – Object-relational database Spatial and temporal data Time-series data Stream data Multimedia database Heterogeneous and legacy database Text databases & WWW
– The Google system uses a mathematical algorithm calledPageRank to estimate the relative importance of individual web pages based on link patterns – The Advanced Scout system analyzes the logs of NBA games to uncover interesting pieces of information (e.g., “when player X is on the floor, player Y’s shot accuracy decreases from 75% to 30%.”) As of 1997 the system was in use by several NBA teams
HD-ITR
9
What is data mining?
• Data mining—the same as knowledge discovery • Data mining—core of knowledge discovery process Pattern Evaluation
Data Mining Task-relevant Data Data Warehouse Data Cleaning Data Integration Databases
interesting • Interestingness measures
– A pattern is interesting if it is easily understood by humans, valid on new
or test data with some degree of certainty, potentially useful, novel, or validates some hypothesis that a user seeks to confirm
– Data mining is a complex procedure
• Why valid?
– Incorrect patterns are valueless
• Why novel?
– Investment on known-knowledge is wasteful
• Why potentially useful?
• Data discretization
– Part of data reduction but with particular importance, especially for numerical data
2016/11/14
HD-ITR
12
2016/11/14
H
• Other pattern-directed or statistical analyses
2016/11/14
HD-ITR
19
Are all the patterns interesting?
• Data mining may generate thousands of patterns: Not all of them are
• Outlier analysis
– Outlier: a data object that does not comply with the general behavior of the data – Noise or exception? No! useful in fraud detection, rare events analysis
2016/11/14
HD-ITR
7
What is data mining?
• Data mining is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns from huge volume of data --varied from U. Fayyad, et al. ’s definition of
School of Software Engineering
Data Warehouse and Data Mining
Yang Yan
Part 2 Data Mining
2016/11/14
HD-ITR
2
Contents
• Data mining
– – – – – – Introduction Mining Association Rules Classification Cluster Analysis Mining Complex Types of Data Data Mining Systems
– Prediction
• To predict some unknown or missing numerical values
2016/11/14
HD-ITR
17
Data mining functionality
• Cluster analysis
– Class label is unknown: Group data to form new classes – Maximizing intra-class similarity & minimizing interclass similarity
• Objective vs. subjective interestingness measures
– Objective: based on statistics and structures of patterns, e.g., support, confidence, etc. – Subjective: based on user’s belief in the data, e.g., unexpectedness, novelty,
HD-ITR
15
Data Mining: On what kind of data?
• • • • Relational database Data warehouse Transactional database Advanced database and information repository
2016/11/14
HD-ITR
6
Motivation: Why data mining?
• Applications
– – – – telecom Shares information Supermarket: Diaper and beer ……
• Other Applications
– Text mining (news group, email, documents) and Web mining – Stream data mining – Bioinformatics and bio-data analysis
HD-ITR
16
2016/11/14
Data mining functionality
• Concept description: Characterization(描述) and discrimination(区别)
– Generalize(归纳), summarize(总结), and contrast data characteristics, e.g., dry vs. wet regions
– Patterns will be used in decision making for future affairs
• Why ultimately understandable?
– Patterns will be presented to decision makers
2016/11/14
• Integration of data mining system with a DB and DW System
• Major issues in data mining
2016/11/14
HD-ITR
4
Motivation: Why data mining?
• Data mining applications
• Association (correlation and causality)
– Diaper àBeer [0.5%, 75%]
• Classification and Prediction
– Classification
• To find a model for predicting the class labels of unseen data • Presentation: decision-tree, classification rule, neural network