哈工大08机器学习试题

合集下载

哈工大08机器学习试题

2008年春硕士研究生机器学习试题下列各题每个大题10分注意：在给出算法时，非标准（自己设计的）部分应给出说明。

特别是自己设置的参数及变量的意义要说明。

1.请看以下的正例和反例序列，它们描述的概念是“两个住在同一个房间中的人”。

每个训练样例描述了一个有序对，每个人由其性别(male、female)、头发颜色(black、brown)、身高(tall、short)以及国籍(US、French)描述。

+<<male brown tall US>，<female black short US>>+<<male brown short French>，<female black short US>>-<<female brown tall French>，<female black short French>>+<<male black tall French>，<female brown tall US>>考虑在这些实例上定义的假设空间为：所有假设以一对4元组表示，其中每个值约束可以为：一个特定值(比如male、tall等)、?（表示接受任意值）和（表示拒绝所有值）。

例如，下面假设：<<male ? tall ?>，<female ? ? French>>它表示了所有这样的序对：第一个人为高个男性(国籍发色任意)，第二个人为法国女性(发色和身高任意)。

1）根据上述提供的训练样例和假设表示，手动执行候选消除算法。

特别是要写出处理了每一个训练样例后变型空间的特殊和一般边界；2）列出最后形成的变型空间中的所有假设。

2.假设一个神经网络有一个隐藏层(有一个隐藏层的神经网络由一个输入层、一个隐藏层、一个输出层组成)，写出训练这个神经网络的反向传播算法的步骤。

（完整版）哈尔滨工业大学数据库试题（含答案）

（完整版）哈尔滨工业大学数据库试题（含答案）试卷一（哈尔滨工业大学）一、选择题（每题1分，共20分）1.在数据管理技术的发展过程中，数据独立性最高的是（）阶段。

A. 数据库系统B. 文件系统C. 人工管理D. 数据项管理2. （）是存储在计算机内的有结构的数据集合。

A. 网络系统B. 数据库系统C. 操作系统D. 数据库3. 在数据库的三级模式结构中，描述数据库中全体数据的全局逻辑结构和特征的是（）。

A. 外模式B. 内模式C. 存储模式D. 模式4. 作为关系数据系统，最小应具备的关系运算是（）。

A. 排序、索引、统计B. 选择、投影、连接C. 关联、更新、排序D. 显示、打印、制表5. 在select语句中使用group by Sno时，Sno 必须出现在（）子句中。

A. whereB. fromC. selectD. having6. 在where语句的条件表达式中，与零个或多个字符匹配的通配符是（）。

A. *B. ?C. %D. _7. 对关系模式进行分解时，要求保持函数依赖，最高可以达到（）。

A. 2NFB. 3NFC. BCNFD. 4NF8. 在关系模式R（U，F）中，Y∈XF+是X→Y是否成立的（）。

A. 充分必要条件B. 必要条件C. 充分条件D. 既不充分也不必要条件9. 在关系数据库设计阶段中，完成关系模式设计的阶段是（）。

A. 需求分析阶段B. 概念设计阶段C. 逻辑设计阶段D. 物理设计阶段10. 基本E-R图就是数据库的（）。

A. 外模式B. 逻辑模式C. 内模式D. 概念模式11. 从数据流图构造E-R图时，选择实体一般应先考虑数据流图中的（）。

A. 数据项B. 数据流C. 数据处理D. 数据存储12. 以下（）不是当前常用的存取方法。

A. 索引方法B. 聚簇方法C. HASH方法D. 链表方法13. 事务一旦提交，对数据库的改变是永久的，这是事务的（）。

A. 原子性B. 一致性C. 隔离性D. 持久性14. 并发控制要解决的根本问题是保持数据库状态的（）。

历年考研机试答案（哈尔滨工业大学）

历年考研机试答案（哈尔滨工业大学）哈工大计算机院研究生入学考试机试编程题（09-12）哈工大计算机专业机试编程题（2009）题目描述：用小于等于n元去买100只鸡，大鸡5元/只，小鸡3元/只,还有1/3元每只的一种小鸡，分别记为x只,y只,z只。

编程求解x,y,z所有可能解。

输入：测试数据有多组，输入n。

输出：对于每组输入,请输出x,y,z所有可行解，按照x，y，z依次增大的顺序输出。

样例输入：40样例输出：x=0,y=0,z=100x=0,y=1,z=99x=0,y=2,z=98x=1,y=0,z=99参考答案：#includeint main(){int x, y, z;float n;while(scanf("%f",&n)!=EOF){for(x=0;5*x<=n;x++){for(y=0;3*y<=n;y++){z=100-x-y;if((5*x+3*y+(float)z/3)<=n){printf("x=%d,y=%d,z=%d\n",x,y,z); }}}}return 0;}题目描述：输入10个数，要求输出其中的最大值。

输入：测试数据有多组，每组10个数。

输出：对于每组输入,请输出其最大值（有回车）。

样例输入：10 22 23 152 65 79 85 96 32 1样例输出：max=152参考答案：#includeint main()int i, a[10], maxn;while(scanf("%d",&a[0])!=EOF){maxn=a[0];for(i=1;i<10;i++){scanf("%d",&a[i]);if(maxn<a[i])< p="">{maxn=a[i];}}printf("max=%d\n",maxn);}return 0;}题目描述：给定一个数n，要求判断其是否为素数（0,1，负数都是非素数）。

(完整版)哈工大深圳算法设计与分析08年试卷-何震宇

哈尔滨工业大学深圳研究生院 2008年秋季学期期末考试试卷HIT Shenzhen Graduate School Examination PaperCourse Name: Lecturer::This exam is closed book . You may not use the text book, your notes, computer, or any other materials during the exam.No credit will be given for questions left unanswered, so you should be sure to answer all questions, even if you are only taking your best guess.Write your answer to each question or problem in the paper provided. If necessary, extra sheets will be provided. Make sure your name is written on all of these pages.Please be sure to write neatly and answer all questions unambiguously. This exam has a total of _100_ points, and you have 120 minutes.Time : 09:00-11:00, Monday, Dec. 8, 2008错误!未找到引用源。

Single choice [10 points]1、Which of the following sorting algorithms is not stable? ( B ) (A) Insertion sort (B) Quick sort (C) Merge sort (D) Bubble sort2、We say that ()f n is asymptotically larger than ()g n if ( D ). (A) ()()()f n O g n = (B) ()()()f n g n =Ω (C) ()()()f n o g n = (D) ()()()f n g n ω=3、An order-statistic tree is an augmented red-black tree. In addition to its usual fields, each node x has a fieldsize[x], which is the number of nodes in the subtree rooted at x , For an order-statistic tree with n nodes, the time for insertion, deletion and maintenance of the size field are ( A ) (A) (lg )O n (lg )O n (lg )O n(B) (lg )O n (lg )O n (lg )O n n (C) (lg )O n (lg )O n(1)O(D) (lg )O n(lg )O n n ()O n4、There ’s a B-tree whose minimum degree is t, every node other than the root must have at least __ keys, at most __ keys, every internal node other than the root has at least __ children ( D ). (A) t-1 2t t (B) t-1 2t-1 t (C) t 2t t+1 (D) t-1 2t+1 t5、Which of the following statements about P, NP,NPC is correct? ( C ) (A) P = NP , NPC ⊇ NP (B) P ⊇NP , NPC ⊇ P (C) P ⊆NP , NPC ⊆NP (D) P = NPC , P ⊇ NP错误!未找到引用源。

哈工大机器学习历年考试

1 Give the defi niti ons or your comprehe nsions of the follow ing terms.(12 '1.1 The in ductive lear ning hypothesisP171.2 Overfitti ngP491.4 Con siste nt lear nerP1482 Give brief answers to the following questions.(15 '2.2 If the size of a version space is |VS |, In general what is the smallest number of queries may berequired by a concept learner using optimal query strategy to perfectly learn the target con ceptP272.3 In genaral, decision trees represent a disjunction of conjunctions of constrains on the attributevalues of in sta nse,the n what expressi on does the followi ng decisi on tree corresp onds to3 Give the explaination to inductive bias, and list inductive bias of CANDIDATE-ELIMINATION algorithm, decisi on tree learni ng(ID3), BACKPROPAGATION algorithm.(10 '4 How to solve overfitt ing in decisi on tree and n eural n etwork(10 'Soluti on:Decisi on tree:及早停止树增长(stop growing earlier)后修剪法(post-pruning)Neural Network权值衰减(weight decay) 验证数据集(validation set)A5 Prove that the LMS weight update rule i i(V train (b) V (b))x i performs a gradientdescent to minimize the squared error. In particular, define the squared error E as in the text. NowAcalculate the derivative of E with respect to the weight i, assuming that V (b) is a linear function as defi ned in the text. Gradie nt desce nt is achieved by updat ing each weight i n proport ion Eto --------- . Therefore, you must show that the LMS trai ning rule alters weights in this proporti on iA2for each training example it encounters. ( E (V train (b) V(b)) ) (8'b ,V t r ain (b) training exampleSolution :As Vtrai n(b) \? (Successor(b))we can get E= (V train (b) V(b))2=2(V train(b)伽)中As mentioned in LMS: i i (V train (b) \?(b))X iWe can get i i ( E / w i)Therefore, gradient descent is achievement by updating each weight in proportion to E / w i;LMS rules alters weights in this proportion for each training example it encounters.6 True or false: if decisi on tree D2 is an elaborati on of tree D1, the n D1 is more-ge neral-tha n D2. Assume D1 and D2 are decision trees representing arbitrary boolean funcions, and that D2 is an elaboratio n of D1 if ID3 could exte nd D1 to D2. If true give a proof; if false, a coun ter example. (Definition: Let h j and h k be boolean-valued functions defined over X .then h j ismore_ge neral_tha n_o r_equal_to h k (writte n h j g h k ) If and only if (x X)[(h k(x) 1) (h j(x) 1)] then h j h k (h j g h k )(h k g h j)) (10 'The hypothesis is false.One cou nter example is A XOR B while if A!=B, trai ning examples are all positive, while if A==B, trai ning examples are all n egative, then, usi ng ID3 to exte nd D1, the new tree D2 will be equivale nt to D1, i.e., D2 is equal to D1.7 Design a two-input perceptron that implements the boolean function A B .Design atwo-layer network of perceptrons that implements A XOR B . (10 '8 Suppose that a hypothesis space containing three hypotheses, h!, h2,h3, and the posteriorprobabilities of these typotheses given the training data are 0.4, 0.3 and 0.3 respectively. And if anew instanee x is encountered, which is classified positive by g, but negative by h2andh3,then give the result and detail classification course of Bayes optimal classifier.(10 'P1259 Suppose S is a collection of training-example days described by attributes including Humidity, which can have the values High or Normal. Assume S is a collection containing 10 examples, [7+,3_]. Of these 10 examples, suppose 3 of the positive and 2 of the negative examples have Humidity = High, and the rema in der have Humidity = Normal. Please calculate the in formati on gain due to sorting the original 10 examples by the attribute Humidity.( log 2l=0, log 22=1, Iog 23=1.58, Iog 24=2, Iog 25=2.32, Iog 26=2.58, Iog 27=2.8, Iog 28=3, Iog 29=3.16, Iog 2l0=3.32,) (5' Solution :(a)Here we denote S=[7+,3-],then Entropy([7+,3-])=丄 l^ 上? I^ ? =0.886;10 10 10 10(b) Gai n(S,Humidity)=E ntropy(S)-v values(Humidity JQ Values(Humidity )={High, Normal}S High {s S|Humidity (s) High}Each trai ning example is a pair of the form ；. x,t ；：, where x is the vector of in put values,Initialize eachi to some small random valueUn til the term in atio n con diti on is met, DoInitialize each i to zero.For each ( x, n in training_examples, DoIn put the in sta nee x to the un it and compute the output o For each linear unit weight i , Do For each linear unit weight i , Do(2) FIND-S AlgorithmIn itialize h to the most specific hypothesis in H For each positive trai ning in sta nee xFor each attribute constraint a i in h—Entropy(Sz) Gain(S,a2)3 3 2 2Entropy(S High )=-Jog2[-匸log ?匚 0.972, 0 5 5 5 54 4 En tropy(S Normal )=-：Iog 2 匚5 55 Thus Gain (S,Humidity)=0.886- ( 0.972 10 Fin ish the followi ng algorithm. (10 '(1) GRADIENT-DESCENT(training examples,)igh 5 =44 V 1 log ? 0.72 , S N5 55*0.72) =0.0410ormal=5and t is the target output value.is the lear ning rate (e.g., 0.05).If ________________________The ndo nothingElsereplace a i in h by the n ext more gen eral con stra int that is satisfied by x Output hypothesis h1. What is the defi niti on of lear ning problem(5)Use a checkers learning problem ” as an example to state how to design a learning system.(15)An swer:A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experie nce.(5)Example:A checkers lear ning problem:T: play checkers(1)P: perce ntage of games won in a tour name nt (1) E: opport un ity to play aga inst itself (1)To desig n a lear ning system:Step 1: Choos ing the Trai ning Experie nce(4)A checkers lear ning problem:Task T: play ing checkersPerforma nce measure P: perce nt of games won in the world tourn ame ntTraining experie nce E: games played aga in st itselfIn order to complete the design of the learning system, we must now choose1. the exact type of kno wledge to be lear ned2. a represe ntati on for this target kno wledge3. a lear ning mecha nismStep 2: Choos ing the Target Function ⑷1. if b is a final board state that is won, then V(b)=1002. if b is a final board state that is lost, then V (b)=-1003. if b is a final board state that is draw n, the n V (b)=04. if b is a not a final state in the game, then V(b)=V (b'), where b' is the best final board statethat can be achieved starting from b and playing optimally un til the end of the game (assuming the opp onent plays optimally, as well).Step 3: Choos ing a Represe ntati on for the Target Function (4) x1: the nu mber of black pieces on the boardx2: the nu mber of red pieces on the boardx3: the nu mber of black kings on the boardx4: the number of red kings on the boardx5: the number of black pieces threatened by red (i.e., which can be captured on red's ext turn)x6: the number of red pieces threatened by black.Thus, our learning program will represent V (b) a's a linear function of the formV (b)=w o+w l x l+w2x2+w 3x3+w 4x4+w5x5+w6x6 where w o through w6 are numerical coefficients, or weights, to be chosen by the learning algorithm. Learned values for the weights w 1 through w 6 will determine the relative importance of the various board features in determining the value of the board, whereas the weight wo will provide an additive constant to the board value.2. Answer: Find-S & Find-G: Step 1: Initialize S to the most specific hypothesis in H. (1)S0:{ , , , , , }Initialize G to the most general hypothesis in H.G0:{, , , , , }.Step 2: The first example is {<Sunny, Warm, Normal, Strong, Warm, Same, +>} (3)S1:{Sunny, Warm, Normal, Strong, Warm, Same} G1:{, , , , , }.Step 3: The second example is {<Sunny, Warm, High, Strong, Warm, Same, +>} (3) S2:{Sunny, Warm, , Strong, Warm, Same} G2:{, , , , , }.Step 4: The third example is {<Rainy, Cold, High, Strong, Warm, Change, ->} (3)S3:{ Sunny, Warm, , Strong, Warm, Same } G3:{<Sunny, , , , , >, <, Warm, , , , >, <, , , , , Same>} Step 5: The fourth example is {<Sunny, Warm, High, Strong, Cool, Change, +>} (3)S4:{Sunny, Warm, , Strong, , } G4:{<Sunny, , , , , >, <, Warm, , , , > }Finally, all the hypotheses are: (2) {<Sunny, Warm, , Strong, , >, <Sunny, , , Strong, , >, <Sunny, Warm, , , , >,<, Warm, , Strong, , >, <Sunny, , , , , >, <, Warm, , , , > }3. Answer: Flog(X) = -X*log(X)-(1-X)*log(1-X); STEP1 choose the root node: entropy_all =flog(4/10)=0.971; (2) gain_outlook = entropy_all - 0.3*flog(1/3) - 0.3*flog(1) - 0.4*flog(1/2)=0.296; (1) gain_templture = entropy_all - 0.3*flog(1/3) - 0.3*flog(1/3) - 0.4*flog(1/2)=0.02; (1)step 2 choose the sec ond NODE:for sunny (humidity OR temperature):en tropy_su nny = flog(1/3)=0.918; (1) sunn y_gain_wi nd = en tropy_su nny - (2/3)*flog(0.5) - (1/3)*flog(1)=0.252; (1) sunn y_gain_humidity = en tropy_su nny - (2/3)*flog(1) - (1/3)*flog(1)=0.918;(1)sunn y_gain_temperature = en tropy_su nny - (2/3)*flog(1) - (1/3)*flog(1)=0.918; (1) choose humidity or temperature. (1)for rain (win d):en tropy_rain = flog(1/2)=1; (1)rain_gain_wi nd = en tropy_rain - (1/2)*flog(1) - (1/2)*flog(1)=1; (1)rain_gain_humidity = en tropy_rain - (1/2)*flog(1/2)-(1/2)*flog(1/2)=0; (1)rain_gain_temperature = en tropy_rain - (1/4)*flog(1)- (3/4)*flog(1/3)=0.311; (1) gain_wind = en tropy_all - 0.6*flog(5/6)(1)Root Node is(2)⑴0.4*flog(1/4)=0.256;outlook ”: orgain_humidity = entropy_all - 0.5*flog(2/5) - 0.5*flog(1/5)=0.125;4. An swer:A: The primitive n eural un its are: perceptro n, li near unit and sigmoid un it. (3) Perceptr on: (2)A perceptr on takes a vector of real-valued in puts, calculates a lin ear comb in ati on of these inputs, the n output a 1 if the result is greater tha n some threshold and -1 otherwise. More precisely, give n in put x1 through xn, the output o(x1,..xi,.. xn) computed by the perceptr on is NSometimes write the perceptr on fun cti on asLin ear un its: (2)a lin ear unit for which the output o is give n byThus, a lin ear un it corresp onds to the first stage of a perceptr on, without the threshold. Sigmoid un its: (2) The sigmoid un it is illustrated in picture like the perceptro n, the sigmoid un it first computes a lin ear comb in atio n of its in puts, the n applies a threshold to the result. In the case of the sigmoid un it, however, the threshold output is a continu ous fun cti on of its in put.More precisely, the sigmoid un it computes its output o asWhere,B:(因题目有打印错误，所以感知器规则和delta规则均可，给出的是delta规则)Derivati on process is: (6)感知器规则(perceptron learning rule)5. An swer:P( no)=5/14 P(yes)=9/14 (1)P(su nny|no )=3/5 (1)P(cool| no)=1/5 (1) P(high| no)=4/5 (1) P(stro ng| no)=3/5 (1) P(no|new instance)=P(no)*P(sunny|no)*P(cool|no)*P(high|no)*P(strong|no)=5/14*3/5*1/5*4/5*3/5 = 0.02057=2.057*10 -2(2) P(su nny |yes)=2/9 (1) P(cool|yes)=3/9 (1) P(high|yes)=3/9 (1) P(stro ng|yes)=3/9 (1) P(yes|new instance)=P(yes)*P(sunny|yes)*P(cool|yes)*P(high|yes)*P(strong|yes)=9/14*2/9*3/9*3/9*3/9 = 0.0529 仁5.291*10 -3(2) ANSWER: NO (2) 6. An swer:INDUCTIVE BIAS: (8)Consider a concept learning algorithm L for the set of instances X, Let c be an arbitrary concept define over X, and let D c = {< x; c (x) >} be an arbitrary set of training examples of c . Let denote the classification assigned to the instanee x i by L after training on thedata D c .The in ductive bias of L is any mini mal set of asserti ons B such that for any targetconcept c and corresponding training examples D c：(?<i € X)[(B A x i A D c) ? L(x i；D c)]---The?futility?of?bias-free?learning: (7)A?learner?that?makes? no?a?priori?assumptio ns?regardi ng?the?ide ntity?of?the?target?co ncept?ha s?n o?ratio nal?basis?for?classifyi ng?a ny?u nsee n?i nsta nces.?l n?fact,?the?o nly?reaso n?that?the?lea rner?was?able?to?ge neralize?beyo nd?the?observed?trai ning?examples?is?that?it?was?biased?by? the? in ductive?bias.Unfortunately , the only instances that will produce a unanimous vote are the previously observed training examples. For, all the other instances, taking a vote will be futile: each unobserved instance will be classified positive by precisely half the hypotheses in the version space and will be classified n egative by the other half.1 In the EnjoySport lear ning task, every example day is represe nted by 6 attributes. Given thatattributes Sky has three possible values, and that AirTemp、Humidity、Wind、Wind、Water and Forecast each have two possible values. Expla in why the size of the hypothesis space is 973.How would the nu mber of possible in sta nces and possible hypotheses in crease with theaddition of one attribute A that takes on on K possible values2 Write the algorithm of Can didate_Elim in atio n using vers ion space. Assume G is the set ofmaximally gen eral hopytheses in hypothesis space H, and S is the set of maximally specific hopytheses.(a) What is the Entropy of the collection training examples with respect to the target functionclassificati on(b) According to the 5 traning examples, compute the decision tree that be learned by ID3, and showthe decisi on tree.(log23=1.585, log25=2.322)4 Give several approaches to avoid overfitti ng in decisi on tree lear ning. How to determ in thecorrect final tree size5 Write the BackPropagation algorithm for feedforward network containing two layers of sigmoid units.6 Explai n the Maximum a posteriori(MAP) hypothesis.7 Usi ng Naive Byes Classifier to classify the new in sta nee:<Outlook=s unn y,Temperature=cool,Humidity=high,Wi nd=stro ng> Our task is to predict the target value (yes or no) of the target concept PlayTennis for this new8 Question Eight : The definition of three types of fitness functions in genetic algorithmQuestion one :(举一个例子，比如：导航仪、西洋跳棋)Question two :In itilize: G={,,,,,} S={ ,,,,,}Step 1:G={,,,,,} S={s unny ,warm ,no rmal,str on g,warm,same}Step2: coming one positive in sta nee 2G={,,,,,} S={s unny ,warm,,str on g,warm,same}Step3: coming one n egative in sta nee 3G=<S unny,,,,,> <,warm,,,,> <,,,,,same>S={s unny ,warm,,str on g,warm,same}Step4: coming one positive in sta nee 4S= { sunny ,warm,,str on g,, }G=<Su nn y,,,,,> <,warm,,,,>Question three :(a) Entropy(S)= 一 -丨og(3/5) 一 -】og(2/5)= 0.971(b) Gain(S,sky) = Entropy(S) - (4/5) Entropy(Ssunny) + (1/5) Entropy(Srainny)] = 0.322Gai n( S,AirTemp) = Gai n(S,wi nd) = Gai n(S,sky) =0.322Gai n( S,Humidity) = Gain (S,Forcast) = 0.02Gai n( S,water) = 0.171Choose any feature of AirTemp, wi nd and sky as the top no de.The decisi on tree as follow: (If choose sky as the top no de)Question Four :An swer:In ductive bias: give some proor assumpti on for a target con cept made by the lear ner to have a basis for classify ing un see n in sta nces.Suppose L is a machine learning algorithm and x is a set of training examples. L(xi, Dc) denotes the classification assigned to xi by L after training examples on Dc. Then the inductive bias is a minimal set of assertion B, given an arbitrary target concept C and set of training examples Dc:(眾i E 艾)[(B n Dc「Xi) -| L(xi, Dc)]C_E: the target concept is contained in the given gypothesis space H, and the training examples are all positive examples.ID3: a, small trees are preferred over larger trees.B, the trees that place high information gain attribute close to root are preferred over those that do not.BP:Smooth in terpolati on betee n data poin ts.Question Five :Answer: In na?ve bayes classification, we assump that all attributes are independent given the tatget value, while in bayes belif n et, it specifes a set of con diti onal in depe ndence along with a set of probability distributi on.Question Six :随即梯度下降算法Question Seven :朴素贝叶斯例子Question Eight : The definition of three types of fitness functions in genetic algorithmAn swer:In order to select one hypothese according to fitness function, there are always three methods: roulette wheel selecti on, tour name nt selecti on and rank selectio n.Question nine :Sin gle-po int crossover:Two-po int crossover:Offspri ng:()Uniform crossover:Point mutati on:Any mutati on is ok!1 Solutio n:A computer program is said to lear n from experie nee E with respect to some class of tasks T and performa nee measure P,if its performa nee at tasks in T, as measured by P, improves with experie nee E.Example : (po int out the T,P,E of the example)A checkers lear ning problem.A handwriting recognition learning problemA robot drivi ng lear ning problem.2 Solutio n:S o ：{ , , , , , }S 1：{Su nny, Warm, Normal, Stro ng, Warm, Same}S 2:{Su nny, Warm, , Stro ng, Warm, Same}G o , G 1, G 2：{, , , , , }S 3：{ Su nny, Warm, , Stro ng, Warm, Same }G 3：{Sunny, , , , , } U {, Warm, , , , } U {, , , , , Same}S 4：{Su nny, Warm, , Stro ng, , }G 4：{Sunny, , , , , } U {, Warm, , , , }3 Solutio n:4 In gen eral,i nductive inference: Some form of prior assumpti ons regard ing the inden tity of thetarget concept made by a learner to have a rational basis for classifying an unseen instances.FormallyCANDIDATE-ELIMINATION:The target con cept c is co ntain ed in the give n hypothesis space H. Decision tree learning(ID3): Shorter trees are preferred over larger trees.Trees that place highinformation gain attributes close to the root are perferred over those that do not. BACKPROPAGATION algorithm:smooth in terpolation between data poi nts.5 Soluti on: (1)⑵6(3) GRADIENT-DESCENT(training examples,)Each training example is a pair of the form : x,t. , where x is the vector of inputvalues, and t is the target output value. is the lear ning rate (e.g., 0.05).(a)Here we denote S=[7+,3-],then Entropy([7+,3-])=10 10 鼻2空 10 10 =0.886;(b) Gai n(S,Humidity)=E ntropy(S)-v values(Humidity J Entropy(S v ) Gain(S,a2)Q Values(Humidity )={High, Normal}S High {s S | Humidity (s) High}3 3 2 2Entropy(S High )=-：log 2：-匸log ?匚 0.972, S High 5=4 5 5 5 5 En tropy(S N ormal )=-|log 24-1log 21. 5 5 5 5 0.72 Thus Gain (S,Humidity) =0.886-(-°972 存OS =°04Initialize each i to some small random valueUn til the term in atio n con diti on is met, DoInitialize each i to zero.For each (x, t) in training_examples, DorIn put the in sta nee x to the un it and compute the output oFor each linear unit weight i, Doa) n+18Dtfinitiort: Consider a concept class C defined over a set of instances X of lengtli n and a learner L using hypothesis space H. C is PAC-learnable by L using H if for all c e C, distributions T> over X t芒such that 0 < € < 1/2, and $ such that 0 < 5 < 1/2, learner L will with probability at least (1 — 5) output a hypothesis h e H such that error^W< 巳in time that is polynomial in 1/百，l/久n r and。

工业自动化中的机器学习应用考核试卷

A. Bagging
B. Boosting
C. Stacking
D. PCA
17.在机器学习中，以下哪个不是深度学习模型？（）
A.卷积神经网络
B.循环神经网络
C.支持向量机
D.深度信念网络
18.以下哪个不是工业自动化中机器学习的挑战？（）
A.数据不平衡
B.数据过拟合
C.模型泛化能力差
D.数据量过多
19.在机器学习中，以下哪个不是监督学习算法？（）
4.请解释什么是机器学习模型的泛化能力，以及为什么它在工业自动化中尤为重要。同时，讨论至少两种提高机器学习模型泛化能力的方法。
标准答案
一、单项选择题
1. B
2. B
3. D
4. C
5. C
6. A
7. D
8. B
9. D
10. D
11. D
12. D
13. B
14. D
15. C
16. D
17. D
A.决策树
B.协同过滤
C.聚类分析
D.主成分分析
14.在机器学习中的交叉验证方法中，以下哪个不是常用的交叉验证方式？（）
A.留出法
B. K折交叉验证
C.留一法
D.随机交叉验证
15.以下哪种方法通常用于提高机器学习模型的鲁棒性？（）
A.数据标准化
B.数据归一化
C.特征选择
D.增加模型复杂度
16.在工业自动化中，以下哪个不是机器学习中的集成学习方法？（）
D.决策树
4.以下哪些是深度学习中的常见网络结构？（）
A.卷积神经网络
B.循环神经网络
C.全连接网络
D.对抗生成网络
5.以下哪些方法可以用来降低机器学习模型的方差？（）

机器学习基础知识试题

机器学习基础知识试题一、选择题1. 机器学习的主要目标是什么？A. 让机器能够像人一样思考B. 让机器能够自动学习C. 提高计算机的运算速度D. 使机器具备无限的记忆能力2. 哪个是监督学习的主要特点？A. 需要标记好的训练数据B. 无需人工干预C. 机器能独立学习D. 只能处理分类问题3. 以下哪个属于非监督学习？A. 图像分类B. 垃圾邮件过滤C. 聚类分析D. 情感分析4. 在机器学习中，过拟合指的是什么？A. 模型无法适应新的数据B. 模型在训练集上表现较好，在测试集上表现较差C. 模型无法收敛D. 模型的准确率低5. 以下哪个是机器学习中常用的性能评估指标？A. 准确率B. 召回率C. F1值D. 所有选项都正确二、填空题1. 机器学习是一门研究怎样使计算机能够__________的科学。

2. 监督学习中，训练数据包括__________和__________。

3. __________是一种无监督学习算法，用于将数据分成相似的组或簇。

4. 过拟合是指模型在训练集上过度学习，导致在测试集上_____________。

5. 准确率是用来评估__________模型性能的指标。

三、简答题1. 请简要解释机器学习中的模型训练过程。

2. 什么是特征工程？为什么它在机器学习中很重要？3. 请解释交叉验证的概念及其作用。

4. 解释机器学习中的偏差和方差之间的关系。

5. 什么是集成学习？如何应用于机器学习中？四、应用题假设你是一个房地产公司的数据科学家，公司希望使用机器学习模型来预测未来一年的房屋价格。

你被要求开发一个模型，基于房屋的相关特征，帮助公司预测房屋的售价。

1. 请列举至少五个可能有用的特征，用于训练模型。

2. 你认为是分类问题还是回归问题？为什么？3. 你将如何评估你开发的模型的性能？4. 请描述你将如何使用交叉验证来提高模型的泛化能力。

5. 除了单一的机器学习模型，你可以考虑使用哪些集成学习方法来提高预测性能？答案：一、选择题1. B2. A3. C4. B5. D二、填空题1. 自动学习2. 特征、标签3. 聚类分析4. 表现较差5. 分类器三、简答题1. 模型训练过程包括选择合适的算法和模型结构、准备训练数据、使用训练数据对模型进行训练、评估模型性能以及根据评估结果调整模型参数。

08年哈工大

一（10分）设A，B，C为三事件，用A，B，C的运算关系表示下列各事件。

（1）A发生，B与C不发生;（2）A，B，C中至少有一个发生；（3）A，B，C中不多于一个发生。

二（10分）在房间里有10个人，分别佩戴从1号到10号的
纪念章，任选三人记录其纪念章的号码，（1）求最小号码为5
的概率；（2）求最大号码为5的概率。

三（10分）设随机变量X的密度函数为f(x)=Ax,0小于等于x小于1;f(x)=B-x,1小于等于x小于2;f(x)=0,其他.
试求：（1）常数A，B；（2）X的分布函数；（3）p(1/2小于x小于3/2)
四（10分）设随机变量X在（0，1）服从均匀分布。

（1）求Y= e的x次方的概率密度；（2）求Y=-2lnx的概率密度。

五（15分）若(X,Y)的分布列为
Y
X 1 2 3
1 1/6 1/9 1/18
2 1/
3 2阿尔法3β
问：（1）阿尔法与β满足何条件？
（2）若X与Y独立，则阿尔法与β各等于多少？
七（15分）求掷只骰子出现点数之和的数学期望与方差。

八（15分）设随机变量X与Y 独立同分布,且服从U[0,1] ，求
（1）Z=X-Y 绝对值的分布函数；（2）EZ和DZ
分享到：。

哈工大深圳机器学习08ML试题答案

1 Give the definitions or your comprehensions of the following terms.(12’)1.1 The inductive learning hypothesisP171.2 OverfittingP491.4 Consistent learnerP1482 Give brief answers to the following questions.(15’)VS, In general what is the smallest number of queries may 2.2 If the size of a version space is ||be required by a concept learner using optimal query strategy to perfectly learn the target concept?P272.3 In genaral, decision trees represent a disjunction of conjunctions of constrains on the attributevalues of instanse,then what expression does the following decision tree corresponds to ?3 Give the explaination to inductive bias, and list inductive bias of CANDIDATE-ELIMINATION algorithm, decision tree learning(ID3), BACKPROPAGATION algorithm.(10’)4How to solve overfitting in decision tree and neural network?(10’)Solution:●Decision tree:◆及早停止树增长(stop growing earlier)◆后修剪法(post-pruning)●Neural Network◆ 权值衰减(weight decay)◆ 验证数据集(validation set)5 Prove that the LMS weight update rule ^(()())i i train i V b V b x ωωη←+-performs a gradient descent to minimize the squared error. In particular, define the squared error E as in the text. Now calculate the derivative of E with respect to the weight i ω, assuming that ^()V b is a linear function as defined in the text. Gradient descent is achieved by updating each weight in proportion to iE ω∂-∂. Therefore, you must show that the LMS training rule alters weights in this proportion for each training example it encounters. ( ^2,() (()())train train b V b training example E V b V b 〈〉∈≡-∑) (8’) Solution ： As Vtrain(b)←ˆV(Successor(b)) we can get E=2ˆ(()())train V b Vb -∑ 0112233445566ˆ()w +w x +w x +w x +w x +w x +w x V b = ˆˆ/2(()())(()())/i train train iE w V b V b V b V b w -∂∂=--∂-∂ =ˆ2(()())train iV b V b x -- As mentioned in LMS:ˆ(()())i i train i V b V b x ωωη←+- We can get (/)i i i E w ωωη←+-∂∂/2ηη'=-Therefore, gradient descent is achievement by updating each weight in proportion to /i E w -∂∂; LMS rules alters weights in this proportion for each training example it encounters.6 True or false: if decision tree D2 is an elaboration of tree D1, then D1 is more-general-than D2. Assume D1 and D2 are decision trees representing arbitrary boolean funcions, and that D2 is an elaboration of D1 if ID3 could extend D1 to D2. If true give a proof; if false, a counter example. (Definition: Let j h and k h be boolean-valued functions defined over X .then j h is more_general_than_or_equal_to k h (written j g k h h ≥) If and only if()[(()1)(()1)]k j x X h x h x ∀∈=→= then ()()j k j g k k g j h h h h h h >⇔≥∧≥) (10’) The hypothesis is false.One counter example is A XOR Bwhile if A!=B, training examples are all positive,while if A==B, training examples are all negative,then, using ID3 to extend D1, the new tree D2 will be equivalent to D1, i.e., D2 is equal to D1.7 Design a two-input perceptron that implements the boolean function A B ∧⌝.Design a two-layer network of perceptrons that implements A XOR B . (10’)8 Suppose that a hypothesis space containing three hypotheses, 1h ,2h ,3h , and the posterior probabilities of these typotheses given the training data are 0.4, 0.3 and 0.3 respectively. And if a new instance x is encountered, which is classified positive by 1h , but negative by 2h and 3h ,then give the result and detail classification course of Bayes optimal classifier.(10’)P1259 Suppose S is a collection of training-example days described by attributes including Humidity, which can have the values High or Normal. Assume S is a collection containing 10 examples,[7+,3-]. Of these 10 examples, suppose 3 of the positive and 2 of the negative examples have Humidity = High, and the remainder have Humidity = Normal. Please calculate the information gain due to sorting the original 10 examples by the attribute Humidity.( log 21=0, log 22=1, log 23=1.58, log 24=2, log 25=2.32, log 26=2.58, log 27=2.8, log 28=3, log 29=3.16, log 210=3.32, ) (5’)Solution ：(a)Here we denote S=[7+,3-],then Entropy([7+,3-])= 227733log log 10101010-- =0.886; (b)i v v values(Humidity )Gain(S,Humidity)=Entropy(S)-Entropy(S )v S S ∈∑Gain(S,a2)Values(Humidity )={High, Normal}∴{|()}High S s S Humidity s High =∈=223322Entropy()=-log -log 0.9725555High S =,5High S ==4 ∴ 224411Entropy()=-log -log 0.725555Normal S = ,Normal S =5 Thus Gain (S,Humidity)=0.886-55(0.972*0.72)1010⨯+=0.0410 Finish the following algorithm. (10’)(1) GRADIENT-DESCENT(training examples, η) Each training example is a pair of the form ,x t , where x is the vector of input values,and t is the target output value. η is the learning rate (e.g., 0.05).● Initialize each i ω to some small random value● Until the termination condition is met, Do● Initialize each i ω∆ to zero.● For each ,x t in training_examples, Do● Input the instance x to the unit and compute the output o● For each linear unit weight i ω, Do● For each linear unit weighti ω, Do i i i ωωω←+∆(2) FIND-S Algorithm● Initialize h to the most specific hypothesis in H● For each positive training instance x● For each attribute constraint a i in hIfThendo nothingElsereplace a i in h by the next more general constraint that is satisfied by x● Output hypothesis h。

机器学习考试试题

机器学习考试试题一、选择题（每题 3 分，共 30 分）1、以下哪种情况不属于机器学习的应用场景？（）A 图像识别B 自然语言处理C 传统的数值计算D 预测股票价格2、在监督学习中，如果预测值与真实值之间的差异较大，通常使用以下哪种方法来衡量模型的性能？（）A 准确率B 召回率C 均方误差D F1 值3、下列哪种算法不是聚类算法？（）A KMeansB 决策树C 层次聚类D 密度聚类4、对于一个过拟合的模型，以下哪种方法可以缓解？（）A 增加训练数据量B 减少模型的复杂度C 增加正则化项D 以上都是5、以下关于特征工程的描述，错误的是？（）A 特征工程是将原始数据转换为更有意义和有用的特征的过程B 特征选择是特征工程的一部分C 特征工程对于机器学习模型的性能影响不大D 特征缩放可以提高模型的训练效率6、在深度学习中，以下哪个不是常见的激活函数？（）A Sigmoid 函数B ReLU 函数C Tanh 函数D Logistic 函数7、支持向量机（SVM）主要用于解决什么问题？（）A 回归问题B 分类问题C 聚类问题D 降维问题8、以下哪种优化算法常用于神经网络的训练？（）A 随机梯度下降（SGD）B 牛顿法C 共轭梯度法D 以上都是9、下面关于集成学习的说法，错误的是？（）A 随机森林是一种集成学习算法B 集成学习可以提高模型的稳定性和泛化能力C 集成学习中的个体学习器必须是同一种类型的模型D 集成学习通过组合多个弱学习器来构建一个强学习器10、对于一个二分类问题，若混淆矩阵如下：｜｜预测正例｜预测反例｜｜｜｜｜｜实际正例｜ 80 ｜ 20 ｜｜实际反例｜ 10 ｜ 90 ｜则该模型的准确率是多少？（）A 80%B 90%C 70%D 85%二、填空题（每题 3 分，共 30 分）1、机器学习中的有监督学习包括________、＿_______和________等任务。

2、常见的无监督学习算法有________、＿_______和________。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

８．给定 ·目标概念：SafeToStack(x,y) ·训练样例：下面显示了一个典型的正例 SafeToStack(Obj1，Obj2)： On(Obj1，Obj2) Owner(Obj1，Fred) Type(Obj1，Box) Owner(Obj2，Louise) Type(Obj2，Endtable) Density(Obj1，0.3) Color(Obj1，Red) Material(Obj1，Cardboard) Color(Obj2，Blue) Material(Obj2，Wood) Volume(Obj1，２) ·领域理论 B： SafeToStack(x,y)← ┓Fragile(y) SafeToStack(x,y)← Lighter(x,y) Lighter(x,y)← Weight(x，wx)∧Weight(y，wy)∧LessThan(wx，wy) Weight(x，w)← Volume(x，v)∧Density(x，d)∧Equal(w，v*d)
2008 年春硕士研究生机器学习试题
下列各题每个大题 10 分注意：在给出算法时，非标准（自己设计的）部分应给出说明。特别是自己设置的参数及变量的意义要说明。 1. 请看以下的正例和反例序列，它们描述的概念是“两个住在同一个房间中的人” 。每个训练样例描述了一个有序对，每个人由其性别(male、 female)、头发颜色(black、 brown)、身高(tall、short)以及国籍(US、French)描述。 +<<male brown tall US>，<female black short US>> +<<male brown short French>，<female black short US>> -<<female brown tall French>，<female black short French>> +<<male black tall French>，<female brown tall US>> 考虑在这些实例上定义的假设空间为：所有假设以一对 4 元组表示，其中每个值约束可以为：一个特定值(比如 male、tall 等)、?（表示接受任意值）和（表示拒绝所有值）。例如，下面假设： <<male ? tall ?>，<female ? ? French>> 它表示了所有这样的序对：第一个人为高个男性(国籍发色任意)，第二个人为法国女性(发色和身高任意)。 1）根据上述提供的训练样例和假设表示，手动执行候选消除算法。特别是要写出处理了每一个训练样例后变型空间的特殊和一般边界； 2）列出最后形成的变型空间中的所有假设。 2. 假设一个神经网络有一个隐藏层(有一个隐藏层的神经网络由一个输入层、一个隐藏层、一个输出层组成)，写出训练这个神经网络的反向传播算法的步骤。 3. 简述题： 1）简述一种处理 ID3 算法过适合的方法； 2）简述一种处理神经网络过适合的方法。 4. 给定训练例子集如下表： Day 1 2 3 4 5 6 7 Outlook Sunny Sunny Overcast Sunny Sunny Overcast Overcast Temperature Hot Hot Hot Mild Mild Mild Hot Humidity High High High High Normal High Normal Wind Weak Strong Weak Weak Strong Strong Weak PlayTennis No No Yes No Yes Yes Yes
依据给定的训练例子，使用朴素贝叶斯分类器进行分类。 1）求出 p(Sunny|Yes)、 p(Sunny|No)、 p(Mild|Yes)、 p(Mild|No)、 p(High|Yes)、 p(High|No)、 p(Strong|Yes)、p(Strong|No); 2)给定类别未知例子<Outlook=Sunny,Temperature=Mild, Humidity= High, Wind= Strong>，计算这个例子的类别。（计算类别时要先列出式子，然后再代入具体的数）。
x1 x2 w1 w2
x0=1 w0
∑
w
i0
n
i
xi
xn
wn
第（ 3）页
共（
5.写出 AQ 算法。 6.写出概念聚类算法(或画出流程图)。 7. 1)写出遗传算法； 2)设计编码方案。给定例子集如下表，假定在本问题中规则为 IF THEN 形式，规则的前件(IF 后面、THEN 前面的部分是前件)是属性取值的合取，后件(THEN 后面的部分是后件)是例子的类别。比如，一个具体的规则可以是 IF (Outlook=Overcast∨Rain)∧(Wind=Strong) THEN PlayTennis=yes。设计一个编码方案对规则编码； 3) 给定两个规则 IF (Outlook=Overcast ∨ Sunny) ∧ (Humidity=Normal) THEN PlayTennis=yes 和 IF (Outlook= Rain)∧(Wind=Strong) THEN PlayTennis=no，使用你设计的编码方案对这两个规则进行编码，然后对两个串作两点交叉运算(交叉点你可以自己定)，最后给出交叉后的两个串表示的规则是什么。 day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Outlook Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Temperature Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild Humidity High High High High Normal Normal Normal High Normal Normal Normal High Normal High Wind Weak Strong Weak Weak Weak Strong Strong Weak Weak Weak Strong Strong Weak Strong PlayTennis No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
第（ 2）页共（ 3 ）页
Weight(x，5)← Type(x，Endtable) Fragil(x)← Material(x，Glass) 求用分析学习对上述问题进行处理得到的结果。要求： 1) 画出解释过程； 2) 用回归算法普化(generalization)解释过程(不仅要在解释图中表示出各文字常量、变量的替换情况，还要对整个回归过程进行说明，说清楚每步回归是怎么做的，比如某一步回归用的是哪条规则、置换是什么、这个置换在这步回归中都应用(或作用)到了哪些文字)； 3) 给出分析学习最后得到的规则。 9. Foil 算法是一个学习 Horn 子句的算法。FOCL 是 Foil 的一个扩展，在学习过程中 FOCL 使用了领域知识。参照 Foil 算法的过程，给出 FOCL 算法。 10．给定一个训练例子集，假设例子的属性值都是离散的，例子类别只有两类(正例和反例)。 1) 提出一种方法，这种方法将 GS 算法和感知器学习结合在一起完成学习任务，即提出一个“杂交“算法，这个算法是 GS 算法和感知器学习的结合。(感知器是只有一个节点的神经网络，结构如下图)； 2)说明你这种混合方法可能带来什么好处。