北航计算机机器学习2011复习资料+试卷

合集下载

2011年3月全国计算机等级考试三级数据库真题和答案

2011年3月全国计算机等级考试三级数据库真题和答案一、选择题（每小题1分，共60分）下列各题A、B、C、D四个选项中，只有一个选项是正确的，请将正确选项涂写在答题卡相应位臵上，答在试卷上不得分。

1. 现代计算机系统工作原理的核心之一是"存储程序"，最早提出这一设计思想的是A. 艾兰〃图灵B. 戈登〃摩尔C. 冯〃诺依曼D. 比尔〃盖茨答案：C解析：冯•诺依曼“存储程序”工作原理的核心包含两层含义：①将编写好的程序和原始数据存储在计算机的存储器中，即“存储程序”。

②计算机按照存储的程序逐条取出指令加以分析，并执行指令所规定的操作。

即“程序控制”。

2. 总线用于在计算机部件之间建立可共享连接的信息传输通道。

下列哪一个不属于I/O总线A. PCIB. DMAC. USBD. 1394答案：B解析：目前常见的I/O总线有如下几种：①PCI总线是一种不依附于某个处理器的局部总线，支持多种外设，并能在高时钟频率下保持高性能。

②USB通用串行总线是一种连接I/O串行设备的技术标准。

③1394总线是为家用电器研制的一种高速串行总线标准。

3. 下列关于局域网的叙述中，哪一条是正确的A. 地理覆盖范围大B. 误码率高C. 数据传输速率低D. 不包含OSI参考模型的所有层答案：D解析：局域网的技术特点主要表现在：①局域网覆盖有限的地理范围。

②提供高数据传输速率（10～1000Mbps）、低误码率的高质量数据传输环境。

③一般属于一个单位所有，易于建立、维护与扩展。

④决定局域网特性的主要技术要素为网络拓扑、传输介质于介质访问控制方法。

⑤从介质访问控制方法的角度，局域网可分为共享式局域网与交换式局域网。

4. 从邮件服务器读取邮件所采用的协议是A. SMTPB. POP3C. MIMED. EMAIL答案：B5. 为加强网络之间的安全设臵了一项功能，它可以控制和监测网络之间的信息交换和访问，这一功能是A. 消息认证B. 访问控制C. 文件保护D. 防火墙答案：D解析：防火墙是网络安全策略的有机组成部分，它通过控制和监测网络之间的信息交换和访问行为来实现对网络安全的有效管理。

机器学习期中考试 midterm

10-701Midterm Exam,Spring20111.Personal info:•Name:•Andrew account:•E-mail address:2.There are14numbered pages in this exam(including this cover sheet).3.You can use any material you brought:any book,notes,and print outs.You cannotuse materials brought by other students.4.No computers,PDAs,phones or Internet access.5.If you need more room to answer a question,use the back of the preceding page.6.Work eﬃciently.Consider answering all of the easier questionsﬁrst.7.There is one optional extra credit question,which will not aﬀect the grading curve.Itwill be used to bump your grade up,without aﬀecting anyone else’s grade.8.You have80minutes,the test has100points.Good luck!Question Topic Max.score Score1Short Questions202Bayes Nets233Decision Surfaces and Training Rules124Linear Regression205Conditional Independence Violation256[Extra Credit]Violated Assumptions61[20Points]Short Questions1.1True or FalseAnswer each of the following True of False.If True,give a short justiﬁcation.If False,a counter-example or convincing one-sentence explanation.1.[2pts]If we train a Naive Bayes classiﬁer using inﬁnite training data that satisﬁes allof its modeling assumptions(e.g.,conditional independence),then it will achieve zero training error over these training examples.2.[2pts]If we train a Naive Bayes classiﬁer using inﬁnite training data that satisﬁes allof its modeling assumptions(e.g.,conditional independence),then it will achieve zero true error over test examples drawn from this same distribution.3.[2pts]Every Bayes Net deﬁned over10variables X1,X2,...X10 tells how to factorthe joint probability distribution P(X1,X2,...X10)into the product of exactly10 terms.Consider the three Bayes Nets shown below:A B C4.[3pts]True or false:Every joint distribution P(X1,X2,X3)that can be deﬁned byadding Conditional Probability Distributions(CPD)to Bayes Net graph A can also be expressed by appropriate CPD’s for Bayes Net graph B.5.[3pts]True or false:Every joint distribution P(X1,X2,X3)that can be deﬁnedby adding Conditional Probability Distributions to Bayes Net graph A can also be expressed by appropriate CPD’s for Bayes Net graph C.1.2Quick questionsAnswer each of the following in one or two sentences,in the space provided.1.[2pts]Prove that P(X1|X2)P(X2)=P(X2|X1)P(X1).(Hint:This is a two-lineproof.)2.[3pts]Consider a decision tree learner applied to data where each example is describedby10boolean variables X1,X2,...X10 .What is the VC dimension of the hypothesis space used by this decision tree learner?3.[3pts]Consider the plot below showing training and test set accuracy for decisiontrees of diﬀerent sizes,using the same set of training data to train each tree.Describe in one sentence how the training data curve(solid line)will change if the number of training examples approaches inﬁnity.In a second sentence,describe what will happen to the test data curve under the same condition.2[23Points]Bayes Nets2.1[17pts]InferenceIn the following graphical model,A,B,C,and D are binary random variables.1.[2pts]How many parameters are needed to deﬁne the Conditional Probability Dis-tributions(CPD’s)for this Bayes Net?2.[2pts]Write an expression for the probability P(A=1,B=1,C=1,D=1)interms of the Bayes Net CPD e notation like P(C=1|A=0)to denote speciﬁc parameters in the CPD’s.3.[3pts]Write an expression for P(A=0|B=1,C=1,D=1)in terms of the BayesNet Conditional Probability Distribution(CPD)parameters.4.[2pts]True or False(give brief justiﬁcation):C is conditionally independent of Bgiven D.5.[2pts]True or False(give brief justiﬁcation):C is conditionally independent of Bgiven A.Suppose we use EM to train the above Bayes Net from the partially labeled data given below,ﬁrst initializing all Bayes net parameters to0.5.A B C D10101?01110?0?0?010?6.[2pts]How many distinct quantities will be updated during theﬁrst M step?7.[2pts]How many distinct quantities will be estimated during theﬁrst E step?8.[2pts]When EM converges,what will be theﬁnal estimate for P(C=0|A=1)?[Hint:You do not need a calculator.]2.2[6pts]Constructing a Bayes netDraw a Bayes net over the random variables{A,B,C,D}where the following conditional independence assumptions hold.Here,X⊥Y|Z means X is conditionally independent of Y given Z,and X ⊥Y|Z means X and Y are not conditionally independent given Z,and∅stands for the empty set.•A⊥B|∅•A ⊥D|B•A⊥D|C•A ⊥C|∅•B ⊥C|∅•A ⊥B|D•B⊥D|A,C3[12Points]Decision Surfaces and Training Rules Consider a classiﬁcation problem with two boolean variables X1,X2∈{0,1}and label Y∈{0,1}.In Figure1we show two positive(“+”)and two negative(“-”)examples.Figure1:Two positive examples and two negative examples.Question[2pts]:Draw(or just simply describe)a decision tree that can perfectly classify the four examples in Figure1.Question[3pts]:In the class we learned the training rule to grow a decision tree: we start from a single root node and iteratively split each node using the“best”attribute selected by maximizing the information gain of the split.We will stop splitting a node if: 1)examples in the node are already pure;or2)we cannotﬁnd any single attribute that gives a split with positive information gain.If we apply this training rule to the examples in Figure1,will we get a decision tree that perfectly classiﬁes the examples?Brieﬂy explain what will happen.Question[5pts]:Suppose we learn a Naive Bayes classiﬁer from the examples in Figure1,using MLE(maximum likelihood estimation)as the training rule.Write down all the parameters and their estimated values(note:both P(Y)and P(X i|Y)should be Bernoulli distributions).Also,does this learned Naive Bayes perfectly classify the four examples?Question[2pts]:Is there any logistic regression classiﬁer using X1and X2that can perfectly classify the examples in Figure1?Why?4[20Points]Linear RegressionConsider a simple linear regression model in which y is the sum of a deterministic linear function of x,plus random noise .y=wx+where x is the real-valued input;y is the real-valued output;and w is a single real-valued parameter to be learned.Here is a real-valued random variable that represents noise, and that follows a Gaussian distribution with mean0and standard deviationσ;that is, ∼N(0,σ)(a)[3pts]Note that y is a random variable because it is the sum of a deterministic function of x,plus the random variable .Write down an expression for the probability distribution governing y,in terms of N(),σ,w and x.(b)[3pts]You are given n i.i.d.training examples{(x1,y1),(x2,y2),...,(x n,y n)}to train this model.Let Y=(y1,...,y n)and X=(x1,...,x n),write an expression for the conditional data likelihood:p(Y|X,w).(c)[9pts]Here you will derive the expression for obtaining a MAP estimate of w from the training data.Assume a Gaussian prior over w with mean0and standard deviationτ(i.e.w∼N(0,τ)).Show thatﬁnding the MAP estimate w∗is equivalent to solving the following optimization problem:w∗=argmin w 12ni=1(y i−wx i)2+λ2w2;Also express the regularization parameterλin terms ofσandτ.(d)[5pts]Above we assumed a zero-mean prior for w,which resulted in the usualλ2w2regularization term for linear regression.Sometimes we may have prior knowledge that suggests w has some value other than zero.Write down the revised objective function that would be derived if we assume a Gaussian prior on w with meanµinstead of zero(i.e.,if the prior is w∼N(µ,τ)).5[25Points]Conditional Independence Violation 5.1Naive Bayes without Conditional Independence Violation Table 1:P (Y )Y =0Y =10.80.2Table 2:P (X 1|Y )X 1=0X 1=1Y =00.70.3Y =10.30.7Consider a binary classiﬁcation problem with variable X 1∈{0,1}and label Y ∈{0,1}.The true generative distribution P (X 1,Y )=P (Y )P (X 1|Y )is shown as Table 1and Table 2.Question [4pts]:Now suppose we have trained a Naive Bayes classiﬁer,using inﬁnite training data generated according to Table 1and Table 2.In Table 3,please write down the predictions from the trained Naive Bayes for diﬀerent conﬁgurations of X 1.Note that ˆY (X 1)in the table is the decision about the value of Y given X 1.For decision terms in the table,write down either ˆY=0or ˆY =1;for probability terms in the table,write down the actual values (and the calculation process if you prefer,e.g.,0.8∗0.7=0.56).Table 3:Predictions from the trained Naive BayesˆP (X 1,Y =0)ˆP (X 1,Y =1)ˆY(X 1)X 1=0X 1=1Question [3pts]:What is the expected error rate of this Naive Bayes classiﬁer on testing examples that are generated according to Table 1and Table 2?In other words,P (ˆY (X 1)=Y )when (X 1,Y )is generated according to the two tables.Hint:P (ˆY(X 1)=Y )=P (ˆY (X 1)=Y,X 1=0)+P (ˆY (X 1)=Y,X 1=1).5.2Naive Bayes with Conditional Independence Violation Consider two variables X 1,X 2∈{0,1}and label Y ∈{0,1}.Y and X 1are still generated according to Table 1and Table 2,and then X 2is created as a duplicated copy of X 1.Question[6pts]:Now suppose we have trained a Naive Bayes classiﬁer,using inﬁnite training data that are generated according to Table1,Table2and the duplication rule. In Table4,please write down the predictions from the trained Naive Bayes for diﬀerent conﬁgurations of(X1,X2).For probability terms in the table,you can write down just the calculation process(e.g.,one entry might be0.8∗0.3∗0.3=0.072,and you can just write down0.8∗0.3∗0.3to save some time).Hint:the Naive Bayes classiﬁer does assume that X2is conditionally independent of X1given Y.Table4:Predictions from the trained Naive BayesˆP(X,X2,Y=0)ˆP(X1,X2,Y=1)ˆY(X1,X2)1X1=0,X2=0X1=1,X2=1X1=0,X2=1X1=1,X2=0Question[3pts]:What is the expected error rate of this Naive Bayes classiﬁer on testing examples that are generated according to Table1,Table2and the duplication rule?Question[3pts]:Compared to the scenario without X2,how does the expected error rate change(i.e.,increase or decrease)?In Table4,the decision ruleˆY on which conﬁgura-tion is responsible to this change?What actually happened to this decision rule?(You need to brieﬂy answer:increase or decrease,the responsible conﬁguration,and what happened.)5.3Logistic Regression with Conditional Independence Violation Question[2pts]:Will logistic regression suﬀer from having an additional variable X2that is actually a duplicate of X1?Intuitively,why(hint:model assumptions)?Now we will go beyond the intuition.We have a training set D1of L examples D1={(X11,Y1),...,(X L1,Y L)}.Suppose we generate another training set D2of L examples D2={(X11,X12,Y1),...,(X L1,X L2,Y L)},where in each example X1and Y are the same as in D1and then X2is a duplicate of X1.Now we learn a logistic regression from D1,which should contain two parameters:w0and w1;we also learn another logistic regression from D2,whichshould have three parameters:w0,w1and w2.Question[4pts]:First,write down the training rule(maximum conditional likelihoodestimation)we use to estimate(w0,w1)and(w0,w1,w2)from data.Then,given the trainingrule,what is the relationship between(w0,w1)and(w0,w1,w2)we estimated from D1andD2?Use this fact to argue whether or not the logistic regression will suﬀer from having an additional duplicate variable X2.6[Extra Credit 6pts]Violated assumptionsExtra Credit Question :This question is optional –do not attempt it until you have completed the rest of the exam.It will not aﬀect the grade curve for the exam,though you will receive extra points if you answer it.Let A,B,and C be boolean random variables governed by the joint distribution P (A,B,C ).Let D be a dataset consisting of n data points,each of which is an independent draw from P (A,B,C ),where all three variables are fully observed.Consider the following Bayes Net,which does not necessarily capture the correct condi-tional independencies in P (A,B,C).Let ˆPbe the distribution learned after this Bayes net is trained using D .Show that for any number ,0< ≤1,there exists a joint distribution P (A,B,C )such that P (C =1|A =1)=1,but such that the Bayes net shown above,when trained on D ,will (with probability1)learn CPTs where:ˆP (C =1|A =1)= b ∈{0,1}ˆP(C =1|B =b )ˆP (B =b |A =1)≤ as |D |approaches ∞.Assume that the Bayes net is learning on the basis of the MLE.You should solve this problem by deﬁning a distribution with the above property.Your ﬁnal solution may be either in the form of a fully speciﬁed joint distribution (i.e.you write out the probabilities for each assignment of the variables A,B,and C ),or in the form of a Bayes net with fully speciﬁed CPTs.(Hint:the second option is easier.)。

北航2011年考研真题答案—911材料综合

-3-
2011 年金属学原理参考答案
九、
解：（1）扩散的基本机制：
a) 间隙机制:在间隙固溶体中，溶质原子从一个间隙未知跳到另外一个间隙位置的扩散。碳，氮氢等小的间隙原子更容易采用间隙机制扩散
b) 空位机制：晶体中存在着空位。纯金属中的自扩散和置换固溶体中的扩散就是通过原子与空位交换位置实现的。这种扩散方式称为空位机制.大多数情况下原子扩散是借助空位机制实现的。
十、
a) 由于相变阻力大，相变的过冷度一般很大 b) 固态相变都非自发形核 c) 晶体缺陷对固态相变形核、生长及固态相变组织和性能具有决定性影响 d) 新相与母相间往往存在严格的晶体学取向过程 e) 相变历程复杂，往往晶粒溶质偏析—过渡相析出—稳定相析出等一系列历程。
十一、
一、控制凝固过程：
a) 加快冷却速度，获得较大过冷度（降低浇铸温度、提高铸型冷却能力、减小零件壁厚、强制冷却、内外“冷铁”，等等），使液态金属同时大量形成晶核。形核率和晶体长大速率都增大但是形核率增长的更快
十二、
1
a) 细晶强化：由霍儿－配奇公式s 0 kd 2 ，晶粒细化后s 增大相当于屈服强度增
大，故细化晶粒可以达到强化的目的，主要是由于晶界处的原子排列不规则，晶界处杂质原子富集形成各种气团，而且晶界两侧的晶粒取向不同，因此常温下晶界的存在会对位错的运动起阻碍作用致使塑性变形抗力提高宏观表现为晶界比晶内具有较高的强度和硬度。 b) 加工硬化：由于位错间交互作用产生位错增殖，位错密度急剧增加，位错难于移动，位错交割形成大量割阶，钉孔位错，位错交割形成位错网，位错反应形成 Lomer 或 Lomer －Contrell 位错锁，以上都会阻止位错的运动滑移都有利于硬化 c) 固溶强化：由于位错与溶质原子的交互作用而产生的强化。溶质原子的存在会产生晶格畸变和弹性应变场阻碍位错的正常运动，溶质原子可能会与位错产生弹性交互作用、化学交互作用、以及静电交互作用；位错运动时会改变溶质原子的分布情况引起系统能量升高，由此也会增加滑移变形的抗力。（或者更简单的解释：溶质原子会向位错偏聚，形成溶质气团，降低位错的应变能和系统能量，位错变得稳定从而难以移动） d) 粒子强化：a 绕过粒子：强化效果取决于粒子尺寸以及粒子间距，与粒子本性无关。运动中心位错在滑移面上受到第二相质点阻碍时，如质点尺寸和间距较大，则位错线将绕其弯曲，且形成包围质点的位错环，同时原位错继续前进，但位错间这种方式运动所受阻力大。b 切割粒子：第二相强度不大时，可随基体一起变形，且第二相质点与位错间作用力不足以把位错组织在质点处时，位错会直接切过质点使质点分成两部分，除质点周围应力场阻碍位错运动外，质点本身对位错亦有阻力。强化效果取决于粒子的本性（界面共格错配度、界面能、弹性模量差、层错能差、有序度等）。

机器学习题库

机器学习题库一、极大似然1、 ML estimation of exponential model (10)A Gaussian distribution is often used to model data on the real line, but is sometimesinappropriate when the data are often close to zero but constrained to be nonnegative. In such cases one can fit an exponential distribution, whose probability density function is given by()1xb p x e b-=Given N observations x i drawn from such a distribution:(a) Write down the likelihood as a function of the scale parameter b.(b) Write down the derivative of the log likelihood.(c) Give a simple expression for the ML estimate for b.2、换成Poisson 分布：()|,0,1,2,...!x e p x y x θθθ-==()()()()()1111log |log log !log log !N Ni i i i N N i i i i l p x x x x N x θθθθθθ======--⎡⎤=--⎢⎥⎣⎦∑∑∑∑3、二、贝叶斯假设在考试的多项选择中，考生知道正确答案的概率为p ，猜测答案的概率为1-p ，并且假设考生知道正确答案答对题的概率为1，猜中正确答案的概率为1，其中m 为多选项的数目。

2011年北北交大计算机复试离散真题回忆版

今年总体不难，都是常规题目。

知道自己考前搜集资料的辛苦与无奈，也为下届考生留些许方便。

辛勤劳作，感谢回帖！
一、计算题
1、给三合式公式求其主析取范式、主合取范式并求其成真、成假赋值
2、给两集合A、B，分别按要求写出A到B上的关系R、S，并求关系的逆、合成及自反闭包、对称闭包、传递闭包
3、暂时想不起来，来日补充
二、简答题
1、给一图的文字描述，分别判断该图是否欧拉图、哈密顿图、强连通图且给处理由
2、问集合A、B，|A|=m，|B|=n，问A到B上的关系过少个？映射多少个？双射多少个？并给出理由
3、公安人员审盗窃案，已知事实如下：
(1) 甲或乙盗窃了录音机
(2) 若甲盗窃了录音机，则作案时间不能发生在午夜之前
(3) 若乙的证词正确，则午夜时屋里灯光未灭
(4) 若乙的证词不正确，则作案时间发生在午夜之前
(5) 午夜时屋里灯光灭了
请问：盗窃录音机的是甲，还是乙？
三、证明题
1、证明：A、B、C是三个集合，证明A-(B∪C)=(A-B)∩(A-C) （注：是此类型题，题目记不清了，意思一样，方法一样有记着的哥们给补充吧）
2、子群证明G是群，S是G子集，S等于{a|a属于G，且任意g属于G，有ga=ag}，证S是G子群
3、等值演算法证明（你懂得）
4、证明一棵无向树是二部图。

【王道论坛】2011年计算机统考真题+解析

解答：A。选项A中，当查到91后再向24查找，说明这一条路径之后查找的数都要比91小，后面的94就错了。 8．下列关于图的叙述中，正确的是 Ⅰ. 回路是简单路径 Ⅱ．存储稀疏图，用邻接矩阵比邻接表更省空间 Ⅲ．若有向图中存在拓扑序列，则该图不存在回路 A．仅Ⅱ B．仅Ⅰ、Ⅱ C．仅Ⅲ D．仅Ⅰ、Ⅲ 解答：C。Ⅰ.回路对应于路径，简单回路对应于简单路径；Ⅱ．刚好相反；Ⅲ．拓扑有序的必要条件。故选C。 9．为提高散列（Hash）表的查找效率，可以采取的正确措施是 Ⅰ. 增大装填（载）因子 Ⅱ．设计冲突（碰撞）少的散列函数 Ⅲ．处理冲突（碰撞）时避免产生聚集（堆积）现象 A．仅Ⅰ B．仅Ⅱ C．仅Ⅰ、Ⅱ D．仅Ⅱ、Ⅲ 解答：B。III错在“避免”二字。 10．为实现快速排序算法，待排序序列宜采用的存储方式是 A．顺序存储 B．散列存储 C．链式存储 D．索引存储解答：A。内部排序采用顺序存储结构。 11．已知序列25,13,10,12,9是大根堆，在序列尾部插入新元素18，将其再调整为大根堆，调整过程中元素之间进行的比较次数是 A．1 B．2 C．4 D．5 解答：B。首先与10比较，交换位置，再与25比较，不交换位置。比较了二次。 12．下列选项中，描述浮点数操作速度指标的是 A．MIPS B．CPI C．IPC D．MFLOPS 解答：D。送分题。 13．float型数据通常用IEEE 754单精度浮点数格式表示。若编译器将float型变量x分配在一个32位浮点寄存器FR1中，且x=-8.25，则FR1的内容是 A．C104 0000H B．C242 0000H C．C184 0000H D．C1C2 0000H 11 解答：A。x的二进制表示为-1000.01﹦-1.000 01×2 根据IEEE754标准隐藏最高位的 “1”，又E-127=3，所以E=130=1000 0010（2）数据存储为1位数符+8位阶码（含阶符）+23位尾数。故FR1内容为1 10000 0010 0000 10000 0000 0000 0000 000 即1100 0001 0000 0100 0000 0000 0000 0000，即C104000H 14．下列各类存储器中，不采用随机存取方式的是 A．EPROM B．CDROM C．DRAM D．SRAM 解答：B。光盘采用顺序存取方式。 15．某计算机存储器按字节编址，主存地址空间大小为64MB，现用4M×8位的RAM芯片组成32MB 的主存储器，则存储器地址寄存器MAR的位数至少是 A．22位 B．23位 C．25位 D．26位解答：D。64MB的主存地址空间，故而MAR的寻址范围是64M，故而是26位。而实际的主存的空间不能代表MAR的位数。 16．偏移寻址通过将某个寄存器内容与一个形式地址相加而生成有效地址。下列寻址方式中，不属于偏移寻址方式的是． A．间接寻址 B．基址寻址 C．相对寻址 D．变址寻址解答：A。间接寻址不需要寄存器，EA=(A)。基址寻址：EA=A+基址寄存器内同；相对寻址：EA﹦A+PC内容；变址寻址：EA﹦A+变址寄存器内容。

机器学习期中考试 midterm_sol

1. [2 pts] If we train a Naive Bayes classiﬁer using inﬁnite training data that satisﬁes all of its modeling assumptions (e.g., conditional independence), then it will achieve zero training error over these training examples.
will be used to bump your grade up, without aﬀecting anyone else’s grade. 8. You have 80 minutes, the test has 100 points. Good luck!
Question Topic
Max. score Score
SOLUTION: The VC dimension is 210, because we can shatter 210 examples using a tree with 210 leaf nodes, and we cannot shatter 210 + 1 examples (since in that case we must have duplicated examples and they can be assigned with conﬂicting labels).
SOLUTION: False. A can represent distributions where X1 can depend on X3 given no information about X2, whereas graph C cannot.

2011年12月电大网考计算机应用基础统考试题真题

计算机应用基础1一、单选题1、第一台电子计算机是1946年在美国研制成功的，该机的英文缩写名是______。

答案： AA：ENIAC B：EDVAC C：EDSAC D：MARK2、关于计算机的分类方法有多种，下列选项中不属于按计算机处理数据的方式进行分类的是______。

答案： BA：电子数字计算机 B：通用计算机C：电子模拟计算机 D：数模混合计算机3、以下不属于电子数字计算机特点的是______。

答案： CA：运算快速 B：计算精度高 C：形状粗笨 D：通用性强4、利用计算机来模仿人的高级思维活动称为____。

答案： DA：数据处理 B：自动控制 C：计算机辅助系统 D：人工智能5、在计算机领域，客观事物的属性表示为______。

答案： AA：数据 B：数值 C：模拟量 D：信息6、组成计算机主机的主要是____。

答案： BA：运算器和控制器 B：中央处理器和主存储器C：运算器和外设 D：运算器和存储器7、指令的操作码表示的是______。

答案： AA：做什么操作 B：停止操作 C：操作结果 D：操作地址8、某单位的人事管理程序属于____。

答案： CA：系统程序 B：系统软件 C：应用软件 D：目标软件9、冯·诺依曼结构计算机的五大基本构件包括运算器、存储器、输入设备、输出设备和______。

答案： BA：显示器 B：控制器 C：硬盘存储器 D：鼠标器10、绘图仪是计算机的输出图形的输出设备，同属于输出设备的还有______。

答案： AA：打印机和显示器 B：键盘和显示器C：鼠标和显示器 D：扫描仪和打印机11、计算机的主频即计算机的时钟频率，较高的主频用吉赫来表示。

其英文缩略语为______。

答案： BA：MHz B：GHz C：GDP D：MIPS12、下列四组数应依次为二进制、八进制和十六进制，符合这个要求的是____。

答案： DA：11，78，19 B：12，77，10 C：12，80，10 D：11，77，19 13、下列字符中ASCII码值最小的是____。

2011考研计算机学科专业基础综合考试试题及答案解析

2011 年考研计算机学科专业基础综合一．选择题1．设n是描述问题规模的非负整数，下面程序片段的时间复杂度是x = 2;while ( x < n/2 )x = 2*x;A．O(log2n) B．O(n) C．O(n log2n) D．O(n2)2．元素a, b, c, d, e依次进入初始为空的栈中，若元素进栈后可停留、可出栈，直到所有元素都出栈，则在所有可能的出栈序列中，以元素d开头的序列个数是A．3 B．4 C．5 D．63．已知循环队列存储在一维数组A[0..n-1] 中，且队列非空时front和rear 分别指向队头元素和队尾元素。

若初始时队列为空，且要求第1个进入队列的元素存储在A[0]处，则初始时front和rear 的值分别是A．0, 0 B．0, n-1 C．n-1, 0 D．n-1, n-14．若一棵完全二叉树有768个结点，则该二叉树中叶结点的个数是A．257 B．258 C．384 D．3855．若一棵二叉树的前序遍历序列和后序遍历序列分别为1, 2, 3, 4和4, 3, 2, 1，则该二叉树的中序遍历序列不．会是A．1, 2, 3, 4 B．2, 3, 4, 1 C．3, 2, 4, 1 D．4, 3, 2, 16．已知一棵有2011 个结点的树，其叶结点个数为116，该树对应的二叉树中无右孩子的结点个数是A．115 B．116 C．1895 D．18967．对于下列关键字序列，不．可能构成某二叉排序树中一条查找路径的序列是A．95, 22, 91, 24, 94, 71 B．92, 20, 91, 34, 88, 35C．21, 89, 77, 29, 36, 38 D．12, 25, 71, 68, 33, 348．下列关于图的叙述中，正确的是I．回路是简单路径II．存储稀疏图，用邻接矩阵比邻接表更省空间III．若有向图中存在拓扑序列，则该图不存在回路A．仅II B．仅I、II C．仅III D．仅I、III9．为提高散列（Hash）表的查找效率，可以采取的正确措施是I．增大装填（载）因子II．设计冲突（碰撞）少的散列函数III．处理冲突（碰撞）时避免产生聚集（堆积）现象您所下载的资料来源于考研资料下载中心获取更多考研资料，请访问A．仅I B．仅II C．仅I、II D．仅II、III10．为实现快速排序算法，待排序序列宜采用的存储方式是A．顺序存储B．散列存储C．链式存储D．索引存储11．已知序列25, 13, 10, 12, 9是大根堆，在序列尾部插入新元素18，将其再调整为大根堆，调整过程中元素之间进行的比较次数是A．1 B．2 C．4 D．512．下列选项中，描述浮点数操作速度指标的是A．MIPS B．CPI C．IPC D．MFLOPS13．float型数据通常用IEEE 754单精度浮点数格式表示。

机器学习题库

机器学习题库一、极大似然1、 ML estimation of exponential model (10)A Gaussian distribution is often used to model data on the real line, but is sometimesinappropriate when the data are often close to zero but constrained to be nonnegative. In such cases one can fit an exponential distribution, whose probability density function is given by()1xb p x e b-=Given N observations x i drawn from such a distribution:(a) Write down the likelihood as a function of the scale parameter b.(b) Write down the derivative of the log likelihood.(c) Give a simple expression for the ML estimate for b.2、换成Poisson 分布：()|,0,1,2,...!x e p x y x θθθ-==()()()()()1111log |log log !log log !N Ni i i i N N i i i i l p x x x x N x θθθθθθ======--⎡⎤=--⎢⎥⎣⎦∑∑∑∑二、贝叶斯1、贝叶斯公式应用假设在考试的多项选择中，考生知道正确答案的概率为p ，猜测答案的概率为1-p ，并且假设考生知道正确答案答对题的概率为1，猜中正确答案的概率为1m ，其中m 为多选项的数目。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

3、阐述支持向量机的基本原理。

支持向量机5.pdf 第6页。

主要思想：针对线性可分情况进行分析，对于线性不可分的情况，通过使用非线性映射算法，将低维输入空间线性不可分的样本转化为高维特征空间样本，使其线性可分，从而使得在高维特征空间采用线性算法对样本的非线性特征进行线性分析成为可能。

它基于结构风险最小化理论，在特征空间中建构最优分割超平面，使得学习器得到全局最优化，并且在整个样本空间的期望风险以某个概率满足一定上界。

推导过程：
求解最大的margin值ρ情况下的分类超平面：
最小化1
∥w∥2
令y i=+1 x i T w+b≥1−1 x i T w+b≤−1
数学求解后，根据Karush‐Kuhn‐Tucker (KKT) 条件，分类面为最优分类面的条件为：
αi y i x i T w+b −1=0 i=1,2,…,n
w0=αi0y i x i
n
i=1=αi0y i x i 支持向量
决策方程：
g x=w T x+b=αi y i x i T x+b
n
i=1
5、通过含有一层隐藏层的神经网络推导BP反传算法。

BP反传算法总结：
（1）初始化权重ωij；
（2）对于输入的训练样本，求取每个节点输出和最终输出层的输出值；
（3）对输出层求取δk=y k−t k；
（4）对于隐藏层求取δj=ℎ′(a j)ωkjδk；
（5）求取输出误差对于每个权重的梯度ðE n
ðωji
=δj z i；
（6）更新权重w(τ+1)=w(τ)−η∇E(w(τ))。

（7）重复进行（2）~（6）步进行迭代至收敛。

评价：
用梯度法求非线性函数极值，因而有可能陷入局部极小点，不能保证收敛到全局极小点；算法收敛速度很慢；新加入的样本会影响已有样本产生的学习结果。

权值初始值的选择：如果权值初始值都为零或都相同，隐层各单元不能出现差异，运算不能正常进行。

因此，通常用较小的随机数（例如在-0.3～0.3之间的随机数）作为权值初始值。

初始值对收敛有影响，当计算不收敛时，可以改变初始值试算。

6、简述Discrete Adaboost的基本原理，并探讨它在某一领域的应用（已经学过的除外）。

Discrete Adaboost 13.pdf 第9页
核心思想：它是一种迭代算法，其核心思想是针对同一个训练集训练不同的分类器（弱分类器），然后把这些若分类器集合
起来，构成一个更强的最终分类器（强分类器）。

其算法本身是通过改变数据分布来实现的，它根据每次训练集中每个样本的分类是否正确，以及上次总体分布的准确率，来确定每个样本的权值，将修改过权值的新数据集送给下层分类器进行训练，最后将每次训练得到的分类器最后融合起来，作为最终的分类器。

使用Adaboost可以排除一些不必要的训练数据特征，并主要集中于关键数据。

训练过程：算法中不同的训练集是通过调整每个样本对应的权重来实现的。

开始时，每个样本对应的权重是相同的，在此样本分布下训练出一弱分类器。

对于分类错误的样本，加大其对应的权值；而对于分类正确的样本，降低其对应的权值，这样分错的样本就被突出出来，从而得到一个新的样本分布。

在新的样本分布下，再次对弱分类器进行训练，得到另一个弱分类器。

以此类推，经过T次循环，得到T个弱分类器，把这T个弱分类器按照一定的权值叠加（boost）起来，得到最终需要的强分类器。

三种方式：
（1）有T个弱分类器h(t)，产生强分类器H strong。

H strong=1
h(t)
T
t=1
Bagging
（2）有T个弱分类器h(t)，其各自的权重是α(t)，产生强分类器。

H strong=α t h t
T
t=1
Boosting
（3）进一步考虑弱分类器和样本进行自适应：Adaptive Boosting，即Adaboost。

Adaboost算法通过从大量的特征中挑出最优的特征，并将其转化为对应的弱分类器进行分类使用，从而达到对目标进行分类的目的。

7、对一幅带噪声的二值图像，利用马尔科夫随机场进行降噪（可能会换成相似的问题）。

马尔科夫随机场7.pdf 第5页。

图像去噪7.pdf 第10页。

马尔科夫随机场：马尔科夫一般是马尔科夫性质的简称，它指的是一个随机变量序列按时间先后关系依次排开的时候，第N+1时刻的分布特性，只与N时刻有关，且与N时刻以前的随机变量的取值无关。

随机场包含两个要素：位置和相空间，当给每一个位置按照某种分布随机赋予相空间的一个值之后，其全体就叫做随机场。

马尔科夫随机场在二值图像降噪中的运用主要是指在对图像进行分析时，认为图像上的每个点是否是噪点只与它周围的若干个点有关，而与整体图像无关。

所以针对每个点，若它对比周围各点显现出颜色突变，则判定为噪点。

算法用能量函数来描述这种对比关系，若将该点颜色改变为另一个值后能量降低，则判定该点为噪点并维持改变后的颜色；反之则判定为非噪点，并将改回原有颜色。

能量函数：
x i,y i：ψc x c=e−E x c,E x c=−ηx i y i
x i,x j：ψc x c=e−E x c,E x c=−βx i x j
⇒ψc x c=e−E x c,E x c=ℎx i
i −βx i x j
i,j
−ηx i y i
i
梯度下降法：。

北航计算机机器学习2011复习资料+试卷

2011年3月全国计算机等级考试三级数据库真题和答案

机器学习 期中考试 midterm

北航2011年考研真题答案—911材料综合

机器学习题库

2011年北北交大计算机复试离散真题回忆版

【王道论坛】2011年计算机统考真题+解析

机器学习 期中考试 midterm_sol

2011年12月电大网考计算机应用基础统考试题真题

2011考研计算机学科专业基础综合考试试题及答案解析

机器学习题库

机器学习期中考试 midterm

机器学习期中考试 midterm_sol