模式识别作业Homework#2

合集下载

模式识别作业题(2)

得证。 3、使用最小最大损失判决规则的错分概率是最小吗？为什么？
答：不是最小的。首先要明确当我们谈到最小最大损失判决规则时，先验概率是未知的，而先验概率的变化会导致错分概率变化，故错分概率也是一个变量。使用最小最大损失判决规则的目的就是保证在先验概率任意变化导致错分概率变化时，错分概率的最坏（即最大）情况在所有判决规则中是最好的（即最小）。 4、若 λ11 = λ22 =0， λ12 = λ21 ，证明此时最小最大决策面是来自两类的错误率相等。证明：最小最大决策面满足（ λ11 - λ22 ）+（ λ21 - λ11 ）容易得到
λ11 P(ω1 | x) + λ12 P(ω2 | x) < λ21 P(ω1 | x) + λ22 P(ω2 | x) ( λ21 - λ11 ) P (ω1 | x) >( λ12 - λ22 ) P (ω2 | x) ( λ21 - λ11 ) P (ω1 ) P ( x | ω1 ) >( λ12 - λ22 ) P (ω2 ) P ( x | ω2 ) p( x | ω1 ) (λ 12 − λ 22) P(ω2 ) > 即 p( x | ω2 ) ( λ 21 − λ 11) P (ω1 )
6、设总体分布密度为 N( μ ，1)，-∞< μ <+∞,并设 X={ x1 ， x2 ，… xN }，分别用最大似然估计和贝叶斯估计计算 μ 。已知 μ 的先验分布 p（ μ ）～N（0,1）。解：似然函数为：
∧Байду номын сангаас
L（ μ ）=lnp(X|u)=
∑ ln p( xi | u) = −
i =1
N
模式识别第三章作业及其解答

模式识别大作业

模式识别大作业1.最近邻/k近邻法一.基本概念：最近邻法：对于未知样本x，比较x与N个已知类别的样本之间的欧式距离，并决策x与距离它最近的样本同类。

K近邻法：取未知样本x的k个近邻，看这k个近邻中多数属于哪一类，就把x归为哪一类。

K取奇数，为了是避免k1=k2的情况。

二.问题分析：要判别x属于哪一类，关键要求得与x最近的k个样本（当k=1时，即是最近邻法），然后判别这k个样本的多数属于哪一类。

可采用欧式距离公式求得两个样本间的距离s=sqrt（（x1-x2）^2+(y1-y2)^2）三.算法分析：该算法中任取每类样本的一半作为训练样本，其余作为测试样本。

例如iris中取每类样本的25组作为训练样本，剩余25组作为测试样本，依次求得与一测试样本x距离最近的k 个样本，并判断k个样本多数属于哪一类，则x就属于哪类。

测试10次，取10次分类正确率的平均值来检验算法的性能。

四.MATLAB代码：最近邻算实现对Iris分类clc;totalsum=0;for ii=1:10data=load('iris.txt');data1=data(1:50,1:4);%任取Iris-setosa数据的25组rbow1=randperm(50);trainsample1=data1(rbow1(:,1:25),1:4);rbow1(:,26:50)=sort(rbow1(:,26:50));%剩余的25组按行下标大小顺序排列testsample1=data1(rbow1(:,26:50),1:4);data2=data(51:100,1:4);%任取Iris-versicolor数据的25组rbow2=randperm(50);trainsample2=data2(rbow2(:,1:25),1:4);rbow2(:,26:50)=sort(rbow2(:,26:50));testsample2=data2(rbow2(:,26:50),1:4);data3=data(101:150,1:4);%任取Iris-virginica数据的25组rbow3=randperm(50);trainsample3=data3(rbow3(:,1:25),1:4);rbow3(:,26:50)=sort(rbow3(:,26:50));testsample3=data3(rbow3(:,26:50),1:4);trainsample=cat(1,trainsample1,trainsample2,trainsample3);%包含75组数据的样本集testsample=cat(1,testsample1,testsample2,testsample3);newchar=zeros(1,75);sum=0;[i,j]=size(trainsample);%i=60,j=4[u,v]=size(testsample);%u=90,v=4for x=1:ufor y=1:iresult=sqrt((testsample(x,1)-trainsample(y,1))^2+(testsample(x,2) -trainsample(y,2))^2+(testsample(x,3)-trainsample(y,3))^2+(testsa mple(x,4)-trainsample(y,4))^2); %欧式距离newchar(1,y)=result;end;[new,Ind]=sort(newchar);class1=0;class2=0;class3=0;if Ind(1,1)<=25class1=class1+1;elseif Ind(1,1)>25&&Ind(1,1)<=50class2=class2+1;elseclass3=class3+1;endif class1>class2&&class1>class3m=1;ty='Iris-setosa';elseif class2>class1&&class2>class3m=2;ty='Iris-versicolor';elseif class3>class1&&class3>class2m=3;ty='Iris-virginica';elsem=0;ty='none';endif x<=25&&m>0disp(sprintf('第%d组数据分类后为%s类',rbow1(:,x+25),ty));elseif x<=25&&m==0disp(sprintf('第%d组数据分类后为%s类',rbow1(:,x+25),'none'));endif x>25&&x<=50&&m>0disp(sprintf('第%d组数据分类后为%s类',50+rbow2(:,x),ty));elseif x>25&&x<=50&&m==0disp(sprintf('第%d组数据分类后为%s类',50+rbow2(:,x),'none'));endif x>50&&x<=75&&m>0disp(sprintf('第%d组数据分类后为%s类',100+rbow3(:,x-25),ty));elseif x>50&&x<=75&&m==0disp(sprintf('第%d组数据分类后为%s类',100+rbow3(:,x-25),'none'));endif (x<=25&&m==1)||(x>25&&x<=50&&m==2)||(x>50&&x<=75&&m==3)sum=sum+1;endenddisp(sprintf('第%d次分类识别率为%4.2f',ii,sum/75));totalsum=totalsum+(sum/75);enddisp(sprintf('10次分类平均识别率为%4.2f',totalsum/10));测试结果：第3组数据分类后为Iris-setosa类第5组数据分类后为Iris-setosa类第6组数据分类后为Iris-setosa类第7组数据分类后为Iris-setosa类第10组数据分类后为Iris-setosa类第11组数据分类后为Iris-setosa类第12组数据分类后为Iris-setosa类第14组数据分类后为Iris-setosa类第16组数据分类后为Iris-setosa类第18组数据分类后为Iris-setosa类第19组数据分类后为Iris-setosa类第20组数据分类后为Iris-setosa类第23组数据分类后为Iris-setosa类第24组数据分类后为Iris-setosa类第26组数据分类后为Iris-setosa类第28组数据分类后为Iris-setosa类第30组数据分类后为Iris-setosa类第31组数据分类后为Iris-setosa类第34组数据分类后为Iris-setosa类第37组数据分类后为Iris-setosa类第39组数据分类后为Iris-setosa类第41组数据分类后为Iris-setosa类第44组数据分类后为Iris-setosa类第45组数据分类后为Iris-setosa类第49组数据分类后为Iris-setosa类第53组数据分类后为Iris-versicolor类第54组数据分类后为Iris-versicolor类第55组数据分类后为Iris-versicolor类第57组数据分类后为Iris-versicolor类第58组数据分类后为Iris-versicolor类第59组数据分类后为Iris-versicolor类第60组数据分类后为Iris-versicolor类第61组数据分类后为Iris-versicolor类第62组数据分类后为Iris-versicolor类第68组数据分类后为Iris-versicolor类第70组数据分类后为Iris-versicolor类第71组数据分类后为Iris-virginica类第74组数据分类后为Iris-versicolor类第75组数据分类后为Iris-versicolor类第77组数据分类后为Iris-versicolor类第79组数据分类后为Iris-versicolor类第80组数据分类后为Iris-versicolor类第84组数据分类后为Iris-virginica类第85组数据分类后为Iris-versicolor类第92组数据分类后为Iris-versicolor类第95组数据分类后为Iris-versicolor类第97组数据分类后为Iris-versicolor类第98组数据分类后为Iris-versicolor类第99组数据分类后为Iris-versicolor类第102组数据分类后为Iris-virginica类第103组数据分类后为Iris-virginica类第105组数据分类后为Iris-virginica类第106组数据分类后为Iris-virginica类第107组数据分类后为Iris-versicolor类第108组数据分类后为Iris-virginica类第114组数据分类后为Iris-virginica类第118组数据分类后为Iris-virginica类第119组数据分类后为Iris-virginica类第124组数据分类后为Iris-virginica类第125组数据分类后为Iris-virginica类第126组数据分类后为Iris-virginica类第127组数据分类后为Iris-virginica类第128组数据分类后为Iris-virginica类第129组数据分类后为Iris-virginica类第130组数据分类后为Iris-virginica类第133组数据分类后为Iris-virginica类第135组数据分类后为Iris-virginica类第137组数据分类后为Iris-virginica类第142组数据分类后为Iris-virginica类第144组数据分类后为Iris-virginica类第148组数据分类后为Iris-virginica类第149组数据分类后为Iris-virginica类第150组数据分类后为Iris-virginica类k近邻法对wine分类：clc;otalsum=0;for ii=1:10 %循环测试10次data=load('wine.txt');%导入wine数据data1=data(1:59,1:13);%任取第一类数据的30组rbow1=randperm(59);trainsample1=data1(sort(rbow1(:,1:30)),1:13);rbow1(:,31:59)=sort(rbow1(:,31:59)); %剩余的29组按行下标大小顺序排列testsample1=data1(rbow1(:,31:59),1:13);data2=data(60:130,1:13);%任取第二类数据的35组rbow2=randperm(71);trainsample2=data2(sort(rbow2(:,1:35)),1:13);rbow2(:,36:71)=sort(rbow2(:,36:71));testsample2=data2(rbow2(:,36:71),1:13);data3=data(131:178,1:13);%任取第三类数据的24组rbow3=randperm(48);trainsample3=data3(sort(rbow3(:,1:24)),1:13);rbow3(:,25:48)=sort(rbow3(:,25:48));testsample3=data3(rbow3(:,25:48),1:13);train_sample=cat(1,trainsample1,trainsample2,trainsample3);%包含89组数据的样本集test_sample=cat(1,testsample1,testsample2,testsample3);k=19;%19近邻法newchar=zeros(1,89);sum=0;[i,j]=size(train_sample);%i=89,j=13[u,v]=size(test_sample);%u=89,v=13for x=1:ufor y=1:iresult=sqrt((test_sample(x,1)-train_sample(y,1))^2+(test_sample(x ,2)-train_sample(y,2))^2+(test_sample(x,3)-train_sample(y,3))^2+( test_sample(x,4)-train_sample(y,4))^2+(test_sample(x,5)-train_sam ple(y,5))^2+(test_sample(x,6)-train_sample(y,6))^2+(test_sample(x ,7)-train_sample(y,7))^2+(test_sample(x,8)-train_sample(y,8))^2+( test_sample(x,9)-train_sample(y,9))^2+(test_sample(x,10)-train_sa mple(y,10))^2+(test_sample(x,11)-train_sample(y,11))^2+(test_samp le(x,12)-train_sample(y,12))^2+(test_sample(x,13)-train_sample(y, 13))^2); %欧式距离newchar(1,y)=result;end;[new,Ind]=sort(newchar);class1=0;class 2=0;class 3=0;for n=1:kif Ind(1,n)<=30class 1= class 1+1;elseif Ind(1,n)>30&&Ind(1,n)<=65class 2= class 2+1;elseclass 3= class3+1;endendif class 1>= class 2&& class1>= class3m=1;elseif class2>= class1&& class2>= class3m=2;elseif class3>= class1&& class3>= class2m=3;endif x<=29disp(sprintf('第%d组数据分类后为第%d类',rbow1(:,30+x),m));elseif x>29&&x<=65disp(sprintf('第%d组数据分类后为第%d类',59+rbow2(:,x+6),m));elseif x>65&&x<=89disp(sprintf('第%d组数据分类后为第%d类',130+rbow3(:,x-41),m));endif (x<=29&&m==1)||(x>29&&x<=65&&m==2)||(x>65&&x<=89&&m==3) sum=sum+1;endenddisp(sprintf('第%d次分类识别率为%4.2f',ii,sum/89));totalsum=totalsum+(sum/89);enddisp(sprintf('10次分类平均识别率为%4.2f',totalsum/10));第2组数据分类后为第1类第4组数据分类后为第1类第5组数据分类后为第3类第6组数据分类后为第1类第8组数据分类后为第1类第10组数据分类后为第1类第11组数据分类后为第1类第14组数据分类后为第1类第16组数据分类后为第1类第19组数据分类后为第1类第20组数据分类后为第3类第21组数据分类后为第3类第22组数据分类后为第3类第26组数据分类后为第3类第27组数据分类后为第1类第28组数据分类后为第1类第30组数据分类后为第1类第33组数据分类后为第1类第36组数据分类后为第1类第37组数据分类后为第1类第43组数据分类后为第1类第44组数据分类后为第3类第45组数据分类后为第1类第46组数据分类后为第1类第49组数据分类后为第1类第54组数据分类后为第1类第56组数据分类后为第1类第57组数据分类后为第1类第60组数据分类后为第2类第61组数据分类后为第3类第63组数据分类后为第3类第65组数据分类后为第2类第66组数据分类后为第3类第67组数据分类后为第2类第71组数据分类后为第1类第72组数据分类后为第2类第74组数据分类后为第1类第76组数据分类后为第2类第77组数据分类后为第2类第79组数据分类后为第3类第81组数据分类后为第2类第82组数据分类后为第3类第83组数据分类后为第3类第84组数据分类后为第2类第86组数据分类后为第2类第87组数据分类后为第2类第88组数据分类后为第2类第93组数据分类后为第2类第96组数据分类后为第1类第98组数据分类后为第2类第99组数据分类后为第3类第102组数据分类后为第2类第104组数据分类后为第2类第105组数据分类后为第3类第106组数据分类后为第2类第110组数据分类后为第3类第113组数据分类后为第3类第114组数据分类后为第2类第115组数据分类后为第2类第116组数据分类后为第2类第118组数据分类后为第2类第122组数据分类后为第2类第123组数据分类后为第2类第124组数据分类后为第2类第133组数据分类后为第3类第134组数据分类后为第3类第135组数据分类后为第2类第136组数据分类后为第3类第140组数据分类后为第3类第142组数据分类后为第3类第144组数据分类后为第2类第145组数据分类后为第1类第146组数据分类后为第3类第148组数据分类后为第3类第149组数据分类后为第2类第152组数据分类后为第2类第157组数据分类后为第2类第159组数据分类后为第3类第161组数据分类后为第2类第162组数据分类后为第3类第163组数据分类后为第3类第164组数据分类后为第3类第165组数据分类后为第3类第167组数据分类后为第3类第168组数据分类后为第3类第173组数据分类后为第3类第174组数据分类后为第3类2.Fisher线性判别法Fisher 线性判别是统计模式识别的基本方法之一。

模式识别大作业

模式识别大作业引言：转眼之间，研一就结束了。

这学期的模式识别课也接近了尾声。

我本科是机械专业，编程和算法的理解能力比较薄弱。

所以虽然这学期老师上课上的很精彩，但是这学期的模式识别课上的感觉还是有点吃力。

不过这学期也加强了编程的练习。

这次的作业花了很久的时间，因为平时自己的方向是主要是图像降噪，自己在看这一块图像降噪论文的时候感觉和模式识别的方向结合的比较少。

我看了这方面的模式识别和图像降噪结合的论文，发现也比较少。

在思考的过程中，我想到了聚类的方法。

包括K均值和C均值等等。

因为之前学过K均值，于是就选择了K均值的聚类方法。

然后用到了均值滤波和自适应滤波进行处理。

正文：k-means聚类算法的工作过程说明如下：首先从n个数据对象任意选择 k 个对象作为初始聚类中心；而对于所剩下其它对象，则根据它们与这些聚类中心的相似度（距离），分别将它们分配给与其最相似的（聚类中心所代表的）聚类；然后再计算每个所获新聚类的聚类中心（该聚类中所有对象的均值）；不断重复这一过程直到标准测度函数开始收敛为止。

一般都采用均方差作为标准测度函数。

k-means 算法接受输入量k ；然后将n个数据对象划分为k个聚类以便使得所获得的聚类满足：同一聚类中的对象相似度较高；而不同聚类中的对象相似度较小。

聚类相似度是利用各聚类中对象的均值所获得一个“中心对象”（引力中心）来进行计算的。

k个聚类具有以下特点：各聚类本身尽可能的紧凑，而各聚类之间尽可能的分开。

均值滤波是常用的非线性滤波方法 ,也是图像处理技术中最常用的预处理技术。

它在平滑脉冲噪声方面非常有效,同时它可以保护图像尖锐的边缘。

均值滤波是典型的线性滤波算法，它是指在图像上对目标像素给一个模板，该模板包括了其周围的临近像素（以目标象素为中心的周围8个象素，构成一个滤波模板，即去掉目标象素本身）。

再用模板中的全体像素的平均值来代替原来像素值。

即对待处理的当前像素点（x，y），选择一个模板，该模板由其近邻的若干像素组成，求模板中所有像素的均值，再把该均值赋予当前像素点（x，y），作为处理后图像在该点上的灰度个g（x，y），即个g（x，y）=1/m ∑f（x，y）m为该模板中包含当前像素在内的像素总个数。

模式识别大作业1

模式识别大作业--fisher线性判别和近邻法学号：021151**姓名：**任课教师:张**I. Fisher线性判别A. fisher线性判别简述在应用统计方法解决模式识别的问题时，一再碰到的问题之一是维数问题．在低维空间里解析上或计算上行得通的方法，在高维里往往行不通．因此，降低维数就成为处理实际问题的关键．我们考虑把维空间的样本投影到一条直线上，形成一维空间，即把维数压缩到一维．这样，必须找一个最好的，易于区分的投影线．这个投影变换就是我们求解的解向量．B.fisher线性判别的降维和判别1．线性投影与Fisher准则函数各类在维特征空间里的样本均值向量：，(1)通过变换映射到一维特征空间后，各类的平均值为：，(2)映射后，各类样本“类内离散度”定义为：，(3)显然，我们希望在映射之后，两类的平均值之间的距离越大越好，而各类的样本类内离散度越小越好。

因此，定义Fisher准则函数：(4)使最大的解就是最佳解向量，也就是Fisher的线性判别式。

2．求解从的表达式可知，它并非的显函数，必须进一步变换。

已知：，, 依次代入上两式，有：，(5)所以：(6)其中：(7)是原维特征空间里的样本类内离散度矩阵，表示两类均值向量之间的离散度大小，因此，越大越容易区分。

将(4.5-6)和(4.5-2)代入(4.5-4)式中：(8)其中：，(9)因此：(10)显然：(11)称为原维特征空间里，样本“类内离散度”矩阵。

是样本“类内总离散度”矩阵。

为了便于分类，显然越小越好，也就是越小越好。

将上述的所有推导结果代入表达式：可以得到：其中，是一个比例因子，不影响的方向，可以删除，从而得到最后解：(12)就使取得最大值，可使样本由维空间向一维空间映射，其投影方向最好。

是一个Fisher线性判断式.这个向量指出了相对于Fisher准则函数最好的投影线方向。

C.算法流程图左图为算法的流程设计图。

II.近邻法A. 近邻法线简述K最近邻(k-Nearest Neighbor，KNN)分类算法，是一个理论上比较成熟的方法，也是最简单的机器学习算法之一。

模式识别作业

第二章主要内容：几种常见的聚类算法已经所使用的准则函数。

作业1：对如下5个6维模式样本，用最小聚类准则进行系统聚类分析已知样本如下：x1: 0, 1, 3, 1, 3, 4；x2: 3, 3, 3, 1, 2, 1；x3: 1, 0, 0, 0, 1, 1；x4: 2, 1, 0, 2, 2, 1；x5: 0, 0, 1, 0, 1, 0 第1步：将每一样本看成单独一类，得(0)(0)(0)112233(0)(0)4455{},{},{}{},{}G x G x G x Gx Gx =====计算各类之间的欧式距离，可得距离矩阵(0)D第2步：矩阵(0)D，它是(0)3G 和(0)5G 之间的距离，将他们合并为一类，得新的分类为(1)(0)(1)(0)(1)(0)(0)(1)(0)112233544{},{},{,},{}G G G G G G G G G ====计算聚类后的距离矩阵(1)D 第3步：由于(1)D 它是(1)3G 与(1)4G 之间的距离，于是合并(1)3G 和(1)4G ，得新的分类为(2)(1)(2)(2)(2)(1)(1)1122334{},{},{,}G G G G G G G ===同样，按最小距离准则计算距离矩阵(2)D，得第4步：同理得(3)(2)(3)(2)(2)11223{},{,}G G G G G == 满足聚类要求，如聚为2类，聚类完毕。

系统聚类算法介绍：第一步：设初始模式样本共有N 个，每个样本自成一类，即建立N 类。

G 1(0), G 2(0) , ……,G N (0)为计算各类之间的距离（初始时即为各样本间的距离），得到一个N*N 维的距离矩阵D(0)。

这里，标号(0)表示聚类开始运算前的状态。

第二步：假设前一步聚类运算中已求得距离矩阵D(n)，n 为逐次聚类合并的次数，则求D(n)中的最小元素。

如果它是Gi(n)和Gj(n)两类之间的距离，则将Gi(n)和Gj(n)两类合并为一类G ij (n+1)，由此建立新的分类：G 1(n+1), G 2(n+1)……第三步：计算合并后新类别之间的距离，得D(n+1)。

模式识别大作业

模式识别大作业(总21页)--本页仅作为文档封面，使用时请直接删除即可----内页可以根据需求调整合适字体及大小--作业1 用身高和/或体重数据进行性别分类（一）基本要求：用和的数据作为训练样本集，建立Bayes分类器，用测试样本数据对该分类器进行测试。

调整特征、分类器等方面的一些因素，考察它们对分类器性能的影响，从而加深对所学内容的理解和感性认识。

具体做法：1．应用单个特征进行实验：以（a）身高或者（b）体重数据作为特征，在正态分布假设下利用最大似然法或者贝叶斯估计法估计分布密度参数，建立最小错误率Bayes分类器，写出得到的决策规则，将该分类器应用到测试样本，考察测试错误情况。

在分类器设计时可以考察采用不同先验概率（如对, 对, 对等）进行实验，考察对决策规则和错误率的影响。

图1-先验概率:分布曲线图2-先验概率:分布曲线图3--先验概率:分布曲线图4不同先验概率的曲线有图可以看出先验概率对决策规则和错误率有很大的影响。

程序：和2．应用两个特征进行实验：同时采用身高和体重数据作为特征，分别假设二者相关或不相关（在正态分布下一定独立），在正态分布假设下估计概率密度，建立最小错误率Bayes分类器，写出得到的决策规则，将该分类器应用到训练/测试样本，考察训练/测试错误情况。

比较相关假设和不相关假设下结果的差异。

在分类器设计时可以考察采用不同先验概率（如 vs. , vs. , vs. 等）进行实验，考察对决策和错误率的影响。

训练样本female来测试图1先验概率 vs. 图2先验概率 vs.图3先验概率 vs. 图4不同先验概率对测试样本1进行试验得图对测试样本2进行试验有图可以看出先验概率对决策规则和错误率有很大的影响。

程序和3．自行给出一个决策表，采用最小风险的Bayes决策重复上面的某个或全部实验。

W1W2W10W20close all;clear all;X=120::200; %设置采样范围及精度pw1=;pw2=; %设置先验概率sample1=textread('') %读入样本samplew1=zeros(1,length(sample1(:,1)));u1=mean(sample1(:,1));m1=std(sample1(:,1));y1=normpdf(X,u1,m1); %类条件概率分布figure(1);subplot(2,1,1);plot(X,y1);title('F身高类条件概率分布曲线');sample2=textread('') %读入样本samplew2=zeros(1,length(sample2(:,1)));u2=mean(sample2(:,1));m2=std(sample2(:,1));y2=normpdf(X,u2,m2); %类条件概率分布subplot(2,1,2);plot(X,y2);title('M身高类条件概率分布曲线');P1=pw1*y1./(pw1*y1+pw2*y2);P2=pw2*y2./(pw1*y1+pw2*y2);figure(2);subplot(2,1,1);plot(X,P1);title('F身高后验概率分布曲线');subplot(2,1,2);plot(X,P2);title('M身高后验概率分布曲线');P11=pw1*y1;P22=pw2*y2;figure(3);subplot(3,1,1);plot(X,P11);subplot(3,1,2);plot(X,P22);subplot(3,1,3);plot(X,P11,X,P22);sample=textread('all ') %读入样本[result]=bayes(sample1(:,1),sample2(:,1),pw1,pw2);%bayes分类器function [result] =bayes(sample1(:,1),sample2(:,1),pw1,pw2); error1=0;error2=0;u1=mean(sample1(:,1));m1=std(sample1(:,1));y1=normpdf(X,u1,m1); %类条件概率分布u2=mean(sample2(:,1));m2=std(sample2(:,1));y2=normpdf(X,u2,m2); %类条件概率分布P1=pw1*y1./(pw1*y1+pw2*y2);P2=pw2*y2./(pw1*y1+pw2*y2);for i = 1:50if P1(i)>P2(i)result(i)=0;pe(i)=P2(i);elseresult(i)=1;pe(i)=P1(i);endendfor i=1:50if result(k)==0error1=error1+1;else result(k)=1error2=error2+1;endendratio = error1+error2/length(sample); %识别率,百分比形式sprintf('正确识别率为%.2f%%.',ratio)作业2 用身高/体重数据进行性别分类（二）基本要求：试验直接设计线性分类器的方法，与基于概率密度估计的贝叶斯分离器进行比较。

模式识别大作业

作业1 用身高和/或体重数据进行性别分类（一）基本要求：用和的数据作为训练样本集，建立Bayes分类器，用测试样本数据对该分类器进行测试。

调整特征、分类器等方面的一些因素，考察它们对分类器性能的影响，从而加深对所学内容的理解和感性认识。

在分类器设计时可以考察采用不同先验概率（如对, 对, 对等）进行实验，考察对决策规则和错误率的影响。

图1-先验概率:分布曲线图2-先验概率:分布曲线图3--先验概率:分布曲线图4不同先验概率的曲线有图可以看出先验概率对决策规则和错误率有很大的影响。

程序：和2．应用两个特征进行实验：同时采用身高和体重数据作为特征，分别假设二者相关或不相关（在正态分布下一定独立），在正态分布假设下估计概率密度，建立最小错误率Bayes 分类器，写出得到的决策规则，将该分类器应用到训练/测试样本，考察训练/测试错误情况。

比较相关假设和不相关假设下结果的差异。

在分类器设计时可以考察采用不同先验概率（如vs. , vs. , vs. 等）进行实验，考察对决策和错误率的影响。

训练样本female来测试图1先验概率vs. 图2先验概率vs.图3先验概率vs. 图4不同先验概率对测试样本1进行试验得图对测试样本2进行试验有图可以看出先验概率对决策规则和错误率有很大的影响。

程序和3．自行给出一个决策表，采用最小风险的Bayes决策重复上面的某个或全部实验。

W1W2W10W20close all;clear all;X=120::200; %设置采样范围及精度pw1=;pw2=; %设置先验概率sample1=textread('') %读入样本samplew1=zeros(1,length(sample1(:,1)));u1=mean(sample1(:,1));m1=std(sample1(:,1));y1=normpdf(X,u1,m1); %类条件概率分布figure(1);subplot(2,1,1);plot(X,y1);title('F身高类条件概率分布曲线');sample2=textread('') %读入样本samplew2=zeros(1,length(sample2(:,1)));u2=mean(sample2(:,1));m2=std(sample2(:,1));y2=normpdf(X,u2,m2); %类条件概率分布subplot(2,1,2);plot(X,y2);title('M身高类条件概率分布曲线');P1=pw1*y1./(pw1*y1+pw2*y2);P2=pw2*y2./(pw1*y1+pw2*y2);figure(2);subplot(2,1,1);plot(X,P1);title('F身高后验概率分布曲线');subplot(2,1,2);plot(X,P2);title('M身高后验概率分布曲线');P11=pw1*y1;P22=pw2*y2;figure(3);subplot(3,1,1);plot(X,P11);subplot(3,1,2);plot(X,P22);subplot(3,1,3);plot(X,P11,X,P22);sample=textread('all ') %读入样本[result]=bayes(sample1(:,1),sample2(:,1),pw1,pw2);%bayes分类器function [result] =bayes(sample1(:,1),sample2(:,1),pw1,pw2);error1=0;error2=0;u1=mean(sample1(:,1));m1=std(sample1(:,1));y1=normpdf(X,u1,m1); %类条件概率分布u2=mean(sample2(:,1));m2=std(sample2(:,1));y2=normpdf(X,u2,m2); %类条件概率分布P1=pw1*y1./(pw1*y1+pw2*y2);P2=pw2*y2./(pw1*y1+pw2*y2);for i = 1:50if P1(i)>P2(i)result(i)=0;pe(i)=P2(i);elseresult(i)=1;pe(i)=P1(i);endendfor i=1:50if result(k)==0error1=error1+1;else result(k)=1error2=error2+1;endendratio = error1+error2/length(sample); %识别率,百分比形式sprintf('正确识别率为%.2f%%.',ratio)作业2 用身高/体重数据进行性别分类（二）基本要求：试验直接设计线性分类器的方法，与基于概率密度估计的贝叶斯分离器进行比较。

模式识别第三章作业

1. 在一个10类的模式识别问题中，有3类单独满足多类情况1，其余的类别满足多类情况2。

问该模式识别问题所需判别函数的最少数目是多少？答：25个判别函数。

将10类问题看作4类满足多类情况1的问题，先将3类单独满足多类情况1的类找出来，再将剩下的7类全部划到第4类中。

再对第四类运用多类情况2的判别法则进行分类，此时需要7*（7-1）/2=21个判别函数。

所有一共需要4+21=25个判别函数；2. 一个三类问题，其判别函数如下：d1(x)=-x1, d2(x)=x1+x2-1, d3(x)=x1-x2-1(1) 设这些函数是在多类情况1条件下确定的，绘出其判别界面和每一个模式类别的区域（2）设为多类情况2，并使：d12(x)= d1(x), d13(x)= d2(x), d23(x)= d3(x)。

绘出其判别界面和多类情况2的区域。

（3）设d1(x), d2(x)和d3(x)是在多类情况3的条件下确定的，绘出其判别界面和每类的区域3．两类模式，每类包括5个3维不同的模式，且良好分布。

如果它们是线性可分的，问权向量至少需要几个系数分量？假如要建立二次的多项式判别函数，又至少需要几个系数分量？（设模式的良好分布不因模式变化而改变。

）解：由总项数公式()!!!rw n rn rN Cr n++==，得1 44N C==；23210N C+==所以如果它们是线性可分的，则权向量至少需要4个系数分量；如要建立二次的多项式判别函数，则至少需要10个系数分量4．用感知器算法求下列模式分类的解向量w:ω1: {(0 0 0)T, (1 0 0)T, (1 0 1)T, (1 1 0)T}ω2: {(0 0 1)T, (0 1 1)T, (0 1 0)T, (1 1 1)T}解：将属于2ω的模式样本乘以（-1）进行第一轮迭代：取C=1，令w(1)= (0 0 0 0)Tw T(1)x①=(0 0 0 0)(0 0 0 1)T=0；故w(2)=w(1)+x①=(0 0 0 1)Tw T(2)x②=(0 0 0 1)(1 0 0 1)T=1>0，故w(3)=w(2)=(0 0 0 1)Tw T(3)x③=(0 0 0 1)(1 0 1 1)T=1>0，故w(4)=w(3)=(0 0 0 1)Tw T(4)x④=(0 0 0 1)(1 1 0 1)T=1>0，故w(5)=w(4)=(0 0 0 1)Tw T(5)x⑤=(0 0 0 1)(0 0 -1 -1)T=-1<0，故w(6)=w(5)+x⑤=(0 0 -1 0)Tw T(6)x⑥=(0 0 -1 0)(0 -1 -1 -1)T=1>0，故w(7)=w(6)=(0 0 -1 0)Tw T(7)x⑦=(0 0 -1 0)(0 -1 0 -1)T=0，故w(8)=w(7)+x⑦=(0 -1 -1 -1)Tw T(8)x⑧=(0 -1 -1 -1)(-1 -1 -1 -1)T=3>0，故w(9)=w(8)=(0 -1 -1 -1)T第二轮迭代：w T(9)x①=(0 -1 -1 -1)(0 0 0 1)T=-1<0；故w(10)=w(9)+x①=(0 -1 -1 0)Tw T(10)x②=(0 -1 -1 0)(1 0 0 1)T=0，故w(11)=w(10)+x②=(1 -1 -1 1)Tw T(11)x③=(1 -1 -1 1)(1 0 1 1)T=1>0，故w(12)=w(11)=(1 -1 -1 1)Tw T(12)x④=(1 -1 -1 1)(1 1 0 1)T=1>0，故w(13)=w(12)=(1 -1 -1 1)Tw T(13)x⑤=(1 -1 -1 1)(0 0 -1 -1)T=0，故w(14)=w(13)+x⑤=(1 -1 -2 0)T w T(14)x⑥=(1 -1 -2 0)(0 -1 -1 -1)T=3>0，故w(15)=w(14)=(1 -1 -2 0)T w T(15)x⑦=(1 -1 -2 0)(0 -1 0 -1)T=1>0，故w(16)=w(15)=(1 -1 -2 0)T w T(16)x⑧=(1 -1 -2 0)(-1 -1 -1 -1)T=2>0，故w(17)=w(16)=(1 -1 -2 0)T 第三轮迭代：…w T(24)x⑧=(2 -2 -2 0)(-1 -1 -1 -1)T=2>0，故w(25)=w(24)=(2 -2 -2 0)T 第四轮迭代：w T(25)x①=(2 -2 -2 0)(0 0 0 1)T=0；故w(26)=w(25)+x①=(2 -2 -2 1)T…w T(32)x⑧=(2 -2 -2 1)(-1 -1 -1 -1)T=1>0，故w(33)=w(32)=(2 -2 -2 1)T 第五轮迭代：….该轮迭代全部大于0所以w=(2 -2 -2 1)TMatlab 运行结果5．用多类感知器算法求下列模式的判别函数：ω1: (-1 -1)Tω2: (0 0)Tω3: (1 1)T解：将模式样本写成增广形式：x①=(-1 -1 1)T, x②=(0 0 1)T, x③=(1 1 1)T取初始值w1(1)=w2(1)=w3(1)=(0 0 0)T，C=1。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Homework #2Note:In some problem (this is true for the entire quarter) you will need to make some assumptions since the problem statement may not fully specify the problem space. Make sure that you make reasonable assumptions and clearly state them.Work alone: You are expected to do your own work on all assignments; there are no group assignments in this course. You may (and are encouraged to) engage in general discussions with your classmates regarding the assignments, but specific details of a solution, including the solution itself, must always be your own work.Problem:In this problem we will investigate the importance of having the correct model for classification. Load file hw2.mat and open it in Matlab using command load hw2. Using command whos, you should see six array c1, c2, c3 and t1, t2, t3, each has size 500 by 2. Arrays c1, c2, c3 hold the training data, and arrays t1, t2, t3 hold the testing data. That is arrays c1, c2, c3 should be used to train your classifier, and arrays t1, t2, t3 should be used to test how the classifier performs on the data it hasn’t seen. Arrays c1 holds training data for the first class, c2 for the second class, c3 for the third class. Arrays t1, t2, t3 hold the test data, where the true class of data in t1, t2, t3 comes from the first, second, third classed respectively. Of course, array ci and ti were drawn from the same distribution for each i. Each training and testing example has 2 features. Thus all arrays are two dimensional, the number of rows is equal to the number of examples, and there are 2 columns, column 1 has the first feature, column 2 has the second feature.(a)Visualize the examples by using Matlab scatter command a plotting each class indifferent color. For example, for class 1 use scatter(c1(:,1),c1(:,2),’r’);. Other possible colors can be found by typing help plot.(b)From the scatter plot in (a), for which classes the multivariate normal distribution lookslike a possible model, and for which classes it is grossly wrong? If you are not sure how to answer this part, do parts (c-d) first.(c)Suppose we make an erroneous assumption that all classed have multivariate normalNμ. Compute the Maximum Likelihood estimates for the means and distributions()∑,covariance matrices (remember you have to do it separately for each class). Make sure you use only the training data; this is the data in arrays c1, c2, and c3.(d)You can visualize what the estimated distributions look like using Matlab contour().Recall that the data should be denser along the smaller ellipse, because these are closer to the estimated mean.(e)Use the ML estimates from the step (c) to design the ML classifier (this is the Bayesclassifier under zero-one loss function with equal priors). Thus we are assuming that priors are the same for each class. Now classify the test example (that is only thoseexamples which are in arrays t1, t2, t3). Compute confusion array which has size 3 by 3, and in ith row and jth column contains the number of examples fro which the true class isi while the class your classifier gives is j. Note that all the off-diagonal elements in theconfusion array are errors. Compute the total classification error, easiest way to do it is to use Matlab function sum() and trace().(f)Inspect the off diagonal elements to see if which types of error are more common thanothers. That should give you an idea of where the decision boundaries lie. Now plot the decision regions experimentally (select a fine 2D grid, classify each point on this grid, and plot the class with distinct color). If you love solving quadratic systems of equations, you can find the decision boundaries analytically. Using your decision boundaries, explain why some errors are more common than others.(g)If the model assumed for the data is wrong, than the ML estimate of the parameters arenot even the best parameters to use for classification with that wrong model. That is because the multivariate normal is the wrong distribution to use with out data, the MLE parameters we computed in part (c) are not the ones which will give us the best classification with our wrong model. To confirm this, find parameters for the means and variances (you can change as many as you like, from one to all) which will give better classification rate than the one you have gotten in part (e). Hint: it pays to try to change covariance matrices, rather than the means.(h)Now let’s try to find a better model for our data. Notice that to determine the class of apoint; it is sufficient to consider the distance of that point from the origin. The distance from the origin is very important for classifying our data, while the direction is totally irrelevant. Convert all the training and testing arrays to polar coordinates using Matlib function cart2pol(). Ignore the first coordinate, which is the angle, and only use the second coordinate, which is the radius (or distance from the origin). Assume now that all classes come from normal distribution with unknown mean and variance. Estimate these unknown parameters using ML estimation again using only the training data (the arrays ci’s). Test how this new classifier works using the testing data (the arrays ti’s) by computing the confusion matrix and the total classification error. How does this classifier compare with the one using the multivariate normal assumption and why is there a difference?(i)Experimentally try to find better parameters than those found by ML method for classifierin (h). If you do find better parameters, do they lead to a significantly better classification error? How does it compare to part (g)? Why can’t you find significantly better parameters than MLE for the classifier in (h)?。