基于MATLAB的贝叶斯网络(BNT)工具箱的使用与实例

合集下载

机器学习中的贝叶斯网络及其推理分析

机器学习中的贝叶斯网络及其推理分析张慧莹;宁媛;邵晓非【摘要】机器学习作为当今国内外研究的热点在智能系统中得到了重视和运用,贝叶斯是机器学习的核心方法之一,以贝叶斯理论作为中心的贝叶斯网络必将应用延伸到各个问题领域,本文介绍了贝叶斯网络的概念及其学习推理过程,并结合MATLAB中的BNT工具箱,引用来自UCI的标准数据集对贝叶斯网络进行仿真测试.【期刊名称】《现代机械》【年(卷),期】2012(000)002【总页数】4页(P91-94)【关键词】机器学习;贝叶斯网络;MATLAB;贝叶斯学习推理;BNT工具箱【作者】张慧莹;宁媛;邵晓非【作者单位】贵州大学电气工程学院,贵州贵阳550003;贵州大学电气工程学院,贵州贵阳550003;贵州大学电气工程学院,贵州贵阳550003【正文语种】中文【中图分类】TP1830 引言机器学习作为当今国内外研究的热点，在智能系统中得到了重视和运用，而贝叶斯是机器学习的核心方法之一，以贝叶斯理论作为中心的贝叶斯网络更是将应用延伸到各个问题领域，所有需要作出概率预测的地方都可以见到贝叶斯的影子，这背后的深刻原因在于现实世界本身就是不确定的，人类的观察能力是有局限性的，这正是贝叶斯网络的优点，值得深入研究。

1 机器学习机器学习即是研究计算机怎样模拟或实现人类的学习行为，以获取新的知识或技能，重新组织已有的知识结构使之不断改善自身的性能。

对于机器学习的研究成果已经无声的走入了人类的日常生活，自动驾驶、智能机器手、智能窗帘等等很多方面都可以看到机器学习的应用，它不仅为人类的生活带来了便利，也引领着全世界进入一个智能化的多元世纪。

机器学习旨在建立学习的计算理论，构造各种学习系统，并在各个领域应用这些系统，它有四个构成要素:环境、学习环节、知识库和执行环节［1］。

四个环节之间构成了如图1 所示的关系流程，即“认识—实践—再认识”，从而实现机器学习的过程。

这样一个动态的学习过程表明，机器学习实际是一个有特定目的的知识获取过程，对知识的认识是机器学习研究的基础，知识的获取和提高是机器学习的两个重要内容。

Matlab BNS 使用

%学习 bnet3 = learn_params(bnet2,data);
实验结果：手动给出的CPT nsamples=20 nsamples=200
分区机器学习的第 3 页
nsamples=200 nsamples=2000 可以看出，随着训练样本数的增加，学习到的条件概率表越来越逼近于手动给出的条件概率表。
rand('state',seed); bnet2.CPD{C} = tabular_CPD(bnet2,C); bnet2.CPD{S} = tabular_CPD(bnet2,S); bnet2.CPD{R} = tabular_CPD(bnet2,R); bnet2.CPD{W} = tabular_CPD(bnet2,W);
%计算单个节点后验概率，即进行推理 marg1 = marginal_nodes(engine,S); marg1.T %计算对节点联合后验概率 marg2 = marginal_nodes(engine,[S R W]); marg2.T %给出“软证据”，即节点的可能分布概率情况下的推理 evidence{R} = []; soft_evidence{R}=[0.6 0.4]; [engine, loglike] = enter_evidence(engine,evidence,'soft',soft_evidence); marg3 = marginal_nodes(engine,S); marg3.T 实验结果: 1.贝叶斯网络 2.单个节点后验概率 3.多个节点后验概率 4.soft_evidence情况下的后验概率 2. 焚化炉厂废物排放模型建立及推理：（包含离散变量和连续变量）这个实验与第一个实验不同的地方就是它所建立的贝叶斯网中的节点变量包含连续变量，在建立条件概率概率表时会有所不同，离散变量使用CPD构造器tabular_CPD,连续变量使用 gaussian_CPD。这里指给出这一部分的代码： bnet.CPD{B} = tabular_CPD(bnet,B,'CPT',[0.85 0.15]);

Matlab中的机器学习和贝叶斯网络技巧

Matlab中的机器学习和贝叶斯网络技巧机器学习是一门涵盖统计学、人工智能和计算机科学等多学科知识的领域，它通过让计算机从数据中学习并逐步改进性能，来完成特定任务。

而贝叶斯网络是机器学习中一种常用的概率图模型，它能够建模和推断变量之间的依赖关系。

本文将介绍在Matlab中应用机器学习和贝叶斯网络的技巧和方法。

一、机器学习基础机器学习的基本任务是通过对已有数据的学习来构建一个预测模型，并用该模型对新的数据进行预测。

在Matlab中，我们可以使用一些常用的机器学习工具箱，如Statistics and Machine Learning Toolbox和Neural Network Toolbox等，来实现各种机器学习算法。

1. 数据准备在进行机器学习之前，我们首先需要准备好适合建模的数据。

这包括数据的采集、预处理和特征提取等步骤。

Matlab提供了丰富的数据处理和可视化函数，如readtable、preprocess和feature_extraction等，来帮助我们完成这些任务。

2. 特征选择在建模之前，我们需要从原始数据中选择出对预测结果有重要影响的特征。

Matlab中提供了一些特征选择工具函数，如sequentialfs、relieff和lasso等，可以帮助我们进行特征选择。

3. 模型训练在数据准备和特征选择之后，我们就可以使用机器学习算法进行模型训练了。

根据不同的问题和数据类型，我们可以选择适合的算法，如支持向量机、决策树、随机森林等。

Matlab提供了这些算法的实现函数，如svmtrain、treefit和randomForest等，可以方便地进行模型训练。

4. 模型评估模型训练完成后，我们需要对模型进行评估，以了解其性能和泛化能力。

在Matlab中，我们可以使用一些评估指标，如准确率、精确率、召回率和F1值等，来评估模型的效果。

此外，还可以使用交叉验证、学习曲线和混淆矩阵等方法来进行模型评估。

Matlab Bayes Net Toolbox - 快速操作多维数组说明书

Fast manipulation of multi-dimensional arrays in MatlabKevin P.Murphy**************.edu11September20021IntroductionProbabilistic inference in graphical models with discrete random variables requires performing various oper-ations on multi-dimensional arrays(discrete potentials).This is true whether we use an exact algorithm like junction tree[CDLS99,HD96]or an approximate algorithm like loopy belief propagation[AM00,KFL01]. These operations consist of element-wise multiplication/division of two arrays of potentially diﬀerent sizes, and summation(marginalization)over a subset of the dimensions.This report discusses eﬃcient ways to implement these operations in Matlab,with an emphasis on the implementation used in the Bayes Net Toolbox(BNT).2Running exampleWe will introduce an example to illustrate the problems we want to solve.Consider two arrays(tables), Tbig and Tsmall,where Tbig represents a function over the variables X1,X2,X3,X4and Tsmall represents a function over X1,X3.We will say Tbig’s domain is(1,2,3,4),Tsmall’s domain is(1,3),and the diﬀerence in their domains is(2,4).Let the size of variable X i(i.e.,the number of possible values it can have)be denoted by S i.Here is a straighforward implementation of elementwise multiplication:for i1=1:S1for i2=1:S2for i3=1:S3for i4=1:S4Tbig[i1,i2,i3,i4]=Tbig[i1,i2,i3,i4]*Tsmall[i1,i3];endendendendSimilarly,here is a straightforward implementation of marginalizing the big array onto the small domain: for i1=1:S1for i3=1:S3sum=0;for i2=1:S2for i4=1:S4sum=sum+Tbig[i1,i2,i3,i4];endendTsmall[i1,i3]=sum;endendOf course,these are not general solutions,because we have hard-coded the fact that Tbig is4-dimensional, Tsmall is2-dimensional,and that we are marginalizing over dimensions2and4.The general solution requires mapping vectors of indices(or subscripts)to1dimensional indices(oﬀsets into a1D array),and vice versa. We discuss these auxiliary functions next,before discussing a variety of diﬀerent solutions based on these functions.3Auxiliary functions3.1Converting from multi-dimensional indices to1D indicesIf a d-dimensional array is stored in memory such that the left-most indices toggle fastest(as in Matlab and Fortran—C follows the opposite convention,toggling the right-most indices fastest),then we can compute the1D index from a vector of indices,subs,as follows:ndx=1+(i1-1)+(i2-1)*S1+(i3-1)*S1*S2+...+(id-1)*S1*S2*...*S(d-1) where the vector of indices is subs=(i1,...,i d),and the size of the dimensions are sz=(S1,...,S d). (We only need to add and subtract1if we use1-based indexing,as in Matlab;in C,we would omit this step.)This can be written more concisely asndx=1+sum((subs-1).*[1cumprod(sz(1:end-1))])If we pass the subscripts separately,we can use the built-in Matlab functionndx=sub2ind(sz,i1,i2,...)BNT oﬀers a similar function,except the subscripts are passed as a vector:ndx=subv2ind(sz,subs)The BNT function is vectorized,i.e.,if each row of subs contains a set of subscripts.then ndx will be a vector of1D indices.This can be computed by simple matrix multiplication.For example,ifsubs=[111;211;...222];and sz=[222],then subv2ind(sz,subs)returns[12...8]’,which can be computed usingcp=[1cumprod(siz(1:end-1))]’;ndx=(subs-1)*cp+1;If all dimensions have size2,this is equivalent to converting a binary number to decimal.3.2Converting from1D indices to multi-dimensional indicesWe convert from1D to multi-D by dividing(which is slower).For instance,if subs=(1,3,2)and sz= (4,3,3),then ndx will bendx=1+(1−1)∗1+(3−1)∗4+(2−1)∗4∗3=21To convert back we geti1=1+ (21−1)mod41=1i2=1+ (21−1)mod124=3i3=1+ (21−1)mod3612=2The matlab and BNT functions sub2ind and subv2ind do this.Note:If all dimensions have size2,this is equivalent to converting a decimal to a binary number.3.3Comparing diﬀerent domainsIn our example,Tbig’s domain was(X1,X2,X3,X4)and Tsmall’s domain was(X1,X3).The identity of the variables does not matter;what matters is that for each dimension in Tsmall,we canﬁnd the corresponding/ equivalent dimension in Tbig.This is called a mapping,and can be computed in BNT usingmap=find_equiv_posns(small_domain,big_domain)(In R,this function is called match.In Phrog[Phr],this is called an RvMapping.)If smalldom=(1,3) and bigdom=(1,2,3,4),then map=(1,3);if smalldom=(8,2)and bigdom=(2,7,4,8),then map= (4,1);etc.4Naive methodGiven the above auxiliary functions,we can implement multiplication as follows:map=find_equiv_posns(Tsmall.domain,Tbig.domain);for ibig=1:prod(Tbig.sizes)subs_big=ind2subv(Tbig.sizes,ibig);subs_small=subs_big(map);ismall=subv2ind(Tsmall.sizes,subs_small);Tbig.T(ibig)=Tbig.T(ibig)*Tsmall.T(ismall);end(Note that this involves a single write to Tbig in sequential order,but multiple reads from Tsmall in a random order;this can aﬀect cache performance.)Marginalization can be implemented as follows(other methods are possible,as we will see below). Tsmall.T=zeros(Tsmall.sizes);map=find_equiv_posns(Tsmall.domain,Tbig.domain);for ibig=1:prod(Tbig.sizes)subs_big=ind2subv(Tbig.sizes,ibig);subs_small=subs_big(map);ismall=subv2ind(Tsmall.sizes,subs_small);Tsmall.T(ismall)=Tsmall.T(ismall)+Tbig.T(ibig);end(Note that this involves multiple writes to Tsmall in a random order,but a single read of each element of Tbig in sequential order;this can aﬀect cache performance.)The problem is that calling ind2subv and subv2ind inside the inner loop is very slow,so we now seek faster methods.Row ndx 1111211112112211112121211221222111122112121222121122212212222222look up the indices,we need to convert map and big-sizes,both of which are lists of positive(small)integers, into integers.This could be done with a hash function.This makes it possible to hide the presence of the cache inside an object(call it a TableEngine)which can multiply/marginalize(pairs of)tables,e.g., function Tbig=multiply(TableEngine,Tsmall,Tbig)map=find_equiv_posns(Tsmall.domain,Tbig.domain);cache_entry_num=hash_fn(map,Tbig.sizes);ndx=TableEngine.ndx_cache{cache_entry_num};Tbig.T(:)=Tbig.T(:).*Tsmall.T(ndx);One big problem with Matlab is that Tbig will be passed by value,since it is modiﬁed within the function, and then copied back to the callee.This could be avoided if functions could be inlined,or if pass by reference were supported,but this is not possible with the current version of Matlab.Another problem is that computing the hash function is slow in Matlab,so what I currently do in BNT is explicitely store the cache-entry-num for every pair of potentials that will be multiplied(i.e.,for every pair of neighboring cliques in the jtree).Also,for every potential,I store the cache-entry-num for every domain onto which the potential may be marginalized(this corresponds to all families and singletons in the bnet). This allows me to quickly retrieve the relevant cache entry,but unfortunately requires the inference engine to be aware of the presence of the cache.That is,the current code(inside jtree-ndx-inf-engine/collect-evidence) looks more like the following:ndx=B_NDX_CACHE{engine.mult_cl_by_sep_ndx_id(p,n)};ndx=double(ndx)+1;Tsmall=Tsmall(ndx);Tbig(:)=Tbig(:).*Tsmall(:);where mult-cl-by-sep-ndx-id(p,n)is the cache entry number for multiplying clique p by separator n.By implementing a TableEngine with a hash function,we could hide the presence of the cache,and simplify the code.Furthermore,instead of having jtree-inf and jtree-ndx-inf,we would have a single class,jtree-inf,which could use diﬀerent implementations of the TableEngine object,e.g.,with or without cacheing.Indeed,any code that manipulates tables(e.g.,loopy)would be able to use diﬀerent implementations of TableEngine. We will now discuss other possible implementations of the TableEngine,which use diﬀerent kinds of indices, or even no indices at all.7Reducing the size of the indices:ndxSDWe can reduce the space requirements for storing the indices from B to S+D,where S=prod(sz(Tsmall.domain)) and D=prod(sz(diﬀ-domain)).To explain how this works,let usﬁrst rewrite the marginalization code,so that we write to each element of Tsmall once in sequential order,but do multiple random reads from Tbig (the opposite of before).small_map=find_equiv_posns(Tsmall.domain,Tbig.domain);diff=setdiff(Tbig.domain,Tsmall.domain);diff_map=find_equiv_posns(diff,Tbig.domain);diff_sz=Tbig.sizes(diff_map);for ismall=1:prod(Tsmall.sizes)subs_small=ind2subv(Tsmall.sizes,ismall);subs_big=zeros(1,length(Tbig.domain));sum=0;for jdiff=1:prod(diff_sz)subs_diff=ind2subv(diff_sz,idff);subs_big(small_map)=subs_small;subs_big(subs_diff)=subs_diff;ibig=subv2ind(Tbig.sizes,subs_big);sum=sum+Tbig.T(ibig);endTsmall.T(ismall)=sum;endNow suppose we have a function that computes ibig given ismall and jdiﬀ,call it index(ismall,jdiﬀ). Then we can rewrite the above as follows:for ismall=1:prod(Tsmall.sizes)sum=0;for jdiff=1:prod(diff_sz)sum=sum+Tbig.T(index(ismall,jdiff));endTsmall.T(ismall)=sum;endSimilarly,we can implement multiplication by doing a single write to Tbig in a random order,and multiple sequential reads from Tsmall(the opposite of before).for ismall=1:prod(Tsmall.sizes)for jdiff=1:prod(diff_sz)ibig=index(ismall,jdiff);Tbig.T(ibig)=Tbig.T(ibig)*Tsmall.T(ismall);endend7.1Computing the indicesWe now explain how to compute the index(what[Phr]calls a FactorMapping)using our running example.1 Referring to Table1,we see that entries1,3,9,11of Tbig map to Tsmall(1),entries2,4,10,12map to2, entries5,7,13,15map to3,and entries6,8,14,16map to4.Instead of keeping the whole ndx,it is suﬃcient to keep two tables:one that keeps theﬁrst value of each group,call it start(in this case[1,2,5,6])and one that keeps the oﬀset from the start within each group(in this case[0,2,8,10]).(In BNT,start is called small-ndx and oﬀset is called diﬀ-ndx.)Then we haveindex(ismall,jdiff)=start(ismall)+offset(jdiff)We can compute the start positions by noticing that,in this example,we just increment X1and X3, keeping the remaining dimensions(the diﬀdomain)constantly clamped to1;this can be implemented by setting the eﬀective size of the diﬀerence dimensions to1(so they only have1possible value).diff_domain=mysetdiff(Tbig.domain,Tsmall.domain);diff_sizes=Tbig.sizes(map);map=find_equiv_posns(diff_domain,Tbig.domain);sz=Tbig.sizes;sz(map)=1;%sz(diff)is1,sz(small)is normalsubs=ind2subv(sz,1:prod(Tsmall.sizes));start=subv2ind(Tbig.sizes,subs);Similarly,we can compute the oﬀsets by incrementing X2and X4,keeping X1and X3ﬁxed.map=find_equiv_posns(Tsmall.domain,Tbig.domain);sz=Tbig.sizes;sz(map)=1;%sz(small)is1,sz(diff)is normalsubs=ind2subv(sz,1:prod(diff_sz));offset=subv2ind(Tbig.sizes,subs)-1;7.2Avoiding for-loopsGiven start and oﬀset,we can implement multiplication and marginalization using two for-loops,as shown above.This is fast in C,but slow in Matlab.For small problems,it is possible to vectorize both operations,as follows.First we create a matrix of indices,ndx2,from start and oﬀset,as follows:ndx2=repmat(start(:),1,prod(diff_sizes))+repmat(offset(:)’,prod(Tsmall.sizes),1);In our example,this produces1111222255556666 +02810028100281002810 =13911241012571315681416Row 1contains the locations in Tbig which should be summed together and stored in Tsmall(1),etc.Hence we can writeTsmall.T =sum(Tbig.T(ndx2),2);For multiplication,each element of Tsmall gets mapped to many elements of Tbig,hence Tbig.T(ndx2(:))=Tbig.T(ndx2(:)).*repmat(Tsmall.T(:),prod(diff_sz),1);8Eliminating for-loops and indicesWe can eliminate the need for for-loops and indices as follows.We make Tsmall the same size as Tbig by replicating entries where necessary,and then just doing an element-wise multiplication:temp =extend_domain(Tsmall.T,Tsmall.domain,Tsmall.sizes,Tbig.domain,Tbig.sizes);Tbig.T =Tbig.T .*temp;Let us explain extend-domain using our standard example.First we reshape Tsmall to make it have size [2121],so it has the same number of dimensions as Tbig.(This involves no real work.)map =find_equiv_posns(smalldom,bigdom);sz =ones(1,length(bigdom));sz(map)=small_sizes;Tsmall =reshape(Tsmall,sz);Now we replicate Tsmall along dimensions 2and 4,to make it the same size as big-sizes (we assume the variables have a standardized order,so there is no need to permute dimensions).sz =big_sizes;sz(map)=1;Tsmall =repmat(Tsmall,sz(:)’);Now we are ready to multiply:Tbig .*Tsmall .For small problems,this is faster,at least in Matlab,but in C,it would be faster to avoid copying memory.Doug Schwarz wrote a genops class using C that can avoid the repmat above.See /∼schwarz/genops.html (unfortunately no longer available online).For marginalization,we can simply call Matlab’s built-in sum command on each dimension,and then squeeze the result,to get rid of dimensions that have been reduced to size 1(we must remember to put back any dimensions that were originally size 1,by reshaping).Method Section ndx MargTsmall Tbig vecmulti rnd rd sgl seq wr yesmulti seq rd sgl rnd wr norepmat sgl rnd wr yesrepmat sgl seq wr yes[KFL01] F.Kschischang,B.Frey,and H-A.Loeliger.Factor graphs and the sum-product algorithm.IEEE Trans Info.Theory,February2001.[Phr]Phrog:Stanford’s C++Bayes Net package./frog/mappings.html.。

BP神经网络matlab实现和matlab工具箱使用实例

w=rand(hideNums,outputNums); %10*3;同V deltw=zeros(hideNums,outputNums);%10*3 dw=zeros(hideNums,outputNums); %10*3
samplelist=[0.1323,0.323,-0.132;0.321,0.2434,0.456;-0.6546,-0.3242,0.3255]; %3*3;上设输输作3*3(实作3和网网) expectlist=[0.5435,0.422,-0.642;0.1,0.562,0.5675;-0.6464,-0.756,0.11]; %3*3;星也输也作3*3(实作3和网网)，如输有的有有学
num1=5; %设隐设稍 num2=10000; %最也迭迭迭稍 a1=0.02; %星也显显 a2=0.05; %学学学
test=randn(1,5)*0.5; %网网网稍5和个个作 in=-1:.1:1; %训训作 expout=[-.9602 -.5770 -.0729 .3771 .6405 .6600 .4609 .1336 -.2013 -.4344 -.5000 -.3930 -.1647 .0988 .3072 .3960 .3449 .1816
%p,t作作我我训训输输，pp作作训训虚的我我输输网网,最然的ww作作pp神经训训虚的BP训训然的输也
function ww=bpnet(p,t,ynum,maxnum,ex,lr,pp) plot(p,t,"+"); title("训训网网"); xlabel("P"); ylabel("t"); [w1,b1,w2,b2]=initff(p,ynum,"tansig",t,"purelin"); %我我我初稍和设设的BP我我 zhen=25; %每迭迭每每迭每稍显显 biglr=1.1; %学学学使学学学(和也用也用用用) litlr=0.7; %学学学使学学学(梯梯然梯经学使) a=0.7 %动网动a也也(△W(t)=lr*X*ん+a*△W(t-1)) tp=[zhen maxnum ex lr biglr litlr a 1.04]; %trainbpx [w1,b1,w2,b2,ep,tr]=trainbpx(w1,b1,"tansig",w2,b2,"purelin",p,t,tp);

matlab朴素贝叶斯分类算法代码

matlab朴素贝叶斯分类算法代码朴素贝叶斯分类算法是一种基于贝叶斯定理的统计学习方法，常用于分类和文本分类问题。

以下是一个简单的 MATLAB 朴素贝叶斯分类算法的示例代码，演示如何使用MATLAB 的统计工具箱（Statistics and Machine Learning Toolbox）进行朴素贝叶斯分类：% 生成示例数据data = randn(100, 2); % 两个特征的随机数据labels = randi([1, 2], 100, 1); % 两类标签（1或2）% 划分训练集和测试集idx = randperm(100);trainData = data(idx(1:70), :);trainLabels = labels(idx(1:70));testData = data(idx(71:end), :);testLabels = labels(idx(71:end));% 训练朴素贝叶斯分类器nbClassifier = fitcnb(trainData, trainLabels);% 在测试集上进行预测predictedLabels = predict(nbClassifier, testData);% 评估分类器性能accuracy = sum(predictedLabels == testLabels) / numel(testLabels);disp(['分类器准确率：', num2str(accuracy * 100), '%']);这个例子中，我们首先生成了一些随机的二维数据，并为每个数据点分配了一个类标签。

然后，我们将数据分为训练集和测试集。

接着，使用 fitcnb 函数训练朴素贝叶斯分类器，并使用 predict 函数在测试集上进行预测。

最后，计算分类器的准确率。

请注意，这只是一个简单的演示，实际应用中你可能需要更复杂的数据集和特征工程。

matlab enter_evidence详解

enter_evidence是MATLAB 中贝叶斯网络工具箱（Bayesian Network Toolbox）的一个函数。

这个函数的主要作用是将观测数据（即证据）添加到贝叶斯网络中，用于后续的概率推理或学习。

贝叶斯网络是一个概率图模型，其中随机变量之间的依赖关系通过一个有向无环图来表示。

在使用enter_evidence函数时，通常的用法是将证据数据作为参数传递给该函数，然后这些数据将被添加到贝叶斯网络中。

函数的基本语法如下：
net = enter_evidence(net, evidence)
其中：
●net是一个贝叶斯网络对象。

●evidence是一个结构体数组，其中每个结构体表示一个观察到的随机变量
的值。

例如，假设你有一个包含两个随机变量的贝叶斯网络，变量名为A和B。

如果你观察到A的值为1，你可以使用以下代码将这个观察值添加到网络中：net = enter_evidence(net, struct('A', 1));
这个操作之后，所有依赖于A的父节点的概率更新将基于你提供的证据进行。

总的来说，enter_evidence函数在贝叶斯网络分析中起到了至关重要的作用，因为它允许用户根据观测数据更新网络的概率分布，这对于推理、预测和决策支持等任务是非常关键的。

如何使用贝叶斯网络工具箱

如何使用贝叶斯网络工具箱2004-1-7版翻译：By 斑斑（QQ：23920620）联系方式：banban23920620@安装安装Matlab源码安装C源码有用的Matlab提示创建你的第一个贝叶斯网络手工创建一个模型从一个文件加载一个模型使用GUI创建一个模型推断处理边缘分布处理联合分布虚拟证据最或然率解释条件概率分布列表（多项式）节点Noisy-or节点其它（噪音）确定性节点Softmax（多项式分对数）节点神经网络节点根节点高斯节点广义线性模型节点分类 / 回归树节点其它连续分布CPD类型摘要模型举例高斯混合模型PCA、ICA等专家系统的混合专家系统的分等级混合QMR条件高斯模型其它混合模型参数学习从一个文件里加载数据从完整的数据中进行最大似然参数估计先验参数从完整的数据中（连续）更新贝叶斯参数数据缺失情况下的最大似然参数估计（EM算法）参数类型结构学习穷举搜索K2算法爬山算法MCMC主动学习结构上的EM算法肉眼观察学习好的图形结构基于约束的方法推断函数联合树消元法全局推断方法快速打分置信传播采样（蒙特卡洛法）推断函数摘要影响图 / 制定决策DBNs、HMMs、Kalman滤波器等等安装安装Matlab代码1.下载FullBNT.zip文件。

2.解压文件。

3.编辑"FullBNT/BNT/add_BNT_to_path.m"让它包含正确的工作路径。

4.BNT_HOME = 'FullBNT的工作路径';5.打开Matlab。

6.运行BNT需要Matlab版本在V5.2以上。

7.转到BNT的文件夹例如在windows下，键入8.>> cd C:\kpmurphy\matlab\FullBNT\BNT9.键入"add_BNT_to_path"，执行这个命令。

添加路径。

添加所有的文件夹在Matlab的路径下。

10.键入"test_BNT"，看看运行是否正常，这时可能产生一些数字和一些警告信息。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

tabul等ar_CPD(bnet, S, [0.5 0.9 0.5 0.1]);
bnet.CPD{W} =
tabular_CPD(bnet, W, [1 0.1 0.1
0.01 0 0.9 0.9 0.99]);
Draw_graph(dag) 绘制图形
推理（Inference）
贝叶斯网络中有许多不同的算法来作为推断的的工具，在速度、复杂性、普遍性和精确性上有不同的表现。BNT因此提供了多种多样的不同的推断引擎。
结构学习
结构学习举例：
使用K2算法，对根据wetgrass例子的CPD生成随机数列，进行结构学习，建立贝叶斯网络图。见：wetgrassdata.txt ; K2_wetgrassdata文件
推理（Inference）
BNT提供了多种多样的不同的推断引擎。 • 联合树算法：jtree_inf_engine • 变量消元算法：var_elim_inf_engine • 全局推理算法： enumerative_inf_engine, gaussian_inf_engine, and
cond_gauss_inf_engine • 快速打分算法：quickscore_inf_engine • 采样算法：likelihood_weighting_inf_engine、gibbs_sampling_inf_engine • 调用方法：所有推理算法的调用都相同：
网络结构
bnet=mk_bnet(dag,[2,2,2,2],'name s',{'C',’S',’R',’W'},'discrete', 1:4);
bnet.CPD{C} = tabular_CPD(bnet,C,[0.5 0.5]) ; bnet.CPD{R} = tabular_CPD(bnet,R,[0.8 0.2 0.2 0.8]); bnet.CPD{S} = tabular_CPD(bnet,S,[0.5 0.9 0.5 0.1]); bnet.CPD{W} = tabular_CPD(bnet,W,[1 0.1 0.1 0.01 0 0.9 0.9 0.99]);
– engine = jtree_inf_engine(bnet);%指定引擎 – [engine, loglik] = enter_evidence(engine, evidence);% 添加证据 – marg = marginal_nodes(engine, S);%指定需要计算的节点 – marg.T；显示结果
• 节点数量 • 节点间关系 • 节点大小与类型 • 建立贝叶斯网络
参数（parameters）
• 条件概率分布CPD • 最简单的是CPT：条件概率表
图形可视化
• 绘制有向图
推理
• 联合树引擎（junction tree engine） • 精确推断的根本
• 计算边缘分布 • 计算联合分布
网络结构（structure）
Matlab的路径下。 • 键入"test_BNT"，看看运行是否正常，这时可能产生一些数字和一些警告信
息。（你可以忽视它）但是没有错误信息。
基于MATLAB的BNT工具箱的使用与实例
Creating your first Bayes net 创建你的第一个贝叶斯网络
网络结构（structure）
后面杨海滨同学会做详细介绍。
– evidence = cell(1,N); – evidence{W} = 2;%证据为W=2 – [engine, loglik] = enter_evidence(engine, evidence);% 把证据添
加进引擎
– marg = marginal_nodes(engine, S);%指定计算S的概率分布 – marg.T；%计算 P(S=1|W=2)与 P(S=2|W=2) – marg.T(2);%计算P(S=2|W=2)的边缘概率
节点数量节点间关系节点大小与类型
N=4；%节点为4 父节点子节点 1→ 2,3 2→ 4
建立贝叶斯网络
3→ 4 Dag=zeros(N,N);
dag(1,[2 3]) = 1;
节点均有两个取值：F&T
dag(2,4) = 1; dag(3,4)=1;
节点类型： discrete_nodes=1:4;
然后， Make a tabula rasa
bnet2 = mk_bnet(dag, node_sizes); seed = 0; rand('state', seed); bnet2.CPD{C} = tabular_CPD(bnet2, C); bnet2.CPD{R} = tabular_CPD(bnet2, R); bnet2.CPD{S} = tabular_CPD(bnet2, S); bnet2.CPD{W} = tabular_CPD(bnet2, W);
3、银行个人客户数据的测试工作遇到问题：客户数据中有连续型、离散型两种，目前未能对连续型数据进行处理。此外离散型数据中有，如日期（1-30号）、月份（1-12号），年龄（最小 18，最大87），通话时长（最短4s，最长3500s）等，如此数据应不能直接用MATLAB测试，就算可以直接测试，效果肯定也不理想，需要进行数据预处理。
当然，证据也可以是多维数组。 marg = marginal_nodes(engine, [S R W]); %指定计算S R W 的联合概率分布,也就是P(S,R,W)。
无证据情况下结果为： ans(:,:,1) =
0.2900 0.0410 0.0210 0.0009 ans(:,:,2) =
学习了节点4的参数：
dispcpt(CPT3{4}) Samples=20,50,100时，节点4的参数越来越接近真实值
原概率表
参数学习
如果有数据，就不需要手动获取样本，只需要将数据加载即可。
• 加载数据进行参数学习：
– 先将刚刚得到的样本保存为datatest.txt
• 参数学习
– data = load('datatest.txt'); – samples=data'; – bnet2= …… dispcpt（
CPT，conditional probability tables,是作为பைடு நூலகம்为数组存储的。
神经网络节点
CPT=reshape([1,0.1,0.1···0.99]
注意：子节点，通常是最后一维；
，[根2,2节,2点]);
在MATLAB中，数组索引从1开始，
按照惯例false（假）==1, true（真）==2 例如：节点w
• 联合树引擎-所有精确推断引擎的根本
– jtree_inf_engine
• 调用方法：
– engine = jtree_inf_engine(bnet);
• 推理计算：
– 回到上面，我们已经建立了一个关于草地湿润的例子。洒水器和下雨均可能打湿草坪，现在计算洒水器导致草地湿润的概率。
– 题目中，证据为草地湿润即，W=2。
基于MATLAB的BNT工具箱的使用与实例
工具箱的安装
• 下载FullBNT.zip文件。解压文件。 • 编辑"FullBNT/BNT/add_BNT_to_path.m"让它包含正确的工作路径。 • BNT_HOME = 'FullBNT的工作路径'; • 打开Matlab。 • 运行BNT需要Matlab版本在V5.2以上。 • 键入"add_BNT_to_path"，执行这个命令。添加路径。添加所有的文件夹在
0 0.3690 0.1890 0.0891
Evidence（R）=2；marg.T; 有证据情况下结果为：
ans(:,:,1) = 0.0820 0.0018
ans(:,:,2) = 0.7380 0.1782
dag=zeros(4,4); dag(1,[2 3])=1; dag(2,4)=1; dag(3,4)=1; draw_graph(dag);
推理举例
• 贝叶斯网络精确推理:计算P(S=2|W=2) 的概率
var_elim_inf_engine为变量消元法, jtree_inf_engine为联合树推理（团树传播法）见inference1.
• 贝叶斯网络近似推理:计算P(S=2|W=2) 的概率
likelihood_weighting_inf_engine 为重要性抽要推理见inference2.
engine = jtree_inf_engine(bnet); evidence = cell(1,4); [engine, loglik] = enter_evidence(engine, evidence); marg = marginal_nodes(engine, [R S W]); marg.T;
• 参数（parameters）
创建条件概率表 CPT = zeros(2,2,2);
条件概率分布CPD
CPT(1,1,1) = 1.0； CPTTa(b2u,1la,1r )n=od0e.s1;列表节点
• 最简单的是CPT：条件概率表
…… CPNTo(2is,2y-,o2r) n=o0d.e9s9;布尔型
0 0.8963 0.9085 0.9990
回顾
手动建立简单的贝叶斯网络---调用的函数有：mk_bnet tabular_CPD jtree_inf_engine 参数学习---完整数据下的最大似然估计、数据缺失下EM算法简单的样本训练、数据加载及学习
1、MATLAB学习：下载了MATLAB教程课件，边敲代码边记笔记，更加细致地学习了变量及数组相关内容 2、本周已初步掌握贝叶斯工具箱的使用：能够对离散型数据进行参数学习与推理。