信息论第三版课后答案
信息论与编码第3版第3章习题解答

第3章 无失真离散信源编码习题3.1 设信源1234567()0.20.190.180.170.150.10.01X a a a a a a a P X(1) 求信源熵H (X ); (2) 编二进制香农码;(3) 计算其平均码长及编码效率。
解: (1)()()log ()(.log ..log ..log ..log ..log ..log ..log .).7212222222=-020201901901801801701701501501010010012609 i i i H X p a p a bit symbol(2)a i p (a i ) p a (a i ) k i 码字 a 1 0.2 0 3 000 a 2 0.19 0.2 3 001 a 3 0.18 0.39 3 011 a 4 0.17 0.57 3 100 a 5 0.15 0.74 3 101 a 6 0.1 0.89 4 1110 a 70.010.9971111110(3)()3(0.2+0.19+0.18+0.17+0.15)+40.1+70.01=3.1471i i i K k p a()() 2.609=83.1%3.14H X H X R K3.2 对习题3.1的信源编二进制费诺码,计算其编码效率。
解:a i p (a i ) 编 码 码字 k i a 1 0.2 000 2 a 2 0.19 1 0 010 3 a 3 0.18 1 011 3 a 4 0.17 110 2 a 5 0.15 10 110 3 a 6 0.1 10 1110 4 a 70.011 11114()2(0.2+0.17)+3(0.19+0.18+0.15)+4(0.1+0.01)=2.7471i i i K k p a()() 2.609=95.2%2.74H X H X R K3.3 对习题3.1的信源分别编二进制和三进制赫夫曼码,计算各自的平均码长及编码效率。
信息论答案

信息论答案2.1一个马尔可夫信源有3个符号{}1,23,u u u ,转移概率为:()11|1/2p u u =,()21|1/2p u u =,()31|0p u u =,()12|1/3p u u =,()22|0p u u =,()32|2/3p u u =,()13|1/3p u u =,()23|2/3p u u =,()33|0p u u =,画出状态图并求出各符号稳态概率。
解:状态图如下状态转移矩阵为:1/21/201/302/31/32/30p ⎛⎫ ⎪= ⎪ ⎪⎝⎭设状态u 1,u 2,u 3稳定后的概率分别为W 1,W 2、W 3由1231WP W W W W =⎧⎨++=⎩得1231132231231112331223231W W W W W W W W W W W W ⎧++=⎪⎪⎪+=⎪⎨⎪=⎪⎪⎪++=⎩计算可得1231025925625W W W ⎧=⎪⎪⎪=⎨⎪⎪=⎪⎩2.2 由符号集{0,1}组成的二阶马尔可夫链,其转移概率为:(0|00)p =0.8,(0|11)p =0.2,(1|00)p =0.2,(1|11)p =0.8,(0|01)p =0.5,(0|10)p =0.5,(1|01)p =0.5,(1|10)p =0.5。
画出状态图,并计算各状态的稳态概率。
解:(0|00)(00|00)0.8p p == (0|01)(10|01)p p == (0|11)(10|11)0.2p p == (0|10)(00|10)p p == (1|00)(01|00)0.2p p == (1|01)(11|01)p p==(1|11)(11|11)0.8p p == (1|10)(01|10)0.5p p ==于是可以列出转移概率矩阵:0.80.200000.50.50.50.500000.20.8p ⎛⎫ ⎪⎪= ⎪ ⎪⎝⎭ 状态图为:设各状态00,01,10,11的稳态分布概率为W 1,W 2,W 3,W 4 有411i i WP W W ==⎧⎪⎨=⎪⎩∑ 得 13113224324412340.80.50.20.50.50.20.50.81W W W W W W W W W W W W W W W W +=⎧⎪+=⎪⎪+=⎨⎪+=⎪+++=⎪⎩ 计算得到12345141717514W W W W ⎧=⎪⎪⎪=⎪⎨⎪=⎪⎪⎪=⎩2.5 居住某地区的女孩子有25%是大学生,在女大学生中有75%是身高160厘米以上的,而女孩子中身高160厘米以上的占总数的一半。
信息论英文课后部分习题答案

本答案是英文原版的配套答案,与翻译的中文版课本题序不太一样但内容一样。
翻译的中文版增加了题量。
2.2、Entropy of functions. Let X be a random variable taking on a finite number of values. What is the (general) inequality relationship of ()H X and ()H Y if(a) 2X Y =?(b) cos Y X =?Solution: Let ()y g x =. Then():()()x y g x p y p x ==∑.Consider any set of x ’s that map onto a single y . For this set()():():()()log ()log ()log ()x y g x x y g x p x p x p x p y p y p y ==≤=∑∑,Since log is a monotone increasing function and ():()()()x y g x p x p x p y =≤=∑.Extending this argument to the entire range of X (and Y ), we obtain():()()log ()()log ()x y x g x H X p x p x p x p x =-=-∑∑∑()log ()()yp y p y H Y ≥-=∑,with equality iff g if one-to-one with probability one.(a) 2X Y = is one-to-one and hence the entropy, which is just a function of the probabilities does not change, i.e., ()()H X H Y =.(b) cos Y X =is not necessarily one-to-one. Hence all that we can say is that ()()H X H Y ≥, which equality if cosine is one-to-one on the range of X .2.16. Example of joint entropy. Let (,)p x y be given byFind(a) ()H X ,()H Y .(b) (|)H X Y ,(|)H Y X . (c) (,)H X Y(d) ()(|)H Y H Y X -.(e) (;)I X Y(f) Draw a Venn diagram for the quantities in (a) through (e).Solution:Fig. 1 Venn diagram(a) 231()log log30.918 bits=()323H X H Y =+=.(b) 12(|)(|0)(|1)0.667 bits (/)33H X Y H X Y H X Y H Y X ==+===((,)(|)()p x y p x y p y =)((|)(,)()H X Y H X Y H Y =-)(c) 1(,)3log3 1.585 bits 3H X Y =⨯=(d) ()(|)0.251 bits H Y H Y X -=(e)(;)()(|)0.251 bits=-=I X Y H Y H Y X(f)See Figure 1.2.29 Inequalities. Let X,Y and Z be joint random variables. Prove the following inequalities and find conditions for equality.(a) )ZHYXH≥X(Z()|,|(b) )ZIYXI≥X((Z);,;(c) )XYXHZ≤Z-H-XYH),(,)(((X,,)H(d) )XYIZIZII+-XZY≥Y(););(|;;(Z|)(XSolution:(a)Using the chain rule for conditional entropy,HZYXHZXH+XH≥XYZ=),(|(Z)(||,()|)With equality iff 0YH,that is, when Y is a function of X andXZ,|(=)Z.(b)Using the chain rule for mutual information,ZIXXIZYX+=,I≥IYZ(|;)X);)(,;;(Z)(With equality iff 0ZYI, that is, when Y and Z areX)|;(=conditionally independent given X.(c)Using first the chain rule for entropy and then definition of conditionalmutual information,XZHYHIXZYX==-H-XHYYZ)()(;Z)|,|),|(X(,,)(XHXZH-Z≤,=,)()()(X|HWith equality iff 0ZYI, that is, when Y and Z areX(=|;)conditionally independent given X .(d) Using the chain rule for mutual information,);()|;();,();()|;(Z X I X Y Z I Z Y X I Y Z I Y Z X I +==+And therefore this inequality is actually an equality in all cases.4.5 Entropy rates of Markov chains.(a) Find the entropy rate of the two-state Markov chain with transition matrix⎥⎦⎤⎢⎣⎡--=1010010111p p p p P (b) What values of 01p ,10p maximize the rate of part (a)?(c) Find the entropy rate of the two-state Markov chain with transition matrix⎥⎦⎤⎢⎣⎡-=0 1 1p p P(d) Find the maximum value of the entropy rate of the Markov chain of part (c). We expect that the maximizing value of p should be less than 2/1, since the 0 state permits more information to be generated than the 1 state.Solution:(a) T he stationary distribution is easily calculated.10010*********,p p p p p p +=+=ππ Therefore the entropy rate is10011001011010101012)()()()()|(p p p H p p H p p H p H X X H ++=+=ππ(b) T he entropy rate is at most 1 bit because the process has only two states. This rate can be achieved if( and only if) 2/11001==p p , in which case the process is actually i.i.d. with2/1)1Pr()0Pr(====i i X X .(c) A s a special case of the general two-state Markov chain, the entropy rate is1)()1()()|(1012+=+=p p H H p H X X H ππ.(d) B y straightforward calculus, we find that the maximum value of)(χH of part (c) occurs for 382.02/)53(=-=p . The maximum value isbits 694.0)215()1()(=-=-=H p H p H (wrong!)5.4 Huffman coding. Consider the random variable⎪⎪⎭⎫ ⎝⎛=0.02 0.03 0.04 0.04 0.12 0.26 49.0 7654321x x x x x x x X (a) Find a binary Huffman code for X .(b) Find the expected codelength for this encoding.(c) Find a ternary Huffman code for X .Solution:(a) The Huffman tree for this distribution is(b)The expected length of the codewords for the binary Huffman code is 2.02 bits.( ∑⨯=)()(i p l X E )(c) The ternary Huffman tree is5.9 Optimal code lengths that require one bit above entropy. The source coding theorem shows that the optimal code for a random variable X has an expected length less than 1)(+X H . Given an example of a random variable for which the expected length of the optimal code is close to 1)(+X H , i.e., for any 0>ε, construct a distribution for which the optimal code has ε-+>1)(X H L .Solution: there is a trivial example that requires almost 1 bit above its entropy. Let X be a binary random variable with probability of 1=X close to 1. Then entropy of X is close to 0, but the length of its optimal code is 1 bit, which is almost 1 bit above its entropy.5.25 Shannon code. Consider the following method for generating a code for a random variable X which takes on m values {}m ,,2,1 with probabilities m p p p ,,21. Assume that the probabilities are ordered so thatm p p p ≥≥≥ 21. Define ∑-==11i k i i p F , the sum of the probabilities of allsymbols less than i . Then the codeword for i is the number ]1,0[∈i Frounded off to i l bits, where ⎥⎥⎤⎢⎢⎡=i i p l 1log . (a) Show that the code constructed by this process is prefix-free and the average length satisfies 1)()(+<≤X H L X H .(b) Construct the code for the probability distribution (0.5, 0.25, 0.125, 0.125).Solution:(a) Since ⎥⎥⎤⎢⎢⎡=i i p l 1log , we have 11log 1log +<≤i i i p l pWhich implies that 1)()(+<=≤∑X H l p L X H i i .By the choice of i l , we have )1(22---<≤ii l i l p . Thus j F , i j > differs from j F by at least il -2, and will therefore differ from i F is at least one place in the first i l bits of the binary expansion of i F . Thus thecodeword for j F , i j >, which has length i j l l ≥, differs from thecodeword for i F at least once in the first i l places. Thus no codewordis a prefix of any other codeword.(b) We build the following table3.5 AEP. Let ,,21X X be independent identically distributed random variables drawn according to theprobability mass function {}m x x p ,2,1),(∈. Thus ∏==n i i n x p x x x p 121)(),,,( . We know that)(),,,(log 121X H X X X p n n →- in probability. Let ∏==n i i n x q x x x q 121)(),,,( , where q is another probability mass function on {}m ,2,1.(a) Evaluate ),,,(log 1lim 21n X X X q n-, where ,,21X X are i.i.d. ~ )(x p . Solution: Since the n X X X ,,,21 are i.i.d., so are )(1X q ,)(2X q ,…,)(n X q ,and hence we can apply the strong law of large numbers to obtain∑-=-)(log 1lim ),,,(log 1lim 21i n X q n X X X q n 1..))((log p w X q E -=∑-=)(log )(x q x p∑∑-=)(log )()()(log )(x p x p x q x p x p )()||(p H q p D +=8.1 Preprocessing the output. One is given a communication channel withtransition probabilities )|(x y p and channel capacity );(max )(Y X I C x p =.A helpful statistician preprocesses the output by forming )(_Y g Y =. He claims that this will strictly improve the capacity.(a) Show that he is wrong.(b) Under what condition does he not strictly decrease the capacity? Solution:(a) The statistician calculates )(_Y g Y =. Since _Y Y X →→ forms a Markov chain, we can apply the data processing inequality. Hence for every distribution on x ,);();(_Y X I Y X I ≥. Let )(_x p be the distribution on x that maximizes );(_Y X I . Then__)()()(_)()()();(max );();();(max __C Y X I Y X I Y X I Y X I C x p x p x p x p x p x p ==≥≥===.Thus, the statistician is wrong and processing the output does not increase capacity.(b) We have equality in the above sequence of inequalities only if we have equality in data processing inequality, i.e., for the distribution that maximizes );(_Y X I , we have Y Y X →→_forming a Markov chain.8.3 An addition noise channel. Find the channel capacity of the following discrete memoryless channel:Where {}{}21Pr 0Pr ====a Z Z . The alphabet for x is {}1,0=X . Assume that Z is independent of X . Observe that the channel capacity depends on the value of a . Solution: A sum channel.Z X Y += {}1,0∈X , {}a Z ,0∈We have to distinguish various cases depending on the values of a .0=a In this case, X Y =,and 1);(max =Y X I . Hence the capacity is 1 bitper transmission.1,0≠≠a In this case, Y has four possible values a a +1,,1,0. KnowingY ,we know the X which was sent, and hence 0)|(=Y X H . Hence thecapacity is also 1 bit per transmission.1=a In this case Y has three possible output values, 0,1,2, the channel isidentical to the binary erasure channel, with 21=f . The capacity of this channel is 211=-f bit per transmission.1-=a This is similar to the case when 1=a and the capacity is also 1/2 bit per transmission.8.5 Channel capacity. Consider the discrete memoryless channel)11 (mod Z X Y +=, where ⎪⎪⎭⎫ ⎝⎛=1/3 1/3, 1/3,3 2,,1Z and {}10,,1,0 ∈X . Assume thatZ is independent of X .(a) Find the capacity.(b) What is the maximizing )(*x p ?Solution: The capacity of the channel is );(max )(Y X I C x p =)()()|()()|()();(Z H Y H X Z H Y H X Y H Y H Y X I -=-=-=bits 311log)(log );(=-≤Z H y Y X I , which is obtained when Y has an uniform distribution, which occurs when X has an uniform distribution.(a)The capacity of the channel is bits 311log /transmission.(b) The capacity is achieved by an uniform distribution on the inputs.10,,1,0for 111)( ===i i X p 8.12 Time-varying channels. Consider a time-varying discrete memoryless channel. Let n Y Y Y ,,21 be conditionally independent givenn X X X ,,21 , with conditional distribution given by ∏==ni i i i x y p x y p 1)|()|(.Let ),,(21n X X X X =, ),,(21n Y Y Y Y =. Find );(max )(Y X I x p . Solution:∑∑∑∑∑=====--≤-≤-=-=-=-=ni i n i i i n i i ni i i ni i i n p h X Y H Y H X Y H Y H X Y Y Y H Y H X Y Y Y H Y H X Y H Y H Y X I 111111121))(1()|()()|()(),,|()()|,,()()|()();(With equlity ifnX X X ,,21 is chosen i.i.d. Hence∑=-=ni i x p p h Y X I 1)())(1();(max .10.2 A channel with two independent looks at Y . Let 1Y and 2Y be conditionally independent and conditionally identically distributed givenX .(a) Show );();(2),;(21121Y Y I Y X I Y Y X I -=. (b) Conclude that the capacity of the channelX(Y1,Y2)is less than twice the capacity of the channelXY1Solution:(a) )|,(),(),;(212121X Y Y H Y Y H Y Y X I -=)|()|();()()(212121X Y H X Y H Y Y I Y H Y H ---+=);();(2);();();(2112121Y Y I Y X I Y Y I Y X I Y X I -=-+=(b) The capacity of the single look channel 1Y X → is );(max 1)(1Y X I C x p =.Thecapacityof the channel ),(21Y Y X →is11)(211)(21)(22);(2max );();(2max ),;(max C Y X I Y Y I Y X I Y Y X I C x p x p x p =≤-==10.3 The two-look Gaussian channel. Consider the ordinary Shannon Gaussian channel with two correlated looks at X , i.e., ),(21Y Y Y =, where2211Z X Y Z X Y +=+= with a power constraint P on X , and ),0(~),(221K N Z Z ,where⎥⎦⎤⎢⎣⎡=N N N N K ρρ. Find the capacity C for (a) 1=ρ (b) 0=ρ (c) 1-=ρSolution:It is clear that the two input distribution that maximizes the capacity is),0(~P N X . Evaluating the mutual information for this distribution,),(),()|,(),()|,(),(),;(max 212121212121212Z Z h Y Y h X Z Z h Y Y h X Y Y h Y Y h Y Y X I C -=-=-==Nowsince⎪⎪⎭⎫⎝⎛⎥⎦⎤⎢⎣⎡N N N N N Z Z ,0~),(21ρρ,wehave)1()2log(21)2log(21),(222221ρππ-==N e Kz e Z Z h.Since11Z X Y +=, and22Z X Y +=, wehave ⎪⎪⎭⎫⎝⎛⎥⎦⎤⎢⎣⎡++++N N P N N P N Y Y P P ,0~),(21ρρ, And ))1(2)1(()2log(21)2log(21),(222221ρρππ-+-==PN N e K e Y Y h Y .Hence⎪⎪⎭⎫⎝⎛++=-=)1(21log 21),(),(21212ρN P Z Z h Y Y h C(a) 1=ρ.In this case, ⎪⎭⎫⎝⎛+=N P C 1log 21, which is the capacity of a single look channel.(b) 0=ρ. In this case, ⎪⎭⎫⎝⎛+=N P C 21log 21, which corresponds to using twice the power in a single look. The capacity is the same as the capacity of the channel )(21Y Y X +→.(c) 1-=ρ. In this case, ∞=C , which is not surprising since if we add1Y and 2Y , we can recover X exactly.10.4 Parallel channels and waterfilling. Consider a pair of parallel Gaussianchannels,i.e.,⎪⎪⎭⎫⎝⎛+⎪⎪⎭⎫ ⎝⎛=⎪⎪⎭⎫ ⎝⎛212121Z Z X X Y Y , where⎪⎪⎭⎫ ⎝⎛⎥⎥⎦⎤⎢⎢⎣⎡⎪⎪⎭⎫ ⎝⎛222121 00 ,0~σσN Z Z , And there is a power constraint P X X E 2)(2221≤+. Assume that 2221σσ>. At what power does the channel stop behaving like a single channel with noise variance 22σ, and begin behaving like a pair of channels? Solution: We will put all the signal power into the channel with less noise until the total power of noise+signal in that channel equals the noise power in the other channel. After that, we will split anyadditional power evenly between the two channels. Thus the combined channel begins to behave like a pair of parallel channels when the signal power is equal to the difference of the two noise powers, i.e., when 22212σσ-=P .。
信息论傅祖芸第三版答案

信息论傅祖芸第三版答案【篇一:信息论】p class=txt>信息论是运用概率论与数理统计的方法研究信息、信息熵、通信系统、数据传输、密码学、数据压缩等问题的应用数学学科。
信息论将信息的传递作为一种统计现象来考虑,给出了估算通信信道容量的方法。
信息传输和信息压缩是信息论研究中的两大领域。
这两个方面又由信息传输定理、信源-信道隔离定理相互联系。
它主要是研究通讯和控制系统中普遍存在着信息传递的共同规律以及研究最佳解决信息的获限、度量、变换、储存和传递等问题的基础理论。
信息论发展的三个阶段第一阶段:1948年贝尔研究所的香农在题为《通讯的数学理论》的论文中系统地提出了关于信息的论述,创立了信息论。
第二阶段:20世纪50年代,信息论向各门学科发起冲击;60年代信息论进入一个消化、理解的时期,在已有的基础上进行重大建设的时期。
研究重点是信息和信源编码问题。
第三阶段:到70年代,由于数字计算机的广泛应用,通讯系统的能力也有很大提高,如何更有效地利用和处理信息,成为日益迫切的问题。
人们越来越认识到信息的重要性,认识到信息可以作为与材料和能源一样的资源而加以充分利用和共享。
信息的概念和方法已广泛渗透到各个科学领域,它迫切要求突破申农信息论的狭隘范围,以便使它能成为人类各种活动中所碰到的信息问题的基础理论,从而推动其他许多新兴学科进一步发展。
信息科学和技术在当代迅猛兴起有其逻辑必然和历史必然。
信息是信息科学的研究对象。
信息的概念可以在两个层次上定义:本体论意义的信息是事物运动的状态和状态变化的方式,即事物内部结构和外部联系的状态和方式。
认识论意义的信息是认识主体所感知、表达的相应事物的运动状态及其变化方式,包括状态及其变化方式的形式、含义和效用。
这里所说的“事物”泛指一切可能的研究对象,包括外部世界的物质客体,也包括主观世界的精神现象;“运动”泛指一切意义上的变化,包括思维运动和社会运动;“运动状态”指事物运动在空间所展示的性状和态势;“运动方式”是事物运动在时间上表现的过程和规律性。
信息论参考答案

信息论参考答案信息论参考答案信息论是一门研究信息传输和编码的学科,它的核心概念是信息的度量和传输。
信息论的发展可以追溯到上世纪40年代,由克劳德·香农提出,并逐渐成为计算机科学、通信工程等领域的重要理论基础。
本文将从信息的定义、信息的度量以及信息的传输三个方面,探讨信息论的相关知识。
一、信息的定义信息是指能够改变接收者知识状态的事实或数据。
在信息论中,信息的基本单位是比特(bit),它表示一个二进制的选择,即0或1。
比特是信息论中最小的单位,可以用来表示一个简单的选择问题,如是或否、真或假等。
当然,在实际应用中,比特往往被扩展为更大的单位,如字节、千字节等。
二、信息的度量信息的度量是信息论的核心问题之一。
克劳德·香农提出了信息熵的概念,用来度量信息的不确定性或者说信息的平均量。
信息熵的计算公式为:H(X) = -ΣP(x)log2P(x),其中H(X)表示随机变量X的信息熵,P(x)表示随机变量X取值为x的概率。
信息熵越大,表示信息的不确定性越高,反之亦然。
除了信息熵,信息论还引入了条件熵、相对熵和互信息等概念。
条件熵表示在已知某些信息的情况下,对另一个随机变量的不确定性进行度量。
相对熵用来衡量两个概率分布之间的差异,而互信息则表示两个随机变量之间的相关程度。
三、信息的传输信息的传输是信息论的另一个重要问题。
在信息论中,通过信道来传输信息。
信道可以是有线的或者无线的,可以是噪声的或者非噪声的。
为了保证信息的可靠传输,需要对信息进行编码和解码。
编码是将信息转化为能够在信道中传输的信号的过程。
常见的编码方法有霍夫曼编码、香农-费诺编码等。
编码的目标是尽量减少信息的冗余,提高信息的传输效率。
解码是将经过信道传输的信号恢复为原始信息的过程。
解码的目标是尽量减少信息的失真,保证信息的可靠性。
常见的解码方法有最大似然解码、Viterbi解码等。
信息论的应用广泛,不仅在通信领域发挥着重要作用,还在数据压缩、密码学、人工智能等领域有着广泛的应用。
信息论 基础理论与应用课后答案 全

X
a1 a2
P = 0.070.93
问男同志回答“是”所获昨的信息量为:
I 问男同志回答“否”所获得的信息量为:
比特/符号
I 男同志平均每个回答中含有的信息量为
比特/符号
H(X) = −∑P(x)log P(x) = 0.366 比特/符号
同样,女同志红绿色盲的概率空间为
Y
b1
b2
P = 0.0050.995
A′ ={ai ,i =1,2,...,2q},并且各符号的概率分布满足
Pi′= (1−e)Pi i =1,2,...,q
Pi′= ePi
i = q +1,q + 2,...,2q
试写出信源 S′的信息熵与信源 S 的信息熵的关系。
解:
H(S′) = −∑P(x)log P(x)
∑ ∑ = − (1−e)Pi log(1−e)Pi − ePi logePi ∑ ∑ ∑ ∑ = −(1−e) Pi log(1−e) − (1−e) Pi log Pi −e Pi loge −e Pi log Pi
即函数 f (x) 为减函数,因此有 f (0) ≥ f (e),即
(p1 −e)log(p1 −e) + (p2 + e)log(p2 + e) ≤ p1 log p1 + p2 log p2
因此 H(X) ≤ H(X ′)成立。
【解释】 当信源符号的概率趋向等概率分布时,不确定性增加,即信息熵是增加的。
(1)求质点 A 落入任一格的平均自信息量,即求信息熵,首先得出质点 A 落入任 一格的概率空间为:
= XP
48a11 48a12 48a13 a48148 平均自信息量为
(信息论)第二、三章习题参考答案

第二章习题参考答案2-1解:同时掷两个正常的骰子,这两个事件是相互独立的,所以两骰子面朝上点数的状态共有6×6=36种,其中任一状态的分布都是等概的,出现的概率为1/36。
(1)设“3和5同时出现”为事件A ,则A 的发生有两种情况:甲3乙5,甲5乙3。
因此事件A 发生的概率为p(A)=(1/36)*2=1/18 故事件A 的自信息量为I(A)=-log 2p(A)=log 218=4.17 bit(2)设“两个1同时出现”为事件B ,则B 的发生只有一种情况:甲1乙1。
因此事件B 发生的概率为p(B)=1/36 故事件B 的自信息量为I(B)=-log 2p(B)=log 236=5.17 bit (3) 两个点数的排列如下:因为各种组合无序,所以共有21种组合: 其中11,22,33,44,55,66的概率是3616161=⨯ 其他15个组合的概率是18161612=⨯⨯symbol bit x p x p X H ii i / 337.4181log 18115361log 3616)(log )()(=⎪⎭⎫ ⎝⎛⨯+⨯-=-=∑(4) 参考上面的两个点数的排列,可以得出两个点数求和的概率分布:sym bolbit x p x p X H X P X ii i / 274.3 61log 61365log 365291log 912121log 1212181log 1812361log 3612 )(log )()(36112181111211091936586173656915121418133612)(=⎪⎭⎫ ⎝⎛+⨯+⨯+⨯+⨯+⨯-=-=⎪⎭⎪⎬⎫⎪⎩⎪⎨⎧=⎥⎦⎤⎢⎣⎡∑(5)“两个点数中至少有一个是1”的组合数共有11种。
bitx p x I x p i i i 710.13611log )(log )(3611116161)(=-=-==⨯⨯=2-2解:(1)红色球x 1和白色球x 2的概率分布为⎥⎥⎦⎤⎢⎢⎣⎡=⎥⎦⎤⎢⎣⎡2121)(21x x x p X i 比特 12log *21*2)(log )()(2212==-=∑=i i i x p x p X H(2)红色球x 1和白色球x 2的概率分布为⎥⎥⎦⎤⎢⎢⎣⎡=⎥⎦⎤⎢⎣⎡100110099)(21x x x p X i 比特 08.0100log *100199100log *10099)(log )()(22212=+=-=∑=i i i x p x p X H (3)四种球的概率分布为⎥⎥⎦⎤⎢⎢⎣⎡=⎥⎦⎤⎢⎣⎡41414141)(4321x x x x x p X i ,42211()()log ()4**log 4 2 4i i i H X p x p x ==-==∑比特2-5解:骰子一共有六面,某一骰子扔得某一点数面朝上的概率是相等的,均为1/6。
信息论答案完整版

2.7 为了传输一个由字母 A、B、C、D 组成的符号集,把每个字母编码成两个二元码脉冲序列,以“00” 代表 A,“01”代表 B,“10”代表 C,“11”代表 D。每个二元码脉冲宽度为 5ms。
(1) 不同字母等概率出现时,计算传输的平均信息速率? (2) 若每个字母出现的概率分别为{1/5,1/4,1/4,3/10},试计算传输的平均信息速率? 解:(1)不同字母等概率出现时,符号集的概率空间为:
I (a4
=
3)
=
− log
P(a4 )
=
− log
1 8
=
log2
8=3(比特)
此消息中共有 14 个符号“0”,13 个符号“1”,12 个符号“2”和 6 个符号“3”,则此消息的自
信息是
I = 14I (a1 = 0) +13I (a2 = 1) +12I (a3 = 2) + 6I (a4 = 3) ≈ 14×1.415 +13× 2 +12× 2 + 6× 3 ≈ 87.71(比特)
解:同时掷两个均匀的骰子,也就是各面呈现的概率都是 1/6,总共有 36 种可能的状态,每 种状态出现的概率都是 1/36。 (1)设“3 和 5 同时出现”为事件 A。则在 36 种状态中,有两种可能的情况,即 5+3 和 3+5。则
P( A) = 2 / 36 I ( A) = − log P( A) = log2 18 ≈ 4.17(比特)
(2)此消息中共有 45 个信源符号,携带了 87.81 比特信息量,因此,此消息中平均每个符号携带的信 息量为
I2 = 87.81/ 45 ≈ 1.95(比特)
2.4
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
信息论第三版课后答案【篇一:西电邓家先版信息论与编码第3章课后习题解答】6x11/6y13/41/4x2图3.1 二元信道y2?x??x1x2???=?0.60.4?通过一干扰信道,接收符号y=?y1y2?,信道传递概率如p(x)????图3.33所示。
求:(1)信源x中事件x1,和x2分别含有的自信息。
(2)收到消息yj(j=1,2)后,获得的关于xi(i=1,2)的信息量。
(3)信源x和信源y的信息熵。
(4)信道疑义度h(x|y)和噪声熵h(y|x)。
(5)接收到消息y后获得的平均互信息。
解:(1)由定义得:i(x1)= -log0.6=0.74biti(x2)= -log0.4=1.32biti(xi;xj)= i(xi)-i(xi|yj)=log[p(xi|yj)/p(xi)]= log[p(yj|xi)/p(yj)]则 i(x1;y1)= log[p(y1|x1)/p(y1)]=log5/6/0.8=0.059bit i (x1;y2)= log[p(y2|x2)/p(y2)]=log1/6/0.2=-0.263biti(x2;y1)= log[p(y1|x2)/p(y1)]=log3/4/0.8=-0.093bit i(x2;y2)= log[p(y2|x2)/p(y2)]=log1/4/0.2=0.322bit(3)由定义显然 h(x)=0.97095bit/符号h(y)=0.72193bit/符号(4)h(y|x)=?22p(xy)log[1/p(y|x)]=??i?1j?1p(xi)p(yj|xi)log[1/p(yj|xi)]h(x|y)= h(x)+h(y|x)-h(y)=0.9635bit/符号(5) i(x;y)= h(x)-h(x|y)=0.00745 bit/符号3.2设8个等概率分布的消息通过传递概率为p的bsc进行传送。
八个消息相应编成下述码字:m1=0000, m2=0101, m3=0110, m4=0011, m5=1001, m6=1010, m7=1100, m8=1111, 试问 (1) 接受到第一个数字0与m之间的互信息。
(2) 接受到第二个数字也是0时,得到多少关与m的附加互信息。
(3) 接受到第三个数字仍是0时,又增加多少关与m的互信息。
(4) 接受到第四个数字还是0时,再增加了多少关与m的互信息。
解: (1 ) i(0;m1)= log[ p(0|m1)/p(0)]=1 bit(2 ) i(00;m1)= log[ 1/p(00)]=2 bit2-1=1 bit(3 ) i(000;m1)=3 bit 3-2=1 bit(4 ) i(0000;m1)=4 bit4-3=1 bit3.3 设二元对称信道的传递矩阵为?2?3??1??31?3?? 2??3?(1)若p(0)?3/4,p(1)?1/4,求h(x),h(x|y),h(y|x)和i(x;y);(2)求该信道的信道容量及其达到信道容量时的输入概率分布。
解:(1)已知二元对称信道的传递矩阵,又已知输入的概率分布p(0)?3/4,p(1)?1/4,可以求得输出y的概率分别和后验概率。
p(y?0)??p(x)p(y?0|x)x?p(x?0)p(y?0|x?0)?p(x?1)p(y?0|x?1)32117?????434312p(y?1)??p(x)p(y?1|x)x?p(x?0)p(y?1|x?0)?p(x?1)p(y?1|x?1) 31125?????434312所以p(x?0|y?0)?p(x?0)p(y?0|x?0)6?p(x)p(y?0|x)7xp(x?1|y?0)?p(x?1)p(y?0|x?1)1?p(x)p(y?0|x)7xp(x?0|y?1)?p(x?0)p(y?1|x?0)3?p(x)p(y?1|x)5xp(x?1|y?1)?于是,h(x)??p(x?1)p(y?1|x?1)2?p(x)p(y?1|x)5x?p(x)logp(x)?0.811比特/符号xxyh(x|y)????p(x)p(y|x)logp(x|y)326111313122?[??log??log??log??log]437437435435?0.111?0.234?0.184?0.220?0.749比特/符号 h(y|x)????p(x)p(y|x)logp(y|x)xy322111311122?[??log??log??log??log]433433433433?0.918比特/符号i(x;y)?h(x)?h(x|y)?0.062比特/符号(2)此信道为二元对称信道,所以信道容量2c?1?h(p)?1?h()?0.082比特/符号3根据二元对称信道的性质可知,输入符号为等概率分布(即p(0)?p(1)?道的信息传输率才能达到这个信道容量值。
1)时信2解:x2??x??x1?y??y1y2??p(x)???0.640.36?, ?p(y)???0.70.3? ????????且p(x1∣y1)=0.8 p(x2∣y1)=0.2 由p(x1)=p(y1)p(x1∣y1)+p(y2)p(x1∣y2) p(x2)=p(y1)p(x2∣y1)+p(y2)p(x2∣y2)得p(x2∣y2)=2.2/3 p(x1∣y2)=0.8/3所以h(x∣y)=p(y1)〔-p(x1∣y1)logp(x1∣y1)-p(x2∣y1)logp(x2∣y1)〕+ p(y2)〔-p(x1∣y2)logp(x1∣y2)-p(x2∣y2)logp(x2∣y2)〕=0.7〔-0.8log0.8-0.2log0.2〕+0.3〔-0.8/3log(0.8/3)-2.2/3log(2.2/3)〕=0.7 [0.258+0.464]+0.3[0.509+0.328]=0.505+0.251=0.756h(x) =-0.64log0.64-0.36log0.36=0.412+0.531=0.944 i(x;y)=h(x)-h(x∣y)=0.944-0.756=0.1883.5若x,y,z是三个随机变量,试证明:(1)i(x;yz)=i(x;y)+i(x;z|y)=i(x;z)+i(x;y|z); (2)i(x;y|z)=i(y;x|z)=h(x|z)-h(x|yz); (3)i(x;y|z)?0;证明:(1) i(x)+i(x;z/y)=??p(xy)logxyp(y/x)+?p(y)x??p(xyz)logp(x/y)yzp(x/yz)=??xy?p(xyz)logzp(x/yz)p(y/x)p(y)p(x/y)p(x/yz)p(y/x)p(xy)p(x/yz)p(y/x)p(x)p(y/x)p(x/yz)p(y/x)p(x)=??xyxy?p(xyz)logzz=???p(xyz)log??xy=?p(xyz)logz=i(x;yz) i(x;z)+i(x;y/z) = ?xx?p(xz)logzp(z/x)+?p(z)x?y?p(xyz)logzp(x/yz)p(x/z)=??yxy?p(xyz)logzzp(x/yz)p(z/x)p(z)p(x/z)p(x/yz)p(z/x) p(x)p(z/x)=???p(xyz)log=??xy?p(xyz)logzp(x/yz)p(x)=i(x;yz) (2) i(x;y/z) = ??xyxy?p(xyz)logzp(x/yz)p(x/z)=????xy?p(xyz)logzp(xyz)p(z)p(xz)p(yz)p(y/xz)p(xz)p(z) p(y/z)p(z)p(xz)p(y/xz)p(y/z)=?p(xyz)logz=??xy?p(xyz)logz=i(y;x/z)h(x/z)-h(x/yz) =?p(xz)logxzxyz1?p(x/z)p(x/yz)?p(xyz)logxyz1p(x/yz)=?p(xyz)log=i(x;y/z)i(x;y/z)=i(y;x/z)=h(x/z)-h(x/yz) (3) i(x;y/z)≥0i(x;y/z)=?p(xyz)logp(yz)p(x/z)xyzp(xyz)-i(x;y/z)=?p(xyz)logxyzp(yz)p(x/z)≤log???p(yz)p(x/z)=0p(xyz)xyz得i(x;y/z)≥03.6若三个离散随机变量,有如下关系:x+y=z,其中x和y相互统计独立。
试证明:(1) h(x)=h(z); (2) h(y)=h(z);(3) h(z)= h(xy)= h(x)+ h(y); (4) i(x;z)=h(z)-h(y); (5) i(xy;z)= h(z); (6) i(x;yz)= h(x); (7) i(y;z/x)= h(y);(8) i(x;y/z)= i(x/z)= i(y/z);【篇二:信息论答案第三章】3.1 设信源????通过一干扰信道,接收符号为y = { y1, y2 },信道转移矩??p(x)??0.60.4??51???阵为?66?,求:13???44?(1) 信源x中事件x1和事件x2分别包含的自信息量;(2) 收到消息yj (j=1,2)后,获得的关于xi (i=1,2)的信息量; (3) 信源x和信宿y的信息熵;(4) 信道疑义度h(x/y)和噪声熵h(y/x); (5) 接收到信息y后获得的平均互信息量。
i(x1)??log2p(x1)??log20.6?0.737biti(x2)??log2p(x2)??log20.4?1.322bit2)51p(y1)?p(x1)p(y1/x1)?p(x2)p(y1/x2)?0.6??0.4??0.66413p(y2)?p(x1)p(y2/x1)?p(x2)p(y2/x2)?0.6??0.4??0.464p(y1/x1)5/6i(x1;y1)?log2?log2?0.474bitp(y1)0.6p(y2/x1)1/6i(x1;y2)?log2?log2??1.263bitp(y2)0.4i(x2;y1)?log2i(x2;y2)?log23)p(y1/x2)1/4?log2??1.263bitp(y1)0.6p(y2/x2)3/4?log2?0.907bitp(y2)0.4h(x)???p(xi)logp(xi)??(0.6log0.6?0.4log0.4)log210?0.971bit/sy mbolih(y)???p(yj)logp(yj)??(0.6log0.6?0.4log0.4)log210?0.971bit/sy mbolj4)h(y/x)????p(xi)p(yj/xi)logp(yj/xi)ij55111133??(0.6?log?0.6?log?0.4?log?0.4?log)?log21066664444?0.715bit/symbol?h(x)?h(y/x)?h(y)?h(x/y)?h(x/y)?h(x)?h(y/x)?h(y) ?0.971?0.715 ?0.971?0.715bit/symbol5)i(x;y)?h(x)?h(x/y)?0.971?0.715?0.256bit/symbol?21???3.2 设二元对称信道的传递矩阵为?33?12???33?(1) 若p(0) = 3/4, p(1) = 1/4,求h(x), h(x/y), h(y/x)和i(x;y); (2) 求该信道的信道容量及其达到信道容量时的输入概率分布;解: 1)3311h(x)???p(xi)??(?log2??log2)?0.811bit/symbol4444ih(y/x)????p(xi)p(yj/xi)logp(yj/xi)ij322311111122??(?lg??lg??lg??lg)?log210433433433433?0.918bit/symbol3211p(y1)?p(x1y1)?p(x2y1)?p(x1)p(y1/x1)?p(x2)p(y1/x2)?????0.583 343433112p(y2)?p(x1y2)?p(x2y2)?p(x1)p(y2/x1)?p(x2)p(y2/x2)?????0.416 74343h(y)???p(yj)??(0.5833?log20.5833?0.4167?log20.4167)?0.980bi t/symbolji(x;y)?h(x)?h(x/y)?h(y)?h(y/x)h(x/y)?h(x)?h(y)?h(y/x)?0.811?0.980?0.918?0.749bit/symboli(x; y)?h(x)?h(x/y)??0.811?0.749?0.062bit/symbol2)1122c?maxi(x;y)?log2m?hmi?log22?(lg?lg)?log210?0.082bit/symbo l33331p(xi)?2解:对本题建立数学模型如下:?x阻值??x1?2??x2?5????y瓦数??y1?1/8? ??????0.7??0.3??p(x)???p(y)??0.64p(y1/x1)?0.8,p(y2/x1)?0.2求:i(x;y)以下是求解过程:y2?1/4??0.36?p(x1y1)?p(x1)p(y1/x1)?0.7?0.8?0.56p(x1y2)?p(x1)p(y2/x1)?0.7?0.2?0.14?p(y1)?p(x1y1)?p(x2y1)?p(x2y1)?p(y1)?p(x1y1)?0.64?0.56?0.08?p(y2)?p(x1y2)?p(x2y2) ?p(x2y2)?p(y2)?p(x1y2)?0.36?0.14?0.22h(x)???p(xi)???0.7?log20.7?0.3?log20.3??0.881bit/symbolih(y)???p(yj)???0.64?log20.64?0.36?log20.36??0.943bit/symbol jh(xy)????p(xiyj)logp(xiyj)ij???0.56?log20.56?0.14?log20.14?0.08?log20.08?0.22?log20.2 2??1.638bit/symboli(x;y)?h(x)?h(y)?h(xy)?0.881?0.943?1.638?0.186bit/symbol(1) i(x;yz) = i(x;y) + i(x;z/y) = i(x;z) + i(x;y/z);证明:i(x;yz)????p(xiyjzp(xi/yjzk)k)logijkp(xi) ????p(xp(xi/yjzk)p(xi/yj)iyjzk)logijkp(xi)p(xi/yj)????p(xp(xi/yj)p(xi/yjzk)iyjzk)logp(x?p(xiyjzk)logijki)???ijkp(xi/yj)?i(x;y)?i(x;z/y)i(x;yz)????p(xp(xi/yjzk)iyjzk)logijkp(xi) ????p(xiyjzk)logp(xi/yjzk)p(xi/zk)ijkp(xi)p(xi/zk)????p(xp(xi/zk)iyjzk)log?ijkp(x???p(xlogp(xi/yjzk)iyjzk)i)ijkp(xi/zk) ?i(x;z)?i(x;y/z)(2) i(x;y/z) = i(y;x/z) = h(x/z) – h(x/yz);证明:i(x;y/z)????p(xxi/yjzk)iyjzk)logp(ijkp(xi/zk)????p(xi/yjzk)p(yjzk)iyjzk)logp(xijkp(xi/zk)p(yjzk)????p(xiyjzk)logijkp(xiyjzk)p(xi/zk)p(zk)p(yj/zk)p(xiyjzk)p(xizk)p(yj/zk)p(xiyjzk)p(xi zk)p(yj/zk)p(yj/xizk)p(yj/zk)????p(xiyjzk)logijk????p(xiyjzk)logijk????p(xiyjzk)logijk?i(y;x/z)p(xi/yjzk)p(xi/zk)ijki(x;y/z)????p(xiyjzk)logijkijk?????p(xiyjzk)logp(xi/zk)????p(xiyjzk)logp(xi/yjzk)?? ?????? p(xiyjzk)?logp(xi/zk)?h(x/yz)ik?j?????p(xizk)logp(xi/zk)?h(x/yz)ik?h(x/z)?h(x/yz)(3) i(x;y/z) ≥0,当且仅当(x, y, z)是马氏链时等式成立。