Hilbert space of real sequences

合集下载

sec2

II. T he Machinery of Quantum MechanicsBased on the results of the experiments described in the previous section, we recognize that real experiments do not behave quite as we expect. This section presents a mathematical framework that reproduces all of the above experimental observations. I am not going to go into detail about how this framework was developed. Historically, the mathematical development of QM was somewhat awkward; it was only years after the initial work that a truly rigorous (but also truly esoteric) foundation was put forth by Von Neumann. At this point, we will take the mathematical rules of QM as a hypothesis that is consistent with all the experimental results we have encountered.Now, there is no physics or chemistry in what we are about to discuss; the physics always arises from the experiments. However, just as Shakespeare had to learn proper spelling and grammar before he could write Hamlet, so we must understand the mathematics of QM before we can really start using it to make interesting predictions. This is both the beauty and the burden of physical chemistry; the beauty because once you understand these tools you can answer any experimental question without having to ask a more experienced colleague; the burden because the questions are very hard to answer.A. Measurements Happen in Hilbert SpaceAll the math of QM takes place in an abstract space that called Hilbert Space. The important point to realize is that Hilbert Space has no connection with the ordinary three dimensional space that we live in. For example, a Hilbert Space can (and usually does) have an infinite number of dimensions. These dimensions do not correspond in any way to the length, width and height we are used to. However, QM gives us a set of rules that connect operations in Hilbert Space to measurements in real space. Given a particular experiment, one constructs the appropriate Hilbert Space, and then uses the rules of QM within that space to make predictions.1. Hilbert Space Operators Correspond to Observables The first rule of QM is: all observables are associated with operators in Hilbert Space. We have already encountered this rule, we just didn’t know the operators lived in Hilbert space. Now, for mostintents and purposes, Hilbert Space operators behave like variables: you can add them, subtract them, multiply them, etc. and many of thefamiliar rules of algebra hold, for example (Z Y Xˆ,ˆ,ˆare arbitrary operators):Addition Commutes:X Y Y Xˆˆˆˆ+=+ Addition is Associative: ()()Z Y X Z Y Xˆˆˆˆˆˆ++=++ Multiplication is Associative:()()Z Y X Z Y X ˆˆˆˆˆˆ= However, the multiplication of operators does not commute :Multiplication does not commute:X Y Y Xˆˆˆˆ≠ We already knew that this was true; in the case of the polarizationoperators we showed that x Pˆ and 'ˆx P do not commute: y x x y P P P Pˆˆˆˆ''≠ Thus, the association of observables with operators allows us to describe the first quantum effect we discovered in the experiments: non-commuting observations . Also, note that uncertainty comes solely from the fact that the order of measurements matters; hence we can’t know the result of both measurements simultaneously.Now, deciding that operators have all the above features (e.g. associative multiplication, commutative addition) may seem rather arbitrary at first. For example, why does operator multiplication need to be associative? The deep result that motivates this is a theorem that asserts that if a set of operators satisfies the above relations (together with a few other benign conditions) guarantees that operators in Hilbert space can always be represented bymatrices . Hence a better way to remember how to multiply and add operators is to remember that they work just like matrices; any relation that is true for two arbitrary matrices is also true for two arbitrary operators.2. The System is Described by a State VectorIn Hilbert Space, the system is represented by a state . Again, we already knew this, but the fact that the states live in Hilbert space lets us know some new facts. First, we note that there are three simpleoperations one can execute on a state. First, one can multiply it by aconstant to obtain a new state:c c ψψ=In general, this constant can be complex . It does not matter which side the constant appears on. The second thing one can do is to addtwo states together to make a new state:21ψψψ+=As we have seen before, ψ is a superposition of the two states 1ψ and 2ψ. Finally, there is one new operation we need to introduce, called Hermitian conjugation . By definition, the Hermitian conjugate (dentoed by ‘†’) is given by:()**2211†2211c c c c ψψψψ+=+()2211†2211**ψψψψc c c c +=+Where ‘*’ denotes complex conjugation. Further, Thus, the Hermitian conjugate takes kets to bras (and vice versa) and takes the complex conjugate of any constant. Hermitian conjugation in Hilbert space isanalogous to the transpose in a traditional vector space. Thus:()T =⇔=†ψψ To be precise, we will ultimately find that Hermitian conjugation is the same as taking the transpose and complex conjugate simultaneously. Finally, we note one important fact about a Hilbert space. There always exists a basis of states, {}αφ, such that any other state canbe written as a linear combination of the basis states:=αααφψcWe have as yet said nothing about the number of these states. In general, the basis for a Hilbert space involves an infinite number of states. The definition above assumes they are denumerable (i.e. we can assign them numbers i=1,2,3,4…) In some situations, the basis will be continuous . In these situations, we can replace the sum byan integral:()αφαψαd c =.3. Bra-Ket Gives ProbabilityNow, in order to make predictions, we need to understand a few properties of the bra-ket product. To be mathematically precise, bar and ket states are dual to one another. The illustration in terms of vectors is invaluable in understanding what this means, because column vectors and row vectors are also dual to one another. Thus, essentially all the properties of row and column vectors can betransferred over to bra and ket states. Most notably, one can define an overlap (or inner product) analogous to the dot product forordinary vectors.()⇔.ψχ The overlap between a bra and a ket has all the same intuitivecontent as the dot product: it tells you how similar the two states are. If the overlap is zero, the two states are orthogonal . We can also define the norm of a state by:ψψψ=2One of the properties of the bracket product in Hilbert space is that the norm of a state is always greater than or equal to zero and it can only be zero for the trivial state that corresponds to the origin. It turns out that the norm of the state has no physical relevance; any value between 0 and gives the same physical answer. In practice it is often easiest to multiply the wavefunction by a normalizationconstant, 2/1−=ψψc , that makes the norm 1. This does not affectour predictions but often makes the expressions simpler. If two states are both orthogonal to one another and normalized, they are said to be orthonorma l .As mentioned above, operators can be associated with matrices. It is therefore natural to associate an operator acting on a ket state with amatrix-vector product:× ⇔ψO ˆ This allows us to define the Hermitian Conjugate (HC) of an operatorby forcing the HC ψOˆ to be the HC of ψ times the HC of O ˆ:()††ˆˆOO ψψ≡ This defines †ˆO, the HC of O ˆ. This is also called the adjoint of the operator Oˆ. If an operator is equal to its adjoint, it is hermitian . This is analogous to a symmetric matrix.It is important to notice that the order of operations is crucial at this point. Operators will always appear to the left of a ket state and to the right of a bra state. The expressionsOand O ˆˆψψ are not incorrect; they are simply useless in describing reality. This might be clearer if we write the associated matrix expressions:()and One can give meaning to these expressions (in terms of a tensor product) but the result is not useful.We are now in a position to restate the third rule of QM: for a system, is given by:ψψψψO O ˆˆ=. Note that this equation simplifies if ψis normalized, in which caseψψO Oˆˆ=. 4. Operators and EigenvaluesOne important fact is that operators in Hilbert Space are always linear , which means:()2121ˆˆˆψψψψO O O+=+ This is another one of the traits that allows operators to berepresented in terms of a matrix algebra (they call it linear algebra for a reason).Now, one can associate a set of eigenvalues, αo , and eigenstates,αψ, with any linear operator, Oˆ, by finding all of the solutions of the eigenvalue equation:αααψψo O=ˆ This allows us to state the final two rules of QM: when measuring the value of the observable O , the only possible outcomes are the eigenvalues of Oˆ. If the spectrum of eigenvalues of O ˆ is discrete, this immediately implies that the resulting experimental results will be quantized , as we know is quite often the case. If the spectrum ofeigenvalues of Oˆ is continuous, then this rule gives us little information. And, finally, after O has been observed and found to have a valueothen the wavefunction of the system collapses into α.5. Some Interesting Facts Before moving on to describe the experiments from the previous section in terms of our newly proposed rules, it is useful to define a few concepts. The first is the idea of an outer product. Just as we can write the inner product as (bra)x(ket), we can write the outer product as (ket)x(bra). This is in strict analogy to the case of vectors where the outer product is a column vector times a row vector:() ⇔ψχ As we have seen in the polarization experiments, the outer product is an operator; if we act on a state with it, we get another state back: ()()φψχφψχφψχ≡==c c This is, again, in direct analogy with vector algebra, where the outer product of two vectors is a matrix. One interesting operator is the outer product of a ket with its own bra, which is called the density operator :ψψψ=P ˆ If ψ is normalized, this operator happens to be equal to its own square:ψψψψψψP P P ˆˆˆ===1This property is called idempotency . Hence, we see that the density operator for any quantum state is idempotent. Further, we see that ψP ˆ acting on any state gives back the state ψ times a constant: ()()φψψφψψφψψ≡==c cBy this token, density operators are also called projectionoperators , because they project out the part of a given wavefunction that is proportional to ψ.One very important fact about Hilbert space is that there is always a complete orthonormal basis , {i φ, of ket states. As the nameimplies, these states are orthonormal (the overlap between different states is zero and each state is normalized) and the form a basis (any state ψ can be written as a linear combination of these states). We can write the orthonormality condition in shorthand asij j i δφφ=Where we have defined the Kroenecker delta - a symbol that is one if i=j and zero otherwise.The first important results we will prove concern Hermitian operators.Given a Hermitian operator, Hˆ, it turns out that 1) the eigenvalues of Hˆ are always real, and 2) the eigenstates can be made to form a complete orthonormal basis. Both these facts are extremely important. First, recall that we know experimental results (which correspond to eigenvalues) are always real numbers; thus, it is not surprising that every observable we deal with in this course will be associated with a Hermitian operator. Also, note that up to now we have appealed to the existence of an orthonormal basis, but gave no hints about how such a basis was to be constructed. We now see that every Hermitian operator associated with an observation naturally defines its own orthonormal basis!As with nearly all theorems in chemistry, the most important part of this is the result and not how it is obtained. However, we will outline the proof of this theorem, mostly to get a little more practice with ins and outs of Dirac notation.______________________________________________________1) Consider the eigenvalue equation and its Hermitian conjugate:*ˆˆααααααψψψψh H h H Conjugate Hermitian = → = Now we apply one of our tricks and take the inner product of the left equation with αψ and the inner product of the right equation with αψ:ααααααααααψψψψψψψψ*ˆˆh H h H ==We see that the left hand sides (l.h.s.) of both equations are the same, so we subtract them to obtain: ()ααααψψ*0h h −= . In order to have the right hand side (r.h.s) be zero, either:()*0ααh h −= or ααψψ=0Since we defined our states so that their norms were not zero , we conclude that()*0ααh h −= Which implies that αh is real2) Here, we need to prove that the eigenstates are a) normalized, b) orthogonal and c) form a complete basis. We will take these points in turn.a) The eigenstates can be trivially normalized, since if αψ isan eigenstate of H ˆ, then so is αψc : ()()αααααααψψψψψc h ch H c c H c H====ˆˆˆ So given an unnormalized eigenstate, we can always normalize it without affecting the eigenvalueb) Consider the ket eigenvalue equation for one value of α and the bra equation for 'α'''ˆˆααααααψψψψh Hh H == where we have already made use of the fact that *''ααh h =.Now, take the inner product of the first equation with 'αψ andthe second with αψ. Then:ααααααααααψψψψψψψψ'''''ˆˆh Hh H == Once again, the l.h.s. of the equations are equal and subtracting gives: ()ααααψψ''0h h −=Thus, either:()'0ααh h −= or ααψψ'0=Now, recall that we are dealing with two different eigenstates (i.e. 'αα≠). If the eigenvalules are not degenerate (i.e.'ααh h ≠), then the first equation cannot be satisfied and theeigenvectors must be orthogonal. In the case of degeneracy, however, we appear to be out of luck; the first equality issatisfied and we can draw no conclusions about theorthogonality of the eigenvectors. What is going on? Notice that, if h h h ≡='αα, then any linear combination of the twodegenerate eigenstates, 'ααψψb a +, is also an eigenstate with the same eigenvalue :()()''''ˆˆˆααααααααψψψψψψψψb a h bh ah H b H a b a H+=+=+=+ So, when we have a degenerate eigenvalue, the definition of the eigenstates that correspond to that eigenvalue are notunique, and not all of these combinations are orthogonal to one another. However, there is a theorem due to Gram andSchmidt – which we will not prove – that asserts that at least one of the possible choices forms an orthonormal set . The difficult part in proving this is that there may be two, three, four… different degenerate states. So, for non-degenerate eigenvalues, the states must be orthogonal, while for adegenerate eigenvalue, the states are not necessarily orthogonal, we are free to choose them to be orthogonalc) The final thing we need to prove is that the eigenstates form a complete basis. Abstractly, this means that we can write any other state as a linear combination of the eigenstates:=αααψχcThis turns out to be difficult to prove, and so we simply defer to our math colleagues and assert that it can be proven_______________________________________________________Finally, it is also useful to define the commutator of two operators:[]A B B A B Aˆˆˆˆˆ,ˆ−≡ If two operators commute, then the order in which they appear does not matter and the commutator vanishes. Meanwhile, if the operatorsdo not commute, then the commutator measures “how much” the order matters.。

第2届丘成桐大学生数学竞赛试题

S.-T.Yau College Student Mathematics Contests 2011Analysis and Diﬀerential EquationsIndividual2:30–5:00pm,July 9,2011(Please select 5problems to solve)1.a)Compute the integral: ∞−∞x cos xdx (x 2+1)(x 2+2),b)Show that there is a continuous function f :[0,+∞)→(−∞,+∞)such that f ≡0and f (4x )=f (2x )+f (x ).2.Solve the following problem: d 2u dx 2−u (x )=4e −x ,x ∈(0,1),u (0)=0,dudx(0)=0.3.Find an explicit conformal transformation of an open set U ={|z |>1}\(−∞,−1]to the unit disc.4.Assume f ∈C 2[a,b ]satisfying |f (x )|≤A,|f(x )|≤B for each x ∈[a,b ]and there exists x 0∈[a,b ]such that |f (x 0)|≤D ,then |f (x )|≤2√AB +D,∀x ∈[a,b ].5.Let C ([0,1])denote the Banach space of real valued continuous functions on [0,1]with the sup norm,and suppose that X ⊂C ([0,1])is a dense linear subspace.Suppose l :X →R is a linear map (not assumed to be continuous in any sense)such that l (f )≥0if f ∈X and f ≥0.Show that there is a unique Borel measure µon [0,1]such that l (f )= fdµfor all f ∈X .6.For s ≥0,let H s (T )be the space of L 2functions f on the circle T =R /(2πZ )whose Fourier coeﬃcients ˆf n = 2π0e−inx f (x )dx satisfy Σ(1+n 2)s ||ˆf n |2<∞,with norm ||f ||2s =(2π)−1Σ(1+n 2)s |ˆf n |2.a.Show that for r >s ≥0,the inclusion map i :H r (T )→H s (T )is compact.b.Show that if s >1/2,then H s (T )includes continuously into C (T ),the space of continuous functions on T ,and the inclusion map is compact.1S.-T.Yau College Student Mathematics Contests2011Geometry and TopologyIndividual9:30–12:00am,July10,2011(Please select5problems to solve)1.Suppose M is a closed smooth n-manifold.a)Does there always exist a smooth map f:M→S n from M into the n-sphere,such that f is essential(i.e.f is not homotopic to a constant map)?Justify your answer.b)Same question,replacing S n by the n-torus T n.2.Suppose(X,d)is a compact metric space and f:X→X is a map so that d(f(x),f(y))=d(x,y)for all x,y in X.Show that f is an onto map.3.Let C1,C2be two linked circles in R3.Show that C1cannot be homotopic to a point in R3\C2.4.Let M=R2/Z2be the two dimensional torus,L the line3x=7y in R2,and S=π(L)⊂M whereπ:R2→M is the projection map. Find a diﬀerential form on M which represents the Poincar´e dual of S.5.A regular curve C in R3is called a Bertrand Curve,if there existsa diﬀeomorphism f:C→D from C onto a diﬀerent regular curve D in R3such that N x C=N f(x)D for any x∈C.Here N x C denotes the principal normal line of the curve C passing through x,and T x C will denote the tangent line of C at x.Prove that:a)The distance|x−f(x)|is constant for x∈C;and the angle made between the directions of the two tangent lines T x C and T f(x)D is also constant.b)If the curvature k and torsionτof C are nowhere zero,then there must be constantsλandµsuch thatλk+µτ=16.Let M be the closed surface generated by carrying a small circle with radius r around a closed curve C embedded in R3such that the center moves along C and the circle is in the normal plane to C at each point.Prove thatMH2dσ≥2π2,and the equality holds if and only if C is a circle with radius √2r.HereH is the mean curvature of M and dσis the area element of M.1S.-T.Yau College Student Mathematics Contests 2011Algebra,Number Theory andCombinatoricsIndividual2:30–5:00pm,July 10,2011(Please select 5problems to solve)For the following problems,every example and statement must be backed up by proof.Examples and statements without proof will re-ceive no-credit.1.Let K =Q (√−3),an imaginary quadratic ﬁeld.(a)Does there exists a ﬁnite Galois extension L/Q which containsK such that Gal(L/Q )∼=S 3?(Here S 3is the symmetric group in 3letters.)(b)Does there exists a ﬁnite Galois extension L/Q which containsK such that Gal(L/Q )∼=Z /4Z ?(c)Does there exists a ﬁnite Galois extension L/Q which containsK such that Gal(L/Q )∼=Q ?Here Q is the quaternion group with 8elements {±1,±i,±j,±k },a ﬁnite subgroup of the group of units H ×of the ring H of all Hamiltonian quaternions.2.Let f be a two-dimensional (complex)representation of a ﬁnite group G such that 1is an eigenvalue of f (σ)for every σ∈G .Prove that f is a direct sum of two one-dimensional representations of G3.Let F ⊂R be the subset of all real numbers that are roots of monic polynomials f (X )∈Q [X ].(1)Show that F is a ﬁeld.(2)Show that the only ﬁeld automorphisms of F are the identityautomorphism α(x )=x for all x ∈F .4.Let V be a ﬁnite-dimensional vector space over R and T :V →V be a linear transformation such that(1)the minimal polynomial of T is irreducible;(2)there exists a vector v ∈V such that {T i v |i ≥0}spans V .Show that V contains no non-trivial proper T -invariant subspace.5.Given a commutative diagramA →B →C →D →E↓↓↓↓↓A →B →C →D →E1Algebra,Number Theory and Combinatorics,2011-Individual2 of Abelian groups,such that(i)both rows are exact sequences and(ii) every vertical map,except the middle one,is an isomorphism.Show that the middle map C→C is also an isomorphism.6.Prove that a group of order150is not simple.S.-T.Yau College Student Mathematics Contests 2011Applied Math.,Computational Math.,Probability and StatisticsIndividual6:30–9:00pm,July 9,2011(Please select 5problems to solve)1.Given a weight function ρ(x )>0,let the inner-product correspond-ing to ρ(x )be deﬁned as follows:(f,g ):= baρ(x )f (x )g (x )d x,and let f :=(f,f ).(1)Deﬁne a sequence of polynomials as follows:p 0(x )=1,p 1(x )=x −a 1,p n (x )=(x −a n )p n −1(x )−b n p n −2(x ),n =2,3,···wherea n =(xp n −1,p n −1)(p n −1,p n −1),n =1,2,···b n =(xp n −1,p n −2)(p n −2,p n −2),n =2,3,···.Show that {p n (x )}is an orthogonal sequence of monic polyno-mials.(2)Let {q n (x )}be an orthogonal sequence of monic polynomialscorresponding to the ρinner product.(A polynomial is called monic if its leading coeﬃcient is 1.)Show that {q n (x )}is unique and it minimizes q n amongst all monic polynomials of degree n .(3)Hence or otherwise,show that if ρ(x )=1/√1−x 2and [a,b ]=[−1,1],then the corresponding orthogonal sequence is the Cheby-shev polynomials:T n (x )=cos(n arccos x ),n =0,1,2,···.and the following recurrent formula holds:T n +1(x )=2xT n (x )−T n −1(x ),n =1,2,···.(4)Find the best quadratic approximation to f (x )=x 3on [−1,1]using ρ(x )=1/√1−x 2.1Applied Math.Prob.Stat.,2011-Individual 22.If two polynomials p (x )and q (x ),both of ﬁfth degree,satisfyp (i )=q (i )=1i,i =2,3,4,5,6,andp (1)=1,q (1)=2,ﬁnd p (0)−q (0)y aside m black balls and n red balls in a jug.Supposes 1≤r ≤k ≤n .Each time one draws a ball from the jug at random.1)If each time one draws a ball without return,what is the prob-ability that in the k -th time of drawing one obtains exactly the r -th red ball?2)If each time one draws a ball with return,what is the probability that in the ﬁrst k times of drawings one obtained totally an odd number of red balls?4.Let X and Y be independent and identically distributed random variables.Show thatE [|X +Y |]≥E [|X |].Hint:Consider separately two cases:E [X +]≥E [X −]and E [X +]<E [X −].5.Suppose that X 1,···,X n are a random sample from the Bernoulli distribution with probability of success p 1and Y 1,···,Y n be an inde-pendent random sample from the Bernoulli distribution with probabil-ity of success p 2.(a)Give a minimum suﬃcient statistic and the UMVU (uniformlyminimum variance unbiased)estimator for θ=p 1−p 2.(b)Give the Cramer-Rao bound for the variance of the unbiasedestimators for v (p 1)=p 1(1−p 1)or the UMVU estimator for v (p 1).(c)Compute the asymptotic power of the test with critical region |√n (ˆp 1−ˆp 2)/ 2ˆp ˆq |≥z 1−αwhen p 1=p and p 2=p +n −1/2∆,where ˆp =0.5ˆp 1+0.5ˆp 2.6.Suppose that an experiment is conducted to measure a constant θ.Independent unbiased measurements y of θcan be made with either of two instruments,both of which measure with normal errors:fori =1,2,instrument i produces independent errors with a N (0,σ2i )distribution.The two error variances σ21and σ22are known.When ameasurement y is made,a record is kept of the instrument used so that after n measurements the data is (a 1,y 1),...,(a n ,y n ),where a m =i if y m is obtained using instrument i .The choice between instruments is made independently for each observation in such a way thatP (a m =1)=P (a m =2)=0.5,1≤m ≤n.Applied Math.Prob.Stat.,2011-Individual 3Let x denote the entire set of data available to the statistician,in this case (a 1,y 1),...,(a n ,y n ),and let l θ(x )denote the corresponding log likelihood function for θ.Let a =n m =1(2−a m ).(a)Show that the maximum likelihood estimate of θis given by ˆθ= n m =11/σ2a m −1 n m =1y m /σ2a m.(b)Express the expected Fisher information I θand the observedFisher information I x in terms of n ,σ21,σ22,and a .What hap-pens to the quantity I θ/I x as n →∞?(c)Show that a is an ancillary statistic,and that the conditional variance of ˆθgiven a equals 1/I x .Of the two approximations ˆθ·∼N (θ,1/I θ)and ˆθ·∼N (θ,1/I x ),which (if either)would you use for the purposes of inference,and why?S.-T.Yau College Student Mathematics Contests 2011Analysis and Diﬀerential EquationsTeam9:00–12:00am,July 9,2011(Please select 5problems to solve)1.Let H 2(∆)be the space of holomorphic functions in the unit disk ∆={|z |<1}such that ∆|f |2|dz |2<∞.Prove that H 2(∆)is a Hilbert space and that for any r <1,the map T :H 2(∆)→H 2(∆)given by T f (z ):=f (rz )is a compact operator.2.For any continuous function f (z )of period 1,show that the equation dϕdt=2πϕ+f (t )has a unique solution of period 1.3.Let h (x )be a C ∞function on the real line R .Find a C ∞function u (x,y )on an open subset of R containing the x -axis such that u x +2u y =u 2and u (x,0)=h (x ).4.Let S ={x ∈R ||x −p |≤c/q 3,for all p,q ∈Z ,q >0,c >0},show that S is uncountable and its measure is zero.5.Let sl (n )denote the set of all n ×n real matrices with trace equal to zero and let SL (n )be the set of all n ×n real matrices with deter-minant equal to one.Let ϕ(z )be a real analytic function deﬁned in a neighborhood of z =0of the complex plane C satisfying the conditions ϕ(0)=1and ϕ (0)=1.(a)If ϕmaps any near zero matrix in sl (n )into SL (n )for some n ≥3,show that ϕ(z )=exp(z ).(b)Is the conclusion of (a)still true in the case n =2?If it is true,prove it.If not,give a counterexample.e mathematical analysis to show that:(a)e and πare irrational numbers;(b)e and πare also transcendental numbers.1S.-T.Yau College Student Mathematics Contests2011Applied Math.,Computational Math.,Probability and StatisticsTeam9:00–12:00am,July9,2011(Please select5problems to solve)1.Let A be an N-by-N symmetric positive deﬁnite matrix.The con-jugate gradient method can be described as follows:r0=b−A x0,p0=r0,x0=0FOR n=0,1,...αn= r n 22/(p TnA p n)x n+1=x n+αn p n r n+1=r n−αn A p nβn=−r Tk+1A p k/p TkA p kp n+1=r n+1+βn p nEND FORShow(a)αn minimizes f(x n+αp n)for allα∈R wheref(x)≡12x T A x−b T x.(b)p Ti r n=0for i<n and p TiA p j=0if i=j.(c)Span{p0,p1,...,p n−1}=Span{r0,r1,...,r n−1}≡K n.(d)r n is orthogonal to K n.2.We use the following scheme to solve the PDE u t+u x=0:u n+1 j =au nj−2+bu nj−1+cu njwhere a,b,c are constants which may depend on the CFL numberλ=∆t ∆x .Here x j=j∆x,t n=n∆t and u njis the numerical approximationto the exact solution u(x j,t n),with periodic boundary conditions.(i)Find a,b,c so that the scheme is second order accurate.(ii)Verify that the scheme you derived in Part(i)is exact(i.e.u nj =u(x j,t n))ifλ=1orλ=2.Does this imply that the scheme is stable forλ≤2?If not,ﬁndλ0such that the scheme is stable forλ≤λ0. Recall that a scheme is stable if there exist constants M and C,which are independent of the mesh sizes∆x and∆t,such thatu n ≤Me CT u0for all∆x,∆t and n such that t n≤T.You can use either the L∞norm or the L2norm to prove stability.1Applied Math.Prob.Stat.,2011-Team2 3.Let X and Y be independent random variables,identically dis-tributed according to the Normal distribution with mean0and variance 1,N(0,1).(a)Find the joint probability density function of(R,),whereR=(X2+Y2)1/2andθ=arctan(Y/X).(b)Are R andθindependent?Why,or why not?(c)Find a function U of R which has the uniform distribution on(0,1),Unif(0,1).(d)Find a function V ofθwhich is distributed as Unif(0,1).(e)Show how to transform two independent observations U and Vfrom Unif(0,1)into two independent observations X,Y fromN(0,1).4.Let X be a random variable such that E[|X|]<∞.Show thatE[|X−a|]=infE[|X−x|],x∈Rif and only if a is a median of X.5.Let Y1,...,Y n be iid observations from the distribution f(x−θ), whereθis unknown and f()is probability density function symmetric about zero.Suppose a priori thatθhas the improper priorθ∼Lebesgue(ﬂat) on(−∞,∞).Write down the posterior distribution ofθ.Provides some arguments to show that thisﬂat prior is noninforma-tive.Show that with the posterior distribution in(a),a95%probability interval is also a95%conﬁdence interval.6.Suppose we have two independent random samples{Y1,i=1,...,n} from Poisson with(unknown)meanλ1and{Y i,i=n+1,...,2n}from Poisson with(unknown)meanλ2Letθ=λ1/(λ1+λ2).(a)Find an unbiased estimator ofθ(b)Does your estimator have the minimum variance among all un-biased estimators?If yes,prove it.If not,ﬁnd one that has theminimum variance(and prove it).(c)Does the unbiased minimum variance estimator you found at-tain the Fisher information bound?If yes,show it.If no,whynot?S.-T.Yau College Student Mathematics Contests2011Geometry and TopologyTeam9:00–12:00am,July9,2011(Please select5problems to solve)1.Suppose K is aﬁnite connected simplicial complex.True or false:a)Ifπ1(K)isﬁnite,then the universal cover of K is compact.b)If the universal cover of K is compact thenπ1(K)isﬁnite.pute all homology groups of the the m-skeleton of an n-simplex, 0≤m≤n.3.Let M be an n-dimensional compact oriented Riemannian manifold with boundary and X a smooth vectorﬁeld on M.If n is the inward unit normal vector of the boundary,show thatM div(X)dV M=∂MX·n dV∂M.4.Let F k(M)be the space of all C∞k-forms on a diﬀerentiable man-ifold M.Suppose U and V are open subsets of M.a)Explain carefully how the usual exact sequence0−→F(U∪V)−→F(U)⊕F V)−→F(U∩V)−→0 arises.b)Write down the“long exact sequence”in de Rham cohomology as-sociated to the short exact sequence in part(a)and describe explicitly how the mapH kdeR (U∩V)−→H k+1deR(U∪V)arises.5.Let M be a Riemannian n-manifold.Show that the scalar curvature R(p)at p∈M is given byR(p)=1vol(S n−1)S n−1Ric p(x)dS n−1,where Ric p(x)is the Ricci curvature in direction x∈S n−1⊂T p M, vol(S n−1)is the volume of S n−1and dS n−1is the volume element of S n−1.1Geometry and Topology,2011-Team2 6.Prove the Schur’s Lemma:If on a Riemannian manifold of dimension at least three,the Ricci curvature depends only on the base point but not on the tangent direction,then the Ricci curvature must be constant everywhere,i.e.,the manifold is Einstein.S.-T.Yau College Student Mathematics Contests 2011Algebra,Number Theory andCombinatoricsTeam9:00–12:00pm,July 9,2011(Please select 5problems to solve)For the following problems,every example and statement must be backed up by proof.Examples and statements without proof will re-ceive no-credit.1.Let F be a ﬁeld and ¯Fthe algebraic closure of F .Let f (x,y )and g (x,y )be polynomials in F [x,y ]such that g .c .d .(f,g )=1in F [x,y ].Show that there are only ﬁnitely many (a,b )∈¯F×2such that f (a,b )=g (a,b )=0.Can you generalize this to the cases of more than two-variables?2.Let D be a PID,and D n the free module of rank n over D .Then any submodule of D n is a free module of rank m ≤n .3.Identify pairs of integers n =m ∈Z +such that the quotient rings Z [x,y ]/(x 2−y n )∼=Z [x,y ]/(x 2−y m );and identify pairs of integers n =m ∈Z +such that Z [x,y ]/(x 2−y n )∼=Z [x,y ]/(x 2−y m ).4.Is it possible to ﬁnd an integer n >1such that the sum1+12+13+14+ (1)is an integer?5.Recall that F 7is the ﬁnite ﬁeld with 7elements,and GL 3(F 7)is the group of all invertible 3×3matrices with entries in F 7.(a)Find a 7-Sylow subgroup P 7of GL 3(F 7).(b)Determine the normalizer subgroup N of the 7-Sylow subgroupyou found in (a).(c)Find a 2-Sylow subgroup of GL 3(F 7).6.For a ring R ,let SL 2(R )denote the group of invertible 2×2matrices.Show that SL 2(Z )is generated by T = 1101 and S = 01−10 .What about SL 2(R )?1。

TheKlein-Gordonequation：克莱因戈登方程

(24)
where the Lagrangian density satisfies the Euler-Lagrange equations of motions
(25)
such that the Euler-Lagrange equations of motion just give the Klein-Gordon equation (12) and its complex conjugate.
as the basic field equation of the scalar field.
The plane waves (10) are basic solutions and the field (9) is constructed by
a general superposition of the basic states.
Quantization
The challenge is to find operator solutions of the Klein-Gordon equation (12) which satisfy eq. (28). In analogy to the Lagrange density (24) , the hamiltonian is
Lecture 8
The Klein-Gordon equation
WS2010/11: ‚Introduction to Nuclear and Particle Physics‘
The bosons in field theory
Bosons with spin 0
scalar (or pseudo-scalar) meson fields
(23)

数学专业英语(修订版2)

dense theorem induced equivalent extension multiplier bound normalize
slice fundamental
k• ±Ï ëþz Ïf C ©) og 4• Ö ©ª Ø Œ
a —8 ½n p
d òÿ ¦ > 5‰z ˜¡ Ä
notation identity sphere lemma
Rd
and it is well deﬁned for a.e. x.
obviously f (x−y)g(y)dy = f (x)g(x−y)dy and if f and g are measurable(integrable), then f ∗ g is also measurable(integrable), and ||f ∗ g|| ≤ ||f || · ||g||.
dust sequence characteristic compact completion hypothesis convex counting rectiﬁable diﬀerence mapping rational preserving
metric series jump positive smooth gradient
as x → ∞. (1) There exists a positive continuous function f on R so that f is integrable
∞ j=1
m(Ej
).
6.Countable intersections of open sets which are called Gδ sets, consider their com-
plements, the countable union of closed sets called the Fδ sets. A subset E of Rd is measurable: (1)if and only if E diﬀers from a Gδ by a set of

数学专业英语(Doc版).14

数学专业英语－MathematicansLeonhard Euler was born on April 15,1707,in Basel, Switzerland, the son of a mathematician and Caivinist pastor who wanted his son to become a pastor a s well. Although Euler had different ideas, he entered the University of Basel to study Hebrew and theology, thus obeying his father. His hard work at the u niversity and remarkable ability brought him to the attention of the well-known mathematician Johann Bernoulli (1667—1748). Bernoulli, realizing Euler’s tal ents, persuaded Euler’s father to change his mind, and Euler pursued his studi es in mathematics.At the age of nineteen, Euler’s first original work appeared. His paper failed to win the Paris Academy Prize in 1727; however this loss was compensated f or later as he won the prize twelve times.At the age of 28, Euler competed for the Pairs prize for a problem in astrono my which several leading mathematicians had thought would take several mont hs to solve.To their great surprise, he solved it in three days! Unfortunately, th e considerable strain that he underwent in his relentless effort caused an illness that resulted in the loss of the sight of his right eye.At the age of 62, Euler lost the sight of his left eye and thus became totally blind. However this did not end his interest and work in mathematics; instead, his mathematical productivity increased considerably.On September 18, 1783, while playing with his grandson and drinking tea, Eul er suffered a fatal stroke.Euler was the most prolific mathematician the world has ever seen. He made s ignificant contributions to every branch of mathematics. He had phenomenal m emory: He could remember every important formula of his time. A genius, he could work anywhere and under any condition.George cantor (March 3, 1845—June 1,1918),the founder of set theory, was bo rn in St. Petersburg into a Jewish merchant family that settled in Germany in 1856.He studied mathematics, physics and philosophy in Zurich and at the University of Berlin. After receiving his degree in 1867 in Berlin, he became a lecturer at the university of Halle from 1879 to 1905. In 1884,under the stra in of opposition to his ideas and his efforts to prove the continuum hypothesis, he suffered the first of many attacks of depression which continued to hospita lize him from time to time until his death.The thesis he wrote for his degree concerned the theory of numbers; however, he arrived at set theory from his research concerning the uniqueness of trigon ometric series. In 1874, he introduced for the first time the concept of cardinalnumbers, with which he proved that there were “more”transcendental numb ers than algebraic numbers. This result caused a sensation in the mathematical world and became the subject of a great deal of controversy. Cantor was troub led by the opposition of L. Kronecker, but he was supported by J.W.R. Dedek ind and G. Mittagleffer. In his note on the history of the theory of probability, he recalled the period in which the theory was not generally accepted and cri ed out “the essence of mathematics lies in its freedom!”In addition to his work on the concept of cardinal numbers, he laid the basis for the concepts of order types, transfinite ordinals, and the theory of real numbers by means of fundamental sequences. He also studied general point sets in Euclidean space a nd defined the concepts of accumulation point, closed set and open set. He wa s a pioneer in dimension theory, which led to the development of topology.Kantorovich was born on January 19, 1912, in St. Petersburg, now called Leni ngrad. He graduated from the University of Leningrad in 1930 and became a f ull professor at the early age of 22.At the age of 27, his pioneering contributi ons in linear programming appeared in a paper entitled Mathematical Methods for the Organization and planning of production. In 1949, he was awarded a S talin Prize for his contributions in a branch of mathematics called functional a nalysis and in 1958, he became a member of the Russian Academy of Science s. Interestingly enough, in 1965,kantorovich won a Lenin Prize fo r the same o utstanding work in linear programming for which he was awarded the Nobel P rize. Since 1971, he has been the director of the Institute of Economics of Ma nagement in Moscow.Paul R. Halmos is a distinguished professor of Mathematics at Indiana Univers ity, and Editor-Elect of the American Mathematical Monthly. He received his P h.D. from the University of Illinois, and has held positions at Illinois, Syracuse, Chicago, Michigan, Hawaii, and Santa Barbara. He has published numerous b ooks and nearly 100 articles, and has been the editor of many journals and se veral book series. The Mathematical Association of America has given him the Chauvenet Prize and (twice) the Lester Ford award for mathematical expositio n. His main mathematical interests are in measure and ergodic theory, algebraic, and operators on Hilbert space.Vito Volterra, born in the year 1860 in Ancona, showed in his boyhood his e xceptional gifts for mathematical and physical thinking. At the age of thirteen, after reading Verne’s novel on the voyage from earth to moon, he devised hi s own method to compute the trajectory under the gravitational field of the ear th and the moon; the method was worth later development into a general proc edure for solving differential equations. He became a pupil of Dini at the Scu ola Normale Superiore in Pisa and published many important papers while still a student. He received his degree in Physics at the age of 22 and was made full professor of Rational Mechanics at the same University only one year lat er, as a successor of Betti.Volterra had many interests outside pure mathematics, ranging from history to poetry, to music. When he was called to join in 1900 the University of Rome from Turin, he was invited to give the opening speech of the academic year. Volterra was President of the Accademia dei Lincei in the years 1923-1926. H e was also the founder of the Italian Society for the Advancement of Science and of the National Council of Research. For many years he was one of the most productive scientists and a very influential personality in public life. Whe n Fascism took power in Italy, Volterra did not accept any compromise and pr eferred to leave his public and academic activities.Vocabularypastor 牧师 hospitalize 住进医院theology 神学 thesis 论文strain 紧张、疲惫transcendental number 超越数relentless 无情的sensation 感觉，引起兴趣的事prolific 多产的controversy 争论，辩论depression 抑郁；萧条，不景气essence 本质，要素transfinite 超限的Note0. 本课文由几篇介绍数学家生平的短文组成，属传记式体裁。

随机矩阵——精选推荐

Random matrixIn probability theory and mathematical physics, a random matrix is a matrix-valued random variable. Many important properties of physical systems can be represented mathematically as matrix problems. For example, the thermal conductivity of a lattice can be computed from the dynamical matrix of the particle-particle interactions within the lattice.MotivationPhysicsIn nuclear physics, random matrices were introduced by Eugene Wigner[1]to model the spectra of heavy atoms. He postulated that the spacings between the lines in the spectrum of a heavy atom should resemble the spacings between the eigenvalues of a random matrix, and should depend only on the symmetry class of the underlying evolution.[2] In solid-state physics, random matrices model the behaviour of large disordered Hamiltonians in the mean field approximation.In quantum chaos, the Bohigas–Giannoni–Schmit (BGS) conjecture[3] asserts that the spectral statistics of quantum systems whose classicalcounterparts exhibit chaotic behaviour are described by random matrix theory.Random matrix theory has also found applications to quantum gravity in two dimensions,[4]mesoscopic physics,[5] andmore[6][7][8][9][10]Mathematical statistics and numerical analysisIn multivariate statistics, random matrices were introduced by John Wishart for statistical analysis of large samples;[11] see estimation of covariance matrices.Significant results have been shown that extend the classical scalar Chernoff,Bernstein, and Hoeffding inequalities to the largest eigenvalues of finite sums of random Hermitian matrices.[12] Corollary results are derived for the maximum singular values of rectangular matrices.In numerical analysis, random matrices have been used since the work of John von Neumann and Herman Goldstine[13]to describe computation errors in operations such as matrix multiplication. See also[14] for more recent results.Number theoryIn number theory, the distribution of zeros of the Riemann zeta function(and other L-functions) is modelled by the distribution of eigenvalues of certain random matrices.[15] The connection was first discovered by Hugh Montgomery and Freeman J. Dyson. It is connected to the Hilbert–Pólya conjecture.Gaussian ensemblesThe most studied random matrix ensembles are the Gaussian ensembles.The Gaussian unitary ensemble GUE(n) is described by the Gaussian measure with densityon the space of n × n Hermitian matrices H = (H ij)ni,j=1. Here Z GUE(n) = 2n/2πn2/2 is a normalisation constant, chosen so that the integral of the density is equal to one. The term unitary refers to the fact that the distribution is invariant under unitary conjugation. The Gaussian unitary ensemble models Hamiltonians lacking time-reversal symmetry.The Gaussian orthogonal ensemble GOE(n) is described by the Gaussian measure with densityon the space of n × n real symmetric matrices H = (H ij)ni,j=1. Its distribution is invariant under orthogonal conjugation, and it models Hamiltonians with time-reversal symmetry.The Gaussian symplectic ensemble GSE(n) is described by the Gaussian measure with densityon the space of n × n quaternionic Hermitian matrices H = (H ij)n i,j=1. Its distribution is invariant under conjugation by the symplectic group, and it models Hamiltonians with time-reversal symmetry but no rotational symmetry.The joint probability density for the eigenvaluesλ1,λ2,...,λn of GUE/GOE/GSE is given bywhere β = 1 for GOE, β = 2 for GUE, and β = 4 for GSE; Zβ,n is a normalisation constant which can be explicitly computed, seeSelberg integral. In the case of GUE (β = 2), the formula (1) describes a determinantal point process.GeneralisationsWigner matrices are random Hermitian matricessuch that the entriesabove the main diagonal are independent random variables with zero mean, andhave identical second moments.Invariant matrix ensembles are random Hermitian matrices with density on the space of real symmetric/ Hermitian/ quaternionicHermitian matrices, which is of the form where the function V is called the potential.The Gaussian ensembles are the only common special cases of these two classes of random matrices.Spectral theory of random matricesThe spectral theory of random matrices studies the distribution of the eigenvalues as the size of the matrix goes to infinity.Global regimeIn the global regime, one is interested in the distribution of linear statistics of the form N f, H = n-1 tr f(H).Empirical spectral measureThe empirical spectral measureμH of H is defined byUsually, the limit of μH is a deterministic measure; this is a particular case of self-averaging. The cumulative distribution function of the limiting measure is called the integrated density of states and is denoted N(λ). If the integrated density of states is differentiable, its derivative is called the density of states and is denoted ρ(λ).The limit of the empirical spectral measure for Wigner matrices was described by Eugene Wigner, see Wigner's law. A more general theory was developed by Marchenko and Pastur [16][17]The limit of the empirical spectral measure of invariant matrix ensembles is described by a certain integral equation which arises from potential theory.[18]FluctuationsFor the linear statistics N f,H = n−1∑f(λj), one is also interested in the fluctuations about ∫f(λ) dN(λ). For many classes of random matrices, a central limit theorem of the formis known, see[19][20] et cet.Local regimeIn the local regime, one is interested in the spacings between eigenvalues, and, more generally, in the joint distribution of eigenvalues in an interval of length of order 1/n. One distinguishes between bulk statistics, pertaining to intervals inside the support of the limiting spectral measure, and edge statistics, pertaining to intervals near the boundary of the support.Bulk statisticsFormally, fix λ0 in the interior of the support of N(λ). Then consider the point processwhere λj are the eigenvalues of the random matrix.The point process Ξ(λ0) captures the statistical properties of eigenvalues in the vicinity of λ0. For the Gaussian ensembles, the limit of Ξ(λ0) is known;[2] thus, for GUE it is a determinantal point process with the kernel(the sine kernel).The universality principle postulates that the limit of Ξ(λ0) as n→ ∞ should depend only on the symmetry class of the random matrix (and neither on the specific model of random matrices nor on λ0). This was rigorously proved for several models of random matrices: for invariant matrix ensembles,[21][22] for Wigner matrices,[23][24] et cet.Edge statisticsSee Tracy–Widom distribution.Other classes of random matricesWishart matricesMain article:Wishart distributionWishart matrices are n × n random matrices of the form H = X X*, where X is an n × n random matrix with independent entries, and X* is its conjugate matrix. In the important special case considered by Wishart, the entries of X are identically distributed Gaussian random variables (either real or complex).The limit of the empirical spectral measure of Wishart matrices was found[16] by Vladimir Marchenko and Leonid Pastur, see Marchenko–Pastur distribution.Random unitary matricesSee circular ensemblesNon-Hermitian random matricesSee circular law.Guide to references∙Books on random matrix theory:[2][25]∙Survey articles on random matrix theory:[14][17][26][27]∙Historic works:[1][11][13]References1.^a b Wigner, E. (1955). "Characteristic vectors ofbordered matrices with infinite dimensions". Ann. Of Math.62 (3): 548–564. doi:10.2307/1970079.2.^a b c Mehta, M.L. (2004). Random Matrices.Amsterdam: Elsevier/Academic Press. ISBN0-120-88409-7.3.^Bohigas, O.; Giannoni, M.J.; Schmit, Schmit (1984)."Characterization of Chaotic Quantum Spectra andUniversality of Level Fluctuation Laws". Phys. Rev. Lett.52: 1–4. Bibcode1984PhRvL..52....1B.doi:10.1103/PhysRevLett.52.1./doi/10.1103/PhysRevLett.52.1.4.^Franchini F, Kravtsov VE (October 2009). "Horizon inrandom matrix theory, the Hawking radiation, and flow ofcold atoms". Phys. Rev. Lett.103 (16): 166401. Bibcode2009PhRvL.103p6401F.doi:10.1103/PhysRevLett.103.166401. PMID19905710./doi/10.1103/PhysRevLett.103.166401.5.^Sánchez D, Büttiker M (September 2004)."Magnetic-field asymmetry of nonlinear mesoscopictransport". Phys. Rev. Lett.93 (10): 106802.arXiv:cond-mat/0404387. Bibcode2004PhRvL..93j6802S.doi:10.1103/PhysRevLett.93.106802. PMID15447435./doi/10.1103/PhysRevLett.93.106802.6.^Rychkov VS, Borlenghi S, Jaffres H, Fert A, WaintalX (August 2009). "Spin torque and waviness in magneticmultilayers: a bridge between Valet-Fert theory and quantum approaches". Phys. Rev. Lett.103 (6): 066602. Bibcode2009PhRvL.103f6602R.doi:10.1103/PhysRevLett.103.066602. PMID19792592./doi/10.1103/PhysRevLett.103.066602.7.^Callaway DJE (April 1991). "Random matrices,fractional statistics, and the quantum Hall effect". Phys. Rev.,B Condens. Matter43 (10): 8641–8643. Bibcode1991PhRvB..43.8641C. doi:10.1103/PhysRevB.43.8641.PMID9996505./doi/10.1103/PhysRevB.43.8641.8.^Janssen M, Pracz K (June 2000). "Correlated randomband matrices: localization-delocalization transitions". Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics61(6 Pt A): 6278–86. arXiv:cond-mat/9911467. Bibcode2000PhRvE..61.6278J. doi:10.1103/PhysRevE.61.6278.PMID11088301./doi/10.1103/PhysRevE.61.6278.9.^Zumbühl DM, Miller JB, Marcus CM, Campman K,Gossard AC (December 2002). "Spin-orbit coupling,antilocalization, and parallel magnetic fields in quantumdots". Phys. Rev. Lett.89 (27): 276803.arXiv:cond-mat/0208436. Bibcode2002PhRvL..89A6803Z.doi:10.1103/PhysRevLett.89.276803. PMID12513231./doi/10.1103/PhysRevLett.89.276803.10.^Bahcall SR (December 1996). "Random Matrix Modelfor Superconductors in a Magnetic Field". Phys. Rev. Lett.77(26): 5276–5279. arXiv:cond-mat/9611136. Bibcode1996PhRvL..77.5276B. doi:10.1103/PhysRevLett.77.5276.PMID10062760./doi/10.1103/PhysRevLett.77.5276.11.^a b Wishart, J. (1928). "Generalized product momentdistribution in samples". Biometrika20A (1–2): 32–52.12.^Tropp, J. (2011). "User-Friendly Tail Bounds for Sumsof Random Matrices". Foundations of ComputationalMathematics. doi:10.1007/s10208-011-9099-z.13.^a b von Neumann, J.; Goldstine, H.H. (1947)."Numerical inverting of matrices of high order". Bull. Amer.Math. Soc.53 (11): 1021–1099.doi:10.1090/S0002-9904-1947-08909-6.14.^a b Edelman, A.; Rao, N.R (2005). "Random matrixtheory". Acta Numer.14: 233–297.doi:10.1017/S0962492904000236.15.^Keating, Jon (1993). "The Riemann zeta-function andquantum chaology". Proc. Internat. School of Phys. Enrico Fermi CXIX: 145–185.16.^a b.Marčenko, V A; Pastur, L A (1967). "Distributionof eigenvalues for some sets of random matrices".Mathematics of the USSR-Sbornik1 (4): 457–483. Bibcode 1967SbMat...1..457M.doi:10.1070/SM1967v001n04ABEH001994./0025-5734/1/4/A01;jsessionid=F84 D52B000FBFEFEF32CCB5FD3CEF2A1.c1.17.^a b Pastur, L.A. (1973). "Spectra of random self-adjointoperators". Russ. Math. Surv.28(1): 1−67. Bibcode1973RuMaS..28....1P.doi:10.1070/RM1973v028n01ABEH001396.18.^Pastur, L.; Shcherbina, M.; Shcherbina, M. (1995)."On the Statistical Mechanics Approach in the RandomMatrix Theory: Integrated Density of States". J. Stat. Phys.79 (3–4): 585−611. Bibcode1995JSP....79..585D.doi:10.1007/BF02184872.19.^Johansson, K. (1998). "On fluctuations of eigenvaluesof random Hermitian matrices". Duke Math. J.91 (1):151–204. doi:10.1215/S0012-7094-98-09108-6.20.^Pastur, L.A. (2005). "A simple approach to the globalregime of Gaussian ensembles of random matrices".Ukrainian Math. J.57 (6): 936–966.doi:10.1007/s11253-005-0241-4.21.^Pastur, L.; Shcherbina, M. (1997). "Universality of thelocal eigenvalue statistics for a class of unitary invariantrandom matrix ensembles". J. Statist. Phys.86 (1–2):109−147. Bibcode1997JSP....86..109P.doi:10.1007/BF02180200.22.^Deift, P.; Kriecherbauer, T.; McLaughlin, K.T.-R.;Venakides, S.; Zhou, X. (1997). "Asymptotics forpolynomials orthogonal with respect to varying exponential weights". Internat. Math. Res. Notices16 (16): 759−782.doi:10.1155/S1073792897000500.23.^Erdős, L.; Péché, S.; Ramírez, J.A.; Schlein, B.; Yau,H.T. (2010). "Bulk universality for Wigner matrices". Comm.Pure Appl. Math.63 (7): 895–925.24.^Tao., T.; Vu, V. (2010). "Random matrices:universality of local eigenvalue statistics up to the edge".Comm. Math. Phys.298(2): 549−572. Bibcode2010CMaPh.298..549T. doi:10.1007/s00220-010-1044-5.25.^Anderson, G.W.; Guionnet, A.; Zeitouni, O. (2010).An introduction to random matrices.. Cambridge: CambridgeUniversity Press. ISBN978-0-521-19452-5.26.^Diaconis, Persi (2003). "Patterns in eigenvalues: the70th Josiah Willard Gibbs lecture". American MathematicalSociety. Bulletin. New Series40 (2): 155–178.doi:10.1090/S0273-0979-03-00975-3. MR1962294./bull/2003-40-02/S0273-0979-03-00975-3/home.html.27.^Diaconis, Persi (2005). "What is ... a random matrix?".Notices of the American Mathematical Society52 (11):1348–1349. ISSN0002-9920. MR2183871./notices/200511/.External linksFyodorov, Y. (2011)."Random matrix theory". Scholarpedia6(3): 9886./article/Random_matrix_theory.Weisstein, E.W.. "Random Matrix". MathWorld--A WolframWeb Resource./RandomMatrix.html .A random matrix is a matrix of given type and size whose entries consist of random numbers from some specified distribution.Random matrix theory is cited as one of the "modern tools" used in Catherine's proof of an important result in prime number theory in the 2005 film Proof .For a realmatrix with elements having a standard normal distribution , the expected number of real eigenvalues is given by(1)(2)where is a hypergeometric function and is a beta(Edelman et al. 1994, Edelman and Kostlan 1994). hasfunction(3) Let be the probability that there are exactly real eigenvalues inthe complex spectrum of the matrix. Edelman (1997) showedthat(4) which is the smallest probability of all s. The entire probabilityfunction of the number of expected real eigenvalues in the spectrum of a Gaussian real random matrix was derived by Kanzieper and Akemann (2005) as(5) where(6)(7)In (6), the summation runs over all partitionsof length, is the number of pairs of complex-conjugated eigenvalues, and are zonal polynomial . In addition, (6) makes use a frequency representation of the partition(Kanzieper and Akemann 2005). Thearguments depend on the parityof (the matrix dimension) and are given by(8)where is a matrix trace,is an matrix with entries(9)(10)andvary between 0 and,with the floor function),are generalized Laguerre polynomials, andis the complementary erf function erfc (Kanzieper and Akemann 2005).Edelman (1997) proved that the density of a random complex pair of eigenvalues of a real matrix whose elements are taken froma standard normal distribution is(11)(12)erfc(complementary error) function,for , where is theis thegamma function. Integrating over the upper half-plane(and multiplying by 2) gives the expected number of complex eigenvalues as(13)(14)(15) (Edelman 1997). The first few values are(16)(17)(18)(19)(20) (Sloane's A052928,A093605, and A046161).Girko's circular law considers eigenvalues(possibly complex) of awith entries independent and takenset of random real matricesstandard normal distribution and states that as , isfrom astates that the for large symmetric realWigner's semicircle lawmatrices with elements taken from a distribution satisfying certain rather general properties, the distribution of eigenvalues is the semicircle function.If matrices are chosen with probability 1/2 from one of(21)(22) then(23)A078416) and denotes the matrixwhere (Sloane'sspectral norm(Bougerol and Lacroix 1985, pp. 11 and 157; Viswanath 2000). This is the same constant appearing in the randomFibonacci sequence. The following Mathematica code can be used to estimate this constant.With[{n = 100000}, m = Fold[Dot, IdentityMatrix[2], {{0, 1}, {1, #}}& /@ RandomChoice[{-1, 1}, {n}] ] // N;Log[Sqrt[Max[Eigenvalues[Transpose[m] . m]]]] / n ]SEE ALSO:Complex Matrix,Girko's Circular Law,Integer Matrix, Matrix,Random Fibonacci Sequence,Real Matrix,Wigner's Semicircle LawREFERENCES:Bougerol, P. and Lacroix, J. Random Products of Matrices with Applications to Schrödinger Operators. Basel, Switzerland:Birkhäuser 1985.Chassaing, P.; Letac, G.; and Mora, M. "Brocot Sequences and Random Walks on ." In Probability Measures on Groups VII(Ed. H. Heyer). New York Springer-Verlag, pp. 36-48, 1984. Edelman, A. "The Probability that a Random Real Gaussian Matrix has Real Eigenvalues, Related Distributions, and the CircularLaw." J. Multivariate Anal.60, 203-232, 1997.Edelman, A. and Kostlan, E. "How Many Zeros of a Random Polynomial are Real?" Bull. Amer. Math. Soc.32, 1-37, 1995. Edelman, A.; Kostlan, E.; and Shub, M. "How Many Eigenvalues of a Random Matrix are Real?" J. Amer. Math. Soc.7, 247-267, 1994.Furstenberg, H. "Non-Commuting Random Products." Trans. Amer. Math. Soc.108, 377-428, 1963.Furstenberg, H. and Kesten, H. "Products of Random Matrices." Ann. Math. Stat.31, 457-469, 1960.Girko, V. L. Theory of Random Determinants. Boston, MA: Kluwer, 1990.Kanzieper, E. and Akemann, G. "Statistics of Real Eigenvalues in Ginibre's Ensemble of Random Real Matrices." Phys. Rev. Lett.95, 230201-1-230201-4, 2005.Katz, M. and Sarnak, P. Random Matrices, Frobenius Eigenvalues, and Monodromy. Providence, RI: Amer. Math. Soc., 1999.Lehmann, N. and Sommers, H.-J. "Eigenvalue Statistics of Random Real Matrices." Phys. Rev. Let.67, 941-944, 1991.Mehta, M. L. Random Matrices, 3rd ed. New York: Academic Press, 1991.Sloane, N. J. A. Sequences A046161,A052928,A078416, andA093605in "The On-Line Encyclopedia of Integer Sequences." Viswanath, D. "Random Fibonacci Sequences and the Number1.13198824...." Math. Comput.69, 1131-1155, 2000.CITE THIS AS:Weisstein, Eric W."Random Matrix." From MathWorld--A Wolfram Web Resource./RandomMatrix.htmlPeter ForresterEmail:**********************.edu.auPhone: +61 (0)3 8344 9683Department of Mathematics and Statistics, The University of Melbourne, Parkville, Vic 3010, Australia.Research InterestsRandom matrices.Random matrix theory is concerned with giving analytic statistical properties of the eigenvalues and eigenvectors of matrices defined by a statistical distribution. It is found that the statistical properties are to a large extent independent of the underlying distribution, and dependent only on global symmetry properties of the matrix. Moreover, these same statistical properties are observed in many diverse settings: the spectra of complex quantum systems such as heavy nuclei, the Riemann zeros, the spectra of single particle quantum systems with chaotic dynamics, the eigenmodes of vibrating plates, amongst other examples.Imposing symmetry constraints on random matrices leads to relationships with Lie algebras and symmetric spaces, and the internal symmetry of these structures shows itself as a relection group symmetry exhibited by the eigenvalue probability densities. The calculation of eigenvalue correlation functions requires orthogonal polynomials, skew orthogonal polynomials, deteminants and Pfaffians. The calculation of spacing distributions involves many manifestations of integrable systems theory, in particular Painlev\'e equations, isomonodromy deformation of differentialequations, and the Riemann-Hilbert problem. Topics of ongoing study include the asymptotics of spacing distributions, eigenvalue distributions in the complex plane and low rank perturbations of the classical random matrix ensembles.Macdonald polynomial theory.Over forty years ago the many body Schrodinger operator with1/r2was isolated as having special properties. Around fifteen years ago families of commuting differential/difference operators based on root systems were identified and subsequently shown to underly the theory of Macdonald polynomials, which are multivariable orthogonal polynomials generalizing the Schur polynomials. In fact these commuting operators can be used to write the1/r2? Schrodinger operator in a factorized form, and the multivariable polynomials are essentially the eigenfunctions. This has the consequence that ground state dynamical correlations can be computed. They explicitly exhibit the fractional statistical charge carried by the elementary excitations. This latter notion is the cornerstone of Laughlin's theory of the fractional quantum Hall effect, which earned him the 1998 Nobel prize for physics. The calculation of correlations requires knowledge of special properties of the multivariable polynomials, much of which follows from thepresence of a Hecke algebra structure. The study of these special structures is an ongoing project.Statistical mechanics and combinatorics.Counting configurations on a lattice is a basic concern in the formalism of equilibrium statistical mechanics. Of the many counting problems encountered in this setting, one attracting a good deal of attention at present involves directed non-intersecting paths on a two-dimensional lattice. There are bijections between such paths and Young tableaux, which in turn are in bijective correspondence with generalized permutations and integer matrices. This leads to a diverse array of model systems which relate to random paths: directed percolation, tilings, asymmetric exclusion and growth models to name a few. The probability density functions which arise typically have the same form as eigenvalue probability density functions in random matrix theory, except the analogue of the eigenvalues are discrete. One is thus led to consider discrete orthogonal polynomials and integrable systems based on difference equations. The Schur functions are fundamentally related tonon-intersecting paths, and this gives rise to interplay with Macdonald polynomial theory.Statistical mechanics of log-potential Coulomb systems.The logarithmic potential is intimately related to topological charge -- for example vortices in a fluid carry a topological charge determined by the circulation, and the energy between two vortices is proportional to the logarithm of the separation. The logarithmic potential is also the potential between two-dimensional electric charges, so properties of the two-dimensional Coulomb gas can be directly related to properties of systems with topological charges. In a celebrated analysis, Kosterlitz and Thouless identified a pairing phase transition in the two-dimensional Coulomb gas. They immediately realized that this mechanism, with the vortices playing the role of the charges, was responsible for the superfluid--normal fluid transition in liquid Helium films. In my studies of thetwo-dimensional Coulomb gas I have exploited the fact that at a special value of the coupling the system is equivalent to the Dirac field and so is exactly solvable. This has provided an analytic laboratory on which to test approximate physical theories, and has also led to the discovery of new universal features of Coulomb systems in their conductive phase.My book `Log-gases and Random matrices' (PUP, 2010)I started working on this project in August 1994, and finished (apart from minor changes) 15 years later.It can be browsed from its Princeton University Press web page.Department of Mathematics Personal Home pages. Department of Mathematics home page.Maths home pageRandom MatricesDate: Tuesday 29 May - Friday 1 June 2012Venue: Mathematik-Zentrum, Lipschitz Lecture Hall, Endenicher Allee 60, BonnOrganizers: Holger Rauhut, Patrik Ferrari, Benjamin SchleinAbstractRandom Matrices and their analysis play an important role in various areas, such as mathematical physics, statistics, Banach space geometry, signal processing (compressive sensing), analysis of optimization algorithms, growth models and more. The interaction with application fields, in particular, has triggered high research activity in random matrix theory recently. The proposed workshop aims at bringing together experts and junior researchers working onvarious aspects of random matrices, and to report on recent advances. In particular, we aim at identifying possible new directions and methods that may arise from the combination of different expertises.Random Matrix Theory and Applications in Theoretical Sciences Date: December 15 - 17, 2011Convenors: Gernot Akemann (Bielefeld), Igor Krasovsky (London), Dmitry Savin (Brunel), Igor Smolyarenko (Brunel)The aim of this workshop is to bring together physicists and mathematicians who work in the area of Random Matrix Theory in a broad sense. The concept of matrices with stochastic matrix elements appears in many modern developments in pure and applied mathematics, physics and other sciences. This workshop will be devoted to topics which have seen a very rapid development in recent years. These include the study of universality using complex analysis and probability theory, the physics of quantum computation and entanglement, turbulence, and the evaluation of economic risk. One of the purposes of the conference is to intensify collaborations between Germany and the United Kingdom in these areas of research, where several highly active international centers arelocated. We will build upon the experience and networking established through previous workshops, especially in a series of annual international workshops based at Brunel University Londonin 2005-2010. We encourage an active participation by students and young scientists who will comprise a significant fraction of the total number of about 40-50 participants大维随机矩阵理论及其应用【成果完成人】白志东【第一完成单位】东北师范大学【关键词】随机矩阵;分布;特征根;特征向量【成果简介】大维随机矩阵最初出现在50年代, 核物理当中为了分析分光镜数据把随机矩阵做为物理的分支提出。

GTM目录

批注本地保存成功开通会员云端永久保存去开通
vol
书
名
1
2 Measure and Category
3
4 A Course in Homological Algebra
5 Categories for the Working Mathematician
6 Projective Planes
7 A Course in Arithmetic
8
9 Introduction to Lie Algebras and Representation Theory
10
11 Functions of One Complex Variable
12
13 Rings and Categories of Modules
14 Stable Mappings and Their Singularities
43 44 45 Probability Theory I 46 Probability Theory II 47 48 General Relativity for Mathematicians 49 50 Fermat Last Theorem 51 52 Algebraic Geometry 53 A Course in Mathematical Logic 54 55 56 57 58 p-adic Numbers, p-adic Analysis, and Zeta-Functions 59 60 Mathematical Methods of Classical Mechanics 61 Elements of Homotopy Theory 62 63 64 65 Differential Analysis on Complex Manifolds 66 Introduction to Affine Group Schemes 67 Local Fields 68 69 70 71 Riemann Surfaces 72 Classical Topology and Combinatiorial Group Theory 73 Algebras 74 Multiplicative Number Theory 75 76 Algebraic Geometry: Birational geometry of Algebraic Varieties 77 78 A Course in Universal Algebra 79 An Introduction to Ergodic Theory 80 81 82 Differential Forms in Algebraic Topology 83 Introduction to Cyclotomic Fields 84 A Classical Introduction to Modern Number Theory 85

数学方面常用词汇中英对照

算术的基础词汇对照表和sum差difference积product商quotient加法addition减法subtraction乘法multiplication除法division余数remainder符号sign实数real number整数integers自然数natural numbers有理数rational number无理数irrational number分数fractions分子numerator分母denominator正positive负negative零zero无限大infinity复素数complex number复素平面complex plane实数部real part虚数部imaginary part绝対值absolute value, modulus 总和summation定数constant系数coefficient变量variable函数function演算operation平方square(d)立方cube(d)平方根square root立方根cubic root乘power比率ratio比例proportional (to)方程式equation根、解root, solution等式equality不等式inequality右边right-hand side左边left-hand side等于equal to大于（小于）greater than ～(less than ～)～以上（以下）greater than or equal to ～(less than or equal to ～)无限infinite有限finite输入input输出output代入substitute变换transform存在exist假定assume证明prove分析analyze代数演算的词汇对照线性演算linear operation因数factor因数分解factorization因数分解factorize对数logarithm对数的底base一次結合linear combination一维空间one-dimensional spaceｎ维空间n-dimensional space向量空间vector space维数dimension次数degree欧几里得空间Euclidean space非线性non-linear非齐次inhomogeneous复数平面complex plane齐次函数homogeneous function联立方程式simultaneous equations矩阵matrix （[pl.] matrices）行row列column逆矩阵matrix inverse矩阵转置transpose of matrix线性无关linearly independent线性相关linearly dependent特征值eigenvalue特征向量eigenvector特征值问题eigenvalue problem行列式determinant迹trace阶数rank对角矩阵diagonal matrix对角元素diagonal elements非对角元素off-diagonal elements对角化diagonalize成员component内积inner product微积分、解析词汇对照连续函数continuous function微分differentiate微分differential可微分differentiable微分算子differential operator差分difference导数derived function, derivative微分方程differential equation常微分方程ordinary differential equation偏微分方程partial differential equation微分方程组simultaneous differential equations 平凡解general solution特殊解particular solution拉格朗日乘子Lagrange multiplier常数项constant term积分integrate积分integral可积integrable不定积分indefinite integral定积分definite integral任意常数arbitrary constant展开expandMaclaurin展开Maclaurin expansionTaylor展开Taylor expansion极大值maximal value极小值minimal value最大值maximum ([pl. ] maxima)最小值minimum （[pl. ] minima)测度measure可测measurable函数的functional拓扑词汇对照拓扑topology・相对拓扑relative topology・弱拓扑weak topology・离散拓扑discrete topology扩张extension限制restriction分离公理separation axiom求和公理axiom of countability可数集合countable set范数norm距离metric, distance距离空间metric space收敛converge・一致收敛uniform（ly）・逐点收敛pointwise发散diverge闭集合closed set开集合open set闭包closure极限点point of closure内点inner point邻域neighborhood过滤器filter局部locally极限limit基本列fundamental sequence Cauchy列Cauchy sequence子列subsequence连续continuous・一致连续uniformly continuous・同等连续equicontinuous紧compact紧化compactification有限交性finite intersection property 有限覆盖finite covering一致uniformly完备complete上界upper bound下界lower bound有界bounded全有界totally bounded上限least upper bound(L.U.B.), supremum (sup) 下限greatest lower bound(G.L.B.), infimum (inf) 稠密dense可分separable连接connectedHilbert空间Hilbert spaceBanach空间Banach space同胚homeomorphic数学的一般理论词汇英文对照科学science算术arithmetic几何学geometry代数algebra微积分calculus解析学analysis概率论probability theory统计学statistics方法method分析analysis逻辑logic理论theory定义definition命题proposition假说hypothesis公理axiom要件postulate定理theorem证明proof假定assumption结论conclusion证明终止Q.E.D. (quod erat demonstrundum)引理lemma系corollary反例counter-example反证法reductio ad absurdum对偶contraposition逆converse恒等式identity英语文献常用词及其缩写Abstracts Abstr. 文摘Abbreviation 缩语和略语Acta 学报Advances 进展Annals Anna. 纪事Annual Annu. 年鉴，年度Semi-Annual 半年度Annual Review 年评Appendix Appx 附录Archives 文献集Association Assn 协会Author 作者Bibliography 书目，题录Biological Abstract BA 生物学文摘Bulletin 通报，公告Chemical Abstract CA 化学文摘Citation Cit 引文，题录Classification 分类，分类表College Coll. 学会，学院Compact Disc-Read Only Memory CD-ROM 只读光盘Company Co. 公司Content 目次Co-term 配合词，共同词Cross-references 相互参见Digest 辑要，文摘Directory 名录，指南Dissertations Diss. 学位论文Edition Ed. 版次Editor Ed. 编者、编辑Excerpta Medica EM 荷兰《医学文摘》Encyclopedia 百科全书The Engineering Index Ei 工程索引Et al 等等European Patent Convertion EPC 欧洲专利协定Federation 联合会Gazette 报，公报Guide 指南Handbook 手册Heading 标题词Illustration Illus. 插图Index 索引Cumulative Index 累积索引Index Medicus IM 医学索引Institute Inst. 学会、研究所International Patent Classification IPC 国际专利分类法International Standard Book Number ISBN 国际标准书号International Standard Series Number ISSN 国际标准刊号Journal J. 杂志、刊Issue 期（次）Keyword 关键词Letter Let. 通讯、读者来信List 目录、一览表Manual 手册Medical Literature Analysis and MADLARS 医学文献分析与检索系统Retrieval SystemMedical Subject Headings MeSH 医学主题词表Note 札记Papers 论文Patent Cooperation Treaty PCT 国际专利合作条约Precision Ratio 查准率Press 出版社Procceedings Proc. 会报、会议录Progress 进展Publication Publ. 出版物Recall Ratio 查全率Record 记录、记事Report 报告、报导Review 评论、综述Sciences Abstracts SA 科学文摘Section Sec. 部分、辑、分册See also 参见Selective Dissemination of Information SDI 定题服务Seminars 专家讨论会文集Series Ser. 丛书、辑Society 学会Source 来源、出处Subheadings 副主题词Stop term 禁用词Subject 主题Summary 提要Supplement Suppl. 附刊、增刊Survey 概览Symposium Symp. 专题学术讨论会Thesaurus 叙词表、词库Title 篇名、刊名、题目Topics 论题、主题Transactions 学报、汇刊Volume Vol. 卷World Intellectual Property Organization WIPO 世界知识产权World Patent Index WPI 世界专利索引Yearbook 年鉴一般词汇数学mathematics, maths(BrE), math(AmE)公理axiom定理theorem计算calculation运算operation证明prove假设hypothesis, hypotheses(pl.)命题proposition算术arithmetic加plus(prep.), add(v.), addition(n.)被加数augend, summand加数addend和sum减minus(prep.), subtract(v.), subtraction(n.)被减数minuend减数subtrahend差remainder乘times(prep.), multiply(v.), multiplication(n.)被乘数multiplicand, faciend乘数multiplicator积product除divided by(prep.), divide(v.), division(n.) 被除数dividend除数divisor商quotient等于equals, is equal to, is equivalent to大于is greater than小于is lesser than大于等于is equal or greater than小于等于is equal or lesser than运算符operator数字digit数number自然数natural number整数integer小数decimal小数点decimal point分数fraction分子numerator分母denominator比ratio正positive负negative零null, zero, nought, nil十进制decimal system二进制binary system十六进制hexadecimal system权weight, significance进位carry截尾truncation四舍五入round下舍入round down上舍入round up有效数字significant digit无效数字insignificant digit代数algebra公式formula, formulae(pl.)单项式monomial多项式polynomial, multinomial系数coefficient未知数unknown, x-factor, y-factor, z-factor 等式，方程式equation一次方程simple equation二次方程quadratic equation三次方程cubic equation四次方程quartic equation不等式inequation阶乘factorial对数logarithm指数，幂exponent乘方power二次方，平方square三次方，立方cube四次方the power of four, the fourth power n次方the power of n, the nth power开方evolution, extraction二次方根，平方根square root三次方根，立方根cube root四次方根the root of four, the fourth root n次方根the root of n, the nth root集合aggregate元素element空集void子集subset交集intersection并集union补集complement映射mapping函数function定义域domain, field of definition 值域range常量constant变量variable单调性monotonicity奇偶性parity周期性periodicity图象image数列，级数series微积分calculus微分differential导数derivative极限limit无穷大infinite(a.) infinity(n.)无穷小infinitesimal积分integral定积分definite integral不定积分indefinite integral有理数rational number 无理数irrational number 实数real number虚数imaginary number 复数complex number矩阵matrix行列式determinant几何geometry点point线line面plane体solid线段segment射线radial平行parallel相交intersect角angle角度degree弧度radian锐角acute angle直角right angle钝角obtuse angle平角straight angle周角perigon底base边side高height三角形triangle锐角三角形acute triangle直角三角形right triangle直角边leg斜边hypotenuse勾股定理Pythagorean theorem钝角三角形obtuse triangle不等边三角形scalene triangle等腰三角形isosceles triangle等边三角形equilateral triangle四边形quadrilateral平行四边形parallelogram矩形rectangle长length宽width菱形rhomb, rhombus, rhombi(pl.), diamond 正方形square梯形trapezoid直角梯形right trapezoid等腰梯形isosceles trapezoid 五边形pentagon六边形hexagon七边形heptagon八边形octagon九边形enneagon十边形decagon十一边形hendecagon十二边形dodecagon多边形polygon正多边形equilateral polygon 圆circle圆心centre(BrE), center(AmE) 半径radius直径diameter圆周率pi弧arc半圆semicircle扇形sector环ring椭圆ellipse圆周circumference周长perimeter面积area轨迹locus, loca(pl.)相似similar全等congruent四面体tetrahedron五面体pentahedron六面体hexahedron平行六面体parallelepiped 立方体cube七面体heptahedron八面体octahedron九面体enneahedron十面体decahedron十一面体hendecahedron 十二面体dodecahedron 二十面体icosahedron多面体polyhedron棱锥pyramid棱柱prism棱台frustum of a prism旋转rotation轴axis圆锥cone圆柱cylinder圆台frustum of a cone球sphere半球hemisphere底面undersurface表面积surface area体积volume空间space坐标系coordinates坐标轴x-axis, y-axis, z-axis 横坐标x-coordinate纵坐标y-coordinate原点origin双曲线hyperbola抛物线parabola三角trigonometry正弦sine余弦cosine正切tangent余切cotangent正割secant余割cosecant反正弦arc sine反余弦arc cosine反正切arc tangent反余切arc cotangent反正割arc secant反余割arc cosecant相位phase周期period振幅amplitude内心incentre(BrE), incenter(AmE)外心excentre(BrE), excenter(AmE)旁心escentre(BrE), escenter(AmE)垂心orthocentre(BrE), orthocenter(AmE) 重心barycentre(BrE), barycenter(AmE) 内切圆inscribed circle外切圆circumcircle统计statistics平均数average加权平均数weighted average方差variance标准差root-mean-square deviation, standard deviation 比例propotion百分比percent百分点percentage百分位数percentile排列permutation组合combination概率，或然率probability分布distribution正态分布normal distribution非正态分布abnormal distribution图表graph条形统计图bar graph柱形统计图histogram折线统计图broken line graph曲线统计图curve diagram扇形统计图pie diagram高等数学词汇Aabbreviation 简写符号；简写absolute error 绝对误差absolute value 绝对值accuracy 准确度acute angle 锐角acute-angled triangle 锐角三角形add 加addition 加法addition formula 加法公式addition law 加法定律addition law(of probability) （概率）加法定律additive property 可加性adjacent angle 邻角adjacent side 邻边algebra 代数algebraic 代数的algebraic equation 代数方程algebraic expression 代数式algebraic fraction 代数分式；代数分数式algebraic inequality 代数不等式algebraic operation 代数运算alternate angle （交）错角alternate segment 交错弓形altitude 高；高度；顶垂线；高线ambiguous case 两义情况；二义情况amount 本利和；总数analysis 分析；解析analytic geometry 解析几何angle 角angle at the centre 圆心角angle at the circumference 圆周角angle between a line and a plane 直?与平面的交角angle between two planes 两平面的交角angle bisection 角平分angle bisector 角平分线?；分角线?angle in the alternate segment 交错弓形的圆周角angle in the same segment 同弓形内的圆周角angle of depression 俯角angle of elevation 仰角angle of greatest slope 最大斜率的角angle of inclination 倾斜角angle of intersection 相交角；交角angle of rotation 旋转角angle of the sector 扇形角angle sum of a triangle 三角形内角和angles at a point 同顶角annum(X% per annum) 年（年利率X%）anti-clockwise direction 逆时针方向；返时针方向anti-logarithm 逆对数；反对数anti-symmetric 反对称approach 接近；趋近approximate value 近似值approximation 近似；略计；逼近Arabic system 阿刺伯数字系统arbitrary 任意arbitrary constant 任意常数arc 弧arc length 弧长arc-cosine function 反余弦函数arc-sin function 反正弦函数arc-tangent function 反正切函数area 面积arithmetic 算术arithmetic mean 算术平均；等差中顶；算术中顶arithmetic progression 算术级数；等差级数arithmetic sequence 等差序列arithmetic series 等差级数arm 边arrow 前号ascending order 递升序ascending powers of X X 的升幂associative law 结合律assumed mean 假定平均数assumption 假定；假设average 平均；平均数；平均值average speed 平均速率axiom 公理axis 轴axis of parabola 拋物线的轴axis of symmetry 对称轴Bback substitution 回代bar chart 棒形图；条线图；条形图；线条图base （1）底；（2）基；基数base angle 底角base area 底面base line 底线?base number 底数；基数base of logarithm 对数的底bearing 方位（角）；角方向（角）bell-shaped curve 钟形图bias 偏差；偏倚binary number 二进数binary operation 二元运算binary scale 二进法binary system 二进制binomial 二项式binomial expression 二项式bisect 平分；等分bisection method 分半法；分半方法bisector 等分线?；平分线? boundary condition 边界条件boundary line 界（线）；边界bounded 有界的bounded above 有上界的；上有界的bounded below 有下界的；下有界的bounded function 有界函数brace 大括号bracket 括号breadth 阔度broken line graph 折线图Ccalculation 计算calculator 计算器；计算器cancel 消法；相消canellation law 消去律capacity 容量Cartesian coordinates 笛卡儿坐标Cartesian plane 笛卡儿平面category 类型；范畴central line 中线?central tendency 集中趋centre 中心；心centre of a circle 圆心centroid 形心；距心certain event 必然事件chance 机会change of base 基的变换change of subject 主项变换change of variable 换元；变量的换chart 图；图表checking 验算chord 弦chord of contact 切点弦circle 圆circular 圆形；圆的circular function 圆函数；三角函数circular measure 弧度法circumcentre 外心；外接圆心circumcircle 外接圆circumference 圆周circumradius 外接圆半径circumscribed circle 外接圆class 区；组；类class boundary 组界class interval 组区间；组距class limit 组限；区限class mark 组中点；区中点classification 分类clnometer 测斜仪clockwise dirction 顺时针方向closed convex region 闭凸区域closed interval 闭区间coefficient 系数coincide 迭合；重合collection of terms 并项collinear 共线?collinear planes 共线面column (1)列；纵行；(2) 柱combination 组合common chord 公弦common denominator 同分母；公分母common difference 公差common divisor 公约数；公约common factor 公因子；公因子common logarithm 常用对数common multiple 公位数；公倍common ratio 公比common tangetn 公切? commutative law 交换律comparable 可比较的compass 罗盘compass bearing 罗盘方位角compasses 圆规compasses construction 圆规作图complement 余；补余complementary angle 余角complementary event 互补事件complementary probability 互补概率completing the square 配方complex number 复数complex root 复数根composite number 复合数；合成数compound bar chart 综合棒形图compound discount 复折扣compound interest 复利；复利息computation 计算computer 计算机；电子计算器concave 凹concave downward 凹向下的concave polygon 凹多边形concave upward 凹向上的concentric circles 同心圆concept 概念conclusion 结论concurrent 共点concyclic 共圆concyclic points 共圆点condition 条件conditional 条件句；条件式cone 锥；圆锥（体）congruence (1)全等；(2)同余congruent 全等congruent figures 全等图形congruent triangles 全等三角形cconjugate 共轭consecutive integers 连续整数consecutive numbers 连续数；相邻数consequence 结论；推论consequent 条件；后项consistency condition 相容条件consistent 一贯的；相容的constant 常数constant speed 恒速率constant term 常项constraint 约束；约束条件construct 作construction 作图construction of equation 方程的设立continued proportion 连比例continued ratio 连比continuous 连续的continuous data 连续数据continuous function 连续函数continuous proportion 连续比例contradiction 矛盾converse 逆(定理)converse theorem 逆定理conversion 转换convex 凸convex polygon 凸多边形coordinate 坐标coordinate geometry 解析几何；坐标几何coordinate system 坐标系系定理；系；推论correct to 准确至；取值至correspondence 对应corresponding angles (1)同位角；(2)对应角corresponding sides 对应边cosine 余弦cosine formula 余弦公式cost price 成本counter clockwise direction 逆时针方向；返时针方向counter example 反例counting 数数；计数criterion 准则critical point 临界点cross-multiplication 交叉相乘cross-section 横切面；横截面；截痕cube 正方体；立方；立方体cube root 立方根cubic 三次方；立方；三次(的)cubic equation 三次方程cuboid 长方体；矩体cumulative 累积的cumulative frequecy 累积频数；累积频率cumulative frequency curve 累积频数曲?cumulative frequcncy distribution 累积频数分布cumulative frequency polygon 累积频数多边形；累积频率直方图curve 曲线?curve sketching 曲线描绘(法)curve tracing 曲线描迹(法)curved line 曲线?curved surface 曲面curved surface area 曲面面积cyclic quadrilateral 圆内接四边形cylinder 柱；圆柱体cylindrical 圆柱形的Ddata 数据decagon 十边形decay 衰变decay factor 衰变因子decimal 小数decimal place 小数位decimal point 小数点decimal system 十进制decrease 递减decreasing function 递减函数；下降函数decreasing sequence 递减序列；下降序列decreasing series 递减级数；下降级数decrement 减量deduce 演绎deduction 推论deductive reasoning 演绎推理definite 确定的；定的distance 距离distance formula 距离公式distinct roots 相异根distincr solution 相异解distribution 公布distrivutive law 分配律divide 除dividend (1)被除数；(2)股息divisible 可整除division 除法division algorithm 除法算式divisor 除数；除式；因子divisor of zero 零因子dodecagon 十二边形dot 点double root 二重根due east/ south/ west /north 向东/ 南/ 西/ 北definiton 定义degree (1)度；(2)次degree of a polynomial 多项式的次数degree of accuracy 准确度degree of precision 精确度delete 删除；删去denary number 十进数denary scale 十进法denary system 十进制denominator 分母dependence (1)相关；(2)应变dependent event(s) 相关事件；相依事件；从属事件dependent variable 应变量；应变数depreciation 折旧descending order 递降序descending powers of X X的降序detached coefficients 分离系数(法)deviation 偏差；变差deviation from the mean 离均差diagonal 对角?diagram 图；图表diameter 直径difference 差digit 数字dimension 量；量网；维(数)direct proportion 正比例direct tax, direct taxation 直接税direct variation 正变（分）directed angle 有向角directed number 有向数direction 方向；方位discontinuous 间断(的)；非连续(的)；不连续(的) discount 折扣discount per cent 折扣百分率discrete 分立；离散discrete data 离散数据；间断数据discriminant 判别式dispersion 离差displacement 位移disprove 反证Eedge 棱；边elimination 消法elimination method 消去法；消元法elongation 伸张；展empirical data 实验数据empirical formula 实验公式empirical probability 实验概率；经验概率enclosure 界限end point 端点entire surd 整方根equal 相等equal ratios theorem 等比定理equal roots 等根equality 等(式)equality sign 等号equation 方程equation in one unknown 一元方程equation in two unknowns(variables) 二元方程equation of a straight line 直线方程equation of locus 轨迹方程equiangular 等角(的)extreme value 极值equidistant 等距(的)equilaeral 等边(的)equilateral polygon 等边多边形equilateral triangle 等边三角形equivalent 等价(的)error 误差escribed circle 旁切圆estimate 估计；估计量Euler's formula 尤拉公式；欧拉公式evaluate 计值even function 偶函数even number 偶数evenly distributed 均匀分布的event 事件exact 真确exact solution 准确解；精确解；真确解exact value 法确解；精确解；真确解example 例excentre 外心exception 例外excess 起exclusive 不包含exclusive events 互斥事件exercise 练习expand 展开expand form 展开式expansion 展式expectation 期望expectation value, expected value 期望值；预期值experiment 实验；试验experimental 试验的experimental probability 实验概率exponent 指数express…in terms of….. 以………表达expression 式；数式extension 外延；延长；扩张；扩充exterior angle 外角external angle bisector 外分角?external point of division 外分点extreme point 极值点Fface 面factor 因子；因式；商factor method 因式分解法factor theorem 因子定理；因式定理factorial 阶乘factorization 因子分解；因式分解factorization of polynomial 多项式因式分解FALSE 假(的)feasible solution 可行解；容许解Fermat’s last theorem 费尔马最后定理Fibonacci number 斐波那契数；黄金分割数Fibonacci sequence 斐波那契序列fictitious mean 假定平均数figure (1)图(形)；(2)数字finite 有限finite population 有限总体finite sequence 有限序列finite series 有限级数first quartile 第一四分位数first term 首项fixed deposit 定期存款fixed point 定点flow chart 流程图foot of perpendicular 垂足for all X 对所有Xfor each /every X 对每一Xform 形式；型formal proof 形式化的证明format 格式；规格formula(formulae) 公式four rules 四则four-figure table 四位数表fourth root 四次方根fraction 分数；分式fraction in lowest term 最简分数fractional equation 分式方程fractional index 分数指数fractional inequality 分式不等式free fall 自由下坠frequency 频数；频率frequency distribution 频数分布；频率分布frequency distribution table 频数分布表frequency polygon 频数多边形；频率多边形frustum 平截头体function 函数function of function 复合函数；迭函数functional notation 函数记号Ggain 增益；赚；盈利gain per cent 赚率；增益率；盈利百分率game （1）对策；（2）博奕general form 一般式；通式general solution 通解；一般解general term 通项geoborad 几何板geometric mean 几何平均数；等比中项geometric progression 几何级数；等比级数geometric sequence 等比序列geometric series 等比级数geometry 几何；几何学given 给定；已知golden section 黄金分割grade 等级gradient (1)斜率；倾斜率；(2)梯度grand total 总计graph 图像；图形；图表graph paper 图表纸graphical method 图解法graphical representation 图示；以图样表达graphical solution 图解greatest term 最大项greatest value 最大值grid lines 网网格线group 组；?grouped data 分组数据；分类数据grouping terms 并项；集项growth 增长growth factor 增长因子Hhalf closed interval 半闭区间half open interval 半开区间head 正面（钱币）height 高（度）hemisphere 半球体；半球heptagon 七边形Heron's formula 希罗公式hexagon 六边形higher order derivative 高阶导数highest common factor(H.C.F) 最大公因子；最高公因式；最高公因子Hindu-Arabic numeral 阿刺伯数字histogram 组织图；直方图；矩形图horizontal 水平的；水平horizontal line 横线?；水平线?hyperbola 双曲线?hypotenuse 斜边Iidentical 全等；恒等identity 等（式）identity relation 恒等关系式if and only if/iff 当且仅当；若且仅若if…., then 若….则；如果…..则illustration 例证；说明image 像点；像imaginary circle 虚圆imaginary number 虚数imaginary root 虚根implication 蕴涵式；蕴含式imply 蕴涵；蕴含impossible event 不可能事件improper fraction 假分数inclination 倾角；斜角inclined plane 斜面included angle 夹角included side 夹边inclusive 包含的；可兼的inconsistent 不相的(的)；不一致(的)increase 递增；增加increasing function 递增函数interior angles on the same side of the transversal 同旁内角interior opposite angle 内对角internal bisector 内分角?internal division 内分割internal point of division 内分点inter-quartile range 四分位数间距intersect 相交intersection (1)交集；(2)相交；(3)交点interval 区间intuition 直观invariance 不变性invariant (1)不变的；(2)不变量；不变式inverse 反的；逆的inverse circular function 反三角函数inverse cosine function 反余弦函数inverse function 反函数；逆函数inverse problem 逆算问题inverse proportion 反比例；逆比例inverse sine function 反正弦函数inverse tangent function 反正切函数inverse variation 反变(分)；逆变(分)irrational equation 无理方程irrational number 无理数irreducibility 不可约性irregular 不规则isosceles triangle 等腰三角形increasing sequence 递增序列increasing series 递增级数increment 增量independence 独立；自变independent event 独立事件independent variable 自变量；独立变量indeterminate (1)不定的；(2)不定元；未定元indeterminate coefficient 不定系数；未定系数indeterminate form 待定型；不定型index,indices 指数；指index notation 指数记数法inequality 不等式；不等inequality sign 不等号infinite 无限；无穷infinite population 无限总体infinite sequence 无限序列；无穷序列infinite series 无限级数；无穷级数infinitely many 无穷多infinitesimal 无限小；无穷小infinity 无限(大)；无穷(大)initial point 始点；起点initial side 始边initial value 初值；始值input 输入input box 输入inscribed circle 内切圆insertion 插入insertion of brackets 加括号instantaneous 瞬时的integer 整数integral index 整数指数integral solution 整数解integral value 整数值intercept 截距；截段intercept form 截距式intercept theorem 截线定理interchange 互换interest 利息interest rate 利率interest tax 利息税interior angle 内角Jjoint variation 联变(分)；连变(分)Kknown 己知LL.H.S. 末项law 律；定律law of indices 指数律；指数定律law of trichotomy 三分律leading coefficient 首项系数least common multiple, lowest common multiple (L.C.M) 最小公倍数；最低公倍式least value 最小值lemma 引理length 长(度)letter 文字；字母like surd 同类根式like terms 同类项limit 极限line 线；行line of best-fit 最佳拟合?line of greatest slope 最大斜率的直?；最大斜率?line of intersection 交线?line segment 线段linear 线性；一次linear equation 线性方程；一次方程linear equation in two unknowns 二元一次方程；二元线性方程linear inequality 一次不等式；线性不等式linear programming 线性规划literal coefficient 文字系数literal equation 文字方程load 负荷loaded coin 不公正钱币loaded die 不公正骰子locus, loci 轨迹logarithm 对数logarithmic equation 对数方程logarithmic function 对数函数logic 逻辑logical deduction 逻辑推论；逻辑推理logical step 逻辑步骤long division method 长除法loss 赔本；亏蚀loss per cent 赔率；亏蚀百分率lower bound 下界lower limit 下限lower quartile 下四分位数lowest common multiple(L.C.M) 最小公倍数Mmagnitude 量；数量；长度；大小major arc 优弧；大弧major axis 长轴major sector 优扇形；大扇形major segment 优弓形；大弓形mantissa 尾数mantissa of logarithm 对数的尾数；对数的定值部many-sided figure 多边形marked price 标价mathematical induction 数学归纳法mathematical sentence 数句mathematics 数学maximize 极大maximum absolute error 最大绝对误差maximum point 极大点maximum value 极大值mean 平均(值)；平均数；中数mean deviation 中均差；平均偏差measure of dispersion 离差的量度measurement 量度median (1)中位数；(2)中线?meet 相交；相遇mensuration 计量；求积法method 方法method of completing square 配方法method of substitution 代换法；换元法metric unit 十进制单位mid-point 中点mid-point formula 中点公式mid-point theorem 中点定理million 百万minimize 极小minimum point 极小点minimum value 极小值minor (1)子行列式；(2)劣；较小的minor arc 劣弧；小弧minor axis 短轴minor sector 劣扇形；小扇形minor segment 劣弓形；小弓形minus 减minute 分mixed number(fraction) 带分数modal class 众数组mode 众数model 模型monomial 单项式multinomial 多项式multiple 倍数multiple root 多重根multiplicand 被乘数multiplication 乘法multiplication law (of probability) (概率)乘法定律multiplicative property 可乘性multiplier 乘数；乘式multiply 乘mutually exclusive events 互斥事件mutually independent 独立; 互相独立mutually perpendicular lines 互相垂直?Nn factorial n阶乘n th root n次根；n次方根natural number 自然数negative 负negative angle 负角negative index 负指数negative integer 负整数negative number 负数neighborhood 邻域net 净(值)n-gon n边形nonagon 九边形non-collinear 不共线?non-linear 非线性non-linear equation 非线性方程non-negative 非负的non-trivial 非平凡的non-zero 非零normal (1)垂直的；正交的；法线的(2)正态的(3)正常的；正规的normal curve 正态分记?伲怀１分记?伲徽?媲?伲徽?忧?? normal distribution 正态分布，常态分布normal form 法线式notation 记法；记号number 数number line 数线?number pair 数偶number pattern 数型number plane 数平面number system 数系numeral 数字；数码numeral system 记数系统numerator 分子numerical 数值的；数字的numerical expression 数字式numerical method 计算方法；数值法Ooblique 斜的oblique cone 斜圆锥oblique triangle 斜三角形obtuse angle 钝角obtuse-angled triangle 钝角三角形octagon 八边形octahedron 八面体odd function 奇函数odd number 奇数one-one correspondence 一一对应open interval 开区间open sentence 开句operation 运算opposite angle 对角opposite interior angle 内对角opposite side 对边optimal solution 最优解order (1)序；次序；(2)阶；级ordered pair 序偶origin 原点outcome 结果output 输出overlap 交迭；相交Pparabola 拋物线?parallel 平行(的)parallel lines 平行(直线) parallelogram 平行四边形parameter 参数；参变量partial fraction 部分分数；分项分式polar coordinate system 极坐标系统polar coordinates 极坐标pole 极polygon 多边形polyhedron 多面体polynomial 多项式polynomial equation 多项式方程positive 正positive index 正指数positive integer 正整数positive number 正数power (1)幂；乘方；(2)功率；(3)检定力precise 精密precision 精确度prime 素prime factor 质因子；质因素prime number 素数；质数primitive (1)本原的；原始的；(2)原函数principal (1)主要的；(2)本金prism 梭柱(体)；角柱(体)prismoid 平截防庾短?probability 概率problem 应用题produce 延长product 乘积；积product rule 积法则profit 盈利profit per cent 盈利百分率profits tax 利得税progression 级数proof 证(题)；证明proper fraction 真分数property 性质property tax 物业税proportion 比例proportional 成比例protractor 量角器pyramid 棱锥(体)；角锥(体) Pythagoras’ Theorem 勾股定理Pythagorean triplet 毕氏三元数组partial sum 部分和partial variation 部分变(分) particular solution 特解Pascal’s triangle 帕斯卡斯三角形pattern 模型；规律pegboard 有孔版pentadecagon 十五边形pentagon 五边形per cent 百分率percentage 百分法；百分数percentage decrease 百分减少percentage error 百分误差percentage increase 百分增加percentile 百分位数perfect number 完全数perfecr square 完全平方perimeter 周长；周界period 周期periodic function 周期函数permutation 排列。

勘误表_信号处理的小波导引

ErrataHere is a list of errors detected in the third edition of the book.If you find an error, please let us know about it.∙Images 13.6 (a) and 13.6 (b) should be swapped.∙On page 22, bellow eq. 1.15, the right inequalities should read\sum_{m \in Gamma} |<f,\tilde{\phi_{p}}>|^2 <= A^(-1) ||f||^2∙The Gradient Pursuit algorithm (reference [115], page 672) does not, as stated in the book, "solve the l1 Lagrangian minimisation (12.97)". The algorithm described in the reference is a fast approximation to OMP and the reference would be better in the OMP section of the book. ∙Exercise 6.4(a): on the right hand side of the equation, it should be $\cos(w_0 u)$ rather than $\cos(w_0 t)$.∙Follow-up to the current second item on the errata list: On page 22, Below and above eq.1.15, ${m \in Gamma}$ should be ${p \in Gamma}$∙On page 112, Example 4.11, 2nd to the last line: "quadratic chirp" should be "linear chirp"∙On page 114, top line: Current: This discrete wavelet has $Ka^j$ nonzero values... Correct: This discrete wavelet has $N/(K a^j)$ nonzero values...∙On page 114, above eq. 4.65: two places in the equation for $\phi_J$: $a^j$ should be $a^J$. ∙On page 115, 3rd line from top: ${a^j}$ should be ${a^j}_{j \in [I,J]}$.∙On page 171, equation at top: leading factor on right has $1/2^j$ should be the square root of this. (or else the norm has to change in eq. 5.54 and elsewhere).∙On page 267, below eq. 7.12: in the numerator of phi on the right: Current numerator: $t - n$, Correct numerator: $t - n 2^j$ (since this is for the orthonormal basis, not the time-invariant version).∙Theorem 7.1 on p.267: The argument in $\phi_{j,n}(t)$ is written as $\phi[(t-n)/2^j]$. It should be $\phi( [t - 2^j n] / 2^j)$.∙On page 677, the iteration for the sparse analysis resolution should read $\tilde F_{k+1} = S_{\mu T}( \tilde F_k + \mu U^*(Y-U\tilde F_k) )$, i.e. $\gamma$ should be replaced by$\mu$.∙It is in Eq. (8.106), p. 423, one should replace [n-a_p] with [n-a_p-1/2].∙On page 739, paragraph "Restricted Isometry and Incoherence", the centence If the vectors are normalized vectors then $A_\La \leq 1 \leq B_\La$, and it is equivalent to impose that $\delta_\La = \max(1 - A_\La, B_\La - 1)$ is not too small. To get a stable recovery of all sparse signals, compressive sensing imposes a uniform bound on all sufficiently sparse sets $\La$: $$\delta_\La \geq \delta_M(\DU) > 0 ~~\mbox{if}~~~|\La| \leq M ,$$should be replaced by:If the vectors are normalized vectors then $A_\La \leq 1 \leq B_\La$, and it is equivalent to impose that $\delta_\La = \max(1 - A_\La, B_\La - 1)$ is not too CLOSE TO 1. To get a stable recovery of all sparse signals, compressive sensing imposes a uniform bound on all sufficiently sparse sets $\La$: $$\delta_\La \leq \delta_M(\DU) < 1 ~~\mbox{if}~~~|\La| \leq M,$$∙On page 739, 8 lines before the end of the page:A simple geometric interpretation explains why random measurement operators defineincoherent dictionaries with $\delta_M (\DU) > 0$ for relatively large $M$.should be replaced byA simple geometric interpretation explains why random measurement operators defineincoherent dictionaries with $\delta_M (\DU) < 1$ for relatively large $M$.∙On page 107, before equation (3.35), should be "(Exercise 4.6)".∙Bottom of page 156 "$\lambda$ is also an eigenvector of $\Phi\Phi^*$".∙Page 55, exercise 2.3, references should be (2.18), (2.19) and (2.21).∙Page 57, exercise 2.17, should be "$f(x_1,x_2)=1_{x_1^2+x_2^2 \leq 1}$".∙Page 86, exercise 3.9, should be "recovers $f(t)$ from $\{f(p s)\}_{p \in \ZZ}$."∙Page 86, exercise 3.12, should be "from any $0 \leq k < N$".∙Page 86, exercise 3.16, b), should be "if $h[n]= a \delta[n-p]$ for some $p \in \ZZ$ and $a \in \CC$".∙Page 676, 5th equation, should be $\tilde a_{k+1}[p] = ...$.∙Page 60, before (3.6), should be "the Poisson formula, Theorem 2.4".∙Pages 320-322, the uper-script "pér" should be replaced by "per".∙Exercise 2.9, one needs to impose that $K < M$.∙In p. 676, $\tilde{a}^{k+1}=a^{K}+\gama * b^{k=1}$ should be $\tilde{a}^{k+1}=a^{K}-\gama * b^{k=1}$.∙In p. 156, On the bottome of this page, in the sentence: "Both Statements are proved to be equivalent...", Phi conjugate times Phi" should be "Phi conjugate times Phi" and "Phi times Phi conjugate".∙Exercise 4.4, the result is $O(N^2 log L)$.∙Exercise 4.9, one needs to use $1/2pi$ and not $1/pi$.∙Exercise 4.17, several errors, see corection.∙Exercise 5.1, it is required that $K \geq 1$.∙Exercise 5.6, one needs to impose $||phi_p || = 1$.∙Exercise 5.9, one needs $\hat g = 1_[-\om0/2,\om0/2]$.∙Exercise 5.10, the sum should run in $mM - K/2$ to $mM + K/2-1$.∙Exercise 5.15, several errors, see the correction.∙Exercise 5.18, one needs to impose that $\hat phi (\om)$ tends to 0 when $omega$ tends to infinity.∙On page 218 (paragraph 6.2.1), "See figure 6.5b on page 218 ..." should be "See figure 6.5b on page 220 ...".∙On page 37, before Theorem 2.2, $exp(i t w)$ are the eigenvector of the convolution operator (and not eigenvalues).∙On p.156, "are equal to...":, one should add: "on $Im \Phi$".∙On p.156, last line: "$\Phi \Phi^*^$ with ..." instead of "$\Phi^* \Phi^$ with...".∙On p.157 eq 5.6 one should replace $<\phi_n, \phi_p>$ by $<\phi_m, \phi_p>$.∙On p.36, in the proof of Theorem 2.1 it should read "We cannot apply the Fubini TheoremA.2" instead of "We cannot apply the Fubini Theorem reffubini".∙On page 103 4.3.1 the transform using a real valued wavelet is defined using a capitalized W.However, throughout 4.3.1 a lower case w is used to indicate the real wavelet transform. Onthe other hand on page 109 the analytic wavelet transform is defined using lower case w, while in the remaining chapter 4.3.2 upper case W is used.∙On page 106: the equation (4.40) lacks the integral from 0 to infinity on the variable s.∙In page 758, for the definition of operator $U$'s image, $Im U$ in line before subsection "Supremum Norm": $H_1$ should be $\mathbf{H}_1$ since $H_2$ is a Hilbert space not a operator.∙In page 758, the first equation of "A.4 LINEAR OPERATORS", $\forallf_1,f_2\in\mathbf{H}$should be$\forall f_1,f_2\in\mathbf{H}_1$, since $f_1, f_2$ aredefined as vectorss in $H_1$ while $H$ is not defined.∙In page 40 before equation (2.31), "applying (2.22) proves that" should be "applying (2.21) proves that", for it used the time derivatives property of Fourier transform.∙On page 129, in the definition of $g_{s,u,\xi}(t)$, $\sqrt(s)$ should be $\frac{1}{\sqrt(s)}$. ∙On page 130, in equation(4.109), before the first sign of equality should be $\frac{\eta(u)}{s}$.∙On page 136, the second part of the equation before the last paragraph should be $[-2(u_0-u)-T,-2(u_0-u+T)]$.∙On page 136, the second line of the last paragraph $|u_0-u|\le T$ should be $|u_0-u|\le T/2$.∙On page 137, the last part of (4.129) should be $P_vg(\frac{u}{s},s\xi)$.∙In exercise 2.10, section (b) it should be asked to prove that $\hat{f_r} = H \hat{f_i}$ and $\hat{f_i} = - H \hat{f_r}$.∙At p. 685, in the proof of Theorem 12.15 an argument is missing because one might have that the support of $\tilde a_{\Lambda}$ is strictly included in $\Lambda$. One should use the monoticity of ERC with respect to the support.∙Page 51, equation (2.70), 1/N should be modified to N.。

Kernel methods in machine learning

a rX iv:mat h /7197v3[mat h.ST]1J ul28The Annals of Statistics 2008,Vol.36,No.3,1171–1220DOI:10.1214/009053607000000677c Institute of Mathematical Statistics ,2008KERNEL METHODS IN MACHINE LEARNING 1By Thomas Hofmann,Bernhard Sch ¨o lkopf and Alexander J.Smola Darmstadt University of Technology ,Max Planck Institute for Biological Cybernetics and National ICT Australia We review machine learning methods employing positive deﬁnite kernels.These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS)of functions deﬁned on the data domain,expanded in terms of a kernel.Working in linear spaces of function has the beneﬁt of facilitating the construction and analysis of learning algorithms while at the same time allowing large classes of functions.The latter include nonlinear functions as well as functions deﬁned on nonvectorial data.We cover a wide range of methods,ranging from binary classiﬁers to sophisticated methods for estimation with structured data.1.Introduction.Over the last ten years estimation and learning meth-ods utilizing positive deﬁnite kernels have become rather popular,particu-larly in machine learning.Since these methods have a stronger mathematical slant than earlier machine learning methods (e.g.,neural networks),there is also signiﬁcant interest in the statistics and mathematics community for these methods.The present review aims to summarize the state of the art on a conceptual level.In doing so,we build on various sources,including Burges [25],Cristianini and Shawe-Taylor [37],Herbrich [64]and Vapnik [141]and,in particular,Sch¨o lkopf and Smola [118],but we also add a fair amount of more recent material which helps unifying the exposition.We have not had space to include proofs;they can be found either in the long version of the present paper (see Hofmann et al.[69]),in the references given or in the above books.The main idea of all the described methods can be summarized in one paragraph.Traditionally,theory and algorithms of machine learning and2T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLAstatistics has been very well developed for the linear case.Real world data analysis problems,on the other hand,often require nonlinear methods to de-tect the kind of dependencies that allow successful prediction of properties of interest.By using a positive deﬁnite kernel,one can sometimes have the best of both worlds.The kernel corresponds to a dot product in a(usually high-dimensional)feature space.In this space,our estimation methods are linear,but as long as we can formulate everything in terms of kernel evalu-ations,we never explicitly have to compute in the high-dimensional feature space.The paper has three main sections:Section2deals with fundamental properties of kernels,with special emphasis on(conditionally)positive deﬁ-nite kernels and their characterization.We give concrete examples for such kernels and discuss kernels and reproducing kernel Hilbert spaces in the con-text of regularization.Section3presents various approaches for estimating dependencies and analyzing data that make use of kernels.We provide an overview of the problem formulations as well as their solution using convex programming techniques.Finally,Section4examines the use of reproduc-ing kernel Hilbert spaces as a means to deﬁne statistical models,the focus being on structured,multidimensional responses.We also show how such techniques can be combined with Markov networks as a suitable framework to model dependencies between response variables.2.Kernels.2.1.An introductory example.Suppose we are given empirical data (1)(x1,y1),...,(x n,y n)∈X×Y.Here,the domain X is some nonempty set that the inputs(the predictor variables)x i are taken from;the y i∈Y are called targets(the response vari-able).Here and below,i,j∈[n],where we use the notation[n]:={1,...,n}. Note that we have not made any assumptions on the domain X other than it being a set.In order to study the problem of learning,we need additional structure.In learning,we want to be able to generalize to unseen data points.In the case of binary pattern recognition,given some new input x∈X,we want to predict the corresponding y∈{±1}(more complex output domains Y will be treated below).Loosely speaking,we want to choose y such that(x,y)is in some sense similar to the training examples.To this end,we need similarity measures in X and in{±1}.The latter is easier, as two target values can only be identical or diﬀerent.For the former,we require a function(2)k:X×X→R,(x,x′)→k(x,x′)KERNEL METHODS IN MACHINE LEARNING3Fig. 1.A simple geometric classiﬁcation algorithm:given two classes of points(de-picted by“o”and“+”),compute their means c+,c−and assign a test input x to the one whose mean is closer.This can be done by looking at the dot product between x−c [where c=(c++c−)/2]and w:=c+−c−,which changes sign as the enclosed angle passes throughπ/2.Note that the corresponding decision boundary is a hyperplane(the dotted line)orthogonal to w(from Sch¨o lkopf and Smola[118]).satisfying,for all x,x′∈X,k(x,x′)= Φ(x),Φ(x′) ,(3)whereΦmaps into some dot product space H,sometimes called the featurespace.The similarity measure k is usually called a kernel,andΦis called its feature map.The advantage of using such a kernel as a similarity measure is that it allows us to construct algorithms in dot product spaces.For instance, consider the following simple classiﬁcation algorithm,described in Figure1, where Y={±1}.The idea is to compute the means of the two classes inthe feature space,c+=1n− {i:y i=−1}Φ(x i), where n+and n−are the number of examples with positive and negative target values,respectively.We then assign a new pointΦ(x)to the class whose mean is closer to it.This leads to the prediction ruley=sgn( Φ(x),c+ − Φ(x),c− +b)(4)with b=1n+ {i:y i=+1} Φ(x),Φ(x i)k(x,x i)−12(1n2+ {(i,j):y i=y j=+1}k(x i,x j)).Let us consider one well-known special case of this type of classiﬁer.As-sume that the class means have the same distance to the origin(hence, b=0),and that k(·,x)is a density for all x∈X.If the two classes are4T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLAequally likely and were generated from two probability distributions that are estimatedp+(x):=1n− {i:y i=−1}k(x,x i),(6)then(5)is the estimated Bayes decision rule,plugging in the estimates p+ and p−for the true densities.The classiﬁer(5)is closely related to the Support Vector Machine(SVM) that we will discuss below.It is linear in the feature space(4),while in the input domain,it is represented by a kernel expansion(5).In both cases,the decision boundary is a hyperplane in the feature space;however,the normal vectors[for(4),w=c+−c−]are usually rather diﬀerent.The normal vector not only characterizes the alignment of the hyperplane, its length can also be used to construct tests for the equality of the two class-generating distributions(Borgwardt et al.[22]).As an aside,note that if we normalize the targets such thatˆy i=y i/|{j:y j= y i}|,in which case theˆy i sum to zero,then w 2= K,ˆyˆy⊤ F,where ·,· F is the Frobenius dot product.If the two classes have equal size,then up to a scaling factor involving K 2and n,this equals the kernel-target alignment deﬁned by Cristianini et al.[38].2.2.Positive deﬁnite kernels.We have required that a kernel satisfy(3), that is,correspond to a dot product in some dot product space.In the present section we show that the class of kernels that can be written in the form(3)coincides with the class of positive deﬁnite kernels.This has far-reaching consequences.There are examples of positive deﬁnite kernels which can be evaluated eﬃciently even though they correspond to dot products in inﬁnite dimensional dot product spaces.In such cases,substituting k(x,x′) for Φ(x),Φ(x′) ,as we have done in(5),is crucial.In the machine learning community,this substitution is called the kernel trick.Definition1(Gram matrix).Given a kernel k and inputs x1,...,x n∈X,the n×n matrixK:=(k(x i,x j))ij(7)is called the Gram matrix(or kernel matrix)of k with respect to x1,...,x n.Definition2(Positive deﬁnite matrix).A real n×n symmetric matrix K ij satisfyingi,j c i c j K ij≥0(8)for all c i∈R is called positive deﬁnite.If equality in(8)only occurs for c1=···=c n=0,then we shall call the matrix strictly positive deﬁnite.KERNEL METHODS IN MACHINE LEARNING 5Definition 3(Positive deﬁnite kernel).Let X be a nonempty set.A function k :X ×X →R which for all n ∈N ,x i ∈X ,i ∈[n ]gives rise to a positive deﬁnite Gram matrix is called a positive deﬁnite kernel .A function k :X ×X →R which for all n ∈N and distinct x i ∈X gives rise to a strictly positive deﬁnite Gram matrix is called a strictly positive deﬁnite kernel .Occasionally,we shall refer to positive deﬁnite kernels simply as kernels .Note that,for simplicity,we have restricted ourselves to the case of real valued kernels.However,with small changes,the below will also hold for the complex valued case.Since i,j c i c j Φ(x i ),Φ(x j ) = i c i Φ(x i ), j c j Φ(x j ) ≥0,kernels of the form (3)are positive deﬁnite for any choice of Φ.In particular,if X is already a dot product space,we may choose Φto be the identity.Kernels can thus be regarded as generalized dot products.While they are not generally bilinear,they share important properties with dot products,such as the Cauchy–Schwarz inequality:If k is a positive deﬁnite kernel,and x 1,x 2∈X ,thenk (x 1,x 2)2≤k (x 1,x 1)·k (x 2,x 2).(9)2.2.1.Construction of the reproducing kernel Hilbert space.We now de-ﬁne a map from X into the space of functions mapping X into R ,denoted as R X ,viaΦ:X →R X where x →k (·,x ).(10)Here,Φ(x )=k (·,x )denotes the function that assigns the value k (x ′,x )to x ′∈X .We next construct a dot product space containing the images of the inputs under Φ.To this end,we ﬁrst turn it into a vector space by forming linear combinationsf (·)=n i =1αi k (·,x i ).(11)Here,n ∈N ,αi ∈R and x i ∈X are arbitrary.Next,we deﬁne a dot product between f and another function g (·)= n ′j =1βj k (·,x ′j )(with n ′∈N ,βj ∈R and x ′j ∈X )asf,g :=n i =1n ′j =1αi βj k (x i ,x ′j ).(12)To see that this is well deﬁned although it contains the expansion coeﬃcients and points,note that f,g = n ′j =1βj f (x ′j ).The latter,however,does not depend on the particular expansion of f .Similarly,for g ,note that f,g = n i =1αi g (x i ).This also shows that ·,· is bilinear.It is symmetric,as f,g =6T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLAg,f .Moreover,it is positive deﬁnite,since positive deﬁniteness of k implies that,for any function f,written as(11),we havef,f =ni,j=1αiαj k(x i,x j)≥0.(13)Next,note that given functions f1,...,f p,and coeﬃcientsγ1,...,γp∈R,we havepi,j=1γiγj f i,f j = p i=1γi f i,p j=1γj f j ≥0.(14)Here,the equality follows from the bilinearity of ·,· ,and the right-hand inequality from(13).By(14), ·,· is a positive deﬁnite kernel,deﬁned on our vector space of functions.For the last step in proving that it even is a dot product,we note that,by(12),for all functions(11),k(·,x),f =f(x)and,in particular, k(·,x),k(·,x′) =k(x,x′). (15)By virtue of these properties,k is called a reproducing kernel(Aronszajn [7]).Due to(15)and(9),we have|f(x)|2=| k(·,x),f |2≤k(x,x)· f,f .(16)By this inequality, f,f =0implies f=0,which is the last property that was left to prove in order to establish that ·,· is a dot product. Skipping some details,we add that one can complete the space of func-tions(11)in the norm corresponding to the dot product,and thus gets a Hilbert space H,called a reproducing kernel Hilbert space(RKHS).One can deﬁne a RKHS as a Hilbert space H of functions on a set X with the property that,for all x∈X and f∈H,the point evaluations f→f(x) are continuous linear functionals[in particular,all point values f(x)are well deﬁned,which already distinguishes RKHSs from many L2Hilbert spaces]. From the point evaluation functional,one can then construct the reproduc-ing kernel using the Riesz representation theorem.The Moore–Aronszajn theorem(Aronszajn[7])states that,for every positive deﬁnite kernel on X×X,there exists a unique RKHS and vice versa.There is an analogue of the kernel trick for distances rather than dot products,that is,dissimilarities rather than similarities.This leads to the larger class of conditionally positive deﬁnite kernels.Those kernels are de-ﬁned just like positive deﬁnite ones,with the one diﬀerence being that their Gram matrices need to satisfy(8)only subject toni=1c i=0.(17)KERNEL METHODS IN MACHINE LEARNING7 Interestingly,it turns out that many kernel algorithms,including SVMs and kernel PCA(see Section3),can be applied also with this larger class of kernels,due to their being translation invariant in feature space(Hein et al.[63]and Sch¨o lkopf and Smola[118]).We conclude this section with a note on terminology.In the early years of kernel machine learning research,it was not the notion of positive deﬁnite kernels that was being used.Instead,researchers considered kernels satis-fying the conditions of Mercer’s theorem(Mercer[99],see,e.g.,Cristianini and Shawe-Taylor[37]and Vapnik[141]).However,while all such kernels do satisfy(3),the converse is not true.Since(3)is what we are interested in, positive deﬁnite kernels are thus the right class of kernels to consider.2.2.2.Properties of positive deﬁnite kernels.We begin with some closure properties of the set of positive deﬁnite kernels.Proposition4.Below,k1,k2,...are arbitrary positive deﬁnite kernels on X×X,where X is a nonempty set:(i)The set of positive deﬁnite kernels is a closed convex cone,that is, (a)ifα1,α2≥0,thenα1k1+α2k2is positive deﬁnite;and(b)if k(x,x′):= lim n→∞k n(x,x′)exists for all x,x′,then k is positive deﬁnite.(ii)The pointwise product k1k2is positive deﬁnite.(iii)Assume that for i=1,2,k i is a positive deﬁnite kernel on X i×X i, where X i is a nonempty set.Then the tensor product k1⊗k2and the direct sum k1⊕k2are positive deﬁnite kernels on(X1×X2)×(X1×X2).The proofs can be found in Berg et al.[18].It is reassuring that sums and products of positive deﬁnite kernels are positive deﬁnite.We will now explain that,loosely speaking,there are no other operations that preserve positive deﬁniteness.To this end,let C de-note the set of all functionsψ:R→R that map positive deﬁnite kernels to (conditionally)positive deﬁnite kernels(readers who are not interested in the case of conditionally positive deﬁnite kernels may ignore the term in parentheses).We deﬁneC:={ψ|k is a p.d.kernel⇒ψ(k)is a(conditionally)p.d.kernel},C′={ψ|for any Hilbert space F,ψ( x,x′ F)is(conditionally)positive deﬁnite}, C′′={ψ|for all n∈N:K is a p.d.n×n matrix⇒ψ(K)is(conditionally)p.d.},whereψ(K)is the n×n matrix with elementsψ(K ij).8T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLAProposition5.C=C′=C′′.The following proposition follows from a result of FitzGerald et al.[50]for (conditionally)positive deﬁnite matrices;by Proposition5,it also applies for (conditionally)positive deﬁnite kernels,and for functions of dot products. We state the latter case.Proposition6.Letψ:R→R.Thenψ( x,x′ F)is positive deﬁnite for any Hilbert space F if and only ifψis real entire of the formψ(t)=∞ n=0a n t n(18)with a n≥0for n≥0.Moreover,ψ( x,x′ F)is conditionally positive deﬁnite for any Hilbert space F if and only ifψis real entire of the form(18)with a n≥0for n≥1.There are further properties of k that can be read oﬀthe coeﬃcients a n:•Steinwart[128]showed that if all a n are strictly positive,then the ker-nel of Proposition6is universal on every compact subset S of R d in the sense that its RKHS is dense in the space of continuous functions on S in the · ∞norm.For support vector machines using universal kernels,he then shows(universal)consistency(Steinwart[129]).Examples of univer-sal kernels are(19)and(20)below.•In Lemma11we will show that the a0term does not aﬀect an SVM. Hence,we infer that it is actually suﬃcient for consistency to have a n>0 for n≥1.We conclude the section with an example of a kernel which is positive deﬁnite by Proposition6.To this end,let X be a dot product space.The power series expansion ofψ(x)=e x then tells us thatk(x,x′)=e x,x′ /σ2(19)is positive deﬁnite(Haussler[62]).If we further multiply k with the positive deﬁnite kernel f(x)f(x′),where f(x)=e− x 2/2σ2andσ>0,this leads to the positive deﬁniteness of the Gaussian kernelk′(x,x′)=k(x,x′)f(x)f(x′)=e− x−x′ 2/(2σ2).(20)KERNEL METHODS IN MACHINE LEARNING9 2.2.3.Properties of positive deﬁnite functions.We now let X=R d and consider positive deﬁnite kernels of the form(21)k(x,x′)=h(x−x′),in which case h is called a positive deﬁnite function.The following charac-terization is due to Bochner[21].We state it in the form given by Wendland [152].Theorem7.A continuous function h on R d is positive deﬁnite if and only if there exists aﬁnite nonnegative Borel measureµon R d such thath(x)= R d e−i x,ω dµ(ω).(22)While normally formulated for complex valued functions,the theorem also holds true for real functions.Note,however,that if we start with an arbitrary nonnegative Borel measure,its Fourier transform may not be real. Real-valued positive deﬁnite functions are distinguished by the fact that the corresponding measuresµare symmetric.We may normalize h such that h(0)=1[hence,by(9),|h(x)|≤1],in which caseµis a probability measure and h is its characteristic function.For instance,ifµis a normal distribution of the form(2π/σ2)−d/2e−σ2 ω 2/2dω, then the corresponding positive deﬁnite function is the Gaussian e− x 2/(2σ2); see(20).Bochner’s theorem allows us to interpret the similarity measure k(x,x′)= h(x−x′)in the frequency domain.The choice of the measureµdetermines which frequency components occur in the kernel.Since the solutions of kernel algorithms will turn out to beﬁnite kernel expansions,the measureµwill thus determine which frequencies occur in the estimates,that is,it will determine their regularization properties—more on that in Section2.3.2 below.Bochner’s theorem generalizes earlier work of Mathias,and has itself been generalized in various ways,that is,by Schoenberg[115].An important generalization considers Abelian semigroups(Berg et al.[18]).In that case, the theorem provides an integral representation of positive deﬁnite functions in terms of the semigroup’s semicharacters.Further generalizations were given by Krein,for the cases of positive deﬁnite kernels and functions with a limited number of negative squares.See Stewart[130]for further details and references.As above,there are conditions that ensure that the positive deﬁniteness becomes strict.Proposition8(Wendland[152]).A positive deﬁnite function is strictly positive deﬁnite if the carrier of the measure in its representation(22)con-tains an open subset.10T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLAThis implies that the Gaussian kernel is strictly positive deﬁnite.An important special case of positive deﬁnite functions,which includes the Gaussian,are radial basis functions.These are functions that can be written as h(x)=g( x 2)for some function g:[0,∞[→R.They have the property of being invariant under the Euclidean group.2.2.4.Examples of kernels.We have already seen several instances of positive deﬁnite kernels,and now intend to complete our selection with a few more examples.In particular,we discuss polynomial kernels,convolution kernels,ANOVA expansions and kernels on documents.Polynomial kernels.From Proposition4it is clear that homogeneous poly-nomial kernels k(x,x′)= x,x′ p are positive deﬁnite for p∈N and x,x′∈R d. By direct calculation,we can derive the corresponding feature map(Poggio [108]):x,x′ p= d j=1[x]j[x′]j p(23)= j∈[d]p[x]j1·····[x]j p·[x′]j1·····[x′]j p= C p(x),C p(x′) ,where C p maps x∈R d to the vector C p(x)whose entries are all possible p th degree ordered products of the entries of x(note that[d]is used as a shorthand for{1,...,d}).The polynomial kernel of degree p thus computes a dot product in the space spanned by all monomials of degree p in the input coordinates.Other useful kernels include the inhomogeneous polynomial, (24)k(x,x′)=( x,x′ +c)p where p∈N and c≥0,which computes all monomials up to degree p.Spline kernels.It is possible to obtain spline functions as a result of kernel expansions(Vapnik et al.[144]simply by noting that convolution of an even number of indicator functions yields a positive kernel function.Denote by I X the indicator(or characteristic)function on the set X,and denote by ⊗the convolution operation,(f⊗g)(x):= R d f(x′)g(x′−x)dx′.Then the B-spline kernels are given by(25)k(x,x′)=B2p+1(x−x′)where p∈N with B i+1:=B i⊗B0.Here B0is the characteristic function on the unit ball in R d.From the deﬁnition of(25),it is obvious that,for odd m,we may write B m as the inner product between functions B m/2.Moreover,note that,for even m,B m is not a kernel.KERNEL METHODS IN MACHINE LEARNING11 Convolutions and structures.Let us now move to kernels deﬁned on struc-tured objects(Haussler[62]and Watkins[151]).Suppose the object x∈X is composed of x p∈X p,where p∈[P](note that the sets X p need not be equal). For instance,consider the string x=AT G and P=2.It is composed of the parts x1=AT and x2=G,or alternatively,of x1=A and x2=T G.Math-ematically speaking,the set of“allowed”decompositions can be thought of as a relation R(x1,...,x P,x),to be read as“x1,...,x P constitute the composite object x.”Haussler[62]investigated how to deﬁne a kernel between composite ob-jects by building on similarity measures that assess their respective parts; in other words,kernels k p deﬁned on X p×X p.Deﬁne the R-convolution of k1,...,k P as[k1⋆···⋆k P](x,x′):=¯x∈R(x),¯x′∈R(x′)P p=1k p(¯x p,¯x′p),(26)where the sum runs over all possible ways R(x)and R(x′)in which we can decompose x into¯x1,...,¯x P and x′analogously[here we used the con-vention that an empty sum equals zero,hence,if either x or x′cannot be decomposed,then(k1⋆···⋆k P)(x,x′)=0].If there is only aﬁnite number of ways,the relation R is calledﬁnite.In this case,it can be shown that the R-convolution is a valid kernel(Haussler[62]).ANOVA kernels.Speciﬁc examples of convolution kernels are Gaussians and ANOVA kernels(Vapnik[141]and Wahba[148]).To construct an ANOVA kernel,we consider X=S N for some set S,and kernels k(i)on S×S,where i=1,...,N.For P=1,...,N,the ANOVA kernel of order P is deﬁned as k P(x,x′):= 1≤i1<···<i P≤N P p=1k(i p)(x i p,x′i p).(27)Note that if P=N,the sum consists only of the term for which(i1,...,i P)= (1,...,N),and k equals the tensor product k(1)⊗···⊗k(N).At the other extreme,if P=1,then the products collapse to one factor each,and k equals the direct sum k(1)⊕···⊕k(N).For intermediate values of P,we get kernels that lie in between tensor products and direct sums.ANOVA kernels typically use some moderate value of P,which speciﬁesthe order of the interactions between attributes x ip that we are interestedin.The sum then runs over the numerous terms that take into account interactions of order P;fortunately,the computational cost can be reduced to O(P d)cost by utilizing recurrent procedures for the kernel evaluation. ANOVA kernels have been shown to work rather well in multi-dimensional SV regression problems(Stitson et al.[131]).12T.HOFMANN,B.SCH¨OLKOPF AND A.J.SMOLABag of words.One way in which SVMs have been used for text categoriza-tion(Joachims[77])is the bag-of-words representation.This maps a given text to a sparse vector,where each component corresponds to a word,and a component is set to one(or some other number)whenever the related word occurs in the ing an eﬃcient sparse representation,the dot product between two such vectors can be computed quickly.Furthermore, this dot product is by construction a valid kernel,referred to as a sparse vector kernel.One of its shortcomings,however,is that it does not take into account the word ordering of a document.Other sparse vector kernels are also conceivable,such as one that maps a text to the set of pairs of words that are in the same sentence(Joachims[77]and Watkins[151]).n-grams and suﬃx trees.A more sophisticated way of dealing with string data was proposed by Haussler[62]and Watkins[151].The basic idea is as described above for general structured objects(26):Compare the strings by means of the substrings they contain.The more substrings two strings have in common,the more similar they are.The substrings need not always be contiguous;that said,the further apart theﬁrst and last element of a substring are,the less weight should be given to the similarity.Depending on the speciﬁc choice of a similarity measure,it is possible to deﬁne more or less eﬃcient kernels which compute the dot product in the feature space spanned by all substrings of documents.Consider aﬁnite alphabetΣ,the set of all strings of length n,Σn,and the set of allﬁnite strings,Σ∗:= ∞n=0Σn.The length of a string s∈Σ∗is denoted by|s|,and its elements by s(1)...s(|s|);the concatenation of s and t∈Σ∗is written st.Denote byk(x,x′)= s#(x,s)#(x′,s)c sa string kernel computed from exact matches.Here#(x,s)is the number of occurrences of s in x and c s≥0.Vishwanathan and Smola[146]provide an algorithm using suﬃx trees, which allows one to compute for arbitrary c s the value of the kernel k(x,x′) in O(|x|+|x′|)time and memory.Moreover,also f(x)= w,Φ(x) can be computed in O(|x|)time if preprocessing linear in the size of the support vectors is carried out.These kernels are then applied to function prediction (according to the gene ontology)of proteins using only their sequence in-formation.Another prominent application of string kernels is in theﬁeld of splice form prediction and geneﬁnding(R¨a tsch et al.[112]).For inexact matches of a limited degree,typically up toǫ=3,and strings of bounded length,a similar data structure can be built by explicitly gener-ating a dictionary of strings and their neighborhood in terms of a Hamming distance(Leslie et al.[92]).These kernels are deﬁned by replacing#(x,s)KERNEL METHODS IN MACHINE LEARNING13 by a mismatch function#(x,s,ǫ)which reports the number of approximate occurrences of s in x.By trading oﬀcomputational complexity with storage (hence,the restriction to small numbers of mismatches),essentially linear-time algorithms can be designed.Whether a general purpose algorithm exists which allows for eﬃcient comparisons of strings with mismatches in linear time is still an open question.Mismatch kernels.In the general case it is only possible toﬁnd algorithms whose complexity is linear in the lengths of the documents being compared, and the length of the substrings,that is,O(|x|·|x′|)or worse.We now describe such a kernel with a speciﬁc choice of weights(Cristianini and Shawe-Taylor[37]and Watkins[151]).Let us now form subsequences u of strings.Given an index sequence i:= (i1,...,i|u|)with1≤i1<···<i|u|≤|s|,we deﬁne u:=s(i):=s(i1)...s(i|u|). We call l(i):=i|u|−i1+1the length of the subsequence in s.Note that if iis not contiguous,then l(i)>|u|.The feature space built from strings of length n is deﬁned to be H n:=R(Σn).This notation means that the space has one dimension(or coordinate) for each element ofΣn,labeled by that element(equivalently,we can thinkof it as the space of all real-valued functions onΣn).We can thus describe the feature map coordinate-wise for each u∈Σn via(28)[Φn(s)]u:= i:s(i)=uλl(i).Here,0<λ≤1is a decay parameter:The larger the length of the subse-quence in s,the smaller the respective contribution to[Φn(s)]u.The sum runs over all subsequences of s which equal u.For instance,consider a dimension of H3spanned(i.e.,labeled)by the string asd.In this case we have[Φ3(Nasd as)]asd= 2λ5.In theﬁrst string,asd is a contiguous substring.In the second string,it appears twice as a noncontiguous substring of length5in lass das,the two occurrences are las as and la d。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

JOURNALOFFORMALIZEDMATHEMATICSVolume15,Released2003,Published2003Inst.ofComputerScience,Univ.ofBiałystok

HilbertSpaceofRealSequencesNoboruEndouGifuNationalCollegeofTechnologyYasumasaSuzukiTake,Yokosuka-shiJapan

YasunariShidamaShinshuUniversityNagano

Summary.Acontinuationof[17].Astheexampleofrealunitaryspaces,weintro-

ducethearithmeticadditionandmultiplicationinthesetofsquaresumablerealsequencesandintroducethescalerproductsalso.ThissethasthestructureoftheHilbertspace.

MMLIdentiﬁer:RSSPACE2.

WWW:http://mizar.org/JFM/Vol15/rsspace2.html

Thearticles[15],[18],[4],[1],[16],[6],[19],[2],[3],[17],[10],[11],[12],[13],[9],[7],[8],[14],and[5]providethenotationandterminologyforthispaper.

1.HILBERTSPACEOFREALSEQUENCES

Onecanprovethefollowingpropositions:(1)Thecarrierofl2-Space=thesetofl2-realsequencesandforeverysetxholdsxisanelementofl2-Spaceiffxisasequenceofrealnumbersandidseq(x)idseq(x)issummableandforeverysetxholdsxisavectorofl2-Spaceiffxisasequenceofrealnumbersandidseq(x)idseq(x)issummableand0l2-Space=Zeroseqandforeveryvectoruofl2-Spaceholdsu=idseq(u)andforallvectorsu,vofl2-Spaceholdsu+v=idseq(u)+idseq(v)andforeveryrealnumberrandforeveryvectoruofl2-Spaceholdsr·u=ridseq(u)andforeveryvectoruofl2-Spaceholds−u=−idseq(u)andidseq(−u)=−idseq(u)andforallvectorsu,vofl2-Spaceholdsu−v=idseq(u)−idseq(v)andforallvectorsv,wofl2-Spaceholdsidseq(v)idseq(w)issummableandforallvectorsv,wofl2-Spaceholds(v|w)=∑(idseq(v)idseq(w)).

(2)Letx,y,zbepointsofl2-Spaceandabearealnumber.Then(x|x)=0iffx=0l2-Spaceand0≤(x|x)and(x|y)=(y|x)and((x+y)|z)=(x|z)+(y|z)and((a·x)|y)=a·(x|y).

Letusobservethatl2-Spaceisrealunitaryspace-like.Nextwestatetheproposition

(3)Foreverysequencev1ofl2-Spacesuchthatv1isaCauchysequenceholdsv1isconvergent.Letusobservethatl2-SpaceisHilbertandcomplete.

1c󰀆AssociationofMizarUsersHILBERTSPACEOFREALSEQUENCES22.MISCELLANEOUS

Wenowstateseveralpropositions:(4)Letr1beasequenceofrealnumbers.Supposeforeverynaturalnumbernholds0≤r1(n)andr1issummable.Then

(i)foreverynaturalnumbernholdsr1(n)≤(∑κα=0(r1)(α))κ∈N(n),(ii)foreverynaturalnumbernholds0≤(∑κα=0(r1)(α))κ∈N(n),(iii)foreverynaturalnumbernholds(∑κα=0(r1)(α))κ∈N(n)≤∑r1,and(iv)foreverynaturalnumbernholdsr1(n)≤∑r1.

(5)Forallrealnumbersx,yholds(x+y)·(x+y)≤2·x·x+2·y·yandforallrealnumbersx,yholdsx·x≤2·(x−y)·(x−y)+2·y·y.

(6)Letebearealnumberands1beasequenceofrealnumbers.Supposes1isconvergentandthereexistsanaturalnumberksuchthatforeverynaturalnumberisuchthatk≤iholdss1(i)≤e.Thenlims1≤e.

(7)Letcbearealnumberands1beasequenceofrealnumbers.Supposes1isconvergent.Letr1beasequenceofrealnumbers.Supposethatforeverynaturalnumberiholdsr1(i)=(s1(i)−c)·(s1(i)−c).Thenr1isconvergentandlimr1=(lims1−c)·(lims1−c).

(8)Letcbearealnumberands1,s2besequencesofrealnumbers.Supposes1isconvergentands2isconvergent.Letr1beasequenceofrealnumbers.Supposethatforeverynaturalnumberiholdsr1(i)=(s1(i)−c)·(s1(i)−c)+s2(i).Thenr1isconvergentandlimr1=(lims1−c)·(lims1−c)+lims2.

REFERENCES

[1]GrzegorzBancerek.Theordinalnumbers.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/ordinal1.

html.

[2]CzesławByli´nski.Functionsandtheirbasicproperties.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/funct_1.html.

[3]CzesławByli´nski.Functionsfromasettoaset.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/funct_2.html.

[4]CzesławByli´nski.Somebasicpropertiesofsets.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/

zfmisc_1.html.

[5]NoboruEndou,YasumasaSuzuki,andYasunariShidama.Reallinearspaceofrealsequences.JournalofFormalizedMathematics,15,2003.http://mizar.org/JFM/Vol15/rsspace.html.

[6]KrzysztofHryniewiecki.Basicpropertiesofrealnumbers.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/

Vol1/real_1.html.

[7]JarosławKotowicz.Convergentsequencesandthelimitofsequences.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/seq_2.html.

[8]JarosławKotowicz.Monotonerealsequences.Subsequences.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/seqm_3.html.

[9]JarosławKotowicz.Realsequencesandbasicoperationsonthem.JournalofFormalizedMathematics,1,1989.http://mizar.org/JFM/Vol1/seq_1.html.

[10]JanPopiołek.Realnormedspace.JournalofFormalizedMathematics,2,1990.http://mizar.org/JFM/Vol2/normsp_1.html.[11]JanPopiołek.IntroductiontoBanachandHilbertspaces—partI.JournalofFormalizedMathematics,3,1991.http://mizar.org/

JFM/Vol3/bhsp_1.html.

[12]JanPopiołek.IntroductiontoBanachandHilbertspaces—partII.JournalofFormalizedMathematics,3,1991.http://mizar.org/JFM/Vol3/bhsp_2.html.