Generalized State Classes of Time Petri Nets for Timeliness Analysis

合集下载

Predicate Classes, Promise Classes, and the Acceptance Power of Regular Languages

Predicate Classes, Promise Classes, and the Acceptance Power of Regular Languages

Inaugural DissertationzurErlangung der Doktorw¨u rdeder Naturwissenschaftlich-Mathematischen Gesamtfakult¨a tderRuprecht-Karls-Universit¨a tHeidelbergvorgelegt vonDiplom-Mathematiker Bernd Borchertaus L¨u nne im Emsland1994F¨u r meine ElternPredicate Classes,Promise Classes,and theAcceptance Power of Regular Languages Gutachter:Prof.Dr.Klaus Ambos-Spies,Universit¨a t Heidelberg Prof.Dr.Klaus W.Wagner,Universit¨a t W¨u rzburg Tag der m¨u ndlichen Pr¨u fung:20.Dezember1994PrefaceTheoretical Computer Science This Ph.D.thesis was written in the time from January 1991until July 1994and is submitted to the Department of Mathematics of the University of Heidelberg.It contributes some results to Structural Complexity Theory which is a subfield of Theoretical Computer Science.First of all I have to thank my advisor Prof.Klaus Ambos-Spies for his con-tinuing guidance and support.He also gave several crucial hints for the results of this thesis.Also I have to thank Prof.Juris Hartmanis and his former students Richard Chang,Suresh Chari,Desh Ranjan,and Pankaj Rohatgi.The initial results leading to this thesis were observed while I was visiting Cornell University in spring 1992.This is the right place to express gratitude to Prof.Steven Homer and Prof.Ak-ihiro Kanamori from Boston University who six years ago led me to the interest-ing field of Structural Complexity Theory.Also I would like to thank Prof.Klaus Weihrauch,University of Hagen,for supervising my first diploma thesis,and Prof.Wolfgang Sch¨o nfeld,IBM Heidelberg,for his guidance while I was working in his group.For helpful discussions I would like to thank Andreas Eisenbl¨a tter,Ulrich Her-trampf,Birgit Jenner,Klaus-J¨o rn Lange,Pierre McKenzie,Wolfgang Merkle,An-dre Nies,Thomas Schwentick,Nikolai Vereshchagin,and Heribert Vollmer.I am grateful to Prof.Klaus W.Wagner,University of W¨u rzburg,who agreed to referee this thesis.The results of Part I of the thesis were presented at the 9th Annual IEEE Con-ference of Structure in Complexity Theory 1994,see [Bo94b],a journal version will be submitted.The results of Part II were presented at the 11th Annual Sym-posium of Theoretical Aspects of Computer Science (STACS)1994,see [Bo94a],a journal version will appear in .Heidelberg,July 1994iiiContents1Introduction11.1Outline of Part I11.2Ouline of Part II21.3Related Work2I Predicate Classes and Promise Classes52Preliminaries52.1Order-Theoretic Notions52.2Recursively Presentable Classes62.3Polynomial Time Many-One Reducibility72.4Polynomial Time Many-One Degrees72.5Principal Ideals82.6Ideals112.7Computation Trees153Predicate Classes173.1The Definition of Predicate Classes173.2The Characterization of Predicate Classes194Promise Classes224.1The Definition of Promise Classes224.2The Characterization of the Promise Classes244.3Consequences of the Characterization of the Promise Classes295Analogous Results for Other Nondeterministic Computation Models305.1Balanced Polynomial Time Turing Machines305.2Polynomial Time Bit-Reducibility315.3Polynomial Time Nondeterministic Transducers335.4Polynomial Time Function Classes345.5Relativized Predicate Classes35iiiII On the Acceptance Power of Regular Languages396Predicate Classes Accepted by Regular Languages396.1Predicate Classes Accepted by Languages396.2The Definition of Regular Languages406.3Predicate Classes Accepted by Regular Languages417A Lemma about Regular Languages437.1o-h-Reducibility437.2Generalized Definite Languages467.3The Main Lemma468A Result for Classes Accepted by Regular Languages488.1The Main Result488.2A Non-Density Result on the Assumption that PH does not Collapse508.3A Non-Density Result for the Relativized Case518.4An Analogous Result for the Log-Space Case53 References55 Subject Index63 Symbol Index64 Index of Classes65iv1Introduction1.1Outline of Part Ipolynomial time many-one reducibility polynomial time nondeterministic computation regular language accepted predicate classes principal ideal promise function Part I of this thesis observes a close connection between two basic concepts of Structural Complexity Theory,both introduced by Karp in [Ka72]:1.The concept of which since its defi-nition was studied intensively,see for example [La75,AS85a].2.The concept of ,in the slightly more general sence as it is used to define not only the class NP (like in the original paper)but also classes like P,PP,UP,BPP,and RP.Part II of this thesis relates the concepts of Part I to the notion of a .More detailed outlines of the two parts and references to related work are given below.Several complexity classes –like NP,P,and PP –are defined (say )by a predicate on computation trees produced by polynomial time nondeterministic Turing machine computations.Such classes will be called .For example NP is accepted by the predicate on computation trees which is 1if and only if the tree contains a leaf with label 1.As another example,P is accepted by the predicate on computation trees which is 1if and only if the tree contains an odd number of leaves with label 1.Call a class a if with re-spect to polynomial time many-one reducibility it has a complete set and is closed downward.It is well known that the example classes NP,P,and PP are principal ideals.This observation can be generalized:The set of predicate classes is equal to the set of principal ideals.After the preliminary definitions and observations in Chapter 2this theorem will be shown in Chapter 3.In Chapter 4complexity classes like UP,BPP,and RP will be considered.These classes have in common that their original definition can be seen the fol-lowing way:there is a 01-valued function –called –on1.2Ouline of Part II1.3Related Workpromise classes accepted ideal yield computation trees where it is presumed (=’promised’)for each machine accept-ing a language in the class that for each input the promise function is not for the corresponding computation tree.Such classes will be called .For example UP is defined (say )by the promise function on computation trees which has the value 0if the tree does not contain a leaf with label 1,which has the value 1if the tree contains exactly one leaf with label 1,and which has the value if the tree contains more than one leaf with label 1.Call a class an if with respect to polynomial time many-one reducibility it is closed downward and closed under join.It is easy to see that the example classes UP,BPP,and RP are countable ideals.Like before,this observation can be generalized:The set of promise classes is equal to the set of countable ideals.The two characterizations of predicate classes and promise classes described above –and their corresponding versions for the recursive case –are the two main results of Part I of this thesis.In Chapter 5analogous results for some other models of nondeterministic computation will be shown.In Part II predicates with a low complexity will be considered:the predicates which are determined by a regular language for the the yields of computation trees (the is the left-to-right concatenation of the leaf labels).For example,NP is accepted by the predicate determined by the regular language which consists of the words containing at least one letter 1.The main result of Part II will be that if the class determined by a (nontrivial)regular language is not equal to P then the class contains at least one of the classes NP,co-NP and MOD P for prime.This will be interpreted as a non-density result in two ways:(1)on the as-sumption that the Polynomial Time Hierarchy does not collapse,and (2)for the relativized case.Additionally,the analog of the main result for the log-space case is shown.Similar work like in Part I was done in Bovet,Crescenzi,and Silvestri in [BCS91,BCS92],by Vereshchagin in [Ve93],by Hertrampf,Lautemann,Schwentick,Voll-locally definable acceptance types mod-classes finite acceptance types mer,and Wagner in [HL*93],and by Jenner,McKenzie,and Th´e rien in [JMT94].Like in Part I of this thesis also in these papers the definability of complexity classes with the help of nondeterministic computation models is investigated.The classes determined by regular languages,see Part II,were first considered by Hertrampf,Lautemann,Schwentick,Vollmer,and Wagner in [HL*93].These classes are a special case (namely the associative case)of the classes determined by defined by Hertrampf in [Her92a,Her94b].On the other hand the and the classes determined by ,considered systematicly in [Her90,Bei91,BG92]and [GW87,Her94a],re-spectively,are classes which are by definition determined by regular languages.Part I2Preliminaries2.1Order-Theoretic NotionsPredicate Classes and Promise Classesrecursively presentable classes polynomial time many-one reducibility degrees,principal ideals ideals computation trees binary relation reflexive transitive symmetric antisymmetric preorder partial order equivalence relation isomorphic -complete -minimum -maximum -supremum -infimum supremum First some standard order-theoretic notions will be defined in Section 2.1.The well-known concept of will be defined in Section2.2.Then the and its notions of and are presented in the Sections 2.3–2.6.Section 2.7introduces .The following order-theoretic notions are standard,see for example [Gr78].A on a set is a subset of .Only the infix notation will be used,i.e.stands for ().A binary relation is if for all ,it is if from and it follows ,it is if from it follows ,and it is if from and it follows =.A is a reflexive and transitive binary relation on a nonempty set.A is a preorder which is antisymmetric,and an is a preorder which is symmetric.A binary relation on a set and a binary relation on a set are called if there exists an isomorphism,i.e.a bijective mapping from to such that ()().Let be a preorder on a set .An element is called for a subset if and holds for all .A ()is an elementsuch that ()for all .For two elements a ()of and is an element such that and (and )and if also for another element it holds that and (and )then ().If it is clear from the context that one is dealing with a preorder ,a –supremum will just be called ,this will be done the same way for other order-theoretic notions.ΣΣ()()2.2Recursively Presentable ClassesPredicate Classes and Promise Classesupper semi-lattice lattice distributive -chain -antichain -atom atomic dense language words class recursively presentable recursive language 6Part I Let be a partial order on a set .Note that for a partial order the supremum (infimum)of two elements,if it exists,is unique.The partial order is called an if the supremum exists for every pair of elements,it is called a if both supremum and infimum exist for every pair of elements.A binary relation on a set is called an (upper semi-)sublattice of an (upper semi-)lattice on a set if is a subset of ,is the restriction of to ,and is closed under -suprema (and -infima).Note that an (upper semi-)sublattice is an (upper semi-)lattice.An (upper semi-)lattice is called if for all elements the following holds:if ,where is the supremum of and ,then there are elements such that ,,and is the supremum of and .Let be a partial order on a set .Two elements are called -comparable if or .A subset is called a if any two elements of are comparable,is called an if any two elements are incomparable.If the minimum exists then an element =is called anif no element =exists such that .The partial order is called if for every element =there is an atom such that .A partial order is called if for any two comparable but different elements there is an element properly between them,formally:for all for which but notthere is a such that and but neither nor .A partial order which contains an atom is obviously not dense because there is no element properly between the minimum and the atom.In this thesis a will always be a set of words over the alphabet =01,for basic definitions like the one of see for example [HU79].A is a set of languages.Let be the (+1)st word of in the length-lexicographic order,see for example [AS89].For a language and an I N define to be the languagewhere is a usual bijective polynomial time computable pairingfunction,see for example [BDG88].Call,like in [BDG88,AS89],a complexityclass if =I N for some recursive ,forthe notion of a see for example [HU79].ΣΣΣΣΣPreliminaries 2.3Polynomial Time Many-One Reducibility2.4Polynomial Time Many-One Degreesjoin polynomial time many-one equivalent polynomial time many-one degree of a language trivial recursive 7Let FP denote the class of functions which can be computed by a Tur-ing machine running in deterministic polynomial time,see for example [HU79,BDG88]for a more detailed definition.Let denote the polynomial time many-one reducibility among languages,i.e.if there exists a function FP such that ()for all words .The original definition of this reducibility is from Karp in [Ka72].It is easy to see that the binary relation is a preorder on the set of all languages.Define the of two languagesto be the language 01.The join is a -supremum of and :and for all languages :.The following enumeration of FP will be useful.Let in some straightforward way the deterministic Turing machines which compute functions be encoded by words.Define for every the function FP to be the function computed by the following polynomial time deterministic Turing machine:on input the machine simulates the computation of the deterministic Turing machine encoded by and cancels the simulation –with output –if the simulated machine has not terminated after +steps.It is easy to see that FP =I N and that the function which maps to ()is recursive.For the notions of this section see for example [La75,AS85a].Two languagesare called ,in short ,ifand .Note that is an equivalence relation on the set of all languages.Let the ,in short deg (),be the set of languages polynomial time many-one equivalent to ,and let denote the partial order on the -degrees defined bydeg ()deg ():Note that this definition does not depend on the choice of and .The degree deg ()is the unique -supremum of deg ()and deg ().This shows that is an upper semi-lattice on the set of all polyno-mial time many-one degrees.The two degrees and are called the degrees.Call a degree deg ()if and therefore all languages in deg ()are recursive.ΣΣΣΣ2.5Principal IdealsPredicate Classes and Promise ClassesTheorem 2.1(Ladner 1975,Ambos-Spies 1985a)The following partial orders are distributive upper semi-lattices,the one in (b)is an upper semi-sublattice of the one in (a).(a)The partial order on the set of all nontrivial polynomial time many-one degrees.(b)The partial order on the set of the nontrivial recursive polynomial time many-one degrees.This upper semi-lattice is additionally dense.principal-ideal principal ideal lower cone of downward closure of principal ideal trivial8Part I Many results are known for the partial order .In this thesis only the following basic results about density and distributivity due to Ladner in [La75]and Ambos-Spies in [AS85a],respectively,will be considered.Call a set of languages a ,or simply ,if there exists a language such that =():=.There areseveral other names for the class (),sometimes it is called or .For the choice of the name see the next section.Note that the language is -complete for ().The following classes are examples of principal ideals.The class NP should be mentioned first as an example of a principal plete languages for NP,like the problem SAT,were –for polynomial time Turing reducibility –first presented by Cook in [Co71],their -completeness was shown in [Ka72,Le73].For a list of -complete prob-lems for NP see [GJ79].The fact that for example SAT is not only -complete for NP but also every language -reducible to SAT is in NP is easy to see.The two principal ideals=(),=()will be called principal ideals.The class P is a principal ideal ()where is any language in P.P contains the two trivial principal ideals and is contained in every nontrivial principal ideal,see Figure 1.=2ΣΣΠ∆Preliminaries 9PFigure 1:The two trivial principal ideals and PNot only P and NP but also all other classes and of the Polyno-mial Time Hierarchy are principal ideals,the existence of -complete lan-guages was shown in [St77,Wr77].Let be any language.Then the class P consisting of all languages com-putable in polynomial time with oracle (see for example [Co71,BDG88]and also Section 5.5)is a principal (many-one)ideal according to the results in [AS86a],see also Corollary 5.8.As a special case,the classes of the Polynomial Time Hierarchy are principal ideals.The classes NP(n)and co-NP(n)of the Boolean Hierarchy are principal ide-als,the existence of -complete languages was shown in [CG*88].Counting classes like PP,C P,MOD P,P =MOD P,US =1–NP are principal ideals,for the original definitions see [Gi77,Wa86b,BG92,PZ83,BGu82,GW87].The exponential time classes EXPTIME =DTIME(2poly )and NEXPTIME =NTIME(2poly )are principal ideals.More generally,the classes -EXPTIME=DTIME(2...2poly )and -NEXPTIME =NTIME(2...2poly ),where inboth cases the exponentiation tower has height ,can easily be shown to be principal ideals for every 1.For the exact definitions see for example [Jo90].()()()Predicate Classes and Promise Classes Proposition 2.1Proof.Corollary 2.1Proposition 2.2Proof.For all languages it holds:()()For all languages it holds:()()Let be a language.()is recursively presentable is recursiveis recursive.10Part I There is a strong connection between the inclusion order on the principal ideals and the partial order on the polynomial time many-one degrees.deg ()deg ()Note that the second equivalence holds by the definition of.In order to see the first equivalence assume that ()().Because()it holds by the assumption that (),this shows .If on the other side then for each it holds by the transitivity of that ,this shows ()()In other words,the inclusion order on the principal ideals is isomorphic to the -order on the polynomial time many-one degrees.It is clear by the proof that this isomorphism between the partial order on the degrees and the inclusion order on the principal ideals exists not only for but for every preorder.The following corollary follows immediately from Proposition 2.1.=deg ()=deg ()By the following proposition the property of being recursively presentable is determined for a principal ideal by any -complete language.deg ()Note that the second equivalence was already mentioned in the definition of the recursiveness of a -degree.In order to see the first equivalence assume that ()=I N forsome recursive language .Then =for some I N.But if is recursivethen also =is.For the other direction let a recursive language be given.Define the language :=(),the functions were defined in Section 2.3.It is easy to see that is recursive and that ()==I N :()2.6IdealsPreliminariesCorollary 2.2Proposition 2.3Proof.The following partial orders are distributive upper semi-lattices.The one in (b)is an upper semi-sublattice of the one in (a).(a)The inclusion order on the set of all nontrivial principal ideals.(b)The inclusion order on the set of all nontrivial recursively presentable principal ideals.This upper semi-lattice is additionally dense.-ideal ideal ideal (a)The principal ideals are exactly the ideals which have a-complete language.(b)The recursively presentable principal ideals are exactly the ideals which have a recursive -complete language.11:()=I N .This finishes the proof of the fact that the class ()is recursively presentable if and only if is recursive.By Propositions 2.1and 2.2the results for the polynomial time many-one de-grees stated in Theorem 2.1can be transfered to the principal ideals immediately.A ,or simply ,is a nonempty set of languages such that if lan-guages and are in then each language with is also in .In other words,an ideal is a nonempty set of languages which is closed under join und closed downward.The name follows the notation in Lattice Theory,see for example Gr¨a tzer [Gr78].The following proposition shows the relation between between ideals and prin-cipal ideals.(a)Let a principal ideal()be given.is -complete for ()because is in ()and by definition all languages in ()are -reducible to .It remains to show that ()is an ideal.Let ()and.Then by the supremum property of the join.By the transitivity of it follows that ().Therefore,()is an ideal.For the other direction let an ideal have a -complete language .It will be shown that =().It holds ()because every language in is -reducible to .And it holds ()because and I is closed downward.ΣΣΠΣI N I N 1trivial ideals Predicate Classes and Promise Classes12Part I Part (b)follows from part (a)and Proposition 2.2.The two trivial principal ideals and will be also be called .Like in the case of principal ideals it is easy to see that the ideal P contains the two trivial ideals and every nontrivial ideal contains P.Some examples of classes are given which are ideals but which are not princi-pal ideals or not known to be principal ideals.Classes like UP (defined in [Va76]),BPP,RP (both defined in [Gi77]),FewP (defined as FNP in [Al86]),and AM (defined in [Ba85]),are easily shown to be (recursively presentable)ideals.These classes are not known to be principal ideals,see [Si82,Kow84,AS86b,HH88,Hem88,AS89].In Proposition 2.5it will be shown that pairwise intersections of nontrivial (recursively presentable)ideals are nontrivial (recursively presentable)ide-als,like (for1)or ZPP =RP co-RP.Generally,it is not known if such an intersection is a principal ideal.For a discussion of this question for the class NP co-NP see [Si82,Kow84,HI85,Hem88,AS89].(Effective)infinite unions of increasing sequences of (recursively presentable)ideals,like the class of languages of the Polynomial Time Hierarchy PH=and the class of languages of the Boolean Hierarchy BH=NP(n),are (recursively presentable)ideals which are in general not known to be principal ideals.For the original definitions of PH and BH see [St77,Wr77,CG*88].Let the class ELEMENT ARY be the union -EXPTIME,for the defini-tion of the classes -EXPTIME for 1see the examples of principal ide-als in Section 2.5.The class ELEMENTARY is a (recursively presentable)ideal which is provably not a principal ideal because by the Time Hierarchy Theorem of [HS65]it can be shown for each 1that -EXPTIME is a proper subset of (+1)-EXPTIME.The class of all recursive languages is a countable ideal but neither a prin-cipal ideal nor a recursively presentable ideal.The class P/poly defined by Karp and Lipton in [KL80,KL82]can easily be shown to be an ideal.It is not countable,but for example P/poly NP is a countable ideal.PreliminariesProposition 2.4Proof.Proposition 2.5(a)Every recursively presentable ideal is a countable ideal.(b)Every principal ideal is a countable ideal.(c)There is a recursively presentable ideal which is not a principal ideal.(d)There is a principal ideal which is not recursively presentable.(e)There is an ideal which is not countable.The following partial orders are distributive lattices,the one in (b)is a sublattice of the one in (a),and the one in (c)is a sublattice of the ones in (a)and (b).(a)The inclusion order on the set of all nontrivial ideals.(b)The inclusion order on the set of all nontrivial countable ideals.(c)The inclusion order on the set of all nontrivial recursively presentable ideals.In (a),(b)and (c)the infimum of two nontrivial ideals and is given by their intersection,and the supremum is given by the smallest ideal containing both and .13The class of all languages is an ideal but not countable.The following classes are not ideals.The class E =DTIME(2lin )is recursively presentable,has a -complete language,and is closed under join but is not an ideal because it is not closed downward.A polynomial time many-one degree is an ideal if and only if it is one of the two trivial degrees because otherwise it is not closed downward.For two -incomparable recursive languages the class ()()is recursively presentable and closed downward but is not an ideal because it is not closed under join.The relation of the different types of ideals introduced so far is described by the following Proposition 2.4,see also Figure 2.(a)Every recursively presentable class is by definition countable.(b)A principal ideal ()is by Proposition 2.3an ideal,and it is countable because there are at most countably many -reductions.A witness for (c)is the class ELEMENTARY,see the examples above,and a witness for (d)is according to Propositions 2.2and 2.3any class ()for a non-recursive .A witness for (e)is the class P/poly,see the examples above.()()Proof.Predicate Classes and Promise Classes14Part I recursively presentable principal idealsrecursively presentable ideals principal idealscountable idealsidealsFigure 2:Types of Ideals(a)Let two nontrivial ideals and be given.It will be shown thatis a nontrivial ideal.Therefore,is the infimum of and .Because both and contain the class P also the class contains P and is not empty.Letfor two languages ,then both and containand .Therefore,they contain also by their property of being an ideal.This shows that is a nontrivial ideal.The supremum of and is the class =:.It is easy to see that this class is an ideal,that it contains both and ,and that it is contained in every ideal containing both and .For the distributivity let be as above and let be contained in .Now it is shown that the supremum of =and =equals .and and therefore also their supremum are contained in .For the other direction it obviously suffices to show that for every language in the principal ideal ()is the supremum of two principal ideals in and ,respectively.By definition of there exist sets such that ,in other words ()is contained in the supremum of ()and ().By the distributivity of the principal ideals there are languages ()()such that ()is the supremum of ()and ().(b)By the property of a sublattice to be a lattice it suffices to observe that for two given countable nontrivial ideals and the classes and from (a)are countable.The distributivity follows like in (a).(c)It suffices like in (b)to show that for two recursively presentable ideals=I N and =I N (for recursive languages and )Preliminaries ()()()()()()()()()2.7Computation TreesTheorem 2.2(Ambos-Spies 1986)Theorem 2.3(Shinoda &Slaman 1990)Proposition 2.6exact pair theorems For a recursively presentable ideal thereexist two recursive languages and such that ()().For a countable ideal there exist two languages and such that ()().(a)The nontrivial countable ideals are exactly the pairwise in-tersections of the nontrivial principal ideals.(b)The nontrivial recursively pre-sentable ideals are exactly the pairwise intersections of the nontrivial recursively presentable principal ideals.15the classes and from (a)are also recursively presentable.Define tobe the recursive language and for all with itholds that .It will be shown that this constructionguarantees that =I N .The inclusion from left to right is obvious.For the other direction consider a fixed :if is infinite,then it is also an (infinite)language of both and ,and if is finite then it is in P and thereforein .This shows =I N .To represent the class define tobe the language (),where was definedin Section 2.3.It is easy to check that is recursive,and by construction it holdsthat =I N .The distributivity follows like in (a).The following two results –called –of Ambos-Spies in[AS86b]and Shinoda and Slaman in [ShS90]relate the notions of ideals and prin-cipal ideals more closely.The latter was shown to hold for the polynomial time Turing reducibility but the proof is also valid for the many-one case.==For the nontrivial ideals the two Theorems 2.2and 2.3can by Proposition 2.5be expressed the following way.Consider nondeterministic Turing machines as presented for example in [BDG88].In this thesis it is additionally assumed for a Turing machine that for a state and a tupel of symbols read by the heads at most two transitions are specified by the transition function,and also it is assumed that the transition function is given as a linear list.This way it is guaranteed that if nondeterminism appears during a。

Generalised chiral QED2 Anomaly and Exotic Statistics

Generalised chiral QED2  Anomaly and Exotic Statistics

a r X i v :h e p -t h /9705178v 1 23 M a y 1997Generalized Chiral QED 2:Anomaly and Exotic StatisticsFuad M.SaradzhevInstitute of Physics,Academy of Sciences of Azerbaijan,Huseyn Javid pr.33,370143Baku,AZERBAIJANABSTRACTWe study the influence of the anomaly on the physical quantum picture of the generalized chiral Schwinger model defined on S 1.We show that the anomaly i)results in the background linearly rising electric field and ii)makes the spectrum of the physical Hamiltonian nonrelativistic without a massive boson.The physical matter fields acquire exotic statistics.We construct explicitly the algebra of the Poincare generators and show that it differs from the Poincare one.We exhibit the role of the vacuum Berry phase in the failure of the Poincare algebra to close.We prove that,in spite of the background electric field,such phenomenon as the total screening of external charges characteristic for the standard Schwinger model takes place in the generalized chiral Schwinger model,too.PACS numbers:03.70+k ,11.10.Mn.1IntroductionThe two-dimensional QED with massless fermions,i.e.the Schwinger model(SM),demonstrates such phenomena as the dynamical mass generation and the total screening of the charge[1].Although the Lagrangian of the SM contains only masslessfields,a massive bosonfield emerges out of the interplay of the dynamics that govern the originalfields.This mass generation is due to the complete compensation of any external charge inserted into the vacuum.In the chiral Schwinger model(CSM)[2,3]the right and left chiral components of the fermionic field have different charges.The left-right asymmetric matter content leads to an anomaly.At the quantum level,the local gauge symmetry is not realized by a unitary action of the gauge symmetry group on Hilbert space.The Hilbert space furnishes a projective representation of the symmetry group[4,5,6].In this paper,we aim to study the influence of the anomaly on the physical quantum picture of the CSM.Do the dynamical mass generation and the total screening of charges take place also in the CSM?Are there any new physical effects caused just by the left-right asymmetry?These are the questions which we want to answer.To get the physical quantum picture of the CSM we needfirst to construct a self-consistent quantum theory of the model and then solve all the quantum constraints.In the quantization procedure,the anomaly manifests itself through a special Schwinger term in the commutator algebra of the Gauss law generators.This term changes the nature of the Gauss law constraint:instead of beingfirst-class constraint,it turns into second-class one.As a consequence,the physical quantum states cannot be defined as annihilated by the Gauss law generator.There are different approaches to overcome this problem and to consistently quantize the CSM. The fact that the second class constraint appears only after quantization means that the number of degrees of freedom of the quantum theory is larger than that of the classical theory.To keep the Gauss law constraintfirst-class,Faddeev and Shatashvili proposed adding an auxiliaryfield in such a way that the dynamical content of the model does not change[7].At the same time,after quantization it is the auxiliaryfield that furnishes the additional”irrelevant”quantum degrees of freedom.The auxiliaryfield is described by the Wess-Zumino term.When this term is added to the Lagrangian of the original model,a new,anomaly-free model is obtained.Subsequent canonical quantization of the new model is achieved by the Dirac procedure.For the CSM,the correspondig WZ-term is not defined uniquely.It contains the so called Jackiw-Rajaraman parameter a>1.This parameter reflects an ambiguity in the bosonization procedure and in the construction of the WZ-term.The spectrum of the new,anomaly-free model turns out to be relativistic and contains a relativistic boson.However,the mass of the boson also depends on the Jackiw-Rajaraman parameter[2,3].This mass corresponds therefore to the”irrelevant”quantum degrees of freedom.The quantum theory with such a parameter in the spectrum is not physical,i.e. thatfinal version of the quantum theory which we would like to get.The latter should not contain any nonphysical parameters,otherwise one can not say anything about a physical quantum picture.In another approach also formulated by Faddeev[8],the auxiliaryfield is not added,so the quantum Gauss law constraint remains second-class.The standard Gauss law is assumed to be regained as a statement valid in matrix elements between some states of the total Hilbert space,and it is the states that are called physical.The theory is regularized in such a way that the quantum Hamiltonian commutes with the nonmodified,i.e.second-class quantum Gauss law constraint.The spectrum turns out to be non-relativistic[9,10].Here,we follow the approach given in our previous work[11].The pecularity of the CSM is that its anomalous behaviour is trivial in the sense that the second class constraint which appears afterquantization can be turned intofirst class by a simple redefinition of the canonical variables.This allows us to formulate a modified Gauss law to constrain physical states.The physical states are gauge-invariant up to a phase,the phase being1-cocycle of the gauge symmetry group algebra.In [12,13,14],the modification of the Gauss law constraint is obtained by making use of the adiabatic approach.Contrary to[11]where the CSM is defined on R1,we suppose here that space is a circle of lengthL,−L2,so space-time manifold is a cylinder S1×R1.The gaugefield then acquires aglobal physical degree of freedom represented by the non-integrable phase of the Wilson integral on S1.We show that this brings in the physical quantum picture new features of principle.Another way of making two-dimensional gaugefield dynamics nontrivial is byfixing the spatial asymptotics of the gaugefield[15,16].If we assume that the gaugefield defined on R1diminishes rather rapidly at spatial infinities,then it again acquires a global physical degree of freedom.We will see that the physical quantum picture for the model defined on S1is equivalent to that obtained in[15,16].We consider the general version of the CSM with a U(1)gaugefield coupled with different charges to both chiral components of a fermionicfield.We show that the charges are not arbitrary,but satisfy a quantization condition.The SM where these charges are equal is a special case of the generalized CSM.This will allow us at each step of our consideration to see the distinction between the two models.We work in the temporal gauge A0=0in the framework of the canonical quantization scheme and the Dirac’s quantization method for the constrained systems[17].We use the system of units where c=1.In Section2,we quantize our model in two steps.First,the matterfields are quantized, while A1is handled as a classical backgroundfield.The gaugefield A1is quantized afterwords,using the functional Schrodinger representation.We derive the anomalous commutators with nonvanishing Schwinger terms which indicate that our model is anomalous.In Section3,we show that the Schwinger term in the commutator of the Gauss law generators is removed by a redefinition of these generators and formulate the modified quantum Gauss law constraint.We prove that this constraint can be also obtained by using the adiabatic approximation and the notion of quantum holonomy.In Section4,we construct the physical quantum Hamiltonian consistent with the modified quan-tum Gauss law constraint,i.e.invariant under the modified gauge transformations both topologically trivial and non-trivial.We introduce the modified topologically non-trivial gauge transformation op-erator and defineθ–states which are its eigenstates.We consider in detail the case of the SM and demonstrate its equivalence to the freefield theory of a massive scalarfield.For the generalized CSM,we define the exotic statistics matterfield and reformulate the quantum theory in terms of thisfield.In Section5,we construct two other Poincare generators,i.e.the momentum and the boost.We act in the same way as before with the Hamiltonian,namely we define the physical generators as those which are invariant under both topologically trivial and non-trivial gauge transformations.We show that the algebra of the constructed generators is not a Poincare one and that the failure of the Poincare algebra to close is connected to the nonvanishing vacuum Berry curvature.In Section6,we study the charge screening.We introduce external charges and calculate(i)the energy of the ground state of the physical Hamiltonian with the external charges and(ii)the current density induced by these charges.Section7contains our conclusions and discussion.2Quantization Procedure2.1Classical TheoryThe Lagrangian density of the generalized CSM isL=−10,1,γ0=σ1,γ1=−iσ2,γ0γ1=γ5=σ3,σi(i=2(1±γ5)ψ.In the temporal gauge A0=0,the Hamiltonian density isH=H EM+H F,(2) where H EM=12)=A1(L2)=ψ±(L¯he±λ}ψ±,generated byG=∂1E+e+j++e−j−,λbeing a gauge function,as well as under global gauge transformations of the right-handed and left-handed Diracfields which are generated byQ±=e± L/2−L/2dxj±(x).Due to the gauge invariance,the Hamiltonian density is not uniquely determined.On the con-strained submanifold G≈0of the full phase space,the Hamiltonian density˜H=H+v H·G,(4) where v H is an arbitrary Lagrange multiplier which can be any function of thefield variables and their momenta,reduces to the Hamiltonian density H.In this sense,our theory cannot distinguish between H and˜H,and so both Hamiltonian densities are physically equivalent to each other.For arbitrary e+,e−the gauge transformations do not respect the boundary conditions 3.The gauge transformations compatible with the boundary conditions must be either of the formλ(L2)+¯h2πe+=N,N∈Z,(6)or of the formλ(L2)+¯h2πe−=N∈Z.(7)Eqs.6or7imply the charge quantization condition for our system.Without loss of generality, we choose the condition 6.For N=1,e−=e+and we have the standard Schwinger model.For N=0,we get the model in which only the right-handed component of the Diracfield is coupled to the gaugefield.From Eq.5we see that the gauge transformations under consideration are divided into topo-logical classes characterized by the integer n.Ifλ(L2),then the gauge transformation istopologically trivial and belongs to the n=0class.If n=0it is nontrivial and has winding number n.Given Eq.5,the nonintegrable phaseΓ(A)=exp{i¯he+L b(t)}.In contrast toΓ(A),the line integralb(t)=1e+Ln.By a non-trivial gauge transformation of the formg n=exp{i2πe+L].The configurations b=0and b=¯h2πe+L.2.2Quantization and AnomalyThe eigenfunctions and the eigenvalues of thefirst quantized fermionic Hamiltonians ared± x|n;± =±εn,± x|n;± ,wherex|n;± =1L exp{i¯hεn,±·x},εn,±=2π2π).We see that the spectrum of the eigenvalues depends on b.For e+b Le+L,the energies ofεn,+decrease by¯h2πL N.Some of energy levels change sign.However,the spectrum atthe configurations b=0and b=¯h2π2π¯h (and e−b L2π¯h]and{e±b L2π¯h],a†n|vac;A;+ =0for n≤[e+b Landb n |vac;A ;− =0for n ≤[e −b L2π¯h ].(11)Excited states are constructed by operating creation operators on the Fock vacuum.In the ζ–function regularization scheme,we define the action of the functional derivative on first quantized fermionic kets and bras byδδA 1(x )|n ;± ·|λεm,±|−s/2,n ;±|←δδA 1(x )|m ;± m ;±|·|λεm,±|−s/2.From 8we get the action ofδδA 1(x )a n =−lim s →0m ∈Zn ;+|δδA 1(x )a †n=lims →0m ∈Zm ;+|δδA 1(x )on b n ,b †n can be written analogously.Next we define the quantum fermionic currents and fermionic parts of the second-quantized Hamiltonian asˆj s ±(x )=12L /2−L /2dx (ψ†s ±d ±ψs ±−ψs ±d ⋆±ψ†s±).Substituting 8into these expressions,we obtainˆj s ±(x )=n ∈Z1Lnx }ρs ±(n ),whereρs +(n )≡k ∈Z12[b †k ,b k +n ]−·|λεk,−|−s/2|λεk +n,−|−s/2are momentum space charge density (or current)operators,andˆH s ±(x )=n ∈Z1Lnx }H s±(n ),H s ±(n )≡H s 0,±(n )∓e ±bρs±(n ),(12)whereH s0,+(n)≡¯hπ2[a†k,a k+n]−·|λεk,+|−s/2|λεk+n,+|−s/2,H s0,−(n)≡¯hπ2[b k+n,b†k]−·|λεk,−|−s/2|λεk+n,−|−s/2. The charges corresponding to the currentsˆj s±(x)areˆQ s±=e± L/2−L/2dxˆj s±(x)=e±ρs±(0).With Eqs.10and11,we have for the vacuum expectation values:vac;A;±|ˆj±(x)|vac;A;± =−12(ξ++ξ−),whereη±≡±lim s→01λ k∈Z|λεk,±|−s+1.Taking the sums,we obtainη±=±22π¯h}−1L(({e±b L2)2−12η±,ˆQ ±=e±:ρ±(0):−L2ξ±,where double dots indicate normal ordering with respect to|vac,A ,ˆH 0,+=¯h2π2π¯h]ka†k a k|λεk,+|−s− k≤[e+b LLlims→0{k>[e−b L2π¯h]kb†k b k|λεk,−|−s}and:ρ+(0):=lims→0{ k>[e+b L2π¯h]a k a†k|λεk,+|−s},:ρ−(0):=lims→0{ k≤[e−b L2π¯h]b k b†k|λεk,−|−s}.The operators:ˆj±(x):and:ˆH±:are well defined when acting onfinitely excited states which have only afinite number of excitations relative to the Fock vacuum.To construct the quantum electromagnetic Hamiltonian,we quantize the gaugefield using the functional Schrodinger representation.In this representation,when the vacuum and excited fermionic Fock states are functionals of A1,the gaugefield operators are represented asˆA1(x)→A1(x),ˆE(x)→−i¯hδL pxαp.Since A1(x)is a real function,αp satisfiesαp=α⋆−p.The Fourier expansion for the canonical momentum conjugate to A1(x)is thenˆE(x)=1L¯h p∈Z p=0e−i2πdαp, whereˆπb≡−i¯h dL exp{i2πL ¯h2ddb−1dα−p+qd2Lˆπ2b−1dαqd2(ξ++ξ−).If we multiply two operators that arefinite linear combinations of the fermionic creation and annihilation operators,theζ–function regulated operator product agrees with the naive product. However,if the operators involve infinite summations their naive product is not generally well defined. We then define the operator product by mutiplying the regulated operators with s large and positive and analytically continue the result to s=0.In this way we obtain the following relations[ρ±(m),ρ±(n)]−=±mδm,−n,(15) [H0,±(n),H0,±(m)]−=±¯h2π[ˆH0,±,ρ±(m)]−=∓¯h2πdbρ±(m)=0,d2π¯hδp,±m,d2π¯hδp,±m,(p>0).(16) The quantum Gauss operator isˆG=ˆG0+2πLpx−ˆG−(p)e−i2πLe+ρN(0),ˆG ±(p)≡¯h pd2πρN(±p)andρN=ρ++Nρ−is momentum space total charge density operator.Using15and16,we easily get thatρ+(±p)(andρ−(±p))are gauge invariant.For example, forρ+(±p)we have:[ˆG+(p),ρ+(±q)]−=0,[ˆG−(p),ρ+(±q)]−=0,(p>0,q>0).The operatorsˆG±(p)don’t commute with themselves,[ˆG+(p),ˆG−(q)]−=(1−N2)e2+L24π2d3Quantum Constraints3.1Quantum SymmetryIn non-anomalous gauge theories,Gauss law is considered to be valid for physical states only.This identifies physical states as those which are gauge-invariant.The problem with the anomalous be-haviour of the generalized CSM,in terms of states in Hilbert space,is apparent:owing to theSchwinger terms we cannot require that states be annihilated by the Gauss law generators ˆG±(p ).Let us represent the action of the topologically trivial gauge transformations by the operatorsU 0(τ)=exp {i¯hp>0(ˆG+τ++ˆG −τ−)}(17)with τ0,τ±(p )smooth,thenU −10(τ)α±p U 0(τ)=α±−ipτ∓(p ),U −1(τ)d dα±p∓i 2π)2τ±(p ),(p >0).The composition law for the operators U 0isU 0(τ(1))U 0(τ(2))=exp {2πiω2(τ(1),τ(2))}U 0(τ(1)+τ(2)),whereω2(τ(1),τ(2))≡−i2π¯h )2p>0p (τ(1)−τ(2)+−τ(1)+τ(2)−)is a 2-cocycle of the gauge group algebra.Thus for N =±1we are dealing with a projectiverepresentation.The 2-cocycle ω2(τ(1),τ(2))is trivial,since it can be removed by a simple redefinition of U 0(τ).Indeed,the modified operators˜U0(τ)=exp {i 2πα1(γ;τ)}·U 0(τ),(18)whereα1(γ,τ)≡−12π¯h )2p>0(α−p τ−−αp τ+)is a 1-cocycle,satisfy the ordinary composition law˜U0(τ(1))˜U 0(τ(2))=˜U 0(τ(1)+τ(2)),i.e.the action of the topologically trivial gauge transformations represented by 18is unitary.The modified Gauss law generators corresponding to 18areˆ˜G±(p )=ˆG ±(p )±18π2α±p .(19)The generators ˆ˜G±(p )commute:[ˆ˜G+(p ),ˆ˜G −(q )]−=0.This means that Gauss law can be maintained at the quantum level for N=±1,too.We define physical states as those which are annihilated byˆ˜G±(p)[11]:ˆ˜G(p)|phys;A =0.(20)±The zero componentˆG0is a sum of quantum generators of the global gauge transformations of the right-handed and left-handed fermionicfields,so the other quantum constraints are:ρ±(0):|phys;A =0.(21) It follows from20that the physical states|phys;A respond to a gauge transformation from the zero topological class with a phase:U0(τ)|phys;A =exp{−i2πα1(γ;τ)}|phys;A .(22) Only for models without anomaly,i.e.for N=±1,this equation translates into the statement that physical states are gauge invariant.Equation22expresses in an exact form the nature of anomaly in the CSM.At the quantum level the gauge invariance is not broken,but realized projectively.The1-cocycleα1occuring in the projective representation contributes to the commutator of the Gauss law generators by a Schwinger term and produces therefore the anomaly.3.2Adiabatic ApproachLet us show now that we can come to the quantum constraints20and21in a different way,using the adiabatic approximation[23,24].In the adiabatic approach,the dynamical variables are divided into two sets,one which we call fast variables and the other which we call slow variables.In our case, we treat the fermions as fast variables and the gaugefields as slow variables.Let A1be a manifold of all static gaugefield configurations A1(x).On A1a time-dependent gaugefield A1(x,t)corresponds to a path and a periodic gaugefield to a closed loop.We consider the fermionic part of the second-quantized Hamiltonian:ˆH F:which depends on t through the background gaugefield A1and so changes very slowly with time.We consider next the periodic gaugefield A1(x,t)(0≤t<T).After a time T the periodicfield A1(x,t)returns to its original value:A1(x,0)=A1(x,T),so that:ˆH F:(0)=:ˆH F:(T).At each instant t we define eigenstates for:ˆH F:(t)by:ˆH F:(t)|F,A(t) =εF(t)|F,A(t) .The state|F=0,A(t) ≡|vac,A(t) is a ground state of:ˆH F:(t),:ˆH F:(t)|vac,A(t) =0.The Fock states|F,A(t) depend on t only through their implicit dependence on A1.They are assumed to be periodic in time,|F,A(T) =|F,A(0) ,orthonormalized,F′,A(t)|F,A(t) =δF,F′,and nondegenerate.The time evolution of the wave function of our system(fermions in a background gaugefield)is clearly governed by the Schrodinger equation:∂ψ(t)i¯h¯h T0dt·εF(t),whileT0dt L/2−L/2dx˙A1(x,t) F,A(t)|iδγBerryF≡δA1(x,t)|F,A(t) ,(24) then= T0dt L/2−L/2dx˙A1(x,t)A F(x,t).γBerryFWe see that upon parallel transport around a closed loop on A1the Fock state|F,A(t) acquiresan additional phase which is integrated exponential of A F(x,t).Whereas the dynamical phaseγdynF provides information about the duration of the evolution,the Berry’s phase reflects the nontrivial holonomy of the Fock states on A1.However,a direct computation of the diagonal matrix elements ofδδδA1(x,t)A F(y,t)−2π2¯h2 n>01L n(x−y))=(1−N2)e2+2ǫ(x−y)−1The corresponding U(1)connection is easily deduced asA F=0(x,t)=−12 T0dt L/2−L/2dx L/2−L/2dy˙A1(x,t)F F=0(x,y,t)A1(y,t).In terms of the Fourier components,the connection A F=0is rewritten as vac,A(t)|ddα±p(t)|vac,A(t) ≡A±(p,t)=±(1−N2)e2+L2pα∓p,so the nonvanishing curvature isF+−(p)≡d dαpA−=(1−N2)e2+L2p.A parallel transportation of the vacuum|vac,A(t) around a closed loop in(αp,α−p)–space(p>0) yields back the same vacuum state multiplied by the phaseγBerry F=0=(1−N2)e2+L2piαp˙α−p.This phase is associated with the projective representation of the gauge group.For N=±1,when the representation is unitary,the curvature F+−and the Berry phase vanish.As mentioned in the beginning of this Section,the projective representation is trivial and the2-cocycle in the composition law of the gauge transformation operators can be removed by a redefinition of these operators.Analogously,if we redefine the momentum operators asddα±p≡d8π2¯h21 dα±p|vac,A(t) =0,˜F+−=˜ddαp˜A−=0.However,the nonvanishing curvature F+−(p)shows itself in the algebra of the modified momentum operators which are noncommuting:[˜ddα−q]−=F+−(p)δp,q.Following27,we modify the Gauss law generators asˆG ±(p)−→ˆ˜G±(p)=¯h p˜d2πρN(±p)that coincides with19.The modified Gauss law generators have vanishing vacuum expectation values,vac,A(t)|ˆ˜G±(p,t)|vac,A(t) =0.This justifies the definition20.For the zero componentˆG0,the vacuum expectation valuevac,A(t)|ˆG0|vac,A(t) =−12(e+η++e−η−)=1The quantum theory consistently describing the dynamics of the CSM should be definitely compatible with20.The corresponding quantum Hamiltonian is then defined by the conditions[ˆ˜H,ˆ˜G±(p)]−=0(p>0)(29)which specify thatˆ˜H must be invariant under the modified topologically trivial gauge transformations generated byˆ˜G±(p).We have in29a system of non-homogeneous equations in the Lagrange multipliersˆv H,±which become operators at the quantum level.The solution of these equations isˆv H,±(p)=¯hp2{pd4π¯h)2α∓p}.Substituting this expression forˆv H,±(p)into the quantum counterpart of28,on the physical states |phys;A we obtain1L2¯h2 p>0(d dα−p−1dαp,˜ddα±by˜d2L ˆπ2b−1dαp,˜d2Lˆπ2b+V(ρN;ρN),whereV(ρN;ρN)≡e2+Lp2ρN(−p)ρN(p)is the energy of the Coulomb current-current interaction.In order to make the dependence on N for the Hamiltonian more obvious,let us representρN asρN=12(1−N)σ,whereρ≡ρ1=ρ++ρ−,σ≡ρ−1=ρ+−ρ−,and[ρ(p),ρ(q)]−=[σ(p),σ(q)]−=0,[σ(p),ρ(q)]−=2pδp,−q.Then the Coulomb interaction energy takes the formV(ρN;ρN)=14(1−N)2V(σ;σ)+12Lˆπ2b+V(ρ;ρ).For N=−1,the momentum space electric charge density operator isσ(p)andˆ˜H EM =12π¯h:[e+b L2π¯h]+n,ˆψ+→exp{i2πn2π¯h ]→[e−b LLx}ˆψ−.The action of the topologically nontrivial gauge transformations on the states can be represented by the operatorsU n=exp{−i2π¯h ]−2πd[e+b L nρN(n)and U0is given by17.To identify the gauge transformation as belonging to the n th topological class we use the index n in31.The case n=0corresponds to the topologically trivial gauge transformations.The topologically nontrivial gauge transformation operators satisfy the same composition law as the topologically trivial ones.The modified operators are˜U n =exp{−i¯hˆTb})n|phys;A .Among all states|phys;A one may identify the eigenstates of the operators of the physical variables.The action of the topologically nontrivial gauge transformations on such states may, generally speaking,change only the phase of these states by a C–number,since with any gauge transformations both topologically trivial and nontrivial,the operators of the physical variables and the observables cannot be ing|phys;θ to designate these physical states,we haveexp{∓i¯h ˆTb})n|phys;A(so calledθ–states[26,27]),where|phys;A is an arbitrary physical state from20.In one dimension the parameterθis related to a constant background electricfield.To show this, let us introduce states which are invariant even against the topologically nontrivial gauge transfor-mations.Recalling that[e+b L2π¯h]θ}|phys;θ .(32)The new states|phys continue to be annihilated byˆ˜G±(p),and are also invariant under the topo-logically nontrivial gauge transformations.The electromagnetic part of the Hamiltonian transforms asˆHEM→exp{i[e+b L2π¯h]θ}=12L¯h2 p>0[˜d dα−p]+,i.e.in the new Hamiltonian the momentumˆπb is supplemented by the electricfield strength Eθ≡e+The condition34can be then rewritten as a system of linear equations in(β0,β±).We can easilyfind a solution of these equations,which gives us(β0,β±)as functions of[e+b L2π¯h}.However,these constants are irrelevant for our consideration and we neglect them.Finding(β0,β±)from34and substituting them into the expression33,on the physical states we obtainˆ˜H|phys;A =ˆHphys|phys;AwhereˆH phys =ˆH physF+ˆH physEM,ˆH physF=ˆH0,++ˆH0,−−1L¯h(1+N2)([e+b LL ¯h[e+b L2L ˆπ2b+V(ρN;ρN)+e2+L2π¯h] p∈Z p=0(−1)p 24(1−N2)2([e+b LL¯h p>0|λεp,±|−sρs±(−p)ρs±(p).Eqs.35and36give us a physical Hamiltonian invariant under both topologically trivial andnontrivial gauge transformations,ˆH physF andˆH physEMbeing invariant separately.The last two terms in35make invariant the free fermionic part of the Hamiltonian,while the ones in36the electromagnetic part.For N=±1,the last two terms in36vanish.These terms are therefore caused by the anomaly and represent new types of interaction which are absent in the nonanomalous models.The new interactions admit the following interpretation.Let us combine the last term in36with the kinetic part of the electromagnetic Hamiltonian,then124(1−N2)2([e+b L2L2 L/2−L/2dx(ˆπb−L E(x))2,i.e.the momentumˆπb is supplemented by the linearly rising electricfield strengthE(x)≡−e+2π¯h].As in four-dimensional models of a relativistic particle moving in an externalfield,we may define a generalized momentum operator in the formˆ˜πb(x)≡ˆπb−L E(x).The commutation relations for ˆ˜πb are[ˆ˜πb (x ),ˆ˜πb (y )]−=i (1−N 2)e 2+LL(1−N 2)[e +b L2L 2ˆ˜π2b→14π2(1−N 2)[e +b Lp 2ρN (p )=−e 2+L2p 2ρbgrd ·ρN (p ).It is just the background linearly rising electric field that couples b to the fermionic physical degreesof freedom in the Coulomb interaction.As a consequence,the eigenstates of the physical Hamiltonian are not a direct product of the purely fermionic Fock states and wave functionals of b .This is a common feature of gauge theories with anomaly.That the Hilbert space in such theories is not a tensor product of the Hilbert space for a gauge field and the fixed Hilbert space for fermions was shown in [6],[7].The background charge interpretation is related to the definition of the Fock vacuum.The definition given in Eqs.10-11depends on [e +b L2π¯h]is fixed.The values of the gauge field in regions of different [e +b L2π¯h]changes,then there is a nontrivial spectral flow,i.e.some of energy levels of the first quantized fermionic Hamiltonians cross zero and change sign.This means that the definition of the Fock vacuum changes.The charge operators ˆQ ±also change.Let :ˆQ (0)±:be charge operators defined in the region where[e +b L 2π¯h]the charge operators become :ˆQ (0)±:∓e ±[e ±b L。

大学英语专业语法课件3-限定词Determ

大学英语专业语法课件3-限定词Determ
• Quantifiers: Quantifiers are words or phrases that indicate the quantity or amount of something. Common quantifiers include words such as "many," "few," "several," "a lot of," etc.
Types of determiners: There are different types of determiners, including articles, demonstratives ("this," "that," "these," "those"), quantifiers ("some," "any," "many," "few"), and possessives, each serving a specific function in relation to the noun.
Classification
Qualifiers can be divided into two main categories: adjectives and adverbs. Adjectives modify nouns or pronouns, adding descriptive details such as color, size, shape, age, etc. Adverbs modify verbs, adjectives, or other adverbs, adding details such as time, place, manner, degree, etc.

Study_on_the_Pretreatment_Methods_for_the_Detectio

Study_on_the_Pretreatment_Methods_for_the_Detectio

Creativity and Innovation2022,VOL.6,NO.5,75-80DOI:10.47297/wspciWSP2516-252712.20220605Study on the Pretreatment Methods for the Detection of Heavy Metal Stress in Shellfish by Hyperspectral TechnologyShuwen Wang2*,Jibin Zhang1,Hengjun Jiang1,Wenjun Xie1,Fengliang Chen11School of Electronic and Electrical Engineering,Lingnan Normal University,Zhanjiang,Guangdong2College of Computer and Intelligent Education,Lingnan Normal University,Zhanjiang,GuangdongABSTRACTIn this paper,hyperspectral technology was combined with differentoptimized pretreatment methods to construct a non-destructiveidentification method for cadmium contaminated and normal Ruditapesphilippinarum.Firstly,120samples of normal and heavy metal cadmiumpollution were collected,and their hyperspectral curves were compared,and then the optimization effects of different preprocessing methods onthe original spectra were explored,and finally the processing results wereused as the input of extreme learning machine classifier to screen out thebest preprocessing method.The results show that the use of SGsmoothing,multiplicative scatter correction(MSC),standard normalvariate transformation(SNVT),first derivative,second derivative and theircombination optimization methods is beneficial to the discrimination ofdifferent classes of spectral curves to a certain extent.Finally,theclassification accuracy of the extreme learning machine classifier is usedas the evaluation index to determine that the multiplicative scattercorrection preprocessing method is the optimal spectral preprocessingmethod.KEYWORDSHyperspectral image;Heavy metal cadmium;Pretreated Ruditapesphilippinensis1Research BackgroundRuditapes philippinarum is one of the main shellfish cultured along the coast.It is rich in various amino acids,vitamins and essential trace elements for human body.It has high nutritional value and delicious taste,so it is very popular[1].In recent years,with the rapid development of industry,a large number of pollutants are directly discharged into the sea,resulting in the aggravation of marine environmental pollution.Cadmium (Cd)is a typical harmful element which is easy to accumulate in organisms and difficult to metabolize,and widely exists in the natural environment[2].Ruditapes philippinarum is widely distributed in coastal areas with relatively serious heavy metal pollution.As a non-selective filter-feeding organism,Ruditapes philippinarum will accumulate heavy metal pollutants in sediments and water during feeding.Long-term consumption of heavy metal contaminated Ruditapes*Corresponding Author:Shuwen Wang(1975-04),male,Han,Zhaodong City,Heilongjiang Province,graduatestudent,associate professor,Interest:Application of intelligent measurement and controltechnology.Shuwen Wang et al. philippinarum will cause harm to human health[3].Therefore,it has become an urgent problem to improve the detection ability of heavy metal pollution in Ruditapes philippinarum and ensure the quality and food safety of Ruditapes philippinarum in food safety science.Aiming at the spectral interference and noise generated in the process of hyperspectral image acquisition of Ruditapes philippinarum,this paper studies the elimination of interference factors by different preprocessing methods,and explores the optimization effect of different single and combined preprocessing methods on the interference in the hyperspectral data of Ruditapes philippinarum[4].Finally,the extreme learning machine was used to realize the rapid non-destructive detection and analysis[5]of normal samples and heavy metal cadmium contaminated samples of Ruditapes philippinarum,and the optimal pretreatment method was determined according to the classification detection results.2Test Materials and Research Methods(1)Culture of test samplesThe samples of Ruditapes philippinarum were purchased from Cunjin Seafood Market,Zhanjiang City,Guangdong Province.The fine sand was disinfected to remove impurities and laid in a plastic culture box with a size of119cm×108cm×32cm and a volume of300L.Seawater was allowed to settle for24hours and then filtered for laboratory culture of Ruditapes philippinarum samples.The pH value of the sea water is8.0,the water temperature is28℃,the dissolved oxygen content is6.5mg/L,and the salinity is30‰.CdC12·2.5H2O solution with a concentration of0.8mg/L-1was addedinto the culture box to simulate the marine environment polluted by heavy metal cadmium.The control group was raised in seawater without any heavy metal elements.During the experiment,thefilter was closed for4hours every day,during which Chlorella was fed.Seawater containing CdC12·2.5H2O reagent and pure seawater were added to the two culture tanks every day to supplementthe loss of seawater in the culture tanks.The Ruditapes philippinarum samples were incubated in a culture box for10days to allow for the accumulation of the heavy metal cadmium.At the end of the culture,60cadmium contaminated samples and60uncontaminated samples of Ruditapes philippinarum were collected for hyperspectral image acquisition[6].(2)Hyperspectral image acquisitionIn this study,the hyperspectral imaging data of Ruditapes philippinarum samples were collected by SOC710-VP hyperspectral imager produced by Surface Optics Company in the United States.The system consists of a hyperspectral imager,a light source unit(halogen lamp),and a carrier plaorm unit(25),as shown in Figure1.The hyperspectral imager has an acquisition range of367.7-1051.9 nm with512bands.The spectrum at the front and end of the whole spectral range contains a lot of noise,so these two parts of the spectrum are removed,and450spectral bands from400.5nm to 1000.9nm are retained.The standard calibration of hyperspectral images,including spectral calibration,radiometric calibration,and reflectance normalization,is performed in SRAnal710 software.Fig.2is a hyperspectral image of Ruditapes philippinarum contaminated by heavy metal cadmium.76Creativity and Innovation 3Results and AnalysisFig.3shows the original spectrum of the Ruditapes philippinarum sample and the spectral curve pretreated by SG smoothing,multiplicative scatter correction (MSC),standard normal variate (SNV),first derivative (FD),second derivative (SD)and their various pretreatment combination optimizationmethods.(A)Rawspectrum (b)SGFig.1Hyperspectral image acquisitionsystemFig.2Hyperspectral image of Ruditapes philippinarum77Shuwen Wang etal.(c)MSC(e)FD(g)SG-FD (d)SNV (f)SD (h)SG-SD78Creativity andInnovation (i)MSC-FD (k)SNV-FD (j)MSC-SD(l)SNV-SDFig.3Spectral curves of Ruditapes philippinarum under various pretreatment methods and different combinations 60healthy samples and 60cadmium contaminated samples after spectral preprocessing are taken as experimental data sets,45samples of each class of samples are taken as training data sets,15samples are taken as test data sets,and the extreme learning machine classifier is used for classification.Because of the random selection,in order to reduce the random error,the modeling was repeated 500times each time,and the classification effect was evaluated by the average of the classification accuracy of the 500experimental results.The corresponding average classification accuracies of SG,MSC,SNV,FD,SD and their SG-FD,SG-SD,MSC-FD,MSC-SD,SNV-FD,SNV-SD preconditioning methods were 92.35%,95.25%,88.93%,91.89%,91.89%,83.99%、90.36%、93.93%、89.96%、88.94%、84.50%。

Covariance and Time Regained in Canonical General Relativity

Covariance and Time Regained in Canonical General Relativity

a r X i v :0803.0125v 2 [g r -q c ] 9 M a r 2008Covariance and time regained in canonical general relativityI.Kouletsis February 2008Abstract Canonical vacuum gravity is expressed in generally-covariant form in order that space-time diffeomorphisms be represented within its equal-time phase space.In accordance with the principle of general covariance and ideas developed within history phase space formalisms in Refs.[1]-[4],the time mapping T :M →I R and the space mapping X :M →Σthat define the Dirac-ADM foliation are incorporated into the frame-work of the Hilbert variational principle.The resulting canonical action encompasses all individual Dirac-ADM actions,corresponding to different choices of foliating vacuum spacetimes by spacelike hypersurfaces.The equal-time phase space P ={g ij ,p ij ,Y α,P α}includes the embeddings Y αand their conjugate momenta P α.It is constrained by eight first-class constraints.The constraint surface C is determined by the super-Hamiltonian and super-momentum constraints of vacuum gravity and the vanishing of the embedding momenta.Deformations of the time and space mappings,δT and δX ,and spacetime diffeomorphisms,V ∈LDiffM ,induce symplectic diffeomorphisms of P .While the gen-erator D (δT ,δX )of deformations depends on all eight constraints,the generator D V of spacetime diffeomorphisms depends only on the embedding momentum constraints.As a result,spacetime observables,namely,dynamical variables F on P that are invariantunder spacetime diffeomorphisms,{F,D V }|C =0,are not necessarily invariant under the deformations of the mappings,{F,D (δT ,δX )}|C =0,nor are they constants of the mo-tion,{F, d 3x H}|C =0.Dirac observables form only a subset of spacetime observables that are invariant under the transformations of T and X and do not evolve in time.In this generally-covariant framework,the conventional interpretation of the canonical theory,due to Bergmann and Dirac,amounts to postulating that the transformations of the reference system (T ,X )have no measurable consequences;i.e.,that all first-class constraints generate gauge transformations.If this postulate is not deemed necessary,canonical gravity admits no classical problem of time.1Introduction1.1General covariance,determinism and the problem of evolution The variational principle for general relativity,with or without sources,introduces a four-dimensional manifold M and an action functional S[Ψ]on M.The principle of general co-variance demands that allfieldsΨbe subject to variation in the action functional and satisfy generally-covariantfield equations.In the case of the vacuum theory,where only the metricfield is present on the spacetime manifold,the set of solutions consists of all distinct metricfields G on M that satisfy the vacuum Einstein equations.These solutions do not all correspond to physically distinct states of the system.Considering that the manifold points are physically indistinguishable prior to introducing thefields on M,any two solutions that can be brought into coincidence by an element of the group DiffM are regarded as representations of the same physical state.The group DiffM is treated as the gauge group of the theory,and each physical state is identified with an equivalence class{G}of DiffM-related solutions on M.The setΓof all such equivalence classes constitutes the set of physically distinct states of the system.In the canonical formalism,initiated by Dirac[5]and Arnowitt,Deser and Misner[6],the same physical conclusion can be drawn by considering the initial-value problem.General rel-ativity is not a deterministic dynamical system in the strict sense.A characteristic of its canonical formulation is that a given set of instantaneous data at an initial time t1may evolve, via different choices of lapse and shift,to many different sets of such data at a later time t2.Nevertheless,a well-posed initial-value problem arises if it is stipulated that all these sets of evolved data characterise the same physical situation[7]-[8].Within the framework of the Dirac-ADM phase space P={(g ij,p ij)},each set of permissible data(g ij(x),p ij(x))on a given hypersurface defines a point on the constraint surface C⊂P,where C is determined by the first-class constraints.All points in C to which an initial point can evolve via arbitrary choices of lapse and shift lie in an orbit of the Hamiltonian vectorfield generated by thefirst-class constraints.The set∆of such distinct orbits in C,equipped with an induced symplectic form, constitutes the so-called reduced phase space of the theory.This set can be brought into a one-to-one correspondence with the setΓof DiffM-classes of solutions on M[9].In this way,the original classification of physical states according to the setΓis recovered, and the inability to physically distinguish between evolved data in the canonical theory may be attributed to the invariance of the spacetime action under DiffM.In addition,the bijective correspondence between the setsΓand∆allows the physical observables of the theory to be perceived either as functions on∆,the so-called Dirac observables,or as functions onΓ,which may be referred to as spacetime observables.In either case,the physical observables remain invariant under the dynamical evolution generated by the Hamiltonian,a fact which implies that this evolution is not measurable.Only symmetries of the reduced phase space,i.e.,symplectic transformations of∆,and equivalently ofΓ,can be contemplated as being measurable[10]. Even if such symmetries are discovered in general relativity,global obstructions are expectedto arise in the phase space[11]which may prohibit such symmetries from being interpreted as generators of the evolution of the system in physical time.This problem of evolution may be regarded as the classical core of the problem of time of quantum gravity.1.2The missing representations of the group DiffMOf particular relevance to the problem of evolution is the way in which the group DiffM is considered to act on the phase space of general relativity,and the connection between this action and the dynamical evolution generated by the Hamiltonian.A peculiar feature of the Dirac-ADM formalism is that,despite the bijective correspondence between the setsΓand ∆,the DiffM-invariance of the spacetime action is reflected only indirectly in thefirst-class constraints.More precisely,although the canonical transformations generated by the Hamil-tonian can be linked to the diffeomorphisms of the spacetime manifold M,the Lie algebra of DiffM cannot be mapped onto the Poisson bracket algebra of the super-Hamiltonian and super-momentum constraints.This inability to recover the action of DiffM directly within the conventional canonical framework is not only noteworthy from the conceptual point of view, but also contributes to the problems that hinder the canonical quantisation of gravity.The cause of this difficulty was diagnosed by Isham and Kuchaˇr[12].The absence from the conventional phase space P={(g ij,p ij)}of the embedding mappings Y:Σ→M that connect the spacetime manifold M with the space manifoldΣrenders the direct canonical description of spacetime objects impossible,and leads to the loss of the representations of DiffM.In order to recover the action of DiffM within the canonical framework,this missing link must be re-established,and the gravitational configuration space must be extended by the space of embeddings fromΣto M.This was achieved in Ref.[12]by parameterising the Dirac-ADM action.The process of parameterisation is tantamount to viewing the lapse function and the shift vector as functionals of the embedding mapping Y:Σ→M,and then varying Y in the action. When applied to a generally-covariant system such as general relativity,this procedure requires that four of the components of the spacetime metric be limited by coordinate conditions with respect to the foliation structure.The coordinate conditions are needed in order that the lapse function and the shift vector can indeed be regarded as functionals of the embedding variable Y,and not as variables on their own.In addition,these conditions ensure a well-posed initial-value problem.Without them,the spacetime metric built by the canonical dynamical evolution would be determined only up to a spacetime diffeomorphism[13].As a result of limiting in Ref.[12]the spacetime metric by the coordinate conditions,the original super-Hamiltonian and super-momentum constraints get suspended,and new,modified, constraints arise.In the resulting phase space{(g ij,p ij,Yα,Pα)},augmented by the embeddings Yα(x)and their conjugate momenta Pα(x),a direct correspondence between the spacetime and the canonical descriptions emerges.This is attested via the construction of a homomorphic mapping from the Lie algebra of DiffM into the Poisson bracket algebra of the dynamicalvariables on the extended phase space.Viewed from the perspective of a variational principle,the procedure of breaking the invari-ance of general relativity by coordinate conditions and restoring it by parameterisation can be associated with the coupling of gravity to matterfields.Kuchaˇr and Torre[14]derived the mod-ified constraints of Isham and Kuchaˇr from an appropriate action functional,and recognised the new terms as the energy-momentum density of a non-rotating,heat-conducting,incoherent dust.Other coordinate conditions lead to different constraint structures,some of which have been investigated in Refs.[15]-[21].1.3Aim,motivation and main conceptIn this paper,a reformulation of the canonical method is considered,that permits the represen-tation of the Lie algebra of DiffM within a suitable equal-time phase space for vacuum general relativity,without abandoning the standard constraints of this theory.The proposed formalism relies upon ideas and techniques that were developed in collaboration with K.Kuchaˇr in Ref.[1]and yields results that are,in certain ways,parallel to the results of Savvidou[2]-[4],derived within the context of the History Projection Operator1formalism for general relativity.From a technical point of view,the only difference between the present formulation and the conventional formulation of Dirac and ADM is that the foliation is modelled as a variable,and is incorporated into the framework of the Hilbert variational principle.Such an approach is actuated by the desire to harmonise the canonical action with the principle of general covariance, and the recognition of the fact that,strictly speaking,this action is an extension of the Hilbert action.This is because,by construction,the canonical action requires a time foliation of M by spacelike hypersurfaces to be introduced into general relativity as an additional geometric element.Thereby,the notion of time is distinguished from that of a spacetime coordinate and becomes dependent upon the spacetime metric G.Time is represented by a global scalar mapping T:M→I R from the spacetime manifold M to a one-dimensional time manifold I R which has the topology of the open line.The gradient T,αof this mapping is required to be timelike with respect to G.Accordingly,each choice of T represents a foliation of M by spacelike hypersurfaces.On each such hypersurface,the notion of space is represented by another metric-dependent mapping X:M→Σ,whose gradients X i,αare required to be spacelike.In order that the canonical theory be cast in generally-covariant form,allfields upon which it is based must conform to the principle of general covariance.That is,they must be subject to variation in the action functional and satisfy generally-covariantfield equations.In particular,field equations must be satisfied by T and X.These equations must enforce the timelike and spacelike character of the gradients of these variables,but must otherwise leave T andX undetermined in order to respect the arbitrariness of the spacelike foliation.As a result,a generally-covariant canonical action must necessary involve a greater number of non-dynamical variables than the Hilbert action.This causes the breaking of the bijective correspondence between its setsΓand∆and,therefore,has repercussions for the functions defined on these sets;namely,the spacetime observables and the Dirac observables.As it is evident,the breaking of this correspondence is a crucial property of the covariant canonical action.In general,the setsΓand∆reveal different aspects of a generally-covariant theory:on the one hand,the setΓis derived from the set of solutions by eliminating only the freedom associated with DiffM.On the other hand,the set∆is derived from the set of solutions by eliminating the freedom associated with thefirst-class constraints.In an arbitrary generally-covariant framework,this latter freedom may be wider than the former,because it depends upon the number of non-dynamical variables present in the action;i.e.,variables that are left undetermined by the variational principle.In our particular case of interest,after the non-dynamical variables T and X are incorporated into the framework of the Hilbert variational principle,the dynamical content of the resulting action,expressed by the set∆, remains unaffected.However,the setΓof DiffM-classes of solutions is extended by the presence of these arbitraryfields.The set∆becomes a subset ofΓ,and the Dirac observables form only a subset of the spacetime observables.Since it is just this subset that weakly commutes with the Hamiltonian,the evolution of the spacetime observables is,in general,non-trivial.In addition,the breaking of the bijective correspondence between the setsΓand∆is reflected within the equal-time phase space P in the doubling of thefirst-class constraints.Thus,it becomes possible in the covariant canonical formalism to identify which constraints arise due to the diffeomorphism invariance of the spacetime action and which arise due to the non-dynamical character of the foliation.This lays the foundations for,firstly,representing the Lie algebra of spacetime diffeomorphisms by symplectic diffeomorphisms of P and,secondly,for separating the canonical transformations generated by spacetime diffeomorphisms from those generated by the deformations of the foliation.In comparison,the bijection between the sets Γand∆in the Dirac-ADM formalism leads to the entanglement of distinct concepts and the loss of general covariance,while the preservation of this bijection via the coordinate conditions causes the suspension of the vacuum constraints in the covariant framework of Isham and Kuchaˇr.1.4The formalismA general procedure for incorporating the foliation into the variational principle of a generally-covariant theory was developed in Ref.[1].It was designed originally for the purpose of representing spacetime diffeomorphisms in the history phase space of an arbitrary generally-covariant system,modelled in Ref.[1]by the Bosonic string.This procedure respects the distinction between the setsΓand∆,so it only needs to be adapted to the circumstances of the gravitational theory.Thus,in the same spirit,a time variable T:M→I R and a spacevariable X:M→Σare incorporated into the Hilbert action as additional variablefields.While the mapping T describes a slicing of the spacetime manifold by spacelike hypersurfaces,namely,a time foliation,the mapping X describes a congruence of timelike reference world-lines,namely,a reference frame.The product mapping T×X:M→I R×Σis inverse to thefoliation mapping Y:I R×Σ→M.The variables T and X are coupled to the spacetime metric G.This coupling preserves thevacuum Einstein equations and also ensures that,at the level of the solutions,the time foliationis spacelike,and the reference frame is timelike,with respect to G.Apart from these essentialrestrictions,the variables T and X are left undetermined by the variational principle in orderto comply with the arbitrariness of the foliation.The resulting set of solutions{Gαβ,T,X i} incorporates the content of all individual Dirac-ADM actions in the sense that it includes allcausal reference systems that can be associated with each vacuum spacetime.All thefields inthis extended action functional transform covariantly under the diffeomorphisms of M,so thegeneral covariance of the formalism remains manifest.As in Ref.[1],the transition from the spacetime action to its Lagrangian counterpart on I R×Σis conceived as a one-to-one transformation from the set of spacetime variables{Gαβ,T,X i} to the set{g ij,N,N i,Yα}of induced variables on I R×Σ.This transformation is followed by a Legendre transformation,which involves the foliationfield Y.Since the spacetime and the canonical frameworks remain interlinked,all symmetries of the originalfield equations on M are transferred to the canonical theory.This provides the basis for studying the transformations induced on the canonicalfields{g ij,p ij,N,N i}by the diffeomorphisms of M,as well as by the deformations of the mappings T and X.Resembling the covariant formulation of Isham and Kuchaˇr,the equal-time phase spaceP={(g ij,p ij,Yα,Pα)}includes the embeddings Yα(x)and their conjugate momenta Pα(x). However,it is now constrained by eightfirst-class constraints.The constraint surface C is determined by the standard super-Hamiltonian and super-momentum constraints of vacuum gravity,H=0and H i=0,and the vanishing of the embedding momenta,Pα=0.The Hamiltonian d3x H is a linear functional of these eightfirst-class constraints,H:=NH+ N i H i+ΛαPα.1.5Summary of resultsThe Hamiltonian d3x H is regarded as the generator of solutions rather than of symmetries; that is,its primary role is considered to be the creation of solutions from permissible instan-taneous data.Symmetries of thefield equations then act on these solutions.Symmetries are generated by infinitesimal transformations of thefield variables that preserve the linearisation of thefield equations provided that these equations hold.Each symmetry defines a mapping of solutions to solutions.Key symmetries are induced by the diffeomorphisms of the manifolds M and the transformations of the mappings T and X.The diffeomorphisms of M do not act on solutions in the same way as the transformationsof T and X do.Under the action of DiffM,the spacetime metric G and the mappings T andX transform covariantly.This implies,in particular,that the spacelike character of the time foliation and the timelike character of the reference frame are respected.The foliation variableY,being inverse to T×X,is transformed arbitrarily by DiffM,but thefields g,p,N andN i remain unchanged.In contrast,under the transformations of the mappings T and X,thespacetime metric G is left,by definition,unchanged,but thefields g,p,N,N i and Y are alltransformed.Special kinds of transformations of T and X are induced by the diffeomorphisms of themanifolds I R andΣ.These move individual hypersurfaces and individual worldlines,but theykeep the time foliation and the reference framefixed;i.e.,thefinal collection of hypersurfacesand worldlines is the same as the original.On account of this,the spacelike character of thetime foliation and the timelike character of the reference frame,with respect to the unchangedG,are preserved.This is not the case for more general transformations of T and X,unless these transformations are allowed to depend fully upon solutions.Then it is indeed possible toconsider generalised symmetriesδT[G,T,X]andδX[G,T,X]that sustain the compatibilitybetween the mappings T,X and the unchanged G.Within the framework of the extended phase space P,solutions are visualised as curveslying in the subspace I R×C of I R×P.Symmetries of thefield equations are acting on thesecurves.The special transformations induced on I R×P by infinitesimal time diffeomorphismsw∈LDiffI R and infinitesimal space diffeomorphisms u∈LDiffΣare generated,respectively,by the dynamical variables D w=− d3x w H and D u=− d3x u i(H i+PαYα,i).The more general transformations induced on I R×P by the symmetriesδT[G,T,X]andδX[G,T,X]are generated by the functional D(δT,δX)=− d3x δT H−δX i(H i+PαYα,i) .This reduces to the generator D w in the case whereδT=w(T)andδX=0,and to the generator D u in the case whereδT=0andδX=u(X).Analogous functionals can be constructed within the Dirac-ADM phase space{g ij,p ij}.On the other hand,the symmetries induced on I R×P by infinitesimal spacetime diffeomor-phisms V∈LDiffM are generated by a dynamical variable that has no counterpart in theconventional phase space.This is the variable D V= d3xPαVα(Y),which depends solely on the embedding variables and the vectorfield V.This functional provides an anti-homomorphic mapping of vectorfields in the Lie algebra LDiffM into the Poisson bracket algebra on the phase space P;i.e.,a representation of spacetime diffeomorphisms by symplectic diffeomorphisms of the phase space.The structure of the generators D(δT,δX)and D V reveals two facts about canonical generalrelativity that lay unexpressed within the conventional canonical formalism.First,the gen-eral covariance of the theory is not reflected in the super-Hamiltonian and super-momentumconstraints but,instead,in the embedding momentum constraints.Second,the orbits of thegenerators D V and d3x H on the phase space P are distinct,in accordance with the set∆being a subset ofΓ.This eliminates any possibility of identifying the Hamiltonian functionald3x H with the generator of spacetime diffeomorphisms,in agreement with Kuchaˇr’s analysis of this issue in Ref.[39].Although this distinct role of the Hamiltonian,as opposed to the role of spacetime diffeo-morphisms,cannotfind an unambiguous mathematical expression within the standard phase space{(g ij,p ij)}of vacuum gravity,it has been enacted in formulations based on history phase spaces.In Ref.[2],history representations of both the Lie algebra of DiffM and the Dirac algebra of the constraints are constructed within the context of the History Projection Operator formalism for general relativity.The foliation is introduced as a parameter in the formalism and satisfies an equivariance condition[3]-[4].The invariance of the canonical action under DiffM was thereby established,and the connection between this fact and the problem of time was studied.The issue of the history quantisation of a spacelike foliation was also analysed—see Ref.[40].Alternative history representations of DiffM were constructed in Ref.[1]in the context of the history phase space of the Bosonic string.The equal-time formalism considered here has inherited several features from that history formalism;among them,the incorporation of the mappings T and X in the variational principle,which makes the correspondence between the setsΓand∆many-to-one.This leads to the enrichment of the notion of instantaneous observables and calls for the revision of their dynamical evolution.As anticipated,two kinds of observables arise on the equal-time phase space P of the covariant canonical action:spacetime observables and Dirac observables.The spacetime observables are dynamical variables F on P that commute on the constraint surface C with the generator of spacetime diffeomorphisms,{F,D V}|C=0.While such func-tionals weakly commute with the embedding momentum constraints,they do not necessarily weakly commute with the super-Hamiltonian and super-momentum constraints.As a result, they are not necessarily invariant under the deformations of the mappings,{F,D(δT,δX)}|C=0, nor are they constants of the motion,{F, d3x H}|C=0.On the other hand,the Dirac observ-ables weakly commute with all eightfirst-class constraints,and hence also with d3x H.These are invariant under both the diffeomorphisms of M and the transformations of the mappings T and X,and form a subset of spacetime observables that remain frozen in time.While the spacetime observables induce functions onΓ,the Dirac observables induce functions on∆.1.6InterpretationRegarded as an action functional on the spacetime manifold M,the covariant canonical action is equivalent to the Hilbert action coupled to causal reference systems(T,X).Although the presence of these systems does not preclude the conventional interpretation of vacuum gravity based upon Hilbert action,it does imply that an additional postulate is necessary if this in-terpretation is to be recovered within the framework of the extended action.More precisely, the covariant canonical formalism accepts two different interpretations,depending on whether physical importance is ascribed to the entire setΓor solely to its subset∆⊂Γ.The second option amounts to the requirement,due to Bergmann[7]and Dirac[8],that allfirst-class constraints generate gauge transformations.According to this position,spacetime diffeomorphisms and deformations of the mappings T and X have no measurable consequences. The mappings T and X are deemed unimportant,and the physical observables coincide with the Dirac observables which are independent of these mappings.Since the Dirac observables do not evolve in time,the problem of evolution resurfaces in its standard form,as discussed in the literature[41]-[48].In this case,the recovery of the representations of DiffM in the phase space of the covariant canonical action is devoid of physical significance.Needless to say,prominence is given to thefirst option.According to this position,the set ∆does not exhaust the observable aspects of the theory.Significance is attributed to the entire setΓ,and the selection of the mappings T and X as additional variables advocates a specific physical proposition.This concerns the issue of what constitutes a physical spacetime in vacuum gravity;a long-standing issue that goes back to the founders of general relativity:Hilbert formalised the notion that the reference system in general relativity should be visualised as a fluid which carries clocks that keep a causal time[49],and Einstein used a similar idealisation in his book[50].Stachel analysed the issue of observability in general relativity in Ref.[51], and Rovelli introduced the so-called localised and non-localised points of view in Ref.[52].The concept of the referencefluid is realised in a mathematically precise way by the mappings T and X.These mappings bridge the gap between observers and the system under observation in the absence of a physical process of measurement.Observers are assumed not to influence the gravitational system under observation.Although their trajectories have to be timelike, they do not form part of the physical system in the strict sense.Accordingly,the interaction between the mappings T,X and the metric G is extremely tenuous.There is just enough interaction to distinguish between the points of M,but not enough to disturb the geometry. This is captured by the vanishing energy-momentum of thefields T and X and the subsequent preservation of the vacuum constraints in the canonical theory.Regarding determinism,initial data do not uniquely determine the evolution derived from the covariant canonical action,even after the orbits of DiffM have been eliminated.There is still freedom remaining in the theory due to the arbitrariness of the foliation.However,this does not mean that the gravitational system under observation has more freedom to evolve than it had before;i.e.,when it was described by Hilbert’s action.The freedom captured by the extended setΓonly refers to the possibilities of observation associated with a given physical stateδ∈∆.As we shall see later,there is a whole set of states{γ}inΓassociated with each physical stateδ∈∆,all of which are DiffM-invariant but foliation-dependent.Provided that the setΓis considered meaningful,each such stateγin the class{γ}is accepted as a distinct measurable state of the physical stateδ.The underlying assumption is that distinct measurements of a given physical situation remain distinct even in the limit where the physical interaction between the observers and the gravitational system becomes negligible.In contrast, this kind of observability is rejected in the formulation based on Hilbert’s action.The focus。

伍德里奇计量经济学导论第四版

伍德里奇计量经济学导论第四版

15CHAPTER 3TEACHING NOTESFor undergraduates, I do not work through most of the derivations in this chapter, at least not in detail. Rather, I focus on interpreting the assumptions, which mostly concern the population. Other than random sampling, the only assumption that involves more than populationconsiderations is the assumption about no perfect collinearity, where the possibility of perfect collinearity in the sample (even if it does not occur in the population should be touched on. The more important issue is perfect collinearity in the population, but this is fairly easy to dispense with via examples. These come from my experiences with the kinds of model specification issues that beginners have trouble with.The comparison of simple and multiple regression estimates – based on the particular sample at hand, as opposed to their statistical properties – usually makes a strong impression. Sometimes I do not bother with the “partialling out” interpretation of multiple regression.As far as statistical properties, notice how I treat the problem of including an irrelevant variable: no separate derivation is needed, as the result follows form Theorem 3.1.I do like to derive the omitted variable bias in the simple case. This is not much more difficult than showing unbiasedness of OLS in the simple regression case under the first four Gauss-Markov assumptions. It is important to get the students thinking aboutthis problem early on, and before too many additional (unnecessary assumptions have been introduced.I have intentionally kept the discussion of multicollinearity to a minimum. This partly indicates my bias, but it also reflects reality. It is, of course, very important for students to understand the potential consequences of having highly correlated independent variables. But this is often beyond our control, except that we can ask less of our multiple regression analysis. If two or more explanatory variables are highly correlated in the sample, we should not expect to precisely estimate their ceteris paribus effects in the population.I find extensive treatments of multicollinearity, where one “tests” or somehow “solves” the multicollinearity problem, to be misleading, at best. Even the organization of some texts gives the impression that imperfect multicollinearity is somehow a violation of the Gauss-Markovassumptions: they include multicollinearity in a chapter or part of the book devoted to “violation of the basic assumptions,” or something like that. I have noticed that master’s students who have had some undergraduate econometrics are often confused on the multicollinearity issue. It is very important that students not confuse multicollinearity among the included explanatory variables in a regression model with the bias caused by omitting an important variable.I do not prove the Gauss-Markov theorem. Instead, I emphasize its implications. Sometimes, and certainly for advanced beginners, I put a special case of Problem 3.12 on a midterm exam, where I make a particular choice for the function g (x . Rather than have the students directly 课后答案网ww w.kh d aw .c om16compare the variances, they should appeal to the Gauss-Markov theorem for the superiority of OLS over any other linear, unbiased estimator.SOLUTIONS TO PROBLEMS3.1 (i Yes. Because of budget constraints, it makes sense that, the more siblings there are in a family, the less education any one child in the family has. To find the increase in the number of siblings that reduces predicted education by one year, we solve 1 = .094(Δsibs , so Δsibs = 1/.094 ≈ 10.6.(ii Holding sibs and feduc fixed, one more year of mother’s education implies .131 years more of predicted education. So if a mother has four more years of education, her son is predicted to have about a half a year (.524 more years of education. (iii Since the number of siblings is the same, but meduc and feduc are both different, the coefficientson meduc and feduc both need to be accounted for. The predicted difference in education between B and A is .131(4 + .210(4 = 1.364.3.2 (i hsperc is defined so that the smaller it is, the lower the student’s standing in high school. Everything else equal, the worse the student’s standing in high school, the lower is his/her expected college GPA. (ii Just plug these values into the equation:n colgpa= 1.392 − .0135(20 + .00148(1050 = 2.676.(iii The difference between A and B is simply 140 times the coefficient on sat , because hsperc is the same for both students. So A is predicted to have ascore .00148(140 ≈ .207 higher.(iv With hsperc fixed, n colgpaΔ = .00148Δsat . Now, we want to find Δsat such that n colgpaΔ = .5, so .5 = .00148(Δsat or Δsat = .5/(.00148 ≈ 338. Perhaps not surprisingly, a large ceteris paribus difference in SAT score – almost two and one-half standard deviations – is needed to obtain a predicted difference in college GPA or a half a point.3.3 (i A larger rank for a law school means that the school has less prestige; this lowers starting salaries. For example, a rank of 100 means there are 99 schools thought to be better.课后答案网ww w.kh d aw .c om17(ii 1β > 0, 2β > 0. Both LSAT and GPA are measures of the quality of the entering class. No matter where better students attend law school, we expect them to earn more, on average. 3β, 4β > 0. The numbe r of volumes in the law library and the tuition cost are both measures of the school quality. (Cost is less obvious than library volumes, but should reflect quality of the faculty, physical plant, and so on. (iii This is just the coefficient on GPA , multiplied by 100: 24.8%. (iv This is an elasticity: a one percent increase in library volumes implies a .095% increase in predicted median starting salary, other things equal. (v It is definitely better to attend a law school with a lower rank. If law school A has a ranking 20 less than law school B, the predicted difference in starting salary is 100(.0033(20 = 6.6% higher for law school A.3.4 (i If adults trade off sleep for work, more work implies less sleep (other things equal, so 1β < 0. (ii The signs of 2β and 3β are not obvious, at least to me. One could argue that more educated people like to get more out of life, and so, other things equal,they sleep less (2β < 0. The relationship between sleeping and age is more complicated than this model suggests, and economists are not in the best position to judge such things.(iii Since totwrk is in minutes, we must convert five hours into minutes: Δtotwrk = 5(60 = 300. Then sleep is predicted to fall by .148(300 = 44.4 minutes. For a week, 45 minutes less sleep is not an overwhelming change. (iv More education implies less predicted time sleeping, but the effect is quite small. If we assume the difference between college and high school is four years, the college graduate sleeps about 45 minutes less per week, other things equal. (v Not surprisingly, the three explanatory variables explain only about 11.3% of the variation in sleep . One important factor in the error term is general health. Another is marital status, and whether the person has children. Health (however we measure that, marital status, and number and ages of children would generally be correlated with totwrk . (For example, less healthy people would tend to work less.3.5 Conditioning on the outcomes of the explanatory variables, we have 1E(θ =E(1ˆβ + 2ˆβ = E(1ˆβ+ E(2ˆβ = β1 + β2 = 1θ.3.6 (i No. By definition, study + sleep + work + leisure = 168. Therefore, if we change study , we must change at least one of the other categories so that the sum is still 168. 课后答案网ww w.kh d aw .c om18(ii From part (i, we can write, say, study as a perfect linear function of the otherindependent variables: study = 168 − sleep − work − leisure . This holds for every observation, so MLR.3 violated. (iii Simply drop one of the independent variables, say leisure :GPA = 0β + 1βstudy + 2βsleep + 3βwork + u .Now, for example, 1β is interpreted as the change in GPA when study increases by one hour, where sleep , work , and u are all held fixed. If we are holding sleep and work fixed but increasing study by one hour, then we must be reducing leisure by one hour. The other slope parameters have a similar interpretation.3.7 We can use Table 3.2. By definition, 2β > 0, and by assumption, Corr(x 1,x 2 < 0.Therefore, there is a negative bias in 1β: E(1β < 1β. This means that, on average across different random samples, the simple regression estimator underestimates the effect of thetraining program. It is even possible that E(1β is negative even though 1β > 0.3.8 Only (ii, omitting an important variable, can cause bias, and this is true only when the omitted variable is correlated with the included explanatory variables. The homoskedasticity assumption, MLR.5, played no role in showing that the OLS estimators are unbiased.(Homoskedasticity was used to o btain the usual variance formulas for the ˆjβ. Further, the degree of collinearity between the explanatory variables in the sample, even if it is reflected in a correlation as high as .95, does not affect the Gauss-Markov assumptions. Only if there is a perfect linear relationship among two or more explanatory variables is MLR.3 violated.3.9 (i Because 1x is highly correlated with 2x and 3x , and these latter variables have largepartial effects on y , the simple and multiple regression coefficients on 1x can differ by largeamounts. We have not done this case explicitly, but given equation (3.46 and the discussion with a single omitted variable, the intuition is pretty straightforward.(ii Here we would expect 1β and 1ˆβ to be similar (subject, of course, to what we mean by “almost uncorrelated”. The amount of correlation between 2x and 3x does not directly effect the multiple regression estimate on 1x if 1x is essentially uncorrelated with 2x and 3x .(iii In this case we are (unnecessarily introducing multicollinearity into the regression: 2x and 3x have small partial effects on y and yet 2x and 3x are highly correlated with 1x . Adding2x and 3x like increases the standard error of the coefficient on 1x substantially, so se(1ˆβis likely to be much larger than se(1β . 课后答案网ww w.kh d aw .c om19(iv In this case, adding 2x and 3x will decrease the residual variance without causingmuch collinearity (because 1x is almost uncorrelated with 2x and 3x , so we should see se(1ˆβ smaller than se(1β. The amount of correlation between 2x and 3x does not directly affect se(1ˆβ.3.10 From equation (3.22 we have111211ˆ,ˆni ii ni i r yr β===∑∑where the 1ˆi rare defined in the problem. As usual, we must plug in the true model for y i : 1011223311211ˆ(.ˆni i i i ii ni i r x x x u r βββββ==++++=∑∑The numerator of this expression simplifies because 11ˆni i r=∑ = 0, 121ˆni i i r x =∑ = 0, and 111ˆni i i r x =∑ = 211ˆni i r =∑. These all follow from the fact that the 1ˆi rare the residuals from the regression of 1i x on 2i x : the 1ˆi rhave zero sample average and are uncorrelated in sample with 2i x . So the numerator of 1βcan be expressed as2113131111ˆˆˆ.n n ni i i i i i i i rr x r u ββ===++∑∑∑Putting these back over the denominator gives 13111113221111ˆˆ.ˆˆnni i ii i nni i i i r x rur r βββ=====++∑∑∑∑课后答案网ww w.kh d aw .c om20Conditional on all sample values on x 1, x 2, and x 3, only the last term is random due to its dependence on u i . But E(u i = 0, and so131113211ˆE(=+,ˆni i i ni i r xr βββ==∑∑which is what we wanted to show. Notice that the term multiplying 3β is the regressioncoefficient from the simple regression of x i 3 on 1ˆi r.3.11 (i 1β < 0 because more pollution can be expected to lower housing values; note that 1β isthe elasticity of price with respect to nox . 2β is probably positive because rooms roughlymeasures the size of a house. (However, it does not allow us to distinguish homes where each room is large from homes where each room is small. (ii If we assume that rooms increases with quality of the home, then log(nox and rooms are negatively correlated when poorer neighborhoods have more pollution, something that is often true. We can use Ta ble 3.2 to determine the direction of the bias. If 2β > 0 andCorr(x 1,x 2 < 0, the simple regression estimator 1βhas a downward bias. But because 1β < 0, this means that the simple regression, on average, overstates the importance of pollution. [E(1β is more negative than 1β.] (iii This is what we expect from the typical sample based on our analysis in part (ii. The simple regression estimate, −1.043, is more negative (larger in magnitude than the multiple regression estimate, −.718. As those estimates are only for one sample, we can never know which is closer to 1β. But if this is a “typical” sample, 1β is closer to −.718.3.12 (i For notational simplicity, define s zx = 1(;ni i i z z x =−∑ this is not quite the samplecovariance between z and x because we do not divide by n – 1, but we are only using it tosimplify notation. Then we can write 1β as11(.niii zxz z ys β=−=∑This is clearly a linear function of the y i : take the weights to be w i = (z i −z /s zx . To show unbiasedness, as usual we plug y i = 0β + 1βx i + u i into this equation, and simplify: 课后答案网w w w .k h d aw .c o m21 11 1 011 111(( (((n ii i i zxnni zx i ii i zxniii zxz z x u s z z s z z u s zz u s ββββββ====−++=−++−=−=+∑∑∑∑where we use the fact that 1(ni i z z =−∑ = 0 always. Now s zx is a function of the z i and x i and theexpected value of each u i is zero conditional on all z i and x i in the sample. Therefore, conditional on these values,1111(E(E(niii zxz z u s βββ=−=+=∑because E(u i = 0 for all i . (ii From the fourth equation in part (i we have (again conditional on the z i and x i in the sample,2111222212Var ((Var(Var((n ni i i i i i zx zxnii zxz z u z z u s s z z s βσ===⎡⎤−−⎢⎥⎣⎦==−=∑∑∑because of the homoskedasticit y assumption [Var(u i = σ2 for all i ]. Given the definition of s zx , this is what we wanted to show.课后答案网ww w.kh d aw .c om22(iii We know that Var(1ˆβ = σ2/21[(].ni i x x =−∑ Now we can rearrange the inequality in the hint, drop x from the sample covariance, and cancel n -1everywhere, to get 221[(]/ni zx i z z s =−∑ ≥211/[(].ni i x x =−∑ When we multiply through by σ2 we get Var(1β ≥ Var(1ˆβ, which is what we wanted to show.3.13 (i The shares, by definition, add to one. If we do not omit one of the shares then the equation would suffer from perfect multicollinearity. The parameters would not have a ceteris paribus interpretation, as it is impossible to change one share while holding all of the other shares fixed. (ii Because each share is a proportion (and can be at most one, when all other shares are zero, it makes little sense to increase share p by one unit. If share p increases by .01 – which is equivalent to a one percentage point increase in the share of property taxes in total revenue – holding share I , share S , and the other factorsfixed, then growth increases by 1β(.01. With the other shares fixed, the excluded share, share F , must fall by .01 when share p increases by .01.SOLUTIONS TO COMPUTER EXERCISESC3.1 (i Prob ably 2β > 0, as more income typically means better nutrition for the mother and better prenatal care. (ii On the one hand, an increase in income generally increases the consumption of a good, and cigs and faminc could be positively correlated. On the other, family incomes are also higher for families with more education, and more education and cigarette smoking tend to benegatively correlated. The sample correlation between cigs and faminc is about −.173, indicating a negative correlation.(iii The regressions without and with faminc aren 119.77.514bwghtcigs =−21,388,.023n R ==and n 116.97.463.093bwghtcigs faminc =−+21,388,.030.n R ==课后答案网ww w.kh d aw .c om23The effect of cigarette smoking is slightly smaller when faminc is added to the regression, but the difference is not great. This is due to the fact that cigs and faminc are not very correlated, and the coefficient on faminc is practically small. (The variable faminc is measured in thousands, so $10,000 more in 1988 income increases predicted birth weight by only .93 ounces.C3.2 (i The estimated equation isn 19.32.12815.20price sqrft bdrms =−++288,.632n R ==(ii Holding square footage constant, n price Δ = 15.20 ,bdrms Δ and so n price increases by 15.20, which means $15,200.(iii Now n price Δ = .128sqrft Δ + 15.20bdrms Δ = .128(140 + 15.20 = 33.12, or $33,120. Because the size of the house is increasing, this is a much larger effect than in (ii. (iv About 63.2%. (v The predicted price is –19.32 + .128(2,438 + 15.20(4 = 353.544, or $353,544. (vi From part (v, the estimated value of the home based only on square footage and number of bedrooms is $353,544. The actual selling price was $300,000, which suggests the buyer underpaid by some margin. But, of course, there are many other features of a house (some that we cannot even measure that affect price, and we have not controlled for these.C3.3 (i The constant elasticity equation isn log( 4.62.162log(.107log(salary sales mktval =++ 2177,.299.n R ==(ii We cannot include profits in logarithmic form because profits are negative for nine of the companies in the sample. When we add it in levels form we getn log( 4.69.161log(.098log(.000036salary sales mktval profits =+++2177,.299.n R ==The coefficient on profits is very small. Here, profits are measured in millions, so if profits increase by $1 billion, which means profits Δ = 1,000 – a huge change – predicted salaryincreases by about only 3.6%. However, remember that we are holding sales and market value fixed.课后答案网ww w.kh d aw .c om24Together, these variables (and we could drop profits without losing anything explain almost 30% of the sample variation in log(salary . This is certainly not “most” of the variation.(iii Adding ceoten to the equation givesn log( 4.56.162log(.102log(.000029.012salary sales mktval profits ceoten =++++2177,.318.n R ==This means that one more year as CEO increases predicted salary by about 1.2%. (iv The sample correlation between log(mktval and profits is about .78, which is fairly high. As we know, this causes no bias in the OLS estimators, although it can cause their variances to be large. Given the fairly substantial correlation between market value andfirm profits, it is not too surprising that the latter adds nothing to explaining CEO salaries. Also, profits is a short term measure of how the firm is doing while mktval is based on past, current, and expected future profitability.C3.4 (i The minimum, maximum, and average values for these three variables are given in the table below:Variable Average Minimum Maximum atndrte priGPA ACT 81.71 2.59 22.516.25 .86131003.93 32(ii The estimated equation isn 75.7017.26 1.72atndrtepriGPA ACT =+− n = 680, R 2 = .291.The intercept means that, for a student whose prior GPA is zero and ACT score is zero, the predicted attendance rate is 75.7%. But this is clearly not an interesting segment of thepopulation. (In fact, there are no students in the college population with priGPA = 0 and ACT = 0, or with values even close to zero. (iii The coefficient on priGPA means that, if a student’s prior GPA is one point higher (say, from 2.0 to 3.0, the attendance rate is about 17.3 percentage points higher. This holds ACT fixed. The negative coefficient on ACT is, perhaps initially a bit surprising. Five more points on the ACT is predicted to lower attendance by 8.6 percentage points at a given level of priGPA . As priGPAmeasures performance in college (and, at least partially, could reflect, past attendance rates, while ACT is a measure of potential in college, it appears that students that had more promise (which could mean more innate ability think they can get by with missing lectures. 课后答案网ww w.kh d aw .c om(iv We have atndrte = 75.70 + 17.267(3.65 –1.72(20 ≈ 104.3. Of course, a student cannot have higher than a 100% attendance rate. Getting predictions like this is always possible when using regression methods for dependent variables with natural upper or lower bounds. In practice, we would predict a 100% attendance rate for this student. (In fact, this student had an actual attendance rate of 87.5%. (v The difference in predicted attendance rates for A and B is 17.26(3.1 − 2.1 − (21 − 26 = 25.86. C3.5 The regression of educ on exper and tenure yields n = 526, R2 = .101. ˆ Now, when we regres s log(wage on r1 we obtain ˆ log( wage = 1.62 + .092 r1 n = 526, R2 = .207. (ii The slope coefficientfrom log(wage on educ is β1 = .05984. ˆ ˆ (iv We have β1 + δ 1 β 2 = .03912 +3.53383(.00586 ≈ .05983, which is very close to .05984; the small difference is due to rounding error. C3.7 (i The results of the regression are math10 = −20.36 + 6.23log(expend − .305 lnchprg 课 (iii The slope coefficients from log(wage on educ and IQ are ˆ = .03912 and β = .00586, respectively. ˆ β1 2 后答案 C3.6 (i The slope coefficient from the regression IQ on educ is (rounded to five decimal places δ1 = 3.53383. n = 408, R2 = .180. 25 This edition is intended for use outside of the U.S. only, with content that may be different from the U.S. Edition. This may not be resold, copied, or distributed without the prior consent of the publisher. 网ˆ As expected, the coefficient on r1 in the second regression is identical to the coefficient on educ in equation (3.19. Notice that the R-squared from the above regression is below that in (3.19. ˆ In effect, the regression of log(wage on r1 explains log(wage using only the part of educ that is uncorrelated with exper and tenure; separate effects of exper and tenure are not included. ww w. kh da w. co m ˆ educ = 13.57 − .074 exper + .048 ten ure + r1 .The signs of the estimated slopes imply that more spending increases the pass rate (holding lnchprg fixed and a higher poverty rate (proxied well by lnchprg decreases the pass rate (holding spending fixed. These are what we expect. (ii As usual, the estimated intercept is the predicted value of the dependent variable when all regressors are set to zero. Setting lnchprg = 0 makes sense, as there are schools with low poverty rates. Setting log(expend = 0 does not make sense, because it is the same as setting expend = 1, and spending is measured in dollars per student. Presumably this is well outside any sensible range. Not surprisingly, the prediction of a −20 pass rate is nonsensical. (iii The simple regression results are failing to account for the poverty rate leads to an overestimate of the effect of spending. C3.8 (i The average of prpblck is .113 with standarddeviation .182; the average of income is 47,053.78 with standard deviation 13,179.29. It is evident that prpblck is a proportion and that income is measured in dollars. (ii The results from the OLS regression are psoda = .956 + .115 prpblck + .0000016 income 后 If, say, prpblck increases by .10 (ten percentage points, the price of soda is estimated toincrease by .0115 dollars, or about 1.2 cents. While this does not seem large, there are communities with no black population and others that are almost all black, in which case the difference in psoda is estimated to be almost 11.5 cents. (iii The simple regression estimate on prpblck is .065, so the simple regression estimate is actually lower. This is because prpblck and income are negatively correlated (-.43 and income has a positive coefficient in the multiple regression. (iv To get a constant elasticity, income should be in logarithmic form. I estimate the constant elasticity model: 26 This edition is intended for use outside of the U.S. only, with content that may be different from the U.S. Edition. This may not be resold, copied, or distributed without the prior consent of the publisher. 课答案 n = 401, R2 = .064. 网ww ˆ (v We can use equation (3.23. Because Corr(x1,x2 < 0, which means δ1 < 0 , and β 2 < 0 , ˆ the simple regression estimate, β , is larger than the multiple regression estimate, β . Intuitively, 1 w. kh (iv The sample correl ation between lexpend and lnchprg is about −.19 , which means that, on average, high schools with poorer students spent less per student. This makes sense, especially in 1993 in Michigan, where school funding was essentially determined by local property tax collections. da w. n = 408, R2 = .030 and the estimated spending effect is larger than it was in part (i –almost double. co 1 m math10 = −69.34 + 11.16 log(expendlog( psoda = −.794 + .122 prpblck + .077 log(income n = 401, R2 = .068. If prpblck increases by .20, log(psoda is estimated to increase by .20(.122 = .0244, or about 2.44 percent. ˆ (v β prpblck falls to about .073 when prppov is added to the regression. (vi The correlation is about −.84 , which makes sense because poverty rates are determined by income (but not directly in terms of median income. (vii There is no argument that they are highly correlated, but we are using them simply as controls to determine if the is price discrimination against blacks. In order to isolate the pure discrimination effect, we need to control for as many measures of income as we can; including both variables makes sense. C3.9 (i The estimated equation is (iv The estimated equation is gift = −7.33 + 1.20 mailsyear − .261 giftlast + 16.20 propresp + .527 avggift Aft er controlling for the average past gift level, the effect of mailings becomes even smaller: 1.20 guilders, or less thanhalf the effect estimated by simple regression. (v After controlling for the average of past gifts – which we can view as measuring the “typical” generosity of the person and is positively related to the current gift level – we find that the current gift amount is negatively related to the most recent gift. A negative relationship makes some sense, as people might follow a large donation with a smaller one. 27 This edition is intended for use outside of the U.S. only, with content that may be different from the U.S. Edition. This may not be resold, copied, or distributed without the prior consent of the publisher. 课 n = 4,268, R 2 = .2005 后 (iii Because propresp is a proportion, it makes little sense to increase it by one. Such an increase can happen only if propresp goes from zero to one. Instead, consider a .10 increase in propresp, which means a 10 percentage point increase. Then, gift i s estimated to be 15.36(.1 ≈ 1.54 guilders higher. 答案 (ii Holding giftlast and propresp fixed, one more mailing per year is estimated to increase gifts by 2.17 guilders. The simple regression estimate is 2.65, so the multiple regression estimate is somewhat smaller. Remember, the simple regression estimate holds no other factors fixed. 网 ww The R-squared is now about .083, compared with about .014 for the simple regression case. Therefore, the variables giftlast and propresp help to explain significantly more variation in gifts in the sample (although still just over eight percent. w. n = 4,268, R 2= .0834 kh gift = −4.55 + 2.17 mailsyear + .0059 giftlast + 15.36 propresp da w. co m。

Attractor states and infrared scaling in de Sitter space

Attractor states and infrared scaling in de Sitter space
tum field theory in curved spacetimes does not contain in itself a unique specification of the quantum state of the system [1]. Even in Minkowski spacetime, where the existence of the Poincar´ e group singles out a special state, the Minkowski vacuum, it is certainly of interest to consider states that are non-invariant under Poincar´ e transformations, since they contain all the information about the physical excitations and dynamics of the theory. Such non-vacuum states are also necessary in a general initial value formulation of the back-reaction problem in both curved and flat spacetimes. In flat space the initial value problem for arbitrary physically allowable states has been formulated and studied for both QED and scalar Φ4 theory in the large N limit, principally for time varying but spatially homogeneous mean fields [2,3]. The simplest situation in which the back-reaction problem can be studied in curved spacetime is that of a free scalar field in a spatially homogeneous and isotropic Robertson-Walker (RW) cosmology, where the geometry is characterized by just one non-trivial function of time. The wave equation for a free scalar field in such a geometry can be separated and expressed in terms of a complete set of time dependent mode functions. The general initial value problem is specified by giving initial data for this complete set at a given initial time. The back-reaction of the quantum scalar field(s) on the RW geometry can be studied then by constructing the renormalized expectation value of the energymomentum tensor Tab of the field(s) and solving (numerically) the semi-classical Einstein equations, augmented by higher derivative terms required by renormalization [4–9]. As in the flat space examples, this semi-classical backreaction problem becomes exact in the large N limit, with N the number of identical scalar fields [10]. As a prelude to the dynamical back-reaction problem in cosmological spacetimes it is necessary to study non-vacuum states first in fixed RW backgrounds. The maximally symmetric de Sitter spacetime is of particular interest. Most previous work has focused on maximally symmetric O(4, 1) de Sitter invariant states or the special O(4) invariant state found by Allen [11]. Since the universe is not globally O(4, 1) invariant, a more generic set of initial conditions, consistent only with RW symmetry and general principles of renormalization of Tab is required for cosmology. The

机器学习英语词汇

机器学习英语词汇

目录第一部分 (3)第二部分 (12)Letter A (12)Letter B (14)Letter C (15)Letter D (17)Letter E (19)Letter F (20)Letter G (21)Letter H (22)Letter I (23)Letter K (24)Letter L (24)Letter M (26)Letter N (27)Letter O (29)Letter P (29)Letter R (31)Letter S (32)Letter T (35)Letter U (36)Letter W (37)Letter Z (37)第三部分 (37)A (37)B (38)C (38)D (40)E (40)F (41)G (41)H (42)L (42)J (43)L (43)M (43)N (44)O (44)P (44)Q (45)R (46)S (46)U (47)V (48)第一部分[ ] intensity 强度[ ] Regression 回归[ ] Loss function 损失函数[ ] non-convex 非凸函数[ ] neural network 神经网络[ ] supervised learning 监督学习[ ] regression problem 回归问题处理的是连续的问题[ ] classification problem 分类问题处理的问题是离散的而不是连续的回归问题和分类问题的区别应该在于回归问题的结果是连续的,分类问题的结果是离散的。

[ ]discreet value 离散值[ ] support vector machines 支持向量机,用来处理分类算法中输入的维度不单一的情况(甚至输入维度为无穷)[ ] learning theory 学习理论[ ] learning algorithms 学习算法[ ] unsupervised learning 无监督学习[ ] gradient descent 梯度下降[ ] linear regression 线性回归[ ] Neural Network 神经网络[ ] gradient descent 梯度下降监督学习的一种算法,用来拟合的算法[ ] normal equations[ ] linear algebra 线性代数原谅我英语不太好[ ] superscript上标[ ] exponentiation 指数[ ] training set 训练集合[ ] training example 训练样本[ ] hypothesis 假设,用来表示学习算法的输出,叫我们不要太纠结H的意思,因为这只是历史的惯例[ ] LMS algorithm “least mean squares” 最小二乘法算法[ ] batch gradient descent 批量梯度下降,因为每次都会计算最小拟合的方差,所以运算慢[ ] constantly gradient descent 字幕组翻译成“随机梯度下降” 我怎么觉得是“常量梯度下降”也就是梯度下降的运算次数不变,一般比批量梯度下降速度快,但是通常不是那么准确[ ] iterative algorithm 迭代算法[ ] partial derivative 偏导数[ ] contour 等高线[ ] quadratic function 二元函数[ ] locally weighted regression局部加权回归[ ] underfitting欠拟合[ ] overfitting 过拟合[ ] non-parametric learning algorithms 无参数学习算法[ ] parametric learning algorithm 参数学习算法[ ] other[ ] activation 激活值[ ] activation function 激活函数[ ] additive noise 加性噪声[ ] autoencoder 自编码器[ ] Autoencoders 自编码算法[ ] average firing rate 平均激活率[ ] average sum-of-squares error 均方差[ ] backpropagation 后向传播[ ] basis 基[ ] basis feature vectors 特征基向量[50 ] batch gradient ascent 批量梯度上升法[ ] Bayesian regularization method 贝叶斯规则化方法[ ] Bernoulli random variable 伯努利随机变量[ ] bias term 偏置项[ ] binary classfication 二元分类[ ] class labels 类型标记[ ] concatenation 级联[ ] conjugate gradient 共轭梯度[ ] contiguous groups 联通区域[ ] convex optimization software 凸优化软件[ ] convolution 卷积[ ] cost function 代价函数[ ] covariance matrix 协方差矩阵[ ] DC component 直流分量[ ] decorrelation 去相关[ ] degeneracy 退化[ ] demensionality reduction 降维[ ] derivative 导函数[ ] diagonal 对角线[ ] diffusion of gradients 梯度的弥散[ ] eigenvalue 特征值[ ] eigenvector 特征向量[ ] error term 残差[ ] feature matrix 特征矩阵[ ] feature standardization 特征标准化[ ] feedforward architectures 前馈结构算法[ ] feedforward neural network 前馈神经网络[ ] feedforward pass 前馈传导[ ] fine-tuned 微调[ ] first-order feature 一阶特征[ ] forward pass 前向传导[ ] forward propagation 前向传播[ ] Gaussian prior 高斯先验概率[ ] generative model 生成模型[ ] gradient descent 梯度下降[ ] Greedy layer-wise training 逐层贪婪训练方法[ ] grouping matrix 分组矩阵[ ] Hadamard product 阿达马乘积[ ] Hessian matrix Hessian 矩阵[ ] hidden layer 隐含层[ ] hidden units 隐藏神经元[ ] Hierarchical grouping 层次型分组[ ] higher-order features 更高阶特征[ ] highly non-convex optimization problem 高度非凸的优化问题[ ] histogram 直方图[ ] hyperbolic tangent 双曲正切函数[ ] hypothesis 估值,假设[ ] identity activation function 恒等激励函数[ ] IID 独立同分布[ ] illumination 照明[100 ] inactive 抑制[ ] independent component analysis 独立成份分析[ ] input domains 输入域[ ] input layer 输入层[ ] intensity 亮度/灰度[ ] intercept term 截距[ ] KL divergence 相对熵[ ] KL divergence KL分散度[ ] k-Means K-均值[ ] learning rate 学习速率[ ] least squares 最小二乘法[ ] linear correspondence 线性响应[ ] linear superposition 线性叠加[ ] line-search algorithm 线搜索算法[ ] local mean subtraction 局部均值消减[ ] local optima 局部最优解[ ] logistic regression 逻辑回归[ ] loss function 损失函数[ ] low-pass filtering 低通滤波[ ] magnitude 幅值[ ] MAP 极大后验估计[ ] maximum likelihood estimation 极大似然估计[ ] mean 平均值[ ] MFCC Mel 倒频系数[ ] multi-class classification 多元分类[ ] neural networks 神经网络[ ] neuron 神经元[ ] Newton’s method 牛顿法[ ] non-convex function 非凸函数[ ] non-linear feature 非线性特征[ ] norm 范式[ ] norm bounded 有界范数[ ] norm constrained 范数约束[ ] normalization 归一化[ ] numerical roundoff errors 数值舍入误差[ ] numerically checking 数值检验[ ] numerically reliable 数值计算上稳定[ ] object detection 物体检测[ ] objective function 目标函数[ ] off-by-one error 缺位错误[ ] orthogonalization 正交化[ ] output layer 输出层[ ] overall cost function 总体代价函数[ ] over-complete basis 超完备基[ ] over-fitting 过拟合[ ] parts of objects 目标的部件[ ] part-whole decompostion 部分-整体分解[ ] PCA 主元分析[ ] penalty term 惩罚因子[ ] per-example mean subtraction 逐样本均值消减[150 ] pooling 池化[ ] pretrain 预训练[ ] principal components analysis 主成份分析[ ] quadratic constraints 二次约束[ ] RBMs 受限Boltzman机[ ] reconstruction based models 基于重构的模型[ ] reconstruction cost 重建代价[ ] reconstruction term 重构项[ ] redundant 冗余[ ] reflection matrix 反射矩阵[ ] regularization 正则化[ ] regularization term 正则化项[ ] rescaling 缩放[ ] robust 鲁棒性[ ] run 行程[ ] second-order feature 二阶特征[ ] sigmoid activation function S型激励函数[ ] significant digits 有效数字[ ] singular value 奇异值[ ] singular vector 奇异向量[ ] smoothed L1 penalty 平滑的L1范数惩罚[ ] Smoothed topographic L1 sparsity penalty 平滑地形L1稀疏惩罚函数[ ] smoothing 平滑[ ] Softmax Regresson Softmax回归[ ] sorted in decreasing order 降序排列[ ] source features 源特征[ ] sparse autoencoder 消减归一化[ ] Sparsity 稀疏性[ ] sparsity parameter 稀疏性参数[ ] sparsity penalty 稀疏惩罚[ ] square function 平方函数[ ] squared-error 方差[ ] stationary 平稳性(不变性)[ ] stationary stochastic process 平稳随机过程[ ] step-size 步长值[ ] supervised learning 监督学习[ ] symmetric positive semi-definite matrix 对称半正定矩阵[ ] symmetry breaking 对称失效[ ] tanh function 双曲正切函数[ ] the average activation 平均活跃度[ ] the derivative checking method 梯度验证方法[ ] the empirical distribution 经验分布函数[ ] the energy function 能量函数[ ] the Lagrange dual 拉格朗日对偶函数[ ] the log likelihood 对数似然函数[ ] the pixel intensity value 像素灰度值[ ] the rate of convergence 收敛速度[ ] topographic cost term 拓扑代价项[ ] topographic ordered 拓扑秩序[ ] transformation 变换[200 ] translation invariant 平移不变性[ ] trivial answer 平凡解[ ] under-complete basis 不完备基[ ] unrolling 组合扩展[ ] unsupervised learning 无监督学习[ ] variance 方差[ ] vecotrized implementation 向量化实现[ ] vectorization 矢量化[ ] visual cortex 视觉皮层[ ] weight decay 权重衰减[ ] weighted average 加权平均值[ ] whitening 白化[ ] zero-mean 均值为零第二部分Letter A[ ] Accumulated error backpropagation 累积误差逆传播[ ] Activation Function 激活函数[ ] Adaptive Resonance Theory/ART 自适应谐振理论[ ] Addictive model 加性学习[ ] Adversarial Networks 对抗网络[ ] Affine Layer 仿射层[ ] Affinity matrix 亲和矩阵[ ] Agent 代理/ 智能体[ ] Algorithm 算法[ ] Alpha-beta pruning α-β剪枝[ ] Anomaly detection 异常检测[ ] Approximation 近似[ ] Area Under ROC Curve/AUC Roc 曲线下面积[ ] Artificial General Intelligence/AGI 通用人工智能[ ] Artificial Intelligence/AI 人工智能[ ] Association analysis 关联分析[ ] Attention mechanism 注意力机制[ ] Attribute conditional independence assumption 属性条件独立性假设[ ] Attribute space 属性空间[ ] Attribute value 属性值[ ] Autoencoder 自编码器[ ] Automatic speech recognition 自动语音识别[ ] Automatic summarization 自动摘要[ ] Average gradient 平均梯度[ ] Average-Pooling 平均池化Letter B[ ] Backpropagation Through Time 通过时间的反向传播[ ] Backpropagation/BP 反向传播[ ] Base learner 基学习器[ ] Base learning algorithm 基学习算法[ ] Batch Normalization/BN 批量归一化[ ] Bayes decision rule 贝叶斯判定准则[250 ] Bayes Model Averaging/BMA 贝叶斯模型平均[ ] Bayes optimal classifier 贝叶斯最优分类器[ ] Bayesian decision theory 贝叶斯决策论[ ] Bayesian network 贝叶斯网络[ ] Between-class scatter matrix 类间散度矩阵[ ] Bias 偏置/ 偏差[ ] Bias-variance decomposition 偏差-方差分解[ ] Bias-Variance Dilemma 偏差–方差困境[ ] Bi-directional Long-Short Term Memory/Bi-LSTM 双向长短期记忆[ ] Binary classification 二分类[ ] Binomial test 二项检验[ ] Bi-partition 二分法[ ] Boltzmann machine 玻尔兹曼机[ ] Bootstrap sampling 自助采样法/可重复采样/有放回采样[ ] Bootstrapping 自助法[ ] Break-Event Point/BEP 平衡点Letter C[ ] Calibration 校准[ ] Cascade-Correlation 级联相关[ ] Categorical attribute 离散属性[ ] Class-conditional probability 类条件概率[ ] Classification and regression tree/CART 分类与回归树[ ] Classifier 分类器[ ] Class-imbalance 类别不平衡[ ] Closed -form 闭式[ ] Cluster 簇/类/集群[ ] Cluster analysis 聚类分析[ ] Clustering 聚类[ ] Clustering ensemble 聚类集成[ ] Co-adapting 共适应[ ] Coding matrix 编码矩阵[ ] COLT 国际学习理论会议[ ] Committee-based learning 基于委员会的学习[ ] Competitive learning 竞争型学习[ ] Component learner 组件学习器[ ] Comprehensibility 可解释性[ ] Computation Cost 计算成本[ ] Computational Linguistics 计算语言学[ ] Computer vision 计算机视觉[ ] Concept drift 概念漂移[ ] Concept Learning System /CLS 概念学习系统[ ] Conditional entropy 条件熵[ ] Conditional mutual information 条件互信息[ ] Conditional Probability Table/CPT 条件概率表[ ] Conditional random field/CRF 条件随机场[ ] Conditional risk 条件风险[ ] Confidence 置信度[ ] Confusion matrix 混淆矩阵[300 ] Connection weight 连接权[ ] Connectionism 连结主义[ ] Consistency 一致性/相合性[ ] Contingency table 列联表[ ] Continuous attribute 连续属性[ ] Convergence 收敛[ ] Conversational agent 会话智能体[ ] Convex quadratic programming 凸二次规划[ ] Convexity 凸性[ ] Convolutional neural network/CNN 卷积神经网络[ ] Co-occurrence 同现[ ] Correlation coefficient 相关系数[ ] Cosine similarity 余弦相似度[ ] Cost curve 成本曲线[ ] Cost Function 成本函数[ ] Cost matrix 成本矩阵[ ] Cost-sensitive 成本敏感[ ] Cross entropy 交叉熵[ ] Cross validation 交叉验证[ ] Crowdsourcing 众包[ ] Curse of dimensionality 维数灾难[ ] Cut point 截断点[ ] Cutting plane algorithm 割平面法Letter D[ ] Data mining 数据挖掘[ ] Data set 数据集[ ] Decision Boundary 决策边界[ ] Decision stump 决策树桩[ ] Decision tree 决策树/判定树[ ] Deduction 演绎[ ] Deep Belief Network 深度信念网络[ ] Deep Convolutional Generative Adversarial Network/DCGAN 深度卷积生成对抗网络[ ] Deep learning 深度学习[ ] Deep neural network/DNN 深度神经网络[ ] Deep Q-Learning 深度Q 学习[ ] Deep Q-Network 深度Q 网络[ ] Density estimation 密度估计[ ] Density-based clustering 密度聚类[ ] Differentiable neural computer 可微分神经计算机[ ] Dimensionality reduction algorithm 降维算法[ ] Directed edge 有向边[ ] Disagreement measure 不合度量[ ] Discriminative model 判别模型[ ] Discriminator 判别器[ ] Distance measure 距离度量[ ] Distance metric learning 距离度量学习[ ] Distribution 分布[ ] Divergence 散度[350 ] Diversity measure 多样性度量/差异性度量[ ] Domain adaption 领域自适应[ ] Downsampling 下采样[ ] D-separation (Directed separation)有向分离[ ] Dual problem 对偶问题[ ] Dummy node 哑结点[ ] Dynamic Fusion 动态融合[ ] Dynamic programming 动态规划Letter E[ ] Eigenvalue decomposition 特征值分解[ ] Embedding 嵌入[ ] Emotional analysis 情绪分析[ ] Empirical conditional entropy 经验条件熵[ ] Empirical entropy 经验熵[ ] Empirical error 经验误差[ ] Empirical risk 经验风险[ ] End-to-End 端到端[ ] Energy-based model 基于能量的模型[ ] Ensemble learning 集成学习[ ] Ensemble pruning 集成修剪[ ] Error Correcting Output Codes/ECOC 纠错输出码[ ] Error rate 错误率[ ] Error-ambiguity decomposition 误差-分歧分解[ ] Euclidean distance 欧氏距离[ ] Evolutionary computation 演化计算[ ] Expectation-Maximization 期望最大化[ ] Expected loss 期望损失[ ] Exploding Gradient Problem 梯度爆炸问题[ ] Exponential loss function 指数损失函数[ ] Extreme Learning Machine/ELM 超限学习机Letter F[ ] Factorization 因子分解[ ] False negative 假负类[ ] False positive 假正类[ ] False Positive Rate/FPR 假正例率[ ] Feature engineering 特征工程[ ] Feature selection 特征选择[ ] Feature vector 特征向量[ ] Featured Learning 特征学习[ ] Feedforward Neural Networks/FNN 前馈神经网络[ ] Fine-tuning 微调[ ] Flipping output 翻转法[ ] Fluctuation 震荡[ ] Forward stagewise algorithm 前向分步算法[ ] Frequentist 频率主义学派[ ] Full-rank matrix 满秩矩阵[400 ] Functional neuron 功能神经元Letter G[ ] Gain ratio 增益率[ ] Game theory 博弈论[ ] Gaussian kernel function 高斯核函数[ ] Gaussian Mixture Model 高斯混合模型[ ] General Problem Solving 通用问题求解[ ] Generalization 泛化[ ] Generalization error 泛化误差[ ] Generalization error bound 泛化误差上界[ ] Generalized Lagrange function 广义拉格朗日函数[ ] Generalized linear model 广义线性模型[ ] Generalized Rayleigh quotient 广义瑞利商[ ] Generative Adversarial Networks/GAN 生成对抗网络[ ] Generative Model 生成模型[ ] Generator 生成器[ ] Genetic Algorithm/GA 遗传算法[ ] Gibbs sampling 吉布斯采样[ ] Gini index 基尼指数[ ] Global minimum 全局最小[ ] Global Optimization 全局优化[ ] Gradient boosting 梯度提升[ ] Gradient Descent 梯度下降[ ] Graph theory 图论[ ] Ground-truth 真相/真实Letter H[ ] Hard margin 硬间隔[ ] Hard voting 硬投票[ ] Harmonic mean 调和平均[ ] Hesse matrix 海塞矩阵[ ] Hidden dynamic model 隐动态模型[ ] Hidden layer 隐藏层[ ] Hidden Markov Model/HMM 隐马尔可夫模型[ ] Hierarchical clustering 层次聚类[ ] Hilbert space 希尔伯特空间[ ] Hinge loss function 合页损失函数[ ] Hold-out 留出法[ ] Homogeneous 同质[ ] Hybrid computing 混合计算[ ] Hyperparameter 超参数[ ] Hypothesis 假设[ ] Hypothesis test 假设验证Letter I[ ] ICML 国际机器学习会议[450 ] Improved iterative scaling/IIS 改进的迭代尺度法[ ] Incremental learning 增量学习[ ] Independent and identically distributed/i.i.d. 独立同分布[ ] Independent Component Analysis/ICA 独立成分分析[ ] Indicator function 指示函数[ ] Individual learner 个体学习器[ ] Induction 归纳[ ] Inductive bias 归纳偏好[ ] Inductive learning 归纳学习[ ] Inductive Logic Programming/ILP 归纳逻辑程序设计[ ] Information entropy 信息熵[ ] Information gain 信息增益[ ] Input layer 输入层[ ] Insensitive loss 不敏感损失[ ] Inter-cluster similarity 簇间相似度[ ] International Conference for Machine Learning/ICML 国际机器学习大会[ ] Intra-cluster similarity 簇内相似度[ ] Intrinsic value 固有值[ ] Isometric Mapping/Isomap 等度量映射[ ] Isotonic regression 等分回归[ ] Iterative Dichotomiser 迭代二分器Letter K[ ] Kernel method 核方法[ ] Kernel trick 核技巧[ ] Kernelized Linear Discriminant Analysis/KLDA 核线性判别分析[ ] K-fold cross validation k 折交叉验证/k 倍交叉验证[ ] K-Means Clustering K –均值聚类[ ] K-Nearest Neighbours Algorithm/KNN K近邻算法[ ] Knowledge base 知识库[ ] Knowledge Representation 知识表征Letter L[ ] Label space 标记空间[ ] Lagrange duality 拉格朗日对偶性[ ] Lagrange multiplier 拉格朗日乘子[ ] Laplace smoothing 拉普拉斯平滑[ ] Laplacian correction 拉普拉斯修正[ ] Latent Dirichlet Allocation 隐狄利克雷分布[ ] Latent semantic analysis 潜在语义分析[ ] Latent variable 隐变量[ ] Lazy learning 懒惰学习[ ] Learner 学习器[ ] Learning by analogy 类比学习[ ] Learning rate 学习率[ ] Learning Vector Quantization/LVQ 学习向量量化[ ] Least squares regression tree 最小二乘回归树[ ] Leave-One-Out/LOO 留一法[500 ] linear chain conditional random field 线性链条件随机场[ ] Linear Discriminant Analysis/LDA 线性判别分析[ ] Linear model 线性模型[ ] Linear Regression 线性回归[ ] Link function 联系函数[ ] Local Markov property 局部马尔可夫性[ ] Local minimum 局部最小[ ] Log likelihood 对数似然[ ] Log odds/logit 对数几率[ ] Logistic Regression Logistic 回归[ ] Log-likelihood 对数似然[ ] Log-linear regression 对数线性回归[ ] Long-Short Term Memory/LSTM 长短期记忆[ ] Loss function 损失函数Letter M[ ] Machine translation/MT 机器翻译[ ] Macron-P 宏查准率[ ] Macron-R 宏查全率[ ] Majority voting 绝对多数投票法[ ] Manifold assumption 流形假设[ ] Manifold learning 流形学习[ ] Margin theory 间隔理论[ ] Marginal distribution 边际分布[ ] Marginal independence 边际独立性[ ] Marginalization 边际化[ ] Markov Chain Monte Carlo/MCMC 马尔可夫链蒙特卡罗方法[ ] Markov Random Field 马尔可夫随机场[ ] Maximal clique 最大团[ ] Maximum Likelihood Estimation/MLE 极大似然估计/极大似然法[ ] Maximum margin 最大间隔[ ] Maximum weighted spanning tree 最大带权生成树[ ] Max-Pooling 最大池化[ ] Mean squared error 均方误差[ ] Meta-learner 元学习器[ ] Metric learning 度量学习[ ] Micro-P 微查准率[ ] Micro-R 微查全率[ ] Minimal Description Length/MDL 最小描述长度[ ] Minimax game 极小极大博弈[ ] Misclassification cost 误分类成本[ ] Mixture of experts 混合专家[ ] Momentum 动量[ ] Moral graph 道德图/端正图[ ] Multi-class classification 多分类[ ] Multi-document summarization 多文档摘要[ ] Multi-layer feedforward neural networks 多层前馈神经网络[ ] Multilayer Perceptron/MLP 多层感知器[ ] Multimodal learning 多模态学习[550 ] Multiple Dimensional Scaling 多维缩放[ ] Multiple linear regression 多元线性回归[ ] Multi-response Linear Regression /MLR 多响应线性回归[ ] Mutual information 互信息Letter N[ ] Naive bayes 朴素贝叶斯[ ] Naive Bayes Classifier 朴素贝叶斯分类器[ ] Named entity recognition 命名实体识别[ ] Nash equilibrium 纳什均衡[ ] Natural language generation/NLG 自然语言生成[ ] Natural language processing 自然语言处理[ ] Negative class 负类[ ] Negative correlation 负相关法[ ] Negative Log Likelihood 负对数似然[ ] Neighbourhood Component Analysis/NCA 近邻成分分析[ ] Neural Machine Translation 神经机器翻译[ ] Neural Turing Machine 神经图灵机[ ] Newton method 牛顿法[ ] NIPS 国际神经信息处理系统会议[ ] No Free Lunch Theorem/NFL 没有免费的午餐定理[ ] Noise-contrastive estimation 噪音对比估计[ ] Nominal attribute 列名属性[ ] Non-convex optimization 非凸优化[ ] Nonlinear model 非线性模型[ ] Non-metric distance 非度量距离[ ] Non-negative matrix factorization 非负矩阵分解[ ] Non-ordinal attribute 无序属性[ ] Non-Saturating Game 非饱和博弈[ ] Norm 范数[ ] Normalization 归一化[ ] Nuclear norm 核范数[ ] Numerical attribute 数值属性Letter O[ ] Objective function 目标函数[ ] Oblique decision tree 斜决策树[ ] Occam’s razor 奥卡姆剃刀[ ] Odds 几率[ ] Off-Policy 离策略[ ] One shot learning 一次性学习[ ] One-Dependent Estimator/ODE 独依赖估计[ ] On-Policy 在策略[ ] Ordinal attribute 有序属性[ ] Out-of-bag estimate 包外估计[ ] Output layer 输出层[ ] Output smearing 输出调制法[ ] Overfitting 过拟合/过配[600 ] Oversampling 过采样Letter P[ ] Paired t-test 成对t 检验[ ] Pairwise 成对型[ ] Pairwise Markov property 成对马尔可夫性[ ] Parameter 参数[ ] Parameter estimation 参数估计[ ] Parameter tuning 调参[ ] Parse tree 解析树[ ] Particle Swarm Optimization/PSO 粒子群优化算法[ ] Part-of-speech tagging 词性标注[ ] Perceptron 感知机[ ] Performance measure 性能度量[ ] Plug and Play Generative Network 即插即用生成网络[ ] Plurality voting 相对多数投票法[ ] Polarity detection 极性检测[ ] Polynomial kernel function 多项式核函数[ ] Pooling 池化[ ] Positive class 正类[ ] Positive definite matrix 正定矩阵[ ] Post-hoc test 后续检验[ ] Post-pruning 后剪枝[ ] potential function 势函数[ ] Precision 查准率/准确率[ ] Prepruning 预剪枝[ ] Principal component analysis/PCA 主成分分析[ ] Principle of multiple explanations 多释原则[ ] Prior 先验[ ] Probability Graphical Model 概率图模型[ ] Proximal Gradient Descent/PGD 近端梯度下降[ ] Pruning 剪枝[ ] Pseudo-label 伪标记[ ] Letter Q[ ] Quantized Neural Network 量子化神经网络[ ] Quantum computer 量子计算机[ ] Quantum Computing 量子计算[ ] Quasi Newton method 拟牛顿法Letter R[ ] Radial Basis Function/RBF 径向基函数[ ] Random Forest Algorithm 随机森林算法[ ] Random walk 随机漫步[ ] Recall 查全率/召回率[ ] Receiver Operating Characteristic/ROC 受试者工作特征[ ] Rectified Linear Unit/ReLU 线性修正单元[650 ] Recurrent Neural Network 循环神经网络[ ] Recursive neural network 递归神经网络[ ] Reference model 参考模型[ ] Regression 回归[ ] Regularization 正则化[ ] Reinforcement learning/RL 强化学习[ ] Representation learning 表征学习[ ] Representer theorem 表示定理[ ] reproducing kernel Hilbert space/RKHS 再生核希尔伯特空间[ ] Re-sampling 重采样法[ ] Rescaling 再缩放[ ] Residual Mapping 残差映射[ ] Residual Network 残差网络[ ] Restricted Boltzmann Machine/RBM 受限玻尔兹曼机[ ] Restricted Isometry Property/RIP 限定等距性[ ] Re-weighting 重赋权法[ ] Robustness 稳健性/鲁棒性[ ] Root node 根结点[ ] Rule Engine 规则引擎[ ] Rule learning 规则学习Letter S[ ] Saddle point 鞍点[ ] Sample space 样本空间[ ] Sampling 采样[ ] Score function 评分函数[ ] Self-Driving 自动驾驶[ ] Self-Organizing Map/SOM 自组织映射[ ] Semi-naive Bayes classifiers 半朴素贝叶斯分类器[ ] Semi-Supervised Learning 半监督学习[ ] semi-Supervised Support Vector Machine 半监督支持向量机[ ] Sentiment analysis 情感分析[ ] Separating hyperplane 分离超平面[ ] Sigmoid function Sigmoid 函数[ ] Similarity measure 相似度度量[ ] Simulated annealing 模拟退火[ ] Simultaneous localization and mapping 同步定位与地图构建[ ] Singular Value Decomposition 奇异值分解[ ] Slack variables 松弛变量[ ] Smoothing 平滑[ ] Soft margin 软间隔[ ] Soft margin maximization 软间隔最大化[ ] Soft voting 软投票[ ] Sparse representation 稀疏表征[ ] Sparsity 稀疏性[ ] Specialization 特化[ ] Spectral Clustering 谱聚类[ ] Speech Recognition 语音识别[ ] Splitting variable 切分变量[700 ] Squashing function 挤压函数[ ] Stability-plasticity dilemma 可塑性-稳定性困境[ ] Statistical learning 统计学习[ ] Status feature function 状态特征函[ ] Stochastic gradient descent 随机梯度下降[ ] Stratified sampling 分层采样[ ] Structural risk 结构风险[ ] Structural risk minimization/SRM 结构风险最小化[ ] Subspace 子空间[ ] Supervised learning 监督学习/有导师学习[ ] support vector expansion 支持向量展式[ ] Support Vector Machine/SVM 支持向量机[ ] Surrogat loss 替代损失[ ] Surrogate function 替代函数[ ] Symbolic learning 符号学习[ ] Symbolism 符号主义[ ] Synset 同义词集Letter T[ ] T-Distribution Stochastic Neighbour Embedding/t-SNE T –分布随机近邻嵌入[ ] Tensor 张量[ ] Tensor Processing Units/TPU 张量处理单元[ ] The least square method 最小二乘法[ ] Threshold 阈值[ ] Threshold logic unit 阈值逻辑单元[ ] Threshold-moving 阈值移动[ ] Time Step 时间步骤[ ] Tokenization 标记化[ ] Training error 训练误差[ ] Training instance 训练示例/训练例[ ] Transductive learning 直推学习[ ] Transfer learning 迁移学习[ ] Treebank 树库[ ] Tria-by-error 试错法[ ] True negative 真负类[ ] True positive 真正类[ ] True Positive Rate/TPR 真正例率[ ] Turing Machine 图灵机[ ] Twice-learning 二次学习Letter U[ ] Underfitting 欠拟合/欠配[ ] Undersampling 欠采样[ ] Understandability 可理解性[ ] Unequal cost 非均等代价[ ] Unit-step function 单位阶跃函数[ ] Univariate decision tree 单变量决策树[ ] Unsupervised learning 无监督学习/无导师学习[ ] Unsupervised layer-wise training 无监督逐层训练[ ] Upsampling 上采样Letter V[ ] Vanishing Gradient Problem 梯度消失问题[ ] Variational inference 变分推断[ ] VC Theory VC维理论[ ] Version space 版本空间[ ] Viterbi algorithm 维特比算法[760 ] Von Neumann architecture 冯· 诺伊曼架构Letter W[ ] Wasserstein GAN/WGAN Wasserstein生成对抗网络[ ] Weak learner 弱学习器[ ] Weight 权重[ ] Weight sharing 权共享[ ] Weighted voting 加权投票法[ ] Within-class scatter matrix 类内散度矩阵[ ] Word embedding 词嵌入[ ] Word sense disambiguation 词义消歧Letter Z[ ] Zero-data learning 零数据学习[ ] Zero-shot learning 零次学习第三部分A[ ] approximations近似值[ ] arbitrary随意的[ ] affine仿射的[ ] arbitrary任意的[ ] amino acid氨基酸[ ] amenable经得起检验的[ ] axiom公理,原则[ ] abstract提取[ ] architecture架构,体系结构;建造业[ ] absolute绝对的[ ] arsenal军火库[ ] assignment分配[ ] algebra线性代数[ ] asymptotically无症状的[ ] appropriate恰当的B[ ] bias偏差[ ] brevity简短,简洁;短暂[800 ] broader广泛[ ] briefly简短的[ ] batch批量C[ ] convergence 收敛,集中到一点[ ] convex凸的[ ] contours轮廓[ ] constraint约束[ ] constant常理[ ] commercial商务的[ ] complementarity补充[ ] coordinate ascent同等级上升[ ] clipping剪下物;剪报;修剪[ ] component分量;部件[ ] continuous连续的[ ] covariance协方差[ ] canonical正规的,正则的[ ] concave非凸的[ ] corresponds相符合;相当;通信[ ] corollary推论[ ] concrete具体的事物,实在的东西[ ] cross validation交叉验证[ ] correlation相互关系[ ] convention约定[ ] cluster一簇[ ] centroids 质心,形心[ ] converge收敛[ ] computationally计算(机)的[ ] calculus计算D[ ] derive获得,取得[ ] dual二元的[ ] duality二元性;二象性;对偶性[ ] derivation求导;得到;起源[ ] denote预示,表示,是…的标志;意味着,[逻]指称[ ] divergence 散度;发散性[ ] dimension尺度,规格;维数[ ] dot小圆点[ ] distortion变形[ ] density概率密度函数[ ] discrete离散的[ ] discriminative有识别能力的[ ] diagonal对角[ ] dispersion分散,散开[ ] determinant决定因素[849 ] disjoint不相交的E[ ] encounter遇到[ ] ellipses椭圆[ ] equality等式[ ] extra额外的[ ] empirical经验;观察[ ] ennmerate例举,计数[ ] exceed超过,越出[ ] expectation期望[ ] efficient生效的[ ] endow赋予[ ] explicitly清楚的[ ] exponential family指数家族[ ] equivalently等价的F[ ] feasible可行的[ ] forary初次尝试[ ] finite有限的,限定的[ ] forgo摒弃,放弃[ ] fliter过滤[ ] frequentist最常发生的[ ] forward search前向式搜索[ ] formalize使定形G[ ] generalized归纳的[ ] generalization概括,归纳;普遍化;判断(根据不足)[ ] guarantee保证;抵押品[ ] generate形成,产生[ ] geometric margins几何边界[ ] gap裂口[ ] generative生产的;有生产力的H[ ] heuristic启发式的;启发法;启发程序[ ] hone怀恋;磨[ ] hyperplane超平面L[ ] initial最初的[ ] implement执行[ ] intuitive凭直觉获知的[ ] incremental增加的[900 ] intercept截距[ ] intuitious直觉[ ] instantiation例子[ ] indicator指示物,指示器[ ] interative重复的,迭代的[ ] integral积分[ ] identical相等的;完全相同的[ ] indicate表示,指出[ ] invariance不变性,恒定性[ ] impose把…强加于[ ] intermediate中间的[ ] interpretation解释,翻译J[ ] joint distribution联合概率L[ ] lieu替代[ ] logarithmic对数的,用对数表示的[ ] latent潜在的[ ] Leave-one-out cross validation留一法交叉验证M[ ] magnitude巨大[ ] mapping绘图,制图;映射[ ] matrix矩阵[ ] mutual相互的,共同的[ ] monotonically单调的[ ] minor较小的,次要的[ ] multinomial多项的[ ] multi-class classification二分类问题N[ ] nasty讨厌的[ ] notation标志,注释[ ] naïve朴素的O[ ] obtain得到[ ] oscillate摆动[ ] optimization problem最优化问题[ ] objective function目标函数[ ] optimal最理想的[ ] orthogonal(矢量,矩阵等)正交的[ ] orientation方向[ ] ordinary普通的[ ] occasionally偶然的P[ ] partial derivative偏导数[ ] property性质[ ] proportional成比例的[ ] primal原始的,最初的[ ] permit允许[ ] pseudocode伪代码[ ] permissible可允许的[ ] polynomial多项式[ ] preliminary预备[ ] precision精度[ ] perturbation 不安,扰乱[ ] poist假定,设想[ ] positive semi-definite半正定的[ ] parentheses圆括号[ ] posterior probability后验概率[ ] plementarity补充[ ] pictorially图像的[ ] parameterize确定…的参数[ ] poisson distribution柏松分布[ ] pertinent相关的Q[ ] quadratic二次的[ ] quantity量,数量;分量[ ] query疑问的R[ ] regularization使系统化;调整[ ] reoptimize重新优化[ ] restrict限制;限定;约束[ ] reminiscent回忆往事的;提醒的;使人联想…的(of)[ ] remark注意[ ] random variable随机变量[ ] respect考虑[ ] respectively各自的;分别的[ ] redundant过多的;冗余的S[ ] susceptible敏感的[ ] stochastic可能的;随机的[ ] symmetric对称的[ ] sophisticated复杂的[ ] spurious假的;伪造的[ ] subtract减去;减法器[ ] simultaneously同时发生地;同步地[ ] suffice满足[ ] scarce稀有的,难得的[ ] split分解,分离[ ] subset子集[ ] statistic统计量[ ] successive iteratious连续的迭代[ ] scale标度[ ] sort of有几分的[ ] squares平方T[ ] trajectory轨迹[ ] temporarily暂时的[ ] terminology专用名词[ ] tolerance容忍;公差[ ] thumb翻阅[ ] threshold阈,临界[ ] theorem定理[ ] tangent正弦U[ ] unit-length vector单位向量V[ ] valid有效的,正确的[ ] variance方差[ ] variable变量;变元[ ] vocabulary词汇[ ] valued经估价的;宝贵的[ ] W [1038 ] wrapper包装。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Generalized State Classes of Time Petri Nets for Timeliness AnalysisD IANXIANG X U+*, J IACUN W ANG++ AND R ICHARD A. V OLZ++Department of Computer ScienceTexas A&M UniversityCollege Station, TX 77840, USA++Nortel Networks2201 Lakeside Blvd, MS-99203G50Richardson, TX 75082, USASubmitted as a short paper to IEEE Transactions on Robotics and Automation Special issue on Analysis and Control of Automated Manufacturing Systems via Timed Petri-Net Models.A BSTRACTTime Petri Nets (TPN’s), as an expressive formalism for modeling distributed real-time systems, are usually analyzed through classical state classes or clock-stamped state classes. There are two issues, however, that have not been effectively or efficiently addressed: 1) how to facilitate the timeliness analysis, particularly the end-to-end time delay of task execution, and 2) how to compare state classes for equality in order to reduce state space. In the state class approach, dynamic firing domains consisting of relative firing intervals of individual transitions and timing inequalities for pairs of transitions have to be transformed into a canonical form, which has polynomial complexity. This makes the generation of state class graphs more complicated and the computation of end-to-end delay of task execution inconvenient. In the clock-stamped state class approach, the conditions for determining the fireability of transitions in general TPN’s are loosened indirectly because of the use of global firing domains, which may lead to a bigger reachable set than that obtained by using the state class approach, and only state class trees, rather than graphs, can be generated. To solve these problems, this paper uses dynamic, relative firing intervals for the determination of transition fireability and global firing domains for the timeliness analysis. For general reachability analysis, a state class graph is generated only in terms of relative firing intervals. Whenever a goal is reached through a certain schedule, global firing domains are applied to analyze the end-to-end delay of the schedule. A simple yet typical manufacturing system is used to demonstrate our approach.Keywords: Time Petri nets, state class, reachability analysis, timeliness analysis, real-time systems.* Corresponding author: Dianxiang Xu, xudian@.I. I NTRODUCTIONPetri nets [1, 2] have been used to model a variety of discrete event systems [3, 4] due to their power in modeling asynchronous events, parallelism, contention, and synchronization. It has been of high interest to extend and use Petri nets by adding time features for time dependent systems [5]. Introduction of time into transitions, places or arcs increases the modeling power as well as the complexity of net analysis. Several extended models of Petri nets have been proposed to deal with the timing issues [6], such as timed Petri nets [7, 8], stochastic timed Petri nets [9], and time Petri nets (TPN’s) [10]. Among these models, TPN’s are most widely used for real-time system specification and verification [5, 11-13]. In TPN’s, event synchronization is represented in terms of a set of pre- and post-conditions associated with each individual action of the modeled system, and timing constraints are expressed in terms of minimum and maximum amount of time elapsing between the enabling and the execution of each action. This allows for explicit modeling of time-dependent concurrency and parallelism. It has been shown [5] TPN’s are expressive for most of the temporal constraints while some of these constraints are difficult to represent only in terms of durations, e.g. in Ramchandani’s Timed Petri nets [8].The most fundamental and useful method for analyzing TPN’s, like for many other formal models, is reachability analysis [5]. It permits the automatic translation of behavioral models into a state transition graph made up of a set of states, a set of actions, and a succession relation associating states through actions [11]. The state transition graph makes explicit such properties as deadlock and reachability [14], and allows for the automatic verification of ordering relationships among task execution times [12]. Two major approaches for the reachability analysis of TPN models are classical state classes [5] and clock-stamped state classes [13]. As will be discussed in detail, however, there are two issues that these approaches have not effectively or efficiently addressed: 1) how to facilitate the timeliness analysis, particularly the end-to-end time delay of task execution, and 2) how to compare state classes for equality in order to reduce state space when generating a state class graph. These issues are important for practical analysis of distributed real-time system models.In the classical state class approach [5], timing inequalities representing the relationships among the firing times of transitions are introduced in dynamic firing domains regarding the fact that relative, dynamic firing intervals of individual transitions alone are inaccurate for timeliness analysis. Accordingly, the timing inequalities for transition pairs, although independent of the determination of transition fireability under a given state class, are also considered for the comparison of state classes (i.e. whether or not two state classes are equal depends on their markings, dynamic firing intervals as well as the inequalities). To do so, dynamic firing domains have to be transformed into a canonical form, which has polynomial complexity [5, 11]. In general, the inequalities make the generation of state class graphsmore complicated. So, this approach is inefficient in deriving the end-to-end delay of a task schedule. In contrast, the clock-stamped state class approach [13] uses the global time mode for state classes, and the end-to-end delay can be easily obtained. Since two schedules reaching the same marking generally have different global time delays, this approach does not compare clock-stamped state classes for equality and thus only generates state class trees, rather than state class graphs. As a result, the reachability analysis is less efficient due to the increased complexity of state space. Another critical problem with the clock-stamped state class approach is that it loosens the conditions of determining the fireability of transitions because of the use of global firing domains. As a consequence, it makes some unreachable markings reachable in some cases. Therefore, this approach is not quite suitable for the timeliness analysis either.To solve the above problems, this paper presents an effective approach, using relative firing intervals for the determination of transition fireability and global firing domains for the timeliness analysis. For reachability analysis of a general TPN model, we generate the state class graph only in terms of dynamic firing intervals of individual transitions without considering the timing inequalities on transition pairs. Whenever a goal is reached through a certain schedule in the state class graph, global firing domains are applied to analyze the end-to-end delay of the schedule. A simple yet typical manufacturing system will be used to demonstrate our approach.The rest of this paper is organized as follows. To make the paper self-contained, section 2 gives a brief introduction to TPN’s, state classes, and clock stamped state classes, and describes the problems with the two approaches. In section 3, generalized state classes are introduced for the reachability and timeliness analysis of TPN’s. Section 4 presents the modeling and analysis of a simple manufacturing system. Section 5 concludes this paper.II. T IME P ETRI NETS, S TATE C LASSES AND C LOCK-S TAMPED S TATE C LASSES In this section, we first define TPN’s that will be used, then briefly review state classes and clock-stamped classes and analyze the issues that they have not efficiently or effectively addressed.A. Time Petri NetsA TPN is a tuple (P, T, B, F, M0, SI) where:(1) P = {p1, p2, …, p m} is a finite nonempty set of places.(2) T = {t1, t2, …, t n}is a finite nonempty set of transitions.(3) B: P × T → N is the backward incidence function.(4) F: T × P → N is the forward incidence function.(5) M0 is the initial marking. (P, T, B, F and M0 together define a Petri net)(6) SI is a mapping of static intervals. ∀t ∈T, SI(t) = [SEFT(t), SLFT(t)], where SEFT(t) andSLFT(t) are non-negative rational numbers, SEFT(t) is the static earliest firing time, and SLFT(t) is the static latest firing time.A state of a TPN is a pair S = (M, I) consisting of a marking M and a firing interval set I. I is a vector of firing times.The number of entries in this vector is given by the number of the transitions enabled by M. A state is reached from the initial state by a given sequence of firing times corresponding to a firing sequence. Since any reachable marking may be reached from the initial marking by different sequence of firing times corresponding to the same firing sequence, the state space is generally infinite.B. State ClassesA state class of a TPN, as an aggregated pseudo-state associated with a firing sequence, represents all states reachable from the initial state by firing all feasible firing values corresponding to the same firing sequence [5]. Let Enabled(M) be the set of transitions enabled under marking M, and τ(t) be the firing time of t relative to the time point when M is reached. A state class is a pair C = (M, D), in which M is the marking of all states in the class, and D is the firing domain defined as a set of linear inequalities. For clarity, we classify the linear inequalities into two categories: the dynamic firing intervals of individual enabled transitions (i.e. a i≤τ(t i) ≤ b i, for any t i∈ Enabled(M) ) and the inequalities of timing constraints on pairs of enabled transitions (i.e. a ij≤τ(t i)- τ(t j) ≤ b ij for any t i, t j∈ Enabled(M) and t i≠t j). Note that, whether or not a transition enabled by M is firable under state class (M, D) depends on the dynamic firing intervals. The inequalities are in essence used for the timeliness analysis and the comparison of state classes when the state class graph is generated. In order to make use of dynamic firing domains, they need to be reduced into a canonical form. It has been proven that one such canonical representation exists uniquely, and an algorithm for this purpose has polynomial complexity [5, 11]. [11] shows timeliness analysis can be conducted by repetitive exploration of the firing intervals and inequalities in the state classes. In comparison, we will show it can be achieved in a more convenient and effective way by using global firing intervals. The generation of state class graph is then made more efficient because there is no need to consider the inequalities in state classes.Without considering the inequalities, we denote a dynamic firing domain D as {D(t)= [EFT(t), LFT(t)]: t∈Enabled(M)}, where D(t) is the dynamic firing interval of t, and EFT(t) and LFT(t) are the dynamic earliest firing time and latest firing time of t, respectively. An enabled transition t is firable under state class C if EFT (t) ≤min{LFT(t i), t i∈Enabled(M)}. Suppose Firable(C) is the set of firable transitions under state class C, and MLFT(C) = min {LFT(t i), t i∈ Firable(C)}is the minimum of latest firing times of all firable transitions in Firable(C). The construction of the state class graph of TPN can be briefly outlined as follows: The initial state class is C0 =(M0, D0), where M0 is the initial marking and D0= {[SEFT(t), SLFT(t)]: t∈Enabled(M0) } (at the initial state, the dynamic firing interval for any enabled transition is equal to its static firing interval); Given a state class C k= (M k, D k), a successor state class C k+1= (M k+1, D k+1) can be created for some t ∈ Firable(C k) by following steps:• The actual firing interval for t under class C k is [EFT k(t), MLFT k(t)]. M k+1 is obstained according to the firing rule of classical Petri nets.• For each inherited transition t j (t j≠t) that is enabled under both M k and M k+1, EFT k+1(t j) = max(0, EFT k(t j)-MLFT(C k)), and LFT k+1(t j) = LFT k(t j)-EFT k(t)• For each newly enabled transition t j under M k+1 (not enabled under M k), EFT k+1(t j) = SEFT(t j) and LFT k+1(t j) = SLFT k(t j).Note that, dynamic firing intervals of individual transitions alone are inaccurate for the timeliness analysis. For example, suppose the initial marking for the TPN in Fig. 1 is (1, 0, 1, 0. 0)T. According to the semantics of TPN’s, the correct time span for reaching marking (0, 1, 0, 1, 0)T by firing t2 and t1 is [3, 5]. Using the state class method, the relative firing intervals for t1 after t2 fires at sometime during [2, 4] is [max{0, 3-4}, 5-2], i.e. [0, 3]. Obviously, the intervals [2, 4] and [0, 3] have no clear relationship with the correct time span [3, 5].C. Clock-Stamped State ClassesA clock-stamped class [13] of a TPN consists of three parts: 1) a marking M, 2) an "global" firing domain corresponding to firing intervals (relative to the beginning of execution) of all firable transitions in the state class, and 3) a clock stamp that corresponds to the moment when the state class is reached with the clock value relative to the beginning of execution.Fig 1. A TPN ExampleThe clock stamped state class method has two critical issues. First, the fireability of transitions under a clock stamped state class cannot be correctly determined in terms of the global firing intervals. For the above example, when marking (0, 1, 0, 1, 0)T is reached, the time stamp for the state class is [3, 5], andthe firing domains for newly enabled transitions t3 and t4 are [1, 3] + [3, 5] = [4, 8] and [4, 5] + [3, 5] = [7, 10], respectively. According to the firing rules of clock stamped state classes, both t3 and t4 are firable because both EFT(t3) = 4 and EFT(t4) = 7 are less than min{LFT(t3), LFT(t4)} = 8. This is incorrect regarding the semantics of TPN’s due to the fact that t4is never firable (t3must fire before t4, which disables t4). In essence, clock stamped state classes are not suitable for the reachability analysis of general TPN’s. Second, since time stamps in a firing schedule are increasing monotonically, the clock stamped state class method can only generate state class trees, rather than state class graphs. This makes state space even more complex.III. T IMELINESS AND R EACHABILITY A NALYSISIn this section, we generalize the concept of state classes by integrating classical state classes with global time mode for the timeliness analysis of TPN’s, such that:• relative firing domains are used to determine the fireability of transitions, and• absolute firing domains are used to calculate the time span of a firing schedule.For general reachability analysis, the state class graph of a TPN is generated only in terms of dynamic firing intervals. The timeliness of a firing schedule is then conducted separately using global firing intervals.A. Generalized State ClassesA generalized state class (GSC) is a 4-tuple C = (M, D, AD, ST) where(1) M is a marking.(2) D is a relative firing domain, i.e., a set of constraints on the values of the time to fire fortransitions enabled by current marking M. For an enabled transition t i, D(t i) represents its relative firing interval. Let EFT(t i) be the left bound of D(t i) (the relative earliest firing time) and LFT(t i) be the right bound of D(t i) (the relative latest firing time).(3) AD is an global (absolute) firing domain, i.e., a set of constraints on the values of the time to firefor transitions enabled by current marking M. For an enabled transition t i, AD(t i) represents its absolute firing interval. Let AEFT(t i) be the left bound of AD(t i) (the absolute earliest firing time) and ALFT(t i) be the right bound of AD(t i) (the absolute latest firing time).(4) ST is the time stamp of the GSC class, which is a (global) time interval.Before we define the transition firing rules, let us explain what we want the absolute firing domain AD and the time stamp ST to be: (1) For an enabled transition t i, AD(t i) gives the global firing time interval of t i, where by “global” we mean the values are counted relative to the beginning of the net’s execution from the initial GSC class C0,defined as C0= (M0, D0, AD0, ST0), where M0 is the initialmarking, D 0 and AD 0 contain all static firing time intervals of the transitions enabled at M 0, and ST 0 = [0, 0]. (2) ST indicates the global time delay interval in which the net runs from the initial GSC class C 0 to current GSC class C. Theorem 1 in next subsection will show this property.Now we consider the firing rules that guide the timeliness analysis of a TPN. An enabled transition t j is said to be firable at GSC-class C k if EFT k (t j ) ≤ min{LFT k (t i ), t i ∈ En(C k )}, where En(C k ) is the set of all enabled transitions at C k . Let Fr(C k ) be the set of firable transitions at GSC class C k , andMLFT(C k ) = min{LFT k (t i ), t i ∈ Fr(C k )},MALFT(C k ) = min{ALFT k (t i ), t i ∈ Fr(C k )},where MLFT(C k ) / MALFT(C k ) define the minimum of relative/absolute latest firing times of all firable transitions in Fr(C k ). We divide the firable transitions in Fr(C k ) into two groups: (1) inherited firable transitions that were firable before C k is reached, and (2) newly firable transitions that begin firable at C k . The firing of transition t f ∈ Fr(C k ) changes GSC class C k to C k+1. Let C k = (M k , D k , AD k , ST k ) and C k+1 = (M k+1, D k+1, AD k+1,ST k+1). We define firing rules as follows:Step 1. Calculate )(~f k t D , the feasible relative firing interval for firing transition t f . It is obtained by shifting right bound of D (t f ) to MLFT (C k ) while keeping its left bound unchanged, that is, )]( ),([ )(~k f k f k C MLFTt EFT t D = LetST k+1 =[AEFT k (t f ), MALFT(C k )]Step 2. Calculate firing intervals of inherited firable transitions in GSC class C k+1.2.1 Let )(1f k k t B M M −′=′+ and collect (inherited) firable transitions at M ′k+1.2.2 Let D k+1 = D k , AD k+1 = AD k and delete from D k+1 and AD k+1 all entries whose correspondingtransitions are disabled by M ′k+1.2.3 For each inherited firable transition t j (t j ≠ t f ) at M ′k+1, letEFT k+1(t j ) = max{0, EFT k (t j )-MLFT(C k )}LFT k+1(t j ) = LFT k (t j )-EFT k (t f )AEFT k+1(t j ) = max{AEFT k (t j ), AEFT k (t f )}.ALFT k+1(t j ) = ALFT k (t j )Step 3. Calculate the firing intervals of newly enabled transitions after firing t f .3.1 Let )(11f k k t F M M +′=++ and collect newly enabled transitions: they are enabled at M k +1 butnot firable at virtual marking M ′k +1.3.2 Add into D k +1, AD k+1 entries for corresponding newly enabled transitions at M k +1: if t j (t j ≠ t f )is a newly enabled transition at M k +1, thenD k+1(t j) = SI(t j).AD k+1(t j) = SI(t j) + ST k+1 .3.3 If t f is still enabled at M k+1 after firing, letD k+1(t f) = SI(t f).AD k+1(t f) = SI(t f) + ST k+1.Note that by Step 3.3, a transition, which is still enabled after its firing, is always treated as a newly enabled one. This simplifies the treatment of states in which a transition has sufficient tokens in its input places to permit multiple firings. The treatment of this situation, usually referred to as multiple enabledness [5], requires multiple firing intervals be associated with a single transition and involves a number of semantic subtleties that are not relevant to the objective of this paper.B. Timeliness AnalysisWe have described the transition firing rules that guide the evolution of GSC classes. Now we show how GSC classes are suitable for timeliness analysis. Theorem 1 illustrates what the absolute firing domain AD and the time stamp ST in a GSC class exactly stand for. Corollary 1 indicates what can be gained from the generation of the reachability graph. To facilitate our description, we denote firing schedule t0 t1…. t n-1 that transforms C0 to C n by (C0, t0)(C1, t1)…(C i, t i)…(C n-1, t n-1)C n.Theorem1. Let C i = (M i, D i, AD i ST i ) be a reachable GSC from C0 = (M0, D0, AD0,,ST0) through firing schedule t0 t1…. t n-1. Then(1) ST i is the global time (interval) when GSC class C i is reached;(2) If t j∈Fr(C i), then AD i(t j) is the global firing time interval of t j.Proof:(1) From the preconditions, we know there must be a firing schedule starting with C0 and ending with C i, that is, (C0, t0)(C1, t1)…(C i-1, t i-1)C i. The proof of the theorem is carried out by induction on i. For the basis case (i = 0), C0 is the initial class. Obviously we have: 1) ∀ t j ∈Fr(C0),. AD0(t j)is exactly the static firing time interval of t j, which is also the global firing time. And 2) ST0 = [0, 0], which is the arriving time of C0. Therefore, the theorem holds for i = 0.Now assume that the assertion holds for i ≤ k. Consider i = k + 1. It follows from the equation ST k+1 = [AEFT k(t k), MALFT(C k)]where AEFT k(t k) is the absolute earliest firing time of t k, and MALFT(C k) is the minimum latest firing time of all firable transitions in Fr(C k), hence the actual latest global firing time of t k. Therefore, ST k+1 is the global firing time interval of t k. Because firing a transition takes no time, ST k+1 is also the global arriving time of state class C k+1, which is reached by firing t k+1. So (1) holds.(2) Suppose that a transition t j is firable at C k+1. There are three different cases of t j.Case 1. t j is a newly enabled transition at M k+1and t j≠t k.AD k+1(t j) = SI(t j) + ST k+1 = [SEFT(t j) + AEFT k(t k), SLFT(t j) + MALFT(C k)],where, AEFT k(t k) is the earliest (global) arriving time of state class C k+1, SEFT(t j) the static (relative) earliest firing time when t j is enabled at C k+1, so SEFT(t j) + AEFT k(t k) is the earliest global firing time of transition t j; MALFT(C k) is the latest (global) arriving time of state class C k+1, SLFT(t j) the static (relative) latest firing time when t j is enabled at C k+1, so SLFT(t j) + MALFT(C k) is the latest global firing time of transition t j. Therefore, AD k+1(t j) is the global firing time interval of t j.Case 2. t j = t k.Because we ignore multiple-enabledness, so t k is viewed as a new enabled transition at M k+1. Thus the conclusion drawn in Case 1 also applies to this case.Case 3. t j is an inherited transition.In this case, it follows from Step 2 thatAD k+1(t j) = [max(AEFT k(t j), AEFT k(t k)), ALFT k(t j)].According to the assumption made for i ≤ k, [AEFT k(t k)), ALFT k(t j)] is the global firing time interval of transition t j at state class C k. The latest global firing time of t j at C k+1 should be the same as it is at C k; however, the earliest global firing time of t j at C k+1 must take the larger value of AEFT k(t j)and AEFT k(t k), because t j is supposed to fire after C k+1 is reached. So, AD k+1(t j) is the global firing time interval of transition t j at state class C k+1.Thus the theorem holds. t According to theorem 1, ST i gives the exact global time interval when GSC class C i is visited. Corollary 1. Let C i = (M i, D i, AD i, ST i) and C j = (M j, D j, AD j, ST j) be two reachable GSC classes of a TPN where C j is reachable from C i. If ∀ t j ∈Fr(C i), t j is a newly enabled transition, then the time span that the TPN runs from C i to C j is ST j −ST i.Proof:Because all firable transitions at C i are newly enabled, so the future behavior of the TPN starting from C i is reached is independent of the history before C i is reached. Suppose that if the TPN starts running from class C i at time 0, it will reach class C i at global time interval [x, y]. Then we know that if the TPN starts running from class C i at time a, it will reach class C i at global time interval [x+a, y+a]. Futhermore, if the TPN starts running from class C i at time interval ST i= [a, b], it will reach class C i at global time interval ST i = [min{x+a, x+b}, max{y+a, y+b}] = [x+a, y+b]. Because the time span that the TPN runs from C i to C j is independent of the starting time, so it follows from Theorem 1 that the time span is [x, y], or ST j −ST i. t Corollary 1 is very useful for timeliness analysis. As mentioned in [12], the key issue of timeliness analysis is to verify whether a marking can be reached with timing constraints. Corollary 1 shows that theconcept of GSC classes helps establish quantitative timing relationship between any two reachable classes in a firing schedule.C. Reachability Analysis through State Class GraphsThe major purpose of generalized state classes is meant for the timeliness analysis of firing schedules. General reachability analysis is conducted in two separate steps. 1) Generating state class graph without considering the time stamps and global firing domains. Two state classes are viewed as identical if and only if they have the same firable transitions and corresponding relative firing intervals. This is similar to the traditional state class graph, except the timing inequalities on transition pairs. Since there are no timing inequalities on transition pairs, the functional reachability analysis is generally more efficient. 2) If a goal marking is reached in the state class graph, analyzing the time delay of the schedule that reaches the goal. An example illustrating this will be given in next section.IV. A N E XAMPLETo illustrate our approach, this section describes the modeling and analysis of a simple yet typical manufacturing system, which is composed of 5 machines and 1 assembler, as shown in Figure 2. The system receives two types of parts (A and B) as inputs. A-parts go to machine 1 for processing, whereas B-parts go to machine 2 for processing. After being processed by machine 2, B-parts go to either machine 3, or machines 4 and 5 for processing. After a pair of A-part and B-part gets processed, they are sent for assembly, where a final product is produced. Assume that the processing of machine 1 on an A-part takes 2 to 9 time units, the processing of machine 2, machine 3, machine 4, and machine 5 on a B-part takes 1 to 3, 3 to 7, 1 to 2, and 2 to 5 time units, respectively. The assembler takes 2 to 4 time units.Fig 2. A manufacturing system.The TPN model of this system is shown in Fig. 3. Table 1 describes its places and transitions.Fig 3. The TPN model for the example manufacturing systemTable 1. Legends of the TPN in Fig. 3.Place Descriptionp1 A-part bufferp2 B-part bufferp3 A-part ready for assemblyp4 B-part processed by machine 2p5 B-part being processed by machine 5p6 B-part ready for assemblyp7 Buffer for final productsTransition Description Time intervalt1 Machine 1 works on A-part [2, 9]t2 Machine 2 works on B-part [1, 3]t3 Machine 3 works on B-part [3, 7]t4 Machine 4 works on B-part [1, 2]t5 Machine 5 works on B-part [2, 5]t6 Assembler works on a pair of A- and B- parts [2, 4]Let us first analyze the timeliness of schedule t2t4t5t1t6. The initial GSC class is C0 = (M0, D0, AD0, ST0) whereST0= [0, 0].M0= (1 1 0 0 0 0 0)T,D0= {D0(t1): [2, 9] , D0(t2): [1, 3]}.AD0= {AD0(t1): [2, 9] , AD0(t2): [1, 3] }.Thus,MLFT(C0) = min{ LFT0(t1), LFT0(t2) } = {9, 3} = 3.MALFT(C0) = min{ ALFT0(t1), ALFT0(t2) } = {9, 3} =3Both t1 and t2 are firable. Firing t2 will reach GSC class C1 = (M1, D1, AD1, ST1). Following Step 1 in section III-A, the dynamic relative firing interval for t2 is [1,3], and ST1 = [AEFT0(t1), MALFT(C0)] = [1, 3]. Following Step 2, we have:M′1 = M0−B(t1)= (1 0 0 0 0 0 0)T.D1(t1) = [0, 8].AD1(t1) = [2, 9].By Step 3, M1 = M′1 + F(t1)= (1 0 0 1 0 0 0) T, under which there are two newly enabled transitions t3 and t4.D1(t3) = [3, 7].D1(t4) = [1, 2].AD1(t3) = [3, 7]+ ST1= [4,10].AD1(t4) = [1, 2]+ST1 = [2, 5].Under GSC class C1, t1 and t4 are firable. But t3 is not firable because EFT1(t3)=3 > min{LFT1(t1), LFT1(t3), LFT1(t4)} = 2. Note that, AEFT1(t3)=4 < min{ALFT1(t1), ALFT1(t3), ALFT1(t4)} = 5 . The global firing intervals cannot be used to determine the fireability of t3 (in the clock-stamped state class method, t3 would be firable). If t4 fires, its relative firing interval is [1, 2]. Similarly, we can get a new GSC class, say C2, where:M2 = (1 0 0 0 1 0 0)T.ST2 = [2, 5].D2(t1) = [0, 7].AD2(t1) = [2, 9].D2(t5) = [2, 5].AD2(t5) = [2, 5]+ST2 = [4, 10].Under C2, both t1 and t5 are firable. The relative firing interval of t5 is [2, 5]. Firing t5 reaches GSC class C3 such that:M3 = (1 0 0 0 0 1 0)T.ST3 = [AEFT2(t5), min{ALFT2(t1), ALFT2(t5)}] = [4, 9]D3(t1) = [0, 5].AD3(t1) = [max{ELFT2(t1), ELFT2(t5)}, ALFT2(t1)] = [4, 9].Under C3, only t1 is enabled and firable. Firing t1 leads to GSC class C4, where:。

相关文档
最新文档