172 Constraint-Based Approaches to the Covering Test Problem
Collocations

15.COLLOCATIONSKathleen R.McKeown and Dragomir R.RadevDepartment of Computer ScienceColumbia University,New York{kathy,radev}@ This chapter describes a class of word groups that lies between idioms and free word combinations.Idiomatic expressions are those in which the semantics of the whole cannot be deduced from the meanings of the individual constituents. Free word combinations have the properties that each of the words can be re-placed by another without seriously modifying the overall meaning of the com-posite unit and if one of the words is omitted,a reader cannot easily infer it from the remaining ones.Unlike free word combinations,a collocation is a group of words that occur together more often than by chance.On the other hand,unlike idioms,individual words in a collocation can contribute to the overall semantics of the compound.We present some definitions and examples of collocations,as well as methods for their extraction and classification.The use of collocations for word sense disambiguation,text generation,and machine translation is also part of this chapter.15.1IntroductionWhile most computational lexicons represent properties of individual words,in general most people would agree that one knows a word from the“company that it keeps”[18].Collocations are a lexical phenomenon that has linguistic and lexicographic status as well as utility for statistical natural language paradigms. Briefly put,they cover word pairs and phrases that are commonly used in lan-guage,but for which no general syntactic or semantic rules apply.Because of their widespread use,a speaker of the language cannot achievefluency without incorporating them in speech.On the other hand,because they escape charac-terization,they have long been the object of linguistic and lexicographic study in an effort to both define them and include them in dictionaries of the language.It is precisely because of the fact that they are observable in language that they have been featured in many statistical approaches to natural language processing.Since they occur repeatedly in language,specific collocations can be acquired by identifying words that frequently occur together in a relatively large sample of language;thus,collocation acquisition falls within the general class of corpus-based approaches to language.By applying the same algorithm to different domain-specific corpora,collocations specific to a particular sub-language can be identified and represented.1Once acquired,collocations are useful in a variety of different applications. They can be used for disambiguation,including both word-sense and struc-tural disambiguation.This task is based on the principle that a word in a particular sense tends to co-occur with a different set of words than when it is used in another sense.Thus bank might co-occur with river in one sense and savings and loan when used in itsfinancial sense.A second important applica-tion is translation:because collocations cannot be characterized on the basis of syntactic and semantic regularities,they cannot be translated on a word-by-word basis.Instead,computational linguists use statistical techniques applied to aligned,parallel,bilingual corpora to identify collocation translations and semi-automatically construct a bilingual collocation lexicon.Such a lexicon can then be used as part of a machine translation program.Finally,collocations have been extensively used as part of language generation systems.Generation systems are able to achieve a level offluency otherwise not possible,by using a lexicon of collocations and word phrases during the process of word selection.In this chapter,wefirst overview the linguistic and lexicographic literature on collocations,providing a partial answer to the question“What is a collo-cation?”.We then turn to algorithms that have been used for acquiring col-locations,including word pairs that co-occur inflexible variations,compounds that may consist of two or more words that are more rigidly used in sequence, and multi-word phrases.After discussing both acquisition and representation of collocations,we discuss their use in the tasks of disambiguation,translation and language generation.We will limit our discussion to these topics,however we would like to mention that some work has been done recently[3]in using collocational phrases in cross-lingual information retrieval.15.2Linguistic and lexicographic views of collo-cationsCollocations are not easily defined.In the linguistic and lexicographic literature, they are often discussed in contrast with free word combinations at one extreme and idiomatic expressions at the other,collocations occurring somewhere in the middle of this spectrum.A free word combination can be described using general rules;that is,in terms of semantic constraints on the words which appear in a certain syntactic relation with a given headword[12].An idiom, on the other hand,is a rigid word combination to which no generalities apply; neither can its meaning be determined from the meaning of its parts nor can it participate in the usual word-order variations.Collocations fall between these extremes and it can be difficult to draw the line between categories.A word combination fails to be classified as free and is termed a collocation when the number of words which can occur in a syntactic relation with a given headword decreases to the point where it is not possible to describe the set using semantic regularities.Thus,examples of free word combinations include put+[object]or run+[object]2(i.e.‘manage’)where the words that can occur as object are virtually open-ended.In the case of put,the semantic constraint on the object is relatively open-ended(any physical object can be mentioned)and thus the range of words that can occur is relatively unrestricted.In the case of run(in the sense of‘man-age’or‘direct’)the semantic restrictions on the object are tighter but still fol-low a semantic generality:any institution or organization can be managed(e.g. business,ice cream parlor,etc.).In contrast to these free word combinations,a phrase such as explode a myth is a collocation.In itsfigurative sense,explode il-lustrates a much more restricted collocational range.Possible objects are limited to words such as belief,idea,theory.At the other extreme,phrases such as foot the bill orfill the bill function as composites,where no words can be interchanged and variation in usage is not generally allowed.This distinction between free word combinations and collocations can be found with almost any pair of syn-tactic categories.Thus,excellent/good/useful/useless dictionary are examples of free word adjective+noun combinations,while abridged/bilingual/combinatorial dictionary are all collocations.More examples of the distinction between free word combinations and collocations are shown in Table1.Idioms Collocations Free word combinationsto kick the bucket to trade actively to take the busdead end table of contents the end of the roadto catch up orthogonal projection to buy a houseTable1:Collocations vs.idioms and free word combinations.Because collocations fall somewhere along a continuum between free-word combinations and idioms,lexicographers have faced a problem in deciding when and how to illustrate collocations as part of a dictionary.Thus,major themes in lexicographic papers address the identification of criteria that can be used to determine when a phrase is a collocation,characteristics of collocations,and representation of collocations in dictionaries.Given the fact that collocations are lexical in nature,they have been studied primarily by lexicographers and by relatively fewer linguists,although early linguistic paradigms which place emphasis on the lexicon are exceptions(e.g.[22,30]).In this section,wefirst describe properties of collocations that surface repeatedly across the literature. Next we present linguistic paradigms which cover collocations.We close the sec-tion with a presentation of the types of characteristics studied by lexicographers and proposals for how to represent collocations in different kinds of dictionaries.15.2.1Properties of collocationsCollocations are typically characterized as arbitrary,language-(and dialect-) specific,recurrent in context,and common in technical language(see overview by[37]).The notion of arbitrariness captures the fact that substituting a synonym for one of the words in a collocational word pair may result in an3infelicitous lexical combination.Thus,for example,a phrase such as make an effort is acceptable,but make an exertion is not;similarly,a running commen-tary,commit treason,warm greetings are all true collocations,but a running discussion,commit treachery,and hot greetings are not acceptable lexical com-binations[5].This arbitrary nature of collocations persists across languages and dialects. Thus,in French,the phrase r´e gler la circulation is used to refer to a policeman who directs traffic,the English collocation.In Russian,German,and Serbo-Croatian,the direct translation of regulate is used;only in English is direct used in place of regulate.Similarly,American and British English exhibit arbitrary differences in similar phrases.Thus,in American English one says set the table and make a decision,while in British English,the corresponding phrases are lay the table and take a decision.In fact,in a series of experiments,Benson [5]presented non-native English speakers and later,a mix of American English and British English speakers,with a set of25sentences containing a variety of American and British collocations.He asked them to mark them as either American English,British English,World English,or unacceptable.The non-native speakers got only22%of them correct,while the American and British speakers got only24%correct.While these properties indicate the difficulties in determining what is an acceptable collocation,on the positive side it is clear that collocations occur frequently in similar contexts[5,12,22].Thus,while it may be difficult to define collocations,it is possible to observe collocations in samples of the language. Generally,collocations are those word pairs which occur frequently together in the same environment,but do not include lexical items which have a high overall frequency in language[22].The latter include words such as go,know, etc.,which can combine with just about any other word(i.e.are free word combinations)and thus,are used more frequently than other words.This property,as we shall see,has been exploited by researchers in natural language processing to identify collocations automatically.In addition,researchers take advantage of the fact that collocations are often domain specific;words which do not participate in a collocation in everyday language often do form part of a collocation in technical language.Thus,file collocates with verbs such as create, delete,save when discussing computers,but not in other sublanguages.Stubbs[42]points out some other interesting properties of collocations.For example,he indicates that the word cause typically collocates with words ex-pressing negative concepts,such as accident,damage,death,or concern.Con-versely,provide occurs more often with positive words such as care,shelter,and food.15.2.2Halliday and Mel’ˇc ukMany lexicographers point back to early linguistic paradigms which,as part of their focus on the lexicon,do address the role of collocations in language[30,22]. Collocations are discussed as one offive means for achieving lexical cohesion in Halliday and Hasan’s work.Repeated use of collocations,among other devices4such as repetition and reference,is one way to produce a more cohesive text. Perhaps because they are among the earliest to discuss collocations,Halliday and Hasan present a more inclusive view of collocations and are less precise in their definition of collocations than others.For them,collocations include any set of words whose members participate in a semantic relation.Their point is that a marked cohesive effect in text occurs when two semantically related words occur in close proximity in a text,even though it may be difficult to systematically classify the semantic relations that can occur.They suggest ex-amples of possible relations,such as complementarity(e.g.,boy,girl),synonyms and near-synonyms,members of ordered sets(e.g.,Monday,Tuesday;dollars, cents),part-whole(e..g,car,brake),as well as relations between words of differ-ent parts of speech(e.g.,laugh,joke:blade,sharp;garden,dig).They point to the need for further analysis and interpretation of collocations in future work; for their purpose,they simply lump together as collocations all lexical relations that cannot be called referential identity or repetition.In later years,Mel’ˇc uk provided a more restricted view of collocations.In the meaning–text model,collocations are positioned within the framework of lexical functions(LFs).An LF is a semantico-syntactic relation which connects a word or phrase with a set of words or phrases.LFs formalize the fact that in language there are words,or phrases,whose usage is bound by another word in the language.There are roughly50different simple LFs in the meaning–text model,some of which capture semantic relations(e.g.the LF anti posits a relation between antonyms),some of which capture syntactic relations(e.g.A0 represents nouns and derived adjectivals such as sun–solar),while others capture the notion of restricted lexical co-occurrence.The LF magn is one example of this,representing the words which can be used to magnify the intensity of a given word.Thus,magn(need)has as its value the set of words{great,urgent, bad},while magn(settled)has the value{thickly},and magn(belief)the value {staunch}.O per1is another LF which represents the semantically empty verb which collocates with a given object.Thus,the O per1of analysis is{perform}.15.2.3Types of collocationsIn an effort to characterize collocations,lexicographers and linguists present a wide variety of individual collocations,attempting to categorize them as part of a general scheme[2,5,12].By examining a wide variety of collocates of the same syntactic category,researchers identify similarities and differences in their behavior,in the process coming a step closer to providing a definition. Distinctions are made between grammatical collocations and semantic colloca-tions.Grammatical collocations often contain prepositions,including paired syntactic categories such as verb+preposition(e to,put on),adjec-tive+preposition(e.g.afraid that,fond of),and noun+preposition(e.g.by accident,witness to).In these cases,the open-class word is called the base and determines the words it can collocate with,the collocators2.Often,com-putational linguists restrict the type of collocations they acquire or use to a subset of these different types(e.g.[11]).Semantic collocations are lexically5restricted word pairs,where only a subset of the synonyms of the collocator can be used in the same lexical context.Examples in this category have already been presented.Another distinction is made between compounds andflexible word pairs. Compounds include word pairs that occur consecutively in language and typi-cally are immutable in function.Noun+noun pairs are one such example,which not only occur consecutively but also function as a constituent.Cowie[12]notes that compounds form a bridge between collocations and idioms,since,like col-locations,they are quite invariable,but they are not necessarily semantically opaque.Since collocations are recursive(ibid.),collocational phrases,includ-ing more than just two words,can occur.For example,a collocation such as by chance in turn collocates with verbs such asfind,discover,notice.Flexible word pairs include collocations between subject and verb,or verb and object; any number of intervening words may occur between the words of the colloca-tion.15.2.4Collocations and dictionariesAfinal,major recurring theme of lexicographers is where to place collocations in dictionaries.Placement of collocations is determined by which word functions as the base and which functions as the collocator.The base bears most of the meaning of the collocation and triggers the use of the collocator.This distinction is best illustrated by collocations which include“support”verbs: in the collocation take a bath,bath is the base and the support verb take,a semantically empty word in this context,the collocator.In dictionaries designed to help users encode language(e.g.generate text),lexicographers argue that the collocation should be located at the base[23].Given that the base bears most of the meaning,it is generally easier for a writer to think of the basefirst. This is especially the case for learners of a language.When dictionaries are intended to help users decode language,then it is more appropriate to place the collocation at the entry for the collocator.The base–collocator pairs listed in Table2illustrate why this is the case.Base Collocator Examplenoun verb set the tablenoun adjective warm greetingsverb adverb struggle desperatelyadjective adverb sound asleepverb preposition put onTable2:Base–collocator pairs615.3Extracting collocations from text corpora Early work on collocation acquisition was carried out by Choueka et al.[8]. They used frequency as a measure to identify a particular type of collocation, a sequence of adjacent words.In their approach,they retrieved a sequence of words that occurs more frequently than a given threshold.While they were theoretically interested in sequences of any length,their implementation is re-stricted to sequences of two to six words.They tested their approach on an 11million word corpus from the New York Times archives,yielding several thousand collocations.Some examples of retrieved collocations include home run,fried chicken,and Magic Johnson.This work was notably one of thefirst to use large corpora and predates many of the more mainstream corpus based approaches in computational linguistics.Their metric,however,was less so-phisticated than later approaches;because it was based on frequency alone,it is sensitive to corpus size.Church,Hanks,and colleagues[11,10]used a correlation-based metric to retrieve collocations;in their work,a collocation was defined as a pair of words that appear together more than would be expected by chance.To estimate correlation between word pairs,they used mutual information as defined in Information Theory[34,17].If two points(words)x and y have probabilities P(x)and P(y),then their mutual information I(x,y)is defined to be[9]:I(x,y)=log2P(x,y) P(x)·P(y)In the formula,P(x,y)is the probability of seeing the two words x and y within a certain window.Whenever P(x,y)=P(x)·P(y),the value of I(x,y)becomes0which is an indication that the two words x and y are not members of a collocation pair.If I(x,y)<0,then the two words are in complementary distribution[9].Other metrics for computing strength of collocations are discussed in[32].Church et al.’s approach was an improvement over that of Choueka et al.in that they were able to retrieve interrupted word pairs,such as subject+verb or verb+object collocations.However,unlike Choueka et al.,they were restricted to retrieving collocations containing only two words.In addition,the retrieved collocations included words that are semantically related(e.g.doctor–nurse, doctor–dentist in addition to true lexical collocations.Smadja and his colleagues[36,37,38]addressed acquisition of a wider variety of collocations than either of the two other approaches.This work featured the use of a severalfilters based on linguistic properties,the use of several stages to retrieve word pairs along with compounds and phrases,and an evaluation of retrieved collocations by a lexicographer to estimate the number of true lexical collocations retrieved.Their system Xtract began by retrieving word pairs using a frequency-based metric.The metric computed the z-score of a pair,byfirst computing the average frequency of the words occurring within a ten-word radius of a7given word and then determining the number of standard deviations above the average frequency for each word pair.Only word pairs with a z-score above a certain threshold were retained.In contrast to Choueka et al.’s metric,this metric ensured that the pairs retrieved were not sensitive to corpus size.This step is analogous to the method used by both Choueka et al.and Church et al., but it differs in the details of the metric.In addition to the metric,however,Xtract used three additionalfilters based on linguistic properties.Thesefilters were used to ensure an increase in the accuracy of the retrieved collocations by removing any which were not true lexical collocates.First,Xtract removed any collocations of a given word where the collocate can occur equally well in any of the ten positions around the given word.Thisfilter removed semantically related pairs such as doctor–nurse,where one word can simply occur anywhere in the context of the other;in contrast, lexically constrained collocations will tend to be used more often in similar positions(e.g.an adjective+noun collocation would more often occur with the adjective several words before the noun).A secondfilter noted patterns of interest,identifying whether a word pair was always used rigidly with the same distance between words or whether there is more than one position.Finally, Xtract used syntax to remove collocations where a given word did not occur significantly often with words of the same syntactic function.Thus,verb+noun pairs werefiltered to remove those that did not consistently present the same syntactic relation.For example,a verb+noun pair that occurred equally often in subject+verb and verb+object relations would befiltered out.After retrieving word pairs,Xtract used a second stage to identify words which co-occurred significantly often with identified collocations.This way,the recursive property of collocations noted by[12]was accounted for.In this stage, Xtract produced all instances of appearance of the two words(i.e.concordances) and analyzed the distributions of words and parts of speech in the surround-ing positions,retaining only those words around the collocation that occurred with probability greater than a given threshold.This stage produced rigid compounds(i.e.adjacent sequences of words that typically occurred as a con-stituent such as noun compounds)as well as phrasal templates(i.e.idiomatic strings of words possibly including slots that may be replaced by other words). An example of a compound is(1),while an example of a phrasal template is (2).(1)the Dow Jones industrial average(2)The NYSE’s composite index of all its listed common stocks fell*NUMBER*to*NUMBER*.Xtract’s output was evaluated by a lexicographer in order to identify preci-sion and recall.4,000collocations produced by thefirst two stages of Xtract, excluding the syntacticfilter,were evaluated in this manner.Of these,40%were identified as good collocations.After further passing these through the syntactic filter,80%were identified as good.This evaluation dramatically illustrated the importance of combining linguistic information with syntactic analysis.Recall8was only measured for the syntacticfilter.It was noted that of the good collo-cations identified by the lexicographer in thefirst step of output,the syntactic filter retained94%of those collocations.15.4Using collocations for disambiguationOne of the more common approaches to word-sense disambiguation involves the application of additional constraints to the words whose sense is to be deter-mined.Collocations can be used to specify such constraints.Two major types of constraints have been investigated.Thefirst uses the general idea that the presence of certain words near the ambiguous one will be a good indicator of its most likely sense.The second type of constraint can be obtained when pairs of translations of the word in an aligned bilingual corpus are considered.Research performed at IBM in the early1990s[7]applied a statistical method using as parameters the context in which the ambiguous word appeared.Seven factors were considered:the words immediately to the left or to the right to the ambiguous word,thefirst noun and thefirst verb both to the left and to the right,as well as the tense of the word in case it was a verb or thefirst verb to the left of the word otherwise.The system developed indicated that the use of collocational information results in a13%increase in performance over the conventional trigram model in which sequences of three words are considered.Work by Dagan and Itai[14]took the ideas set forth in Brown et al.’s work further.They augmented the use of statistical translation techniques with lin-guistic information such as syntactic relations between words.By using a bilin-gual lexicon and a monolingual corpus of one language,they were able to avoid the manual tagging of text and the use of aligned corpora.Other statistical methods for word sense disambiguation are discussed in Chapter YZX.Church et al.[9]have suggested a method for word disambiguation in the con-text of Optical Character Recognition(OCR).They suggest that collocational knowledge helps choose between two words in a given context.For example,the system may have to chooose between farm and form when the context is either:(3)federal...creditor(4)some...ofIn thefirst case,the frequency of federal followed by farm is0.50,while the frequency of federal followed by form is0.039.Similarly,the frequencies of credit following either farm or form are0.13and0.026,respectively.One can therefore approximate the probabilities for the trigram federal farm credit, which is(0.5×10−6)×(0.13×10−6)=0.065×10−12and for federal form credit, which is(0.039×10−6)×(0.026×10−6)=0.0010×10−12.Since thefirst of these probabilities is65times as large as the second one,the OCR system can safely pick farm over form in(3).Similarly,form is273times more likely than9farm in(4).Church et al.also note that syntactic knowledge alone would not help in such cases,as both farm and form are often used as nouns.15.5Using collocations for generationOne of the most straightforward applications of collocational knowledge is in natural language generation.There are two typical approaches applied in such systems:the use of phrasal templates in the form of canned phrases and the use of automatically extracted collocations for unification-based generation.We will describe some of the existing projects using both of these approaches.At the end of this section we will also mention some other uses of collocations in generation.15.5.1Text generation using phrasal templatesSeveral early text-generation systems used canned phrases as sources of col-locational information to generate phrases.One of them was UC(Unix con-sultant),developed at Berkeley by Jacobs[25].The system responded to user questions related to the Unix operating system and used text generation to con-vey the answers.Another such system was Ana,developed by Kukich[27]at the University of Pittsburgh,which generated reports of activity on the stock market.The underlying paradigm behind generation of collocations in these two systems was related to the reuse of canned phrases,such as the following from[27]:opened strongly,picked up momentum early in trading,got offto a strong start.On the one hand,Kukich’s approach was computationally tractable,as there was no processing involved in the generation of the phrases,while on the other, it did not allow for theflexibility that a text generation system requires in the general case.For example,Ana needed to have separate entries in its grammar for two quite similar phrases:opened strongly and opened weakly.Another system that made extensive use of phrasal collocations was FOG [6].This was a highly successful system which generated bilingual(French and English)weather reports that contained a multitude of canned phrases such as (5).(5)Temperatures indicate previous day’s high and overnight low to8a.m.In general,canned phrases fall into the category of phrasal templates. They are usually highly cohesive and the algorithms that can generate them from their constituent words are expensive and sophisticated.15.5.2Text generation using automatically acquired col-locational knowledgeSmadja and McKeown[39]have discussed the use of(automatically retrieved) collocations in text generation.10。
国际市场营销 习题答案

Answer the foll owing questions based on your case study:1\ Describe the difference between ethnocentric【本国中心主义】, polycentric【多国中心主义】, regiocentric【地区中心】, and geocentric【全球中心】management orientations. 说明本国中心主义、多国中心主义、地区中心和全球中心的管理取向/经营方向的区别Ethnocentric:Home country is Superior, sees Similarities in foreign Countries Polycentric: Each host country Is Unique, sees differences In foreign countries Regiocentric: Sees similarities and differences in a worl d Region; is ethnocentric or polycentric in its view of the rest of the worldGeocentric:World view, sees Similarities and Differences in home And host countries2\ What are the basic reasons of gl obal marketing? What , in your view, what are the key reasons of your company? Briefly introduction your com pany’s gl obal marketing performance and result.。
2 什么是全球市场营销的根本原因?什么,在您看来,贵公司的关键原因是什么?简要介绍贵公司的全球市场的表现和结果。
用以减少阻力的几种管理策略

精选ppt
44
Forces that Stimulate Change
精选ppt
16-5
Sources of Resistance to Change
People tend to resist change, even in the face of evidence of its benefits.
• Kotter’s Eight-Step Model of the Change Process
• Organizational Development
精选ppt
11
Lewin’s Three-Step Model
Unfreezing can be achieved by:
– Increase driving forces that direct behavior away from the status quo
– Using outside consultants
• Appreciative Inquiry
– Discovering what the organization does right
精选ppt
18
Creating a Culture for Change
Innovation: a new idea applied to initiating or improving a product, process, or service
Organizational
– Employee Selection – Organizational
Communication – Goal-setting Programs – Job Redesign
智能交通系统中英文对照外文翻译文献

智能交通系统中英文对照外文翻译文献(文档含英文原文和中文翻译)原文:Traffic Assignment Forecast Model Research in ITS IntroductionThe intelligent transportation system (ITS) develops rapidly along with the city sustainable development, the digital city construction and the development of transportation. One of the main functions of the ITS is to improve transportation environment and alleviate the transportation jam, the most effective method to gain the aim is to forecast the traffic volume of the local network and the important nodes exactly with GIS function of path analysis and correlation mathematic methods, and this will lead a better planning of the traffic network. Traffic assignment forecast is an important phase of traffic volume forecast. It will assign the forecasted traffic to every way in the traffic sector. If the traffic volume of certain road is too big, which would bring on traffic jam, planners must consider the adoption of new roads or improving existing roads to alleviate the traffic congestion situation. This study attempts to present an improved traffic assignment forecast model, MPCC, based on analyzing the advantages and disadvantages of classic traffic assignment forecast models, and test the validity of the improved model in practice.1 Analysis of classic models1.1 Shortcut traffic assignmentShortcut traffic assignment is a static traffic assignment method. In this method, the traffic load impact in the vehicles’ travel is not considered, and the traffic impedance (travel time) is a constant. The traffic volume of every origination-destination couple will be assigned to the shortcut between the origination and destination, while the traffic volume of other roads in this sector is null. This assignment method has the advantage of simple calculation; however, uneven distribution of the traffic volume is its obvious shortcoming. Using this assignment method, the assignment traffic volume will be concentrated on the shortcut, which isobviously not realistic. However, shortcut traffic assignment is the basis of all theother traffic assignment methods.1.2 Multi-ways probability assignmentIn reality, travelers always want to choose the shortcut to the destination, whichis called the shortcut factor; however, as the complexity of the traffic network, thepath chosen may not necessarily be the shortcut, which is called the random factor.Although every traveler hopes to follow the shortcut, there are some whose choice isnot the shortcut in fact. The shorter the path is, the greater the probability of beingchosen is; the longer the path is, the smaller the probability of being chosen is.Therefore, the multi-ways probability assignment model is guided by the LOGIT model:∑---=n j ii i F F p 1)exp()exp(θθ (1)Where i p is the probability of the path section i; i F is the travel time of thepath section i; θ is the transport decision parameter, which is calculated by the followprinciple: firstly, calculate the i p with different θ (from 0 to 1), then find the θwhich makes i p the most proximate to the actual i p .The shortcut factor and the random factor is considered in multi-ways probabilityassignment, therefore, the assignment result is more reasonable, but the relationshipbetween traffic impedance and traffic load and road capacity is not considered in thismethod, which leads to the assignment result is imprecise in more crowded trafficnetwork. We attempt to improve the accuracy through integrating the several elements above in one model-MPCC.2 Multi-ways probability and capacity constraint model2.1 Rational path aggregateIn order to make the improved model more reasonable in the application, theconcept of rational path aggregate has been proposed. The rational path aggregate,which is the foundation of MPCC model, constrains the calculation scope. Rationalpath aggregate refers to the aggregate of paths between starts and ends of the trafficsector, defined by inner nodes ascertained by the following rules: the distancebetween the next inner node and the start can not be shorter than the distance betweenthe current one and the start; at the same time, the distance between the next innernode and the end can not be longer than the distance between the current one and theend. The multi-ways probability assignment model will be only used in the rationalpath aggregate to assign the forecast traffic volume, and this will greatly enhance theapplicability of this model.2.2 Model assumption1) Traffic impedance is not a constant. It is decided by the vehicle characteristicand the current traffic situation.2) The traffic impedance which travelers estimate is random and imprecise.3) Every traveler chooses the path from respective rational path aggregate.Based on the assumptions above, we can use the MPCC model to assign thetraffic volume in the sector of origination-destination couples.2.3 Calculation of path traffic impedanceActually, travelers have different understanding to path traffic impedance, butgenerally, the travel cost, which is mainly made up of forecast travel time, travellength and forecast travel outlay, is considered the traffic impedance. Eq. (2) displaysthis relationship. a a a a F L T C γβα++= (2)Where a C is the traffic impedance of the path section a; a T is the forecast traveltime of the path section a; a L is the travel length of the path section a; a F is theforecast travel outlay of the path section a; α, β, γ are the weight value of that threeelements which impact the traffic impedance. For a certain path section, there aredifferent α, β and γ value for different vehicles. We can get the weighted average of α,β and γ of each path section from the statistic percent of each type of vehicle in thepath section.2.4 Chosen probability in MPCCActually, travelers always want to follow the best path (broad sense shortcut), butbecause of the impact of random factor, travelers just can choose the path which is ofthe smallest traffic impedance they estimate by themselves. It is the key point ofMPCC. According to the random utility theory of economics, if traffic impedance is considered as the negativeutility, the chosen probability rs p of origination-destinationpoints couple (r, s) should follow LOGIT model:∑---=n j jrs rs bC bC p 1)exp()exp( (3) where rs p is the chosen probability of the pathsection (r, s);rs C is the traffic impedance of the path sect-ion (r, s); j C is the trafficimpedance of each path section in the forecast traffic sector; b reflects the travelers’cognition to the traffic impedance of paths in the traffic sector, which has reverseratio to its deviation. If b → ∞ , the deviation of understanding extent of trafficimpedance approaches to 0. In this case, all the travelers will follow the path whichis of the smallest traffic impedance, which equals to the assignment results withShortcut Traffic Assignment. Contrarily, if b → 0, travelers ’ understanding error approaches infinity. In this case, the paths travelers choose are scattered. There is anobjection that b is of dimension in Eq.(3). Because the deviation of b should beknown before, it is difficult to determine the value of b. Therefore, Eq.(3) is improvedas follows:∑---=n j OD j OD rsrs C bC C bC p 1)exp()exp(,∑-=n j j OD C n C 11(4) Where OD C is the average of the traffic impedance of all the as-signed paths; bwhich is of no dimension, just has relationship to the rational path aggregate, ratherthan the traffic impedance. According to actual observation, the range of b which is anexperience value is generally between 3.00 to 4.00. For the more crowded cityinternal roads, b is normally between 3.00 and 3.50.2.5 Flow of MPCCMPCC model combines the idea of multi-ways probability assignment anditerative capacity constraint traffic assignment.Firstly, we can get the geometric information of the road network and OD trafficvolume from related data. Then we determine the rational path aggregate with themethod which is explained in Section 2.1.Secondly, we can calculate the traffic impedance of each path section with Eq.(2),Fig.1 Flowchart of MPCC which is expatiated in Section 2.3.Thirdly, on the foundation of the traffic impedance of each path section, we cancalculate the respective forecast traffic volume of every path section with improvedLOGIT model (Eq.(4)) in Section 2.4, which is the key point of MPCC.Fourthly, through the calculation processabove, we can get the chosen probability andforecast traffic volume of each path section, but itis not the end. We must recalculate the trafficimpedance again in the new traffic volumesituation. As is shown in Fig.1, because of theconsideration of the relationship between trafficimpedance and traffic load, the traffic impedanceand forecast assignment traffic volume of everypath will be continually amended. Using therelationship model between average speed andtraffic volume, we can calculate the travel timeand the traffic impedance of certain path sect-ionunder different traffic volume situation. For theroads with different technical levels, therelationship models between average speeds totraffic volume are as follows: 1) Highway: 1082.049.179AN V = (5) 2) Level 1 Roads: 11433.084.155AN V = (6) 3) Level 2 Roads: 66.091.057.112AN V = (7) 4) Level 3 Roads: 3.132.01.99AN V = (8) 5) Level 4 Roads: 0988.05.70A N V =(9) Where V is the average speed of the path section; A N is the traffic volume of thepath section.At the end, we can repeat assigning traffic volume of path sections with themethod in previous step, which is the idea of iterative capacity constraint assignment,until the traffic volume of every path section is stable.译文智能交通交通量分配预测模型介绍随着城市的可持续化发展、数字化城市的建设以及交通运输业的发展,智能交通系统(ITS)的发展越来越快。
Relationships between the trace element composition of sedimentary rocks and upper continental crust

Relationships between the trace elementcomposition of sedimentary rocks and upper continental crustScott M.McLennanDepartment of Geosciences,State University of New York at Stony Brook,Stony Brook,New York 11794-2100(Scott.McLennan@)[1]Abstract:Estimates of the average composition of various Precambrian shields and a variety of estimates of the average composition of upper continental crust show considerable disagreement for a number of trace elements,including Ti,Nb,Ta,Cs,Cr,Ni,V ,and Co.For these elements and others that are carried predominantly in terrigenous sediment,rather than in solution (and ultimately into chemical sediment),during the erosion of continents the La/element ratio is relatively uniform in clastic sediments.Since the average rare earth element (REE)pattern of terrigenous sediment is widely accepted to reflect the upper continental crust,such correlations provide robust estimates of upper crustal abundances for these trace elements directly from the sedimentary data.Suggested revisions to the upper crustal abundances of Taylor and McLennan [1985]are as follows (all in parts per million):Sc =13.6,Ti =4100,V =107,Cr =83,Co =17,Ni =44,Nb =12,Cs =4.6,Ta =1.0,and Pb =17.The upper crustal abundances of Rb,Zr,Ba,Hf,and Th were also directly reevaluated and K,U,and Rb indirectly evaluated (by assuming Th/U,K/U,and K/Rb ratios),and no revisions are warranted for these elements.In the models of crustal composition proposed by Taylor and McLennan [1985]the lower continental crust (75%of the entire crust)is determined by subtraction of the upper crust (25%)from a model composition for the bulk crust,and accordingly,these changes also necessitate revisions to lower crustal abundances for these elements.Keywords:Geochemistry;composition of the crust;trace elements.Index terms:Crustal evolution;composition of the crust;trace elements.Received September 8,2000;Revised December 3,2000;Accepted December 11,2000;Published Apri l 20,2001.McLennan,S.M.,2001.Relationships between the trace element composition of sedimentary rocks and upper continental crust,Geochem.Geophys.Geosyst.,vol.2,Paper number 2000GC000109[8994words,10figures,5tables].Published Apr il 20,2001.Theme:Geochemical Earth Reference Model (GERM)Guest Editor:Hubert Staudigel1.Introduction[2]The chemical composition of the upper continental crust is an important constraint onunderstanding the composition and chemical differentiation of the continental crust as a whole and the Earth in general [e.g.,TaylorG3Published by AGU and the Geochemical SocietyAN ELECTRONIC JOURNAL OF THE EARTH SCIENCESGeochemistry Geophysics GeosystemsCharacterizationV olume 2Ap ril 20,2001Paper number 2000GC000109ISSN:1525-2027Copyright 2001by the American Geophysical Unionand McLennan ,1985,1995;Rudnick and Fountain ,1995].There have been a variety of estimates of upper crustal composition mostly based on large-scale sampling programs,largely in Precambrian shield areas,geochemical com-pilations of upper crustal lithologies,and sedi-mentary rock compositions (mainly shales).If the average chemical composition of the upper crust can be estimated from sedimentary rocks,then an especially powerful insight may be gained into the chemical evolution of the crust (and Earth)over geological time because of the relatively continuous record of sedimentary rocks,dating from $4Ga to the present.[3]For the most part,estimates of upper crustal abundances from sedimentary data have been restricted intentionally to trace elements that are least fractionated by various sedimentary pro-cesses,such as chemical and physical weath-ering,mineral sorting during transport,and diagenesis [McLennan et al.,1980].Included are the rare earth elements (REE),Th,and Sc as well as other elements (K,U,and Rb)that can be estimated indirectly using various so-called canonical ratios (Th/U,K/U,and K/Rb).Recently,however,this general approach has been applied to other trace elements,notably Nb,Ta,and Cs that at least potentially,may be more affected by various sedimentary processes [e.g.,McDonough et al.,1992;Plank and Langmuir ,1998;Barth et al.,2000].In this paper the relationships between the trace ele-ment composition of the sedimentary mass and the upper continental crust are evaluated for a variety of trace elements and new estimates of upper crustal trace element abundances,based on the sedimentary rock record,are presented.parison of Upper Crustal Estimates[4]The most commonly cited estimates of upper crustal abundances are those of Taylor and McLennan [1985](hereinafter referred toas TM85),which are based on a variety of approaches for different elements,including large-scale sampling programs (e.g.,major ele-ments,Sr,and Nb),average igneous composi-tions (e.g.,Pb),compilations from Wedepohl [1969±1978](e.g.,Ba and Zr),sedimentary compositions (e.g.,REE,Th,and Sc),and various canonical or assumed ratios,such as Zr/Hf,Th/U,K/U,K/Rb,Rb/Cs,and Nb/Ta (e.g.,Hf,U,Rb,Cs,and Ta).Although there is widespread agreement that the upper crust approximates to a composition equivalent to the igneous rock type granodiorite,there is in fact considerable disagreement regarding the precise values of a variety of trace elements.In Table 1,estimates of selected trace elements are tabulated for various shield surfaces.Some of these compositions are compared to the upper crustal estimate of TM85in Figure 1,where it can be seen that discrepancies by nearly a factor 2or more are common and that in some cases,estimates differ by more than a factor of 3(Nb,Cr,and Co).These differences are likely due to some combination of inad-equate sampling,analytical difficulties,and real regional variations in upper crustal abundances.In Table 2,various other recent estimates of the upper crust (see Table 2for methods of esti-mates)are also compared to TM85,and again,some significant differences can be seen.3.Sedimentary Rocks and Upper Crustal Compositions[5]The notion that sediments could be used to estimate average igneous compositions at the Earth's surface was first suggested by V .M.Goldschmidt (see discussion by Goldschmidt [1954,pp.53±56]),and using sedimentary data to derive upper crustal REE abundances was pioneered by S.R.Taylor [e.g.,Taylor ,1964,1977;Jakes and Taylor ,1974;Nance and Taylor ,1976,1977;McLennan et al.,1980;Taylor and McLennan ,1981,1985].Gold-schmidt used glacial sediments to estimate theGeochemistry Geophysics GeosystemsG3mclennan:trace element composition and upper continental crust2000GC000109major element composition of average igneous rocks because such sediment is dominated by mechanical rather than chemical processes.However,modern studies have used shale compositions to estimate upper crustal trace element abundances (TM85).This is because shales completely dominate the sedimentary record [Garrels and Mackenzie ,1971],consti-tuting up to 70%of the stratigraphic record (depending on the method of estimating),and because most trace elements are enriched in shales compared to most other sediment types.The result is that shales dominate the sedimen-tary mass balance for all but a few trace elements.[6]Most studies also have been restricted to a few trace elements that are least affected by sedimentary processes and are transferred dom-inantly into the clastic sedimentary record dur-ing continental erosion,notably REE,Y ,Sc,and Th.However,there are numerous other trace elements that are transferred from upper crust primarily into the clastic sedimentary mass,including Zr,Hf,Nb,Ta,Rb,Cs,Pb,Cr,V ,Ni,and Co.Until recently,these ele-ments have been largely neglected (see discus-sion by TM85)because of perceived problems of fractionation during mineral sorting,such that shales may not dominate the sedimentary mass balance (e.g.,Zr,Hf,Nb,Ta,and Pb),and possible redistribution during weathering and/or diagenesis (e.g.,Rb,Cs,Pb,Cr,V ,Ni,and Co).Given the large variability among the various upper crustal and shield estimates for these elements (Tables 1and 2),such processes may well add relatively minor uncertainty to upper crustal estimates derived from the clastic sedimentary record.3.1.Cs in the Upper Crust[7]The Cs content of the upper crust is given as 3.7ppm by TM85based on a Rb content of 112ppm and a Rb/Cs ratio of 30.McDonough et al.[1992]argued that there was no fractio-nation of Rb from Cs during sedimentary processes and determined the average Rb/Cs of $140sediments and sedimentary rocks to be 19(standard deviation of 11),which he took to be equivalent to the upper crust and leading to an upper crustal Cs content of $6ppm (using the Shaw et al.[1986]Canadian Shield average of Rb =110ppm).Rudnick and Fountain101001000S h i e l d E s t i m a t e101001000S h i e l d E s t i m a t e101001000Taylor & McLennan Upper CrustFigure parison plots for selected trace elements in two independent estimates of the Canadian Shield surface and various other shields with the estimate of the average upper continental crust from Taylor and McLennan [1985].Thick solid line represents equal compositions,and dashed lines represent difference by a factor of 2.Data are from Table 1.Geochemistry Geophysics GeosystemsG3mclennan:trace element composition and upper continental crust2000GC000109[1995]adopted an upper crustal Rb/Cs ratio of 20and reported a Cs content of5.6ppm(using the TM85upper crustal value of Rb=112 ppm).Recently,the TM85estimate has also been questioned by Plank and Langmuir [1998]on the basis of young marine sedimen-tary data.They noted a correlation between Cs and Rb in modern deep-sea sediments from a variety of tectonic and sedimentological ing this correlation and accepting a Rb upper crustal abundance of112ppm,they derived a new Cs estimate of7.3ppm(imply-ing an upper crustal Rb/Cs of15.3).[8]The behavior of Cs in the sedimentary environment,in fact,is not well documented. On the basis of the data available at the time, McDonough et al.[1992]argued that the Rb/Cs ratio does not change during sedimentary pro-cesses.However,this conclusion does not seem to be consistent with the observations that sea-water Rb/Cs is$400,typical river water Rb/Cs is$50[e.g.,TM85;Lisitzin,1996],and some tropical river waters have ratios in excess of 1000[DupreÂet al.,1996],whereas all workers seem to agree that the upper crustal Rb/Cs is <40[TM85;McDonough et al.,1992;Gao et al.,1998;Wedepohl,1995;Rudnick and Foun-tain,1995;Plank and Langmuir,1998].[9]Rb/Cs ratios of weathering profiles appear to change systematically as a function of Rb content in both basaltic and granitic terranes (Figure2),suggesting at least the potential for fractionation between these elements during surficial processes.DupreÂet al.[1996]found Congo River suspended sediment,bed loadsands,and dissolved load(including colloids) to have the following Rb/Cs ratios(average 95%confidence interval):17 4(n=8),47 8(n=15),and481 454(n=8),respectively, and Gaillardet et al.[1997]found Rb/Cs ratios as low as4in suspended sediment from the Amazon River.Thus interaction of natural waters with typical upper crust appears to lower the Rb/Cs ratio in the resulting fine-grained clastic sediments,likely due to the preferential exchange of the larger Cs ion onto clay minerals.[10]There are few reliable data for Cs in carbonates,evaporites,and siliceous sediments;51015202530Rb/Cs010203040Rb (ppm)15202530Rb/Cs7090110130150Figure2.Plots of Rb/Cs versus Rb for weathering profilesdeveloped on granodiorite[Nesbitt and Markovics,1997]and basalt([Price et al.,1991]Cs data from S.R.Taylor(personal communication, 1997))in Australia,suggesting Rb/Cs ratios may be strongly fractionated within weathering profiles.In spite of any fractionation within soil profiles both of these elements are carried from weathering sites predominantly in the particulate load.however,from simple crystal chemical argu-ments the larger Cs ion would be expected to be preferentially excluded compared to Rb in most carbonate and evaporite minerals,leading to relatively high Rb/Cs ratios compared to the upper crust(see discussion regarding carbo-nates by Okumura and Kitano[1986]).Thus, although Rb and Cs are carried dominantly in clastic sediments,it is not obvious that the Rb/ Cs ratio of marine sediment studied by Plank and Langmuir[1998],where the terrigenous fraction is dominated by very fine grained clays,is fully representative of the upper crust.3.2.Nb-Ta-Ti in the Upper Crust[11]The Ti and Nb contents of the upper continental crust are given as3000and25 ppm,respectively,by TM85on the basis of the large-scale sampling program in the Cana-dian Shield by D.M.Shaw[Shaw et al.,1967, 1976,1986],and the Ta estimate of2.2ppm is based on a crustal Nb/Ta ratio of11.6(taken from Wedepohl[1977]).Recently,these esti-mates also have been questioned by Plank and Langmuir[1998]on the basis of sedimentary data.Plank and Langmuir[1998]noted corre-lations between Nb and Al2O3,between Ti and Al2O3,and between Nb and Ta in modern deep-sea sediments from a variety of tectonic and sedimentological regimes.From these rela-tionships and by accepting the Al2O3upper crustal estimate of TM85they estimated TiO2 at0.76%,Nb at13.7ppm,and Ta at0.96ppm. Barth et al.[2000]suggested estimates of Nb= 11.5ppm and Ta=0.92ppm on the basis of the abundances of these elements in Australian post-Archean shales(PAAS)and loess.[12]It has long been known that elements concentrated in heavy mineral suites(notably, Zr and Hf but also Sn,Th,LREE,etc.)may be strongly fractionated during mineral sorting of clastic sediments[McLennan et al.,1993]. Although the geochemistry of Ti,Nb,and Ta is likely to be less affected by such processes,these elements may be concentrated in heavy mineral suites(e.g.,rutile,ilmenite,anatase, etc.),and like zircon,rutile and anatase are ``ultrastable''heavy minerals[Pettijohn et al., 1972].Accordingly,some care must be taken in interpreting the Ti,Nb,and Ta content of shales.On the other hand,the discrepancy between estimates of Plank and Langmuir [1998;Barth et al.,2000]and for the Canadian Shield[Shaw et al.,1986]is nearly a factor of 2,much greater than might be expected from any of these sedimentological considerations.3.3.Cr-Ni-V-Co in the Upper Crust[13]Upper crustal ferromagnesian trace element abundances reported by TM85,based largely on the Canadian Shield estimates of Shaw et al. [1967,1976]and Eade and Fahrig[1971, 1973;Fahrig and Eade,1968],are relatively low(e.g.,Cr=35ppm and Ni=20ppm) compared to a number of other shield estimates (Table1and Figure1)and various other upper crustal estimates(Table2).In contrast,the abundances of ferromagnesian trace elements in shales are typically a factor of$2greater than these values(TM85).This discrepancy has rarely been discussed in any detail,although Condie[1993]has proposed significantly higher upper crustal abundances of ferromag-nesian trace element abundances(see Table2).[14]Relatively low upper crustal abundances of these elements were effectively a requirement of the once popular``andesite model''for crustal growth because average andesite has very low abundances for these elements[e.g.,Taylor, 1967,1977;Gill,1981;Gill et al.,1994].For example,Taylor[1977]estimated average ande-site to have Cr=55ppm and Ni=30ppm. During intracrustal partial melting and differ-entiation,enrichments of such elements in the residual lower crust would be expected,but for the andesite model,high ferromagnesian trace element abundances in the upper crust(e.g.,Cr >55ppm)would have predicted theoppositeand thus created mass balance difficulties. Accordingly,the low levels of ferromagnesian trace elements found in the Canadian Shield by Shaw et al.[1967,1976]seemed consistent.[15]However,it is now understood that low abundances of these elements in typical oro-genic andesites are a reflection of the fractio-nated nature of most andesites and that unfractionated mantle-derived arc magmas typically have much higher levels of ferro-magnesian trace elements[e.g.,Gill,1981].In addition,it is now widely accepted that much of the continental crust formed during the Archean and higher ferromagnesian trace ele-ment levels are characteristic of Archean oro-genic igneous rocks[e.g.,Condie,1993]. Most models of bulk crustal abundances now reflect these higher levels[Taylor and McLen-nan,1985,1995;Rudnick and Fountain, 1995],but upper crustal abundances of the ferromagnesian trace elements have received little comment.4.Methods4.1.Database[16]The database consists of a variety of com-pilations based on large-scale averages or com-posites of several sedimentary rock types of different grain sizes and from a variety of tectonic and sedimentological settings.Where possible,old sedimentary rocks,especially of Archean through early Proterozoic age,were neglected in order to avoid any issues of secular change in upper crustal composition.In fact, even with this sampling strategy,it is impos-sible to entirely avoid issues of secular varia-tions in composition because most sedimentary rocks are recycled over long periods of geo-logical time[Veizer and Jansen,1979,1985].[17]The Russian Shale average is based on a remarkable number of samples(n$40,000).Apart from this,>1200samples have gone into the various other averages and composites. Table3lists the trace element analyses and data sources used in Figures3±10.There is a small amount of redundancy in some of these averages in that the same samples may be included in more than one of the averages. For example,modern turbidites analyzed by McLennan et al.[1990]are subdivided by lithology and tectonic setting in Table3.How-ever,these samples(n=63)represent$10%of the analyses considered by Plank and Lang-muir[1998]in estimating global subducting sediment(GLOSS).Loess is considered to be a sediment type that perhaps best reflects the upper crustal provenance for many elements because of the relatively minor effects of weathering[Taylor et al.,1983].Accordingly, several regional loess averages are given in Table4,and these are also plotted individually on Figures4±10.[18]It is not possible to fully evaluate formal statistical uncertainties for some of these aver-ages because the primary sources do not provide sufficient information on variance.However, the large number of samples used to estimate many of the averages coupled with the fact that confidence in an average improves as a function of the square root of the number of samples results in relatively small uncertainties in the averages(at95%confidence level).For exam-ple,Plank and Langmuir[1998]reported standard deviations for the GLOSS data that were typically10±20%of the average for most trace elements.Because of the very large num-ber of samples used to formulate the average (>500),this results in95%confidence levels on the means of$1±2%.At the other extreme,the average river suspended sediment data have relatively large standard deviations(25±50% of the average values),probably a result of the fact that these rivers sample upper crust of widely varying tectonic settings and climatic regimes.This coupled with the relativelysmallnumber of analyses(n=7±19,depending on element)results in95%confidence limits on the means of10±30%.In the case of North American shale composite(NASC),the data represent a single analysis of a composite sample,and analytical error likely dominates the uncertainty.4.1.1.Shales,muds,and loess(fine grain)[19]Fine-grained sediment averages and com-posites that are used are described below(see Table3).In estimating the average fine-grained sediment,equal weight was given to each of the various sediment composites and averages.[20]1.For the river suspended value,average suspended sediment is from near the terminus of19major rivers of the world that together drain$13%of the exposed land surface[Mar-tin and Meybeck,1979;Gaillardet et al.,1999]. Not all elements are reported for all rivers with the most extreme case being Sc(n=7).[21]2.Average loess is determined from the mean of eight regional loess averages from New Zealand,central North America,Kaiser-stuhl region,Spitsbergen,Argentina,United Kingdom,France,and China(see Table4for sources;n=52).[22]3.NASC is a composite of40sediments (mainly shales),mostly from North America [Gromet et al.,1984].[23]4.Post-Archean average Australian shale is an average of23Australian shales of post-Archean age[Nance and Taylor,1976; McLennan,1981,1989;Barth et al.,2000]. The original PAAS[Nance and Taylor,1976] reported REE data only;however,the remain-ing elements were compiled by McLennan [1981],and REE data were updated by McLen-nan[1989].Ta values used here were recently reported by Barth et al.[2000].[24]5.Average Russian shale is an average of 1.6±0.55Ga shales(4883samples and4 composites from1257samples)and0.55±0.0Ga shales(6552samples and1674com-posites from28,288samples).Samples are mainly from Russia and the former Soviet Union but also include representative samples from North America,Australia,South Africa, Brazil,India,and Antarctica[Ronov et al., 1988].[25]6.Average Phanerozoic cratonic shale is from Condie[1993](n>100).[26]7.GLOSS is an estimate of the average composition of marine sediment reaching sub-duction zones,based on$577marine sedi-ments[Plank and Langmuir,1998].This average differs from the other fine-grained averages in that it includes a significant com-ponent of nonterrigenous material,including chemical sediment,pelagic sediment,and coarser-grained turbidites.This leads to some anomalies that are discussed below.[27]8.Average passive margin turbidite mud is an average of modern turbidite muds from trailing edges and the Ganges cone[McLennan et al.,1990](n=9)and Paleozoic passive margin mudstones from Australia[Bhatia, 1981,1985a,1985b](n=10).[28]9.Average active margin turbidite mud is an average of modern turbidite muds from active margins[McLennan et al.,1990](n= 18)and average Australian Paleozoic turbidite mudstones from oceanic island arcs(n=9), continental arcs(n=12),and Andean-type margins(n=2)[Bhatia,1981,1985a,1985b].4.1.2.Sand and sandstones(coarse grain)[29]Coarser-grained sediment averages that were used are described below(see Table3). In estimating the average coarse-grainedsedi-ment,equal weight was given to each of the various sediment composites and averages.[30]1.Average tillite is derived from the average of Pleistocene till from Saskatchewan[Yan et al., 2000](n=33)and late Proterozoic tillite matrix (texturally a sandstone)from Scotland[Panahi and Young,1997](n=21).A coarse-grained glacial sediment average was included to be comparable to the fine-grained loess deposits.[31]2.Average Phanerozoic cratonic sandstone is from Condie[1993](n>100).[32]3.Average Phanerozoic greywacke is from the mean of Paleozoic(n>100)and Mesozoic-Cenozoic(n>100)averages[Condie,1993].[33] 4.Average passive margin sand is an average of modern turbidite sands from trailing edges and the Ganges cone[McLennan et al., 1990](n=11)and Paleozoic passive margin sandstones from Australia[Bhatia,1981, 1985b;Bhatia and Crook,1986](n=15).[34]5.Average active margin sand is an aver-age of modern turbidite sands from active continental margins[McLennan et al.,1990] (n=25,with aberrantly high Cr and Ni from one sample excluded)and average Australian Paleozoic turbidite sandstones from oceanic island arcs(n=11),continental arcs(n=32), and Andean-type(n=10)margins[Bhatia, 1981,1985b;Bhatia and Crook,1986].4.2.Approach[35]The approach adopted in this paper for estimating upper continental crustal abundan-ces of certain trace elements makes two basic assumptions:(1)REE content of clastic sedi-mentary rocks best reflects upper crustal abun-dances and the upper crustal REE estimates of TM85are adopted(e.g.,La=30ppm),and(2) the sedimentary mass balance of the elements under consideration are dominated entirely by clastic sedimentary rocks such that they have low or negligible abundances in other sedi-ments,such as pure carbonates,evaporites,or siliceous sediments.In practice,this assump-tion is more robust for some elements than others(see section6).Accordingly,by examin-ing the relationship between a variety of trace elements and REE(using the most incompat-ible REE,La)in clastic sediments and sedi-mentary rocks it is possible to evaluate upper crustal La/element ratios.This approach is similar to that used by McLennan et al. [1980]to estimate upper crustal Th abundances from the sedimentary record[also see McLen-nan and Xiao,1998].[36]Clastic sedimentary data are divided into ``fine-grained''lithologies,including shales, muds,and silts(e.g.,loess),and``coarse-grained''lithologies,including sands,sand-stones,and tillites,as described above.The average composition of each lithology was determined by giving equal weight to each of the individual averages tabulated in Table3. The upper crustal La/element ratios were cal-culated from the overall weighted average composition,using the relative proportions of shales(fine grained)to sandstones(coarse grained)found in the geological record(shale/ sandstone ratio of6),and thus taken to be representative of average terrigenous sediment. Finally,the upper crustal abundances were determined from these La/element ratios, assuming an upper crustal La content of30 ppm(TM85).[37]The uncertainties in this approach are likely to be dominated by issues such as weighting factors and representativeness of samples rather than the statistical uncertainty in the various sediment averages.As noted above,the95%confidence intervals for the various sediment averages listed in Table3are generally fairly small(mostly less than10%).On the other hand,some of these averages are based on only a few sedimentary sequences.For example,the till average is from samples taken from only two sedimentary sequences,and the active margin sand and mud averages are largely based on relatively few sedimentary sequences in Australia.Whether or not these averages are representative of the various sedi-mentary settings cannot be evaluated and is the subject of further work.[38]An additional potential source of uncer-tainty is in the weighting factors used to determine the fine-grained and coarse-grained averages and overall averages.In calculating the fine-grained and coarse-grained averages an arbitrary weighting factor of 1was given to each analysis listed in Table 3.In calculating the overall average,the fine-grained and coarse-grained averages were weighted to the ratio of shale to sandstone in the geological record.Although there is some uncertainty in this ratio (e.g.,see recent discussion by Lisitzin [1996]),here I adopt the shale to sandstone mass ratio of 6:1,which is approximately midway between the average value measured by a variety of workers (4.3:1;see Garrels and Mackenzie [1971]for summary)and the theo-retical value (7.1:1)calculated by Garrels and Mackenzie [1971].Because trace element abundances in sandstones on average are sig-nificantly less than those in shales and the La/element ratios are generally similar (the great-est difference,for La/Cs,is $50%),changing the proportion of coarse-grained sediment by as much as a factor of 2has only a slight effect (<5%)on the final upper crustal concentra-tions.5.Results5.1.REE,Th,andSc[39]On Figure 3,the REE patterns of the various averages and composites are plottedand compared with TM85estimate of the upper continental crust.The long-standing observa-tion that post-Archean sedimentary REE pat-terns are remarkably uniform is apparent.Although there is considerable variability in1.010.0100.0p p m / p p m C h o n d r i t e s1.010.0100.0p p m / p p m C h o n d r i t e sFigure 3.(a)Chondrite-normalized REE patterns for various fine-grained and coarse-grained sedi-ment averages and composites listed in Table 3compared to upper crustal REE pattern from Taylor and McLennan [1985].(b)Comparison of weighted average clastic sediment and upper crustal REE patterns.。
解约束非线性规划问题的一种有效方法

南京航空航天大学硕士学位论文
图、表清单
图 1.1 混合型方法的执行框架 ····················································································3 表 5.1 QPFQN 算法和算法 2.1 的执行结果·······························································24
Bertsekas 在文献[2]中在上面的假设(i) (ii) (iii)下,考虑了两个不同的牛顿 方程,在 ( x,u) 处建立的第一个方程使得解出来的点具有 q-二阶收敛速度,在 第二个方程中,证明了在 x 处的 q-超线性收敛速度,但是当 xk+1 通过解一个 线性方程组确定时,需要在每一迭代步解一个二次约束子问题。
关键词:非线性规划,乘子方法,有限差分技术,拟牛顿方法,全局收敛性
iv
南京航空航天大学硕士学位论文
ABSTRACT
A new efficient method for solving nonlinear programming (NLP) problem is studied in this paper. This is a hybrid method of multiplier method and quasi-Newton method. In order to avoid solving QP subproblem, we develop a new nonlinear system which is equivalent to KKT conditions of the problem. NCP function is used in the nonlinear system such that the nonnegativity of some variables is avoided. In order to guarantee the global convergence, a method of multiplier is inserted in the iterative process. When the iterative point is not near to the optimal point, we use the approximating negative gradient obtained by the finite-difference technique as the descent direction such that it can guarantee the quick descent quality. When the iterative point is near to the optimal point, we develop a linear system by Newton-equation. We use BFGS updating formula to approach the Hessian matrix. The global convergence of the hybrid algorithm is proved and some numerical tests for the algorithm are given. The theoretical and numerical results show that the hybrid method is efficient.
spatio-temporall...

Spatio-Temporal LSTM with Trust Gates for3D Human Action Recognition817 respectively,and utilized a SVM classifier to classify the actions.A skeleton-based dictionary learning utilizing group sparsity and geometry constraint was also proposed by[8].An angular skeletal representation over the tree-structured set of joints was introduced in[9],which calculated the similarity of these fea-tures over temporal dimension to build the global representation of the action samples and fed them to SVM forfinal classification.Recurrent neural networks(RNNs)which are a variant of neural nets for handling sequential data with variable length,have been successfully applied to language modeling[10–12],image captioning[13,14],video analysis[15–24], human re-identification[25,26],and RGB-based action recognition[27–29].They also have achieved promising performance in3D action recognition[30–32].Existing RNN-based3D action recognition methods mainly model the long-term contextual information in the temporal domain to represent motion-based dynamics.However,there is also strong dependency between joints in the spatial domain.And the spatial configuration of joints in video frames can be highly discriminative for3D action recognition task.In this paper,we propose a spatio-temporal long short-term memory(ST-LSTM)network which extends the traditional LSTM-based learning to two con-current domains(temporal and spatial domains).Each joint receives contextual information from neighboring joints and also from previous frames to encode the spatio-temporal context.Human body joints are not naturally arranged in a chain,therefore feeding a simple chain of joints to a sequence learner can-not perform well.Instead,a tree-like graph can better represent the adjacency properties between the joints in the skeletal data.Hence,we also propose a tree structure based skeleton traversal method to explore the kinematic relationship between the joints for better spatial dependency modeling.In addition,since the acquisition of depth sensors is not always accurate,we further improve the design of the ST-LSTM by adding a new gating function, so called“trust gate”,to analyze the reliability of the input data at each spatio-temporal step and give better insight to the network about when to update, forget,or remember the contents of the internal memory cell as the representa-tion of long-term context information.The contributions of this paper are:(1)spatio-temporal design of LSTM networks for3D action recognition,(2)a skeleton-based tree traversal technique to feed the structure of the skeleton data into a sequential LSTM,(3)improving the design of the ST-LSTM by adding the trust gate,and(4)achieving state-of-the-art performance on all the evaluated datasets.2Related WorkHuman action recognition using3D skeleton information is explored in different aspects during recent years[33–50].In this section,we limit our review to more recent RNN-based and LSTM-based approaches.HBRNN[30]applied bidirectional RNNs in a novel hierarchical fashion.They divided the entire skeleton tofive major groups of joints and each group was fedSpatio-Temporal LSTM with Trust Gates for3D Human Action RecognitionJun Liu1,Amir Shahroudy1,Dong Xu2,and Gang Wang1(B)1School of Electrical and Electronic Engineering,Nanyang Technological University,Singapore,Singapore{jliu029,amir3,wanggang}@.sg2School of Electrical and Information Engineering,University of Sydney,Sydney,Australia******************.auAbstract.3D action recognition–analysis of human actions based on3D skeleton data–becomes popular recently due to its succinctness,robustness,and view-invariant representation.Recent attempts on thisproblem suggested to develop RNN-based learning methods to model thecontextual dependency in the temporal domain.In this paper,we extendthis idea to spatio-temporal domains to analyze the hidden sources ofaction-related information within the input data over both domains con-currently.Inspired by the graphical structure of the human skeleton,wefurther propose a more powerful tree-structure based traversal method.To handle the noise and occlusion in3D skeleton data,we introduce newgating mechanism within LSTM to learn the reliability of the sequentialinput data and accordingly adjust its effect on updating the long-termcontext information stored in the memory cell.Our method achievesstate-of-the-art performance on4challenging benchmark datasets for3D human action analysis.Keywords:3D action recognition·Recurrent neural networks·Longshort-term memory·Trust gate·Spatio-temporal analysis1IntroductionIn recent years,action recognition based on the locations of major joints of the body in3D space has attracted a lot of attention.Different feature extraction and classifier learning approaches are studied for3D action recognition[1–3].For example,Yang and Tian[4]represented the static postures and the dynamics of the motion patterns via eigenjoints and utilized a Na¨ıve-Bayes-Nearest-Neighbor classifier learning.A HMM was applied by[5]for modeling the temporal dynam-ics of the actions over a histogram-based representation of3D joint locations. Evangelidis et al.[6]learned a GMM over the Fisher kernel representation of a succinct skeletal feature,called skeletal quads.Vemulapalli et al.[7]represented the skeleton configurations and actions as points and curves in a Lie group c Springer International Publishing AG2016B.Leibe et al.(Eds.):ECCV2016,Part III,LNCS9907,pp.816–833,2016.DOI:10.1007/978-3-319-46487-950。
人工智能原理_北京大学中国大学mooc课后章节答案期末考试题库2023年

人工智能原理_北京大学中国大学mooc课后章节答案期末考试题库2023年1.Turing Test is designed to provide what kind of satisfactory operationaldefinition?图灵测试旨在给予哪一种令人满意的操作定义?答案:machine intelligence 机器智能2.Thinking the differences between agent functions and agent programs, selectcorrect statements from following ones.考虑智能体函数与智能体程序的差异,从下列陈述中选择正确的答案。
答案:An agent program implements an agent function.一个智能体程序实现一个智能体函数。
3.There are two main kinds of formulation for 8-queens problem. Which of thefollowing one is the formulation that starts with all 8 queens on the boardand moves them around?有两种8皇后问题的形式化方式。
“初始时8个皇后都放在棋盘上,然后再进行移动”属于哪一种形式化方式?答案:Complete-state formulation 全态形式化4.What kind of knowledge will be used to describe how a problem is solved?哪种知识可用于描述如何求解问题?答案:Procedural knowledge 过程性知识5.Which of the following is used to discover general facts from trainingexamples?下列中哪个用于训练样本中发现一般的事实?答案:Inductive learning 归纳学习6.Which statement best describes the task of “classification” in machinelearning?哪一个是机器学习中“分类”任务的正确描述?答案:To assign a category to each item. 为每个项目分配一个类别。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
172Constraint-Based Approachesto the Covering Test ProblemBrahim Hnich,Steven Prestwich,and Evgeny SelenskyCork Constraint Computation Center,University College,Cork,Ireland{brahim,s.prestwich,e.selensky}@4c.ucc.ieAbstract.Covering arrays have been studied for their applications todrug screening and software and hardware testing.In this paper,wemodel the problem as a constraint program.Our proposed models ex-ploit non-binary(global)constraints,redundant modelling,channellingconstraints,and symmetry breaking constraints.Our initial experimentsshow that with our best integrated model,we are able to either proveoptimality of existing bounds orfind new optimal values for arrays ofmoderate size.Local search on a SAT-encoding of the model is able tofind improved bounds on larger problems.1IntroductionSoftware and hardware testing play an important role in the process of prod-uct development.For instance,software testing may consume up to half of the overall software development cost[15].Furthermore,even for simple software or hardware products,exhaustive testing is infeasible because the number of pos-sible test cases is typically prohibitively large.For example,suppose we have a machine with10switches that have to be set,each with two positions.We wish to test the machine before shipping.Since there are210possible combinations,it becomes impractical to test them all.Nevertheless,we might want only a small number of test settings such that every subset of,say three switches,gets exer-cised in all23possible ways.In such a case,the question becomes:How many test vectors do we need?This problem is an instance of the t-covering array problem.A covering array CA(t,k,g)of size b is an k×b array consisting of k vectors of length b with entries from{0,1,...,g−1}(g is the size of the alphabet) such that the projection of any t coordinates contains all g t possibilities.The objective consists infinding the minimum b for which a CA(t,k,g)of size b exists. Covering arrays have been studied for their applications to drug screening and software and hardware testing.Over the past decade,there has been a body of work done in thisfield(See[1,3–5,11,24,25]for examples).Thefirst author is supported by Science Foundation Ireland and an Ilog license grant.The third author is supported by Bausch&Lomb Ireland and Enterprise Ireland.This work has also received support from Science Foundation Ireland under Grant 00/PI.1/C075.B.Faltings et al.(Eds.):CSCLP2004,LNAI3419,pp.172–186,2005.c Springer-Verlag Berlin Heidelberg2005Constraint-Based Approaches to the Covering Test Problem 173Constructions for optimal covering arrays CA (2,k,g )are known when the vectors are binary [17].In [17],an exhaustive backtrack search is presented that is used to find new lower bounds on the sizes of optimal covering arrays CA (2,k,g )where the alphabet Z g is non-binary.However,in the general case the problem is NP-complete [11].Most of the approaches use approximation methods where only upper and lower bounds of b are determined in polynomial time.In this paper we propose modelling this problem,in its most general form,as a constraint program.We explore different models,and show that with a constraint programming approach we are able either to prove optimality of existing bounds,or to find new optimal values for problems of relatively moderate size.When the size of the problem increases our models’performance degrades,but we are able to find improved (though not necessarily optimal)bounds for larger problems by applying a local search algorithm to a SAT-encoding of the constraint model.The rest of the paper is organised as follows.In section 2,we describe the covering test problem and give an overview of related work.Section 3we detail the proposed constraint models.We then show how we extend our models to handle more general cases in Section 4.Section 5presents our initial experimental results and a discussion.Finally,we conclude in Section 6and outline our future directions.2The Covering Test ProblemThe covering test problem 1is a direct application of the problem of covering arrays arising in hardware and software testing[11].Definition 1.Hartman and Raskin [11]A covering array CA (t,k,g )of size b and strength t,is a k ×b array A =(a ij )over Z g ={0,1,2,...,g −1}with the property that for any t distinct rows 1≤r 1≤r 2≤...≤r t ≤k ,and any member (x 1,x 2,...,x t )of Z t g there exists at least one column c such that x i =a r i c for all 1≤i ≤t .Definition 2.Hartman and Raskin [11]The covering array number CAN (t,k,g )is the smallest b for which a CA (t,k,g )of size b exists.The covering test problem is:for a given tuple t,k,g,b find a CA (t,k,g )such that CAN (t,k,g )=b or show that none rmally,we wish to find a minimum number of test vectors,of k parameters each,over the alphabet Z g such that the vectors contain all possible t -strings for every t -tuple of k parameters.Clearly,if t =k and g is fixed then the number of test vectors is g k and it is optimal.However,if t <k then g k is only an upper bound on the number of tests.The problem of finding the minimum b can be solved iteratively by a series of constraint satisfaction problems with decreasing values of b .The solution with the smallest b is then guaranteed to be optimal.1A description of this problem is also available as problem 45in CSPLib,174Brahim Hnich,Steven Prestwich,and Evgeny SelenskyAn approach to making software testing more efficient was presented by Cohen et al.[4],using test suites generated from combinatorial designs.The idea wasfirstly to identify parameters that induce the space of possible test scenarios;and secondly to select test scenarios so as to cover all the pairwise (or t-wise with t>2if necessary)interactions between the values of these parameters2This is analogous to earlier approaches[1,24,25].A theoretical study[11]establishes properties of covering test suites,in particular the lower bounds on their size,and presents several ways to construct test suites to achieve the bounds asymptotically.The problem of minimising the number of test cases in a t-wise covering test suite for k domains of size n was,according to[11],first studied in[20].Some papers consider the equivalent problem of maximising the number k of domains of size n in a t-wise covering test suite with afixed number N of test cases[11].This problem is referred to asfinding the size of the largest family of t-independent n-partitions of an N-set.To determine the minimum number of test vectors for a t-wise covering of parameters with Boolean values is known to be NP-complete[23].Related problems arefinding a test suite with minimum deficiency given afixed budget for executing a maximum of N tests,and a minimum test suite with afixed relative deficiency(deficiency over the total number of t-subsets).[16]discusses a practical issue of extending a given test suite to account for an additional parameter.The authors present an optimal algorithm for adding new rows to the test suite,once a new column has been inserted.However, their algorithms for adding a new column are either exponential or suboptimal.[3]presents a technique for reducing the covering test suite problem to graph-coloring.Even though this approach is more general,it is advantageous only for non-uniform coverage.In[9,12,13]applications of covering suite generation are dealt with rang-ing from testing a satellite system to diagnosis in digital logic devices.This is known as the diagnosis problem and is generally solved via Built-In-Self-Testing (BIST).BIST is a relatively new area and is the leading approach in indus-trial testing.It offers low hardware overheads and quick testing capabilities[13]. The authors of[13]establish a link between BIST techniques and combinato-rial group testing(CGT)[6].They formulate the diagnosis problem,discuss the shortcomings of some contemporary BIST approaches,and overview standard CGT diagnosis algorithms such as digging,multi-stage batching,doubling and jumping.With these algorithms they achieve improvements over the BIST tech-niques,and present new hybrid diagnosis algorithms called batched digging and batched binary search.To the best of our knowledge,no-one to date has looked at this area from a constraint perspective.Given the success of constraint technology in industrial combinatorial optimization,this paper is ourfirst attempt to bridge this gap, and to see if constraint-based approaches can compete with existing methods. 2As[4]points out,the experience of Telcordia Technologies–formerly Bell Commu-nications Research–is that pairwise coverage is sufficient for good code coverage and checking the interactions of system functions.Constraint-Based Approaches to the Covering Test Problem175 3Constraint-Based ApproachesIn this section we explore some models that exploit non-binary(global)con-straints,redundant modelling,channelling constraints,and other features of Constraint Programming(CP).Many scheduling,assignment,routing and other decision problems can be efficiently and effectively solved by CP models consist-ing of matrices of decision variables(so-called“matrix models”[8]).We can model the problem of generating test vectors using multiple matrix models. Without loss of generality,in what follows we assume for clarity that we have a Boolean alphabet Z2={0,1}.3.1A Naive Matrix ModelAs an example consider generating test vectors for all triples of5Boolean pa-rameters(t=3,k=5,g=2).The matrix in Figure1is a solution to this Boolean covering test problem,in which b=10.Note that we highlight all pos-sible combinations of0and1in thefirst three columns;this property holds for any triple of columns.1234500000000110010101001011101000110110110101110011111Fig.1.A solution to the example.A natural way to model the problem would be to introduce a k×b matrix of Boolean variables.However,wefind it hard to express the coverage constraints, that is every t-parameters get combined in all possible2t ways.For every t-parameters in each row we introduce a Boolean variable for each combination that is set to true whenever these t-parameters cover that particular combination, by means of reification constraints.We then impose the constraint that each combination should occur at least once,using a sum constraint over the auxiliary Boolean variables.Unfortunately,posing the coverage constraints on this matrix of decision variables introduces too many auxiliary variables and reification constraints. Furthermore,such a way of enforcing the coverage constraints makes constraint propagation inefficient and ineffective.We therefore need a different model where such coverage constraints can easily be expressed and propagated efficiently.176Brahim Hnich,Steven Prestwich,and Evgeny Selensky3.2An Alternative Matrix Model In our previous example,there are 53 =10triples of the original parameters:T ={ 1,2,3 , 1,2,4 , 1,2,5 , 1,3,4 , 1,3,5 , 1,4,5 ,2,3,4 , 2,3,5 , 2,4,5 ,3,4,5 }We can exploit an alternative viewpoint of the problem to concisely express the covering constraints.We again introduce a matrix of decision variables.The b rows in this matrix represent a possible setting of the parameters.Each column however,represents one of the possible t −combinations (in T in our example).The domain of each variable is {0,...,2t −1},or {0,...,7}in this example.In this new matrix,every entry is a problem variable that column-wise rep-resents the above parameter triples T starting from left to right.The value 0in this matrix stands for value combination 0,0,0 ,1for 0,0,1 ,and so on.Coverage ing the alternative matrix model,we can easily express the coverage constraints with the help of global cardinality constraints[19].For each column we must guarantee that each value occurs at least once and at most b −2t +1times.This ensures that we cover all possible values of any t parameters.Intersection Constraints.Because the variables in the first and the second column share digit positions 1and 2in the test vectors,the parameter values (0or 1)in these positions should be the same.With the alternative model,we introduce the burden of expressing such intersection constraints.So for every row r and every two columns c 1and c 2,if the two columns share some positions then we state a binary constraint between the variables (r,c 1)and (r,c 2)in the alternative matrix.For instance,for each row r the constraint between every two variables M [r, 1,2,3 ]and M [r, 1,2,4 ]in columns 1and 2can be expressed extensionally as follows:{ 0,0 , 0,1 , 1,0 , 1,1 , 2,2 , 2,3 , 3,2 , 3,3 , 4,4 ,4,5 , 5,4 , 5,5 , 6,6 , 6,7 , 7,6 , 7,7 }1,2,3 1,2,4 1,2,5 1,3,4 1,3,5 1,4,5 2,3,4 2,3,5 2,4,5 3,4,5 0000000000011113113310123123152230114551332322766644545501115547663226676546546276666466447777777777Fig.2.The same solution as in Figure 1but presented as an alternative matrix.Constraint-Based Approaches to the Covering Test Problem177 Note that the set of allowed tuples for such binary constraints differs depend-ing on the type of the intersection.The tightness of such constraints[26]also varies.For example the tightness of the previous constraint is0.25.Clearly,the tightness of the intersection constraints increases as the number of digit inter-sections decreases.There are situations when we have only one digit in common between a pair of variables(for instance,variables in thefirst column and vari-ables in the10th column).In that case,the constraint tightness is0.5.Note also that we have b such constraints for every pair of tuples that intersect in at least one digit position.3.3An Integrated ModelIn the naive matrix model wefind it difficult to express the coverage constraints in such a way that we can reason efficiently and effectively about them.This is not the case with the alternative matrix model,where we can use global cardinality constraints for which efficient propagation algorithms exist[19].The downside is that we have to explicitly express the intersection constraints.In order to benefit from the effectiveness of each model,we propose integrat-ing them by channelling the variables of the participating models.The disad-vantages of this integration are the increased number of variables and additional channelling constraints to be processed.The advantage is,however,that we can easily state all problem constraints.We enforce the intersection constraints on the alternative matrix by simply channelling into thefirst matrix,benefiting at the same time from the efficient global cardinality constraints in the alternative model.The channelling constraints associate each variable in the alternative matrix with t corresponding variables in thefirst matrix.The idea is to associate each possible way of combining the t parameters with a different value.For instance if t=3and the alphabet is binary then we constrain each variable ABC in the alternative matrix with its t corresponding parameters A,B,C as follows: ABC,A,B,C ∈{ 0,0,0,0 , 1,0,0,1 , 2,0,1,0 , 3,0,1,14,1,0,0 , 5,1,0,1 , 6,1,1,0 , 7,1,1,1 }So for any t-covering we have kt×b constraints of this type,and the arity ofeach constraint is t+1.3.4SymmetryA common pattern in matrix models is row and column symmetry[7].A matrix has row symmetry and/or column symmetry when in any(partial)assignment to the variables,the rows and/or columns can be swapped without affecting whether or not the(partial)assignment satisfies the constraints.Clearly,any permutation of test vectors in a(non-)solution gives a symmetric(non-)solution.This means that the rows of our matrix models are indistinguishable and hence symmetric [7].However,it is not trivial to see if the naive or the alternative matrix has178Brahim Hnich,Steven Prestwich,and Evgeny Selenskycolumn symmetry.The alternative matrix has no column symmetry while the naive(original)matrix does.Indeed,in the alternative matrix we associate each element with a particular combination of t parameters whereas in the naive matrix we do not distinguish where we project columns from as long as we make sure all parameter combinations are covered.In Figure3the naive matrix(b)is the result of swapping columns1and2 of the naive matrix(a).Both matrices represent symmetric solutions that cor-respond to CAN(2,3,2).It is easy to see that because of the properties of the covering constraints that enforce every combination to occur at least once,such column swaps do not affect whether or not the(partial)assignment satisfies the constraints.Thus the naive matrix also has column symmetry.The counterpart of such column symmetry in the alternative matrix is a complex combination of partial column symmetry and value symmetry among some variables.Note that the result of swapping the columns1and2in the naive model matrix(Figure3)corresponds to the swap of columns2and3in the alternative matrix(c)and the application of the value symmetry that maps 0to0,1to2,2to1,and3to3to the variables of column1in(c),which results in matrix(d).Thus,the naive matrix exhibits row and column symmetry,while the alterna-tive matrix exhibits row symmetry and a complex form of symmetry(equivalent to the column symmetry in the naive matrix),but not column symmetry.To break row and column symmetry in the naive matrix,we can order the rows and the columns lexicographically[7]using lexicographic ordering con-straints[10].Lexicographic ordering is a total order.Thus,by posing such an ordering constraint between every consecutive rows(columns),we break all row (column)symmetry[7].Whilst it is easy to break all symmetry in one dimension of the matrix,breaking symmetry in both dimensions is harder,as the rows and columns intersect.After constraining the rows to be lexicographically ordered we distinguish the columns,thus the columns are no longer symmetric.Never-theless,given a matrix with row and column symmetry,each symmetry class has at least one element where both the rows and columns are lexicographi-cally ordered.Unfortunately,more than one element where both the rows and columns are lexicographically ordered may exist[7],so we cannot break all row and column symmetry.The lexicographic ordering constraint is linear in the size of the vector and it maintains generalized arc consistency.(a)(b)(c)(d)123213 1,2 1,3 2,3 2,1 2,3 1,3000000000000010*********101011231113111111333333011101113231110110322322Fig.3.Symmetric solutions corresponding to CAN(2,3,2)represented as naive ma-trices(a,b)and as alternative matrices(c,d).Constraint-Based Approaches to the Covering Test Problem 1793.5A Model for Local SearchConstraint solvers typically alternate variable assignment with constraint prop-agation;when propagation leads to an empty variable domain,backtracking occurs.An alternative way of finding solutions to constraint problems is local ually starting from a randomly chosen assignment of all variables,single variables (or sometimes more than one)are selected and reassigned to a different value,each reassignment being a local move .The choice of variable and value is made heuristically,with no attempt to maintain completeness of search.This is in contrast to backtrack search,which is complete and can therefore find all solutions,or prove that no solutions exist.The advantage of local search is that it can sometimes solve much larger prob-lems than backtrack search.We decided to evaluate local search on our problem.We chose the Walksat algorithm,which has been successful on many problems and is publicly available.Walksat has several variants,and after some experi-mentation we selected the G variant [21],modified to break ties by preferring the variable that was flipped least recently (a well-known heuristic for improving search diversification).Walksat operates on Boolean satisfiability (SAT)mod-els so we must first SAT-encode our problem.However,the best model for local search is not necessarily the best model for backtrack search [18].Our SAT model is therefore not identical to the integrated matrix model.As before we define a k ×b matrix M of integers in Z g .For each row i ,column j and value s define a Boolean variable m ijs which is true if s occurs in position (i,j )and false otherwise.We also define the alternative k t ×b matrix A of integers in Z g t .For each row i ,column j and value s define a Boolean variable a i j s .In the following constraints 1≤i ≤k ,1≤i ≤ k t ,1≤j,j ≤b and 1≤s,s ≤g .Each M and A position must take exactly one symbol: sm ijs (1)¯m ijs ∨¯m ijs (2) s a i j s(3)¯a i j s ∨¯ai j s (4)where s <s in (2,4).The coverage constraints are:i a i j s(5)To channel between the two matrices we infer the values of the t entries in M for the corresponding A entries:¯a i js ∨m ijs (6)for all i,i ,j,s,s such that M ij =s and A i j =s do not conflict.We refer to our SAT model as the weakened matrix model because it omits several constraints,as follows.Firstly the upper bound on the coverage constraints is hard to express180Brahim Hnich,Steven Prestwich,and Evgeny Selenskyin SAT.This is an implied constraint,and though implied clauses sometimes aid local search [2,14]they are not a necessary part of the model.Secondly,symmetry breaking constraints can have a negative effect on local search perfor-mance [18].Omitting them aids local search by increasing the number of SAT solutions,and also by reducing the size of the model and thus improving the flip rate (number of local moves per second).We therefore omitted upper bound and symmetry breaking constraints from our encoding.The third difference is perhaps less obvious.When applying local search to a SAT-encoded CSP it is common to omit clauses ensuring that each CSP variable is assigned no more than one domain value [22],again improving performance.A CSP solution can still be extracted from a SAT solution by taking any one of the assigned domain values for each CSP variable.Here we may omit clauses (1,3,4).Note that we can still extract a CSP solution from any SAT solution:by clauses(5)in any SAT solution each combination of symbols occurs in at least one row of A for each combination of t columns;by clauses (6)each such occurrence induces the corresponding entries in M ;and by clauses (2)no more than one value is possible in each M position.In fact the omitted clauses (1,3,4)are implied by clauses (2,5,6),and experiments suggest that omitting them makes little difference to the search effort.It reduces the size of the encoding but not its space complexity,which is dominated by the channelling constraints and is O ( k t btg t )literals.3.6Summary of the ModelsWe consider four matrix models for the covering test problem:–The Naive Matrix Model.This model compactly represents the problem.However,it is difficult to express the coverage constraints in such a way that we can efficiently reason about them.This matrix has both row and column symmetry that we can efficiently and effectively reduce using lexicographic ordering constraints.–The Alternative Matrix Model.This model overcomes the disadvantages of the previous model by the use of powerful global cardinality constraints.However,this comes at the cost of introducing the burden of expressing intersection constraints.This matrix has row symmetry that we can reduce using lexicographic ordering constraints.It has another complex form of symmetry that we do not know how to break efficiently and effectively.–The Integrated Matrix Model.This model is an attempt to combine the complimentary strengths of both models.The coverage constraints are stated using the global cardinality constraints while the intersection constraints become redundant with the channelling constraints.We use the symmetry breaking constraints of the naive model as they are very efficient and ef-fective.The overhead of this integrated model is the increased number of variables and the additional channelling constraints.–The Weakened Matrix Model.This is a modification of the integrated matrix model,and designed for use with a SAT local search algorithm.It omits several constraints with the aim of increasing the number of SAT solutions and reducing runtime overheads.Constraint-Based Approaches to the Covering Test Problem181 4ExtensionsFor reasons of clarity,we presented our models assuming afixed binary alphabet and uniform coverage.However,our models can easily be extended to model different practical extensions:–Larger Alphabet.To allow for larger alphabet,we need only to change the domain of the variables in both matrices.We also need to modify the channelling constraints.For instance suppose t=2and the alphabet is Z g= {0,1,2}.The domain of the variable in the matrix in the naive model is Z g while the domain of the variables in the alternative matrix is{0,...,32−1},that is{0,...,8}.The channelling constraints between AB in the alternative matrix and its2corresponding parameters A and B in thefirst matrix become as follows:AB,A,B ∈{ 0,0,0 , 1,0,1 , 2,0,2 , 3,1,0 , 4,1,1 , 5,1,2 , 6,2,0 ,7,2,1 , 8,2,2 }–Heterogeneous Alphabets.The model can easily be extended to allow heterogeneous alphabets.The domains of the variables as well as the chan-nelling constraints need to be slightly changed to reflect this extension,but the essence of the models remains the same.–Partial Coverage.To allow for partial coverage,we simply exclude those values that represent the combinations that need not appear in a solution from the global cardinality constraints.–Side Constraints.Covering array problems can come with side constraints such asfixed columns or forbidden configurations[11].CP is convenient for solving problems with such constraints,which can simply be added to the model.5ExperimentsTo evaluate the different models we ran a small set of experiments for a given alphabet,coverage strengths,and various parameter numbers k.First we re-port on backtracking experiments using a Pentium IV1800MHz512MB RAM machine running Ilog Solver6.0.In our experiments we used instances of the covering test problem with cov-erage strengths t of3and4over a Boolean alphabet Z2={0,1}.In each exper-iment we vary the size k of parameter vectors.Our initial experiments with the naive model showed that it was very inefficient and always outperformed by the other models.For this reason,we decided to exclude it from further analysis.When using the alternative model we can only break row symmetry.How-ever,with the integrated model we can break both row and column symmetry. Furthermore,we can break row symmetry either on the original or on the al-ternative matrix(not both),whereas column symmetry can be eliminated only on the original matrix.Our experiments demonstrated that the best strategy in terms of the amount of search and runtime,when using the integrated model,is to break row symmetry using the alternative matrix.In the experiments we applied four different labeling strategies:–sdf-row:Group the variables by rows from top to bottom,and for each row label the variable that has the smallest domainfirst.Assign the values in the lexicographic order;–sdf-col:Group the variables by columns from left to right,and for each column label the variable that has the smallest domainfirst.Assign the values in the lexicographic order;–lex-row:Group the variables by rows from top to bottom,and label each row lexicographically.Assign the values in the lexicographic order;–lex-col:Group the variables by columns from left to right,and label each column lexicographically.Assign the values in the lexicographic order. Experiments that we ran to determine CAN(3,k,2)for varying k showed that the best labeling heuristics were lex-col and sdf-col and that lex-col outperformed sdf-col and the other labeling heuristics on bigger instances.For example,using the alternative model with a time limit of5minutes,only lex-col could determine CAN(3,8,2).Using the integrated model lex-colfinds CAN(3,11,2)in about 141seconds,sdf-col in281seconds,whereas lex-row and sdf-row cannotfind a solution.Tables1and2display the results of the experiments in more detail.In the tables we use bold face to highlight the best result so far,whereas a star symbol means that the respective value is provably optimal.Our results also show that the integration of the different models is beneficial despite the increase in the number of variables.For instance,the best integrated model found the optimal value for CAN(3,8,2)in around22seconds while the best alternative model in around265seconds.Note also that our results use the symmetry breaking constraints in all tested models.In fact,the symmetry breaking constraints play a vital part in the alternative and in the integrated models.For example,with the integrated model when we are solving the problem for k=5and b=10using lex-col labeling strategy together with row and column symmetry breaking we Table 1.Alternative Model:Finding b min=CAN(3,k,2)for different number of parameters k using the alternative model.Upper bounds UB on CAN(3,k,2)are taken from[11]for comparison.The runtime limit is5minutes,using lex-col(or sdf-col)as labeling heuristics.k b Upper bound in[11]runtime(sec)soluble no.of fails no.of choice points 48*80.01+282958120.03–323159120.19–161160510*120.35+27628361012 4.39–96596461112–(32.11)––(6197)–(6196)612*1225.20+50035013712*13128.34+95759591812*13264.77+95759592。