Semantic and Syntactic Interoperability in Transactional Systems
经典深度学习(PPT136页)

目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术
1. 推动以数据为中心的知识发 现技术
2. 增强AI系统的感知能力
3. 理论AI能力和上限
4. 通用AI 5. 规模化AI系统
6. 仿人类的AI技术 7. 研发实用,可靠,易用的机
器人 8. AI和硬件的相互推动
• 提升机器人的感知能力,更智能的同复 杂的物理世界交互
1. AI系统从设计上需要符合人 类的道德标准:公平,正义, 透明,责任感
2. 构建符合道德的AI技术
3. 符合道德标准的AI技术的实 现框架
• 两层架构: 由一层专门负责道德建设 • 道德标准植入每一个工程AI步骤
4th November 2016
策略 - IV: 确保人工智能系统的自身和对周围环境安全性
1. 推动以数据为中心的知识发 现技术
2. 增强AI系统的感知能力
3. 理论AI能力和上限 4. 通用AI
• 目前的AI系统均为窄人工智能, “Narrow AI”而不是“General AI”
• GAI: 灵活, 多任务, 有自由意志,在 多认知任务中的通用能力(学习能力, 语言能力,感知能力,推理能力,创造 力,计划,规划能力
• AI系统的自我解释能力 • 目前AI系统的学习方法:大数据,黑盒 • 人的学习方法:小数据,接受正规的指
导规则以及各种暗示 • 仿人的AI系统,可以做智能助理,智能
辅导
4th November 2016
策略- I : 在人工智能研究领域做长期研发投资
目标:. 确保美国的世界领导地位 . 优先投资下一代人工智能技术
语音识别,口音,儿童语音识别,受损 语音识别,语言理解,对话能力
翻译理论知识

《翻译理论与实践》考试理论部分复习提纲一、翻译定义:1. 张培基——翻译是用一种语言把另一种语言所表达的思维内容准确而完整地重新表达出来的语言活动。
?3. 刘宓庆——翻译的实质是语际的意义转换。
?4. 王克非——翻译是将一种语言文字所蕴含的意思用另一种语言文字表达出来的文化活动。
5. 泰特勒——好的翻译应该是把原作的长处完全地移注到另一种语言,以使译入语所属国家的本地人能明白地领悟、强烈地感受,如同使用原作语言的人所领悟、所感受的一样。
?6. 费道罗夫——翻译就是用一种语言把另一种语言在内容与形式不可分割的统一中所有已表达出来的东西准确而完全地表达出来。
?7. 卡特福德——翻译的定义也可以这样说:把一种语言(Source Language)中的篇章材料用另一种语言(Target Language)中的篇章材料来加以代替。
8.奈达——翻译就是在译入语中再现与原语信息最切近的自然对等物,首先就意义而言,其次就是文体而言。
“Translating consists in reproducing in the receptor language the closest natural equivalent of the source language message, first in terms of meaning and secondly in terms of style.” ---Eugene Nida纽马克——通常(虽然不能说总是如此),翻译就是把一个文本的意义按作者所想的方式移译入另一种文字(语言)。
“Translation is a craft consisting in the attempt to replace a written message and/or statement in one language by the same message and/or statement in another language.” --- Peter Newmark10. “Translation is the expression in one language (or target language译入语) of what has been expressed in another language (source language 原语), preserving semantic and stylistic equivalences.” --- Dubois12. 13.Translation or translating is a communicative activity or dynamic process in which the translator makes great effort to thoroughly comprehend a written message or text in the source language and works very hard to achieve an adequate or an almost identical reproduction in the target language version of the written source language message or text.二、翻译标准1. 翻译的标准概括为言简意赅的四个字:“忠实(faithfulness)、通顺(smoothness)”。
语言学重点难点

一、语言和语言学1、语言的区别性特征:Design of features of language任意性 arbitrariness 指语言符号和它代表的意义没有天然的联系二重性 duality 指语言由两层结构组成创造性 creativity 指语言可以被创造移位性 displacement 指语言可以代表时间和空间上不可及的物体、时间、观点2、语言的功能(不是很重要)信息功能 informative人际功能 interpersonal施为功能 performative感情功能 emotive function寒暄功能 phatic communication娱乐功能 recreational function元语言功能 metalingual function3、语言学主要分支语音学 phonetics 研究语音的产生、传播、接受过程,考查人类语言中的声音音位学 phonology 研究语音和音节结构、分布和序列形态学 morphology 研究词的内部结构和构词规则句法学 syntax 研究句子结构,词、短语组合的规则语义学 semantics 不仅关心字词作为词汇的意义,还有语言中词之上和之下的意义。
如语素和句子的意义语用学 pragmatics 在语境中研究意义4、宏观语言学 macrolingustics心理语言学 psycholinguistics 社会语言学 sociolinguistics 人类语言学 anthropological linguistics 计算机语言学 computational linguistics5语言学中的重要区别规定式和描写式:规定式:prescriptive说明事情应该是怎么样的描写式:descriptive 说明事情本来是怎么样的共时研究和历时研究:共时:synchronic 研究某个特定时期语言历时:diachronic 研究语言发展规律语言和言语:语言:langue指语言系统的整体言语:parole指具体实际运用的语言语言能力和语言运用:乔姆斯基(chomsky提出)能力:competence用语言的人的语言知识储备运用:performance 真实的语言使用者在实际中的语言使用二、语音学1、语音学分支发音语音学articulatory phonetics研究语言的产生声学语言学acoustic phonetics 研究语音的物理属性听觉语音学 auditory phonetics 研究语言怎样被感知2 IPA(国际音标)是由daniel Jones琼斯提出的三、音位学1、最小对立体minimal pairs2、音位 phoneme3 音位变体 allophones4 互补分布 complementary distribution5 自由变体 free variation6 区别特征 distinctive features7 超音段特征 suprasegmental feature音节 syllable 重音stress 语调tone 声调intonation四形态学1 词的构成语素morpheme 自由语素free morpheme 粘着语素bound morphemeRoot 词根词缀affix 词干stem屈折词汇和派生词汇 inflectional affix and derivational affix2特有的词汇变化lexical change proper新创词语invention 混拼词blending 缩写词abbreviation首字母缩写词 acronym 逆构词汇back-formation例:editor—edit类推构词analogiacal creation 例:work-worked,,slay-slayed外来词 borrowing五句法学1 范畴category 数number 性gender 格case 时tense 体aspect一致关系concord 支配关系govenrment2 结构主义学派the structure approach组合关系 syntagmatic relation词和词组合在一起聚合关系 paradigmatic 具有共同的语法作用的词聚在一起结构和成分 construction and constituents :句子不仅是线性结构liner structure还是层级结构hierarchical structure (句子或短语被称为结构体,而构成句子或短语即结构体的称为成分) 3直接成分分析法 immediate constitutional analysis指把句子分成直接成分-短语,再把这些短语依次切分,得到下一集直接成分,这样层层切分,直到不能再分4向心结构和离心结构endocentric and exocentric constructions向心:指一个结构中有中心词,例an old man ,中心为man离心:指结构中没有明显的中心词。
1.第一章作业答案

Language1.Fill in the blanks with the proper words.(1)Performative function means language can be used to “do” things.(2)Emotive/expressive function means the use of language to reveal something about the feelings and attitudes of the speaker.(3)Most imperative sentences are associated with conative/directive function.(4)The sentence “What‟s it like?” shows interrogative function.(5)Greetings shows phatic function.(6)“We are most grateful for this.”shows expressive function.(7)Propaganda shows evocative/recreational function.(8)Displacement refers to contexts removed from the immediate of the speaker.(9)Halliday‟s metafunctions include Ideational, Interpersonal and Textual functions.(10)Linguistics should include at least five parameters:_phonological, morphological, syntactic, semantic and pragmatic.2. True or false questions (If it is false correct it)(1)Language distinguishes us from animals because it is far more sophisticated than any animal communication system.T(2)There is not a certain degree of correspondence between the sequence of clauses and the actual happenings. F The order of clauses are always matching the actualsequence of happenings.(3)The theories discussed in the textbook about the origins of language are not at most a speculation猜测.F(4)The definition, “Language is a tool for human communication.” has no problem. (F) The definition just focuses on the function of language, which does notinclude the nature and features of language.(5)The definition, “language is a set of rules”, tells nothing about its functions.TLinguistics1. Explain the following definition of linguistics: Linguistics is the scientific study of language.The word “language” preceded by the zero article in English implies that linguistics studies not any particular language, but languages in general. The word …study‟ does not mean …learning‟, but …investigation‟ or …examination‟. The word …scientific‟ refers to the way in which language is studied. A scientific study is one based on the systematic investigation of data, conducted with reference to some general theory of language structure. The process of linguistic study can be summarized as follows:First, certain linguistic facts are observed, which are found to display some similarities, and generalizations are made about them; next, based on these generalizations, hypotheses are formulated to account for these facts; then, the hypotheses are tested by further observations; finally a linguistic theory is constructedabout what language is and how it works.2. What are the major branches of linguistics? What does each of them study? What makes modern linguistics different from traditional grammar? Point out three aspects.The major branches of linguistics and their study on PP. 17-20.It is generally believed that the beginning of modern linguistics was marked by the publication of F. de Saussure‟s Course in General Linguistics in 1919. Before that language had been studied for centuries, called “traditional grammar”. Modern linguistics differs from traditional grammar in several basic ways:Firstly, modern linguistics is descriptive while traditional grammar is prescriptive. (PP.23-24)Second, modern linguistics regards the spoken language as primary, not the written. Traditional grammar tended to emphasize the importance of the written word. Before the invention of sound recording, it was difficult for people to deal with utterances which existed only for seconds.Thirdly, modern linguistics differs from traditional grammar also in that it does not force languages into a Latin-based framework. In the past, it was assumed that Latin provides a universal framework into all languages fit. As a result, other languages were forced to fit into the Latin patterns and categories. Modern linguistics try to set up a universal grammar based on the features shared by most of the languages used by human beings.3. Is modern linguistics mainly synchronic or diachronic? Why?After Saussure‟s distinction between synchronic and diachronic study, a synchronic approach seems to enjoy priority over a diachronic study. Saussure postulated the priority of synchrony: no knowledge of the historical development of a language is necessary to examine its present system. He arrived at this radical viewpoint due to his conviction that linguistic research must concentrate on the structure of language.4. Which enjoys priority in modern linguistics, speech or writing? Why?Speech and writing are two major media of linguistic communication. Modern linguistics regards speech as the primary one for some reasons. From the point of view of linguistic evolution, speech is prior to writing. The writing system is to record speech. Even today, there are some tribes without writing system. From the view of children‟s development, children acquire his mother tongue before they learn to write.5. How is Saussure‟s distinction between langue and parole similar to Chomsky‟s distinction between competence and performance?The meaning of langue and parole (PP.24-25)The meaning of competence and performance (PP.25-26)Among the two pairs of distinction, parole and performance have a lot in common; they are all the actualization or realization of the abstract language system or knowledge. But langue and competence are similar in one aspect, that is, they allrefer to the constant which underlies the utterances that constitute parole or performance. Their differences are obvious. Langue is a social property while competence is a mental one; langue is a set of conventions while competence is a form of …knowing‟.6. What characteristics of language do you think should be included in a good, comprehensive definition of language?A good and comprehensive definition of language should at least include the following points: the structures of language, the features of language and the functions of language. (cf. exercises in Chapter one)7. What features of human language have been specified by C. Hockett to show that it is essentially different from any animal communication system?8. What is the main task for a linguist? State the importance of linguistics.The main task of a linguist is to discover the nature of the underlying language system, such as how each language is conducted, how it is used by its speakers, and how it is related to other languages, etc.According to your understanding, illustrate the importance of the subject.9. Why is “duality” regarded as an important feature of human language?According to Lyons (1982: 20), by duality is meant the property of having two levels of structures, such that units of the primary level are composed of elements of the secondary level and each of the two levels has its own principles of organization. The feature implies that a large number of meaningful elements are made up of a conveniently small number of meaningless but message-differentiating elements, for instance, a small number of sounds can be grouped and regrouped into a large number of units of meaning and the units of meaning can be arranged and rearranged into an infinite number of sentences. It is the feature of duality that makes language productive or creative. Duality also implies that language is hierarchical (PP. 6-7). 10. Fill in the right word according to the explanations.(1)Linguistics is the scientific study of language.(2)Sense study is the study of the interlinguistic relationships among different linguistic elements of language.(3)Universal grammar is the study of universal features of language(4)Synchronic study is the study of a particular language at the particular point of time. (5)Syntax_ is the study of the structure and both the syntactic and semantic rules of a language(6)Prescriptive study is the study of the rules or principles prescribed for people to follow when they use a language.(7)Macrolinguistics is the study of language is relation to other sciences(8)Transformational-Generative grammar is the study of the nature of human language and the human mind through the study of the U.G.11.True or false questions(1)Sociolinguistics relates the study of language to Psychology.F(2)In modern linguistics, synchronic study seems to enjoy priority over diachronic study. T (3)In the past, traditional grammarians tended to over-emphasize the importance of the written word. T(4)Parole is relatively stable and systematic while langue is subject to personal and situational constraints. F(5)Performance is the actual realization of this knowledge in linguistic communication. T (6)Saussure‟s distinction took a sociological view of language and his notion of langue is a matter of social conventions.T(7)Early grammars were based on “high”(religious, literary)written language.T(8)The study of language as a whole is often called applied linguistics.F(9)To explain what language is seems to be a naïve and simple question.F(10)Language bears certain features distinguishing it from means of communication other forms of life may possess, such as bird songs and bee dances.T(11)Competence and performance refer respectively to a language user‟s underlying knowledge about the system of rules and the actual use of language in concrete situations.T(12) Language is productive in that it makes possible the construction and interpretation of newsignals by its users. T(13) Competence and performance were proposed by the famous linguist Saussure.F(14) Syntax is a branch of linguistics that studies how words are combined to form sentencesand the rules that govern the formation of sentences. T(15) If a linguistic study aims to describe and analyze the language people actually use, it issaid to be prescriptive. F(16) Langue and parole were proposed by the famous linguist Saussure. T(17) Design features refer to the defining properties of human language that distinguish it fromany animal system of communication. T(18) Onomatopoeia indicates a non-arbitrary relationship between form and meaning. T。
词汇语义与句法结构BethLevin原著

词汇语义与句法结构Beth Levin原著Malka Rappaport Hovav詹卫东编译[译者按] 原文是The Handbook of Contemporary Semantic Theory, Shalon Lappin, ed. Oxford: Blackwell, 1996,一书中的第18章,Chapter 18, Lexical Semantics and Syntactic Structure。
Beth Levin是美国西北大学语言学系副教授。
她的研究集中在动词意义的词汇表示,以及词汇语义学、句法学和形态学之间的交互作用方面。
她是English Verb Classes and Alternations: A Preliminary Investigation (University of Chicago Press, 1993) 一书的作者,此外她跟Malka Rappaport Hovav合著过Unaccusativity: At the Syntax-Lexical Semantic Interface (MIT Press, 1995)。
在到西北大学工作前,她曾经是MIT认知科学中心的Lexicon工程的主要负责人。
Malka Rappaport Hovav是美国Bar-Ilan大学英语系副教授。
她从1984年起一直在那里工作。
她还是MIT认知科学中心Lexicon工程的研究人员。
她的主要研究兴趣是词汇语义学和形态学,以及上述两个领域跟句法之间的交互关系。
过去的10年是词汇语义学迅速发展的一个时期。
这也部分地导致了当前许多句法理论所普遍接受的一个假设:一个句子的句法性质的诸多方面都是由句中谓词(predicator)1的意义决定的(有关讨论可以参见Wasow 1985)。
在决定句子的句法结构方面,意义所扮演的角色,最引人注目的示例来自谓词论元的句法表达的规律性上。
我们把这叫做“linking regularities”(这个说法来自Carter 1988)。
Simple OML

Conceptual Knowledge Markup Language: The Central CoreRobert E. KentTOC (The Ontology Consortium)550 Staley Dr.Pullman, WA 99163, USArekent@ABSTRACTThe conceptual knowledge framework OML/CKML needs several components for a successful design (Kent, 1999). One important, but previously overlooked, component is the central core of OML/CKML. The central core provides a theoretical link between the ontological specification in OML and the conceptual knowledge representation in CKML. This paper discusses the formal semantics and syntactic styles of the central core, and also the important role it plays in defining interoperability between OML/CKML, RDF/S and Ontolingua.OVERVIEWThe OML/CKML pair of languages is in various senses both description logic based and frame based.A bird’s eye view of the architectural structure of OML/CKML is visualized in Figure 1.• CKML:knowledge framework for the representation ofphilosophy of Conceptual Knowledge Processing(CKP) (Wille, 1982; Ganter and Wille, 1989), aprincipled approach to knowledge representationand data analysis that “advocates methods andinstruments of conceptual knowledge processingwhich support people in their rational thinking, judgment and acting and promote critical discussion.” The new version of CKML continues to follow this approach, but also incorporates various principles, insights and techniques from Information Flow (IF), the logical design of distributed systems (Barwise and Seligman, 1997). This allows diverse communities of discourse to compare their own information structures, as coded in ontologies, logical theories and theory interpretations, with that of other communities that share a common terminology and semantics.Beyond the elements of OML, CKML also includes the basic elements of information flow: classifications, infomorphisms, theories, interpretations, and local logics. The latter elements are discussed in detail in a future paper in preparation on the CKMLFigure 1: OML/CKML at a glanceknowledge model. Being based upon conceptual graphs, formal concept analysis, and information flow, CKML is closely related to a description logic based approach for modeling ontologies. Conceptual scaling and concept lattice algorithms correspond to subsumption.• OML: This language represents ontological and schematic structure. Ontological structure includes classes, relationships, objects and constraints. How and how well a knowledge representation language expresses constraints is a very important issue. OML has three levels for constraint expression as illustrated in Figure 1.5:o top – sequents o intermediate – calculus of binary relations o bottom – logical expressionsThe top level models the theory constraints of informationimportance of binary relation constraints and the categorytheoretic orientation of the classification-projection semantics in the central core, and the bottom level corresponds to theconceptual graphs knowledge model with assertions (closed expressions) in exact correspondence with conceptual graphs.• Simple OML: This language is intended for interoperability. Simple OML was designed to provide the closest approach within OML to RDF/S, while still remaining in harmony with the underlying principles of CKML. In addition to the central core of CKML, Simple OML represents functions, reification, cardinality constraints, inverse relations, and collections. This paper shows how the first-order form of Simple OML is closely related to the Resource Description Framework with Schemas (RDF/S), and how the higher-order form of Simple OML is intimately related to XOL (XML-Based Ontology Exchange Language), an XML expression of Ontolingua with the knowledge model of Open Knowledge Base Connectivity (OKBC).• The Central Core: This is based upon the fundamental classification-projection semantics illustrated in Figure 2. The expression of types and instances in the central core is very frame-like. In contrast to the practical bridge of the conceptual scaling process, the central core provides a theoretical bridge between OML and CKML.〈 type(), ⊢〉 instance(Entity ) ⊨Entity1instance(target ) type(source )Figure 2: Classification Projection DiagramFigure 1.5: ConstraintsSEMANTICSClassification-Projection DiagramIn this section we define formal semantics for the fundamental classification-projection diagram illustrated by Figure 2. Figure 2 has two dimensions, the instance versus type distinction and the entity versus binary relation distinction. There are no subtype or disjointness constraints along either dimension. In Figure 2, arrows denote projection functions, lines denote classification relations, and type names denote higher order types (meta-types). Not visible in Figure 2 are the two entity types Object and Data. Object is the metaclass for all object types, whereas Data is the metaclass for all datatypes either primitive (such as strings, numbers, dates, etc.) or defined (such as enumerations). The Entity type is partitioned as a disjoint union or type sum, Entity = Object + Data , of the Object type and the Data type. So data values are on a par with object instances, although of course less complex.The top subdiagram of Figure 2 owes much to category theory and type theory. A category is defined to be a collection of objects and a collection of morphisms (arrows), which are connected by two functions called source (domain) and target (codomain). To complete the picture, the composition and identity operators need to be added, along with suitable axioms. Also of interest are the various operators from the calculus of binary relations (Pratt, 1992), such as residuation. The partial orders on objects and arrows represent the type order on entities and binary relations. The bottom subdiagram gives a pointed version of category theory, a subject closely related to elementary topos theory. The classification relation connects the bottom subdiagram (instances) to the top subdiagram (types), and represents the classification relation of Barwise's Information Flow (Barwise and Seligman, 1997).Core ConstraintsAssociated with the classification-projection diagram in Figure 2 are the following axiomatic properties. In the discussion below let r be a relation instance having source entity a and target entity b , let ρ be a relation type having source type α and target type β, and let σ be a relation type having source type γ and target type δ. This is symbolized in Table 1.• preservation of classification:r ⊨ ρ implies ( a ⊨ α and b ⊨ β )In words, if r is an instance of (classified as) type ρ, then entity a is an instance of type α and entity b is an instance of type β. As an example, the citizenship relation is from the type Person to the type Country. If c is an instance of citizenship, and c relates p to n , then p is an instance of type Person and n is an instance of type Country. symbol meaningρ : α → β∂0(ρ) = α, ∂1(ρ) = β σ : γ → δ∂0(σ) = γ, ∂1(σ) = δ r = (a , b )∂0(r ) = a , ∂1(r ) = b r = ρ(a , b ) ∂0(r ) = a , ∂1(r ) = b , r ⊨ ρTable 1: Relational types• preservation of entailment:σ⊢ρimplies( γ⊢αandδ⊢β)The authorship binary relation from type Person to type Book is a subtype of the creatorship binary relation from type Agent to type Work. If a man m is an author of a book b, then the agent m is a creator of the work b. The facts that type Person is a subtype of type Agent and type Book is a subtype of type Work may be necessary conditions for the subtype relation.• inclusion implies subtype:σ≤ρimpliesσ⊢ρThe motherhood binary relation on the type Person is a subtype of the parenthood binary relation on the type Person. If the woman w is the mother of a boy b, then w isa parent of b.• creation of incompatible types:(α, γ⊢orβ, δ⊢)impliesρ, σ⊢The sibling relation on type Person is disjoint from the employment relation from type Person to type Organization. This is implied by the fact that type Person is disjoint from type Organization. This seems to be true in general, both for the source and target projections.• creation of incoherent type:(α⊢orβ⊢)impliesρ⊢If a relation type is specified to have a source (or target) entity type that is later found to be incoherent, then the relation type is also incoherent.Core Type HierarchyThe elaboration of the classification-projection diagram as depicted in Figure 3 illustrates the concepts (basic types) in the central core knowledge model. This model renders more explicitly the connections found in the Core Grammar. As a rule of thumb, XML elements become entity types in the core knowledge model, and attributes and content nonterminals (child embeddings) of XML elements become functions and binary relations. In Figure 3 a type is depicted by a rectangle and an instance is depicted by a bullet. The generic classification and subtype hierarchies have not been included as types (rectangles), since their instances are not needed until the full CKML is specified. When more than one subrectangle (subtype) is present, the subtypes partition the supertype. Instances of core relations and functions are listed and grouped within their appropriate types. The signatures and constraints for the core binary relations and functions are listed in Table 2.Figure 3: Core Type HierarchyBinary Relationsclassification : Instance → Type= classification.BinaryRelation + classification.Entityclassification.BinaryRelation : Instance.BinaryRelation → Type.BinaryRelation classification.Entity : Instance.Entity → Type.Entity= classification.Object + classification.Dataclassification.Object : Instance.Object → Type.Objectsubtype : Type → Type= subtype.BinaryRelation + subtype.Entitysubtype.BinaryRelation : Type.BinaryRelation → Type.BinaryRelationsubtype.Entity : Type.Entity → Type.Entitycomment : Thing → StringFunctionssource.Type : Type.BinaryRelation → Type.Entitytarget.Type : Type.BinaryRelation → Type.Entitysource.Instance : Instance.BinaryRelation → Instance.Entitytarget.Instance : Instance.BinaryRelation → Instance.Entityname : Type → Stringid : Instance → StringTable 2: Core Signatures and ConstraintsCore GrammarBelow we list a grammar for the central core that is relation-centric on types and object-centric on instances. Except for the inclusion of function types and instances, this grammar closely models the classification-projection diagram in Figure 2.oml bracket rule[1] oml ::= ‘<OML>’ ontology | collection ‘</OML>’ontology type rules[2] ontology ::= ‘<Ontology>’ (ext | typ | axm)* ‘</Ontology>’[3] ext ::= ‘<extends’ ontologyAttr prefixAttr ‘/>’[4] typ ::= objType | binrelType | fnType[5] objType ::= ‘<Type.Object’ declTypeAttr ‘/>’[6] binrelType ::= ‘<Type.BinaryRelation’ declTypeAttr srcTypeAttr tgtTypeAttr ‘/>’[7] fnType ::= `<Type.Function’ declTypeAttr srcTypeAttr tgtTypeAttr '/>'[8] axm ::= ‘<subtype’ specificAttr genericAttr? ‘/>’collection instance rules[9] collection ::= ‘<Collection’ idAttr? ontologyAttr? ‘>’ inst* ‘</Collection>’[10] inst ::= objInst[11] objInst ::= ‘<Instance.Object’ idAttr? aboutAttr? ‘>’(classInst | binrelInst | fnInst)*‘</Instance.Object>’[12] binrelInst ::= ‘<Instance.BinaryRelation’ tgtInstAttr ‘>’classInst*‘</Instance.BinaryRelation>’[13] fnInst ::= ‘<Instance.Function’ tgtInstAttr ‘>’classInst*‘</Instance.Function>’[14] classInst ::= ‘<classification’ typAttr ‘/>’attribute rules[15] ontologyAttr ::= ‘ontology = "’ URI-reference ‘"’[16] prefixAttr ::= ‘prefix = "’ name ‘"’[17] declTypeAttr ::= ‘name = "’ name ‘"’[18] srcTypeAttr ::= ‘source.Type = "’ typeNSname ‘"’[19] tgtTypeAttr ::= ‘target.Type = "’ typeNSname ‘"’[20] specificAttr ::= ‘specific = "’ typeNSname ‘"’[21] genericAttr ::= ‘generic = "’ typeNSname ‘"’[22] typAttr ::= ‘type = "’ typeNSname ‘"’[23] tgtInstAttr ::= ‘target.Instance = "’ instanceNSname ‘"’[24] idAttr ::= ‘id = "’ name ‘"’[25] aboutAttr ::= ‘about = "’ URI-reference ‘"’basic XML rules[26] typeNSname ::= [ name ':' ] name[27] instanceNSname ::= [ typeNSname '#' ] name[28] URI-reference ::= string, interpreted per [URI][29] name ::= (any legal XML name symbol)[30] string ::= (any XML text, with "<", ">", and "&" escaped)As indicated in the XML specification document an attribute name must be of the following form. In particular, the ‘.’ is appropriate inside attribute names.NameChar ::= Letter | Digit | ‘.’ | ‘-‘ | ‘_’ | ‘:’ | CombiningChar | ExtenderName ::= (Letter | ‘_’ | ‘:’) (NameChar)*Core DTDThe elements, attributes and entities in the Core DTD below are tightly connected with the nonterminals and rules of the Core Grammar. The type elements are relation-centric (with respect to the subtype relation), whereas the instance elements are object-centric (with respect to the classification relation). The parameter entities OML:Type, OML:Axiom and OML:Instance represent in the DTD the “things” in the Core Type Hierarchy and Classification-Projection Diagram that are not represented by an XML tag. Parameter Entity Declarations<!-- rule [4] of the grammar --><!ENTITY % OML:Type“(OML:Type.Object| OML:Type.BinaryRelation| OML:Type.Function)”><!-- rule [8] of the grammar --><!ENTITY % OML:Axiom“(OML:subtype)”><!-- rule [10] of the grammar --><!ENTITY % OML:Instance“(OML:Instance.Object)”>Element Type Declarationsoml bracket element<!-- rule [1] of the grammar --><!ELEMENT OML:OML (OML:Ontology | OML:Collection)>central core ontology dtd<!-- rule [2] of the grammar --><!ELEMENT OML:Ontology (OML:Extends | &OML:Type; | &OML:Axiom;)*><!-- rules [3], [15], [16] of the grammar --><!ELEMENT OML:extends EMPTY><!ATTLIST OML:extendsontology CDATA #REQUIREDprefix CDATA #IMPLIED><!-- rules [5], [17] of the grammar --><!ELEMENT OML:Type.Object EMPTY><!ATTLIST OML:Type.Objectname CDATA #REQUIRED><!-- rules [6], [17], [18], [19] of the grammar --><!ELEMENT OML:Type.BinaryRelation EMPTY><!ATTLIST OML:Type.BinaryRelationname CDATA #REQUIREDsource.Type CDATA #REQUIREDtarget.Type CDATA #REQUIRED><!-- rules [7], [17], [18], [19] of the grammar --><!ELEMENT OML:Type.Function EMPTY><!ATTLIST OML:Type.Functionname CDATA #REQUIREDsource.Type CDATA #REQUIREDtarget.Type CDATA #REQUIRED><!-- rules [8], [20], [21] of the grammar --><!ELEMENT OML:subtype EMPTY><!ATTLIST OML:subtypespecific CDATA #REQUIREDgeneric CDATA #IMPLIED>central core collection dtd <!-- rule [9], [24], [15] of the grammar --><!ELEMENT OML:Collection (&OML:Instance;)*><!ATTLIST OML:Collectionid CDATA #IMPLIEDontology CDATA #IMPLIED><!-- rules [11], [24], [25] of the grammar --><!ELEMENT OML:Instance.Object(OML:classification | OML:Instance.BinaryRelation | OML:Instance.Function)*><!ATTLIST OML:Instance.Objectid CDATA #IMPLIEDabout CDATA #IMPLIED><!-- rules [12], [22], [23] of the grammar --><!ELEMENT OML:Instance.BinaryRelation (OML:classification)*><!ATTLIST OML:Instance.BinaryRelationtarget.Instance CDATA #REQUIRED><!-- rules [13], [22], [23] of the grammar --> <!ELEMENT OML:Instance.Function (OML:classification)*><!ATTLIST OML:Instance.Functiontarget.Instance CDATA #REQUIRED><!-- rules [14], [22] of the grammar --><!ELEMENT OML:classification EMPTY><!ATTLIST OML:classificationtype CDATA #REQUIRED>Higher-Order Entity TypesA first-order ontology is an ontology without higher-order types. In a first-order ontology the notions of instances and individuals coincide. Higher-order types are types that have other types as their instances. This means that instances can be either individuals or types. Individuals are instances that are not types. With higher-order types the classification relation extends to types on its source, and the source and target projection functions for individual relations also extended to types. Color is an example of a second-order type Color = { Red, Orange, Yellow, Green, Blue, Indigo, Violet }example from (Sowa, 1999), represents the Englishphrase a red ball . Here the characteristic relation (chrc ) links the concept of a ball to the concept of the redcolor [Color: Red] whose type label is the second-order type Color and whose referent is the first-order type Red. The conceptual graph maps to the following logical formula.Figure 4: higher-order type example(∃x:Ball)(color(Red) ∧ chrc(x,Red)).In the central core this can be represented as follows.<Ontology>• • •<Type.Object name=“Color”/><Type.Object name=“Red”/>• • •<classification instance=“Red” type=“Color”/>• • •<Type.Object name=“Ball”/><Type.BinaryRelation name=“chrc” source.Type=“Ball” target.Type=“Color”/></Ontology>/*specific style*/<Collection>• • •<Ball><chrc target.Instance=“Red”/></Ball>• • •</Collection>There are three things that are new here. An instance of the classification relation has been placed inside an ontology. The instance attribute of this classification refers to a type. The target attribute of the individual characteristic relation refers to a type.We may also be interested in representing various relationships between types. For example, an “argument” relation (own slot) is from an object type to a multivalent relation type having that object as one of its arguments. In particular, the “Cast” ternary relation type in a Movie ontology has the “Movie” object type as one of its arguments.<Ontology>• • •<Type.BinaryRelation name=“argument”source.Type=“Type.Object” target.Type=“Type.Relation”/>• • •/*specific style*/<argument source.Instance=“Movie” target.Instance=“Cast”/>• • •</Ontology>There is one thing that is new here. An instance of the argument relation has been placed inside an ontology. Both the source and target attributes refer to types.Figure 4 indicates how to extend the first-order classification-projection diagram of Figure 2 to higher-order entity types. As in the first-order case of Figure 2, the instance(BinaryRelation) metatype is the same as individual(BinaryRelation). However, the instance(Entity) metatype has changed to the sum Entity metatype, since object instances can be either individuals or types. The Entity metatype, representing entity instances, is the type sum (disjoint union) of its type and individual parts.Entity = type(Entity) + individual(Entity)instance(BinaryRelation)= individual(BinaryRelation)The entity classification relation has been extended to include types at its source. This means that we can classify types with other higher-order types, ad infinitem. The sourceand target of individual binary relations have also been extended to include types. Note that the individual(BinaryRelation ) metatype, along with its projection functions, correspond to frame-based own slots, whereas the type(BinaryRelation ) metatype,along with its projection functions, correspond to frame-based template slots (see the With Ontolingua subsection below).Higher-Order Relation TypesFigure 5 displays the classification-projection diagam for higher-order types, not only for entities but also for relations. This is a further extension of, and very similar to, the first-order classification-projection diagram of Figure 2. Here the instance(BinaryRelation ) metatype has changed to the sum BinaryRelation metatype, since relation instances can be either individuals or types. Since the BinaryRelation metatype is a type sum, the source and target functions are defined as copairings with the following definitions.source =[ type(source ) ◦ incl, individual(source ) ] target = [ type(target ) ◦ incl, individual(target ) ]In addition, some explanation should be given for the definition of the classification relation for binary relations, that has now been lifted to types. This relation is the copairing of the following two binary relations.⊨BinaryRelation ׃ type(BinaryRelation ) → type(BinaryRelation )⊨BinaryRelation ׃ individual(BinaryRelation ) → type(BinaryRelation )The first classification relation between relational types is new. The second is the usual first-order classification relation, where we identify individuals with instances (in that case).One possible axiom for higher-order relation classification is the following.〈 type(), ⊢〉 individual(BinaryRelation ) individual(Entity )⊨Entity type(source )Figure 4: Classification-Projection Diagram: Higher-Order Entity Types• preservation of classification:σ ⊨ ρ implies ( γ ⊨ α and δ ⊨ β )Suppose that relational type σ is an instance of relational type ρ. If σ has source type γ and target type δ and ρ has source type α and target type β, then γ is an instance of α and δ is an instance of β. As an example how this might occur, let entity types α and β be any two second level types, and define a second-level binary relation ρ between α and β to be those first-level binary relations between first-level entity type instances of α and β.SERIALIZATION SYNTAXThe National Center for Supercomputing Applications (NCSA) uses a search tool called Emerge that links multiple databases for a specialized community. Each community uses its own specialized markup language (XML application) for interchange of their particular information; for example, the astronomy community uses a special Astronomical Markup Language (AML). On the other hand, OML/CKML is a generic framework for describing information of any kind. What is the difference between a specialized markup language such as AML and a generic markup language (or framework) such as OML/CKML and how are these related? The answer involves coding and parsing styles.The generic markup language XOL (see the section on interoperability) advocates a generic approach for the specification of ontologies. The generic approach means that all ontologically-structured information is specified by a single set of XOL tags (defined by the single XOL DTD). The generic approach is modeled in OML/CKML by the generic style discussed below. In contrast, the Conceptual Graph Interchange Form (CGIF) represents information in a specific style. The primary advantage for the generic approach is simplicity in language processing. The primary disadvantage is lack of a means for type-checking the semantic constraints specified in the ontology. As discussedin this section, OML/CKML offers an approach that subsumes both the generic and the〈 type(), ⊢〉⊨Entity1targettype(source )Figure 5: Classification Projection Diagram: Higher-Ordered Typesspecific approaches for coding ontologies and ontologically-structured information. In a nutshell, we want to investigate whether the equivalence of Figure 6 has any meaning, validity and importance. In fact, we believe it has central Array importance in processing ontologies and XML.Abbreviation StylesFigure 6: Equivalence OML/CKML abbreviation styles are equivalent formalizationsthat have either the advantage of simpler processing (generic style) or the advantages of greater code simplicity and better type-checking (specific style). They are closely tied to the OML/CKML parsing methodology. There are two primary abbreviation styles: generic and specific. Any other style might be termed intermediate. The generic and specific styles are polar opposites, while an intermediate style is a mixture of the two. The generic style (no abbreviation) provides a syntax for a single universal grammar or DTD that is independent of domain and ontology. Each specific OML/CKML ontology can be automatically translated into a specific domain-dependent grammar or DTD. The specific style (full abbreviation) is an instance of that domain-specific ontology, and is parseable with that domain-specific grammar or DTD.The OML/CKML abbreviation styles are based upon the two OML/CKML abbreviation forms; an object-element form and a function-attribute form. These loosely follow two of the three RDF abbreviation forms – the object-element form is essentially the third RDF abbreviation form with the RDF Description element corresponding to the OML/CKML Instance.Object element; the function-attribute form is essentially the first RDF abbreviation form restricted to OML/CKML functions. The object-element abbreviation form in OML/CKML preceded the RDF version by several years, providing the syntax for OML/CKML version 1.5. The generic style must use neither of these abbreviations, whereas the specific style must use both of them.In order to illustrate OML/CKML abbreviation styles, we consider the example of the Movie instance Casablanca (1942). In the reduced representation below there is an object type for movies with metadata for year of appearance and genre. There is also a multivalent (n-ary) relation that links movies, cast members and the character that they played. The central core does not have a separate metatype for these (that comes in full OML), and so these are reified and represented as objects. The full Movie ontology can be automatically translated to the domain-specific movie DTD. Obviously, the specific style for Movie instance collections is much simpler code than the generic style.Movie Ontology<Type.Entity name=“Movie”/><Type.Function name=“year” source.Type=“Movie” target.Type=“Natno”/><Type.BinaryRelation name=“genre” source.Type=“Movie” target.Type=“Genre”/><Type.Entity name=“Cast”/><Type.Function name=“movie” source.Type=“Cast” target.Type=“Movie”/><Type.Function name=“member” source.Type=“Cast” target.Type=“Person”/><Type.Function name=“character” source.Type=“Cast” target.Type=“String”/>Domain-Specific Movie DTD<!ELEMENT Movie (genre)*><!ATTLIST Movieid ID #REQUIREDyear NUMBER #IMPLIED><!ELEMENT genre EMPTY><!ATTLIST genretarget.Instance CDATA #REQUIRED><!ELEMENT Cast EMPTY><!ATTLIST Castmovie CDATA #IMPLIEDmember CDATA #IMPLIEDcharacter CDATA #IMPLIED>The Specific Style Collection<Movie id=“Casablanca_1942” year=“1942”><genre target.Instance=“Drama”/><genre target.Instance=“Romance”/></Movie><Castmovie=“Casablanca_1942”member=“Humphrey_Bogart”character=“Rich Blaine”/>The Generic Style Collection<Instance.Entity id=“Casablanca_1942”><classification type=“Movie”/><Instance.Function target.Instance=“1942”><classification type=“year”/></Instance.Function><Instance.BinaryRelation target.Instance=“Drama”><classification type=“genre”/></Instance.BinaryRelation><Instance.BinaryRelation target.Instance=“Romance”><classification type=“genre”/></Instance.BinaryRelation></Instance.Entity><Instance.Entity id=“cast1”><classification type=“Cast”/><Instance.Function target.Instance=“Casablanca_1942”><classification type=“movie”/></Instance.Function><Instance.Function target.Instance=“Humphrey_Bogart”><classification type=“member”/></Instance.Function><Instance.Function target.Instance=“Rich Blaine”><classification type=“character”/></Instance.Function></Instance.Entity>The XML tags for both the ontology and the generic style instance collection use the generic names for types and instances in the central Core Type Hierarchy of Figure 3. These are listed in Table 3. The subtype and classification relations are special. The subtype relation needs the two additional specific and generic attributes, and the classification relation (since it links instances and types) needs the two additional instance and type attributes.。
02-Task5-Open基于语义知识的汉语句法分析
基于语义知识的汉语句法分析王 进(知识产权出版社 北京 10008)kingsten_88@摘要 语义知识是句法分析的必要基础,缺乏语义知识,自动句法分析将始终处于浅层的结构分析。
基于概念依存的关系,建立一套语义知识形式化体系,使得词语内涵的描述和语法结构的描述得到统一,让各级语言单位之间信息计算成为可能。
文章根据语义知识体系改进了传统chart算法的形式规则,增强了形式规则排除歧义的能力。
关键词 语义知识,原型系统,特征结构,chart算法。
A Semantic-Based Syntactic Parsing for ChineseWang Jin(Intellectual Property Publishing House, Beijing, 10008, China)kingsten_88@Abstract:Semantic knowledge system is essential for automatic syntactic parsing. If lacking ofsemantic knowledge, automatic syntactic parsing will always works on shallow syntactic structure farfrom semantic calculation. For achieving the purpose of semantic calculation between syntactic unitsof all levels, it is important to make the unification of expressions about the senses of words and thecharacteristics of syntactic units through establishing a set of formal systems of semantic knowledgebasing on the dependency rule between concepts. This article improves the syntactic rules oftraditional chart parsing approach according to a truly original semantic knowledge system, whichgreatly depreciates the ambiguity of syntactic rules.Key words: Semantic Knowledge, Prototype System, Feature Structure, and Chart Parsing自然语言处理技术,特别是自动句法分析,其优劣好坏终究取决于对语言知识的掌握程度。
HL7 V3 基础框架
二零零七年十月十八日
HL7 V3 基础框架
12
HL7 v2 和HL7 v3 消息
HL7 V3 Message:
<PRPA_MT400001HT03.EncounterEvent moodCode="EVN" type="PatientEncounter" classCode="ENC"> <id extension="ENC.P0001.1" root="10.301.3.10.0"/> <effectiveTime htb:dataType="IVL_TS" operator="I"> <low value="200601101201"/> <high value="200602101202"/> </effectiveTime> <subject contextControlCode="OP" type="Participation" typeCode="SBJ"> <Patient type="Patient" htb:association="role" classCode="PAT"> <id extension="PAT.P0001" root="10.301.5.10"/> <Person type="Person" determinerCode="INSTANCE" htb:association="player" classCode="PSN"> <id extension="PSN.P0001" root="10.301.1.10"/> <name use="L" htb:dataType="PN"> <family encoding=“TXT” partType=“FAM”>萨达姆</family> </name> <telecom value="TEL:+86 10 21450001" use="H"/> <administrativeGenderCode code="1" codeSystemName="GenderCode_301"/> <birthTime value="198009101209"/> <addr use="H"> <postalCode encoding="TXT" partType="ZIP">X0001</postalCode> <streetAddressLine encoding=“TXT” partType=“SAL”>伊拉克</streetAddressLine> </addr> <raceCode code="2034-7" codeSystemName="Race"/> </Person> <Organization type="Organization" determinerCode="INSTANCE" classCode="ORG"> <id extension="ORG.OG.0004" root="10.301.1.20"/> </Organization> </Patient> </subject> </PRPA_MT400001HT03.EncounterEvent>
语言学知识要点
语言学知识要点第一章1 Linguistics :Linguistics is generally defined as the scientific or systematic study of language2the four principles that make linguistics a science are:exhaustiveness/consistency/economy/objectivity3purposes: one is that it studies the nature of language and tries to establish a theory of language, and describes languages in the light of the theoryestablished. The other is that it examines all the forms of language ingeneral and seeks a scientific understanding of the ways in which it isorganized to fulfill the needs it serves and the functions it performs inhuman life.4 the three basic ways linguistics differ from traditional grammar: firstly,descriptive not prescriptive ; secondly, linguists regards the spokenlanguage as primary, not the written; thirdly, linguistics describes eachlanguage on its own merits (traditional grammar is based on Latin)5scope of linguistics:·Phonetics(语音学) : The scientific study of speech sounds .It studies how speech sounds are articulated, transmitted, and received. For example, vowels and consonants·Phonology(音位学) : The study of how speech soundsfunction in a language .It studies the ways speech sounds are organized. . For example, phone, phoneme, and allophone.·Morphology(形态学):The study of the way in which morphemes are arranged to form words is called morphology.For example,boy and“ish”---boyish,teach---teacher.·Syntax(句法): The study of how morphemes and words are combined to form sentences is called syntax. For example, ”John like linguistics.”·Semantics: (语义学)The study of meaning in language is called semantics. For example, the seal could not be found. The zoo keeper became worried.” The seal could not be fo und,The king became worried.” Here the word seal means different things.·Pragmatics(语用学): The study of meaning in context of use is called pragmatics. For example, “I do” The word do means different contex t. ·Sociolinguistics: The study of language with reference to society is called sociolinguistics. For example, regional dialects ,social variation in language. ·Psycholinguistics: The study of language with reference to workings of mind is called psycholinguistics.1. What is language?Language is a system of arbitrary vocal symbols used for human communication2. Design features of language①Arbitrariness(任意性) means that there is no logical connection betweenmeanings and sounds.②Duality(二重性): The property of having two levels of structures, such that unitsof the primary level are composed of elements of thesecondary level and each of the two levels has its own principles of organization.③Productivity(创造性): it means Language is resourceful because of its duality and recursiveness , in that it makes possible the construction and interpretation of new signals by its users.④Displacement(移位性): Human Languages enable their users to symbolize objects, events and concepts which are not present (in time and space) at moment of communication.5) interchangeability: refers to the fact that man can both produce and receive messages, and his roles as a speaker and a hearer can be exchanged at ease.6) specialization: refers to the fact that man does not have a total physical involvement in the act of communication7) cultural transmission: language is not biologically transmitted from generation to generation,but the details of the linguistic system must be learned anew by each speaker.3. Functions of language①Informative(信息功能): To give information about facts. ( ideational)eg. road cross②Interrogative(人际功能): to establish and maintain social status in a society.(age, sex, language, background, accent, status)eg.how old are you?③Performative(施为功能) : language is used to do things, to perform certain actions. (name, promise, apologize, sorry, declare)eg. I declare the meeting open.④Expressive (情感功能): to express feelings and attitudes of the speaker.⑤Phatic communion(寒暄交流) : to use small and meaningless expressions to establish a comfortable relationship or maintain social contact between people without any factualcontent. (health, weather)eg. ah here you are!⑥Directive: language is used to get the hearer to do something.(祈使句)⑦Evocative: language is used to create certain feelings in the hearers.(jokes advertising)4. What is linguistics?Linguistics is generally defined as the scientific study of language.5. Important distinctions in linguisticsDescriptive & prescriptiveSynchronic & diachronicLangue & paroleCompetence & performance6.Descriptive ( 描写/述性)—describe and analyze linguistic facts or the language people actually use (modern linguistic) Prescriptive ( 规定性)—lay down rules for “correct and standard”linguistic behavior in using language (traditional grammar: “never use a double negative”)7.Synchr onic study ( 共时)—description of a language at some point of time (modern linguistics)Diachronic study (历时)—description of a language as it changes through time (historical development of language over a period of time)第二章(一)1.The study of phonetics can be divided into three mainbranches:(1)articulatory phonetics(2)acoustic phonetics (3)auditory phonetics2.speech organs are the human body involved in the production of speech, including the lungs, the trachea, the throat, the nose and the mouth3sound segments are grouped into consonants and vowels 元音V owel(1)The sounds in the production of which no articulators come very close together and the air stream passes through the vocal tract without obstruction are called vowels.(2)元音的和划分标准:the part of the tongue that is raised—front center back;the height of the tongue—high middle low;the opening of the mouth—close open semi-open semi-close;the shape of the lips—rounded unrounded;the length of the sound—long short辅音Consonants(1)The sounds in the production of which there is an obstruction of the air streamat some point of the vocal tract are called consonants.(2)according to the place of articulation English consonants can be classifiedinto :bilabials(双唇音pbmw),labiodentals(唇齿音fv),dentils or interdentals(齿音th 的两种发音),alveolars(齿龈音tdsznlr),palatals(腭音)velars(软腭音)glottal(声门音h):according to the manner of articulation English consonants can be classified into :stops爆破音nasals 鼻音fricatives擦音approximants通音laterals 边通音affricates破擦音liquids流音glides 滑音(二)Phonology1.音位PhonemeThe basic unit in phonology, it?s a collection of distinctive phonetic features.2.音素phoneA phonetic unit or segment. it doesnot necessarily distinguish meaning, it?s a speech sound we use when speakinga language.3.最小对立对Minimal pairWhen two different forms are identical in every way except for one sound segment which occurs in the same place in the strings, the two words are said to form a minimal pair.4.超切分特征Suprasegmental features(1)The phonemic features that occur above the level of the segment are called suprasegmental features. the main suprasegmental features include stress ,intonation 5.自由变体free variationIf two sounds occurring in the same environment do not contrast; namely,if the substitution of one for the other does not generate a new word form but merely a different pronunciation of the same word, the two sounds then are said to be in free variation6.syllables and consonant clusters:in every language we find that there are constraints on the sequences of phonemes that are used.A word which begins with three -consonant clusters always observes three strict rules:(1)the first consonant must be [s](2)the second phoneme must be [p][t][k](3)the third consonant must be [i][r][w][j]补充知识点1.宽式音标Broad transcriptionThe transcription of speech sounds with letter symbols only.2.窄式音标Narrow transcriptionThe transcription of speech sound with letters symbols and the diacritics.3.清音V oicelessWhen the vocal cords are drawn wide apart ,letting air go through without causing vibration ,the sounds produced in sucha condition are called voiceless sounds.4.浊音V oicingSounds produced while the vocal cords are vibrating are called voiced sounds.Two allophones of the same phoneme are said to be in complementary distribution.5.Phonetic 组成⑴Art iculatory phonetics 发音语音学longest established, mostly developed⑵Auditory phonetics 听觉语音学⑶Acoustic phonetics 声学语音学6.articulatoryApparatus /Organs of SpeechPharyngeal cavity–咽腔Oral ...–口腔greatest source of modification of air stream found here Nasal …–鼻腔7.The tongue is the most flexible, responsible for more varieties of articulation than any other, the extreme back of the tongue can be raised towards the uvula and a speech sound can be thus produced as is used in Arabic and French. 8.Obstruction between the back of the tongue and the velar area results in the pronunciation of[k] and[g],the narrowing of space between the hard palate and the front of the tongue leads to the sound[j];the obstruction created between the tip of the tongue and the alveolar ridge results in the sounds[t]and[d].9.nasal consonants: [m] / [n] / [η]例子11.English has four basic types of intonation:Falling tone;Rising tone;Fall-rise tone; Rise-fall tone第三章Morphology1.词素MorphemeThe basic unit in the study of morphology and the smallest meaningful unit of language. We can make a distinction between two types of morphemes: FreeMorpheme and Bound morphemes.2.自由词素Free Morphemes(1)Free morphemes are independent units of meaning and can be used freely all by themselves.(2)All monomorphemic words (单语素词)are Free Morphemes.(3)Free morphemes can be divided into two categories;[A]the set of the ordinary nouns, verbs, and adjectives which carry the content of the messages we convey .examples are book, look, happy[B] functional morphemes .examples are but, and, if, when3.黏着词素Bound morphemes(1) Bound morphemes are these morphemes that cannot be used by themselves, must be combined with other morphemes to form words that can be used independently.(2) Bound morphemes can be divided into two categories[A]derivational morphemes 派生语素The manifestation of relation between stems and affixes through the addition of derivational affixes.[B]inflectional morphemes曲折语素. The manifestation of grammatical relationships through the addition of inflectional affixes, such as number, tense, degree and case.4.词根RootRoot is the base form of a word which cannot be furtheranalyzed without total loss of identity. All words contain a root morpheme, which may be a free morpheme or a bound morpheme.5.词缀AffixThe collective term for the type of formative that can be used only when added to another morpheme. It has three subtypes , prefix, suffix, and infix. All the affixes are bound morphemes.. Prefixes前缀modify the meaning of the stem ,but usually do not change the part of speech of the original word, exceptions are the prefixes …be-… and …en(m)-…. Suffixes后缀are added to the end of stems, they modify the meaning of the original word and in many cases change its part of speech. 3.In using the morphological rules, we must guard against Over-generalization.6.词干StemA stem is any morpheme or combination of morphemes to which an inflectional affix can be added. It can be equivalent to a root, or a root and a derivational7. 词素变体MorphsMorphs are the smallest meaningful phonetic segments of an utterance on the level of parole.8.语素变体AllomorphSome morphemes have a single form in all contexts, such as dog, cat, some others may have considerable variation. some morphemic shapes represent different morphemes sand thus have different meanings, examples,-s 表示复数,人称,和所有格9构词法Types of word formation(1)Compounding合成法(compound合成词)(2)Derivation派生法(derivative派生词)(3)Conversion转类法(4)Backformation逆构法(5)Clipping拆分法(6)Blending混成法(7)Acronym首字母缩略法第四章1. Parts of speechTraditional grammar defines 9 parts of speech: nouns, verbs, pronouns, adjectives, adverbs, prepositions, conjunctions, determiner, Particle2.Word class(1)Nouns are words used to refer to people, objects, creatures, places, events, qualities, phenomena,(2)Adjectives are words that describe the things, quality, state or action which a noun refers to(3)Verbs are words used to refer to various actions.(4)Adverbs are words that describe or add to the meaning ofa verb, anadjective , another adverb, or a sentence.(5)Prepositions are words used with nouns in phrases providing information about time, place, and other connections involving actions and things.(6)Pronouns are words which may replace nouns or nouns phrases(7)Conjunctions are words used to connect, and indicate relationship betweenevents and things.(8)Determiner限定词(9)Particle 颗粒词3. Preposition is not a word you can end with a sentence with.4.(1) One type of descriptive approach is called structuralanalysis.(2) Immediate constituent analysis成分分析法(IC分析法)[A]Immediate constituent analysis is the analysis of a sentence in terms of its immediate constituents –Word groups, which are in turn analyzed into the immediate constituents of their own, and the process goes on until the ultimate constituents are reached[B]The Immediate constituent analysis of a sentence may be carried out with brackets or with a tree diagram, the criterion for the immediate constituent analysis is Substituted for a single word and the structure remains the same. IC analysis , the internal structure of a sentence may be demonstrated,(3)bracketing analysis6三种分析法的优缺点:优点:反映了语言层级性本质(essence);解析语言生成;反应语言递归性。
[2010]_[D]_[开题报告]_[上下文元数据]_[语义互操作]
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Semantic and Syntactic Interoperability in Transactional SystemsChitoJovellanosforward look, inc.8 Faneuil Hall Marketplace, 3rd FlrBoston, Massachusetts US 02109+1 (617) 763-7011chito@ABSTRACTThis research note describes a middleware-based system that enables semantic and syntactic interoperability of transactional data exchanged in real-time between sending and receiving systems. Our system samples transaction streams, and statistically measures the variances in the semantics and syntax of the underlying data elements exchanged between sending and receiving systems. Automatic transformations of the data elements are applied when a context-specific Data Operability Threshold is crossed. The transaction is then transmitted onward by the sending system (or processed for consumption by the receiving system) in a format that ensures no manual reconciliation would be required to process the transaction. The System is being prototyped for cross-border securities trading and settlement.KeywordsIntegration, Interoperability, Semantics, Syntax, Transformation. 1. INTRODUCTIONEnabling full data interoperability at the business-to-business level (defined as the ability of communicating business systems to generate and consume transactional data in a straight-through-processing (STP) manner) has never been satisfactorily addressed. In securities trade processing (as well as other financial industry verticals), the development of exhaustive message standards is rightfully envisioned as the primary enabler for STP. For example, the Society for Worldwide Interbank Financial Telecommunication (SWIFT) has established a number of standards (eg, ISO 15022), since its founding in the early 1970’s, which stipulate how various securities and related payment transactions are to be communicated between investment institutions, brokerages and custodial banks. Despite committed and concerted efforts by industry participants to comply with the letter and spirit of the standards, a significant amount of manual reconciliation is still required to process these transactions. The latest industry research suggests that as much as 36% of securities transactions do not meet the goal of STP resulting in significant trade management and reconciliation costs [1].The core reasons underlying sub-par STP rates are the semantic and syntactic differences with respect to how individual companies represent the specific data elements that populate the industry-compliant transaction formats they exchange in the normal course of business [2], [4]. Constrained systems implementations (eg, small mutual funds cannot often afford the data quality requirements for global securities cross-references), and proprietary business workflows (eg, an equity trading firm may not effectively support fixed income derivative business) often preclude full compliance with the rigorous application of the generic transaction standard. Four decades of experience with message standardization in the global securities industry shows that standards quickly evolve away from the idealized norm where the variances are conditioned by systems capabilities, internal workflows and shifting business imperatives.2. SOLUTION FRAMEWORKOur approach recognizes that small but significant differences in the application of messaging standards exist. The ability to achieve STP is tied intimately to compensating for the variances that continually arise in the semantics and syntax of data elements that comprise business transactions. We demonstrate the application of “integration on demand” [3], as a complement to “integration in advance” (ie, the naïve premise that data harmonization through message standards is a sufficient solution to the problem of business-to-business integration).Our prototype implementation (System) is nested within existing middleware on the sender’s computer, the receiver’s computer, or both. Our System performs the following major operations :Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior written permission.Copyright 2003 forward look, inc. Analyze data streams of tagged transactions (NB, not necessarily XML-based) created by a computer system (eg,a company’s internal order management systems) prior tooutbound transmission to another computer system (eg, the counterparty’s order management system). Similarly, complementary transactions from the counterparty’s systems(eg, trade confirmation systems) are also inspected. Thisuniverse of samples reflects a coherent collection of transactions that support an established business workflow,and provides a complete view of day-to-day “real-world”data usage.Derive transformation algorithms that enable automated restructuring of the data elements of the sender’s transaction(ie, immediately prior to transmission by the sender’s system) into a message format that a receiver’s system canprocess without manual intervention. Conversely, transformation algorithms can also be derived (ie, for use bythe receiver’s system) which restructures the data elementsof the counterparty’s inbound transaction into a form thatthe receiving system can process without manual intervention. These transformation algorithms are enabledusing XSLT.The accuracy of our System improves when the sampled transaction is partitioned into logical semantic segments (Topic Domains) prior to analysis. Each Topic Domain comprises a discrete collection of data elements that delineate a generally understood and logically coherent subject matter area (eg, client information; regulatory compliance data).The variances in the syntax and semantics of sampled transactions are tracked using the Data Interoperability Grid (DIG). The DIG is a set of three dimensional numeric arrays that captures the statistical similarity of data element usage within and across transactions.The variance profile of every data element sampled from the transaction stream is accessible via a Universal Data Structure (UDS). One UDS is available for each field and counterparty combination (eg, “Net Amount” as used with “Goldman Sachs London”). The ensuing collection of UDS objects are inherently simple to use since each and every UDS is consistent in structure and uniformly addressable by an interrogating program.Data transformations are triggered when the semantic and syntactic stability of a transaction stream is perturbed. The Data Operability Threshold (DOT) is a mathematical region (defined by the metrics captured in the DIG) that describes a stable (but non-stationary) optimum of transactional data usage. When the DOT region is pierced, the event triggers the corresponding data transformations (enabled via UDS calls) that dynamically restructure the data elements in the inbound or outbound transactions in order to accommodate the current (or a modified) stable optimum.3. PRELIMINARY RESULTSPreliminary results from our initial deployment of the System showed promising results (Table 1), and areas for improvement. We validated its capabilities in enabling data interoperability between counterparties’ systems by processing corporate actions (eg, notifications for stock splits, mergers and other events that affect the capital structure of a company), and securities trade settlement instructions (eg, a purchase and sale effecting a change in ownership of stock on receipt of payment). Based on concurrent STP metrics obtained for the same set of transactions, usage of our System resulted in significant STP gains of approximately 33% for corporate actions and 12% for settlement instructions.Table 1. Improvement in STP Rates achieved using our System. STP Rate defined as percentage of transactions that did not require manual handling. Sample Size shows average daily number of transactions (daily peaks in parentheses). Results based on 30 contiguous business days. Data represents a global investment institution interacting with their custodian bank. The “training data” set consisted of the samples from the first five days, and were excluded from the “result set” (which comprised the remaining 25 days)STP Rate :ExistingAutomationSTP Rate :With OurSystemSample SizeCorporateActions64% 97% 1000 (3000)TradeSettlement86% 98% 40000 (180000)Corporate action transactions (which are semantically far richer than settlement instructions) each required 3.2 seconds on average to process, while each settlement instruction only took 0.8 seconds on average. Initializing the System for corporate actions required 33 hours of “learning time” based on the training set, while settlement instructions only required 14 hours in total. Follow-on optimizations to the underlying algorithms are now being undertaken.4. REFERENCES[1] Bryant, A. 2002. Presentation at SWIFT SIBOS Conference,Geneva, 30 September 2002.[2] Rahm, E. and Bernstein, P.A. 2001. A survey of approachesto automatic schema matching. The VLDB Journal 10 : 334-350.[3] Stonebraker, M. 2002. Too Much Middleware. SIGMODRecord 31:1 March 2002(/sigmod/record/issues/0203/).[4] Yan, L.L., Miller, R.J., Haas, L.M. and Fagin, R. 2001.Data-Driven Understanding and Refinement of SchemaMappings. Proceedings of the ACM SIGMOD 2001 Conf:485-496.。