An Ontology based document management

合集下载

CoMoTo_一个基于本体的情境建模工具

CoMoTo_⼀个基于本体的情境建模⼯具CoMoTo：⼀个基于本体的情境建模⼯具徐朝晖吴刚上海交通⼤学软件学院，上海 200240摘要：情境感知是近⼏年来普适计算的研究热点，合适的情境建模⽅法和⼯具是实现情境感知的基础。

本⽂采⽤了基于本体的⽅法进⾏情境的建模，并从通⽤性和易⽤性两个⾓度出发，给出了⼀个基于本体的情境建模⼯具（CoMoTo）。

⽂中讨论了对情境进⾏分级建模的⽅法，描述了该建模⼯具的分析与设计⼯作，并通过⼀个案例说明了该⼯具的建模能⼒。

关键词：情境建模；本体；建模⼯具1．引⾔普适计算是以⼈为中⼼的计算，⼀个重要特性就是情境感知的能⼒，即能随着所处环境中情境的变化⽽动态地作出相称的反应。

为了更好地描述和管理情境，需要对情境有⼀个统⼀的认识。

许多⽂献都给出了情境的定义，但因为作者分析⾓度的不同⽽不尽相同，Dey[1]等⼈提出的定义是其中具有代表性、通⽤性相对较强的⼀个，他们认为情境是任何可以⽤来刻画⼀个实体的处境的信息，⽽这个实体可以是⼈、地⽅或者任何跟⽤户与应⽤之间的交互有关联的物体（包括⽤户和应⽤本⾝）。

本⽂涉及的情境，其含义均参考这⼀描述。

在情境感知应⽤中，情境信息可以通过多种渠道获得（例如传感器、存储器及⼿⼯输⼊等），情境信息的这种异源性导致对其进⾏描述、管理和利⽤等操作都将是复杂的过程。

情境模型通过为情境感知应⽤提供情境的抽象描述，使得上述这些复杂的操作对其变得透明，从⽽在很⼤程度上简化了搭建情境感知应⽤的流程。

因此，⼀个具有良好结构的情境模型是构建情境感知系统的关键[2]。

本体作为⼀种描述⼿段，能明确地、形式化⽽规范地对共享概念模型进⾏说明[3]，强⼤的表达能⼒使得其能描述更复杂的情境，⽽它提供的丰富的语义⽀持也使得基于情境进⾏推理成为可能。

得益于这些优势，研究⼈员⼴泛使⽤本体对情境信息进⾏建模。

当前已有多种软件⽤于构建本体，斯坦福⼤学开发的Protege便是其中优秀的⼀员。

然⽽，由于Protege⾯向所有领域的本体构建，其极强的通⽤性使得操作⽐较复杂，建模⼈员需要有很强的专业知识。

应用於临床医学之本体论导向式知识管理方法论

The Ontology-Driven Methodology for the Purpose of Knowledgl Medicine
英文摘要 Background and Objective The term “ontology” means all the core concepts (terms and their relationships) belonging to a specified knowledge domain. Ontology-driven knowledge management
is a methodology that utilizes ontology to coordinate the semantics (meaning) in documents laden with semi-structured knowledge. As far as we know, ontology-driven knowledge management has not been used for clinical purpose to date. In this study, we applied this methodology to construct a knowledge management system for the severe acute respiratory syndrome (SARS)-related knowledge domain, and then tested whether ontology-driven knowledge management could be superior to the conventional method of knowledge management in retrieving

突发公共卫生事件军队卫勤应急反应机制研究

公共卫生突发事件是指突然发生,造成或者可能造成社会公众健康严重损害的重大传染病疫情、群体性不明原因疾病、重大食物和职业中毒,以及其他严重影响公众健康的事件。如果突发事件不能碍到有效的应对和及时的处理,往往诱发社会问题,从而导致危机。2003年的SARS蔓延是人类进入21世纪后遭遇的首次全球性公共卫生突发事件,在短短数月间,便从初始的单一、区域性的公共卫生突发事件,发展成为以公共卫生事件为核心的全球复合型社会危机。突如其来的SARS疫情以及近两年全球多个国家不断暴发的禽流感疫情,对社会所造成的极大恐慌和危害,再次让人们感受到传染病等公共卫生突发事件对人类健康、社会稳定、经济发展所构成的威胁,也向政府及医疗卫生部门提出了警示。在面临突发事件时,如何快速准确地收集信息,制定干预方案,直接关系到危机处理的成败,因此,科学决策就成为解决公共卫生突发事件的关键。本文分析了影响公共卫生突发事件科学决策的主要因素,探讨了构建辅助决策系统提高公共卫生突发事件应对能力。
突发公共卫生事件军队卫勤应急反应机制研究
作者:吴雄杰
学位授予单位:第三军医大学
1.会议论文刘保延.彭锦.胡镜清构建临床救治辅助决策系统科学应对公共卫生突发事件2005
本文链接:/Thesis_Y959378.aspx
授权使用:北京服装学院(bjfzxy),授权号:e80d2d64-a098-4013-bb5d-9e5500a739ca
下载时间:2010年12月23日
SCBR参考特定应用领域的领域模型,定义案例的结构描述,以XML形式结构化存储以往的案例;建立应用领域下相应的案例库。当公共卫生突发事件发生时,SCBR对突发事件以案例结构进行抽象描述;将突发事件与案例库中所有案例进行相似度评估;并检索出与突发事件最相似的若干个案例。通过对若干个最相似的案例的解决方案的学习与修改,SCBR制定出对突发事件的最终解决方案,并对突发事件进行应急指挥。

关于ontology的讨论

关于ontology 的讨论董云卫：你问道关于ontology, 直译是哲学上的存在论或本体论，现在用于系统的概念模型很热。

大体意思是说，客观世界是由很多元素组成，而元素之间又具有各种联系，把这些元素和关系画出来就是一个ontology。

这里有几篇文章可供参考，都是2002年国际会议的文章，比较新。

1，25030001 Conceptual Modeling and Ontology: Possibilities andPitfalls2，25030003 An Ontology for m-Business Models3, 25030012 Ontology-Driven Conceptual Modeling: Advanced Concepts4, 25120174 DAML+OIL:A Reason-Able Web Ontology Language郝克刚。

2003年1月15日Sent: Wednesday, January 15, 2003 8:49 PMSubject: RE: 关于ontology郝老师及各位：The 10th international human-computer interaction conference (2003)刚接受了我一篇文章：Ontology-based Conceptual Modeling for InteractionDesign. 创新和质量都得了满分:-) 事实上，其内容是讨论软件系统的概念建模的，但和建模交互有一定关系。

我和董云卫争论过2小时，但他不相信我的。

本体论的常用定义是：分享概念化的形式、显式规约（但有争论），其内容包括一个概念分类，关系及公理。

本体论一般是静态的，不包括动态概念。

换言之，本体论描述的是说明式知识，不包括过程序知识，因为本体论的目的是表示，不是使用知识。

所谓分享概念化指是在一个问题域中现象的抽象模型，其中概念是公认的，形式化指机器可处理性，显式指概念的类型和使用限制都是明确定义的（一般得有一个meta-ontology或叫ontology assumptions定义概念类型和类型之间的关系，一个具体的概念模型中的概念及关系是它的实例）。

TinyX显示驱动在ARM开发板上的移植

YUV 格式通常有两大类：打包格式和平面格式。前者将 YUV 分量存放在同一个数组中，通常是几个相邻的像素组成一个宏像素；而后者使用 3 个数组分开存放 YUV 这 3 个分量。 [7]
YUV4: 2: 2 采用的是打包格式，它为每个像素保留 Y 分量，而 UV 分量在水平方向上每两个像素采样一次[8]。一个宏像素为 4 个字节，实际表示 2 个像素。(4:2:2 的意思为一个宏像素中有 4 个 Y 分量、2 个 U 分量和 2 个 V 分量。)图像数据中 YUV 分量排列顺序为：y0 u y1 v y2 u y3 v…，其中，y0 为左点的亮度值；y1 为右点的亮度值。u 和 v 为两个点共享的色度值。
接下来还需创建用于存放 YUV 数据的缓冲区和设计 RGB 转 YUV 的转换程序，最后将 YUV 数据缓冲区中的数据写入到显示芯片 ADV7179。 3.1 YUV 数据缓冲区的设计
YUV4:2:2 格式是打包格式，缓冲区的数据相应地也设计成按 YUV4:2:2 格式存放。对于 704×576 分辨率的电视屏幕，其对应的 YUV 数据缓冲区的大小为 704×576×2 个字节。
2 ADV7179 的输入信号格式
ADV7179 的输入格式是 YUV4: 2:2。YUV 是另外一种表示颜色信息的标准，广泛被视频和电视信号传输采用，它用亮度信号 Y 和色度信号 U、V 表示颜色。如果只有 Y 信号分量而没有 U、V 分量，那么这样表示的图像就是黑白灰度图像，因此可以在黑白电视中显示。
YUV4: 2: 2 格式表示的是扫描线上两个点共用 U、V 色度值，当两个点的亮度值 Y 差别不是很大时，人眼有可能对这两个像素点显示的图像分辨不清。因此，电视屏幕适合显示过渡比较明显的画面。而图形系统对画面的质量要求比较高，例如绘制一条一个像素宽度的直线都要能清晰的显示出来。为了解决清晰度的问题，我们让两个点的 Y 值也相等，即这两个点都对应虚拟屏幕缓冲区中的一个 RGB 颜色格式的像素点。采用这种机制，TinyX 的图像显示到电视屏幕上，宽度将放大到原来的两倍，但这时字体图像显示出来将会变成很难看长条形状，为了让字体图像也保持正方形显示，就必须将高度放大到原来的两倍，即电视屏幕垂直方向上两个点的 YUV 值也相等，也都对应一个 RGB 颜色格式的像素点 (如图 2 所示)。这样，TinyX 虚拟屏幕缓冲区的水平分辨率、垂直分辨率都降到电视屏幕的一半，为 352× 288。

ieee1

A Web Search Contextual Crawler Using OntologyRelation MiningWu ChenshengBeijing Municipal Institute of Science and TechnologyInformationBeijing, ChinaHou Wei, Shi Yanqin, Liu TongSchool of Information EngineeringUniversity of Science and Technology Beijing Beijing Municipal Institute of Science and TechnologyInformationBeijing, ChinaAbstract—In order to increase the correctness and the recall of vertical web search system, a novel web crawler ORC is proposed in this paper. By introducing ontologies, the semantic concepts are defined. The relations between ontologies are exploited to evaluate the importance of web documents’ context. These ontologies are organized by a network structure that is updated periodically. Thanking to the ontology relation mining, the crawling space is expanded and focused more. A primitive vertical search system is built based on ORC at the end of this paper.Keywords-vertical search; ontology; data mining; web crawlerI.I NTRODUCTIONNowadays, Web Search Engine (alternatively, Information Extraction System) has already been one of indispensable applications. Several power search engines have been built, such as google, baidu and so on. However, do they really return your favorite want? Sometimes, it would be failed when you focus on something professional or uncommon concepts. This problem is relieved by vertical search engines[1], which concentrates on only certain concepts or regions. However, to the best of our knowledge, this problem is not solved properly yet up to now. How to increase the correctness and the recall is one of the most urgent issues in web search society.One ideal scenario is semantic web[2], which aims to bridge the concepts between humans and machines. Although many efforts had been made, and several methods had been proposed to establish semantic webs, the ratio of standard semantic webs is very little in the world. In other words, semantic web is not the main current at present.If the structure of web document is organized regular, such as XML, the situation would be better. But the condition is opposite. For that reason, another technique had been proposed, that is ontology. In philosophy, ontology is the study of the nature of being, existence or reality in general, as well as of the basic categories of being and their relations. Traditionally listed as a part of the major branch of philosophy known as metaphysics, ontology deals with questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences. In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.In theory, an ontology is a "formal, explicit specification of a shared conceptualization"[3]. An ontology provides a shared vocabulary, which can be used to model a domain — that is, the type of objects and/or concepts that exist, and their properties and relations.In some of web search engines, ontology is utilized to describe and define concepts. The crawlers (alternatively spiders), a kind of agent software, explore the World Wide Web and establish the indices of the concepts. Generally speaking, a common web search engine is composed of crawler algorithms, concept index databases, and search algorithms. Among them, crawler algorithm decides the recall and correctness at a big extent.When the concepts related closely, similar concepts should be likely the wants of users either, besides the one user submitted. Data mining is the process of extracting hidden patterns from large amounts of data. As more data is gathered, with the amount of data doubling every three years[4], data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery.Constructing the relation networks of ontologies by data mining method, is a novel strategy to organize the concept index database and explore heuristically. That is the main topic of this paper.This paper is organized as follows: The state-of-the-art web search engine and data mining technology are introduced in section II; in section III, ontology concept description technology is discussed; a novel web crawler algorithm ORC, which is based on ontology relation mining, is proposed in section IV; in section V, a primitive search website using ORC is presented.II.B ACKGROUNDA web search engine is a tool designed to search for information on the World Wide Web. Its research could be derived from 1990s. One of the first "full text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it let users search for any word in any978-1-4244-4507-3/09/$25.00 ©2009 IEEEwebpage, which has become the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994 Lycos (which started at Carnegie Mellon University) was launched, and became a major commercial endeavor.Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s[5]. Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001. Nowadays, a series of search engines are booming out, such as Yahoo, Google, Live search and so on.Web search engines work by storing information about many web pages, which they retrieve from the WWW itself. These pages are retrieved by a Web crawler — an automated Web browser which follows every link it sees. Exclusions can be made by the use of robots. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. Some search engines, such as Google, store all or part of the source page as well as information about the web pages, whereas others, such as AltaVista, store every word of every page they find. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of linkrot, and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage. This satisfies the principle of least astonishment since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.While data mining can be used to uncover hidden patterns in data samples that have been "mined", it is important to be aware that the use of a sample of the data may produce results that are not indicative of the domain. Data mining will not uncover patterns that are present in the domain, but not in the sample. Humans have been "manually" extracting information from data for centuries, but the increasing volume of data in modern times has called for more automatic approaches. As data sets and the information extracted from them has grown in size and complexity, direct hands-on data analysis has increasingly been supplemented and augmented with indirect, automatic data processing using more complex and sophisticated tools, methods and models. The proliferation, ubiquity and increasing power of computer technology has aided data collection, processing, management and storage. However, the captured data needs to be converted into information and knowledge to become useful. Data mining is the process of using computing power to apply methodologies, including new techniques for knowledge discovery, to data[6].Data mining identifies trends within data that go beyond simple data analysis. Through the use of sophisticated algorithms, non-statistician users have the opportunity to identify key attributes of processes and target opportunities. However, abdicating control and understanding of processes from statisticians to poorly informed or uninformed users can result in false-positives, no useful results, and worst of all, results that are misleading and/or misinterpreted.III.O NTOLOGYHistorically, ontologies arise out of the branch of philosophy known as metaphysics, which deals with the nature of reality – of what exists. This fundamental branch is concerned with analyzing various types or modes of existence, often with special attention to the relations between particulars and universals, between intrinsic and extrinsic properties, and between essence and existence. The traditional goal of ontological inquiry in particular is to divide the world "at its joints", to discover those fundamental categories, or kinds, into which the world’s objects naturally fall.During the second half of the 20th century, philosophers extensively debated the possible methods or approaches to building ontologies, without actually building any very elaborate ontologies themselves. By contrast, computer scientists were building some large and robust ontologies with comparatively little debate over how they were built.Since the mid-1970s, researchers in the field of artificial intelligence have recognized that capturing knowledge is the key to building large and powerful AI systems. AI researchers argued that they could create new ontologies as computational models that enable certain kinds of automated reasoning. In the 1980s, the AI community began to use the term ontology to refer to both a theory of a modeled world and a component of knowledge systems. Some researchers, drawing inspiration from philosophical ontologies, viewed computational ontology as a kind of applied philosophy[7].In the early 1990s, the widely cited Web page and paper "Toward Principles for the Design of Ontologies Used for Knowledge Sharing" by Tom Gruber[8] is credited with a deliberate definition of ontology as a technical term in computer science. Gruber introduced the term to mean a specification of a conceptualization. That is, an ontology is a description, like a formal specification of a program, of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set of concept definitions, but more general. And it is a different sense of the word than its use in philosophy.Ontologies are often equated with taxonomic hierarchies of classes, class definitions, and the subsumption relation, but ontologies need not be limited to these forms. Ontologies are also not limited to conservative definitions – that is, definitions in the traditional logic sense that only introduce terminology and do not add any knowledge about the world. To specify aconceptualization, one needs to state axioms that do constrain the possible interpretations for the defined terms[9].IV.ORCVertical search, or domain-specific search, part of a larger sub-grouping known as "specialized" search, is a relatively new tier in the Internet search, industry consisting of search engines that focus on specific slices of content. The type of content in special focus may be based on topicality or information type. For example, an intelligent medical search engine would clearly be specialized in terms of its topical focus, whereas a video search engine would seek out results within content that is in a video format. So vertical search may focus on all manner of differentiating criteria, such as particular locations, multimedia object types and so on.Broad-based search engines such as Google or Yahoo fetch very large numbers of documents using a Web crawler. Another program called an indexer then reads these documents and creates a search index based on words contained in each document. Each search engine uses a proprietary algorithm to create its indexes so that, ideally, only meaningful results are returned for each query.Vertical search engines, on the other hand, send their spiders out to a highly refined database, and their indexes contain information about a specific topic. As a result, they are of most value to people interested in a particular area; what’s more, the companies that advertise on these search engines reach a much focused audience. For example, there are search engines for veterinarians, doctors, patients, job seekers, house hunters, recruiters, travelers and corporate purchasers, to name a few.Pursuant, for a certain key word, the crawlers of vertical search usually explore broader documents than the one of general search engines. In this section, Ontology Relation Crawler, shortly ORC, is discussed. Its distinguish feature is the contextual sensitivity, which is based on ontology co-existent relation mining.A.Ontology OganizationIn ORC, the concepts described by ontologies are organized by network, as shown in Figure 1. An example of a set of ontologies in ORC about science is depicted. Every ontology is denoted by a frame, and connected with each other by curves. These curves imply the relations between the ontologies. Each curve is correspondent with a coefficient, support called. It decides the co-existent relation between the two ontologies connected by the curve.Figure 1. Example of ontologies in ORCThe supports of the ontology co-existent relations could be compute by frequent pattern mining, a kind of data mining method. The support is evident to be understood. If there are N documents the crawler explored, and s ones from them contain both of ontology A and B, then the support of the pattern A B∧is s N. All of the supports would be updated by the ORC periodically by one frequent pattern mining method modified from fp growth.B.ORC AlgorithmFigure 2. ORC AlgorithmORC algorithm is one iterative process as shown in Figure 2. Every web document online is identified by a Uniform Resource Locator (Url). The domain ontology set O is preset by expertise. In step 7, current document is indexed into the ontologies found.A template database D is maintained by ORC. This database records the information of co-existent relations of ontologies. The supports are updated, in step 11 and 12, by incremental frequent pattern mining from D. The CreateChild function, in step 14,defines the next explore directions by two ways: 1) The links ofcurrent document; 2) The indexed documents which contained co-existent ontologies.C.Rank AlgorithmFigure 3. Rank AlgorithmThe contextual sensitivity of ORC becomes from its ontology network based on co-existent relations. It is realized by its rank algorithm as shown in Figure 3. In step 5, a set of key word relative ontologies would be found for certain document. Only relative ontologies are considered here. Then the certain document’s rank is assessed on its relation degree with the key word by supports in the network in step 8. At last, the rank list would be adjusted for each document in step 9.V.A PRIMITIVE WEB SEARCH ENGINEORC algorithm is utilized in the system, which is a vertical search engine about popular science. This system is implemented by Visual Studio 2008, and the main technology taken is C#. The aim of the system is to popularize science to citizens conveniently. The configuration of the host of the system is, CPU 2.4GHz, RAM 4G.TEXTTOONTO[11] is adopted firstly to establish automatically the ontology database used here, and then the ontology database is purified by the expertise in popular science. As shown in Figure 4, the ontology database is the foundation of the deep search engine of popular science. The initial Html database is comprised of 10000 Html documents about popular science. After 138 hours’ computing, a Url Index Database, whose size was 15 million with 0.1 million ontologies, was acquired by ORC. The user interface component call the function of search operation in the Url index database, in order to feedback the wants of users. From general queries, the system unusually obtains more contextually deeper and broader results than Google and Yahoo, especially in Chinese.Figure 4. The deep search engine of popular scienceVI.CONCLUSIONTo attack the problem of correctness and recall of vertical web search system, a novel web crawler ORC is proposed in this paper. It introduces ontologies, on which the semantic concepts are defined. The relations between ontologies are exploited to evaluate the importance of web documents’ context. These ontologies are organized by a network structure that is updated periodically. Due to the ontology relation mining, the crawling space is expanded and focused more. At the end of this paper, a primitive vertical search system is built based on ORC.A CKNOWLEDGMENTThis work was supported in part by Systems Engineering Study Center Beijing.R EFERENCES[1]M. Chau and H. Chen, "Comparison of three vertical search spiders,"Computer, 2003, vol. 36, pp. 56-62.[2]S.A. McIlraith, T.C. Son and H. Zeng, "Semantic Web services,"Intelligent Systems, 2001, vol. 16, pp.46-53.[3]T. Gruber, "A translation approach to portable ontology specifications,"Knowledge Acquisition, 1993, vol. 5, pp. 199-199.[4]L., Peter and H.R. Varian, "How Much Information,"/how-much-info-2003, 2003.[5]G. Neil, "The dynamics of competition in the internet search enginemarket," International Journal of Industrial Organization, 2001, vol. 19 , pp.1103–1117, doi:10.1016/S0167-7187(01)00065-0.[6]K. Mehmed, “Data Mining: Concepts, Models, Methods, and Algorithms,”John Wiley & Sons, 2003.[7]T. Gruber, "Ontology," To appear in the Encyclopedia of DatabaseSystems, Ling Liu and M. Tamer Özsu (Eds.), Springer-Verlag, 2008.[8]T. R. Gruber, "Toward Principles for the Design of Ontologies Used forKnowledge Sharing," Proc. International Journal Human-Computer Studies, 1995, vol. 43, pp. 907-928.[9]T. R. Gruber, "A translation approach to portable ontologies," Proc.Knowledge Acquisition, 1993, vol. 5, pp. 199-220.[10]J. Han, "Data Mining and Knowledge Discovery," Springer Netherlands,2004.[11]S. Bloehdorn, A. Hotho and S. Staab, “An Ontology-based framework fortext mining,” Journal for computational linguistics and language technology, 2005, Vol.20, pp. 1-20.。

Ontology研究综述

北京大学学报(自然科学版),第38卷,第5期,2002年9月Acta Scientiarum NaturaliumUniversi tatis Pekinensis,Vol.38,No.5(Sep,2002)述评R e view1)国家重点基础研究发展规划(973)资助项目(G1999032705)收稿日期:2001-09-11;修回日期:2002-03-28Ontology 研究综述1)邓志鸿2) 唐世渭3) 张铭2) 杨冬青2) 陈捷3)(2)北京大学计算机系,3)北京大学视觉与听觉处理国家重点实验室,北京,100871)摘要 Ontology 是描述概念及概念之间关系的概念模型,通过概念之间的关系来描述概念的语义。

作为一种有效表现概念层次结构和语义的模型,Ontology 被广泛地应用到计算机科学的众多领域。

本文作者对目前Ontology 的研究与应用现状进行了综述性地介绍,从Ontology 的定义、Onto-logy 理论研究、Ontology 在信息系统中的应用以及在语义Web 中的地位等方面加以了系统阐述。

关键词 Ontology ;信息系统;语义Web;XML;RDF中图分类号 TP 301;TP 3910 引言近些年来,随着计算机应用需求的不断增强,计算机科学与技术的发展日新月异。

然而在这种快速发展的同时,也面临着种种的困难。

主要的困难包括:知识的表示、信息的组织、软件的复用等。

特别是由于因特网的快速发展,面对信息的海洋,如何组织、管理和维护海量信息并为用户提供有效的服务也就成为一项重要而迫切的研究课题。

为了适应这些要求,Ontology 作为一种能在语义和知识层次上描述信息系统的概念模型建模工具,自被提出以来就引起了国外众多科研人员的关注,并在计算机的许多领域得到了广泛的应用,如知识工程、数字图书馆、软件复用、信息检索和Web 上异构信息的处理、语义Web 等。

本文对Ontology 及相关的应用和研究进行了系统的分析,希望对相关领域的同行有抛砖引玉的作用。

Ontology-based service discovery system and method

专利名称：Ontology-based service discovery systemand method for ad hoc networks发明人：Young Gook Ha,Joo Chan Sohn,Ho-SangHam申请号：US10867792申请日：20040616公开号：US20050138173A1公开日：20050623专利内容由知识产权出版社提供专利附图：摘要：An ontology-based ad hoc service discovery system includes a local service cache, a cache manager, a service description unit, a query processor, a service semanticinference unit and a node daemon. The local service cache restores a service ontology by collecting class information of all services advertised on an ad hoc network and stores the service ontology. The cache manager manages the local service cache and performs various preset operations on the cache. The service description unit stores a description of a corresponding service for use in initializing the local service cache. The query processor starts performing a semantic based service query protocol by receiving a service query from a user or an application program. The service semantic inference unit inspects whether the service query transmitted from a client is coincident with the content of the service. The node daemon performs a service cache synchronization protocol with neighboring nodes.申请人：Young Gook Ha,Joo Chan Sohn,Ho-Sang Ham地址：Daejeon KR,Daejeon KR,Daejeon KR国籍：KR,KR,KR更多信息请下载全文后查看。

Oracle BPM 套件：一份关于 Oracle Corporation 的商业流程管理工具的介绍

An Ontological Approach to Oracle BPMJean Prater, Ralf Mueller, Bill BeauregardOracle Corporation, 500 Oracle Parkway, Redwood City, CA 94065, USA **********************,***********************,*****************************The Oracle Business Process Management (Oracle BPM) Suite is composed oftools for Business Analysts and Developers for the modeling of BusinessProcesses in BPMN 2.0 (OMG1 standard), Business Rules, Human Workflow,Complex Events, and many other tools. BPM operates using the commontenants of an underlying Service Oriented Architecture (SOA) runtimeinfrastructure based on the Service Component Architecture (SCA). OracleDatabase Semantic Technologies provides native storage, querying andinferencing that are compliant with W3C standards for semantic (RDF/OWL)data and ontologies, with scalability and security for enterprise-scale semanticapplications.Semantically-enabling all artifacts of BPM from the high-level design of aBusiness Process Diagram to the deployment and runtime model of a BPMapplication promotes continuous process refinement, enables comprehensiveimpact analysis and prevents unnecessary proliferation of processes andservices. This paper presents the Oracle BPM ontology based upon BPMN 2.0,Service Component Architecture (SCA) and the Web Ontology Language(OWL 2). The implementation of this ontology provides a wide range of usecases in the areas of Process Analysis, Governance, Business Intelligence andSystems Management. It also has the potential to bring together stakeholdersacross an Enterprise, for a true Agile End-to-End Enterprise Architecture.Example use cases are presented as well as an outlook of the evolution of theontology to cover the organizational and social aspects of Business ProcessManagement.1.IntroductionIn the 1968 film, 2001: A Space Odyssey, the movie’s antagonist, HAL, is a computer that is capable not only of speech, speech recognition, and natural language processing, but also lip reading, apparent art appreciation, interpreting and reproducing emotional behavior, reasoning, and playing chess, all while maintaining the systems on an interplanetary mission. While the solution we present in this paper does not possess all of the capabilities of HAL, the potential benefits of combining semantic technology with Oracle BPM provides the ability to define contextual relationships between business processes and provides the tools to use that context so that ‘software agents’ (programs working on behalf of people) can find the right1 Object Management Group, see 2 Jean Prater, Ralf Mueller, Bill Beauregardinformation or processes and make decisions based on the established contextual relationships.Organizations can more efficiently and effectively optimize their information technology resources through a service-oriented approach leveraging common business processes and semantics throughout their enterprise. The challenge, however, with applications built on Business Process Management (BPM) and Service Oriented Architecture (SOA) technology is that many are comprised of numerous artifacts spanning a wide range of representation formats. BPMN 2.0, the Service Component Architecture Assembly Model, Web Service definitions (in the form of WSDL), XSLT transformations, for example are all based on well defined but varying type models. To answer even simple queries on the entire BPM model, a user is left with a multitude of API’s and technologies, making the exercise difficult and highly complicated. Oracle has developed an ontology in OWL that encompasses all the artifacts of a BPM application and is stored in Oracle Database Semantic Technologies that provides a holistic view of the entire model and a unified and standardized way to query that model using SPARQL.Oracle is actively involved in the standards process and is leading industry efforts to use ontologies for metadata analysis. Oracle is also investigating the integration of organizational and social aspects of BPM using FOAF2. BPMN 2.0 task performers can be associated with a FOAF Person, Group or Organization and then used in Social Web activities to enable Business Users to collaborate on BPM models.1.1 BenefitsThe benefits of adding semantic technology to the database and to business process management in the middleware, driven by an underlying ontology are three fold:1.It promotes continuous process refinement. A less comprehensive processmodel can evolve into a complete executable process in the same model.2.It makes it easy to analyze the impact of adding, modifying or deletingprocesses and process building blocks on existing processes and webservices.3.It helps prevent unnecessary proliferation of processes and services. Combining semantic technology and business process management allows business users across organizational boundaries to find, share, and combine information and processes more easily by adding contextual relationships.1.2 Customer Use CaseThe US Department of Defense (DoD) is leading the way in the Federal Government for Architecture-driven Business Operations Transformation. A vital tenet for success is ensuring that business process models are based on a standardized representation, thus enabling the analysis and comparison of end to end business processes. This will lead to the reuse of the most efficient and effective process patterns (style guide), comprised of elements (primitives), throughout the DoD Business Mission Area. A key principle in DoD Business Transformation is its focus on data ontology. The 2 The Friend of a Friend (FOAF) project, see An Ontological Approach to Oracle BPM 3 Business Transformation Agency (BTA), under the purview of the Deputy Chief Management Officer (DCMO), has been at the forefront of efforts to develop a common vocabulary and processes in support of business enterprise interoperability through data standardization. The use of primitives and reuse of process patterns will reduce waste in overhead costs, process duplication and building and maintaining enterprise architectures. By aligning the Department of Defense Architecture Framework3 2.0 (DoDAF 2.0) with Business Process Modeling Notation 2.0 (BPMN 2.0) and partnering with industry, the BTA is accelerating the adoption of these standards to improve government business process efficiency.2.The Oracle BPM OntologyThe Oracle BPM ontology encompasses and expands the BPMN 2.0 and SCA ontologies. The Oracle BPM ontology is stored in Oracle Database Semantic Technologies and creates a composite model by establishing relationships between the OWL classes of the BPMN 2.0 ontology and the OWL classes of the SCA runtime ontology. For example, the BPMN 2.0 Process, User Task and Business Rule Task are mapped to components in the composite model. Send, Receive and Service Tasks, as well as Message Events are mapped to appropriate SCA Services and References and appropriate connections are created between the composite model artifacts. Figure 1 illustrates the anatomy of the Business Rule Task “Determine Approval Flow” that is a part of a Sales Quote demo delivered with BPM Suite.Figure 1: Anatomy of a BPMN 2.0 Business Rule Task4The diagram shows that the Business Rule Task “Determine Approval Flow” is of BPMN 2.0 type Business Rule Task and implemented by a SCA Decision Component that is connected to a BPMN Component “RequestQuote”. Also of significance is that the Decision Component exposes a Service that refers to a specific XML-Schema, which is also referred to by Data Objects in the BPMN 2.0 process RequestQuote.bpmn.3See /products/BEA_6.2/BEA/products/2009-04-27 Primitives Guidelines for Business Process Models (DoDAF OV-6c).pdf4 Visualized using TopBraid Composer TM4 Jean Prater, Ralf Mueller, Bill Beauregard3.An Ontology for BPMN 2.0With the release of the OMG BPMN 2.0 standard, a format based on XMI and XML-Schema was introduced for the Diagram Interchange (DI) and the Semantic Model. Based on the BPMN 2.0 Semantic Model, Oracle created an ontology that is comprised of the following5:•OWL classes and properties for all BPMN 2.0 Elements that are relevant for the Business Process Model.6The OWL classes, whenever possible,follow the conventions in the BPMN 2.0 UML meta model. OWL propertiesand restrictions are included by adding all of the data and object propertiesaccording to the attributes and class associations in the BPMN 2.0 model.7•OWL classes and properties for instantiations of a BPMN 2.0 process model. These OWL classes cover the runtime aspects of a BPMN 2.0process when executed by a process engine. The process engine createsBPMN 2.0 flow element instances when the process is executed. Activitylogging information is captured, including timestamps for a flow elementinstance’s activation and completion, as well as the performer of the task. The implicit (unstated) relationships in the Oracle BPM ontology can be automatically discovered using the native inferencing engine included with Oracle Database Semantic Technologies. The explicit and implicit relationships in the ontology can be queried using Oracle Database Semantic Technologies support for SPARQL (patterns matching queries) and/or mixed SPARQL in SQL queries. [6] Example SPARQL queries are shown below:Select all User Tasks in all Lanesselect ?usertask ?lanewhere {usertask rdf:type bpmn:UserTask .usertask bpmn:inLane lane}Select all flow elements with their sequence flow in lane p1:MyLane (a concrete instance of RDF type bpmn:Lane)select ?source ?targetwhere {flow bpmn:sourceFlowElement source .flow bpmn:targetFlowElement target .5 All of the classes of the BPMN 2.0 meta model that exists for technical reasons only (model m:n relationship or special containments) are not represented in the ontology6 The work in [2] describes an ontology based on BPMN 1.x for which no standardized meta model exists7 Oracle formulated SPARQL queries for envisioned use cases and added additional properties and restrictions to the ontology to support those use casesAn Ontological Approach to Oracle BPM 5 target bpmn:inLane p1:MyLane}Select all activities in process p1:MyProcess that satisfy SLA p1:MySLA select ?activity ?activityInstancewhere {activity bpmn:inProcess p1:MyProcess .activityInstance obpm:instanceOf activity .activityInstance obpm:meetSLA p1:MySLA}A unique capability of BPMN 2.0, as compared to BPEL, for instance, is its ability to promote continuous process refinement. A less comprehensive process model, perhaps created by a business analyst can evolve into a complete executable process that can be implemented by IT in the same model. The work sited in Validating Process Refinement with Ontologies[4] suggests an ontological approach for the validation of such process refinements.4.An Ontology for the SCA composite modelThe SCA composite model ontology represents the SCA assembly model and is comprised of OWL classes for Composite, Component, Service, Reference and Wire, which form the major building blocks of the assembly model. Oracle BPM ontology has OWL classes for concrete services specified by WSDL and data structures specified by XML-Schema. The transformation of the SCA assembly model to the SCA ontology includes creating finer grained WSDL and XML-Schema artifacts to capture the dependencies and relationships between concrete WSDL operations and messages to elements of some XML-Schema and their imported schemata.The SCA ontology was primarily created for the purpose of Governance and to act as a bridge between the Oracle BPM ontology and an ontology that would represent a concrete runtime infrastructure. This enables the important ability to perform impact analysis to determine, for instance, which BPMN 2.0 data objects and/or data associations are impacted by the modification of an XML-Schema element or which Web Service depends on this element. This feature helps prevent the proliferation of new types and services, and allows IT to ascertain the impact of an XML-Schema modification.5.The TechnologiesAs part of the customer use case, as referenced in section 1.2 above, we implemented a system that takes a BPM Project comprised of BPMN 2.0 process definitions, SCA assembly model, WSDL service definitions, XML-Schema and other metadata, and created appropriate Semantic data (RDF triples) for the Oracle BPM ontology. The6 Jean Prater, Ralf Mueller, Bill Beauregardtriples were then loaded into Oracle Database Semantic Technologies [3] and a SPARQL endpoint was used to except and process queries.6.ConclusionOracle BPM ontology encompasses and expands the generic ontologies for BPMN 2.0 and the SOA composite model to cover all artifacts of a BPM application from a potentially underspecified8process model in BPMN 2.0 down to the XML-Schema element and type level at runtime for process analysis, governance and Business Intelligence. The combination of RDF/OWL data storage, inferencing and SPARQL querying, as supported by Oracle Database Semantic Technologies, provides the ability to discover implicit relationships in data and find implicit and explicit relationships with pattern matching queries that go beyond classical approaches of XML-Schema, XQuery and SQL.7.AcknowledgementsWe’d like to thank Sudeer Bhoja, Linus Chow, Xavier Lopez, Bhagat Nainani and Zhe Wu for their contributions to the paper and valuable comments.8.References[1] Business Process Model and Notation (BPMN) Version 2.0,/spec/BPMN/2.0/[2] Ghidini Ch., Rospocher M., Serafini L.: BPMN Ontology,https://dkm.fbk.eu/index.php/BPMN_Ontology[3] Oracle Database Semantic Technologies,/technetwork/database/options/semantic-tech/[4] Ren Y., Groener G., Lemcke J., Tirdad R., Friesen A., Yuting Z., Pan J., Staab S.:Validating Process Refinement with Ontologies[5] Service Component Architecture (SCA), [6] Kolovski V., Wu Z., Eadon G.: Optimizing Enterprise-Scale OWL 2 RL Reasoning in aRelational Database System, ISWC 2010, page 436-452[7] “Use of End-toEnd (E2E) Business Models and Ontology in DoD Business Architectures”;Memorandum from Deputy Chief Management Office; April 4, 2011, Elizabeth A.McGrath, Deputy DCMO.[8] “Primitives and Style: A Common Vocabulary for BPM across the Enterprise”; DennisWisnosky, Chief Architect & CTO ODCMO and Linus Chow Oracle; BPM Excellence in Practice 2010; Published by Future Strategies, 20108A BPMN 2.0 model element is considered underspecified, if its valid but not all attribute values relevant for execution are specified.。

文献整理学术传统对古籍数字化的参照价值(之三)——以“版本源流考订”为例

文献研究*本文系国家社科基金项目“文献整理学术传统在古籍数字化中的价值实现研究”（项目编号：17BTQ009）研究成果。

摘要版本源流考订是古代文献整理的优良传统。

通过对单种文献历代同书异本传承关系的考察，可以从存本中发现和推荐与祖本关系密切的善本。

对古籍数字化而言，版本源流考订有助于选择精良的底本，还可将前代版本源流考订成果随同文献正文一起数字化，并进行数据加工。

文章以宋代梅尧臣《宛陵集》为例，利用OWL 本体描述语言完成面向专业研究人员的古籍版本源流知识库的结构设计，基于本体开发工具Protégé完成以RDF 资源描述框架为基础的本体建模，利用On⁃toGraf 插件功能将该书的版本源流以可视化的形式直观呈现出来。

关键词文献整理学术传统古籍数字化版本源流版本学引用本文格式李明杰，卢彤，高晓文.文献整理学术传统对古籍数字化的参照价值（之三）——以“版本源流考订”为例[J].图书馆论坛，2021，41（5）：108-117.Academic Traditions of Chinese Document Systematization and Their Reference Value for the Digitization of Ancient Chinese Books （III ）——Textual Research on the Origin and Evolution of Ancient Books ’EditionsLI Mingjie ，LU Tong ，GAO XiaowenAbstractTextual research on the origin and evolution of ancient books ’editions is an excellent tradition inChinese document systematization.Through an investigation of the inheritance relationship of various editions of thesame book in different dynasties ，rare editions closely related to the original edition could be discovered.As for thedigitization of ancient Chinese books ，textual research on the origin and evolution of ancient books ’editions could be helpful for choose excellent master copies.Besides ，previous results of textual research could be digitized and processed along with the proper texts.In this paper ，taking Mei Yaochen ’s Wanlin Collection as an example ，the OWL ontology language is used to design a repository which could track the origin and evolution of ancient books ’editions.Then ，Protégéis used to work out an ontology modeling based on the RDF resource description framework.In the end ，the origin and evolution of Wanlin Collection could be visualized by means of the OntoGraf plug-in.Keywordsdocument systematization ；academic traditions ；digitization of ancient Chinese books ；origin andevolution of ancient books ；study on ancient books ’editions文献整理学术传统对古籍数字化的参照价值（之三）——以“版本源流考订”为例*李明杰，卢彤，高晓文108◎2021年第5期◎文献研究版本源流有两重含义：一是广义的版本源流，指的是文献制作方式的演变源流，如写本源流、拓本源流、刻本源流、石印本源流；二是狭义的版本源流，指的是单种文献(含丛书)在传抄、翻刻等传播过程中形成的错综复杂的版本传承关系。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

1 AbstractIn this article an approach to the problem of associations of documents with a knowledge base is demonstrated in a real world application. It is based on combination of annotating documents with concepts from a knowledge base and grouping documents together into clusters. Our knowledge base is an ontology provided by a dedicated ontology server.2 IntroductionWWW is slightly becoming the most important communication medium in a last time. There are many reasons for this, but the fact is that most people access information on Internet using web services. Usually, WWW provides one-way communication from publisher to user. In this case we meet a problem of huge amount of unstructured information when it is not easy to find relevant document. This is well known problem for which many techniques are being developing like intelligent search engines or ambitious Semantic Web initiative.However, WWW can be also successfully used in two-way communication between two sides. Such a communication involves discussion, polling, chat, predefined reports, questionnaires, query systems etc., and of course, the classical publishing. Here the problem of too much information arises again, but new requirement appears in addition. We don’t only want to be lost in available information space but also want from the system to control our communication, make advises, select or notify the right agent (usually person) on the other side, so that the communication was efficient. The need of user friendly and intelligent communication environment is very important point if we want people to regularly visit our site or even to be able to use it. Webocrat is a web based system supporting direct participation of citizens in democratic processes, which is being developed within Webocracy project. The project partners are University of Technology in Košice, Slo-vakia, University of Wolverhampton, UK, University of Essen, Germany, JUVIER s.r.o, Slovakia, CITEC Engineering Oy Ab, Finland, City Ward Tahanovce, Slovakia, City Ward Furca, Slovakia, Wolverhampton Metropolitan Borough Council, UK.From the point of view of functionality of the system it is possible to break down the system into several parts and/or modules (Mach et al 2001). They can be represented in a layered sandwich-like structure which is depicted in Figure 1.1 Technical University of Kosice, Dept of Cyberneticsand Artificial IntelligenceFigure 1 System structure from the point of the system’s functionalityThe central part of this structure is occupied by a knowledge model (KM) module. This system component contains one or more ontological domain models providing a conceptual model of a domain. The purpose of this component is to index all information stored in the system in order to describe the context of this information (in terms of domain specific concepts). The central position symbolises that the knowledge model is the core (heart) of the system – all parts of the system use this module in order to deal with information stored in the system (both for organising this information and accessing it).Information stored within the system has the form of documents of different types. Since three main document types are expected to be processed by the system, a document space can be divided into three subspaces – publishing space, discussion space, and opinion polling space. These areas contain published documents expected to be read by users, users’ contributions to discussions on different topics of interest, and records of users’ opinions about different issues, respectively.Documents stored in these three document subspaces can be inter-connected with hyper-textual links – they can contain links to other documents – to documents stored in the same subspace, to documents located in another subspace, and to documents from outside of the system. Thus, documents within the system are organised using net-like structure. Moreover, documents located in these subspaces should contain links to elements of a domain model.Since each document subspace expects different way of manipulating with documents, three system’s modules are dedicated to them. Web content management (WCM) module offers means to manage the publishing space. ItAn Ontology based document managementJan Hreno 1 and Robert Kende 1enables to prepare documents in order to be published (e.g. to link them to elements of a domain model), to publish them, and to access them after they are published. Discussion space is managed by discussion forum (DF) module. The module enables users to contribute to discussions they are interested in and/or to read contributions submitted by other users. Opinion polling room (OPR) module represents a tool for performing opinion polling on different topics. Users can express their opinions in the form of polling – selecting those alternatives they prefer.In order to navigate among information stored in the system in an easy and effective way, one more layer has been added to the system. This layer is focused on retrieving relevant information from the system in various ways. It is represented by two modules, each enabling easy access to the stored information in a different way. Citizens’ information helpdesk (CIH) module is dedicated to search. It represents a search engine based on the indexing of stored documents. Its purpose is to find all those documents which match user’s requirements expressed in the form of a query.The other module performing information retrieval is the Reporter (REP) module. This module is dedicated to providing information of two types. The first type represents information in an aggregated form. It enables to define and generate different reports concerning information stored in the system. The other type is focused on providing particular documents – but unlike the CIH module it is oriented on off-line mode of operation. It monitors content of the document space on behalf of the user and if information the user may be interested in appears in the system, it sends an alert to him/her.The upper layer of the presented functional structure of the system is represented by a user interface. It integrates functionality of all the modules accessible to a particular user into one coherent portal to the system and provides access to all functions of the system in a uniform way.3Using domain model in Webocrat 3.1AnnotationTo give a system some kind of intelligence, it must know a meaning of the document - its semantics. Standard HTML pages contain almost unstructured information that is understandable only by humans, not by computer. There is no way to tell the computer that this article is about cars unless it contains word car explicitly or semantic analysis is applied. The solution is to annotate the document. This means that explicit information about its meaning is attached to it whether manually or automatically. Thus, the system can extract relevant information from every annotated document and use it in some intelligent task like searching. Semantic Web initiative is based on this method. It gives proposals and suggestions for annotating HTML pages, using special meta-tags and XML. There is an implicit (tacit) information about document in those tags, which is not visible to end-user, it is only used by system. In knowledge engineering this information is called meta-knowledge. There are many ways how to store meta-knowledge, it doesn’t need to be in meta-tags (it is not technically possible with MS Word documents), but it can be stored in special files or databases. Based on meta-knowledge one can perform intelligent retrieval, which gives more relevant results than pure full-text search.Meta-knowledge can be of two types:1. List of keywords or description in natural language.Document is enriched with some kind of thesaurus here. Full-text search is performed also with this part giving more precise results.2. Link to a concept in predefined vocabulary. Thismethod assumes that there exists some vocabulary of terms or concepts used in the area of our interest.More about this in the next section.In our work, we concentrated our effort to annotate electronic document (in our case any document published in WCM system) by linking it together with other relevant documents to relevant concepts from the Knowledge base (in our case ontology). It is based on grouping together relevant documents and concepts from the ontology. Such a group of documents and concepts we call Association. Every association has its name, description, and some other attributes needed later for the document retrieval. Basic idea can be seen on Figure 2.3.2Domain modelIn the previous section there was mentioned the word vocabulary. In the simplest case it is just a list of terms, where each term has its own description – thesaurus. Such a structure is not satisfactory for our purposes, because it doesn’t reflect relations among the terms. What we want is the model of the real world or its part. The part of the world we are interested in is called domain and its model is called domain model. Domain model is based on conceptualisation. A conceptualisation is an abstract, simplified view of the world that we wish to represent for some purpose. It consists of concepts that represent the objects of our interest in a real world and relationships that hold them. To formally represent domain model we use ontology. Ontology is an explicit specification of a conceptualisation [1].Domain model allows the system to perform reasoning and thus to find relevance of a document not only on lexical but also on semantic basis. An example of a part of an ontology is shown in Figure 3Figure 2 Basic idea of the associationsFigure 3 A part of sample ontology4Using domain model in Webocrat The main idea behind whole Webocrat system is to treat documents of various types that are associated with a part of domain model – ontology. This way it is possible to annotate discussions, chats, reports, polling or ordinary WWW pages. By ordinary documents we mean all the documents that are published by local authority, such as news, announcements, reports and other documents that could be interesting for public. When they are published, they are annotated first, whether manually or semi-automatically. After that they are prepared for intelligent retrieval. When accessing information, user can make his query consisting of words for full-text search and of terms (concepts) used in ontology. By use of concepts in the query it is ensured that also its hidden meaning will be discovered. Formulation of such query also allows the user to define his personal profile of interest in terms of ontology. Personalised reports and newsletters can be then automatically generated and sent to user. Described scenario assumes that the ontology covers all relevant parts of real life concerning to structure of public institutions, communal matters, ecology etc. Figure 3 shows sample ontology about institutions. (This is only testing example. Real life ontologies are being developed in the time of writing this paper).So we showed how classical web content can be annotated for aforementioned one-way communication. But knowledge about the semantics of document can play also active role during communication. Discussions are typical examples in Webocrat. We consider the discussion as a thread of documents that are all annotated. In order to enable to retrieve discussion contributions according to their content, it is necessary to create links to elements of a domain model when creating new discussion. These elements will represent topics on which the discussion will be focused. Each contribution which will be added to this discussion later will be linked to the same elements from the domain model in an automatic way (contributions inherit links from their discussion).In order to enable organising contributions within the discussion not only according to the date and time of submissions or authors of submissions, it is possible to complete the contribution with a set of links. These links can be of two types – links to elements of a domain model and links to other contributions from within the discussion. The former type of links enables to define the content in more detail (not only in the sense that the contribution is about exactly the same issues as the discussion as a whole) – this includes not only adding some more links to the set of links inherited from the discussion definition but reducing this inherited set as well. The latter type of links enable user to determine to which existing contribution(s) he/she responds. In addition, it is possible to enrich a contribution to some discussion with links to documents from inside or outside of the system, e.g. in case when the users (submitters) refers in their contributions to those documents.In order to read particular contributions it is necessary to access them. User has several possibilities how to complete this task. First of all, he/she can choose from a list of all available discussions. Another alternative way is to use linking of contributions to elements of a domain model in order to create groups of contributions dealing with the same set of issues [2].Using links to ontology, system can suggest the discussion on some topic when user reads document on that topic. Or when user contributes to some discussion, system can advise where to find more relevant information. It would be impossible without links to domain model. Even more, when user links his contribution to some concepts, overriding linkage of whole discussion, system can automatically find more relevant discussion, if existing, and suggest it. Similarly, if some contributions get more and more distant from topic of original thread, administrator can be notified to split discussion. The similarity of contributions is measured using distances of corresponding concepts in the ontology.On this discussion example we showed how the domain model can enhance communication and how classical tools could be used more efficiently.5Domain model requirementsUsing experiences from other projects and related work with ontologies, we had specified some basic attributes, which we expect our ontology will have. They was as follows:§some constant types are defined e.g. integer, float, string, date, currency§basic objects are classes, instances, relations§classes can be primitive (definition represents necessary but not sufficient conditions) or non-primitive (both sufficient and necessary)§ a class can be associated with a collection of slots§slots with predefined semantics: documentation§ a collection of facets can be associated with a slot§slot facets with predefined semantics (for classes only): value-type, can be constant type, constant expression (and, or, not), enumerated type, min-cardinality, max-cardinality, range, can be constant tuple or list of constant tuples, (not) same value as other slot has, subset-of-values as other slot has, documentation, default value, value§an instance can inherit a collection of slots§only one facet can be associated with a slot of an instance:§value and default value of a slot can be constant or set of constants§relations can be n-ary for n=1,2,3,...§relations are defined on basic objects§relations can have defined attributes: inverse-relation - which relation is an inverse to the one, disjoint, covered, equivalent, transitive, symmetric, functional§predefined relations are: instance-of - between a class and an instance, semantics: inheritance of slots (values, facets), type-of - an inverse relation to instance-of, subclass-of - between two classes, semantics: inheritance of slots (values, facets), superclass-of - an inverse relation to subclass-of§slot facet values are inherited but can be overwritten (new value must be more constraining than the old one)§multiple inheritance (from more parents) is allowed §special classeso THING - represents the root of the class hierarchy§every defined class is a subclass ofTHING,§every instance is an instance ofTHING§has slot "documentation" with value-type STRINGo CLASS - class of all classeso INSTANCES - class of all instancesIn current state of the project we needed to offer for our partners tool for creating and editing ontology. Because Knowledge Module task starts in our project in future, we had specified some other requirements for knowledge editor:§it has to be flexible, to enable later modifications in knowledge model§platform independence§it should enable importing ontologies from other formatsThus we dedicated to use some kind of Open Source knowledge editor programmed in JAVA instead of programming new one and to modify it for our purposes. Tool, which best fitted into mostly all of our requirement seemed to be Protégé 2000 from Stanford University. Other knowledge editors we have tested was OntoEdit, JOT, GEF, Apollo, SiLRI.6Using Protege 2000 for creating ontologiesProtégé-2000 is the latest component-based and platform-independent generation of the ontology editor. Two goals have driven the design and development of Protégé-2000: 1.achieving interoperability with other knowledge-representation systems, and2.being an easy-to-use and configurable knowledge-acquisition tool. The first goal is achieved by compatibility of the knowledge model of Protégé-2000 with OKBC (Open Knowledge Base Connectivity). As a result, Protégé-2000 users can import ontologies from other OKBC-compatible servers and export their ontologies to other OKBC knowledge servers. Protégé-2000 uses the freedom allowed by the OKBC specification to maintain the model of structured knowledge acquisition tools and to achieve the second design goal of being a usable and extensible tool.Protégé fitted almost all of our requirements for the knowledge editor. The only one noticeable difference was in form, how relations are represented in Protégé. Because of freedom of the ontology specification in Protége knowledge model, relations are not defined as basic objects [3]. We discuss later in this article, how to solve this lack. Other modifications we did to Protégé were:1.Localisation of Protégé into more languages (at thistime it is localised into Slovak version)2.Adding ability to graphically view classesstructure(Figure 4). It will help the user easily browse ontology in a graphical view. The graph layout is computed automatically or can be changedby user.Figure 4 Graphview tab for Protégé 20007Representing relations in Protégé Because relations are not basic Protégé objects, we have to model them. In the discussion within Protégé community four possible solutions were proposed:Option 1We can use own slots. This is probably the easiest way to go, but it is also the most restrictive one. Here the relations are own slots on all subclasses of the class that first specified those slots. The values of the slots are classes that they are related to in one way or another. Advantage:§Very easy to model§We already have all the interface and underlying structures in Protégé for this.Problems:§We can not add additional information, such as orientation, in particular, when the value of a slot isa list of classes and not a single class§Option 2 (extension of Option 1)Use facets on own slots (own slots on own slots) to specify orientation and other additional properties Problem:§Too complicated: it is hard even to explain exactly how things are going to work.§Option 3Use template slots. Since slots are first-class objects in Protégé (they are themselves frames) , it is easy to express attributes of relations such as reflexivity, transitivity, etc, as well as a hierarchy of relations (the same is true for Option 1).Advantage:§Can use advantages of inheritance more extensively. §Own slots on classes are harder to explain and understand template slots are easier.Problems:§It is harder to express additional constraints on relations, such as orientation.§Option 4Relations are themselves classes. We can go one step further and reify relations as classes themselves. Relations between particular classes are instances of these Relation classesAdvantages:§Can easily encode meta-information on relations: Reflexive, Transitive, Inverse. All of these properties are own slots on a Relation class§Relations can have additional slots, such as orientation, that get instantiated when we define relations between classes.The first advantage also carries over to most of the earlier options with the exception that the additional information (relation attributes, hierarchy) would be on slots and not classes, which is often harder to understand and manipulate.Problem:§Specialized browsing that "jumps over" a level to view hierarchies of entities based on each relation will be needed (for example, view the part-of hierarchy).All of these four options can be combined. Price for this is then loose of the uniform approach to describing properties of relations such as transitivity, inverses and so on.Option 4 looks like the most suitable one, but it would be uncomfortable for user to define special class for any possible type of relation. Since real applications are not developed yet, we cannot predicate the number of relations needed.We decided for option 3. The EXTENDED_SLOT class has been defined with new facets TRANSITIVE and DISJOINT. Other attributes can be easily added at any time. This EXTENDED_SLOT class is set to be default, so that every new slot that is created on any class is a subclass of EXTENDED_SLOT and thus it automatically contains required attributes TRANSITIVE and DISJOINT. Relation between two objects is modelled as a slot, where one class of relation contains that slot and second class is a value of that slot. Protégé 2000 does not treat DISJOINT or TRANSITIVE facets in some special way. They are only used by reasoning mechanism which will be developed later and will not be a part of Protégé itself.8AcknowledgementsThis work is done within the Webocracy project “Web Technologies Supporting Direct Participation in Democratic Processes”, which is supported by European Commission DG INFSO under the IST program, contract no. IST-1999-20364, and within the VEGA project 1/8131/01 ”Knowledge Technologies for Information Acquisition and Retrieval” of Scientific Grant Agency of Ministry of Education of the Slovak Republic.The content of this publication is the sole responsibility of the authors, and in no way represents the view of the European Commission or its services.9References[1] Gruber, T., R. (1993): A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220.[2] Mach, M.; Dridi, F.; Furdik, K. (2001): Webocrat System Architecture and Functionality. Webocracy report2.4.[3] Noy, N., F.; Fergerson, R., W.; Musen, M., A. (2000): The knowledge model of Protégé-2000: combining interoperability and flexibility. International Conference on Knowledge Engineering and Knowledge Management (EKAW '2000), Juan-les-Pins, France.[4] Sabol, T.; Jackson, M.; Dridi, F.; Palola, I.; Novacek,E.; Cizmarik, T.; Thompson, P. (2001): Dissemination and Use Plan. Webocracy report 15.2.1.。