Applied ontologies in the knowledge management of
管理研究方法论-第一章 绪论

语义web
语义Web提供了一个通用的框架,允许跨越不同应用 程序、企业和团体的边界共享和重用数据。 语义Web以资源描述框架(RDF)为基础。RDF以XML 作为语法、URL作为命名机制,将各种不同的应用 集成在一起,对Web上的数据所进行的一种抽象表 示。语义Web所指的“语义”是“机器可处理的” 语义,而不是自然语言语义和人的推理等目前计 算机所不能够处理的信息。
对管理研究的启示
• 科学是一定条件和环境下关于某一范围的 理论,没有一种终极的理论 • 科学方法是科学社群的小众规范 研究方法论的作用 1.有基本规范可循,可以少走弯路,提高研 究工作的效率 2.可以鉴别别人的研究成果的好坏。 3.有规范就有共同语言,便于相互交流和沟 通,有利于国际学术讨论
第一讲 绪论
管理研究的趋势
改变科学模式倾向 哈佛商评2005年5月的一篇题为How Business Schools Lost Their Way的文章,却认为应该改变商学院中普遍存 在的“Scientific model” – This scientific model, as we call it, is predicated on the faulty assumption that business is an academic discipline like chemistry or geology. ·· ·When applied to business—where judgments are made with messy, incomplete data—statistical and methodological wizardry can blind rather than illuminate.
科学的标准
学术英语(管理)课文翻译

Unit 1When faced with both economic problems and increasing competition not only from firms in the united states but also from international firms located in other parts of the world, employee and managers now began to ask the question:what do we do now? although this is a fair question, it is difficult to answer. Certainly, for a college student taking business courses or be beginning employee just staring a career, the question is even more difficult to answer. And yet there are still opportunities out there d=for people who are willing to work hard, continue to learn, and possess the ability to adapt to change.当面对不仅来自美国的公司而且来自位于世界其他地方的国际公司的经济问题和日益剧烈的竞争时,员工和经理现在开场要问一个问题:我们要做什么?虽然这是一个很清晰的问题,但是它是很难答复的。
当然,对于一个正在谈论商务课程的大学生或者一个刚开场职业生涯的员工来说,这个问题更难答复。
但目前仍然有许多时机给那些愿意努力工作,继续学习并且拥有适应变化的能力的人。
Whether you want to obtain part-time employment to pay college and living expense, begin your career as a full –time employee, or start a business, you must bring something to the table that makes you different from the next person . Employee and our capitalistic economic systems are more demanding than ever before. Ask yourself: What can I do that will make employee want to pay me a salary? What skills do I have that employers need? With these questions in mind, we begin with another basic question:Why study business?There are at least four quite compelling reasons.无论你想获得可以用来支付大学和生活开销的兼职,作为一个全职员工开场你的职业生涯,或者创业,你都应该拿出可以让你不同于其他人的东西。
文献检索报告

检索报告2012级信工院系专业学号姓名成绩一检索课题概况(一)检索课题名称(中英文题名)计算机元数据的数据清洗Cleanning data for the metadata of computer(二)检索课题研究现状在构建机构知识库时,其中一项重要的工作是将收割的临时元数据仓储中的DC(Dublin core)元数据进行规范化,并将规范后的元数据写入DC元数据中心。
由于这些元数据来自不同的加工单位,存在录入错误、语义表示不一致、拼写错误和记录重复等情况,数据质量差异大,尤其是重复记录信息严重,影响查全率和查准率,所以,在元数据导入数据中心前,需要对元数据进行清洗。
国外对数据清洗的研究最早出现在美国,是从对全美的社会保险号错误的纠正开始口]。
美国信息业和商业的发展。
极大地刺激了对数据清洗技术的研究,主要集中在检测并消除数据异常、检测并消除近似重复记录、数据的集成、特定领域的数据清洗四个方面。
国内对数据清洗技术的研究还处于初步阶段。
直接针对数据清洗,特别是针对中文数据清洗的研究成果并不多。
大多是在数据仓库、决策支持、数据挖掘研究中,对其作一些比较简单的阐述。
银行、保险和证券等对客户数据的准确性要求很高的行业,都在做各自的客户数据的清洗工作,针对各自具体应用而开发软件,而很少有理论性的成果见诸于报道。
(三)总体检索思路计算机元数据的数据清洗工作是针对实际问题提出的课题,许多行业和领域如保险、证券、银行、图书馆都需要对原始数据进行数据清洗。
本课题涉及的学科范围包括计算机科学与技术、图书馆文件检索、情报学等等。
国内外数据清洗技术发展都很迅速,所以需要检索的文献包括国内外从2001-2012年的各种期刊论文、会议论文、会议论文、国内外专利文献等等,文献语种主要是中文和英文。
二检索过程记录该部分为综合检索报告的主体部分,主要分为图书资料、中文期刊论文、外文期刊论文、学位论文、专利文献及网络资源的检索。
包括对所选用的数据库、检索年限、检索词、检索策略(即逻辑检索表达式)以及检索结果等的记录。
SCI(EI)收录的材料类期刊

SCI(EI)收录的材料类期刊1 NATURE NATURE 自然0028-0836 27.955/2 SCIENCE SCIENCE 科学0036-8075 23.329/3 SURF SCI REP SURFACE SCIENCE REPORTS 表面科学报告0167-5729 14.091/science/journal/016757294 Prog Mater Sci Progress In Materials Science 材料科学进展0079-6425 14http//www.elsevier.nl/inca/publications/store/4/1/4/5 Prog Surf Sci Progress In Surface Science 表面科学进展0079-6816 7.96/science/journal/007968166 PHYS REV LETT PHYSICAL REVIEW LETTERS 物理评论快报0031-9007 6.668/7 MA T SCI ENG R MA TERIALS SCIENCE & ENGINEERING R-REPORTS 材料科学与工程报告0927-796X 6.143/science/journal/0927796X8 ADV POL YM SCI ADV ANCES IN POL YMER SCIENCE 聚合物科学发展0065-3195 6.053/science/journal/007967009 ADV MATER ADV ANCED MA TERIALS 先进材料0935-9648 5.579 http://www.wiley-vch.de/publish/en/journals/alphabeticIndex/2089/10 ANNU REV MATER SCI ANNUAL REVIEW OF MA TERIALS SCIENCE 材料科学年度评论0084-6600 5.405/loi/matsci?cookieSet=111 APPL PHYS LETT APPLIED PHYSICS LETTERS 应用物理快报0003-6951 3.849/aplo/12 PROG POL YM SCI PROGRESS IN POL YMER SCIENCE 聚合物科学进展0079-6700 3.738/science/journal/0079670013 CHEM MATER CHEMISTRY OF MATERIALS 材料化学0897-4756 3.69 /journals/cmatex/14 PHYS REV B PHYSICAL REVIEW B 物理评论B 0163-1829 3.07/15 ADV CHEM PHYS ADV ANCES IN CHEMICAL PHYSICS 物理化学发展0065-2385 2.828/WileyCDA/WileyTitle/productCd-0471214531.html16 J MATER CHEM JOURNAL OF MATERIALS CHEMISTRY 材料化学杂志0959-9428 2.736/is/journals/current/jmc/mappub.htm17 ACTA MATER ACTA MA TERIALIA 材料学报1359-6454 2.658 http://www.elsevier.nl/locate/actamat/18 MRS BULL MRS BULLETIN 材料研究学会(美国)公告0883-7694 2.606 /publications/bulletin/19 BIOMATERIALS BIOMA TERIALS 生物材料0142-9612 2.489 /20 CARBON CARBON 碳0008-6223 2.34/inca/publications/store/2/5/8/21 SURF SCI SURFACE SCIENCE 表面科学0039-6028 2.189/science/journal/0169433222 J APPL PHYS JOURNAL OF APPLIED PHYSICS 应用物理杂志0021-8979 2.128/japo/23 CHEM V APOR DEPOS CHEMICAL V APOR DEPOSITION 化学气相沉积0948-1907 2.123http://www.wiley-vch.de/publish/dt/24 J BIOMED MA TER RES JOURNAL OF BIOMEDICAL MA TERIALS RESEARCH 生物医学材料研究0021-9304 2.105/cgi-bin/jhome/3072825 IEEE J QUANTUM ELECT IEEE JOURNAL OF QUANTUM ELECTRONICS IEEE量子电子学杂志0018-9197 2.086/xpl/RecentIssue.jsp?puNumber=326 CURR OPIN SOLID ST M CURRENT OPINION IN SOLID STATE & MATERIALS SCIENCE 固态和材料科学的动态1359-0286 1.92/science/journal/1359028627 DIAM RELAT MATER DIAMOND AND RELATED MA TERIALS 金刚石及相关材料0925-9635 1.902/science/journal/0925963528 ULTRAMICROSCOPY ULTRAMICROSCOPY 超显微术0304-3991 1.89 /science/journal/0304399129 EUR PHYS J B EUROPEAN PHYSICAL JOURNAL B 欧洲物理杂志B 1434-6028 1.811/app/home/main.asp30 J AM CERAM SOC JOURNAL OF THE AMERICAN CERAMIC SOCIETY 美国陶瓷学会杂志0002-7820 1.748/31 APPL PHYS A-MATER APPLIED PHYSICS A-MATERIALS SCIENCE & PROCESSING 应用物理A-材料科学和进展0947-8396 1.722/app/home/journal.asp32 NANOTECHNOLOGY NANOTECHNOLOGY 纳米技术0957-4484 1.621 /jnn/33 J V AC SCI TECHNOL B JOURNAL OF V ACUUM SCIENCE & TECHNOLOGY B 真空科学与技术杂志B 1071-1023 1.549 /jvstb/34 J MA TER RES JOURNAL OF MATERIALS RESEARCH 材料研究杂志0884-2914 1.539/publications/jmr/35 PHILOS MAG A PHILOSOPHICAL MAGAZINE A-PHYSICS OF CONDENSED MA TTER STRUCTURE DEFECTS AND MECHANICAL PROPERTIES 哲学杂志A凝聚态物质结构缺陷和机械性能物理0141-8610 1.532http///journals/36 INT J NON-EQUILIB PR INTERNATIONAL JOURNAL OF NON-EQUILIBRIUM PROCESSING 非平衡加工技术国际杂志1368-9290 1.5http://www.ifw-dresden.de/biblio/zsbestand/izs.htm37 J NEW MAT ELECTR SYS JOURNAL OF NEW MATERIALS FOR ELECTROCHEMICAL SYSTEMS 电化学系统新材料杂志1480-2422 1.478http://www.newmaterials.polymtl.ca/38 J V AC SCI TECHNOL A JOURNAL OF V ACUUM SCIENCE & TECHNOLOGY A-V ACUUM SURFACES AND FILMS 真空科学与技术A真空表面和薄膜0734-2101 1.448/refs/jvsta/Default.html39 DENT MATER DENTAL MATERIALS 牙齿材料0109-5641 1.441/science/journal/0109564140 J ELECTRON MA TER JOURNAL OF ELECTRONIC MATERIALS 电子材料杂志0361-5235 1.382/pubs/journals/JEM/jem.html41 J NUCL MATER JOURNAL OF NUCLEAR MATERIALS 核材料杂志0022-3115 1.366/science/journal/0022311542 INT MA TER REV INTERNA TIONAL MA TERIALS REVIEWS 国际材料评论0950-6608 1.364/journals/browse/maney/imr43 J NON-CRYST SOLIDS JOURNAL OF NON-CRYSTALLINE SOLIDS 非晶固体杂志0022-3093 1.363/science/journal/0022309344 J MAGN MAGN MATER JOURNAL OF MAGNETISM AND MAGNETIC MATERIALS 磁学与磁性材料杂志0304-8853 1.329/science/journal/0304885345 OPT MATER OPTICAL MATERIALS 光学材料0925-3467 1.299/science/journal/0925346746 IEEE T APPL SUPERCON IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY IEEE应用超导性会刊1051-8223 1.278/xpl/RecentIssue.jsp?puNumber=7747 METALL MATER TRANS A METALLURGICAL AND MATERIALS TRANSACTIONS A-PHYSICAL METALLURGY AND MA TERIAL 冶金与材料会刊A——物理冶金和材料1073-5623 1.273http///journals/MT//48 THIN SOLID FILMS THIN SOLID FILMS 固体薄膜0040-6090 1.266/science/journal/0040609049 J PHYS D APPL PHYS JOURNAL OF PHYSICS D-APPLIED PHYSICS 物理杂志D——应用物理0022-3727 1.26/EJ/journal/0022-372750 INTERMETALLICS INTERMETALLICS 金属间化合物0966-9795 1.239/science/journal/0966979551 PHILOS MAG B PHILOSOPHICAL MAGAZINE B-PHYSICS OF CONDENSED MA TTERSTA TISTICAL MECHANICS 哲学杂志B-凝聚态物质统计力学0141-8637 1.238http///journals/default.html52 SURF COAT TECH SURFACE & COATINGS TECHNOLOGY 表面与涂层技术0257-8972 1.236/science/journal/0257897253 J BIOMAT SCI-POLYM E JOURNAL OF BIOMA TERIALS SCIENCE-POL YMER EDITION 生物材料科学—聚合物版0920-5063 1.234/journals/jn-JouBioSciPolEdi.html54 MATER RES INNOV MA TERIALS RESEARCH INNOV ATIONS 材料研究创新1432-8917 1.23/app/home/journal.asp55 BIOMETALS BIOMETALS 生物金属0966-0844 1.229/issn/0966-0844/contents56 INT J PLASTICITY INTERNATIONAL JOURNAL OF PLASTICITY 塑性国际杂志0749-6419 1.212/science/journal/0749641957 SMART MATER STRUCT SMART MATERIALS & STRUCTURES 智能材料与结构0964-1726 1.199/EJ/journal/0964-172658 ADV IMAG ELECT PHYS ADV ANCES IN IMAGING AND ELECTRON PHYSICS 成像和电子物理发展1076-5670 1.188/serials/imaging/59 SYNTHETIC MET SYNTHETIC METALS 合成金属0379-6779 1.158/science/journal/0379677960 J MATER SCI-MATER M JOURNAL OF MA TERIALS SCIENCE-MATERIALS IN MEDICINE 材料科学杂志—医用材料0957-4530 1.144/issn/0957-4530/contents61 scriptA MATER scriptA MATERIALIA 材料快报1359-6462 1.13/science/journal/1359646262 COMPOS PART A-APPL S COMPOSITES PART A-APPLIED SCIENCE AND MANUFACTURING 复合材料A应用科学与制备1359-835X 1.128/science/journal/1359835X63 MOD PHYS LETT A MODERN PHYSICS LETTERS A 现代物理快报A 0217-7323 1.119/mpla/mpla.shtml64 SEMICOND SCI TECH SEMICONDUCTOR SCIENCE AND TECHNOLOGY 半导体科学与技术0268-1242 1.079/EJ/journal/0268-124265 J EUR CERAM SOC JOURNAL OF THE EUROPEAN CERAMIC SOCIETY 欧洲陶瓷学会杂志0955-2219 1.071/science/journal/0955221966 APPL SURF SCI APPLIED SURFACE SCIENCE 应用表面科学0169-4332 1.068/science/journal/0169433267 MA TER T JIM MATERIALS TRANSACTIONS JIM 日本金属学会材料会刊0916-1821 1.056http//www.sendai.kopas.co.jp/METAL/PUBS/68 PHYS STATUS SOLIDI A PHYSICA STA TUS SOLIDI A-APPLIED RESEARCH 固态物理A——应用研究0031-8965 1.025/cgi-bin/jhome/4000076169 MAT SCI ENG B-SOLID MA TERIALS SCIENCE AND ENGINEERING B-SOLID STATE MATERIALS FOR ADV ANCED TECH 材料科学与工程B—先进技术用固体材料0921-5107 1.022/refs/mseb/Default.html70 CORROS SCI CORROSION SCIENCE 腐蚀科学0010-938X 1.021/science/journal/0010938X71 J PHYS CHEM SOLIDS JOURNAL OF PHYSICS AND CHEMISTRY OF SOLIDS 固体物理与化学杂志0022-3697 1.02/science/journal/0022369772 J ADHES SCI TECHNOL JOURNAL OF ADHESION SCIENCE AND TECHNOLOGY 粘着科学与技术杂志0169-4243 1.01/journals/jn-JouAdhSciTec.html73 INT J REFRACT MET H INTERNATIONAL JOURNAL OF REFRACTORY METALS & HARD MATERIALS 耐火金属和硬质材料国际杂志0263-4368 0.989/science/journal/0263436874 SURF INTERFACE ANAL SURFACE AND INTERFACE ANAL YSIS 表面与界面分析0142-2421 0.987/cgi-bin/jhome/200975 INT J INORG MA TER INTERNATIONAL JOURNAL OF INORGANIC MA TERIALS 无机材料国际杂志1466-6049 0.986/science/journal/1466604976 SURF REV LETT SURFACE REVIEW AND LETTERS 表面评论与快报0218-625X 0.986/srl/srl.shtml77 MAT SCI ENG A-STRUCT MA TERIALS SCIENCE AND ENGINEERING A-STRUCTURAL MA TERIALS PROPERTIES MICROST 材料科学和工程A—结构材料的性能、组织与加工0921-5093 0.978/peoplepages/oreilly.html78 NANOSTRUCT MA TER NANOSTRUCTURED MATERIALS 纳米结构材料0965-9773 0.969/science/journal/0965977379 IEEE T ADV PACKAGING IEEE TRANSACTIONS ON ADV ANCED PACKAGING IEEE 高级封装会刊1521-3323 0.96/xpl/RecentIssue.jsp?puNumber=604080 INT J FATIGUE INTERNATIONAL JOURNAL OF FATIGUE 疲劳国际杂志0142-1123 0.957/science/journal/0142112381 J ALLOY COMPD JOURNAL OF ALLOYS AND COMPOUNDS 合金和化合物杂志0925-8388 0.953/science/journal/0925838882 J NONDESTRUCT EV AL JOURNAL OF NONDESTRUCTIVE EV ALUA TION 无损检测杂志0195-9298 0.909/issn/0195-9298/current83 MAT SCI ENG C-BIO S MATERIALS SCIENCE & ENGINEERING C-BIOMIMETIC AND SUPRAMOLECULAR SYSTEMS 材料科学与工程C—仿生与超分子系统0928-4931 0.905/inca/publications/store/5/2/4/1/7/5/524175.pub.htt84 J ELECTROCERAM JOURNAL OF ELECTROCERAMICS 电子陶瓷杂志1385-3449 0.904/issn/1385-3449/contents85 ADV ENG MATER ADV ANCED ENGINEERING MATERIALS 先进工程材料1438-1656 0.901http://www.wiley-vch.de/publish/en/journals/alphabeticIndex/2266/86 IEEE T MAGN IEEE TRANSACTIONS ON MAGNETICS IEEE磁学会刊0018-9464 0.891/xpl/RecentIssue.jsp?puNumber=2087 PHYS STATUS SOLIDI B PHYSICA STA TUS SOLIDI B-BASIC RESEARCH 固态物理B —基础研究0370-1972 0.873/cgi-bin/jhome/4000118588 J THERM SPRAY TECHN JOURNAL OF THERMAL SPRAY TECHNOLOGY 热喷涂技术杂志1059-9630 0.862/content/Journals/JournalofThermalSprayTechnology/thermalspray.htm89 MECH COHES-FRICT MAT MECHANICS OF COHESIVE-FRICTIONAL MATERIALS 粘着磨损材料力学1082-5010 0.849/cgi-bin/jhome/615290 ATOMIZATION SPRAY ATOMIZATION AND SPRAYS 雾化和喷涂1044-5110 0.82/journals/6a7c7e10642258cc.html91 COMPOS SCI TECHNOL COMPOSITES SCIENCE AND TECHNOLOGY 复合材料科学与技术0266-3538 0.812/science/journal/0266353892 NEW DIAM FRONT C TEC NEW DIAMOND AND FRONTIER CARBON TECHNOLOGY 新型金刚石和前沿碳技术1344-9931 0.8http://www.kt.rim.or.jp/~myukk/NDFCT/93 MODEL SIMUL MATER SC MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING 材料科学与工程中的建模与模拟0965-0393 0.789/EJ/journal/MSMSE94 INT J THERMOPHYS INTERNATIONAL JOURNAL OF THERMOPHYSICS 热物理学国际杂志0195-928X 0.773/issn/0195-928X/contents95 J SOL-GEL SCI TECHN JOURNAL OF SOL-GEL SCIENCE AND TECHNOLOGY 溶胶凝胶科学与技术杂志0928-0707 0.765/issn/0928-0707/current96 HIGH PERFORM POL YM HIGH PERFORMANCE POL YMERS 高性能聚合物0954-0083 0.758/journals/browse/sage/hip97 MA TER CHEM PHYS MATERIALS CHEMISTRY AND PHYSICS 材料化学与物理0254-0584 0.757/science/journal/0254058498 METALL MATER TRANS B METALLURGICAL AND MATERIALS TRANSACTIONS B-PROCESS METALLURGY AND MA TERIALS 冶金和材料会刊B—制备冶金和材料制备科学1073-5615 0.754/laughlin/mmt.html99 COMPOS PART B-ENG COMPOSITES PART B-ENGINEERING 复合材料B工程1359-8368 0.741/science/journal/13598368100 CEMENT CONCRETE RES CEMENT AND CONCRETE RESEARCH 水泥与混凝土研究0008-8846 0.738/science/journal/00088846101 J COMPOS MATER JOURNAL OF COMPOSITE MATERIALS 复合材料杂志0021-9983 0.73/journals/browse/sage/jcm102 J MATER SCI JOURNAL OF MATERIALS SCIENCE 材料科学杂志0022-2461 0.728/issn/0022-2461/current103 J ENG MA TER-T ASME JOURNAL OF ENGINEERING MA TERIALS AND TECHNOLOGY-TRANSACTIONS OF THE ASME 工程材料与技术杂志—美国机械工程师学会会刊0094-4289 0.726http://www.csb-ing.unige.it/ITA/ElePerio/Periopages/PEJ.html104 MA TER RES BULL MATERIALS RESEARCH BULLETIN 材料研究公告0025-5408 0.715/science/journal/00255408105 JOM-J MIN MET MAT S JOM-JOURNAL OF THE MINERALS METALS & MATERIALSSOCIETY 矿物、金属和材料学会杂志1047-4838 0.702http///pubs/journals/JOM/106 J BIOMATER APPL JOURNAL OF BIOMA TERIALS APPLICATIONS 生物材料应用杂志0885-3282 0.697/journal.aspx?pid=309107 FATIGUE FRACT ENG M FATIGUE & FRACTURE OF ENGINEERING MATERIALS & STRUCTURES 工程材料与结构的疲劳与断裂8756-758X 0.693/journals/browse/bsc/ffems108 J ADHESION JOURNAL OF ADHESION 粘着杂志0021-8464 0.68/science/journal/01437496109 COMP MATER SCI COMPUTATIONAL MATERIALS SCIENCE 计算材料科学0927-0256 0.677/science/journal/09270256110 IEEE T SEMICONDUCT M IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING IEEE半导体制造会刊0894-6507 0.676/xpl/RecentIssue.jsp?puNumber=66111 MECH COMPOS MATER ST MECHANICS OF COMPOSITE MATERIALS AND STRUCTURES 复合材料和结构力学1075-9417 0.675http://www.wkap.nl/prod/b/0-7923-5870-8112 PHASE TRANSIT PHASE TRANSITIONS 相变0141-1594 0.671/~subir/qptweb/toc.html113 MATER LETT MA TERIALS LETTERS 材料快报0167-577X 0.67/science/journal/0167577X114 EUR PHYS J-APPL PHYS EUROPEAN PHYSICAL JOURNAL-APPLIED PHYSICS 欧洲物理杂志—应用物理1286-0042 0.664/journal/index.cfm?edpsname=epjap115 PHYSICA B PHYSICA B 物理B 0921-4526 0.663/science/journal/09214526116 ADV COMPOS LETT ADV ANCED COMPOSITES LETTERS 先进复合材料快报0963-6935 0.662/117 POL YM COMPOSITE POL YMER COMPOSITES 聚合物复合材料0272-8397 0.661/macrog/composit.htm118 CORROSION CORROSION 腐蚀0010-9312 0.66/journals/browse/maney/bcj119 PHYS CHEM GLASSES PHYSICS AND CHEMISTRY OF GLASSES 玻璃物理与化学0031-9090 0.66/vl=21681904/cl=15/nw=1/rpsv/cw/sgt/00319090/contp1.htm120 J MATER SCI-MATER EL JOURNAL OF MATERIALS SCIENCE-MATERIALS IN ELECTRONICS 材料科学杂志—电子材料0957-4522 0.641/issn/0957-4522/current121 COMPOS INTERFACE COMPOSITE INTERFACES 复合材料界面0927-6440 0.631/journals/jn-ComInt.html122 AM CERAM SOC BULL AMERICAN CERAMIC SOCIETY BULLETIN 美国陶瓷学会公告0002-7812 0.628/123 APPL COMPOS MATER APPLIED COMPOSITE MA TERIALS 应用复合材料0929-189X 0.627/issn/0929-189X/contents124 RES NONDESTRUCT EV AL RESEARCH IN NONDESTRUCTIVE EV ALUATION 无损检测研究0934-9847 0.621/125 PROG CRYST GROWTH CH PROGRESS IN CRYSTAL GROWTH AND CHARACTERIZATION OF MATERIALS 晶体生长和材料表征进展0960-8974 0.618/science/journal/09608974126 J COMPUT-AIDED MATER JOURNAL OF COMPUTER-AIDED MATERIALS DESIGN 计算机辅助材料设计杂志0928-1045 0.605/issn/0928-1045/current127 CERAM INT CERAMICS INTERNATIONAL 国际陶瓷0272-8842 0.593/science/journal/02728842128 POL YM TEST POL YMER TESTING 聚合物测试0142-9418 0.59/science/journal/01429418129 ADV PERFORM MA TER ADV ANCED PERFORMANCE MATERIALS 先进性能材料0929-1881 0.583/issn/0929-1881/contents130 SEMICONDUCTORS+ SEMICONDUCTORS 半导体1063-7826 0.575/131 J BIOACT COMPAT POL URNAL OF BIOACTIVE AND COMPA TIBLE POL YMERSJO 生物活性与兼容性聚合物杂志0883-9115 0.571http//132 HIGH TEMP MAT PR-ISR HIGH TEMPERATURE MATERIALS AND PROCESSES 高温材料和加工0334-6455 0.57133 ADV POL YM TECH ADV ANCES IN POL YMER TECHNOLOGY 聚合物技术发展0730-6679 0.569/cgi-bin/jhome/35650134 COMPOS STRUCT COMPOSITE STRUCTURES 复合材料结构0263-8223 0.556/science/journal/02638223135 J CERAM SOC JPN JOURNAL OF THE CERAMIC SOCIETY OF JAPAN 日本陶瓷学会杂志0914-5400 0.541http://www.ceramic.or.jp/~ihensyuj/journal_j.html136 BIO-MED MA TER ENG BIO-MEDICAL MATERIALS AND ENGINEERING 生物医用材料与工程0959-2989 0.537http://www.iospress.nl/site/html/09592989.html137 INT J MOD PHYS B INTERNATIONAL JOURNAL OF MODERN PHYSICS B 现代物理国际杂志B 0217-9792 0.523/ijmpb/ijmpb.shtml138 INT J THEOR PHYS INTERNATIONAL JOURNAL OF THEORETICAL PHYSICS 理论物理国际杂志0020-7748 0.52/issn/0020-7748/contents139 INTEGR FERROELECTR INTEGRATED FERROELECTRICS 集成铁电材料1058-4587 0.512/journals/titles/10584587.html140 MAG CONCRETE RES MAGAZINE OF CONCRETE RESEARCH 混凝土研究杂志0024-9831 0.512/jol/141 ACI MATER J ACI MA TERIALS JOURNAL 美国混凝土学会材料杂志0889-325X 0.503/general/home.asp142 J MA TER SCI LETT JOURNAL OF MATERIALS SCIENCE LETTERS 材料科学杂志快报0261-8028 0.489/issn/0261-8028/current143 FERROELECTRICS FERROELECTRICS 铁电材料0015-0193 0.471/xpl/RecentIssue.jsp?puNumber=58144 B MA TER SCI BULLETIN OF MATERIALS SCIENCE 材料科学公告0250-4707 0.465http://www.ias.ac.in/matersci/145 MATER SCI FORUM MATERIALS SCIENCE FORUM 材料科学论坛0255-5476 0.461/default.cfm?issn=0255-5476&pg=1146 JSME INT J A-SOLID M JSME INTERNATIONAL JOURNAL SERIES A-SOLID MECHANICS AND MA TERIAL ENGINEERIN 日本机械工程学会国际杂志系列A-固体力学与材料工程1344-7912 0.449http://webcat.nii.ac.jp/cgi-bin/shsproc?id=AA10888746147 MATER CHARACT MATERIALS CHARACTERIZATION 材料表征1044-5803 0.447/science/journal/10445803148 SYN REACT INORG MET SYNTHESIS AND REACTIVITY IN INORGANIC AND METAL-ORGANIC CHEMISTRY 无机物及金属—有机物化学的合成和反应0094-5714 0.446/servlet/product/productid/SIM149 MATER HIGH TEMP MA TERIALS AT HIGH TEMPERA TURES 高温材料0960-3409 0.444/materials/scimat.htm150 HIGH TEMP-HIGH PRESS HIGH TEMPERA TURES-HIGH PRESSURES 高温—高压0018-1544 0.438/151 J COMPOS TECH RES JOURNAL OF COMPOSITES TECHNOLOGY & RESEARCH 复合材料技术与研究杂志0884-6804 0.438/JOURNALS/COMPTECH/comptech.html152 ACI STRUCT J ACI STRUCTURAL JOURNAL 美国混凝土学会结构杂志0889-3241 0.435/PUBS/JOURNALS/SJHOME.ASP153 MATER DESIGN MA TERIALS & DESIGN 材料与设计0261-3069 0.434/science/journal/02613069154 MATER STRUCT MA TERIALS AND STRUCTURES 材料与结构1359-5997 0.432/EJ/journal/0964-1726155 MA T SCI SEMICON PROC MATERIALS SCIENCE IN SEMICONDUCTOR PROCESSING 半导体加工的材料科学1369-8001 0.419/science/journal/13698001156 BRIT CERAM T BRITISH CERAMIC TRANSACTIONS 英国陶瓷会刊0967-9782 0.413/journals/browse/maney/bct157 MECH COMPOS MA TER MECHANICS OF COMPOSITE MA TERIALS 复合材料力学0191-5665 0.405/issn/0191-5665/contents158 J COATING TECHNOL JOURNAL OF COATINGS TECHNOLOGY 涂层技术杂志0361-8773 0.393/Publications/JCT.html159 J REINF PLAST COMP JOURNAL OF REINFORCED PLASTICS AND COMPOSITES 增强塑料和复合材料杂志0731-6844 0.383/journal.aspx?pid=318160 MATER CORROS MA TERIALS AND CORROSION-WERKSTOFFE UND KORROSION 材料与腐蚀0947-5117 0.376http://www.wiley-vch.de/vch/journals/2010/161 SCI CHINA SER E SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES 中国科学E技术科学1006-9321 0.376/scienceinchina_e_en.htm162 CEMENT CONCRETE COMP CEMENT & CONCRETE COMPOSITES 水泥与混凝土复合材料0958-9465 0.371/science/journal/09589465163 MATER EV AL MATERIALS EV ALUA TION 材料评价0025-5327 0.371/publications/materialseval/materialseval.htm164 POL YM POL YM COMPOS POL YMERS & POLYMER COMPOSITES 聚合物与聚合物复合材料0967-3911 0.368/Themes/polymers.htm165 J MATER SYNTH PROCES JOURNAL OF MATERIALS SYNTHESIS AND PROCESSING 料合成与加工杂志1064-7562 0.358/issn/1064-7562/current166 ADV COMPOS MA TER ADV ANCED COMPOSITE MA TERIALS 先进复合材料0924-3046 0.357/167 INT J MA TER PROD TEC INTERNATIONAL JOURNAL OF MATERIALS & PRODUCT TECHNOLOGY 材料与生产技术国际杂志0268-1900 0.349/catalogue/m/ijmpt/indexijmpt.html168 J MA TER CIVIL ENG JOURNAL OF MATERIALS IN CIVIL ENGINEERING 土木工程材料杂志0899-1561 0.348/mto/169 HIGH TEMP MATER P-US HIGH TEMPERATURE MATERIAL PROCESSES 高温材料加工1093-3611 0.342/journals/57d172397126f956.html170 CONSTR BUILD MATER CONSTRUCTION AND BUILDING MA TERIALS 结构与建筑材料0950-0618 0.341/science/journal/09500618171 HIGH TEMP+ HIGH TEMPERATURE 高温0018-151X 0.327/issn/0018-151X/contents172 RARE METAL MAT ENG RARE METAL MATERIALS AND ENGINEERING 稀有金属材料与工程1002-185X 0.319http://www.benran.ru/Magazin/El_vin/X/082472.HTM173 INORG MATER+ INORGANIC MA TERIALS 无机材料0020-1685 0.312/science/journal/14666049174 SCI TECHNOL WELD JOI SCIENCE AND TECHNOLOGY OF WELDING AND JOINING 焊接科学与技术1362-1718 0.295/phase-trans/abstracts/stwj.html175 MATER MANUF PROCESS MATERIALS AND MANUFACTURING PROCESSES 材料与制造工艺1042-6914 0.288/servlet/product/productid/AMP176 FERROELECTRICS LETT FERROELECTRICS LETTERS SECTION 铁电材料快报0731-5171 0.274/journals/titles/07315171.html177 J MA TER SCI TECHNOL JOURNAL OF MATERIALS SCIENCE & TECHNOLOGY 材料科学与技术杂志1005-0302 0.269http://coral.dir.bg/jmst-h.htm178 J MA TER ENG PERFORM JOURNAL OF MATERIALS ENGINEERING AND PERFORMANCE 材料工程与性能杂志1059-9495 0.268/science/journal/10599495179 MET MATER-INT METALS AND MATERIALS INTERNA TIONAL 国际金属及材料1225-9438 0.256http://www.icm.re.kr/doc/paper/index.jsp?flag=kor&jcode=499180 GLASS TECHNOL GLASS TECHNOLOGY 玻璃技术0017-1050 0.255/181 J MATER PROCESS TECH JOURNAL OF MATERIALS PROCESSING TECHNOLOGY 材料加工技术杂志0924-0136 0.255/science/journal/09240136182 J POL YM MATER JOURNAL OF POL YMER MATERIALS 聚合物材料杂志0970-0838 0.229http://balkema.ima.nl/scripts/cgiBalkema.exe/serie?SerNo=40183 ADV POWDER TECHNOL ADV ANCED POWDER TECHNOLOGY 先进粉末技术0921-8831 0.224/journals/jn-AdvPowTec.html184 J ADV MATER JOURNAL OF ADV ANCED MATERIALS 先进材料杂志1070-9789 0.22/JAM.html185 SYNTHESE SYNTHESE 合成0039-7857 0.208/issn/0039-7857186 GLASS SCI TECHNOL GLASS SCIENCE AND TECHNOLOGY-GLASTECHNISCHE BERICHTE 玻璃科学与技术0946-7475 0.189/isi/187 J TEST EV AL JOURNAL OF TESTING AND EV ALUATION 测试及评价杂志0090-3973 0.171/cgi-bin/SoftCart.exe/index.shtml?E+mystore188 MATER SCI TECH-LOND MA TERIALS SCIENCE AND TECHNOLOGY 材料科学与技术0267-0836 0.171/~tw/home.html189 POWDER METALL MET C+ POWDER METALLURGY AND METAL CERAMICS 粉末冶金及金属陶瓷1068-1302 0.161/issn/1068-1302/contents190 MATER SCI+ MATERIALS SCIENCE 材料科学1068-820X 0.15/191 MATER TECHNOL MATERIALS TECHNOLOGY 材料技术1066-7857 0.147/192 ADV MATER PROCESS ADV ANCED MA TERIALS & PROCESSES 先进材料及工艺0882-7958 0.144/193 RARE METALS RARE METALS 稀有金属1001-0521 0.142/194 J WUHAN UNIV TECHNOL JOURNAL OF WUHAN UNIVERSITY OF TECHNOLOGY-MATERIALS SCIENCE EDITION 武汉理工大学学报-材料科学版1000-2413 0.14/journal/english/order.htm195 PLAT SURF FINISH PLATING AND SURFACE FINISHING 电镀和表面修整0360-3164 0.14/196 J INORG MA TER JOURNAL OF INORGANIC MATERIALS 无机材料杂志1000-324X 0.131/science/journal/14666049197 MATER WORLD MA TERIALS WORLD 材料世界0967-8638 0.104/198 MET SCI HEAT TREA T+ METAL SCIENCE AND HEAT TREATMENT 金属科学及热处理0026-0673 0.096/issn/0026-0673/current199 METALL METALL 金属0026-0746 0.096http://www.vsb.cz/200 MATER PERFORMANCE MA TERIALS PERFORMANCE 材料性能0094-1492 0.087/nace/content/publications/mediakit.asp#MP201 J MA TER PROCESS MANU JOURNAL OF MATERIALS PROCESSING & MANUFACTURING SCIENCE 材料加工与制造科学杂志1062-0656 0.078/journal.aspx?pid=316202 SCI ENG COMPOS MATER SCIENCE AND ENGINEERING OF COMPOSITE MATERIALS 复合材料科学与工程0334-181X 0.075/exec/obidos/ASIN/B00006KWDC/shoppingsavvy-20/002-4085689-4536 025203 IEEE T COMPON PACK T IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES IEEE元件及封装技术会刊1521-3331 0.071/xpl/RecentIssue.jsp?puNumber=6144204 JOCCA-SURF COAT INT JOCCA-SURFACE COATINGS INTERNATIONAL JOCCA—国际表面涂层1356-0751/publications/sci/sci_sso205 ADV FUNCT MA TER ADV ANCED FUNCTIONAL MA TERIALS 先进功能材料1616-301X/cgi-bin/jhome/77003362?CRETRY=1206 ANN REV MATER RES ANNUAL REVIEW OF MATERIALS RESEARCH 材料研究年度评论1531-7331/loi/matsci207 MATER TRANS MATERIALS TRANSACTIONS 材料会刊1345-9678/pubs/journals/MT/MT.html。
KBS

Knowledge-Based Systems E-Business Study Topicsa. Bayesian Network P45Bayesian networks:These are probabilistic models based on directed graphs capturingcausal relationships between a number of variables being modelled. These can provide veryaccurate tools for predication and diagnosis. Microsoft is a big supporter basing variousdiagnostic tools on the technology.b. Formal Representation P89Formal representation is the process of coding knowledge for input into a computer using a restricted syntax as defined by a formal grammar developed for the purpose. Formalrepresentation results in statements with which you can reason automatically using computersoftware.c. Expert System Shell P4Expert system shell: the middle ring shows the expert system shell that provides computational facilities for applying the knowledge base to user decision support: ♦the application system at the bottom tests the user's hypotheses or seeks to satisfy his goals by deriving the consequences of facts about a particular situation reported bythe user, using the general facts and inference rules supplied by the expert and editedby the knowledge engineer;♦the explanation system to the lower left answers queries about the way in which facts have been derived in terms of what information and inference rules have been used;♦the acquisition system to the upper left provides tools for interviewing the expert to obtain his vocabulary, and general facts and inference procedures in terms of it;♦the display system at the top provides tools for presenting the knowledge base in an understandable form--the relations between the facts and inference procedures;♦the edit system to the upper right provides tools for editing the knowledge base while maintaining its integrity in terms of the vocabulary, variables and operations used;♦the validation system to the lower right provides tools for checking the knowledge base against specific case histories with known consequences.d. Reasoning2.1 What Is Knowledge?p7In A.I. research the data and knowledge two have different meanings. Traditionally the termdata is used to describe simple information such as numbers, strings and Boolean values. Todeal with the real world we need more complex information such as processes, procedures, actions, causality, time, motivations, goals, and common sense reasoning. The term knowledge is used to describe this sort of information of which data is merely a subset. It could more formally be described as a symbolic description (or model) of a domain (or universe of discourse)).It is in artificial intelligence research that knowledge representation lends itself to utilization because knowledge is the second requirement of intelligent behavior (the first one is reasoning).1.Reasoning engine: Inference mechanisms for manipulating the symbolicinformation and knowledge in the knowledge base to form a line ofreasoning in solving a problem. The inference mechanism can range fromsimple modus ponens backward chaining of IF-THEN rules to case-basedreasoning.Reason using representations of human knowledge∙that is a previously encountered problem may even have additional information, such as how the problem was solved or symptoms that are associated with the problem.e. Knowledge Management P9Knowledge Management is the collection of processes that govern the creation, dissemination, and utilization of knowledge. In one form or another, knowledge management has been around for a very long time. Practitioners have included philosophers, priests, teachers, politicians, scribes, Liberians, etc.So if Knowledge Management is such an ageless and broad topic what role does it serve in today's Information Age? These processes exist whether we acknowledge them or not and they have a profound effect on the decisions we make and the actions we take, both of which are enabled by knowledge of some type. If this is the case, and we agree that many of our decisions and actions have profound and long lasting effects, it makes sense to recognize and understand the processes that effect or actions and decision and, where possible, take steps to improve the quality these processes and in turn improve the quality of those actions and decisions for which we are responsible?Knowledge management is not a, "a technology thing" or a, "computer thing" If we accept the premise that knowledge management is concerned with the entire process of discovery and creation of knowledge, dissemination of knowledge, and the utilization of knowledge then we are strongly driven to accept that knowledge management is much more than a "technology thing" and that elements of it exist in each of our jobs.f. Knowledge Elicitation P234.2 Knowledge ElicitationThe most important branch of knowledge acquisition is knowledge elicitation- obtaining knowledge from a human expert (or human experts) for use in an expert system.Knowledge elicitation is difficult. This is the principle reason why expert systems have not become more widespread - the knowledge elicitation bottleneck.It is necessary to find out what the expert(s) know, and how they use their knowledge.Expert knowledge includes:∙domain-related facts & principles;∙modes of reasoning;∙reasoning strategies;∙explanations and justifications.The knowledge elicitation (and analysis) task involves:∙Finding at least one expert in the domain who:o is willing to provide his/her knowledge;o has the time to provide his/her knowledge;o is able to provide his/her knowledge.∙Repeated interviews with the expert(s), plus task analysis, concept sorting, etc, etc..∙Knowledge structuring: converting the raw data (taken from the expert) into intermediate representations, prior to building a working system.∙This will improve the knowledge engineer's understanding of the subject;∙This provides easily-accessible knowledge for future KEs to work from (knowledge archiving).∙Building a model of the knowledge derived from the expert, for the expert to criticise.From then on, the development proceeds by stepwise refinement.One major obstacle to knowledge elicitation: experts cannot easily describe all they know about their subject.∙They do not necessarily have much insight into the methods they use to solve problems.Their knowledge is "compiled" (c.f. a compiled computer program - fast & efficient, but unreadable).Some of the techniques used in Knowledge Elicitation∙Various different forms of interview:o Unstructured. A general discussion of the domain, designed to provide a list of topics and concepts.Structured. Concerned with a particular concept within the domain.o Problem-solving. The expert is provided with a real-life problem, of a kind that they deal with during their working life, and asked to solve it. As they doso, they are required to describe each step, and their reasons for doing whatthey do. The transcript of their verbal account is called a protocol.o Think-aloud. As above, but the expert merely imagines that they are solving the problem presented to them, rather than actually doing it. Once again, theydescribe the steps involved in solving the problem.o Dialogue. The expert interacts with a client, in the way that they would normally do during their normal work routine.o Review. The KE and DE examine the record of one of the sessions described above, together.∙Sample lecture preparation. The expert prepares a lecture, and the KE analyses its content.∙Concept sorting ("card sort").∙Questionnaires. Especially useful when the knowledge is to be elicited from several different experts.∙Repertory grid (particularly the "laddered grid" technique).It is standard practice to tape record KE sessions. However, KEs should be aware of the costs this involves, in time and money.The above techniques will be discussed in detail later.g. Knowledge Acquisition P9Knowledge acquisition: how to translate human knowledge in current, written, conceptual and abstract representations into computer representations.Knowledge Acquisition is the process of obtaining knowledge for use in the knowledge base of an expert system.h. Semantics P11Semantic networks. A semantic network is a method of representing knowledge often used for critical analysis of literary texts. Similar to hypertext technologies in some ways, but with emphasis on typed links among concepts.i. Knowledge Networks P14Knowledge Network is a search tool for a knowledgebase.j. Knowledge Engineers P4♦knowledge engineers act as intermediaries between the expert and the system, helping him to encode his knowledge and validate the operation of the expert system.k. Inference Rule P4、P65♦the application system at the bottom tests the user's hypotheses or seeks to satisfy his goals by deriving the consequences of facts about a particular situation reported by the user, using the general facts and inference rules supplied by the expert and edited by the knowledge engineer;ImplementationDuring the next stage, implementation, the formalized knowledge is mapped or coded into the framework of the development tool to build a working prototype. The contents of knowledge structures, inference rules and control strategies established in the previous stages are organized into suitable format. Often, knowledge engineers will have been using the program development tool to build a working prototype to document and organize information collected during the formalization stage, so that implementation is completed at this point. If not, the notes from the earlier phases are coded at this time.The inference engine:1. Combines the facts of a specific case with the knowledge contained in the knowledge base to come up with a recommendation. In a rule-based expert system, the inference engine controls the order in which production rules are applied2. Directs the user interface to query the user for any information it needs for further inferencing.Inference Engine∙General problem-solving knowledge or methods∙Interpreter analyzes and processes the rules∙Scheduler determines which rule to look at next∙The search portion of a rule-based system∙It takes advantage of heuristic information∙Otherwise, the time to solve a problem could become prohibitively long∙This problem is called the combinatorial explosion∙Expert-system shell provides customizable inference engineKnowledge base: the central ring shows the knowledge base of facts and inference rules. This extends the facilities of a conventional database by enabling the storage not only of facts but also of rules that enable further facts to be derived.P1♦Interpretation - take observations and infer descriptions e.g. natural language understanding♦Prediction - recognise situations and infer likely consequences e.g. weather forecastingP12 and Groupware P12IntranetsAn intranet is an network, internal to the organization, based on Internet and World Wide Web technology. By using common Internet protocols, or core technologies, in conjunction with their own business applications, corporations can easily communicate, distribute information, and facilitate project collaboration across the entire enterprise while keeping unauthorized users out.GroupwareGroupware is software that was created in recognition of the significance of groups in offices by providing functions and services that support the collaborative activities of work groups.L OTUS N OTES D OMINO allows users to coordinate work with built-in calendars, scheduling, e-mail, web navigational tools, integrated support of internet standards. To Learn more about Lotus Notes and the Domino web version 5.0, please click on the URL above to go to that site.MS E XCHANGE S ERVER is Microsoft's software that is competing directly with Lotus Notes for customers. It gives businesses the ability to rely on their messaging and collaboration servers, provides for a comprehensive messaging platform that includes the tools necessary to create rich collaboration applications.Intellectual Asset Management is the management the intellectual asset of a corporation, including patents, copyrights, etc.A company called A URIGIN offers a solution called IPAM (Intellectual Property Asset Management System) which allows organizations to organize, analyze and manage intellectual property using an intranet.P27 and Observation P27InterviewsKnowledge engineers elicit knowledge from experts in conversation. The process is best started with free-formed questions, narrowing in specificity. The expert is in control, which has some advantages, but makes interviews very time-consuming.Observation of task performanceThe expert's performance, while working at a real problem, may be recorded by simply watching or by videotaping the process.and Sharing P101Knowledge creating - knowledge modelling, knowledge representation (in fact, all activities useful in a development process) are phases arising from methods elaborated in "knowledge engineering environments”.sharing –modern technologies (including intelligent browsers) offer almostunlimited access to knowledge resources from any place. For example: a well conceived e-commerce application contains knowledge on products and services, is able to explain how to use products in given contexts, how to connect several devices together, etc.-Based P33Rule-based programming is one of the most commonly used techniques for developing expert systems. In this programming paradigm, rules are used to represent heuristics, or "rules of thumb," which specify a set of actions to be performed for a given situation. A rule is composed of an if portion and a then portion. The if portion of a rule is a series of patterns which specify the facts (or data) which cause the rule to be applicable. The process of matching facts to patterns is called pattern matching. The expert system tool provides a mechanism, called the inference engine, which automatically matches facts against patterns and determines which rules are applicable. The if portion of a rule can actually be thought of as the whenever portion of a rule since pattern matching always occurs whenever changes are made to facts. The then portion of a rule is the set of actions to be executed when the rule is applicable. The actions of applicable rules are executed when the inference engine is instructed to begin execution. The inference engine selects a rule and then the actions of the selected rule are executed (which may affect the list of applicable rules by adding or removing facts). The inference engine then selects another rule and executes its actions. This process continues until no applicable rules remain-Symbolic Knowledge P18SubSymbolic.The knowledge is stored without the use of symbols. This typically means the architecture uses direct mapping from the inputs to outputs.P87and Graphs P86The decision table formalism supports presentation of conditions and conditioned actions. A decision table usually consists of four parts. The top left part lists the possible conditions while the bottom left lists the possible actions. The right part indicates the particular action to be taken (the bottom part) for each set of circumstances (the top right part). Tables are represented graphically; if each column contains simple states we call it expanded DTa, otherwise, if contractions or irrelevant conditions are allowed, we say the table is in consolidated form. Using a set of tables in order to modularise the tabular knowledge base, we are able to represent different knowledge bases after transforming them into rules or directly into tables.Decision tables in validation & verificationAcquiring the correct and complete knowledge is one of the main problems in building knowledge-based systems. Also, maintaining the knowledge base is not a trivial task and often introduces unnoticed inconsistencies or contradictions. Verification and validation (V&V) of knowledge based systems are receiving increased attention.It has been reported earlier (e.g. Vanthienen [19], Cragun and Steudel [4], Puuronen [16]) that, in a vast majority of cases, the decision table technique easily allows to check for common V&V problems, such as contradictions, inconsistencies, incompleteness, redundancy, etc. in the problem specification.10.5 Benefits Of Decision TablesDecision tables offer some important benefits as described below:CompletenessKnowledge bases often suffer from missing attribute values or combinations of attribute values, unreachable conclusions, etc. The nature of the (single hit) decision table easily allows to check for completeness, because the number of simple columns (before contraction) should equal the product of the number of states for every condition. Completeness of combinations of attribute values can, therefore, be enforced automatically.ConsistencyInconsistency occurs when rules with the same premises but different conclusions exist. When these conclusions are contradictory, the rules are in conflict. If the contradictory conclusions deal with opposite values of the same action, this will be called contradiction. When the conclusions do not necessarily contradict each other, the rules are ambiguous. Because in a (correct) decision table all columns are non-overlapping and each column refers to exactly one configuration of conclusions, inconsistency between columns will not occur. Non-redundancyBy definition, a single hit decision table eliminates redundant rules and premises, as a combination of condition states will be included in only one column.CorrectnessAfter the decision tables have been designed, the knowledge engineer may want to check the (semantic) correctness of the decision specification, verifying that for each possible case the right action(s) will be executed. The decision table format easily allows this kind of validation.More recently, also other researchers have realized the importance of identifying inconsistencies and redundancies early in the knowledge base development process. In Ngwenyama and Bryson [13], a formal method, based on decision table matrices, is presented to solve the problems of redundancy and inconsistency when integrating the rule sets of multiple experts.P18 and Frames P203.3 RulesCurrently, the most popular method of knowledge representation is in the form of rules (also known as production rules or rule-based systems).Below are illustrated the use of rules through a simple rule base of five rules created for the domain of credit approval. There are many questions a loan officer may ask in the process of deciding whether to approve or deny an application for credit. Some of the questions the officer may ask concern:♦the current salary of the person,♦the credit history of the person, and♦their current employmentA simple (fictitious) rule base that might be applicable to this domain is given below.3.4 FramesThe use of object-oriented methods in software development has impacted the development of E/KBS as well. Knowledge in an E/KBS can also be represented using the concept of objects to capture both the declarative and procedural knowledge in a particular domain. In E/KBS, the terminology that is used to denote the use of objects is frames, and frames are fast becoming a popular and economical method of representing knowledge. Frames are extremely similar to object-oriented technology and provide many of the benefits that have been attributed to object-oriented systems.A frame is a self-contained unit of knowledge that contains all of the data (knowledge) and the procedures associated with the particular object in the domain. In Figure 1.1, we show a hierarchy of objects using the classification of humans as the particular domain. Each of the frames in Figure 1.1 represents an object in the domain. The top-level object is known as the class. As you proceed down the tree each of the objects become a more specific example of the upper node. For instance, Jack is a particular example of a Male and Human; we call Jackan instance of the class Human, while Male is a subclass of Human.from research laboratories to business applications P16918.2 Stages In The Process Of Data MiningStage 1: Exploration.This stage usually starts with data preparation which may involve cleaning data, data transformations, selecting subsets of records and - in case of data sets with large numbers of variables ("fields") - performing some preliminary feature selection operations to bring the number of variables to a manageable range (depending on the statistical methods which are being considered). Then, depending on the nature of the analytic problem, this first stage of the process of data mining may involve anywhere between a simple choice of straightforward predictors for a regression model, to elaborate exploratory analyses using a wide variety of graphical and statistical methods in order to identify the most relevant variables and determine the complexity and/or the general nature of models that can be taken into account in the next stage.Stage 2: Model building and validation. This stage involves considering various models and choosing the best one based on their predictive performance (i.e., explaining the variability in question and producing stable results across samples). This may sound like a simple operation, but in fact, it sometimes involves a very elaborate process. There are a variety of techniques developed to achieve that goal - many of which are based on so-called "competitive evaluation of models," that is, applying different models to the same data set and then comparing their performance to choose the best. These techniques - which are often considered the core of predictive data mining - include: Bagging (Voting, Averaging), Boosting, Stacking (Stacked Generalizations), and Meta-Learning.Stage 3: Deployment.That final stage involves using the model selected as best in the previous stage and applying it to new data in order to generate predictions or estimates of the expected outcome.The concept of Data Mining is becoming increasingly popular as a business information management tool where it is expected to reveal knowledge structures that can guide decisions in conditions of limited certainty. Recently, there has been increased interest in developing new analytic techniques specifically designed to address the issues relevant to business Data Mining (e.g., Classification Trees), but Data Mining is still based on the conceptual principles of statistics including the traditional Exploratory Data Analysis (EDA) and modeling and it shares with them both some components of its general approaches and specific techniques. However, an important general difference in the focus and purpose between Data Mining and the traditional Exploratory Data Analysis (EDA) is that Data Mining is more oriented towards applications than the basic nature of the underlying phenomena. In other words, Data Mining is relatively less concerned with identifying the specific relations between the involved variables. For example, uncovering the nature of the underlying functions or the specific types of interactive, multivariate dependencies between variables are not the main goal of Data Mining. Instead, the focus is on producing a solution that can generate useful predictions. Therefore, Data Mining accepts among others a "black box" approach to data exploration or knowledge discovery and uses not only the traditional Exploratory Data Analysis (EDA) techniques, but also such techniques as Neural Networks which can generate valid predictions but are not capable of identifying the specific nature of the interrelations between the variables on which the predictions are based.P17618.8 Reasons for the growing popularity of Data MiningGrowing Data VolumeThe main reason for necessity of automated computer systems for intelligent data analysis is the enormous volume of existing and newly appearing data that require processing. The amount of data accumulated each day by various business, scientific, and governmental organizations around the world is daunting. According to information from GTE research center, only scientific organizations store each day about 1 TB (terabyte!) of new information. And it is well known that academic world is by far not the leading supplier of new data. It becomes impossible for human analysts to cope with such overwhelming amounts of data. Limitations of Human AnalysisTwo other problems that surface when human analysts process data are the inadequacy of the human brain when searching for complex multifactor dependencies in data, and the lack of objectiveness in such an analysis. A human expert is always a hostage of the previous experience of investigating other systems. Sometimes this helps, sometimes this hurts, but it is almost impossible to get rid of this fact.Low Cost of Machine LearningOne additional benefit of using automated data mining systems is that this process has a much lower cost than hiring an army of highly trained (and payed) professional statisticians. While data mining does not eliminate human participation in solving the task completely, it significantly simplifies the job and allows an analyst who is not a professional in statistics and programming to manage the process of extracting knowledge from data.Chapter 18 – Data Mining (2) Page 10a. Procedural P7 and Declarative Knowledge P29b. General and Domain-based Problem Solving Methodc. Tacit P99 and Explicit Knowledge P51、54d. Data Driven and Goal Driven Reasoning Methoda.Classification Of KnowledgeKnowledge can be classified as either static or dynamic. If it describes properties of, or relations between, objects and processes, we call it static or descriptive. If it provides tools that help to decide how to use (manipulate, reason with) this static knowledge when solving particular problems, or is goal-oriented, we call it dynamic or procedural knowledge.The two main forms of knowledge are experimental and theoretical. Experimental knowledge encapsulates experience, contains examples, precedents and situations, while, theoretical knowledge includes conceptual components (notions), declarative components (sentences that define relationships between concepts independent of any procedure to manipulate them), and operative components (actions).Knowledge may also be domain-specific, or related to a specific topic; or common-sense, or part of the everyday knowledge about the world and its contents that underlies most other reasoning. It may also represent the organization, structure and the usage of knowledge itself, and is called meta-level knowledge.Declarative knowledge representations are, for KA, preferable to procedural ones, as their meaning is explicit (can be "read directly").Furthermore, the notion of declarative knowledge representations is open-ended, in the sense that it will accommodate changes to the knowledge base. Current research at Carnegie-Mellon is aimed at reducing maintenance costs still further through the creation of higher level “shells” that actively support the process of knowledge acquisition and testing. New prototype versions of the configuration system are now being reengineered using these shells.b. the primary problem solving method is pattern recognition. Some additional features arethe following: a knowledge base viewable through objects, production rules, or decision tables; uncertainty handling through non-numerical means; support for report writing; and a self-documenting knowledge base. ACQUIRE is available for Microsoft Windows (3.1 or higher) only.The future for expert/knowledge-based systems development is bright; however, there remain。
领域知识图谱研究进展及其在水利领域的应用

第49卷第1期2021年1月河海大学学报(自然科学版)Journal of Hohai University(Natural Sciences)Vol.49No.1Jan.2021DOI :10.3876/j.issn.10001980.2021.01.005 基金项目:国家重点研发计划(2018YFC0407901);安徽省高等学校自然科学研究重点项目(KJ2019A1277)作者简介:冯钧(1969 ),女,教授,博士,主要从事数据管理㊁智能数据处理与数据挖掘㊁水利信息化研究㊂E⁃mail:fengjun@ 通信作者:杭婷婷,副教授㊂E⁃mail:httsf@引用本文:冯钧,杭婷婷,陈菊,等.领域知识图谱研究进展及其在水利领域的应用[J].河海大学学报(自然科学版),2021,49(1):26⁃34.FENG Jun,HANG Tinting,CHEN Ju,et al.Research status of domain knowledge graph and its application in water conservancy[J].Journal of Hohai University(Natural Sciences),2021,49(1):26⁃34.领域知识图谱研究进展及其在水利领域的应用冯 钧1,杭婷婷1,2,陈 菊1,王云峰1,王秉发1,张 涛1(1.河海大学计算机与信息学院,江苏南京 211100;2.无人机开发及数据应用安徽高校联合重点实验室,安徽马鞍山 243031)摘要:首先总结现有领域知识图谱的研究现状㊂其次,介绍领域知识图谱的发展趋势㊂然后,梳理水利领域知识图谱的构建难点,提出包含知识表示㊁抽取㊁融合㊁推理和存储等关键模块的水利领域知识图谱研究框架,并简要概括上述各模块的研究内容㊂最后,指出领域知识图谱构建存在的表示形式单一㊁抽取样本稀少㊁多源知识冲突㊁规则表示困难和数据管理低效等问题,认为合理化表示㊁准确全面抽取㊁实时性融合㊁可解释推理和高性能存储是下一步水利知识图谱的研究方向㊂关键词:领域知识图谱;水利领域;大数据;知识表示;知识抽取;知识融合;知识推理;知识存储中图分类号:TP391.1 文献标志码:A 文章编号:10001980(2021)01002609Research status of domain knowledge graph and its application in water conservancyFENG Jun 1,HANG Tinting 1,2,CHEN Ju 1,WANG Yunfeng 1,WANG Bingfa 1,ZHANG Tao 1(1.College of Computer and Information ,Hohai University ,Nanjing 211100,China ;2.Key Laboratory of Unmanned Aerial Vehicle Development and Data Application of Anhui Higher Education Institutes ,Maanshan 243031,China )Abstract :Firstly,this study summarized the current research status of the domain knowledge graph.Secondly,the development trend of the domain knowledge graph was introduced.Then,this study sorted out some difficulties in the construction of water conservancy knowledge graph,proposed a research framework including main modules such as knowledge representation,extraction,fusion,reasoning,and storage,and briefly summarized the research content of each module.Finally,the construction of domain knowledge graph encountered some problems,such as the single representation,the extraction sample sparse,the multi⁃source knowledge conflict,the rule representation difficulty,and the inefficient data management.Therefore,the rationalized representation,accurate and comprehensive extraction,real⁃time fusion,interpretable reasoning,and high⁃performance storage are regarded as the next research direction of water conservancy knowledge graph.Key words :domain knowledge graph;water conservancy;big data;knowledge representation;knowledge extraction;knowledge fusion;knowledge reasoning;knowledge storage随着人工智能研究的不断发展,人工智能的主要发展方向经历了从拥有快速计算和记忆存储能力的运算智能,到拥有视觉㊁听觉㊁触觉等感知能力的感知智能,正在迈向拥有理解和思考能力的认知智能㊂知识图谱和以知识图谱为代表的知识工程系列技术是认知智能的核心㊂知识图谱本质是一种揭示实体之间关系的语义网络,可以对现实世界的事务及其相关关系进行形式化描述[1],它强大的语义处理和互联组织能力,对有效描述数据间的关联关系进而打破信息孤岛的局面具有一定的现实意义㊂目前,在一些领域已经出现了面向领域的知识图谱,例如电影领域的IMDB [2]㊁生物医学领域的BMKN [3]㊁新闻领域的ECKG [4]㊁健康领域的SHKG [5]等㊂从已有的领域知识图谱看,构建领域知识图谱需要借鉴通用知识图谱的方法,同时还需要依靠特72第1期冯 钧,等 领域知识图谱研究进展及其在水利领域的应用定行业数据,具有特定的行业意义,领域知识图谱的构建是当前知识图谱研究的一个重要方向和趋势㊂随着水利信息化及其水利信息技术的发展,水利领域长期业务实践积累了实时监测㊁遥感遥测㊁水文气象㊁水利工程㊁社会经济等多源异构水利大数据,实现了水利监测从点到面的转变,从静态到动态的拓展㊂随着信息采集和传输技术的飞速发展和领域信息化的进程,领域数据不断更新,数据量日益增加,数据间语义不一致也屡见不鲜㊂多源异构数据呈现出海量㊁动态㊁内容多样㊁处理复杂的特点㊂如何让分布存储管理的㊁语义各异的数据能够互联,充分发掘领域数据价值,促进信息资源的高效利用,是推进智慧水利[6]的关键,也是水利信息资源查询推荐,语义搜索,智慧防汛[7⁃8]和智慧水资源管理[9]等应用的基础,对于提高水利领域智能化管理水平㊁辅助管理者进行决策分析具有非常重要的意义[10]㊂因此,水利领域知识图谱研究既具有重要的理论意义,也具有显著的实用价值㊂本文总结领域知识图谱构建的研究现状,包括构建方式㊁应用现状等方面的进展;介绍近年来领域知识图谱构建的发展趋势;对水利领域知识图谱构建工作进行展望,提出研究框架和具体的研究内容㊂1 领域知识图谱构建研究现状知识图谱按照覆盖范围可分为通用知识图谱和领域知识图谱㊂通用知识图谱面向通用领域,以常识性知识为主,其构建过程高度自动化㊂其关联的大多数是静态的㊁客观的㊁明确的三元组事实性知识㊂领域知识图谱面向某一特定领域,以行业数据为主,其构建过程半自动化㊂其关联的不仅包含静态知识,也涉及一些动态知识㊂本文主要探讨领域知识图谱构建㊂1.1 领域知识图谱的构建方式在领域知识图谱的构建方式方面,目前主要有自顶向下和自底向上2种构建方式㊂自顶向下方式是针对特定的行业,由该行业专家定义好顶层本体与数据模式,再将抽取到的实体加入到知识库中㊂国内外现有的本体建模工具以Protégé㊁PlantData为代表㊂Protégé是一套基于RDF(S),OWL等语义网规范的开源本体编辑器,拥有图形化界面,适用于原型构建场景㊂PlantData是一款商用知识图谱智能平台软件㊂该软件提供了本体概念类㊁关系㊁属性和实例的定义和编辑,屏蔽了具体的本体描述语言,用户只需在概念层次上进行领域本体模型的构建,使得建模更加便捷㊂自底向上方式主要依赖开放链接数据集和百科网站,从这些结构化的知识中进行自动学习,直接将抽取数据中发现的实体㊁关系以及属性合并到知识图谱中[11]㊂自顶向下的方法有利于抽取新的实例,保证抽取质量㊂而自底向上的方法则能发现新的模式㊂因此,目前大部分领域知识图谱的构建方式是自顶向下和自底向上相结合的方式㊂1.2 领域知识图谱的应用现状领域知识图谱通常用来辅助各种复杂的分析应用或决策支持㊂目前,在大多数领域中均存在领域知识图谱的应用㊂因为应用场景和应用目的不同,不同领域的应用形式也有所不同㊂下面将从知识应用的角度出发,介绍相关领域知识图谱的应用现状㊂a.电商知识图谱的应用㊂电商知识图谱的主要应用场景就是导购㊂导购就是让消费者更容易找到他想要的东西㊂为此,电商知识图谱学习了大量的行业规范与国家标准,对一些专业词汇进行了更细致的解决㊂另外,它还可以从公共媒体和专业社区中识别出近期热词㊂当消费者输入相关热词之后,可以出现跟热词相关的商品㊂与此同时,电商知识图谱还可以通过场景构建,实现与场景相关的商品推荐㊂b.医疗知识图谱的应用㊂医疗知识图谱的主要应用包括医疗过程智能辅助㊁医学科研以及患者服务等方面㊂其中医疗过程智能辅助是通过医疗知识图谱实现临床辅助决策㊁合理用药等智能服务㊂医学科研是基于医疗知识图谱,辅助医务工作者实现疾病风险预测㊁药物研发等应用服务㊂患者服务是根据患者过去的就医记录以及相关的医疗知识,为患者提供健康知识推送和健康评估等日常服务㊂c.企业知识图谱的应用㊂企业知识图谱通过异常关联挖掘㊁最终控制人等方式为行业客户提供风险管理㊂其中异常关联挖掘是通过路径分析㊁关联探索等操作,挖掘企业之间的异常关联,减少企业经营风险和资金风险㊂最终控制人是寻找持股比例最大的股东,最终追溯至自然人或者国有资产管理部门,向行业用户提供更准确的智能服务㊂d.创投知识图谱的应用㊂创投知识图谱主要应用包含知识检索和可视化决策支持㊂其中知识检索是由机器完成用户搜索意图识别,向用户提供准确检索答案㊂可视化决策支持是通过图谱可视化技术对公司82河海大学学报(自然科学版)第49卷的全方位信息,投资机构的投资偏好等进行展示,为投融资决策提供支持㊂总的来说,知识图谱与各行业的深度融合已经成为一个重要趋势㊂在这一过程中,涌现出一系列的领域应用,可以解决行业痛点问题㊂2 领域知识图谱构建的发展趋势领域知识图谱构建的主要过程包括知识表示㊁知识抽取㊁知识融合㊁知识推理和知识存储等5个方面㊂尽管目前相关原理和应用都已经取得了较好的成果,但仍在快速发展之中㊂近年来,领域知识图谱的发展趋势发生了一系列的变化,主要表现在:a.在知识表示方面,现阶段一般采用三元组表达事实知识㊂但是,在决策㊁推理等相关应用中,需要依赖于大量专家知识㊁动态知识进行辅助判断,而专家知识的表示已经超出了常规知识表示的范畴㊂在大数据的赋能下,知识表示的重心将逐步过渡到动态知识是必然趋势㊂b.在知识抽取方面,现阶段的研究主要集中在纯文本信息抽取方面㊂在训练样本较为丰富的情况下,基于神经网络的抽取模型可以取得较好的抽取效果㊂但是,领域知识多数处于小样本㊁零样本以及面向开放域的抽取环境下,知识抽取的重心将逐步过渡到小样本㊁零样本信息抽取是必然趋势㊂c.在知识融合方面,现阶段的研究主要聚焦于知识融合过程中的某一部分或者只关注知识融合的模式,冲突检测㊁实体对齐㊁属性对齐和属性真值发现过程的研究缺乏连续性㊂另外,随着大量新增知识的更新,知识融合的重心将逐步过渡到新增知识的实时融合是必然趋势㊂d.在知识推理方面,现阶段的研究主要采用基于规则㊁逻辑的方法挖掘领域图谱中隐含的知识或纠正错误的知识㊂但是,该方法对规则的依赖度高㊂图神经网络是连接主义与符号主义的有机结合,不仅使深度学习模型能够应用在图这种非欧几里德结构上,还为深度学习模型赋予了一定的因果推理能力[12]㊂知识推理的重心将逐步过渡到面向图结构的深度推理是必然趋势㊂e.在知识存储方面,现阶段一般利用传统的关系型数据库存储领域知识图谱㊂但是,针对低选择性㊁复杂查询效率低的问题,知识存储的重心将逐步过渡到分布式RDF查询优化是必然趋势㊂3 水利领域知识图谱构建3.1 水利领域知识图谱构建的难点a.在水利知识表示方面,领域应用不仅需要静态知识,也需要动态知识㊂如何对抽取出来的静态知识和动态知识进行合理表示是当前面临的主要技术难点㊂另外,有很多知识和事实有时间和空间条件,从时空纬度扩展知识表示也是需要解决的技术难点㊂b.在水利知识抽取方面,纯文本信息抽取是当前面临的主要难点㊂部分文本抽取算法在公共数据集上取得了较好的实验结果,但普遍存在应用到水利领域中扩展性不好等问题㊂难点在于如何根据领域知识图谱的小样本特性,构建基于小样本的有效模型㊂c.在水利知识融合方面,主要存在以下难点:(a)实体对应不准确,同一实体名在不同数据源中常含有歧义,数据源中存在严重的多源指代问题[11];(b)不同数据源关于相同实体的相同属性存在表述差异[13];(c)不同数据源为同一实体的同一属性提供的属性值存在冲突[11]㊂d.在水利知识推理方面,由于现有水利领域的应用需要高准确性地从图谱中获取信息,因此基于描述逻辑和规则的推理方法能有效用于水利知识推理㊂难点在于如何设计基于一阶谓词逻辑的推理规则用于知识推理㊂e.在水利知识存储方面,主要存在以下难点:(a)随着水利数据不断丰富,RDF数据规模日益增加,现有的集中式数据管理系统难以满足对大规模RDF数据的存储和查询性能需求,需要高性能的分布式数据管理系统[14]来实现对大规模RDF数据的存储㊁索引和查询处理;(b)现有的分布式数据管理系统,对特定类型的查询进行了优化[15],但对水利领域常涉及的低选择性㊁大直径查询的查询效率低;(c)现有的分布式数据管理系统不能动态适应工作负载[16]的变化㊂3.2 水利领域知识图谱的总体框架为解决上述水利领域知识图谱构建研究的5个难点,并实现建立水利领域知识图谱的目标,本文提出了第1期冯 钧,等 领域知识图谱研究进展及其在水利领域的应用如图1所示的研究框架㊂在该研究框架下,首先对水利知识表示进行研究,建立2种不同的表示形式;其次,针对不同类型的水利数据,研究相对应的水利知识抽取方法;然后,研究了水利知识融合和推理的具体方法;最后,在充分利用水利大数据和相关存储技术的基础上,对水利领域知识进行存储,支撑相关应用㊂图1 水利领域知识图谱构建研究框架Fig.1 Modeling framework of domain knowledge graph in water conservancy3.3 水利领域知识图谱构建的研究内容水利领域知识图谱的构建流程可以被归纳为5个模块,即水利知识表示㊁水利知识抽取㊁水利知识融合㊁水利知识推理以及水利知识存储㊂水利知识表示是将水利知识表达成计算机可存储㊁可计算的结构化知识㊂水利知识抽取可以从大量结构化㊁半结构化和非结构化的水利数据中提取知识要素㊂水利知识融合可以消除实体㊁关系㊁属性与对象之间的歧义,并为水利知识图谱更新旧知识或补充新知识㊂水利知识推理是在已有水利知识的基础上进一步挖掘隐含知识或者缺失事实,从而丰富㊁扩展水利知识库㊂水利知识存储是设计有效的存储模式来支持对水利数据的有效管理㊂3.3.1 水利知识表示三元组是知识图谱的一种通用表示形式[17],由2个具有语义连接关系的水利实体和实体间关系组成,是水利知识的直观表示㊂三元组的基本形式主要包括(实体1,关系,实体2)和(实体,属性,属性值)等㊂概念主要指水利对象类,例如水资源分区㊁流域分区㊁湖泊㊁测站㊁河流㊁水库及水电站等;实体是知识图谱中的最基本元素,例如湖西区㊁长江流域㊁汾湖㊁吴江水厂㊁太浦河㊁青山水库㊁龙头水电站等;关系存在于不同实体之间,例如属于㊁位于㊁流入㊁包含等;属性主要指对象可能具有的特征及参数,例如湖泊代码㊁湖泊名称㊁跨界类型等;属性值指对象特定属性的值,例如FH407㊁FHBA1B00000M㊁跨省等㊂表1 太湖描述的三元组表示Table 1 Triple representation of Taihu Lake 基本形式实体1关系实体2(实体1,关系,实体2)太湖流域太湖流域太湖流域太湖流域包括包括包括包括苏南地区杭嘉湖地区上海市大陆部分宣城的小部分地区基本形式实体属性属性值(实体,属性,属性值)太湖流域太湖流域太湖流域太湖流域太湖流域太湖流域太湖流域太湖流域总面积水面积河道总长河道密度地形地势河道比降水流流速 3.69万km 25551km 212万km 3.3km /km 2碟状平坦小缓慢通过一个全局唯一的ID 号来标识实体,实体间内在特征通过属性属性值来进行刻画,实体之间的关联通过关系来描述㊂三元组的存在表示一个已有的事实㊂例如关于太湖的描述为:太湖流域包括江苏省苏南地区㊁浙江省杭嘉湖地区㊁上海市大陆部分(不含崇明㊁长兴㊁横沙三道)和安徽省宣城的小部分地区,总面积3.69万km 2㊂流域水面积5551km 2;河道总长约12万km,河道密度达3.3km /km 2㊂流域地形呈周边高㊁中间低的碟状地形,地势平坦,河道比降小,水流流速缓慢㊂太湖的描述可以通过表1的三元组进行表示㊂所有三元组合可以并构成一个图(图2),其中节点表示实体,有向边表示实体之间的关系,不同的关系边的标签不同㊂3.3.2 水利知识抽取在水利信息技术飞速发展的今天,水利知识大量存在于水利信息系统的结构化数据㊁半结构化的表格㊁网页以及非结构化的文本数据中㊂针对不同类型的水利数据,采用不同的知识抽取方法㊂对于结构化数据,研究基于D2R 技术的知识图谱构建方法,利用信息系统中的结构化对象数据,抽取出静态对象及其相关关92河海大学学报(自然科学版)第49卷图2 水利知识表示示意图Fig.2 Schematic diagram of knowledge representation in water conservancy系㊂结构化数据抽取如图3(a)所示,基本步骤包括:(a)通过分析关系型数据库判断可以建立联系的2张表是否有外键关联㊂如果没有外键关联,需要人工设置外键或者在映射文件中写入外键㊂(b)建立了外键关系之后,将2张表映射成RDF 之后就可以实现语义互联㊂通过上述一系列操作,可以将2个实体之间存在的关系进行合理表示㊂对于半结构化数据,利用包装器将分布在互联网上半结构化的HTML 页面中的属性和属性值抽取出来㊂半结构化数据抽取如图3(b )所示,基本步骤包括:(a)HTML 页面清洗及解析㊂将页面转换为DOM 树形结构㊂(b)页面去噪㊂去除页面中与主题信息无关的其他信息㊂(c)包装器自动生成㊂自动获取需求信息节点的XPath 路径,定义规则模板,结合XPath 路径表达式实现抽取规则的自动构造㊂通过上述一系列操作,可以抽取出与实体有关的属性和属性值信息㊂对于非结构化数据,利用基于远程监督和神经网络的方法抽取出水利文本中的知识㊂非结构化数据抽取如图4(c)所示,基本步骤包括:(a)采用远程监督的方法利用知识库自动生成标注数据,再通过离群点检测的方法去除其中的错误标注㊂(b)采用基于监督学习的神经网络方法,先在标注好的数据上进行训练,再对未标注的数据进行测试,抽取出未标注文本中包含的实体和它们之间的关系㊂通过上述一系列操作,可以补充知识图谱中所需要的一些静态知识和动态知识㊂3.3.3 水利知识融合鉴于百科类网站具有一个页面围绕一个实体进行描述㊁页面组织结构相对统一㊁信息质量相对较高的特点,百科类网站成为领域知识库进行知识融合的主要数据来源[18],其信息框中的关于实体的属性-属性值对是对该页面实体信息的高度提炼㊂对不同百科中描述相同实体的知识卡片进行融合,可以获得关于水利对象的更全面㊁质量更高的知识㊂针对前述关于多知识库融合的难点,研究基于中文维基百科㊁百度百科㊁互动百科的知识卡片的水利知识融合方法㊂图4所示为水利知识融合流程㊂通过基于多特征的命名实体消歧㊁基于词典的属性对齐和基于贝叶斯分析的属性真值发现模块,消除实体㊁关系㊁属性及其对象之间的歧义,最终获得跟水利对象有关的属性及相应的属性值㊂图5是三大百科以及本地知识库对于水利对象 太湖”融合后的查询结果㊂蓝色的方块代表初步形成的水利领域知识图谱,红色的方块代表中文维基百科,黄色的方块代表百度百科,绿色的方块代表互动百科㊂从 太湖”的融合结果可以看出,本地水利领域知识图谱提供的信息资源具备良好的行业覆盖面和行业深度,为水利知识图谱的构建提供了核心支撑㊂中文维基百科则更多地从专业领域对其进行描述,提供的更多是较严谨的知识㊂百度百科和互动百科的知识卡片存在很多重复,且覆盖的属性更符合普通大众的娱乐需要,如关于太湖的适宜游玩季节㊁建议游玩时长㊁门票价格等㊂3.3.4 水利知识推理知识推理旨在从图谱已有的知识推理得到新的事实[19]㊂由于水利知识来源多样化,水利知识和数据的收集局限于终端采集方式而缺乏整体性,需要结合水利知识推理方法,来对相关知识进行补充㊂例如,水利领域知识图谱中存在由不同数据源得到的2个三元组:(太湖,出口,太浦闸)和(太浦闸,属于,太浦河),可以利用知识推理来获取新的事实知识(太湖,流入,太浦河)㊂目前主要的领域知识推理的方法有:基于规则推理的方法[20]㊁基于本体推理的方法[21⁃22]㊁基于表示模型的方法[23⁃25]㊁基于神经网络的方法[26]㊂通过对水利领域的业务需求进行分析,可以发现水利领域知识图谱需要为即时查询㊁决策提供支撑,因此决定了水利领域知识图谱构建的高准确性要求㊂另外,水利领域知识图谱的层次性较强,根据管理单位㊁地理空间㊁河网管网的分层关系可以在实际应用场景中将图谱切分,以降低搜索空间㊂结合水利知识图谱存在的高准确性要求和可切分特点,最适合的知识推理方法是基于规则推理的方法㊂该方法通过结合现有的一些水利领域知识,手工定义一些推理规则,去服务水利知识推理㊂其具体过程如下:(a)在概念层,通过一阶谓03第1期冯 钧,等 领域知识图谱研究进展及其在水利领域的应用图3 水利知识抽取示意图Fig.3 Schematic diagrams of knowledge extraction in water conservancy13河海大学学报(自然科学版)第49卷图4 水利知识融合流程Fig.4 Flow chart of knowledge fusion in waterconservancy图5 水利知识融合示意图Fig.5 Schematic diagram of knowledge fusion in water conservancy表2 水利知识推理规则Table 2 Rules of knowledge reasoning in water conservancy 编号推理规则含义1(河流,流入,水库),(水电站,属于,水库)→(水电站,位于,河流)水电站在水库所在的河流上2(泵站,拥有,取水口),(泵站口,位于,湖泊),(湖泊,属于,流域分区)→(取水口,属于,流域分区)取水口属于泵站所在湖泊的流域分区3(桥梁,位于,河段),(河段,属于,河流)→(桥梁,横跨,河流)桥梁横跨河段所属的河流词逻辑表示定义相关推理规则㊂(b)在实例层,再通过实例去实例化推理规则,找到符合推理规则的关系事实㊂表2为部分推理规则及其相关含义㊂3.3.5 水利知识存储水利知识存储的优化目标是减少冗余数据的存储,提高查询的效率㊂为了达到上述目标,采用以下处理手段:(a)针对集中式系统难以满足对大规模水利RDF 数据的存储和查询处理的问题,采用了一个无共享的集群,以分布式的方式处理大规模RDF 数据㊂(b)针对水利领域涉及的低选择性㊁大直径查询效率低,对查询工作负载伸缩性差的问题,研究了基于垂直划分和哈希划分的混合关系存储模式㊂通过监控查询工作负载中的频繁模式,使用频繁模式指导水利RDF 数据进行增量重划分,以提高对查询工作负载的伸缩性㊂(c)通过设置代价评估模型,进行代数优化和连接顺序优化,从而优化分布式查询的效率㊂水利知识存储流程如图6所示㊂该流程首先对经过质量评估后的水利知识进行基于主语的哈希划分形成三元组表(TT);然后,对哈希划分后的三元组表进行垂直划分,形成只包含主语-宾语列的垂直划分表(VP);最后,通过查询监控器监控查询工作负载,挖掘频繁模式,对频繁模式所对应的垂直划分表进行半连接计算,形成频繁谓词扩展垂直划分表(FP⁃ExtVP)㊂上述不同类型的表都以Parquet 格式存储到集群的各23。
biddinggame
17© Shreekant W Shiralkar 2016S. W. Shiralkar, IT Through Experiential Learning , DOI 10.1007/978-1-4842-2421-2_3C ontext:C ollaborative Learning and Collective Understanding of E RPDuring our school days, my friends and I frequently engaged in discussing specific topics from our textbooks. Each one of us comprehended a specific aspect of the larger subject, and when we shared understanding or knowledge of the topic, we found that our collective understanding helped us raise each individual’s understanding much faster and deeper than individually struggling to comprehend the subject. Later, we even formalized the process during the examination period as we found the process helping learn quickly. During my college days, we practiced the technique further by forming study groups, and when having difficulty understanding a topic, we broke it into subtopics and distributed among the group for learning parts individually and then collectively sharing it with the rest of the group. The process helped each one of us in comprehending knowledge which appeared difficult and complex to us as individuals. The results of learning through a process of discussion were impressive and gave me insight into a few aspects of the concept formally known as “cooperative learning,” which defines the process of learning together rather than being passive individual receivers of knowledge (e.g., teacher lecturing and students hearing). This process allows learners to use cognitive skills of questioning and clarifying, extrapolating and summarizing.I n one of my assignments, I was engaged to train the top management of an organization on ERP and the impact of its implementation. I anticipated that it would be a huge challenge to engage top executives in this training, as most would have had some understanding already, and applying a conventional training process risked losing their attention if my co-trainer or I fell short of their expectations. While individually each top executive may have had generic knowledge of ERP , they certainly lacked comprehensive knowledge, and more specifically a seamless collective understanding of the subject, without any gaps due to individual interpretations or exposures. The task, therefore, was multifaceted: on one hand, I had to get them interested in learning aspects of which they lacked knowledge, and on the other, I had to encourage them to share their individual understandings of the subject, facilitating development of a collective learning.CHAPTER 3 ■ BIDDING GAME18F or a top executive, it is expected that he or she needs to take calculated risks inalmost every key decision, whether it’s bidding for a large contract or establishing price point while taking a privately held organization for public trading. The process of bidding involves awareness of collective knowledge of capability, assessment about competition, and expertise to apply judgment based on rational (and some irrational) criteria. In the knowledge-driven economy, the contributions of each employee, regardless of level, add up to the collective capability of the organization.W ith a view to facilitate collective learning in the shortest possible time for these top executives, I conceived a “Bidding Game” that leveraged cooperative learning to teach the ERP solution and the impact of its implementation in one session. The result in of Bidding Game was outstanding.T his is the premise of the game that will be explained in this chapter. The game also helps induce elements of social skills like effective communication and interpersonal and group skills in learning an otherwise abstract and complex subject.T he Bidding Game is a game played by all the participants divided into two or more teams. Teams compete on the strength of their collective knowledge of the subject. The game concludes after the collective learning on a specific subject is acquired to the appropriate level on all the essential aspects. The game format provides encouragement to each participant to contribute his or her knowledge of the subject and helps the team to win. A notional value attached to the correct and complete response helps measure the level of knowledge among participants. The competition is premised on the accuracy of the initial bid, which adds a flavor of bidding.F igure 3-1will help you visualize the setting created for the participants of the Bidding Game.F igure 3-1.I nstructor inviting b idsCHAPTER 3 ■ BIDDING GAME19In a hall, participants will be seated in a U-shaped arrangement, facing the projector screen. The hall will have two whiteboards on either side of the projector screen. One of the whiteboards will be titled “Knowledge Bid” and will display the bids by participating t eams.T he second whiteboard will record the actual earnings or the SCORE for each of the teams. The projector screen will be used to publish the question for each of the bid, and the instructor will allow the teams to respond in sequence and will record the score on the whiteboard on the basis of the accuracy and completeness of response by the team (Figure 3-2).F igure 3-2.I nstructor inviting response to q uestion In designing the Bidding Game, the elements of competition and encouraging discussion on each aspect form the core theme. The competitive aspect triggers speed, the game element induces interest without force or pressure, and finally discussions and sharing of knowledge facilitate desired coverage of the subject—for instance, technical nuances and features offered by new technology and/or processes, channelling an accelerated Learning and Collective Understanding new technology and/or p rocesses .B idding Game Design To design the Bidding Game, I recommend ensuring that the pace of learning is accelerated gradually, and that learning begins with basic aspects and moves on to the advanced and complex aspects in sequence instead of beginning with complex subjects and then concluding with basics. In the design of the sequence, care has to be exercised in segregating the basic and must-learn aspects from the “nice-to-know” aspects, andCHAPTER 3 ■ BIDDING GAME20design should ensure accomplishing learning of basic and must-learn ones whileprovisioning for nice-to-know types based on the interest and appetite of the participants. Design the sequence in such a way that initially the participant need to spend less time and are encouraged toward the game and competition, while later parts of the sequence should ensure that participants spend more time in discussions and staying ahead of competition.T he objective—rapid development of collective learning of technology and/or new processes—necessitates a short duration of the Bidding Game.L et us now examine the task-level details of the Bidding Game beginning with preparation/planning, recommended rules, and then the process for its execution,including steps to consolidate learning after conclusion. An overview of the entire game is depicted in Figure 3-3 .C omplete details of the activities in the process flow are described in detail in the following sections.P reparation/Planning •D ivide the subject into 20 subtopics that cover the subject comprehensively. • C reate a question for each of the subtopics.•C reate a sequence of questions in a way that gradually raises the level of knowledge. •S egment the questions into three levels: Rookie, Advanced, and Expert.•A ssign different values to questions from the three sets, for example, $100 per question from the Rookie level, $200 perquestion from the Advanced level, and $300 per question from the Expert level. •D evelop a clear rule set for the Bidding Game that can be used to explain the game to the participants.F igure 3-3.B idding Game p rocess flowCHAPTER 3 ■ BIDDING GAME • H ave a scoreboard that displays the bid value of the team and alsotheir score during the progress of the game (use the whiteboardmarker pens).• H ave a large clock for monitoring time and identify assistants forkeeping time and recording the score.R ecommended Rules• T he winner is chosen on the basis of two parameters: high scoreas well as that which is closest to its bid.• E ach wrong or incomplete response has a loss of value (i.e.,negative marking); for example, a $50 penalty for each wrong orincomplete response.• $50 is deducted from the value of a passed-over question or apartly answered question.• T he completeness of the response to a question can be challengedby competing teams to apply penalty and reduce the score.• T here’s a limit of 5 minutes for responding to each question. Eachround could begin sequence in a way that provides a fair chanceto all the teams.O nce all the preparation is completed, the game can begin.E xecution1. A ll the participants are told the context and rationale for thegame (i.e., what ERP is and the importance of each of themhaving a collective understanding of the subject, which wouldmaximize benefit from its implementation). Also, it shouldbe explained how playing a game such as this can increaseindividual understanding much faster and more deeply thanindividually struggling to comprehend the subject in isolation.2. P articipants are divided into teams. Team formation canbe done in any way that generates nearly equal numbers ofparticipants for each team (dividing the room, counting off bytwos, etc.)3. T he instructor/quiz master (QM) invites bids from each of theteams, which are recorded on the whiteboard for everyone tosee.4. T he instructor launches the first question on the screen andinvites the first team to take its chance, while the timekeepermonitors the time taken by the responding team.21CHAPTER 3 ■ BIDDING GAME225. O n the basis of correctness and completeness of the response,the instructor assigns a score to the team, which is recordedon the second whiteboard.6. I n case the question is passed to the second team and they areable to respond correctly and completely, the reduced score isrecorded.7. I n case the question is not answered or is incompletelyresponded by any of the teams, the instructor shares the correctand complete answer and the subject is discussed and clarified.8. T he process continues until the subject is completely covered.9. T he instructor tallies the scores for the teams and announcesthe winner on the basis of the high score and the bid a ccuracy .O nce the game is over, observations from experience are collected and crystallized inlearning in the next section.C onclusion• T he learning gained through the game needs to be articulatedand consolidated. Debrief is a process that will aid in articulatinglearning that participants gained during the game.• T he process of debrief begins with each participant sharinglearning, specifically something that has changed theirunderstanding about the subject during the game.• E ach participant would have learned something new, be it a verybasic addition to earlier knowledge of the subject or very complexinformation that the participant hadn’t ever known before.• T he individual learnings are recorded on a whiteboard, whichhelps in crystallizing and consolidating collective understandingon the subject.• O nce the game is over, the learning can be consolidated bypresenting additional material by way of slides, videos, and so on.S ample ArtifactsW ith a view to facilitate the immediate application of the approach in the chapter, a sample list of questions on ERP and Big Data along with an illustrative score sheet with result, are provided in the following section. The correct responses from multiple choices, are identified in bold.CHAPTER 3 ■ BIDDING GAMES ample Question Cards:ERP1. W hat is the extended form of ERP?a. E nterprise Retail Processb. E nterprise Resource Planningc. E arning Revenue and Profitd. N one of the above2. R eal time in the context of ERP relates to which of thefollowing?a. T ime shown in the computer system synchs with yourwatchb. P rocesses/events happen per transaction at the sameinstantc. B oth of the aboved. N one of the above3. W hat does “SOA” stand for in relation to ERP systemarchitecture?a. S ervice-Oriented Architectureb. S ystem of Accountsc. S tatement of Accountd. N one of the above4. W hich of these is not a packaged ERP?a. S APb. O raclec. W indowsd. J D Edwards5. I n the context of packaged ERP, do “Customization” and“Configuration” refer to the same process, or are theydifferent?a. S ameb. D ifferentc.D on’t know23CHAPTER 3 ■ BIDDING GAME246. M aterials Management in ERP helps to/esnure ?a. I ncrease of inventoryb. I nventory is well balancedc. B oth of the aboved. N one of the above7. S ales and Distribution Module in ERP helps in which of thefollowing?a. I ncreased customer serviceb. R educed customer servicec. B oth of the aboved. N one of the above8. F inancial and Controlling Module in ERP helps in which ofthe following?a. E valuating and responding to changing businessconditions with accurate, timely financial datab. E asy compliance with financial reporting requirementsc. S tandardizing and streamlining operationsd. A ll of the abovee. N one of the above9. G ain from implementation of ERP results in which of thefollowing?a. I mproved business performanceb. I mproved decision makingc. I ncreased ability to plan and growd. A ll of the aboveS ample Question Cards:B ig Data1. W hat is Big Data?a. D ata about big thingsb. D ata which is extremely large in size (in petabytes)c. D ata about datad. N one of the aboveCHAPTER 3 ■ BIDDING GAME2. W hich are not characteristics of Big Data?a. V olumeb.V elocityc. V irtualityd.V ariety3. W hich are key inputs for Big Data?a.I ncreased processing powerb. A vailability of tools and techniques for Big Datac. I ncreased storage capacitiesd. A ll of the above4. W hich are applications of Big Data?a. T argeted advertisingb.M onitoring telecom networkc. C ustomer sentimentsd. A ll of the above5. W hich tools are used for Big Data?a. N oSQLb. M apReducec. H adoop Distributed File Systemd. A ll of the above6. S ocial media and mobility are key contributors to Big Data:true or false?a. T rueb. F alse7. W hich is not a term related to Big Data?a. D atabasesMongoDBb. D ata T riggerc. P igd. S PARK25CHAPTER 3 ■ BIDDING GAME26B enefit Assessment After consolidation of the learning, it’s recommended to conduct a benefit assessment exercise to measure the gains from application of the game-based approach. Theassessment could be in form of a written quiz on the subject with multiple-choice answers.。
Developing
Design Research in the Netherlands75 7. Developing NPD-Process KnowledgeJan BuijsDepartment of Product Innovation & ManagementSub-Faculty of Industrial Design EngineeringDelft University of Technology7.1 IntroductionThis conference on Design Research in the Netherlands 2000 gives us a nice opportunity to show the results of design research which is being carried out at the Delft School for Product Design (officially the Sub-faculty of Industrial Design Engineering at the Delft University of Technology). Since the 1995 conference a lot has happened. In those days the Delft School of Product Design was the independent Faculty of Industrial Design Engineering. Now we have merged with the Schools of Mechanical Engineering and Naval Architecture into the new Faculty of Design, Construction and Production (DCP). The number of students and staff for product design stayed constant for all those years (ca. 100 fte staff and 1600 students). Originally we had five organisational units: four “Vakgroepen” responsible for teaching and research in the fields of respectively Construction, Ergonomics, Formgiving and Management Sciences, with one shared “Werkgroep” responsible for teaching design.Now we have three departments (“Afdelingen”), responsible only for research: Industrial Design (ID), Design Engineering (DE) and Product Innovation & Management (PI&M). All education is separately organised, headed by the Director of Education. Design teaching is an integral part of this organisation (although it is separately organised as the Institute for Design Teaching (= IvOO = Instituut voor het Ontwerp Onderwijs) and has the same budgetary status as the three research departments ID, DE and PI&M. The Department of Industrial Design is the combination of the former Ergonomics and Formgiving groups, Design Engineering comes from the former Construction group and Product Innovation & Management comes from the Management Sciences group. Design Methodology was part of the Management Sciences group and is now part of PI&M.7.2 Design researchIt could be argued that all research carried out within a school of product design is a form of design research, but that would be much too pretentious. For instance within the Department of Design Engineering research is done in the field of material sciences on plastics, and within Industrial Design researchers look at the physical limitations of elderly people in order to design better suited products for them. Within the Department of Product Innovation & Management research has been done on market introduction strategies for new products. These and other research projects are not considered as design research projects though.It would be difficult to make a sharp distinction between what is design research and what is not, especially considering the multi-disciplinary character of design itself. I will limit design research to only those research subjects that are aimed at the development of process knowledge of the New Product Development (NPD) process and not covered by other76Developing NPD-Process Knowledge traditional mono-disciplinary domains. This gap partly exists because the other disciplines are not interested in them (i.e. intuition and creativity by psychologists) or because they are unable to do it within a mono-discipline (i.e. real protocol analysis of product design projects needs both designers or engineers and psychologists).I will also limit myself to the research work of the Department of Product Innovation & Management. Others at this conference will take care of the research work that is being done in the other departments.By doing so I will not go into the research carried out within the Marketing group (a sub group of PI&M), because their research is part of the mono-discipline of marketing. Even though they have, besides marketeers, economists, psychologists, communication scientists and even product designers in their staff. I will only report about the developments within the two other groups of PI&M, the Design Methodology Group and the Management & Organisation Group.7.2.1The Design Methodology Group(Permanent research staff per May 1st 2000: ir. Norbert Roozenburg, dr. Peter Lloyd and 2 vacancies. Temporary research staff: 2 vacancies).This has been the core design research group at our school, right from its beginning in 1964. Design Methodology is one of the key elements in the curriculum of Delft School of Product Design. According to the research of Hanny de Wilde (1997), about the history and development of this school, explicit attention to design methodology was one of the key elements to start the first product design school in the Netherlands at a university level. The founder of our school, an architect called Joost van der Grinten, borrowed the ideas about design methodology both from the Royal College of Art in the UK and from the Hochschule für Gestaltung in Ulm, Germany. The work of Bruce Archer was quite influential.The graduation work of our first graduate (Norbert Roozenburg in December 1971) was about the application of a specific design method in product design. He still works at the school and is, not only, very active in the design methodology and design research field, but is also the Director of the School’s Institute for Design Teaching. He is unable to be here because he is currently guest professor at the Danish University of Technology in Copenhagen. So I will be his humble representative.The first professor in Design Methodology was Johannes Eekels (he became emeritus in 1987). Together with Norbert Roozenburg he produced numerous books and articles. The latest Dutch version of their book was published in 1998 (Roozenburg and Eekels 1998). An English version was published in 1995 (Rozenburg and Eekels 1975).Besides this traditional emphasis on the prescriptive and normative ways of designing, which is still of concern, the research in this field now also embraces empirical studies.The publication of the book on the Delft workshop on protocol analysis is a landmark in this respect (Cross, Christiaans and Dorst 1996). The workshop was organised to discuss, among leading scholars in design research, the results of different analyses from shared data.The shared data consisted of a protocol study on both individual and group design work. It was based on the same design brief. The experiment itself has taken place at Xerox PARC in California. The experimenters were Nigel Cross (at that time part-time professor in Design Methodology in Delft), Anita Cross, Henri Christiaans and Kees Dorst; the participating designers came from IDEO, the leading product design firm in the US.The workshop offered a great deal of insight into how designers actually work. At the workshop invited scholars shared their results, ideas, objections and doubts. It was interesting to watch the discussion because every attendant of the workshop had used the same originalBuijs77 data. It proved to be a very effective way of having detailed discussions about both the content of a design process as well as the way of doing protocol studies.Another interesting project of this group has been the research of Kees Dorst. This empirically based study proved that the use of different paradigms within the design research field could be used to study different aspects of design. Traditionally within the design research domain the rational problem solving paradigm, based on Herbert Simon’s ideas, is dominant (Simon 1967). Kees showed that this paradigm has its limitations, and looked for another paradigm. Donald Schön’s idea of “design as a reflective practice” proved to be this interesting other paradigm (Schön 1983). Kees showed that using both paradigms to interpret the same empirical data leads to different views and different conclusions about how designers are really working (Dorst 1997). It is my opinion that this multi-paradigmatic analysis of product design will produce more interesting results.The arrival, last year, of Peter Lloyd from the UK, an ethnographic oriented design researcher, is the next step to continue the current new stream of conducting further empirical studies.The teaching of this group is focused on a fourth year course in Design Theory and Design Methods for all our design students. Of course the group is very active in the design studio work within the “IvOO”.7.2.2The Management & Organisation Group(Permanent research staff per May 1st 2000: prof. dr. ir. Jan Buijs, ir. Frido Smulders, ir. Rianne Valkenburg, dr. Hanny de Wilde, and 2 vacancies. Temporary research staff: ir. Danielle Hendriks, ir. Remko van der Lugt, and 2 vacancies).The main objective of the Management & Organisation Group, the group I am responsible for, is to study product design processes in their natural environment, that is in the competitive situation of design projects, within companies, working together with suppliers and customers. Its focus is on design as a business activity. We usually refer to it as “design in context” or “design in business”We are looking into product design as the result of teamwork. We are interested in both the communication within the team, as well as the influence of the project leader on team behaviour. This approach looks at team behaviour not in terms of group dynamics, but in terms of design work. Of course design work and group behaviour are intertwined, but we are primarily interested in the content of the product design work.This shift from individual designers towards design groups has been caused by the very practice of industrial product development. Few product designs are the work of just one lonely designer. Nowadays complicated consumer- and industrial products are always the results of multi-disciplinary design teams.However we are not only interested in the teamwork itself, but also in the interfaces between those design teams and the rest of the organisation.We are continually conducting case studies of product development in real corporate situations. This allows us to compare empirical studies with theories of product development and has resulted in two books on Integrated New Product Development and a new course for our first year product design students (Buijs and Valkenburg 1996 and 2000).During the discussions of the aforementioned Delft workshop on analysing design activities we discovered big differences in the ways psychologists and design researchers were looking at design behaviour. For example two researchers were looking at the same type of a group design activity. Both looked at a specific action on the videotape. However the psychologist looked at body language and group dynamics, while the design researcher looked at the78Developing NPD-Process Knowledge content of the discussions within the design team. So for both there was something interesting to see, but the results were completely different. More surprisingly, some times the conclusions were completely different or even opposing.This has led to some very intriguing research projects. Helga Hohn, a psychologist, started to look at the behaviour of team leaders in helping teams with innovative tasks. She questioned more than 75 international working professionals on how they inspire their (design) teams, how they keep them on track, and how they deal with the company pressure to perform better, quicker or cheaper. Once again process and content were very closely related with “playing”proving to be very important in keeping teams alive and kicking (Hohn 1999).Rianne Valkenburg, a design researcher, is looking at team design work on the content level. She is comparing two teams of students designing during the Philips Design Competition, and two professional design teams, which took part in the earlier Delft experiment at Xerox PARC. Inspired by Kees Dorst’s work she is using Donald Schön’s paradigm to compare these different design teams. She has operationalised Schön’s theory and is heading towards some interesting conclusions about shared understanding and team communication based on the content of the design project (Valkenburg and Dorst 1998). Her thesis will be published at the end of this summer.Within this team-based research Danielle Hendriks and Hanny de Wilde are doing research about the role and influence of project leaders on the results of the product design team. Besides interviewing project leaders in Dutch design consultancies, they were also allowed to study the archives of one of the leading Dutch design firms. From a knowledge management perspective these archives have not proved useful. However, they have shown that if designers want to learn from their past they have to be more accurate in what and how to file their actual design work. Recently, an e-mail-based way of making weekly diaries has been developed. In analysing these diaries they hope to find some of the heuristics, project leaders use to solve their professional problems (Hendriks and De Wilde 1999). They are helped in this by a research student, Sjors Witjes, who is doing empirical research in cooperation with Stanford University. He is observing and interviewing project leaders of product development teams in the US high tech industry. Hopefully we can compare the results from the Netherlands with those from the US. These results will be integrated in our recently developed fourth year course on Product Development Management.In our attempts to study the real life of designers we have discovered that most designers talk about intuition as an important element in their work. Although intuition is difficult to study within the traditional way of doing scientific research, we have taken up the challenge. Robin Groeneveld has interviewed about twenty professional designers. Most of them are very explicit about the influence of intuition and about the way they can rely on it. Hopefully his PhD thesis will be published the end of this year.Finally within the Management & Organisation Group we are interested in stimulating creativity in product design. Not only have we developed a fourth year course on Creative Problem Solving (CPS), we have also started a research project in this field. Creative Problem Solving (i.e. brainstorming or synectics) is usually verbally based, while product designers tend to be visually oriented. The research project of Remko van der Lugt is trying to bridge the gap between the original CPS-rules and the more visual attitudes of product designers. The first results are promising (Van der Lugt 1998). An extended version of braindrawing, as opposed to brainstorming, seems to be an effective tool for product designers. His PhD thesis is scheduled for early next year.Beside the already mentioned courses we are also teaching a third year course on Strategy and Organisation (Frido Smulders is responsible) and we all participate in the design studio work.Buijs79 7.3 Final remarksThe research in both the Design Methodology Group and in the Management & Organisation Group is aimed at getting better insights into the process of New Product Development (NPD). With this insight we hope to improve the quality of product design work.By sharing a selection of our work with other design researchers in the Netherlands we hope to get enough energy not only to continue, but also to improve.7.4 ReferencesBuijs, J. A., and Valkenburg,R. (1996).Integrale Produktontwikkeling, Lemma,Utrecht.Buijs, J. A., and Valkenburg,R. (2000).Integrale Productontwikkeling - Tweede Geheel Herziene Druk, Lemma, Utrecht.Cross, N. G.and Christiaans,H.and Dorst, K. (1996).Analyzing Design Activities, Wiley, Chichester.Dorst, C. H.(1997).Describing Design: A Comparison of Paradigms, PhD thesis, Delft University of Technology.Hohn, H.(1999).Playing, Leadership and Team Development in Innovative Teams, PhD thesis,Delft University of Technology.Hendriks,D.and Wilde, H. de (1999).Project Management for New Product Development Projects: An Empirical Study, in: Proceedings ICED‘99,München.Lugt, R. van der and Buijs, J.A.(1998). Creative Problem Solving in Product Development: An Exploration Into the Use of CPS in Design Practice, in: Dingli, S., Creative Thinking, Towards Broader Horizons, Malta University Press.Roozenburg N. F. M. and Eekels, J. (1995).Product Design: Fundamentals and Methods, Wiley,Chichester. Roozenburg,N. F. M. and Eekels, J.(1998).Produktontwerpen, tweede druk, Lemma, Utrecht.Schön, D. A.(1983).The Reflective Practitioner, Basic Books,New York, 1983.Simon, H.A. (1967).Sciences of the Artificial, The MIT Press, Cambridge MA.Valkenburg, R. and Dorst,K.(1998).The Reflective Pratice of Design Teams, in:Design Studies,19, pp.249-271.Valkenburg, R. (2000).The Reflective Pratice of Product Design Teams. PhD thesis Delft University of Technology, forthcoming in2000.Wilde,H. de (1997).Passie Voor Productontwikkeling, Lemma,Utrecht.。
C.parvum全基因组序列
DOI: 10.1126/science.1094786, 441 (2004);304Science et al.Mitchell S. Abrahamsen,Cryptosporidium parvum Complete Genome Sequence of the Apicomplexan, (this information is current as of October 7, 2009 ):The following resources related to this article are available online at/cgi/content/full/304/5669/441version of this article at:including high-resolution figures, can be found in the online Updated information and services,/cgi/content/full/1094786/DC1 can be found at:Supporting Online Material/cgi/content/full/304/5669/441#otherarticles , 9 of which can be accessed for free: cites 25 articles This article 239 article(s) on the ISI Web of Science. cited by This article has been /cgi/content/full/304/5669/441#otherarticles 53 articles hosted by HighWire Press; see: cited by This article has been/cgi/collection/genetics Genetics: subject collections This article appears in the following/about/permissions.dtl in whole or in part can be found at: this article permission to reproduce of this article or about obtaining reprints Information about obtaining registered trademark of AAAS.is a Science 2004 by the American Association for the Advancement of Science; all rights reserved. The title Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the Science o n O c t o b e r 7, 2009w w w .s c i e n c e m a g .o r g D o w n l o a d e d f r o m3.R.Jackendoff,Foundations of Language:Brain,Gram-mar,Evolution(Oxford Univ.Press,Oxford,2003).4.Although for Frege(1),reference was established rela-tive to objects in the world,here we follow Jackendoff’s suggestion(3)that this is done relative to objects and the state of affairs as mentally represented.5.S.Zola-Morgan,L.R.Squire,in The Development andNeural Bases of Higher Cognitive Functions(New York Academy of Sciences,New York,1990),pp.434–456.6.N.Chomsky,Reflections on Language(Pantheon,New York,1975).7.J.Katz,Semantic Theory(Harper&Row,New York,1972).8.D.Sperber,D.Wilson,Relevance(Harvard Univ.Press,Cambridge,MA,1986).9.K.I.Forster,in Sentence Processing,W.E.Cooper,C.T.Walker,Eds.(Erlbaum,Hillsdale,NJ,1989),pp.27–85.10.H.H.Clark,Using Language(Cambridge Univ.Press,Cambridge,1996).11.Often word meanings can only be fully determined byinvokingworld knowledg e.For instance,the meaningof “flat”in a“flat road”implies the absence of holes.However,in the expression“aflat tire,”it indicates the presence of a hole.The meaningof“finish”in the phrase “Billfinished the book”implies that Bill completed readingthe book.However,the phrase“the g oatfin-ished the book”can only be interpreted as the goat eatingor destroyingthe book.The examples illustrate that word meaningis often underdetermined and nec-essarily intertwined with general world knowledge.In such cases,it is hard to see how the integration of lexical meaning and general world knowledge could be strictly separated(3,31).12.W.Marslen-Wilson,C.M.Brown,L.K.Tyler,Lang.Cognit.Process.3,1(1988).13.ERPs for30subjects were averaged time-locked to theonset of the critical words,with40items per condition.Sentences were presented word by word on the centerof a computer screen,with a stimulus onset asynchronyof600ms.While subjects were readingthe sentences,their EEG was recorded and amplified with a high-cut-off frequency of70Hz,a time constant of8s,and asamplingfrequency of200Hz.14.Materials and methods are available as supportingmaterial on Science Online.15.M.Kutas,S.A.Hillyard,Science207,203(1980).16.C.Brown,P.Hagoort,J.Cognit.Neurosci.5,34(1993).17.C.M.Brown,P.Hagoort,in Architectures and Mech-anisms for Language Processing,M.W.Crocker,M.Pickering,C.Clifton Jr.,Eds.(Cambridge Univ.Press,Cambridge,1999),pp.213–237.18.F.Varela et al.,Nature Rev.Neurosci.2,229(2001).19.We obtained TFRs of the single-trial EEG data by con-volvingcomplex Morlet wavelets with the EEG data andcomputingthe squared norm for the result of theconvolution.We used wavelets with a7-cycle width,with frequencies ranging from1to70Hz,in1-Hz steps.Power values thus obtained were expressed as a per-centage change relative to the power in a baselineinterval,which was taken from150to0ms before theonset of the critical word.This was done in order tonormalize for individual differences in EEG power anddifferences in baseline power between different fre-quency bands.Two relevant time-frequency compo-nents were identified:(i)a theta component,rangingfrom4to7Hz and from300to800ms after wordonset,and(ii)a gamma component,ranging from35to45Hz and from400to600ms after word onset.20.C.Tallon-Baudry,O.Bertrand,Trends Cognit.Sci.3,151(1999).tner et al.,Nature397,434(1999).22.M.Bastiaansen,P.Hagoort,Cortex39(2003).23.O.Jensen,C.D.Tesche,Eur.J.Neurosci.15,1395(2002).24.Whole brain T2*-weighted echo planar imaging bloodoxygen level–dependent(EPI-BOLD)fMRI data wereacquired with a Siemens Sonata1.5-T magnetic reso-nance scanner with interleaved slice ordering,a volumerepetition time of2.48s,an echo time of40ms,a90°flip angle,31horizontal slices,a64ϫ64slice matrix,and isotropic voxel size of3.5ϫ3.5ϫ3.5mm.For thestructural magnetic resonance image,we used a high-resolution(isotropic voxels of1mm3)T1-weightedmagnetization-prepared rapid gradient-echo pulse se-quence.The fMRI data were preprocessed and analyzedby statistical parametric mappingwith SPM99software(http://www.fi/spm99).25.S.E.Petersen et al.,Nature331,585(1988).26.B.T.Gold,R.L.Buckner,Neuron35,803(2002).27.E.Halgren et al.,J.Psychophysiol.88,1(1994).28.E.Halgren et al.,Neuroimage17,1101(2002).29.M.K.Tanenhaus et al.,Science268,1632(1995).30.J.J.A.van Berkum et al.,J.Cognit.Neurosci.11,657(1999).31.P.A.M.Seuren,Discourse Semantics(Basil Blackwell,Oxford,1985).32.We thank P.Indefrey,P.Fries,P.A.M.Seuren,and M.van Turennout for helpful discussions.Supported bythe Netherlands Organization for Scientific Research,grant no.400-56-384(P.H.).Supporting Online Material/cgi/content/full/1095455/DC1Materials and MethodsFig.S1References and Notes8January2004;accepted9March2004Published online18March2004;10.1126/science.1095455Include this information when citingthis paper.Complete Genome Sequence ofthe Apicomplexan,Cryptosporidium parvumMitchell S.Abrahamsen,1,2*†Thomas J.Templeton,3†Shinichiro Enomoto,1Juan E.Abrahante,1Guan Zhu,4 Cheryl ncto,1Mingqi Deng,1Chang Liu,1‡Giovanni Widmer,5Saul Tzipori,5GregoryA.Buck,6Ping Xu,6 Alan T.Bankier,7Paul H.Dear,7Bernard A.Konfortov,7 Helen F.Spriggs,7Lakshminarayan Iyer,8Vivek Anantharaman,8L.Aravind,8Vivek Kapur2,9The apicomplexan Cryptosporidium parvum is an intestinal parasite that affects healthy humans and animals,and causes an unrelenting infection in immuno-compromised individuals such as AIDS patients.We report the complete ge-nome sequence of C.parvum,type II isolate.Genome analysis identifies ex-tremely streamlined metabolic pathways and a reliance on the host for nu-trients.In contrast to Plasmodium and Toxoplasma,the parasite lacks an api-coplast and its genome,and possesses a degenerate mitochondrion that has lost its genome.Several novel classes of cell-surface and secreted proteins with a potential role in host interactions and pathogenesis were also detected.Elu-cidation of the core metabolism,including enzymes with high similarities to bacterial and plant counterparts,opens new avenues for drug development.Cryptosporidium parvum is a globally impor-tant intracellular pathogen of humans and animals.The duration of infection and patho-genesis of cryptosporidiosis depends on host immune status,ranging from a severe but self-limiting diarrhea in immunocompetent individuals to a life-threatening,prolonged infection in immunocompromised patients.Asubstantial degree of morbidity and mortalityis associated with infections in AIDS pa-tients.Despite intensive efforts over the past20years,there is currently no effective ther-apy for treating or preventing C.parvuminfection in humans.Cryptosporidium belongs to the phylumApicomplexa,whose members share a com-mon apical secretory apparatus mediating lo-comotion and tissue or cellular invasion.Many apicomplexans are of medical or vet-erinary importance,including Plasmodium,Babesia,Toxoplasma,Neosprora,Sarcocys-tis,Cyclospora,and Eimeria.The life cycle ofC.parvum is similar to that of other cyst-forming apicomplexans(e.g.,Eimeria and Tox-oplasma),resulting in the formation of oocysts1Department of Veterinary and Biomedical Science,College of Veterinary Medicine,2Biomedical Genom-ics Center,University of Minnesota,St.Paul,MN55108,USA.3Department of Microbiology and Immu-nology,Weill Medical College and Program in Immu-nology,Weill Graduate School of Medical Sciences ofCornell University,New York,NY10021,USA.4De-partment of Veterinary Pathobiology,College of Vet-erinary Medicine,Texas A&M University,College Sta-tion,TX77843,USA.5Division of Infectious Diseases,Tufts University School of Veterinary Medicine,NorthGrafton,MA01536,USA.6Center for the Study ofBiological Complexity and Department of Microbiol-ogy and Immunology,Virginia Commonwealth Uni-versity,Richmond,VA23198,USA.7MRC Laboratoryof Molecular Biology,Hills Road,Cambridge CB22QH,UK.8National Center for Biotechnology Infor-mation,National Library of Medicine,National Insti-tutes of Health,Bethesda,MD20894,USA.9Depart-ment of Microbiology,University of Minnesota,Min-neapolis,MN55455,USA.*To whom correspondence should be addressed.E-mail:abe@†These authors contributed equally to this work.‡Present address:Bioinformatics Division,Genetic Re-search,GlaxoSmithKline Pharmaceuticals,5MooreDrive,Research Triangle Park,NC27009,USA.R E P O R T S SCIENCE VOL30416APRIL2004441o n O c t o b e r 7 , 2 0 0 9 w w w . s c i e n c e m a g . o r g D o w n l o a d e d f r o mthat are shed in the feces of infected hosts.C.parvum oocysts are highly resistant to environ-mental stresses,including chlorine treatment of community water supplies;hence,the parasite is an important water-and food-borne pathogen (1).The obligate intracellular nature of the par-asite ’s life cycle and the inability to culture the parasite continuously in vitro greatly impair researchers ’ability to obtain purified samples of the different developmental stages.The par-asite cannot be genetically manipulated,and transformation methodologies are currently un-available.To begin to address these limitations,we have obtained the complete C.parvum ge-nome sequence and its predicted protein com-plement.(This whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the project accession AAEE00000000.The version described in this paper is the first version,AAEE01000000.)The random shotgun approach was used to obtain the complete DNA sequence (2)of the Iowa “type II ”isolate of C.parvum .This isolate readily transmits disease among numerous mammals,including humans.The resulting ge-nome sequence has roughly 13ϫgenome cov-erage containing five gaps and 9.1Mb of totalDNA sequence within eight chromosomes.The C.parvum genome is thus quite compact rela-tive to the 23-Mb,14-chromosome genome of Plasmodium falciparum (3);this size difference is predominantly the result of shorter intergenic regions,fewer introns,and a smaller number of genes (Table 1).Comparison of the assembled sequence of chromosome VI to that of the recently published sequence of chromosome VI (4)revealed that our assembly contains an ad-ditional 160kb of sequence and a single gap versus two,with the common sequences dis-playing a 99.993%sequence identity (2).The relative paucity of introns greatly simplified gene predictions and facilitated an-notation (2)of predicted open reading frames (ORFs).These analyses provided an estimate of 3807protein-encoding genes for the C.parvum genome,far fewer than the estimated 5300genes predicted for the Plasmodium genome (3).This difference is primarily due to the absence of an apicoplast and mitochondrial genome,as well as the pres-ence of fewer genes encoding metabolic functions and variant surface proteins,such as the P.falciparum var and rifin molecules (Table 2).An analysis of the encoded pro-tein sequences with the program SEG (5)shows that these protein-encoding genes are not enriched in low-complexity se-quences (34%)to the extent observed in the proteins from Plasmodium (70%).Our sequence analysis indicates that Cryptosporidium ,unlike Plasmodium and Toxoplasma ,lacks both mitochondrion and apicoplast genomes.The overall complete-ness of the genome sequence,together with the fact that similar DNA extraction proce-dures used to isolate total genomic DNA from C.parvum efficiently yielded mito-chondrion and apicoplast genomes from Ei-meria sp.and Toxoplasma (6,7),indicates that the absence of organellar genomes was unlikely to have been the result of method-ological error.These conclusions are con-sistent with the absence of nuclear genes for the DNA replication and translation machinery characteristic of mitochondria and apicoplasts,and with the lack of mito-chondrial or apicoplast targeting signals for tRNA synthetases.A number of putative mitochondrial pro-teins were identified,including components of a mitochondrial protein import apparatus,chaperones,uncoupling proteins,and solute translocators (table S1).However,the ge-nome does not encode any Krebs cycle en-zymes,nor the components constituting the mitochondrial complexes I to IV;this finding indicates that the parasite does not rely on complete oxidation and respiratory chains for synthesizing adenosine triphosphate (ATP).Similar to Plasmodium ,no orthologs for the ␥,␦,or εsubunits or the c subunit of the F 0proton channel were detected (whereas all subunits were found for a V-type ATPase).Cryptosporidium ,like Eimeria (8)and Plas-modium ,possesses a pyridine nucleotide tran-shydrogenase integral membrane protein that may couple reduced nicotinamide adenine dinucleotide (NADH)and reduced nico-tinamide adenine dinucleotide phosphate (NADPH)redox to proton translocation across the inner mitochondrial membrane.Unlike Plasmodium ,the parasite has two copies of the pyridine nucleotide transhydrogenase gene.Also present is a likely mitochondrial membrane –associated,cyanide-resistant alter-native oxidase (AOX )that catalyzes the reduction of molecular oxygen by ubiquinol to produce H 2O,but not superoxide or H 2O 2.Several genes were identified as involved in biogenesis of iron-sulfur [Fe-S]complexes with potential mitochondrial targeting signals (e.g.,nifS,nifU,frataxin,and ferredoxin),supporting the presence of a limited electron flux in the mitochondrial remnant (table S2).Our sequence analysis confirms the absence of a plastid genome (7)and,additionally,the loss of plastid-associated metabolic pathways including the type II fatty acid synthases (FASs)and isoprenoid synthetic enzymes thatTable 1.General features of the C.parvum genome and comparison with other single-celled eukaryotes.Values are derived from respective genome project summaries (3,26–28).ND,not determined.FeatureC.parvum P.falciparum S.pombe S.cerevisiae E.cuniculiSize (Mbp)9.122.912.512.5 2.5(G ϩC)content (%)3019.43638.347No.of genes 38075268492957701997Mean gene length (bp)excluding introns 1795228314261424ND Gene density (bp per gene)23824338252820881256Percent coding75.352.657.570.590Genes with introns (%)553.9435ND Intergenic regions (G ϩC)content %23.913.632.435.145Mean length (bp)5661694952515129RNAsNo.of tRNA genes 454317429944No.of 5S rRNA genes 6330100–2003No.of 5.8S ,18S ,and 28S rRNA units 57200–400100–20022Table parison between predicted C.parvum and P.falciparum proteins.FeatureC.parvum P.falciparum *Common †Total predicted proteins380752681883Mitochondrial targeted/encoded 17(0.45%)246(4.7%)15Apicoplast targeted/encoded 0581(11.0%)0var/rif/stevor ‡0236(4.5%)0Annotated as protease §50(1.3%)31(0.59%)27Annotated as transporter 69(1.8%)34(0.65%)34Assigned EC function ¶167(4.4%)389(7.4%)113Hypothetical proteins925(24.3%)3208(60.9%)126*Values indicated for P.falciparum are as reported (3)with the exception of those for proteins annotated as protease or transporter.†TBLASTN hits (e Ͻ–5)between C.parvum and P.falciparum .‡As reported in (3).§Pre-dicted proteins annotated as “protease or peptidase”for C.parvum (CryptoGenome database,)and P.falciparum (PlasmoDB database,).Predicted proteins annotated as “trans-porter,permease of P-type ATPase”for C.parvum (CryptoGenome)and P.falciparum (PlasmoDB).¶Bidirectional BLAST hit (e Ͻ–15)to orthologs with assigned Enzyme Commission (EC)numbers.Does not include EC assignment numbers for protein kinases or protein phosphatases (due to inconsistent annotation across genomes),or DNA polymerases or RNA polymerases,as a result of issues related to subunit inclusion.(For consistency,46proteins were excluded from the reported P.falciparum values.)R E P O R T S16APRIL 2004VOL 304SCIENCE 442 o n O c t o b e r 7, 2009w w w .s c i e n c e m a g .o r g D o w n l o a d e d f r o mare otherwise localized to the plastid in other apicomplexans.C.parvum fatty acid biosynthe-sis appears to be cytoplasmic,conducted by a large(8252amino acids)modular type I FAS (9)and possibly by another large enzyme that is related to the multidomain bacterial polyketide synthase(10).Comprehensive screening of the C.parvum genome sequence also did not detect orthologs of Plasmodium nuclear-encoded genes that contain apicoplast-targeting and transit sequences(11).C.parvum metabolism is greatly stream-lined relative to that of Plasmodium,and in certain ways it is reminiscent of that of another obligate eukaryotic parasite,the microsporidian Encephalitozoon.The degeneration of the mi-tochondrion and associated metabolic capabili-ties suggests that the parasite largely relies on glycolysis for energy production.The parasite is capable of uptake and catabolism of mono-sugars(e.g.,glucose and fructose)as well as synthesis,storage,and catabolism of polysac-charides such as trehalose and amylopectin. Like many anaerobic organisms,it economizes ATP through the use of pyrophosphate-dependent phosphofructokinases.The conver-sion of pyruvate to acetyl–coenzyme A(CoA) is catalyzed by an atypical pyruvate-NADPH oxidoreductase(Cp PNO)that contains an N-terminal pyruvate–ferredoxin oxidoreductase (PFO)domain fused with a C-terminal NADPH–cytochrome P450reductase domain (CPR).Such a PFO-CPR fusion has previously been observed only in the euglenozoan protist Euglena gracilis(12).Acetyl-CoA can be con-verted to malonyl-CoA,an important precursor for fatty acid and polyketide biosynthesis.Gly-colysis leads to several possible organic end products,including lactate,acetate,and ethanol. The production of acetate from acetyl-CoA may be economically beneficial to the parasite via coupling with ATP production.Ethanol is potentially produced via two in-dependent pathways:(i)from the combination of pyruvate decarboxylase and alcohol dehy-drogenase,or(ii)from acetyl-CoA by means of a bifunctional dehydrogenase(adhE)with ac-etaldehyde and alcohol dehydrogenase activi-ties;adhE first converts acetyl-CoA to acetal-dehyde and then reduces the latter to ethanol. AdhE predominantly occurs in bacteria but has recently been identified in several protozoans, including vertebrate gut parasites such as Enta-moeba and Giardia(13,14).Adjacent to the adhE gene resides a second gene encoding only the AdhE C-terminal Fe-dependent alcohol de-hydrogenase domain.This gene product may form a multisubunit complex with AdhE,or it may function as an alternative alcohol dehydro-genase that is specific to certain growth condi-tions.C.parvum has a glycerol3-phosphate dehydrogenase similar to those of plants,fungi, and the kinetoplastid Trypanosoma,but(unlike trypanosomes)the parasite lacks an ortholog of glycerol kinase and thus this pathway does not yield glycerol production.In addition to themodular fatty acid synthase(Cp FAS1)andpolyketide synthase homolog(Cp PKS1), C.parvum possesses several fatty acyl–CoA syn-thases and a fatty acyl elongase that may partici-pate in fatty acid metabolism.Further,enzymesfor the metabolism of complex lipids(e.g.,glyc-erolipid and inositol phosphate)were identified inthe genome.Fatty acids are apparently not anenergy source,because enzymes of the fatty acidoxidative pathway are absent,with the exceptionof a3-hydroxyacyl-CoA dehydrogenase.C.parvum purine metabolism is greatlysimplified,retaining only an adenosine ki-nase and enzymes catalyzing conversionsof adenosine5Ј-monophosphate(AMP)toinosine,xanthosine,and guanosine5Ј-monophosphates(IMP,XMP,and GMP).Among these enzymes,IMP dehydrogenase(IMPDH)is phylogenetically related toε-proteobacterial IMPDH and is strikinglydifferent from its counterparts in both thehost and other apicomplexans(15).In con-trast to other apicomplexans such as Toxo-plasma gondii and P.falciparum,no geneencoding hypoxanthine-xanthineguaninephosphoribosyltransferase(HXGPRT)is de-tected,in contrast to a previous report on theactivity of this enzyme in C.parvum sporo-zoites(16).The absence of HXGPRT sug-gests that the parasite may rely solely on asingle enzyme system including IMPDH toproduce GMP from AMP.In contrast to otherapicomplexans,the parasite appears to relyon adenosine for purine salvage,a modelsupported by the identification of an adeno-sine transporter.Unlike other apicomplexansand many parasitic protists that can synthe-size pyrimidines de novo,C.parvum relies onpyrimidine salvage and retains the ability forinterconversions among uridine and cytidine5Ј-monophosphates(UMP and CMP),theirdeoxy forms(dUMP and dCMP),and dAMP,as well as their corresponding di-and triphos-phonucleotides.The parasite has also largelyshed the ability to synthesize amino acids denovo,although it retains the ability to convertselect amino acids,and instead appears torely on amino acid uptake from the host bymeans of a set of at least11amino acidtransporters(table S2).Most of the Cryptosporidium core pro-cesses involved in DNA replication,repair,transcription,and translation conform to thebasic eukaryotic blueprint(2).The transcrip-tional apparatus resembles Plasmodium interms of basal transcription machinery.How-ever,a striking numerical difference is seenin the complements of two RNA bindingdomains,Sm and RRM,between P.falcipa-rum(17and71domains,respectively)and C.parvum(9and51domains).This reductionresults in part from the loss of conservedproteins belonging to the spliceosomal ma-chinery,including all genes encoding Smdomain proteins belonging to the U6spliceo-somal particle,which suggests that this par-ticle activity is degenerate or entirely lost.This reduction in spliceosomal machinery isconsistent with the reduced number of pre-dicted introns in Cryptosporidium(5%)rela-tive to Plasmodium(Ͼ50%).In addition,keycomponents of the small RNA–mediatedposttranscriptional gene silencing system aremissing,such as the RNA-dependent RNApolymerase,Argonaute,and Dicer orthologs;hence,RNA interference–related technolo-gies are unlikely to be of much value intargeted disruption of genes in C.parvum.Cryptosporidium invasion of columnarbrush border epithelial cells has been de-scribed as“intracellular,but extracytoplas-mic,”as the parasite resides on the surface ofthe intestinal epithelium but lies underneaththe host cell membrane.This niche may al-low the parasite to evade immune surveil-lance but take advantage of solute transportacross the host microvillus membrane or theextensively convoluted parasitophorous vac-uole.Indeed,Cryptosporidium has numerousgenes(table S2)encoding families of putativesugar transporters(up to9genes)and aminoacid transporters(11genes).This is in starkcontrast to Plasmodium,which has fewersugar transporters and only one putative ami-no acid transporter(GenBank identificationnumber23612372).As a first step toward identification ofmulti–drug-resistant pumps,the genome se-quence was analyzed for all occurrences ofgenes encoding multitransmembrane proteins.Notable are a set of four paralogous proteinsthat belong to the sbmA family(table S2)thatare involved in the transport of peptide antibi-otics in bacteria.A putative ortholog of thePlasmodium chloroquine resistance–linkedgene Pf CRT(17)was also identified,althoughthe parasite does not possess a food vacuole likethe one seen in Plasmodium.Unlike Plasmodium,C.parvum does notpossess extensive subtelomeric clusters of anti-genically variant proteins(exemplified by thelarge families of var and rif/stevor genes)thatare involved in immune evasion.In contrast,more than20genes were identified that encodemucin-like proteins(18,19)having hallmarksof extensive Thr or Ser stretches suggestive ofglycosylation and signal peptide sequences sug-gesting secretion(table S2).One notable exam-ple is an11,700–amino acid protein with anuninterrupted stretch of308Thr residues(cgd3_720).Although large families of secretedproteins analogous to the Plasmodium multi-gene families were not found,several smallermultigene clusters were observed that encodepredicted secreted proteins,with no detectablesimilarity to proteins from other organisms(Fig.1,A and B).Within this group,at leastfour distinct families appear to have emergedthrough gene expansions specific to the Cryp-R E P O R T S SCIENCE VOL30416APRIL2004443o n O c t o b e r 7 , 2 0 0 9 w w w . s c i e n c e m a g . o r g D o w n l o a d e d f r o mtosporidium clade.These families —SKSR,MEDLE,WYLE,FGLN,and GGC —were named after well-conserved sequence motifs (table S2).Reverse transcription polymerase chain reaction (RT-PCR)expression analysis (20)of one cluster,a locus of seven adjacent CpLSP genes (Fig.1B),shows coexpression during the course of in vitro development (Fig.1C).An additional eight genes were identified that encode proteins having a periodic cysteine structure similar to the Cryptosporidium oocyst wall protein;these eight genes are similarly expressed during the onset of oocyst formation and likely participate in the formation of the coccidian rigid oocyst wall in both Cryptospo-ridium and Toxoplasma (21).Whereas the extracellular proteins described above are of apparent apicomplexan or lineage-specific in-vention,Cryptosporidium possesses many genesencodingsecretedproteinshavinglineage-specific multidomain architectures composed of animal-and bacterial-like extracellular adhe-sive domains (fig.S1).Lineage-specific expansions were ob-served for several proteases (table S2),in-cluding an aspartyl protease (six genes),a subtilisin-like protease,a cryptopain-like cys-teine protease (five genes),and a Plas-modium falcilysin-like (insulin degrading enzyme –like)protease (19genes).Nine of the Cryptosporidium falcilysin genes lack the Zn-chelating “HXXEH ”active site motif and are likely to be catalytically inactive copies that may have been reused for specific protein-protein interactions on the cell sur-face.In contrast to the Plasmodium falcilysin,the Cryptosporidium genes possess signal peptide sequences and are likely trafficked to a secretory pathway.The expansion of this family suggests either that the proteins have distinct cleavage specificities or that their diversity may be related to evasion of a host immune response.Completion of the C.parvum genome se-quence has highlighted the lack of conven-tional drug targets currently pursued for the control and treatment of other parasitic protists.On the basis of molecular and bio-chemical studies and drug screening of other apicomplexans,several putative Cryptospo-ridium metabolic pathways or enzymes have been erroneously proposed to be potential drug targets (22),including the apicoplast and its associated metabolic pathways,the shikimate pathway,the mannitol cycle,the electron transport chain,and HXGPRT.Nonetheless,complete genome sequence analysis identifies a number of classic and novel molecular candidates for drug explora-tion,including numerous plant-like and bacterial-like enzymes (tables S3and S4).Although the C.parvum genome lacks HXGPRT,a potent drug target in other api-complexans,it has only the single pathway dependent on IMPDH to convert AMP to GMP.The bacterial-type IMPDH may be a promising target because it differs substan-tially from that of eukaryotic enzymes (15).Because of the lack of de novo biosynthetic capacity for purines,pyrimidines,and amino acids,C.parvum relies solely on scavenge from the host via a series of transporters,which may be exploited for chemotherapy.C.parvum possesses a bacterial-type thymidine kinase,and the role of this enzyme in pyrim-idine metabolism and its drug target candida-cy should be pursued.The presence of an alternative oxidase,likely targeted to the remnant mitochondrion,gives promise to the study of salicylhydroxamic acid (SHAM),as-cofuranone,and their analogs as inhibitors of energy metabolism in the parasite (23).Cryptosporidium possesses at least 15“plant-like ”enzymes that are either absent in or highly divergent from those typically found in mammals (table S3).Within the glycolytic pathway,the plant-like PPi-PFK has been shown to be a potential target in other parasites including T.gondii ,and PEPCL and PGI ap-pear to be plant-type enzymes in C.parvum .Another example is a trehalose-6-phosphate synthase/phosphatase catalyzing trehalose bio-synthesis from glucose-6-phosphate and uridine diphosphate –glucose.Trehalose may serve as a sugar storage source or may function as an antidesiccant,antioxidant,or protein stability agent in oocysts,playing a role similar to that of mannitol in Eimeria oocysts (24).Orthologs of putative Eimeria mannitol synthesis enzymes were not found.However,two oxidoreductases (table S2)were identified in C.parvum ,one of which belongs to the same families as the plant mannose dehydrogenases (25)and the other to the plant cinnamyl alcohol dehydrogenases.In principle,these enzymes could synthesize protective polyol compounds,and the former enzyme could use host-derived mannose to syn-thesize mannitol.References and Notes1.D.G.Korich et al .,Appl.Environ.Microbiol.56,1423(1990).2.See supportingdata on Science Online.3.M.J.Gardner et al .,Nature 419,498(2002).4.A.T.Bankier et al .,Genome Res.13,1787(2003).5.J.C.Wootton,Comput.Chem.18,269(1994).Fig.1.(A )Schematic showing the chromosomal locations of clusters of potentially secreted proteins.Numbers of adjacent genes are indicated in paren-theses.Arrows indicate direc-tion of clusters containinguni-directional genes (encoded on the same strand);squares indi-cate clusters containingg enes encoded on both strands.Non-paralogous genes are indicated by solid gray squares or direc-tional triangles;SKSR (green triangles),FGLN (red trian-gles),and MEDLE (blue trian-gles)indicate three C.parvum –specific families of paralogous genes predominantly located at telomeres.Insl (yellow tri-angles)indicates an insulinase/falcilysin-like paralogous gene family.Cp LSP (white square)indicates the location of a clus-ter of adjacent large secreted proteins (table S2)that are cotranscriptionally regulated.Identified anchored telomeric repeat sequences are indicated by circles.(B )Schematic show-inga select locus containinga cluster of coexpressed large secreted proteins (Cp LSP).Genes and intergenic regions (regions between identified genes)are drawn to scale at the nucleotide level.The length of the intergenic re-gions is indicated above or be-low the locus.(C )Relative ex-pression levels of CpLSP (red lines)and,as a control,C.parvum Hedgehog-type HINT domain gene (blue line)duringin vitro development,as determined by semiquantitative RT-PCR usingg ene-specific primers correspondingto the seven adjacent g enes within the CpLSP locus as shown in (B).Expression levels from three independent time-course experiments are represented as the ratio of the expression of each gene to that of C.parvum 18S rRNA present in each of the infected samples (20).R E P O R T S16APRIL 2004VOL 304SCIENCE 444 o n O c t o b e r 7, 2009w w w .s c i e n c e m a g .o r g D o w n l o a d e d f r o m。
Coordinator
D2.3.3.v1SemVersion–Versioning RDF and Ontologies Max V¨o lkel(University of Karlsruhe)with contributions from:Carlos F.Enguix(National University of Ireland,Galway,Ireland)Sebastian Ryszard Kruk(DERI)Anna V.Zhdanova(DERI)Robert Stevens(U Manchester)York Sure(AIFB)Abstract.EU-IST Network of Excellence(NoE)IST-2004-507482KWEBDeliverable D2.3.3.v1(WP2.3)This papers describes the requirements for a semantic versioning system.The design,implementation and usage of SemVersion are described.KWEB/2004/D2.3.3.a/v1.0Document Identi-fierProject KWEB EU-IST-2004-507482Version v1.0Date June6th,2005StatefinalDistribution internalKnowledge Web ConsortiumThis document is part of a research project funded by the IST Programme of the Commission of the European Communities as project number IST-2004-507482.University of Innsbruck(UIBK)-CoordinatorInstitute of Computer Science Technikerstrasse13A-6020InnsbruckAustriaFax:+43(0)5125079872,Phone:+43(0)5125076485/88Contact person:Dieter FenselE-mail address:dieter.fensel@uibk.ac.at `Ecole Polythechnique F´e d´e rale de Lausanne (EPFL)Computer Science DepartmentSwiss Federal Institute of TechnologyIN(Ecublens),CH-1015LausanneSwitzerlandFax:+41216935225,Phone:+41216932738 Contact person:Boi FaltingsE-mail address:boi.faltings@epfl.chFrance Telecom(FT)4Rue du Clos Courtel35512Cesson S´e vign´eFrance.PO Box91226Fax:+33299124098,Phone:+33299124223 Contact person:Alain LegerE-mail address:alain.leger@ Freie Universit¨a t Berlin(FU Berlin) Takustrasse914195BerlinGermanyFax:+493083875220,Phone:+493083875223 Contact person:Robert TolksdorfE-mail address:tolk@inf.fu-berlin.deFree University of Bozen-Bolzano(FUB) Piazza Domenicani339100BolzanoItalyFax:+390471315649,Phone:+390471315642 Contact person:Enrico FranconiE-mail address:franconi@inf.unibz.it Institut National de Recherche en Informatique et en Automatique(INRIA) ZIRST-655avenue de l’Europe-Montbonnot Saint Martin38334Saint-IsmierFranceFax:+33476615207,Phone:+33476615366 Contact person:J´e rˆo me EuzenatE-mail address:Jerome.Euzenat@inrialpes.frCentre for Research and Technology Hellas/ Informatics and Telematics Institute(ITI-CERTH)1st km Thermi-Panorama road57001Thermi-ThessalonikiGreece.Po Box361Fax:+30-2310-464164,Phone:+30-2310-464160 Contact person:Michael G.StrintzisE-mail address:strintzi@iti.gr Learning Lab Lower Saxony(L3S)Expo Plaza130539HannoverGermanyFax:+49-511-7629779,Phone:+49-511-76219711 Contact person:Wolfgang NejdlE-mail address:nejdl@learninglab.deNational University of Ireland Galway (NUIG)National University of IrelandScience and Technology BuildingUniversity RoadGalwayIrelandFax:+35391526388,Phone:+353876826940 Contact person:Christoph BusslerE-mail address:chris.bussler@deri.ie The Open University(OU)Knowledge Media InstituteThe Open UniversityMilton Keynes,MK76AAUnited KingdomFax:+441908653169,Phone:+441908653506 Contact person:Enrico MottaE-mail address:e.motta@Universidad Polit´e cnica de Madrid(UPM) Campus de Montegancedo sn28660Boadilla del MonteSpainFax:+34-913524819,Phone:+34-913367439 Contact person:Asunci´o n G´o mez P´e rezE-mail address:asun@fi.upm.es University of Karlsruhe(UKARL)Institut f¨u r Angewandte Informatik und Formale Beschreibungsverfahren-AIFBUniversit¨a t KarlsruheD-76128KarlsruheGermanyFax:+497216086580,Phone:+497216083923 Contact person:Rudi StuderE-mail address:studer@aifb.uni-karlsruhe.deUniversity of Liverpool(UniLiv) Chadwick Building,Peach StreetL697ZF LiverpoolUnited KingdomFax:+44(151)7943715,Phone:+44(151)7943667 Contact person:Michael WooldridgeE-mail address:M.J.Wooldridge@ University of Manchester(UoM)Room2.32.Kilburn Building,Department of Computer Science,University of Manchester, Oxford RoadManchester,M139PLUnited KingdomFax:+441612756204,Phone:+441612756248 Contact person:Carole GobleE-mail address:carole@University of Sheffield(USFD)Regent Court,211Portobello streetS14DP SheffieldUnited KingdomFax:+441142221810,Phone:+441142221891 Contact person:Hamish CunninghamE-mail address:hamish@ University of Trento(UniTn)Via Sommarive1438050TrentoItalyFax:+390461882093,Phone:+390461881533 Contact person:Fausto GiunchigliaE-mail address:fausto@dit.unitn.itVrije Universiteit Amsterdam(VUA) De Boelelaan1081a1081HV.AmsterdamThe NetherlandsFax:+31842214294,Phone:+31204447731 Contact person:Frank van HarmelenE-mail address:Frank.van.Harmelen@cs.vu.nl Vrije Universiteit Brussel(VUB) Pleinlaan2,Building G101050BrusselsBelgiumFax:+3226293308,Phone:+3226293308 Contact person:Robert MeersmanE-mail address:robert.meersman@vub.ac.beExecutive SummaryChange management for ontologies becomes a crucial aspect for any kind of on-tology management environment,as engineering of ontologies often takes place in distributed settings where multiple independent users have to interact.There is also a variety of ontology languages used.Although RDF Schema and OWL are gaining more and more popularity,a lot of semantic data still resides in other formats,as it is the case in the biology domain(c.f.Sec.1.2.3).Until now,no standard version-ing system or methodology has arisen,that can provide a common way to handle versioning issues.This deliverable describes the RDF-centric versioning approach and implementa-tion SemVersion.It provides structural(purely triple based)and semantic(ontology language based,like RDFS,OWL and OBOL)versioning.It separates language-neutral features for data management from language-specific features like semantic diffs in design and implementation.This way SemVersion offers a common approach for already widely used RDF models and a wide range of ontology languages.The requirements for our system are derived from a set of practical scenarios, which are documented in detail in this deliverable.The project experienced a shift in requirements,when Robert Stevens from Uni-versity of Manchester joined the group in May2005.WP2.3decided to tackle the problem of versioning the Gene Ontology.In[1]we suggested reification for data storage.As we now face the large volume of the Gene Ontology data(see1.2.3),we need more powerful storage solutions than for the other use cases.Addressing triple sets(models)is another challenge.In[1] we argued to use reification,which would make models four times as large.To avoid this,we now use native quad stores,which provide a context URI for each triple. We use the context URI to address models more efficiently.A sub-project,Rdf2Go,has been created to deal with various model abstrac-tions and serves as a unifying triple(and quad)store entry point.Rdf2Go is described in Chapter2.A second sub-project of SemVersion,RdfReactor,facilitates the usage of RDF Schema based data in Java significantly.It’s latest version is based on Rdf2Go.In fact,RDFReactor has been designed for SemVersion in thefirst place.RDFReactor is described in Sec.1.5.4.Contents1SemVersion–An RDF Versioning System11.1Introduction (1)1.1.1Term Definitions (3)1.2Requirements for an ontology versioning system (3)1.2.1Use Case1:MarcOnt Collaborative Ontology Development..31.2.2Use Case2:The People’s Portal for Community OntologyDevelopment (6)1.2.3Use Case3:Versioning the Gene Ontology (7)1.2.4Use Case4:Versioning in a Semantic Wiki (10)1.2.5Use Case5:Analysis of Wikipedia (10)1.2.6Requirements Summary (11)1.3Data Management Design (12)1.3.1RDF as the structural core of ontology languages (12)1.3.2Version Data Management (13)1.4Versioning Functionality Design (14)1.4.1Structural Diff (14)1.4.2Semantic Diff (15)1.4.3Blank Nodes and the Diff (16)1.4.4Branch and Merge (17)1.4.5Conflict Detection (18)1.4.6Query Language Extension (18)1.5Implementation (18)1.5.1Storage Layer Access (19)1.5.2Handling Commits (20)1.5.3Generating globally unique URIs (20)1.5.4RDFReactor (20)2RDF2Go222.1What is RDF2Go? (22)2.2Working Example:Simple FOAF via RDF2Go (24)2.3Architecture (26)2.4The API (26)2.4.1Model and ContextModel (26)iiD2.3.3.v1SemVersion–Versioning RDF and Ontologies IST Project IST-2004-5074822.4.2Queries (29)2.5How to get started (30)3Using and Extending SemVersion313.1Using SemVersion (31)3.1.1Typical Actions (32)3.1.2Administration (33)3.1.3Usage and Implementation Notes (34)3.1.4SemVersion Usage Examples (34)3.2Extending SemVersion (34)4Conclusions and Outlook36 KWEB/2004/D2.3.3.a/v1.0June6th,2005iiiChapter 1SemVersion –An RDF Versioning System1.1IntroductionAs outlined in the Knowledge Web Deliverable D2.3.1”Specification of a method-ology for syntactic and semantic versioning”[1],there is a clear need for RDF data and ontology versioning.This deliverable is a follow-up of D2.3.1,which explains the underlying concepts in detail.Here we focus on the concrete approach and implementation.Change management for ontologies becomes a crucial aspect for any kind of ontology management environment,as engineering of ontologies often takes place in distributed settings where multiple independent users have to interact.There is also a variety of ontology languages used.Although RDF Schema and OWL are gaining more and more popularity,a lot of semantic data still resides in other formats,as it is the case in the biology domain (c.f.Sec. 1.2.3).Until now,no standard versioning system or methodology has arisen,that can provide a common way to handle versioning issues.This deliverable describes the RDF-centric versioning approach and implementa-tion SemVersion 1.It provides structural (purely triple based)and semantic (ontol-ogy language based,like RDFS,OWL and OBOL)versioning.It separates language-neutral features for data management from language-specific features like semantic diffs in design and implementation.This way SemVersion offers a common approach for already widely used RDF models and a wide range of ontology languages.SemVersion is published as an open-source software project on the site OntoWare.The current version of the project homepage is depicted in Fig.1.1.1The name resembles the upcoming de-facto standard subversion ( )and is also a short form of ”Semantic Versioning”11.SEMVERSION–AN RDF VERSIONING SYSTEMFigure1.1:Homepage of the SemVersion project2June6th,2005KWEB/2004/D2.3.3.a/v1.0D2.3.3.v1SemVersion–Versioning RDF and Ontologies IST Project IST-2004-507482 Our approach is inspired by the classical CVS system for version management of textual documents(e.g.Java code).Core element of our approach is the sepa-ration of language-specific features(the semantic diff)from general features(such as structural diff,branch and merge,management of projects and metadata).A speciality of RDF is the usage of so-called blank nodes.As part of our approach we present a method for blank node enrichment which helps in versioning of such blank nodes.1.1.1Term DefinitionsRDF is a data model with the types URI,blank node,plain literal,language tagged literal and data typed literal.It consists of triples(also called state-ments).A set of triples is called model(or triple set).An ontology is a model, in which semantics have been assigned to certain URIs and/or triple constructs, according to an ontology language.We use the term concept to denote things ontologies talk about:classes,properties and instances.In an RDF context,every-thing that is addressable by URI or by blank node is considered a concept.SemVersion versions models.A model under version control is named a ver-sioned model.A versioned model has a root model,which is a version.A version is a model plus versioning metadata.Versions in SemVersion never change. Instead,every operation that changes the state of a versioned model(commit,merge, ...)results in the creation of a new version.More details about SemVersion’s con-ceptual data model can be found in Sec.1.3.2.1.2Requirements for an ontology versioning sys-temWe gathered different requirements from Knowledge Web partners in order to create a more general design.We tried to gather as concrete usage requirements as possible to obtain a usable(and hence testable)design and implementation.In this section we present the different usage requirements.For each use case we name the stakeholder and provide a use case description, characteristics of the data set,and derived versioning requirements.1.2.1Use Case1:MarcOnt Collaborative Ontology Devel-opmentStakeholder:Sebastian Ryszard Kruk(DERI),sebastian.kruk@KWEB/2004/D2.3.3.a/v1.0June6th,200531.SEMVERSION–AN RDF VERSIONING SYSTEMThe MarcOnt2scenario served as thefirst source of inspiration for SemVersion. MarcOnt is a project to create an ontology for library data exchange.One of the most commonly used bibliographic description format is MARC21. Though it is capable of describing most of the features of the library resources, its semantic content is low.It means that while searching for a resource,one has to look for particular keywords in the resource’s descriptionfields,but one cannot carry out a search be meaning or concept.This can often result in large sets of results.Also the data communication between library systems is very hard to extend. On of the earliest shared vocabularies is the Dublin Core Metadata standard for library resource description.Besides the fact that most of the information covered by MARC21is lost,the full potential of the Semantic Web is not being used.The project aims at creating the MarcOnt ontology,based on a social agreement that will combine descriptions from MARC21together with DublinCore and makes use of the full potential of the Semantic Web technologies.This will include transla-tions to/from other ontologies,more efficient searching for resources(ers may have impact on the searching process).The MarcOnt initiative is strongly connected to the Jerome Digital Library project(e-library with semantics,formerly ElvisDL)-which implements a simple library ontology and can be used as a starting point for further work.MarcOnt also assumed that JeromeDL will be a testing platform for an experimental results from the MarcOnt initiative.Data Set Currently there exists only one version of the MarcOnt ontology,which can be downloaded at /index.php?option=com_content&task=view&id=13&Itemid=27.Versioning Requirements The MarcOnt project has a clear view on the process of ontology evolution.It starts with a current main version.Now people can suggest (multiple,independent)changes.Then the community discusses about the proposed changes and selects some.The changes are applied and a new main version is created. The process is illustrated in Fig.1.2.The ontology builder of the MarcOnt portal requires not only a GUI for building the ontology through submitting changes.It also needs the ability to:•Manage a main trunk of the ontology(R1.1)3•Manage versions of suggestions(R1.2)•Generate snapshots of the main ontology with some suggestions applied(R1.3) 2/3Requirements are numbered by”use case number”/”.”/running number4June6th,2005KWEB/2004/D2.3.3.a/v1.0Figure1.2:Versions and suggestions in the MarcOnt use caseKWEB/2004/D2.3.3.a/v1.0June6th,20055•Detect and resolve conflicts(R1.4)•Add suggestions to the main trunk(R1.5)•Attach mapping/translation rules(R1.6)•Be able to check out arbitrary versions by HTTP GET with a specific URL (R1.7)1.2.2Use Case2:The People’s Portal for Community On-tology DevelopmentStakeholder:Anna V.Zhdanova(DERI),anna.zhdanova@deri.at People’s portal[2]is an implementation of a human-Semantic Web interactive environment.The environment is named The People’s Portal and it is implemented employing Java,Jena and Tomcat.The basic idea of the People’s portal is to marry a community Semantic Web portal technology with collaborative ontology manage-ment functionalities in order to bring the Semantic Web to masses and overcome limitations of the existing community web portals.Use cases:The People’s portal environment is applied to DERI and used to produce part of the DERI web site.DERI members can login here to enter the environment.DERI web site managers can login here to manage the data in a centralized fashion.Versioning Requirements The system uses a subset of RDF ers of the portal can introduce new classes and properties on thefly.Consensus is partly reached by usage.Properties that are often used and classes that have many instances are considered useful for the community.Hence it is necessary to ask the versioning system:•How many instance does this class have now?Last week?Generalised:How many instances does a concept(rdfs:Class or rdfs:Property)has at a specific point in time?(R2.1)•When has this classfirst been instantiated?(R2.2)•How many properties are attached to this class?Since when?(R2.3)number of instances of class,properties NOW(specific point in time also)•Who added this ontology item?(R2.4)•Store new versions and return diffs between arbitrary points in time.(R2.5)•Return predecessor of an ontology item(class,property)in time(R2.6)6June6th,2005KWEB/2004/D2.3.3.a/v1.0•Support the evolution primitives:”add”,”remove”and ”replace”on concept definitions.(R2.7)•Return number of changed instance items (also properties,classes)and show which items changed.(R2.8)•Which concepts appeared within a given time interval?(R2.9)•Queries across change log/activity log:For each attribute,when was it instan-tiated and when have instances been created?(R2.10)•What are hot attributes?Those instantiated or changed often recently.Which are these?(R2.11)1.2.3Use Case 3:Versioning the Gene OntologyStakeholder:Robert Stevens (U Manchester),robert.stevens@ Background An important step was the phone conference on 12.07.2005,in which common goals were identified 4.Robert Stevens from Manchester University has be-come an active member of the work package.Robert is a biologist who is also a doctor in Computer Science.Robert is a Bioinformatics Lecturer in the BioHealth Informatics Group at the University of Manchester.He has around 80publications in international conferences,workshops,journals and so on.He was involved in the TAMBIS project for transparent access and integration of biological databases.Now one of his main interests is in the definition of formal biological ontologies.He is involved in the transformation of the Gene Ontology controlled vocabulary into a description-logics OWL based ontology.He is interested in contributing to the devel-opment of an ontology-based versioning system to the Gene Ontology which is part of the Open Biological Ontologies.Also he want’s to study how conceptualisations change over time,hence the need for data analysis.Use case description The gene ontology 5community is where collaborative on-tology construction is practiced a long time comparing to other communities.The GO community showed that involvement of multiple parties is a must for a compre-hensive ontology as a result.The GO community is far ahead of other communities constructing ontologies [3].Hence they are the ideal subject to study real-world change operations.”The goal of the Gene Ontology (GO)consortium is to produce a controlled vocabulary that can be applied to all organisms even as knowledge of gene and 4/wiki/KnowledgeWeb/WP23/MeetingAgenda12July20055KWEB/2004/D2.3.3.a/v1.0June 6th,20057protein roles in cells is accumulating and changing.GO provides three structured networks of defined terms to describe gene product attributes.”6Current Gene Ontology versions are maintained by CVS repositories which han-dle only syntactic differences among ontologies.In other words CVS is not able to differentiate class versions for instance,being able only to differentiate text/file differences.Versioning Requirements Essentially,here SemVersion is used for data analysis.In order to study ontology change operations,SemVersion must cope with multiple versions of the Gene Ontology (GO).The GO is authored in Open Biology Language 7(OBOL),for which usable OWL exports exist.The GO has about 19.000concepts.Assuming about 10statements per concept we estimate a size of roughly 100.000statements –per version.The researchers who study the ontology change patterns (Robert Stevens and his team)would like to use a monthly snapshot for a period of 6years.This amounts to 6years ×12month =72versions.Thus the underlying triple store must be able to handle up to 7million triples and search (maybe even reason)over them.The requirements in short form are thus•Store up to 7million triples (R3.1)•Allow meta-data queries over the 72versions (R3.2)•Allow data queries over all versions (7million triples)(R3.3)•OBOL semantic diff(R3.4)•OBOL to RDF converter (R3.5)•A Java interface (R3.6)Data Set The Gene Ontology ”per se”is not an Ontology in the formal sense,it is rather a cross-species controlled biological vocabulary as previously indicated above.The Gene Ontology is divided in three disjoint sub-ontologies,currently stored in big flat files or also stored in persistent repositories such as a relational database (MySQL database).The three sub-ontologies are divided into vocabularies that describe gene products in terms of:Molecular functions,associated biological processes and cellular components.The GO ontology permits to associate biological relationships among molecular functions,the involvement of molecular functions in biological processes and the 6Extracted from the OBO site /7/8June 6th,2005KWEB/2004/D2.3.3.a/v1.0occurrence of biological processes at a given time and space in cells [4].Whereas the molecular function defines what a gene product does at the biochemical level,the bi-ological process normally indicates a transformation process triggered or contributed by a gene product involving multiple molecular functions.Finally the cellular com-ponent indicates the cell structure a gene product is part of.The Gene Ontology contains around 20.000concepts which are convertible to OWL.The latest statistics about the GO could be found at the GO site 8:Current term counts (as of June 20,2005at 6:00Pacific time):•17946terms,94.2%with definitions.•6984(38.9%)Molecular functions•9410(52.4%)Biological processes•1552(8.6%)Cellular components•There are 998obsolete terms not included in the above statistics(Total Terms=18944)Further complexity assessments can be found at /~cjm/obol/doc/go-complexity.html .According to [5]the GO is a handcrafted ontology accepting only ”is-a”and ”part-of”relationships.The hierarchical organization is represented via a directed-acyclic-graph (DAG)structure similar to the representation of Web pages or hypertext systems.Members of the Consortium group contribute to updates and revisions of the GO.The Go is maintained by editors and scientific curators who notify GO users of ontology changes via email,or at the GO site by monthly reports 9.Please note that ontology creation and annotation of GO terms in databases (association of GO terms with gene products)are two different operations.Each annotation should include its data provenance or source(a cross database reference,a literature reference,etc).Technically,there are two different data sets,available via public CVS stores.Set I ranges from 1999to 2001and has a snapshot of the GO for each month in GO syntax.The second set runs from 2001up to now and contains for each month a Go snapshot in OBO syntax.As OBO is the newer syntax,we assume the existence of a converter from GO syntax to OBO syntax available from the GO community.In order to use the data sets,one has to decide for a format.There are three options:(a)RDF,(b)OWL generated from DAG-Edit 10or (c)nice OWL generated by Prot´e g´e -Plugin.Whatever choice is made,the exported data should contain the provenance8/GO.downloads.shtml#ont9/MonthlyReports/10/dev/java/dagedit/docs/index.html KWEB/2004/D2.3.3.a/v1.0June 6th,20059information of the source file and the conversion process used.SemVersion offers ways to store such provenance information.1.2.4Use Case 4:Versioning in a Semantic WikiStakeholder:Max V ¨o lkel (U Karl),mvo@aifb.uni-karlsruhe.deA wiki is a browser-based environment to author networked,structured notes,often in a collaborative way.The project SemWiki 11aims at creating a semantic wiki for personal note management.SemWiki extends the wiki syntax with means to enter statements about resources,much like in RDF.In a traditional wiki,users are accustomed to see and compare different versions of a page.In the semantic wiki ”SemWiki”12pages are just a special kind of resource and some attached properties.Hence,a semantic diffhas to be calculated ”by hand”.Data Set A typical personal wiki has up to 3000pages with approximately 10versions per page.Each page consists roughly of 50statements.This leads to approximately 1.5million triples for a snapshot-based versioning system.Versioning Requirements SemWiki users need ways to request a semantic diffbetween two page-versions.As pages partly consist of ”background statements”,which do not belong to a particular page,SemWiki needs a model-based versioning approach (R4.1).Sometimes users want to roll-back page changes,thus we need the ability to revert to old states (R4.2).Additionally,users want to track each statement:Who authored it,when has it been introduced,etc.(R4.3).1.2.5Use Case 5:Analysis of WikipediaStakeholder:Denny Vrandecic,Markus Kr ¨o tzsch,Max V ¨o lkel (U Karl){dvr,mkr,mvo}@aifb.uni-karlsruhe.deAn emerging research topic at AIFB is the analysis of changes in the Wikipedia 13.This use case is mostly similar to ”Versioning the Gene Ontology”.Data Set The Wikipedia contains roughly 1.500.000articles across all language versions.11 121310June 6th,2005KWEB/2004/D2.3.3.a/v1.0Versioning Requirements There are no obvious requirements beyond those al-ready mentioned in use case 3.1.2.6Requirements SummaryWe can distinguish rather data management related requirements and rather ontol-ogy language specific features.Data Management Requirements•Store and retrieve versions;store up to 7million triples•Retrieve versions via HTTP or Java function calls;address versions unambigu-ously via URIs and user-friendly via labels•Rich meta data per model /statement:provenance,author,valid time,transaction time•Model based versioning and additionally concept-oriented queries•Queries across versions concerning meta data•Each version can have a number of attached ”suggestions”;ability turn sug-gestions into official versionsOntology Language Requirements•Queries across versions concerning the content•return diffs between arbitrary versions•OBOL semantic diff•OBOL to RDF converter•RDFS semantic diff•OWL semantic diff•Semantic Wiki semantic diff•Conflict detection in OWLKWEB/2004/D2.3.3.a/v1.0June 6th,2005111.3Data Management DesignA versioning system has generally two main parts.One deals with general data management issues,the other part with versioning specific functionality such as cal-culating the difference between two versions.Wefirst present the data management parts and then the ontology specific versioning functions.The data management parts can be used no matter which ontology language is used–as long as the data model is encoded as RDF.RDF encoding of data is crucial in order to have a significant re-use of software across ontology languages.We now present some arguments for this claim.A more detailed discussion can be found in the Knowledge Web Deliverable D2.3.1[1].1.3.1RDF as the structural core of ontology languagesThe most elementary modelling primitive that is needed to model a shared con-ceptualisation of some domain is a way to denote entities and to unambiguously reference them.For this purpose RDF uses URIs,identifiers for resources,that are supposed to be globally unique.Every ontology language needs to provide means to denote entities.For global systems the identifier should be globally unique.Hav-ing entities,that can be referenced,the next step is to describe relations between them.As relations are semantic core elements,they should also be unambiguously addressable.Properties in RDF can be seen as binary relations.This is the very basic type of relations between two entities.More complex types of relations can be modelled by defining a special vocabulary for this purpose on top of RDF,like it has been done in OWL.The two core elements for semantic modelling,mechanisms to identify entities and to identify and state relationships between them,are provided by RDF.Ontol-ogy languages that build upon RDF use these mechanisms and define the semantics of certain relationships,entities,and combinations of relationships and entities.So RDF provides the structure in which the semantic primitives of the ontology lan-guages are embedded.That means we can distinguish three layers here:syntactic layer(e.g.XML),structural layer(RDF),semantic layer(ontology languages).The various ontology languages differ in their vocabulary,their logical founda-tions,and epistemological elements,but they have in common that they describe structures of entities and their relations.Therefore RDF is the largest common de-nominator of all ontology languages.RDF is not only a way to encode the ontology languages or just an arbitrary data model,but it is a structured data model that matches exactly the structure of ontology languages.12June6th,2005KWEB/2004/D2.3.3.a/v1.0。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Applied ontologies in the knowledge management of construction processes
Marco Masera Università degli Studi di Firenze, Firenze, Italy (email: marco.masera@unifi.it) Chiara Cirinnà Università degli Studi di Firenze, Firenze, Italy (email: chiara.cirinnà@taed.unifi.it) Saverio Mecca Università degli Studi di Firenze, Firenze, Italy (email: saverio.mecca@unifi.it) Valeriano Sandrucci Università degli Studi di Firenze, Firenze, Italy (email: vsand74@hotmail.com) Enrico Vicario Università degli Studi di Firenze, Firenze, Italy (email: vicario@dsi.unifi.it)
Abstract Technological knowledge integration is time consuming, and requires intensive negotiation, and flows aiming at to achieve the design and planning performance in the construction projects. The traditional taxonomic approach of project management through the task decomposition could be usefully extended through an ontological approach to the task description. The ontological analysis allows to produce and experiment knowledge bases to support the communication and the management processes. The identified strategic objective is the development of knowledge bases supported by ontologies to analyse domain knowledge, to make domain assumptions explicit of tacit or informal knowledge, to enable reuse of domain knowledge that is one of the core objectives to which ontology research is addressed. The goal is to reduce the knowledge-acquisition bottleneck in the management process by improving the efficiency of the knowledge exchange in structuring knowledge-bases. The scenario of ontology-editing environments offers a range of achievable general purpose tools. The expected result from the ongoing research will provide an integrated approach to support the knowledge bases elicitation in the construction management oriented to the development and the implementation of a cooperative system of knowledge management.
Keywords: Ontology, Knowledge Base, Construction Management 1. Introduction The knowledge management is related to the complex process of producing and utilising knowledge. Conceptualise, communicate, utilise, re-use, share, manage, extend distributed knowledge, each of these concepts are referred to a complex network of interactions that interest people and machinery, and the processes among them. In order to the processes requirements a fundamental need is observed related to the communication elements that are to be explicit, accessible and cognitively adequate [1]: such kind of requirements characterise mainly the concepts, the tools, and the used vocabulary. To this field the studies on the ontologies are addressed.
1.1 Ontologies in Knowledge Management On a different field of study respect the Ontology intended as a philosophical discipline, the term ontology is used in the field of the knowledge management and the information technologies to describe a theoretical or a computational artefacts [2]. T. R. Gruber defined the ontology as a “specification of a conceptualisation” [3]. The ontology is specified and formalised through specific artefacts representing the meaning assumed in the terms of nature and structure of the related entities. In different terms an ontology represents the attempt to elaborate a conceptual schema that could be exhaustive and rigorous in a specific domain. Generally an ontology is expressed through a hierarchical structure of data containing all the entities that are relevant, their explicit relations, the rules, the axioms, and the constraints specific for the domain [4].
In general and in the field of the ontological study of a technological domain a development of ontologies either foundation either computational is required. A foundation ontology is in general related with the development of basic glossary, nevertheless contrary to a glossary the ontology utilises taxonomic breakdown structures subdivided in levels and the defined elements constitutes the base to describe all the elements of the domain.
A foundation ontology covers the functionality of a basic ontology supporting either the users either the programs, influencing their perspective of data and events, e.g. the foundation ontology are applied to the artificial languages. All the computer programs are based on foundation ontologies that are constitute by the instructions set of the processor, or by the language libraries, the files presents in a file system, or by a other list of “existing stuffs”. Aiming at to build the representation of a knowledge domain afflicted by insufficient basis could be obtained incorrect results. That implies the requirement to develop basic ontologies to be consolidated as foundation of the task.