Software architecture of PSET A page segmentation evaluation toolkit

合集下载

十个明确简答

十个明确简答1.What is a computer?A computer is a device that can store, retrieve, and process data.2.What is the internet?The internet is a global network of interconnected computers, allowing users to access, exchange and share data.3.What is a computer virus?A computer virus is a malicious program designed to spread itself by copying itself to other computers or exploiting computer vulnerabilities.4.What is hardware?Hardware refers to the physical components of a computer system, including the processor, memory, power supply, motherboards and other components.5.What is software?Software is a set of instructions for a computer that can be used to perform specific tasks.6.What is a database?A database is a collection of data or information that is organized and stored so that it can be easily accessed and manipulated.7.What is system software?System software is a set of programs that control the operation of a computer and manage its resources.8.What is an operating system?An operating system is a set of software that manages the resources of a computer system, such as its memory and processor, and allows applications and users to interact with the computer.9.What is a program?A program is a set of instructions that a computer can execute to carry out specific tasks.10.What is a compiler?A compiler is a program that can convert a program written in a programming language into instructions that can be read and executed by a computer.。

高二英语软件开发单选题50题

高二英语软件开发单选题50题1.The main language used in software development is _____.A.PythonB.JavaC.C++D.All of the above答案：D。

在软件开发中，Python、Java 和C++都是常用的编程语言，所以答案是以上皆是。

2.Which one is not a software development tool?A.Visual StudioB.IntelliJ IDEAC.PhotoshopD.Eclipse答案：C。

Photoshop 是图像编辑软件，不是软件开发工具。

Visual Studio、IntelliJ IDEA 和Eclipse 都是常用的软件开发集成环境。

3.The process of finding and fixing bugs in software is called _____.A.debuggingB.codingC.testingD.designing答案：A。

debugging 是调试的意思，即查找和修复软件中的错误。

coding 是编码，testing 是测试，designing 是设计。

4.A set of instructions that a computer follows is called a _____.A.programB.algorithmC.data structureD.variable答案：A。

program 是程序，即一组计算机遵循的指令。

algorithm 是算法，data structure 是数据结构，variable 是变量。

5.Which programming paradigm emphasizes on objects and classes?A.Procedural programmingB.Functional programmingC.Object-oriented programmingD.Logic programming答案：C。

Software Architect

Concerned with cross project/solution architecture and communication between different practices in architecture.
Time To Market
Cost and Benefits
Projected life time
Targeted Market
Integration with Legacy System
Roll back Schedule
Business Community view
A list of quality attributes exists in ISO/IEC 9126-2001 Information Technology – Software Product Quality
.Nects
Solutions Architecture
Specific to a particular business area (or project) but still reliant on being a technical focal point for communications between the domain architect, business interests and development.
Concerned with the business model as it relates to an automated solution.
E-business is a good candidate Structural part of requirements analysis. Domain Specific

Applying Slicing Technique to Software Architectures

a r X i v :c s /0105008v 1 [c s .S E ] 5 M a y 2001Applying Slicing Technique to Software ArchitecturesJianjun ZhaoDepartment of Computer Science and EngineeringFukuoka Institute of Technology3-10-1Wajiro-Higashi,Higashi-ku,Fukuoka 811-0214,JapanEmail:zhao@cs.ﬁt.ac.jpAbstractSoftware architecture is receiving increasingly atten-tion as a critical design level for software systems.As software architecture design resources (in the form of ar-chitectural speciﬁcations)are going to be accumulated,the development of techniques and tools to support ar-chitectural understanding,testing,reengineering,main-tenance,and reuse will become an important issue.This paper introduces a new form of slicing,named archi-tectural slicing,to aid architectural understanding and reuse.In contrast to traditional slicing,architectural slicing is designed to operate on the architectural spec-iﬁcation of a software system,rather than the source code of a program.Architectural slicing provides knowl-edge about the high-level structure of a software system,rather than the low-level implementation details of a program.In order to compute an architectural slice,we present the architecture information ﬂow graph which can be used to represent information ﬂows in a software architecture.Based on the graph,we give a two-phase algorithm to compute an architectural slice.1IntroductionSoftware architecture is receiving increasingly atten-tion as a critical design level for software systems [18].The software architecture of a system deﬁnes its high-level structure,exposing its gross organization as a col-lection of interacting components.A well-deﬁned ar-chitecture allows an engineer to reason about system properties at a high level of abstraction.Architectural description languages (ADLs)are formal languages that can be used to represent the architecture of a software system.They focus on the high-level structure of the overall application rather than the implementation de-tails of any speciﬁc source module.Recently,a number of architectural description languages have been pro-posed such as W right [2],Rapide [13],UniCon [17],and ACME [9]to support formal representation and reason-ing of software architectures.As software architecture design resources (in the form of architectural speciﬁca-tions)are going to be accumulated,the development of techniques to support software architectural under-standing,testing,reengineering,maintenance and reuse will become an important issue.One way to support software architecture develop-ment is to use slicing technique.Program slicing,origi-nally introduced by Weiser [23],is a decomposition tech-nique which extracts program elements related to a par-ticular computation.A program slice consists of those parts of a program that may directly or indirectly aﬀect the values computed at some program point of interest,referred to as a slicing criterion .The task to compute program slices is called program slicing .To understand the basic idea of program slicing,consider a simple ex-ample in Figure 1which shows:(a)a program frag-ment and (b)its slice with respect to the slice criterion (Total ,14).The slice consists of only those statements in the program that might aﬀect the value of variable Total at line 14.The lines represented by small rectan-gles are statements that have been sliced away.We refer to this kind of slicing as traditional slicing to distinguish it from a new form of slicing introduced later.Traditional slicing has been studied primarily in the context of conventional programming languages [21].In such languages,slicing is typically performed by using a control ﬂow graph or a dependence graph [5,12,7,16,24,25].Traditional slicing has many ap-plications in software engineering activities including program understanding [6],debugging [1],testing [3],maintenance [8],reuse [15],reverse engineering [4],and complexity measurement [16].Applying slicing technique to software architectures promises beneﬁt for software architecture development at least in two aspects.First,architectural understand-ing and maintenance should beneﬁt from slicing.When a maintainer wants to modify a component in a software architecture in order to satisfy new design requirements,the maintainer must ﬁrst investigate which components will aﬀect the modiﬁed component and which compo-nents will be aﬀected by the modiﬁed component.This process is usually called impact analysis .By slicing a software architecture,the maintainer can extract the parts of a software architecture containing those compo-nents that might aﬀect,or be aﬀected by,the modiﬁed component.The slicing tool which provides such infor-mation can assist the maintainer greatly.Second,archi-tectural reuse should beneﬁt from slicing.While reuse of code is important,in order to make truly large gains in productivity and quality,reuse of software designs and patterns may oﬀer the greater potential for return on investment.By slicing a software architecture,a sys-1 begin2 read(X,Y);3 Total := 0.0;4 Sum := 0.0;5 if X <= 1 then6 Sum := Y;7 else8 begin9 read(Z);10 Total := X * Y;11 end;12 end if13 Write(Total, sum);14 end1 begin2 read(X,Y);3 Total := 0.0; 45 if X <= 1 then 67 else8 begin 910 Total := X * Y;11 end;12 end if 1314 end(a) A program fragment.(b) a slice of (a) on the criterion (Total,14).Figure 1:A program fragment and its slice on criterion (Total ,14).tem designer can extract reusable architectures from it,and reuse them into new system designs for which they are appropriate.While slicing is useful in software architecture devel-opment,existing slicing techniques for conventional pro-gramming languages can not be applied to architectural speciﬁcations straightforwardly due to the following rea-sons.Generally,the traditional deﬁnition of slicing is concerned with slicing programs written in conventional programming languages which primarily consist of vari-ables and statements,and the slicing notions are usually deﬁned as (1)a slicing criterion is a pair (s,V)where s is a statement and V is a set of variables deﬁned or used at s ,and (2)a slice consists of only statements.However,in a software architecture,the basic elements are components and their interconnections,but neither variables nor statements as in conventional program-ming languages.Therefore,to perform slicing at the architectural level,new slicing notions for software ar-chitectures must be deﬁned.In this paper,we introduce a new form of slicing,named architectural slicing .In contrast to traditionalslicing,architectural slicing is designed to operate on a formal architectural speciﬁcation of a software sys-tem,rather than the source code of a conventional pro-gram.Architectural slicing provides knowledge about the high-level structure of a software system,rather than the low-level implementation details of a conven-tional program.Our purpose for development of archi-tectural slicing is diﬀerent from that for development of traditional slicing.While traditional slicing was de-signed originally for supporting source code level un-derstanding and debugging of conventional programs,architectural slicing was primarily designed for support-ing architectural level understanding and reuse of large-scale software systems.However,just as traditional slic-ing has many other applications in software engineering activities,we believe that architectural slicing is also useful in other software architecture development activ-ities including architectural testing,reverse engineering,reengineering,and complexity measurement.Abstractly,our slicing algorithm takes as input a for-mal architectural speciﬁcation (written in its associated architectural description language)of a software system,then it removes from the speciﬁcation those components and interconnections between components which are not necessary for ensuring that the semantics of the speciﬁ-cation of the software architecture is maintained.This beneﬁt allows unnecessary components and interconnec-tions between components to be removed at the archi-tectural level of the system which may lead to consid-erable space savings,especially for large-scale software systems whose architectures consist of numerous com-ponents.In order to compute an architectural slice,we present the architecture information ﬂow graph which can be used to represent information ﬂows in a software architecture.Based on the graph,we give a two-phase algorithm to compute an architectural slice.The rest of the paper is organized as follows.Section 2brieﬂy introduces how to represent a software archi-tecture using W right :an architectural description lan-guage.Section 3shows a motivation example.Section 4deﬁnes some notions about slicing software architec-tures.Section 5presents the architecture information ﬂow graph for software architectures .Section 6gives a two-phase algorithm for computing an architectural slice.Section 7discusses the related work.Concluding remarks are given in Section 8.2Software Architectural Speciﬁcation in W rightWe assume that readers are familiar with the basic concepts of software architecture and architectural de-scription language,and in this paper,we use W right architectural description language [2]as our target lan-guage for formally representing software architectures.The selection of W right is based on that it supports to represent not only the architectural structure but also the architectural behavior of a software architecture.Below,we use a simple W right architectural speci-Conﬁguration GasStationComponent CustomerPort Pay=pay!x→pump?x→GasComputation=Pay.pay!x→Gas.pump?x→Computation Component CashierPort Customer1=pay?x→Customer1Port Customer2=pay?x→Customer2Port Topump=pump!x→Computation []Customer2.pay?x→Topump.pump!x→Oil1Port Oil2=take→pump!x→Computation)[](Oil2.take→Oil2.pump!xCashierRole Givemoney=pay!x→GlueConnector Customer→pump?x→GetoilRole Giveoil=take→pump!x→Giveoil.pump?x→Getoil.pump!xPumpRole Tell=pump!x→GlueInstancesCustomer1:CustomerCustomer2:Customercashier:Cashierpump:PumpCustomer1CashierCustomer2CashierCustomer1PumpCustomer2Pumpcashier PumpAttachmentsCustomer1.Pay as Customer1pump.GetoilCustomer2.Pay as Customer2pump.Getoilcasier.Customer1as Customer1cashier.Getmoneycashier.Topump as cashierpump.Knowpump.Oil1as Customer1pump.GiveoilEnd GasStation.Figure2:An architectural speciﬁcation in W right.ﬁcation taken from[14]as a sample to brieﬂy introduce how to use W right to represent a software architecture. The speciﬁcation is showed in Figure2which models the system architecture of a Gas Station system[11].2.1Representing Architectural StructureW right uses a conﬁguration to describe architec-tural structure as graph of components and connectors.Components are computation units in the system.In W right,each component has an interface deﬁned by a set of ports.Each port identiﬁes a point of interaction between the component and its environment.Connectors are patterns of interaction between com-ponents.In W right,each connector has an interface deﬁned by a set of roles.Each role deﬁnes a participant of the interaction represented by the connector.A W right architectural speciﬁcation of a system is deﬁned by a set of component and connector type deﬁni-tions,a set of instantiations of speciﬁc objects of these types,and a set of attachments.Attachments specify which components are linked to which connectors.For example,in Figure2there are three compo-nent type deﬁnitions,Customer,Cashier and Pump,and three connector type deﬁnitions,Customer_Cashier, Customer_Pump and Cashier_Pump.The conﬁguration is composed of a set of instances and a set of attach-ments to specify the architectural structure of the sys-tem.2.2Representing Architectural BehaviorW right models architectural behavior according to the signiﬁcant events that take place in the computa-Customer1Customer2pumpFigure 3:The architecture of the Gas Station system.tion of components,and the interactions between com-ponents as described by the connectors.The nota-tion for specifying event-based behavior is adapted from CSP [10].Each CSP process deﬁnes an alphabet of events and the permitted patterns of events that the process may exhibit.These processes synchronize on common events (i.e.,interact)when composed in paral-lel.W right uses such process descriptions to describe the behavior of ports,roles,computations and glues.A computation speciﬁcation speciﬁes a component’s behavior:the way in which it accepts certain events on certain ports and produces new events on those or other ports.Moreover,W right uses an overbar to distin-guish initiated events from observed events ∗.For ex-ample,the Customer initiates Pay action (i.e.,pay!x).As a result,based on formal W right architectural speciﬁcations,we can infer which ports of a component are input ports and which are output ports.Also,we can infer which roles are input roles and which are out-speciﬁcation of the original one which includes those components and connectors that might aﬀect the com-ponent cashier through the ports in the criterion,and a forward architectural slice is a partial speciﬁcation of the original one which includes those components and connectors that might be aﬀected by the component cashier through the ports in the criterion.The other parts of the speciﬁcation that might not aﬀect or be af-fected by the component cashier will be removed,i.e., sliced away from the original speciﬁcation.The main-tainer can thus examine only the contents included in a slice to investigate the impact of modiﬁ-ing the algorithm we will present in Section6,the slice shown in Figure6can be computed.4Architectural SlicingIntuitively,an architectural slice may be viewed as a subset of the behavior of a software architecture,simi-lar to the original notion of the traditional static slice. However,while a traditional slice intends to isolate the behavior of a speciﬁed set of program variables,an ar-chitectural slice intends to isolate the behavior of a spec-iﬁed set of a component or connector’s elements.Given an architectural speciﬁcation P=(C m,C n,c g),our goal is to compute an architectural slice S p=(C′m,C′n,c′g) which should be a“sub-architecture”of P and preserve partially the semantics of P.To deﬁne the meanings of the word“sub-architecture,”we introduce the concepts of a reduced component,connector and conﬁguration. Deﬁnition4.1Let P=(C m,C n,c g)be an architec-tural speciﬁcation and c m∈C m,c n∈C n,and c g be a component,connector,and conﬁguration of P respec-tively:•A reduced component of c m is a component c′m that is derived from c m by removing zero,or more ele-ments from c m.•A reduced connector of c n is a connector c′n that is derived from c n by removing zero,or more elements from c n.•A reduced conﬁguration of c g is a conﬁguration c′g that is derived from c g by removing zero,or more elements from c g.The above deﬁnition showed that a reduced compo-nent,connector,or conﬁguration of a component,con-nector,or conﬁguration may equal itself in the case that none of its elements has been removed,or an empty com-ponent,connector,or conﬁguration in the case that all its elements have been removed.For example,the followings show a component Customer,a connector Customer_Cashier,and a con-ﬁguration as well as their corresponding reduced com-ponent,connector,and conﬁguration.The small rect-angles represent those ports,roles,or instances and at-tachments that have been removed from the original component,connector,or conﬁguration.(1)The component Customer and its reduced compo-nent(with*mark)in which the port Gas and elements Gas.take→PayPort Gas=take→Gas.take→Pay22222222222222222222Computation=Pay.pay!xCashierRole Givemoney=pay!x→Glue*Connector Customer→Glue(3)The conﬁguration and its reduced conﬁguration (with*mark)in which some instances and attachments have been removed.InstancesCustomer1:CustomerCustomer2:Customercashier:Cashierpump:PumpCustomer1CashierCustomer2CashierCustomer1PumpCustomer2Pumpcashier PumpAttachmentsCustomer1.Pay as Customer1pump.GetoilCustomer2.Pay as Customer2pump.Getoilcasier.Customer1as Customer1cashier.Getmoney cashier.Topump as cashierpump.Knowpump.Oil1as Customer1pump.Giveoil*InstancesCustomer1:CustomerCustomer2:Customercashier:Cashier22222222Customer1CashierCustomer2Cashier222222222222222222222222222222222222222222222222222222222222*AttachmentsCustomer1.Pay as Customer1cashier.Givemoney 222222222222222222222222222casier.Customer1as Customer1cashier.Getmoney 222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222Having the deﬁnitions of a reduced component,con-nector and conﬁguration,we can deﬁne the meaning of the word“sub-architecture”.Deﬁnition4.2Let P=(C m,C n,c g)and P′= (C′m,C′n,c′g)be two architectural speciﬁcations.Then P′is a reduced architectural speciﬁcation of P if:•C′m={c′m1,c′m2,...,c′mk}is a“subset”of C m={c m1,c m2,...,c mk}such that for i=1,2,...,k,c′miis a reduced component of c m i,•C′n={c′n1,c′n2,...,c′nk}is a“subset”of C n={c n1,c n2,...,c nk}such that for i=1,2,...,k,c′niis a reduced connector of c n i,•c′g is a reduced conﬁguration of c g,Having the deﬁnition of a reduced architectural spec-iﬁcation,we can deﬁne some notions about slicing soft-ware architectures.In a W right architectural speciﬁcation,for exam-ple,a component’s interface is deﬁned to be a set of ports which identify the form of the component inter-acting with its environment,and a connector’s interface is deﬁned to be a set of roles which identify the form of the connector interacting with its environment.To un-derstand how a component interacts with other compo-nents and connectors for making changes,a maintainer must examine each port of the component of interest. Moreover,it has been frequently emphasized that con-nectors are as important as components for architec-tural design,and a maintainer may also want to modify a connector during the maintenance.To satisfy these requirements,for example,we can deﬁne a slicing cri-terion for a W right architectural speciﬁcation as a set of ports of a component or a set of roles of a connector of interest.Deﬁnition4.3Let P=(C m,C n,c g)be an architec-tural speciﬁcation.A slicing criterion for P is a pair (c,E)such that:1.c∈C m and E is a set of elements of c,or2.c∈C n and E is a set of elements of c.Note that the selection of a slicing criterion depends on users’interests on what they want to examine.If they are interested in examining a component in an ar-chitectural speciﬁcation,they may use slicing criterion 1.If they are interested in examining a connector,they may use slicing criterion2.Moreover,the determina-tion of the set E also depends on users’interests on what they want to examine.If they want to examine a component,then E may be the set of ports or just a subset of ports of the component.If they want to ex-amine a connector,then E may be the set of roles or just a subset of roles of the connector.Deﬁnition4.4Let P=(C m,C n,c g)be an architec-tural speciﬁcation.•A backward architectural slice S bp=(C′m,C′n,C′g) of P on a given slicing criterion(c,E)is a reduced architectural speciﬁcation of P which contains only those reduced components,connectors,and conﬁg-uration that might directly or indirectly aﬀect the behavior of c through elements in E.•Backward-slicing an architectural speciﬁcation P on a given slicing criterion is toﬁnd the backward architectural slice of P with respect to the criterion. Deﬁnition4.5Let P=(C m,C n,c g)be an architec-tural speciﬁcation.•A forward architectural slice S fp=(C′m,C′n,C′g) of P on a given slicing criterion(c,E)is a reduced architectural speciﬁcation of P which contains only those reduced components,connectors,and conﬁg-uration that might be directly or indirectly aﬀected by the behavior of c through elements in E.•Forward-slicing an architectural speciﬁcation P ona given slicing criterion is toﬁnd the forward ar-chitectural slice of P with respect to the criterion.From Deﬁnitions4.4and4.5,it is obviously that there is at least one backward slice and at least one forward slice of an architectural speciﬁcation that is the speciﬁcation itself.Moreover,the architecture repre-sented by S bp or S fp should be a“sub-architecture”of the architecture represented by P.Deﬁning an architectural slice as a reduced architec-tural speciﬁcation of the original one is particularly use-ful for supporting architectural reuse.By using an ar-chitectural slicer,a system designer can automatically decompose an existing architecture(in the case that its architectural speciﬁcation is available)into some small architectures each having its own functionality which may be reused in new system designs.Moreover,the view of an architectural slice as a reduced architecturalpv1: Customer1.Paypv2: Customer1.Gaspv3: Customer2.Paypv4: Customer2.Gaspv5: cashier.Customer1pv6: cashier.Customer2pv7: cashier.Topumppv8: pump.Fromcashierpv9: pump.Oil1pv10: pump.Oil2rv1: Customer1_cashier.Givemoneyrv2: Customer1_cashier.Getmoneyrv3: Customer2_cashier.Givemoneyrv4: Customer2_cashier.Getmoneyrv5: cashier_pump.Tellrv6: cashier_pump.Knowrv7: Customer1_pump.Getoilrv8: Customer1_pump.Giveoilrv9: Customer2_pump.Getoilrv10: Customer2_pump.GiveoilFigure4:The informationﬂow graph of the architectural speciﬁcation in Figure2.speciﬁcation dose not reduce its usefulness when applied it to architectural understanding because it also con-tains enough information for a maintainer to facilitate the modiﬁcation.5The Information Flow Graph for Soft-ware ArchitecturesIn this section,we present the architecture informa-tionﬂow graph for software architectures on which ar-chitectural slices can be computed eﬃciently.The architecture informationﬂow graph is an arc-classiﬁed digraph whose vertices represent the ports of components and the roles of the connectors in an archi-tectural speciﬁcation,and arcs represent possible infor-mationﬂows between components and/or connectors in the speciﬁcation.Deﬁnition5.1The Architecture Information Flow Graph(AIFG)of an architectural speciﬁcation P is an arc-classiﬁed digraph(V com,V con,Com,Con,Int), where:•V com is the set of port vertices of P;•V con is the set of role vertices of P;•Com is the set of component-connectorﬂow arcs;•Con is the set of connector-componentﬂow arcs;•Int is the set of internalﬂow arcs.There are three types of informationﬂow arcs in the AIFG,namely,component-connectorﬂow arcs, connector-componentﬂow arcs,and internalﬂow arcs.Component-connectorﬂow arcs are used to represent informationﬂows between a port of a component and a role of a connector in an architectural speciﬁcation. Informally,if there is an informationﬂow from a port of a component to a role of a connector in the speciﬁ-cation,then there is a component-connectorﬂow arc in the AIFG which connects the corresponding port vertex to the corresponding role vertex.For example,from the W right speciﬁcation shown in Figure2,we can know that there is an informationﬂow from the port Topump of the component cashier to the role Tell of the con-nector cashier_pump.Therefore there is a component-connectorﬂow arc in the AIFG in Figure4which con-nects the port vertex of port Topump to the role vertex of role Tell.Connector-componentﬂow arcs are used to represent informationﬂows between a role of a connector and a port of a component in an architectural speciﬁcation. Informally,if there is an informationﬂow from a role of a connector to a port of a component in the speci-ﬁcation,then there is a connector-componentﬂow arc in the AIFG which connects the corresponding role ver-tex to the corresponding port vertex.For example,from the W right speciﬁcation in Figure2,we can know that there is an informationﬂow from the role Know of the connector cashier_pump to the port Fromcashier of the component pump.Therefore,there is a connector-componentﬂow arc in the AIFG in Figure4which con-nects the role vertex for role Know to the port vertex for port Fromcashier.Internalﬂow arcs are used to represent internal in-formationﬂows within a component or connector in an architectural speciﬁrmally,for a component in the speciﬁcation,there is an internalﬂow from an input port to an output port,and for a connector in the speciﬁcation,there is an internalﬂow from an in-put role to an output role.For example,in Figure2, there is an internalﬂow from the role Givemoney to the role Getmoney of the connector Customer1_cashier and also an internalﬂow arc from the port Fromcashier to the port Oil1of component pump.As we introduced in Section2,W right uses CSP-based model to specify the behavior of a component and a connector of a software architecture.W right allows user to infer which ports of a component are input and which are output,and which roles of a connector are input and which are output based on a W right archi-tectural speciﬁcation.Moreover,it also allows user to infer the direction in which the information transfers be-tween ports and/or roles.As a result,by using a static analysis tool which takes an architectural speciﬁcation as its input,we can construct the AIFG of a W right architectural speciﬁcation automatically.Figure4shows the AIFG of the architectural speciﬁ-cation in Figure2.In theﬁgure,large squares represent components in the speciﬁcation,and small squares rep-resent the ports of each component.Each port vertex has a name described by component name. For example,pv5(cashier.Customer1)is a port ver-tex that represents the port Customer1of the compo-nent rge circles represent connectors in the speciﬁcation,and small circles represent the roles of each connector.Each role vertex has a name de-scribed by connector name.For example,rv5 (cashier_pump.Tell)is a role vertex that represents the role Tell of the connector cashier_pump.The com-plete speciﬁcation of each vertex is shown on the right side of theﬁgure.Solid arcs represent component-connectorﬂow arcs that connect a port of a component to a role of a connec-tor.Dashed arcs represent connector-componentﬂow arcs that connect a role of a connector to a port of a component.Dotted arcs represent internalﬂow arcs that connect two ports within a component(from an input port to an output port),or two roles within a con-nector(from an input role to an output role).For exam-ple,(rv2,pv5)and(rv6,pv8)are connector-component ﬂow arcs.(pv7,rv5)and(pv9,rv8)are component-connectorﬂow arcs.(rv1,rv2)and(pv8,pv10)are in-ternalﬂow arcs.6Computing Architectural Slices The slicing notions deﬁned in Section4give us only a general view of an architectural slice,and do not tell us how to compute it.In this section we present a two-phase algorithm to compute a slice of an architectural speciﬁcation based on its informationﬂow graph.Our algorithm contains two phases:(1)Computing a slice S g over the informationﬂow graph of an architectural speciﬁcation,and(2)Constructing an architectural slice S p from S g.6.1Computing a Slice over the AIFGLet P=(C m,C n,c g)be an architectural speciﬁca-tion and G=(V com,V con,Com,Con,Int)be the AIFG of P.To compute a slice over the G,we reﬁne the slicing notions deﬁned in Section4as follows:•A slicing criterion for G is a pair(c,V c)such that:(1)c∈C m and V c is a set of port vertices corre-sponding to the ports of c,or(2)c∈C n and V c isa set of role vertices corresponding to roles of c.•The backward slice S bg(c,V c)of G on a given slic-ing criterion(c,V c)is a subset of vertices of G such that for any vertex v of G,v∈S bg(c,V c)iﬀthere exists a path from v to v′∈V c in the AIFG.•The forward slice S fg(c,V c)of G on a given slicing criterion(c,V c)is a subset of vertices of G such that for any vertex v of G,v∈S fg(c,V c)iﬀthere exists a path from v′∈V c to v in the AIFG.According to the above descriptions,the computa-tion of a backward slice or forward slice over the AIFG can be solved by using an usual depth-ﬁrst or breath-ﬁrst graph traversal algorithm to traverse the graph by taking some port or role vertices of interest as the start point of interest.Figure5shows a backward slice over the AIFG with respect to the slicing criterion(cashier,V c)such that V c={pv5,pv6,pv7}.6.2Computing an Architectural SliceThe slice S g computed above is only a slice over the AIFG of an architectural speciﬁcation,which is a set of vertices of the AIFG.Therefore we should map each element in S g to the source code of the speciﬁcation.Let P=(C m,C n,c g)be an architectural speciﬁcation and G=(V com,V con,Com,Con,Int)be the AIFG of P.By using the concepts of a reduced component,connector, and conﬁguration introduced in Section4,a slice S p= (C′m,C′n,c′g)of an architectural speciﬁcation P can be constructed in the following steps:1.Constructing a reduced component c′m from a com-ponent c m by removing all ports such that their corresponding port vertices in G have not been in-cluded in S g and unnecessary elements in the com-putation from c m.The reduced components C′m in S p have the same relative order as the componentsC m in P.2.Constructing a reduced connector c′n from a con-nector c n by removing all roles such that their cor-responding role vertices in G have not been in-cluded in S g and unnecessary elements in the glue from c n.The reduced connectors C′n in S p have the。

2_what is the software architecture

physical model + mapping
A B
C
D
TCP/IP over Ethernet bandwidth, availability
下午7时54分 63 8
1.1.2 Architecture is an abstraction
In modern systems, elements interact with each other by means of interface Interface partition the details of an element into public and private parts Architecture is concerned with the public side of interface
The software architecture of a system is
the set of structures needed to reason about the system, which comprise software elements, relations among them, and properties of both business goals and the final resulting systems A B
下午7时54分
63
6
1.1.1.2 C&C structure
C&C structures focus on the way the elements interact with each other at runtime to carry out the system’s functions For example: the service system

物联网体系架构及关键技术

第2章物联网体系架构
它提供整个网络信息
物联网的这种自主体系结构由数据面、的控完制整面视、图知，并识且面提和
炼成为网络系统的知
管理面四个面组成。
识，控用制于面指通导过控向制数面的适据应面性发控送制配置信
息，优化数据面
的吞吐量，提高
可靠性
数据面主要用于
管理面用于协调数图2.1 物联网的一种自主体系结构数据分组的传送
第2章物联网体系架构图2.3 EPC物联网体系架构示意图
第2章物联网体系架构
由图2.3可以看到一个企业物联网应用系统的基本架构。该应用系统由三大部分组成，即RFID识别系统、中间件系统和计算机互联网系统。
RFID识别系统包含EPC标签和RFID读写器，两者通过 RFID空中接口通信，EPC标签贴于每件物品上。
EPC Global对于物联网的描述是，一个物联网主要由 EPC编码体系、射频识别系统及EPC信息网络系统三部分组成。
第2章物联网体系架构
1．EPC编码体系物联网实现的是全球物品的信息实时共享。显然，首先要做的是实现全球物品的统一编码，即对在地球上任何地方生产出来的任何一件物品，都要给它打上电子标签。这种电子标签带有一个电子产品代码，并且全球唯一。电子标签代表了该物品的基本识别信息，例如，表示“A公司于B时间在C地点生产的D类产品的第E件”。目前，欧美支持的EPC编码和日本支持的UID编码是两种常见的电子产品编码体系。
第2章物联网体系架构
EPC信息发现服务(Discovery Service)包括对象名解析服务(Object Name Service，ONS)以及配套服务，它基于电子产品代码，获取EPC数据访问通道信息。目前，根ONS系统和配套的发现服务系统由EPC Global委托VeriSign公司进行运

Performance Analysis of Software Architectures

rr Q re 10
Other average performance indices can be derived from π and depend on the blocking type Exact solution becomes soon numerically untractable Product-form solution in special cases approximate analysis
• Network Topology
– models how service centers are interconnected and how customers move among them
Queueing networks with finite capacity queues
•Queueing network models to represent – sharing of resources with finite capacity queues – population constraints – synchronization constraints
Lack of information
How do we measure
How do we interpret the measures?
Performance Evaluation
Quantitative analysis of systems; based on models and methods both deterministic and stochastic
• Analytic techniques can be exact (e.g. numerical), approximated or bound

Client-side cross-site scripting protection

在线网络技术搜索研发GIS研发互联网/电子商务功能设计linux平台脚本语言数据结构和算法设计Online NetworkSearch R & DGIS R & DInternet / E-CommerceFunctional Designlinux platformScripting languageData structure and algorithm design16,410 articles found for: pub-date > 2002 and tak(((E-Commerce) or (Online Network ) or (GIS R&D) or (Search R&D)) and ((Functional Design ) or (Scripting language ) or (Data structure ) or (algorithm design) or (linux platform )) and internet)Platform-based product design and development: A knowledge-intensive support approachKnowledge-Based SystemsThis paper presents a knowledge-intensive support paradigm for platform-based product family design and development. The fundamental issues underlying the product family design and development, including product platform and product family modeling, product family generation and evolution, and product family evaluation for customization, are discussed. A module-based integrated design scheme is proposed with knowledge support for product family architecture modeling, product platform establishment, product family generation, and product variant assessment. A systematic methodology and the relevant technologies are investigated and developed for knowledge supported product family design process. The developed information and knowledge-modeling framework and prototype system can be used for platform product design knowledge capture, representation and management and offer on-line support for designers in the design process. The issues and requirements related to developing a knowledge-intensive support system for modular platform-based product family design are also addressed.Article Outline1. Introduction2. Literature review3. Platform-based product design and development4. Product platform and product family modeling4.1. Product family architecture modeling4.2. Product family generation and optimization4.3. Product family evolution representation4.4. Product family evaluation for customization5. Module-based product family design process6. Knowledge support framework for modular product family design6.1. Knowledge support scheme and key issues6.2. Product family design knowledge modeling and support6.2.1. Issues of product family design knowledge modeling6.2.2. Knowledge modeling and representation for product family design6.2.3. Knowledge support process for modular product family design7. Prototype of knowledge-intensive support system for product family design8. Summary and future work9. DisclaimerReferencesCost-based admission control for Internet Commerce QoS enhancementElectronic Commerce Research and ApplicationsIn many e-commerce systems, preserving Quality of Service (QoS) is crucial to keep a competitive edge. Poor QoS translates into poor system resource utilisation, customer dissatisfaction and profit loss. In this paper, a cost-based admission control (CBAC) approach is described which is a novel approach to preserve QoS in Internet Commerce systems. CBAC is a dynamic mechanism which uses a congestion control technique to maintain QoS while the system is online. Rather than rejecting customer requests in a high-load situation, a discount-charge model which is sensitive to system current load and navigationalstructure is used to encourage customers to postpone their requests. A scheduling mechanism with load forecasting is used to schedule user requests in more lightly loaded time periods. Experimental results showed that the use of CBAC at high load achieves higher profit, better utilisation of system resources and service times competitive with those which are achievable during lightly loaded periods. Throughput is sustained at reasonable levels and request failure at high load is dramatically reduced.Article Outline1. Introduction2. An overview of CBAC3. Discount-charge pricing model4. CBAC’s navigational model5. Customer postponed request scheduling6. Forecasting system load7. CBAC-specific web pages8. Customer behaviour9. ECBench benchmarking tool10. CBAC performance analysis10.1. Service time10.2. CPU utilisation10.3. Throughput and failed requests10.4. Profit10.5. CBAC overhead10.6. CBAC load forecasting effect11. Related work12. ConclusionsReferencesActiveRDF: Embedding Semantic Web data into object-oriented languagesWeb Semantics: Science, Services and Agents on the World Wide WebSemantic Web applications share a large portion of development effort with database-driven Web applications. Existing approaches for development of these database-driven applications cannot be directly applied to Semantic Web data due to differences in the underlying data model. We develop a mapping approach that embeds Semantic Web data into object-oriented languages and thereby enables reuse of existing Web application frameworks.We analyse the relation between the Semantic Web and the Web, and survey the typical data access patterns in Semantic Web applications. We discuss the mismatch between object-oriented programming languages and Semantic Web data, for example in the semantics of class membership, inheritance relations, and object conformance to schemas.We present ActiveRDF, an object-oriented API for managing RDF data that offers full manipulation and querying of RDF data, does not rely on a schema and fully conforms to RDF(S) semantics. ActiveRDF can be used with different RDF data stores: adapters have been implemented to generic SPARQL endpoints, Sesame, Jena, Redland and YARS and new adapters can be added easily. We demonstrate the usage of ActiveRDF and its integration with the popular Ruby on Rails framework which enables rapid development of Semantic Web applications.Article Outline1. Introduction1.1. Mapping relational data1.2. Web application frameworks1.3. Outline2. Related work2.1. Object–relational mappings2.2. RDF data access2.3. Semantic Web application development3. Developing Semantic Web applications4. Requirements for Semantic Web application development5. Typical data access and manipulation patterns6. Programming languages for embedding RDF data7. A layered architecture for programmatic access to data7.1. Adapters7.2. Federation manager7.3. Query engine7.4. Object manager8. Evaluation9. Example application: exploring online communities9.1. Domain: social communities on the web9.2. The Ruby on Rails Web application framework9.3. Implementing the SIOC explorer9.3.1. Crawling SIOC data9.3.2. Integrating the data9.3.3. Application logic: social context extraction9.3.4. Faceted navigation with BrowseRDF9.4. Implemented Semantic Web capabilities10. ConclusionReferencesThe hybrid model of neural networks and genetic algorithms for the design of controls for internet-based systems for business-to-consumer electronic commerceExpert Systems with ApplicationsResearch highlights► A hybrid model using neural networks and genetic algorithms is proposed. ► The effect of system environments on controls can be estimated. ► The effect of each mode of controls on implementation (volume) can be identified. ► The model can suggest the bes t set of values for controls to be recommended.As organizations become increasingly dependent on Internet-based systems for business-to-consumer electronic commerce (ISB2C), the issue of IS security becomes increasingly important. As the usage ofsecurity controls is related to the implementation of ISB2C, the extent of ISB2C controls can be adjusted in order to enable the greatest extent of implementation of ISB2C. This study intends to proposeISB2C-NNGA (ISB2C-controls design using neural networks and genetic algorithms), a hybrid optimization model using neural networks and genetic algorithms for the design of ISB2C controls, which uses back-propagation neural networks (BPN) model as a prediction of controls using system environments, and GA as a pattern directed search mechanism to estimate the exponent of independent variables (i.e., ISB2C controls) in multivariate regression analysis of power model. The effect of system environments on controls can be estimated using BPN model which outperformed linear regression analysis in terms of square root of mean squared error. The effect of each mode of controls on implementation (volume) can be identified using exponents and standardized coefficients in theGA-based nonlinear regression analysis in ISB2C-NNGA. ISB2C-NNGA outperformed conventional linear regression analysis in prediction accuracy in terms of the average R square and sum of squared error. ISB2C can suggest the best set of values for controls to be recommended from several candidate sets of values for controls by identifying the set of values for controls which produce greatest extent of ISB2C implementation. The results of study will support the design of ISB2C controls effectively. Article Outline1. Introduction2. Theoretical background2.1. Neural networks2.2. Genetic algorithms2.3. ISB2C Controls for ISB2C implementation3. Research model3.1. Build a neural network model to estimate the effect of system environments on controls3.2. Build a GA-based nonlinear regression model3.3. Determine the extent of effect on implementation by each mode of controls3.4. Recommend the set of controls for maximum implementation of ISB2C from candidate sets of controls4. Measures and data collection5.1. Estimation and prediction of ISB2C controls using BPN5.2. Estimation and prediction of ISB2C implementation using GA-based nonlinear regression model5.3. Recommendation of controls6. ConclusionAppendix AA.1. System environmentsA.2. ISB2C ControlsA.3. ISB2C ImplementationReferencesCataclysm: Scalable overload policing for internet applicationsE-fulfillment and multi-channel distribution – A reviewEuropean Journal of Operational ResearchThis review addresses supply chain management issues specific to Internet fulfillment in a multi-channel environment. It provides a systematic overview of managerial planning tasks and corresponding quantitative models. Our objective is to twofold, namely to enhance the understanding of multi-channel e-fulfillment by documenting the current state of affairs, and to inspire fruitful future research by identifying gaps between relevant managerial issues and available academic literature.One of the recurrent patterns in today’s e-commerce operations is the combination of ‘bricks-and-clicks’ –the integration of e-fulfillment into a portfolio of multiple alternative distribution channels. From a supply chain management perspective, multi-channel distribution provides opportunities for serving different customer segments, creating synergies, and exploiting economies of scale. However, in order to successfully exploit these opportunities companies must master novel challenges. In particular, the design of a multi-channel distribution system requires a constant trade-off between process integration and separation across multiple channels. In addition, sales and operation decisions are ever more tightly intertwined as delivery and after-sales services are becoming key components of the product offering. Article Outline2. Scope and framework3. Sales and delivery planning3.1. Delivery service design3.1.1. Issues3.1.2. Models3.2. Pricing and forecasting3.2.1. Issues3.2.2. Models3.3. Order promising and revenue management 3.3.1. Issues3.3.2. Models3.4. Transportation planning3.4.1. Issues3.4.2. Models4. Supply management4.1. Distribution network design4.1.1. Issues4.1.2. Models4.2. Warehouse design4.2.1. Issues4.2.2. Models4.3. Inventory and capacity management4.3.1. Issues4.3.2. Models5. ConclusionsAcknowledgementsReferencesDistributed algorithm engineering for networks of tiny artifactsComputer Science Reviewthis survey, we describe the state of the art for research on experimentally-driven research on networks of tiny artifacts. The main topics are existing and planned practical testbeds, software simulations, and hybrid approaches; in addition, we describe a number of current studies undertaken by the authors. Article Outline1. Introduction2. Tools of practical assessment2.1. Experimental facilitiesMoteLabTWISTTutorNetMetroSense—BikenetTrueMobilesensLABWISEBEDSmart cities2.2. Software simulatorsNs-2JSimOMNeT++SENSETOSSIMAvroraWSimShawn2.3. Hybrid approachesSensorSimH-TOSSIMEmstarX-SimSensornet CheckpointingVirtual Links2.4. The wiselib: a library of algorithms Lack of established programming standards Compatibility issuesAbsence of low-level convenience features Absence of algorithm building blocks ArchitecturepSTLpMPAlgorithms3. Algorithm engineering in FRONTS3.1. Hardware3.2. Simulating the testbed3.3. Experiments repository4. Case studies4.1. Adaptive multipath fair communication Probabilistic data forwardingEnergy fairnessMultiple forwarding phasesTwisting forwardingConclusions4.2. Game-theoretic vertex coloringGame-theoretic vertex coloring Centralized implementationSimplified distributed implementationFully distributed implementation ConclusionsReferencesVitaeA multi-resolution model of vector map data for rapid transmission over the InternetComputers & GeosciencesThe rapid transmission of vector map data over the Internet is becoming a bottleneck of spatial data delivery and visualization in web-based environment because of increasing data amount and limited network bandwidth. In order to improve the transmission performance of vector map data over the Internet, a multi-resolution model of vector map data is proposed, which constructs multiple-resolution representations of vector map data on-line prior to transmission with a vertex decimation method on the server side. The vertex decimation method was developed to extract various resolutions of vector map data on-line, and data reconstruction solution was developed to recover the original vector map data from a low-resolution one on the client side. Secondly, rules of the vertex decimation method are defined to overcome self-intersections and inconsistent topology in the multi-resolution model and corresponding algorithm was developed to test the performance and feasibility of the multi-resolution model for rapid transmission over the Internet. Experimental results reveal that the multi-resolution model of vector map data significantly improves the transmission time and decreases data amount of spatial data over the Internet.Article Outline1. Introduction2. Methodology2.1. The concept of progressive transmission of vector map data2.2. Construction of the multi-resolution model of vector map data2.2.1. The vertex decimation method2.2.2. Constructing the multi-resolution model of geometry objects3. Implementation of the multi-resolution model of vector map data3.1. Data structure for the vertices eliminated3.2. The transmission and reconstruction of vector map data4. Experimental study and analysis5. ConclusionsAcknowledgementsReferencesCommunity detection in graphsPhysics ReportsThe modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i.e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e.g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.Article Outline1. Introduction2. Communities in real-world networks3. Elements of community detection3.1. Computational complexity3.2. Communities3.2.1. Basics3.2.2. Local definitions3.2.3. Global definitions3.2.4. Definitions based on vertex similarity 3.3. Partitions3.3.1. Basics3.3.2. Quality functions: Modularity4. Traditional methods4.1. Graph partitioning4.2. Hierarchical clustering4.3. Partitional clustering4.4. Spectral clustering5. Divisive algorithms5.1. The algorithm of Girvan and Newman5.2. Other methods6. Modularity-based methods6.1. Modularity optimization6.1.1. Greedy techniques6.1.2. Simulated annealing6.1.3. Extremal optimization6.1.4. Spectral optimization6.1.5. Other optimization strategies6.2. Modifications of modularity6.3. Limits of modularity7. Spectral algorithms8. Dynamic algorithms8.1. Spin models8.2. Random walk8.3. Synchronization9. Methods based on statistical inference 9.1. Generative models9.2. Blockmodeling, model selection and information theory10. Alternative methods11. Methods to find overlapping communities11.1. Clique percolation11.2. Other techniques12. Multiresolution methods and cluster hierarchy12.1. Multiresolution methods12.2. Hierarchical methods13. Detection of dynamic communities14. Significance of clustering15. Testing algorithms15.1. Benchmarks15.2. Comparing partitions: Measures15.3. Comparing algorithms16. General properties of real clusters17. Applications on real-world networks17.1. Biological networks17.2. Social networks17.3. Other networks18. OutlookAcknowledgementsAppendix. Elements of graph theoryA.1. Basic definitionsA.2. Graph matricesA.3. Model graphsReferencesArchitectures for the future networks and the next generation Internet: A survey Computer CommunicationsNetworking research funding agencies in USA, Europe, Japan, and other countries are encouraging research on revolutionary networking architectures that may or may not be bound by the restrictions of the current TCP/IP based Internet. We present a comprehensive survey of such research projects and activities. The topics covered include various testbeds for experimentations for new architectures, new security mechanisms, content delivery mechanisms, management and control frameworks, service architectures, and routing mechanisms. Delay/disruption tolerant networks which allow communications even when complete end-to-end path is not available are also discussed.Article Outline1. Introduction2. Scope3. Security3.1. Relationship-Oriented Networking3.1.1.Identities3.1.2. Building and sharing relationships3.1.3. Relationship applications3.2. Security architecture for Networked Enterprises (SANE)3.3. Enabling defense and deterrence through private attribution3.4. Protecting user privacy in a network with ubiquitous computing devices3.5. Pervasive and trustworthy network and service infrastructures3.6. Anti-Spam Research Group (ASRG)4. Content distribution mechanisms4.1. Next generation CDN4.2. Next generation P2P4.3. Swarming architecture4.4. Content Centric Networking5. Challenged network environments5.1. Delay Tolerant Networks (DTN)5.2. Delay/fault tolerant mobile sensor networks (DFT-MSN)5.3. Postcards from the edge5.4. Disaster day after networks (DAN)5.5. Selectively Connected Networking (SCN)6. Network monitoring and control architectures6.1. 4D architecture6.2. Complexity Oblivious Network Management (CONMan)6.3. Maestro6.4. Autonomic network management6.5. In-Network Management (INM)7. Service centric architectures7.1. Service-Centric End-to-End Abstractions for Network Architecture7.2. SILO architecture for services integration, control, and optimization for the future Internet7.3. NetSerV: architecture of a service-virtualized Internet7.4. SLA@SOI: empowering the Service Economy with SLA-aware Infrastructures7.5. SOA4All: Service-Oriented Architectures for All7.6. Internet 3.0: a multi-tier diversified architecture for the next generation Internet based on object abstraction8. Next generation internetworking architectures8.1. Algorithmic foundations for Internet architecture: clean slate approach8.2. Greedy routing on hidden metrics (GROH Model)8.3. HLP: hybrid link state path-vector inter-domain routing8.4. eFIT [94] enabling future Internet innovations through transit wire8.5. Postmodern internetwork architecture8.6. ID-locater split architectures8.6.1. HIP8.6.2. LISP8.6.3. MILSA8.7. Other proposals8.7.1. User controlled routes8.7.2. Switched Internet Architecture8.7.3. Routing Control Platform (RCP)9. Future Internet infrastructure design for experimentation9.1. Background: a retrospect of PlanetLab, Emulab and others 9.2. Next generation network testbeds: virtualization and federation 9.2.1. Federation9.2.2. Virtualization9.3. Next generation network testbeds: implementations9.3.1. Global Environment for Network Innovations (GENI)9.3.2. FIRE testbeds9.3.3. WISEBED10. Conclusions11. List of abbreviations。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Software Architecture of PSET: A Page Segmentation Evaluation Toolkit
Song Mao and Tapas Kanungo Language and Media Processing Laboratory Center for Automation Research University of Maryland, College Park, MD
1 Introduction
It is important to quantitatively monitor progress in any scienti c eld. The information retrieval community and the speech recognition community, for example, have yearly competitions in which researchers evaluate their latest algorithms on clearly de ned tasks, datasets, and metrics. To make such evaluations possible, researchers have access to standardized datasets, metrics, and freely available software for scoring the results produced by algorithms 18, 1]. In the Document Image Analysis area, regular evaluations of OCR accuracy have been conducted by UNLV 3]. Page segmentation algorithms, which are crucial components of OCR systems, were at one time evaluated by UNLV based on the nal OCR results, but not on the geometric results of the segmentation. Recently 14], we empirically compared various commercial and research page segmentation algorithms, using the University of Washington dataset. We used a well-de ned (geometric) line-based metric and a sound statistical methodology to score the segmentation results. Furthermore, unlike the UNLV evaluations, we trained the segmentation algorithms prior to evaluating them. In this paper we describe in detail the software architecture of the package called PSET, which we used in 14] to evaluate page segmentation algorithms. This package was developed by us at the University of Maryland and will be made available to researchers at no cost. Publication of the package will allow researchers to implement our ve-step evaluation methodology and evaluate their own algorithms. Software architecture can be described using methods such as Petri Nets and Data Flow Diagrams 8]. We describe the architecture of PSET, the I/O le formats, etc., using Object-Process Diagrams (OPDs) 5], which are similar in spirit to Petri Nets. The package, called the Page Segmentation Evaluation Toolkit (PSET), is modular, written using the C language, and runs on the SUN/UNIX platform. The software has been structured so that it can be used at the UNIX command line level or compiled into other software packages by calling API functions. The description in this paper will aid users in using, updating, and modifying the PSET package. It will also help users to add new algorithm modules to the package and to interface it with other software tools and packages. The PSET package includes three research page segmentation algorithms; 1 a textline-based benchmarking algorithm; and a Simplex-based optimization algorithm for estimating algorithm parameters from training datasets. This paper is organized as follows. In Section 2, we discuss the page segmentation problem. In Section 3, we present our ve-step page segmentation performance evaluation methodology. In Section 4, we describe the architecture and le formats of our PSET package in detail and show how to implement each step of our ve-step performance evaluation methodology. In Section 5, we give the hardware and software requirements for using the PSET package. In Section 6, we discuss our future work. Finally in Section 7, we give a summary of the article. A detailed description of our textline-based metric is given in an Appendix for completeness.
MDA 9049-6C-1250 9802167270 N660010028910 IIS9987944 September 2000
Empirical performance evaluation of page segmentation algorithms has become increasingly important due to the numerous algorithms that are being proposed each year. In order to choose between these algorithms for a speci c domain it is important to empirically evaluate their performance. To accomplish this task the document image analysis community needs i) standardized document image datasets with groundtruth, ii) evaluation metrics that are agreed upon by researchers, and iii) freely available software for evaluating new algorithms and replicating other researchers' results. In an earlier paper (SPIE Document Recognition and Retrieval 2000) we published evaluation results for various popular page segmentation algorithms using the University of Washington dataset. In this paper we describe the software architecture of the PSET evaluation package, which was used to evaluate the segmentation algorithms. The description of the architecture will allow researchers to understand the software better, replicate our results, evaluate new algorithms, experiment with new metrics and datasets, etc. The software is written using the C language on the SUN/UNIX platform and is being made available to researchers at no cost.