Sacre a Constraint Satisfaction Problem Based Theorem Prover

合集下载

ENSURING THE SATISFACTION OF A TEMPORAL SPECIFICATION AT RUN-TIME

ENSURING THE SATISFACTION OF A TEMPORAL SPECIFICATIONAT RUN-TIME1Grace Tsai,Matt Insall2and Bruce McMillin3CSC-93-020August1,1993ABSTRACTThis paper presents an approach to operationally evaluate a temporal speciﬁcation in a dis-tributed computing environment.First,an algorithm Compute History is proposed to allow every distributed processor to collect events executed by itself and by other processors,and to order the collected events by causality.This algorithm employs neither monitors nor a global clock to order collected events of a processor.These collected events of a processor form an execution history of a distributed computation,and they represent behaviors of all the processors during execution.Next,the semantics of operational evaluation of a temporal assertion is presented.The evaluation of a temporal assertion is an examination of events of a processor’s collected histories against the assertion.Since the histories are computed in a faulty distributed environment and the assertion denotes a program’s speciﬁcation,operational evaluation ensures satisfaction of a temporal speciﬁcation in the presence of faults.1INTRODUCTIONDuring the design phase of software development,we often make assumptions about the behavior of a system and its environment.However,the unpredictability of the environment and occurrences of faults can cause violations of a program’s speciﬁcation at run-time.It is desirable that,under these conditions,the run-time behavior of a program still conforms to its speciﬁcation.Thus,an approach to ensuring satisfaction of a temporal speciﬁcation in a faulty operational environment is proposed.In this approach,we present an algorithm,Compute History,which allows every processor to collect events occurring within itself and within other processors,and to order the collected events by causality.This algorithm employs neither monitors nor a global clock to collect and order the collected events in a faulty distributed environment.These collected events of a processor form an execution history of a distributed computation,and they represent a processor’s view of the executions of all the processors involved in a computation.Next,we introduce the semantics of operational evaluation of a temporal assertion(spec-iﬁcation).The evaluation of an assertion at run-time is called operational evaluation.Since a processor’s collected history represents its view of computation during execution,it can be uti-lized to evaluate assertions.Thus,the operational evaluation of an assertion is an examination of events of a processor’s collected history against the assertion.If the history does not satisfy the assertion(speciﬁcation),then we have detected a violation of a program’s speciﬁcation. Otherwise,we say that the speciﬁcation is satisﬁed at run-time.Other work has used embedded assertions to examine system behavior.For example,[1] embeds system constraints into programs and examines them at run-time.However,they use a centralized monitor to obtain an execution history,while we collect event history without using monitors.([2],[3],and[4])use assertions as a tool for run-time detection of software faults during debugging and testing,while we use assertions to detect a violation of a program’s speciﬁcation and to ensure satisfaction of a program’s speciﬁcation.Our work of Changeling([5],[6],and[7])embeds safety assertions to detect errors in the presence of failures.This paper focuses on building a framework of operational evaluation of a temporal assertion(speciﬁcation).In particular,we consider liveness assertions[8], which are obtained from the reasoning of communications,using the temporal proof system of Interleaving Set Temporal Logic[9].A liveness assertion(denoting a program’s speciﬁcation) is satisﬁed at run-time only if the operational evaluation of the assertion is true or the collected event history does not violate the assertion.This is the concept of operational evaluation of a temporal assertion.The organization of this paper is as follows.Section2introduces the notion of an event, our model of distributed computation,and other deﬁnitions to be used in this paper.In Section 3,the motivation for an appropriate clock scheme to order events is illustrated by examples. Section4deﬁnes correct histories,and presents the algorithm for distributed processors to obtain execution history in a non-faulty environment.In Section5,we deﬁne a consistency check, give examples,and provide an algorithm to build histories in a faulty environment.Section6 presents liveness assertions and semantics of operational evaluation.Section7concludes this paper.2BACKGROUNDThis section gives a brief introduction to events and our model of distributed computation. Then,the notation and deﬁnitions to be used in this paper are presented.A computation of a distributed program can be viewed as a sequence of event occurrences. Loosely speaking,events represent activities that happen in a system.An event occurrence deﬁnes a point in time,in this context.Therefore,properties or assertions can be expressed as relationships among event occurrences in a computation.The events occurring within a process can be totally ordered according to the process’s clock.There are three types of events,internal(local)events,send events,and receive events. In particular,send and receive events are referred to as externally observable events.The deﬁnition of an event is bound to granularity of the actions it models.Since we are not interested in granularity,events and actions(operations)are used interchangeably throughout the paper.A distributed program consists of n processes,12,which cooperate to perform a computation.Each process resides on a unique processor.The mapping between processors and processes is one-to-one.There is no global clock,and processes must communicate to exchange information.Basic DeﬁnitionsNotation2.1Event execution in a distributed program is represented by a diagram,where each horizontal line describes one process behavior,and the horizontal direction of each line denotes time which increases from left to right.Message exchanges are shown by directed lines.(See Figure1)Deﬁnition2.1Event execution in a program forms an irreﬂexive partial order(denoted by) on the events which occur in the program.Deﬁnition2.2Event precedes event in an execution,i.e.,,iff any one of the following conditions holds[10]:1.and are events of the same process,and occurs before,2.is a send event,and is the corresponding receive event,or3.there exists event,such that,and.Deﬁnition2.3Two events are causally related if either or holds.If neither nor holds,then and are considered as concurrent or independent events. Deﬁnition2.4A history of a program is a pair where is the initial interpretation and12...is a sequence of events of the program in the causalrelation order.Throughout the paper,we will use the letter in a history(e.g.)to denote an initial state.Deﬁnition2.5Let be a collection of events.Given an initial state and two sequences of events and(),two histories and are equivalent, if there exist histories12...with1and and for each1,there exist and,such that1.In other words,and1only differ by the order of adjacent symbols which are independent according to the causal relation of Deﬁnition2.2[9].Example1.Let an alphabet be,where events and are causally related,and( and)or(and)are not causally related.The following shows that the historiesand are equivalent,where and.First,we can obtain2by interchanging and in.Let,1Next,and can be interchanged as follows.Let,2Therefore,these two histories and are equivalent.But,are not equivalent,because there exist no sequenced interchanges from.Deﬁnition2.6A trace is an equivalence class of histories,denoted by where J is an initial state and is some member of the equivalence class[9].Deﬁnition2.7Let a set be a processor collection of values including processor (local)values,and those it observes(processor communicates with processor about its local values).The set is used to denote processor history.Also,processor’s knowledge or view about system execution is based on the values in its history.The values in the history are used to denote the events executed by,and those which have occurred in the other processors and have been obtained by processor via message exchanges.Notice that the history denotes an execution sequence of a program, while the history of processor is a collection of values denoting the event occurrences that processor observes at run-time.3Motivation for Ordering EventsA history is a collection of events observed or performed by a processor(process)during execution.Among these events,some of them are casually related,while some are not causally related(independent).How should we systematically order them?First,we show by an example that,without a timestamp,it is not sufﬁcient for a processor to order events from another processor.The following deﬁnition is needed before we look at the example.Deﬁnition3.1Let12be theﬁrst,second,,observable(send or receive) events of processor.A send(receive)event is the event of processor,which represents an operation of sending(receiving)a set of values from processor.Example2.Figure1shows a computation,where212212are send events,and113132 are the corresponding receive events.Events2122send the(local)values of of processor 2to processors1and3,respectively.Event12shows that processor1relays the value of(from processor2)to processor3.These values of are not timestamped.Let events 2122and12contain values of as follows.21:01(1)22:01001(2)12:01(3)When the receive event11occurs,processor1records the values in21into its history,1,as follows.1:201where1:2denotes the view or history from processor1to processor2.When event 31occurs,the view from processor3to2is obtained by recording the values of22as follows.3:201001When event32occurs,processor3receives the information12from processor1.However, processor3has no way to know that12:01has already been recorded in itshistory3.Therefore,timestamps are necessary to order events from the same processor.However,the following example shows that even local clock timestamping is not sufﬁcient for a processor (process)to order events from different processors.Example3.Given a history2of processor2,2consists of timestamped events fromprocessors1and3as follows.12:1531:07By looking at the clock values of5and7of processors1and3,processor2has no way to decide which event occurred before the other,since there is no known common reference points for these two clocks.In Example3,if processor2has knowledge of another event44,and1244,4431,then2can conclude that12precedes31by transitivity.However,the message about44may arrive later than the messages about1231,and yet2has to wait for relevant information to decide whether events12and31are independent or causally related.What we would like is a clock scheme that can decide causality of any two events by comparing their clock values,without resorting to a third event or transitivity.Moreover, the time-stamping scheme should not impose any arbitrary ordering on any two events which are not originally causally related.Thus,vector clocks([11],[12])are utilized to determine causality between any two events.Vector Clock SchemeLet be the vector clock of process,and let denote the clock value after the execution of event.On sending a message,a process timestamps the message by appending the clock value to it.When process executes an event,the following operations are performed on its clock.Operation1:for each event,increments its clock on the component of the vector,i.e.,1,where denotes the component of vector. Operation2:for a receive event with a vector timestamp,, where and denote the components of vectors and,respectively.In other words,the value of each component of vector is obtained by taking the maximal value from the corresponding components of vectors and.The following example illustrates the vector clock scheme.Example4.In Figure2,the initial clock values of processes12and3are000.When process2sends21to process1,2changes its clock value to010(by Operation1). Then,21is timestamped with010.Upon receiving this message,1changes its clockvalue to110(by Operation2).Likewise,we can obtain the timestamps of223112and ,as shown in Figure2.32The following deﬁnition describes the mechanism of deciding causality of any two timestamped events.Deﬁnition3.2Given two timestamps(vectors),for events and,respectively,the relation holds,iffIn other words,occurs before,if and only if all the components of are less than or equal to the corresponding components of and there exists a component of which is strictly less than that of.4Correct Histories and AlgorithmsIn Section3,we have shown that a processor can utilize vector clock timestamping to order events from other processors.This section deﬁnes a criterion of correct histories,presents an algorithm Compute History,and shows the correctness of a history which is constructed according to the algorithm.4.1Correct HistoriesRecall that a history is a collection of events executed or observed by a processor at run-time, and it is a processor’s view of all the processors involved in a computation.The objective is to utilize processors’histories to check for an violation of a program’s speciﬁcation at run-time. Thus,it is very important that all the processors agree on the same event occurrences in their collected histories which,in turn,ensures that they have the same view of the execution.The following deﬁnitions show that a history is correct with respect to the historyiff its continuation(see Deﬁnition4.1)is a member of the equivalence class of histories of .Deﬁnition4.1A history is a continuation of a history,if(1)for every event in,is also in,and(2)causality in implies causality in.Deﬁnition4.2For a processor,its(run-time)history is correct with respect to the history,iff the following condition holds:there exists a history,a continuation of,such that is a member ofthe equivalence class of,i.e.,.Notice that,in Deﬁnition4.2,the history is an expected history of a program,and is a history resulting from extending the history.The following proposition shows that a history is correct,iff causality among events in is preserved during execution. Proposition4.1For a processor,its history is correct with respect to a history iff for events in,causality in causality in,i.e.,causality in is preserved during execution.Proof:(if part)If the history is correct,then there exists a continuationof,and.In other words,and only differ by the order of independent operations,which implies and have the same orderings for those causally related events,i.e.,causality in causality in.Since is a continuation of,for the events in,causality in causality in.Therefore,for the events in,causality in causality in.This implies that causality in is preserved during execution.(only-if part)From assumption for events in,causality in causality in,we know that is a continuation of.Let the history be.Clearly,the historyis a continuation of,and and are equivalent.Thus,the history is correct with respect to the history.4.2Computing Histories in a Non-Faulty EnvironmentIn this section,we present an algorithm,which allows distributed processors to obtain their histories(views)about system execution in a non-faulty environment[7].These histories will be utilized to detect a violation of processors’run-time behaviors.Main idea.Every process relies on communications toﬁnd out events that have occurred in other processes,and to collect these events into its history.Thus,whenever there is a communication,processes exchange their observations(histories)of event occurrences in the system.After the exchange,processors incorporate the received histories into their own histories.Thus,the exchanges of histories allow every processor to have a view of all the other processors in the system.Now,we describe the contents of messages,the relevant information for processors to compute their histories.Then,examples are given to illustrate how processors exchange their observations,followed by the algorithm Compute History.Deﬁnition4.3A tuple is of the formor12.An observable(send or receive)event is of the form12where12are tuples denoting the values of variables resulting from processor local computation(processor,var=val,time),or representing a timestamped send/receive operation of1,where2is the corresponding communicating processor12.Figure3shows the data structure for a tuple,an event and a history.Example5illustrates tuples,events and histories.Example5.In Figure4,processor1sends the values of to processor2,at times50and 90,by executing events11and12.11111012501215701290Events11shows that at time10the value of is1in processor1,and1sent a message to 2at time50.Likewise,events12shows that at time70the value of is5in processor 1,and1sent a message to2at time90.Then,how does processor2incorporate thereceived events into its history2?The following describes an incorporation of a receivedevent into a history.Deﬁnition4.4Let12be a receive event,where12are tuples.The incorporation of an event into a history12of processor is to incorporate each tuple in into,such that the new history11satisﬁes the following conditions:1.for each,,i.e.,does not cause any1.tuple1tuple2historyevent1event3Figure3:tuple,event and history2.,i.e.,occurs before.Therefore,in the new sequence11,does not cause any event preceding (i.e.,11),and causes the next event of(i.e.,).Notice that there are many ways of incorporating events into a processor’s history,since events are timestamped by vector clocks and they form a partial ordering instead of a total ordering.The important thing is that during execution,causality in a collected history is preserved,and at termination,history is a member of the equivalence class of an expected history.The following describes an incorporation of one history into another.Deﬁnition4.5Given two histories1and2,a function¯12returns a history2,such that if an event of1is not in2,then,according to Deﬁnition4.4,incorporate into2.Example6illustrates exchanges of observations(histories)and incorporations of histories. Example6.In Figure5,there are three communications,12,and3.The values of variables,and are relevant among processors.When a communication occurs,processes utilize this moment to exchange their observations(histories)of event occurrences in the system.For example,before communication1occurs,the histories of processors1and2 only contain events executed by themselves.11111100125902212203021090During the communication1,processors1and2exchange their histories.Then,they apply function¯to incorporate received histories into their own histories.The new histories1and2are as follows.1112121121Before communication2occurs,histories of processors2and3are as follows.2112122where33122220302109031320073201120At this point of time,processor2has knowledge of events executed by processor1 from communication1.During communication2,processors2and3exchanges their histories.Hence,processor3also has knowledge of events executed by processor1.The histories of processors2and3after communication2are as follows.211212231311212231Notice that,before communications with processor1,3has no knowledge of events executed by processor1.However,because of exchanges of histories,from processor2,processor3 knows the events executed by processor1.Thus,the history of processor3includes events occurring in processors1,2,and3.Similarly,when communication3occurs,processors1and3exchange their histories and update their own histories.Thus,through exchanges of histories,processors can collect events occurring in other processors and obtain their views of execution without any monitors.4.3Algorithm for a Non-Faulty EnvironmentAssume that there are isolated processors,which can communicate only by two-party mes-sages.Assume also that communication delay is not negligible.Figure6presents AlgorithmCompute History,which computes a history of processor in a non-faulty environment.Inthis algorithm,processors and exchange their respective histories and during acommunication.Then,processors computes its new history by incorporating events in into(step1).Finally,processor updates its clock(step2).Theorem4.1The history,,built by the algorithm Compute History of Figure6,is correctin a non-faulty environment.Proof:By Proposition4.1,history is correct if causality in is preserved during execution. Recall that is a set of events collected by processor at run-time.In the algorithm,uponthe receipt of history,processor computes its history from function¯,which incorporates events of into according to Deﬁnition4.4.Therefore,events in are ina linear order compatible with the causal relations,and causality in is preserved.5Computing Histories in a Faulty Environment5.1Consistency CheckThis section presents an algorithm to collect and order events in a faulty environment.In this environment,faulty processors may fool non-faulty ones by sending incorrect values.In this case,a consistency check is necessary to deal with faulty processors sending inconsistent values to different processors.The idea behind a consistency check is that if a value of a variable is sent from a non-faulty processor to a set of processors on more than one path,then,under a bounded number of faults,non-faulty processors will receive the same value of the variable,or an inconsistency is detected.Thus,a consistency check is to detect inconsistency of values received and to ensure the same observations of event occurrences,which in terms ensures that non-faulty processors agree on causality in their collected histories with respect to an expected history.First,we deﬁne consistency check,and use Example7to illustrate the idea.Then,we present an algorithm to compute correct histories in a faulty environment.Deﬁnition5.1Two histories1and2are inconsistent,if there exist two tuples and,such that is a tuple of1,is a tuple of2,and1.if and,thenand only differ in the values of().2.if12and12,then and either differ in the second argument(22),or the third argument(),where is or.Deﬁnition5.2Given two histories1and2,a function12is deﬁned as follows.12if1and2are inconsistent, otherwise.Deﬁnition5.3A consistency check by a non-faulty processor is a process ofﬁnding an inconsistency where and are histories of processors and. Example7illustrates a consistency check.Example7.In Figure7,processor3sends2to processors1and2,when commu-nications1and2occur.Since processor2is faulty,it fabricates the value of and sends 0to processor1.Now we apply an consistency check to see how processor1can detect an inconsistency of value.In Figure8,when communication1occurs,processors 1and3exchange their observations(histories).The histories of processors1and3before the exchange are as follows.111331where1113100313200731107The histories of processors1and3after the exchange are as follows.1113131131Next,communication2occurs,and processors2and3exchange their histories.Thefollowing are the histories of processors2and3before the exchange.221where311313221231593232109Thus,after the exchange and the incorporation of received histories,the histories of processorsand3are now as follows.2211313221311313221The histories of processors1and2before the exchange during communication3are as follows.1113112where2113132212222219591212900Since processor2is faulty,it sends an incorrect value0to processor1.Thus,the to be sent to processor1contains event31as follows.history231305.2Algorithm for a Faulty EnvironmentSince a processor may receive incorrect values from other processors in a fault prone system, causality on the events in a processor’s history can be different from that of an expected history .However,the algorithm Compute History for a faulty environments must ensure that non-faulty processors preserve causality and thus obtain correct histories.The followingobservation describes how to obtain correct histories in a faulty environment. Observation:Recall that for a processor,its history is correct,with respect to a history,iff causality in is preserved during execution.However,in a faulty environment it is not known which processors are faulty,and that the values relayed by faulty processors can be arbitrary.To preserve causality in a faulty environment,non-faulty processors must apply consistency checks to catch inconsistency between events in a received history and events in its own history.This ensures that non-faulty processors agree on event occurrences and hence on causality in their histories.Figure9shows algorithm Compute History for a faulty environment,which is described below.1.If an inconsistency is detected,then STOP.2.If there is no inconsistency,then processor incorporates history into its own history,such that events of are in order according to Deﬁnition4.4.3.Processor updates its clock.Theorem5.1Given that out of processors are faulty and the processor intercon-nected network forms a graph with vertex connectivity1,if a history is to be received by a processor from vertex disjoint paths and consistency checks are applied,then the received values in are correct,or else an inconsistency is detected.Proof:First,if history travels along paths containing no faulty processors,then non-faulty processors will receive the same history.Second,it travels along paths containing faultyprocessors,which may relay incorrect history.Since the graph has vertex connectivity 1,the processor will receive history from at least one path containing no faulty processors,and from a path possibly containing faulty processors.An inconsistency is detected when faulty processors corrupt values in.Theorem5.2The history,built according to algorithm Collect History of Figure9,is correct,or an inconsistency is detected.Proof:In Theorem4.1,we have shown that the algorithm is correct in a non-faulty environment What is left to consider is when there are faulty processors.The algorithm stops when an inconsistency is detected,which is shown by the above theorem.Otherwise,histories and are consistent,which implies processors and agree on event occurrences and hence agree on causality in their histories.Therefore,causality in history is preserved and is correct.||||φψφψFigure 10:EF6Run-Time SatisfactionIn Sections 4and 5,we have shown that the algorithm Compute History builds a processor’s history (view)of a computation.This section presents the operational evaluation of assertions,which is an evaluation on assertions at run-time.Since during execution a processor’s history is obtained by collecting events occurring within itself and within other processors,this history represents run-time behaviors of all the processors and can be utilized to evaluate assertions (the expected behaviors)of processors.Thus,the operational evaluation of an assertion is an examination of events of a processor’s collected history against the assertion.The assertion of interest is a liveness assertion 4EF ,which ensures what values program variables must possess at a state satisfying assertion ,starting from a state satisfying assertion .Therefore,it is a progress assertion.The liveness assertions show what a program is supposed to do from one communication point to another.Thus,they can be applied to check,at run-time,whether progress has been made from one communication point to another.Notice that evaluation of liveness assertions can be performed during execution,since these assertions denote progress from one communication point to another.However,if assertions involve a termination property,then they may not be evaluated before termination,since during execution a processor’s history only contains events executed and observed so far,rather than events of the whole execution.6.1Ensuring One Processor’s BehaviorThis subsection considers evaluation of liveness assertions,which involve one processor’sbehavior.A liveness assertionEF ensures that an execution will progress from a state satisfying assertion to a state satisfying assertion .In other words,the assertionEF is true,if for every state satisfying ,there is eventually a state satisfying .For example,the computation sequence of Figure 10satisﬁes assertion EF .The assertion EF ensures a program’s behavior from one state to another.How-ever,a processor’s observation of execution is a collection of events in its history.Thus,to operationally evaluate an assertion,we need to relate states to events of a history.A computation is regarded as event driven,i.e.,a processor receives a message,processes it,sends messages (possibly zero)to other processors,and waits for the next message.Thus,a。

Books Constraint Satisfaction in Logic Programming

ReferencesOn-lineNASSLLI 2003 Tutorial lecture notesR. Barták, http://ktiml.mff.cuni.cz/~bartak/NASSLLI2003, 2003. On-line Guide to Constraint ProgrammingR. Barták, http://kti.mff.cuni.cz/~bartak/constraints, 1998. Constraints Archive/ccc/archiveBooksConstraint Satisfaction in Logic ProgrammingP. Van Hentenryck, MIT Press, 1989.Foundations of Constraint SatisfactionE. Tsang, Academic Press, 1993.Programming with Constraints: An IntroductionK. Marriott, P.J. Stuckey, MIT Press, 1998.SurveysConstraint Programming: In Pursuit of Holy GrailR. Barták, in Proceedings of Week of Doctoral Students, Prague, 1999.Constraint Logic Programming – A SurveyJ. Jaffar & M.J. Maher, J. Logic Programming, 19/20:503-581, 1996.Algorithms for Constraint Satisfaction Problems: A SurveyV. Kumar, AI Magazine 13(1): 32-44, 1992.A Tutorial on Constraint ProgrammingB.M. Smith, TR 95.14, University of Leeds, 1995.The OriginsThe Programming Language Aspects of ThingLab, A Constraint-Oriented Simulation LaboratoryA. Borning, in ACM Transactions on Programming Languages and Systems 3(4): 252-387, 1981.Logic Programming: Further DevelopmentsH. Gallaire, in: IEEE Symposium on Logic Programming, Boston, IEEE, 1985.Constraint Logic ProgrammingJ. Jaffar & J.L. Lassez, in Proc. The ACM Symposium on Principles of Programming Languages, ACM, 1987.Networks of constraints fundamental properties and applications to picture processingU. Montanary, in: Information Sciences 7: 95-132, 1974. Sketchpad: a man-machine graphical communication systemI. Sutherland, in Proc. IFIP Spring Joint Computer Conference, 1963.Understanding line drawings of scenes with shadowsD.L. Waltz, in Psychology of Computer Vision, McGraw-Hill, New York, 1975. BinarisationOn the conversion between Non-Binary and Binary Constraint Satisfaction ProblemsF. Bacchus, P. van Beek, in Proc. National Conference on Artifical Intelligence (AAAI-98), Madison, Wisconsin, 1998.Non-Binary ConstraintsC. Bessiere, in Proc. Principles and Practice of Constraint Programming (CP-99), Alexandria, Virginia, USA, 1999.On the equivalence of constraint satisfaction problemsF. Rossi, V. Dahr and C. Petrie, in Proc. European Conference on Artificial Intelligence (ECAI-90), Stockholm, 1990. Also MCC Technical Report ACT-AI-222-89.Using auxiliary variables and implied constraints to model non-binary problemsB. Smith, K. Stergiou, T. Walsh, in Proc. National Conference on Artificial Intelligence (AAAI-00), Austin, Texas, 2000. Encodings of Non-Binary Constraint Satisfaction ProblemsK. Stergiou, T. Walsh, in Proc. National Conference on Artificial Intelligence (AAAI-99), Orlando, Florida, 1999.Local SearchTabu Search for Maximal Constraint Satisfaction ProblemsP. Galinier, Jin-Kao Hao, in Proceedings of Principles and Practice of Constraint Programming (CP97), Springer Verlag, Austria, 1997.Tabu SearchF. Glover, M. Laguna, in: Modern Heuristics for Combinatorial Problems, Blackwell Scientific Publishing, Oxford, 1993. Localizer: A Modelling Language for Local SearchL. Michel, P. Van Hentenryck, in Proceedings of Principles and Practice of Constraint Programming (CP97), Springer Verlag, Austria, 1997.Minimising conflicts: a heuristic repair method for constraint satisfaction and scheduling problemsS. Minton, M.D. Johnston, P. Laird, in: Artificial Intelligence 58(1-3):161-206, 1992.Domain-independent extensions to GSAT: Solving Large Structured Satisfiability ProblemsB. Selman, H. Kautz, in: Proc. IJCAI-93, 1993.Solving constraint satisfaction problems using neural-networks C.J. Wang, E.P.K. Tsang, in: Proc. Second International Conference on Artificial Neural Networks, 1991.SearchBacktracking algorithms for constraint satisfaction problems; a surveyR. Dechter, D. Frost, in Constraints, International Journal, 1998. Performance Measurement and Analysis of Certain Search AlgorithmsGaschnig, J., CMU-CS-79-124, Carnegie-Mellon University, 1979.Dynamic BacktrackingM.L. Ginsberg, in Journal of Artificial Intelligence Research 1, pages 25-46, 1993.Iterative BroadeningM.L. Ginsberg, W.D. Harvey, In AAAI Proceedings, 1990. Increasing tree search efficiency for constraint satisfaction problemsHaralick, R.M., Elliot, G.L., in: Artificial Intelligence 14:263-314, 1980.Limited Discrepancy SearchW.D. Harvey and M.L. Ginsberg, in Proceedings of IJCAI95, pages 607-613, 1995.Consistency techniquesImproving Domain Filtering using Restricted Path Consistency P. Berlandier, in Proceedings of the IEEE CAIA-95, Los Angeles CA, 1995.Arc-consistency and arc-consistency againC. Bessiere, in Artificial Intelligence 65, pages 179-190, 1994. Using constraint metaknowledge to reduce arc consistency computationC. Bessiere, E.C. Freuder, and J.-R. Régin, in Artificial Intelligence 107, pages 125-148, 1999.Some practicable filtering techniques for the constraint satisfaction problemR. Debruyne and C. Bessi`ere, in Proceedings of the 15th IJCAI, pages 412-417, 1997.Neighborhood inverse consistency preprocessingE. Freuder and C. D. Elfe, in Proceedings of the AAAI National Conference, pages 202-208, 1996.Comments on Mohr and Henderson's path consistency algorithm C. Han and C. Lee, in Artificial Intelligence 36, pages 125-130, 1988.Consistency in networks of relationsA.K. Mackworth, in Artificial Intelligence 8, pages 99-118, 1977. The complexity of some polynomial network consistency algorithms for constraint satisfaction problemsA.K. Mackworth and E.C. Freuder, in Artificial Intelligence 25, pages 65-74, 1985.Arc and path consistency revisedR. Mohr and T.C. Henderson, in Artificial Intelligence 28, pages 225-233, 1986.Arc consistency for factorable relationsM. Perlin, in Artificial Intelligence 53, pages 329-342, 1992. Singleton ConsistenciesP. Prosser, K. Stergiou, T. Walsh, in Proc Principles and Practice of Constraint Programming (CP2000), pages 353-368, 2000.A filtering algorithm for constraints of difference in CSPsJ.C. Régin, in AAAI-94, in Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 362-367, 1994.Path Consistency RevisedM. Singh, in Proceedings of the 7th IEEE International Converence on Tolls with Artificial Intelligence, pages 318-325, 1995.A generic customizable framework for inverse local consistency G. Verfaillie, D. Martinez, and C. Bessiere, in Proceedings of the AAAI National Conference, pages 169-174, 1999. A generic arc-consistency algorithm and its specializationsP. Van Hentenryck, Y. Deville, and C.-M. Teng, in Artificial Intelligence 57, pages 291-321, 1992.Over-constrained problemsModelling Soft Constraints: A SurveyBarták, R., Neural Network World, Vol. 12, Number 5, pp. 421-431, 2002.Semiring-based Constraint Satisfaction and OptimisationS. Bistarelli, U. Montanary, F. Rossi, Journal of ACM, 1997. Semiring-based CSPs and Valued CSPs: Frameworks, properties, and comparisonS. Bistarelli, H. Fargier, U. Montanary, F. Rossi, T. Schiex,G. Verfaillie, Constraints: An international journal, 4(3), 1999. Constraint HierarchiesA. Borning, R. Duisberg,B. Freeman-Benson, A. Kramer,M. Woolf, in Proceedings of the 1987 ACM Conference on Object Oriented Programming Systems, Languages, and Applications, pp.48-60, 1987.Propagation and satisfaction of flexible constraintsD. Dubois, H. Fargier, H. Prade, in Fuzzy Sets, Neural Networks and Soft Computing, pp. 166-187, New York, 1994.Uncertainty in constraint satisfaction problems: a probabilistic approachH. Fargier, J. Lang, in Proceedings of European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, Springer Verlag LNCS 747, 1993.Selecting preferred solutions in fuzzy constraint satisfaction problemsH. Fargier, J. Lang, T. Schiex, in the First European Congress on Fuzzy and Intelligent Technologies, Volume 3, pp. 1128-1134, 1993.Partial Constraint SatisfactionE.C. Freuder, R.J. Wallace, Artificial Intelligence, 58:21-70, 1992. Constraint Satisfaction with PreferencesH. Rudová, Ph.D. Thesis, Masaryk University, Brno, 2001. Possibilistic constraint satisfaction problems or "How to handle soft constraints?"T. Schiex, in Proceedings of the Eighth International Conference on Uncertainty in Artificial Intelligence, pp. 268-275, Stanford, 1992.Valued Constraint Satisfaction Problems: Hard and Easy Problems T. Schiex, H. Fargier, G. Verfaillie, in Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 631-639, San Mateo, 1995.。

A principled approach toward symbolic geometric constraint satisfaction

Journal of Artificial Intelligence Research 4 (1996) 419-443Submitted 2/96; published 6/96A Principled Approach Towards SymbolicGeometric Constraint SatisfactionSanjay Bhansali BHANSALI@ School of EECS, Washington State UniversityPullman, WA 99164-2752Glenn A. Kramer GAK@ Enterprise Integration Technologies, 800 El Camino RealMenlo Park, CA 94025Tim J. Hoar TIMHOAR@ Microsoft CorporationOne Microsoft Way, 2/2069Redmond, WA 98052AbstractAn important problem in geometric reasoning is to find the configuration of a collection of geometric bodies so as to satisfy a set of given constraints. Recently, it has been suggested that this problem can be solved efficiently by symbolically reasoning about geometry. This approach, called degrees of freedom analysis, employs a set of specialized routines called plan fragments that specify how to change the configuration of a set of bodies to satisfy a new constraint while preserving existing constraints. A potential drawback, which limits the scalability of this approach, is concerned with the difficulty of writing plan fragments. In this paper we address this limitation by showing how these plan fragments can be automatically synthesized using first principles about geometric bodies, actions, and topology.1. IntroductionAn important problem in geometric reasoning is the following: given a collection of geometric bodies, called geoms, and a set of constraints between them, find a configuration– i.e., position, orientation, and dimension – of the geoms that satisfies all the constraints. Solving this problem is an integral task for many applications like constraint-based sketching and design, geometric modeling for computer-aided design, kinematics analysis of robots and other mechanisms (Hartenberg & Denavit, 1964), and describing mechanical assemblies.General purpose constraint satisfaction techniques are not well suited for solving constraint problems involving complicated geometry. Such techniques represent geoms and constraints as algebraic equations, whose real solutions yield the numerical values describing the desired configuration of the geoms. Such equation sets are highly non-linear and highly coupled and in the general case require iterative numerical solutions techniques. Iterative numerical techniques are not particularly efficient and can have problems with stability and robustness (Press, Flannery, Teukolsky & Vetterling, 1986). For many tasks (e.g., simulation and optimization of mechanical devices) the same equations are solved repeatedly which makes a compiled solution desirable. In theory, symbolic manipulation of equations can often yield a non-iterative, closed form solution. Once found, such a closed-form solution can be executed very efficiently.B HANSALI, K RAMER &H OARHowever, the computational intractability of symbolic algebraic solution of the equations renders this approach impractical (Kramer, 1992; Liu & Popplestone, 1990).In earlier work Kramer describes a system called GCE that uses an alternative approach called degrees of freedom analysis (1992, 1993). This approach is based on symbolic reasoning about geometry, rather than equations, and was shown to be more efficient than systems based on algebraic equation solvers. The approach uses two models. A symbolic geometric model is used to reason symbolically about how to assemble the geoms so as to satisfy the constraints incrementally. The "assembly plan" thus developed is used to guide the solution of the complex nonlinear equations - derived from the second, numerical model - in a highly decoupled, stylized manner.The GCE system was used to analyze problems in the domain of kinematics and was shown to perform kinematics simulation of complex mechanisms (including a Stirling engine, an elevator door mechanism, and a sofa-bed mechanism) much more efficiently than pure numerical solvers (Kramer, 1992). The GCE has subsequently been integrated in a commercial system called Bravo TM by Applicon where it is used to drive the 2D sketcher (Brown-Associates, 1993). Several academic systems are currently using the degrees of freedom analysis for other applications like assembly modeling (Anantha, Kramer & Crawford, 1992), editing and animating planar linkages (Brunkhart, 1994), and feature-based design (Salomons, 1994; Shah & Rogers, 1993).GCE employs a set of specialized routines called plan fragments to create the assembly plan.A plan fragment specifies how to change the configuration of a geom using a fixed set of operators and the available degrees of freedom, so that a new constraint is satisfied while preserving all prior constraints on the geom. The assembly plan is completed when all constraints have been satisfied or the degrees of freedom is reduced to zero. This approach is canonical: the constraints may be satisfied in any order; the final status of the geom in terms of remaining degrees of freedom is the same (p. 80-81, Kramer, 1992). The algorithm for finding the assembly procedure has a time complexity of O(cg) where c is the number of constraints and g is the number of geoms (p. 139, Kramer, 1992).Since the crux of problem-solving is taken care of by the plan fragments, the success of the approach depends on one’s ability to construct a complete set of plan fragments meeting the canonical specification. The number of plan fragments needed grows geometrically as the number of geoms and constraints between them increase. Worse, the complexity of the plan fragments increases exponentially since the various constraints interact in subtle ways creating a large number of special cases that need to be individually handled. This is potentially a serious limitation in extending the degrees of freedom approach. In this paper we address this problem by showing how plan fragments can be automatically generated using first principles about geoms, actions, and topology.Our approach is based on planning. Plan fragment generation can be reduced to a planning problem by considering the various geoms and the invariants on them as describing a state. Operators are actions, such as rotate, that can change the configuration of geoms, thereby violating or achieving some constraint. An initial state is specified by the set of existing invariants on a geom and a final state by the additional constraints to be satisfied. A plan is a sequence of actions that when applied to the initial state achieves the final state.With this formulation, one could presumably use a classical planner, such as STRIPS (Fikes & Nilsson, 1971), to automatically generate a plan-fragment. However, the operators in this domain are parametric operators with a real-valued domain. Thus, the search space consists of an infinite number of states. Even if the real-valued domain is discretized by considering real-valued intervals there is still a very large search space and finding a plan that satisfies theP RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTIONspecified constraints would be an intractable problem. Our approach uses loci information (representing a set of points that satisfy some constraints)to reason about the effects of various operators and thus reduces the search problem to a problem in topology, involving reasoning about the intersection of various loci.An issue to be faced in using a conventional planner is the frame problem: how to determine what properties or relationships do not change as a result of an action. A typical solution is to use the assumption: an action does not modify any property or relationship unless explicitly stated as an effect of the action. Such an approach works well if one knows a priori all possible constraints or invariants that might be of interest and relatively few constraints get affected by each action - which is not true in our case. We use a novel scheme for representing effects of actions. It is based on reifying (i.e., treating as first class objects) actions in addition to geometric entities and invariant types. We associate, with each pair of geom and invariants, a set of actions that can be used to achieve or preserve that invariant for that geom. Whenever a new geom or invariant type is introduced the corresponding rules for actions that can achieve/preserve the invariants have to be added. Since there are many more invariant types than actions in this domain, this scheme results in simpler rules. Borgida, Mylopoulos & Reiter (1993) propose a similar approach for reasoning with program specifications. A unique feature of our work is the use of geometric-specific matching rules to determine when two or more general actions that achieve/preserve different constraints can be reformulated to a less general action.Another shortcoming of using a conventional planner is the difficulty of representing conditional effects of operators. In GCE an operation’s effect depends on the type of geom as well as the particular geometry. For example, the action of translating a body to the intersection of two lines on a plane would normally reduce the body’s translational degrees of freedom to zero; however, if the two lines happen to coincide then the body still retains one degree of translational freedom and if the two lines are parallel but do not coincide then the action fails. Such situations are called degeneracies. One approach to handling degeneracies is to use a reactive planner that dynamically revises its plan at run-time. However, this could result in unacceptable performance in many real-time applications. Our approach makes it possible to pre-compile all potential degeneracies in the plan. We achieve this by dividing the planning algorithm into two phases. In the first phase a skeletal plan is generated that works in the normal case and in the second phase, this skeletal plan is refined to take care of singularities and degeneracies. The approach is similar to the idea of refining skeletal plans in MOLGEN (Friedland, 1979) and the idea of critics in HACKER (Sussman, 1975) to fix known bugs in a plan. However, the skeletal plan refinement in MOLGEN essentially consisted of instantiating a partial plan to work for specific conditions, whereas in our method a complete plan which works for a normal case is extended to handle special conditions like degeneracies and singularities. 1.1 A Plan Fragment Example.We will use a simple example of a plan fragment specification to illustrate our approach. Domains such as mechanical CAD and computer-based sketching rely heavily on complex combinations of relatively simple geometric elements, such as points, lines, and circles and a small collection of constraints such as coincidence, tangency, and parallelism. Figure 1 illustrates some fairly complex mechanisms (all implemented in GCE) using simple geoms and constraints.B HANSALI, K RAMER &H OARStirling EngineFigure 1. Modeling complex mechanisms using simple geoms and constraints. All the constraintsneeded to model the joints in the above mechanisms are solvable using the degrees of freedom approach.Our example problem is illustrated in Figure 2 and is specified as follows:Geom-type: circleName: $cInvariants: (fixed-distance-line $c $L1 $dist1 BIAS_COUNTERCLOCKWISE)To-be-achieved: (fixed-distance-line $c $L2 $dist2 BIAS_CLOCKWISE)In this example, a variable-radius circle $c1 has a prior constraint specifying that the circle is at a fixed distance $dist1 to the left of a fixed line $L1 (or alternatively, that a line drawn parallelto $L1 at a distance $dist1 from the center of $c is tangent in a counterclockwise direction to the circle). The new constraint to be satisfied is that the circle be at a fixed distance $dist2 to the1We use the following conventions: symbols preceded by $ represent constants, symbols preceded by ?represent variables, expressions of the form (>> parent subpart) denote the subpart of a compound term, parent.P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION To solve this problem, three different plans can be used: (a) translate the circle from its current position to a position such that it touches the two lines $L2’ and $L1’ shown in the figure (b) scale the circle while keeping its point of contact with $L1’ fixed, so that it touches $L2’ (c) scale and translate the circle so that it touches both $L2’ and $L1’.Each of the above action sequences constitute one plan fragment that can be used in the above situation and would be available to GCE from a plan-fragment library. Note that some of the plan fragments would not be applicable in certain situations. For example, if $L1 and $L2 are parallel, then a single translation can never achieve both the constraints, and plan-fragment (a) would not be applicable. In this paper we will show how each of the plan-fragments can be automatically synthesized by reasoning from more fundamental principles.The rest of the paper is organized as follows: Section 2 gives an architectural overview of the system built to synthesize plan fragments automatically with a detailed description of the various components. Section 3 illustrates the plan fragment synthesis process using the example of Figure 2. Section 4 describes the results from the current implementation of the system. Section 5 relates our approach to other work in geometric constraint satisfaction. Section 6 summarizes the main results and suggests future extensions for this work.2. Overview of System ArchitectureFigure 3 gives an overview of the architecture of our system showing the various knowledge components and the plan generation process. The knowledge represented in the system is broadly categorized into a Geom knowledge-base that contains knowledge specific to particular geometric entities and a Geometry knowledge-base that is independent of particular geoms and can be reused for generating plan fragments for any geom.Figure 3. Architectural overview of the plan fragment generator2.1 Geom Knowledge-baseThe geom specific knowledge-base can be further decomposed into seven knowledge components.B HANSALI, K RAMER &H OAR2.1.1A CTIONSThese describe operations that can be performed on geoms. In the GCE domain, three actions suffice to change the configuration of a body to an arbitrary configuration: (translate g v) which denotes a translation of geom g by vector v; (rotate g pt ax amt) which denotes a rotation of geom g, around point pt, about an axis ax, by an angle amt; and (scale g pt amt) where g is a geom, pt is a point on the geom, and amt is a scalar. The semantics of a scale operation depends on the type of the geom; for example, for a circle, a scale indicates a change in the radius of the circle and for a line-segment it denotes a change in the line-segment’s length. Pt is the point on the geom that is fixed (e.g., the center of a circle).2.1.2 I NVARIANTSThese describe constraints to be solved for the geoms. The initial version of our system has been designed to generate plan fragments for a variable-radius circle and a variable length line-segment on a fixed workplane, with constraints on the distances between these geoms and points, lines, and other geoms on the same workplane. There are seven invariant types to represent these constraints. Examples of two such invariants are:• (Invariant-point g pt glb-coords) which specifies that the point pt of geom g is coincident with the global coordinates glb-coords, and• (Fixed-distance-point g pt dist bias) which specifies that the geom g lies at a fixed distance dist from point pt; bias can be either BIAS_INSIDE or BIAS_OUTSIDE depending on whether g lies inside or outside a circle of radius dist around point pt.2.1.3 L OCIThese represent sets of possible values for a geom parameter, such as the position of a point on a geom. The various kinds of loci can be grouped into either a 1d-locus (representable by a set of parametric equations in one parameter) or a 2d-locus (representable by a set of parametric equations in two variables). For, example a line is a 1d locus specified as (make-line-locus through-pt direc) and represents an infinite line passing through through-pt and having a direction direc. Other loci represented in the system include rays, circles, parabolas, hyperbolas, and ellipses.2.1.4 M EASUREMENTSThese are used to represent the computation of some function, object, or relationship between objects. These terms are mapped into a set of service routines which get called by the plan fragments. An example of a measurement term is: (0d-intersection 1d-locus1 1d-locus2). This represents the intersection of two 1d-loci. In the normal case, the intersection of two 1-dimensional loci is a point. However, there may be singular cases, for example, when the two loci happen to coincide; in such a case their intersection returns one of the locus instead of a point. There may also be degenerate cases, for example, when the two loci do not intersect; in such a case, the intersection is undefined. These exceptional conditions are also represented with each measurement type and are used during the second phase of the plan generation process to elaborate a skeletal plan (see Section 3.3).P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION2.1.5 G EOMSThese are the objects of interest in solving geometric constraint satisfaction problems. Examples of geoms are lines, line-segments, circles, and rigid bodies. Geoms have degrees of freedoms which allow them to vary in location and size. For example, in 3D-space a circle with a variable radius, has three translational, two rotational, and one dimensional degree of freedom.The configuration variables of a geom are defined as the minimal number of real-valued parameters required to specify the geometric entity in space unambiguously. Thus, a circle has six configuration variables (three for the center, one for the radius, and two for the plane normal). In addition, the representation of each geom includes the following:•name: a unique symbol to identify the geom;•action-rules: a set of rules that describe how invariants on the geom can be preserved or achieved by actions (see below);•invariants: the set of current invariants on the geom;•invariants-to-be-achieved: the set of invariants that need to be achieved for the geom.2.1.6 A CTION R ULESAn action rule describes the effect of an action on an invariant. There are two facts of interest to a planner when constructing a plan: (1) how to achieve an invariant using an action and (2) how to choose actions that preserve as many of the existing invariants as possible. In general, there are several ways to achieve an invariant and several actions that will preserve an invariant. The intersection of these two sets of actions is the set of feasible solutions. In our system, the effect of actions is represented as part of geom-specific knowledge in the form of Action rules, whereas knowledge about how to compute intersections of two or more sets of actions is represented as geometry-specific knowledge (since it does not depend on the particular geom being acted on).An action rule consists of a three-tuple (pattern, to-preserve, to-[re]achieve). Pattern is the invariant term of interest; to-preserve is a list of actions that can be taken without violating the pattern invariant; and to-[re]achieve is a list of actions that can be taken to achieve the invariant or re-achieve an existing invariant “clobbered” by an earlier action. These actions are stated in the most general form possible. The matching rules in the Geometry Knowledge base are then used to obtain the most general unifier of two or more actions. An example of an action rule, associated with variable-radius circle geoms is:pattern: (1d-constrained-point ?circle (>> ?circle CENTER) ?1dlocus)(AR-1) to-preserve: (scale ?circle (>> ?circle CENTER) ?any)(translate ?circle (v- (>> ?1dlocus ARBITRARY-POINT)(>> ?circle CENTER))to-[re]achieve: (translate ?circle (v- (>> ?1dlocus ARBITRARY-POINT)(>> ?circle CENTER))This action rule is used to preserve or achieve the constraint that the center of a circle geom lie on a 1d locus. There are two actions that may be performed without violating this constraint: (1) scale the circle about its center. This would change the radius of the circle but the position of the center remains the same and hence the 1d-constrained-point invariant is preserved. (2)B HANSALI, K RAMER &H OARtranslate the circle by a vector that goes from its current center to an arbitrary point on the 1-dimensional locus ((v- a b) denotes a vector from point b to point a). To achieve this invariant only one action may be performed: translate the circle so that its center moves from its current position to an arbitrary position on the 1-dimensional locus.2.1.7 S IGNATURESFor completeness, it is necessary that there exist a plan fragment for each possible combination of constraints on a geom. However, in many cases, two or more constraints describe the same situation for a geom (in terms of its degrees of freedom). For example, the constraints that ground the two end-points of a line-segment and the constraints that ground the direction, length, and one end-point of a line-segment both reduce the degrees of freedom of the line-segment to zero and hence describe the same situation. In order to minimize the number of plan fragments that need to be written, it is desirable to group sets of constraints that describe the same situation into equivalence classes and represent each equivalence class using a canonical form.The state of a geom, in terms of the prior constraints on it, is summarized as a signature. A signature scheme for a geom is the set of canonical signatures for which plan fragments need to be written. In Kramer’s earlier work (1993) the signature scheme had to be determined manually by examining each signature obtained by combining constraint types and designating one from a set of equivalent signatures to be canonical. Our approach allows us to construct the signature scheme for a geom automatically by using reformulation rules (described shortly). A reformulation rule rewrites one or more constraints into a simpler form. The signature scheme is obtained by first generating all possible combinations of constraint types to yield the set of all possible signatures. These signatures are then reduced using the reformulation rules until each signature is reduced to the simplest form. The set of (unique) signatures that are left constitute the signature scheme for the geom.As an example, consider the set of constraint types on a variable radius circle. The signature for this geom is represented as a tuple <Center, Normal, Radius, FixedPts, FixedLines> where:•Center denotes the invariants on the center point and can be either Free (i.e., no constraint on the center point), L2 (i.e., center point is constrained to be on a 2-dimensional locus), L1 (i.e., center point is constrained to be on a 1-dimensional locus), or Fixed.•Normal denotes the invariant on the normal to the plane of the circle and can be either Free, L1, or Fixed (in 2D it is always fixed).•Radius denotes the invariant on the radius and can be either Free or Fixed.•FixedPts denotes the number of Fixed-distance-point invariants and can be either 0,1, or 2.•FixedLines denotes the number of Fixed-distance-line invariants and can be either 0,1, or 2.L2 and L1 denote a 2D and 1D locus respectively. If we assume a 2D geometry, the L2 invariant on the Center is redundant, and the Normal is always Fixed. There are then 3 x 1 x 2 x 3 x 3 = 54 possible signatures for the geom. However, several of these describe the same situation. For example, the signature:<Center-Free,Radius-Free, FixedPts-0,FixedLines-2>which describes a circle constrained to be at specific distances from two fixed lines, can be rewritten to:P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION<Center-L1, Radius-Free,FixedPts-0,FixedLines-0>which describes a circle constrained to be on a 1-dimensional locus (in this case the angular bisector of two lines). Using reformulation rules, we can derive the signature scheme for variable radius circles consisting of only 10 canonical signatures given below:<Center-Free,Radius-Free, FixedPts-0,FixedLines-0><Center-Free,Radius-Free, FixedPts-0,FixedLines-1><Center-Free,Radius-Free, FixedPts-1,FixedLines-0><Center-Free,Radius-Fixed, FixedPts-0,FixedLines-0><Center-L1,Radius-Free, FixedPts-0,FixedLines-0><Center-L1,Radius-Free, FixedPts-0,FixedLines-1><Center-L1,Radius-Free, FixedPts-1,FixedLines-0><Center-L1,Radius-Fixed, FixedPts-0,FixedLines-0><Center-Fixed,Radius-Free, FixedPts-0,FixedLines-0><Center-Fixed,Radius-Fixed, FixedPts-0,FixedLines-0>Similarly, the number of signatures for line-segments can be reduced from 108 to 19 using reformulation rules.2.2 Geometry Specific KnowledgeThe geometry specific knowledge is organized as three different kinds of rules.2.2.1 M ATCHING R ULESThese are used to match terms using geometric properties. The planner employs a unification algorithm to match actions and determine whether two actions have a common unifier. However, the standard unification algorithm is not sufficient for our purposes, since it is purely syntactic and does not use knowledge about geometry. To illustrate this, consider the following two actions:(rotate $g $pt1 ?vec1 ?amt1), and(rotate $g $pt2 ?vec2 ?amt2).The first term denotes a rotation of a fixed geom $g, around a fixed point $pt1 about an arbitrary axis by an arbitrary amount. The second term denotes a rotation of the same geom around a different fixed point $pt2 with the rotation axis and amount being unspecified as before. Standard unification fails when applied to the above terms because no binding of variables makes the two terms syntactically equal2. However, resorting to knowledge about geometry, we can match the two terms to yield the following term:(rotate $g $pt1 (v- $pt2 $pt1) ?amt1)which denotes a rotation of the geom around the axis passing through points $pt1 and $pt2. The point around which the body is rotated can be any point on the axis (here it is arbitrarily chosen to be one of the fixed points, $pt1) and the amount of rotation can be anything.The planner applies the matching rules to match the outermost expression in a term first; if no rule applies, it tries subterms of that term, and so on. If none of the matching rules apply, then 2 Specifically, unification fails when it tries to unify $pt1 and $pt2.B HANSALI, K RAMER &H OARthis algorithm degenerates to standard unification. The matching rules can also have conditions attached to them. The condition can be any boolean function; however, for the most part they tend to be simple type checks.2.2.2 R EFORMULATION R ULESAs mentioned earlier, there are several ways to specify constraints that restrict the same degrees of freedom of a geom. In GCE, plan fragments are indexed by signatures which summarize the available degrees of freedom of a geom. To reduce the number of plan fragments that need to be written and indexed, it is desirable to reduce the number of allowable signatures. This is accomplished with a set of invariant reformulation rules which are used to rewrite pairs of invariants on a geom into an equivalent pair of simpler invariants (using a well-founded ordering). Here equivalence means that the two sets of invariants produce the same range of motions in the geom. This reduces the number of different combinations of invariants for which plan fragments need to be written. An example of invariant reformulation is the following: (fixed-distance-line ?c ?l1 ?d1 BIAS_COUNTERCLOCKWISE)(fixed-distance-line ?c ?l2 ?d2 BIAS_CLOCKWISE)⇓ (RR-1) (1d-constrained-point ?c (>> ?c center) (angular-bisector(make-displaced-line ?l1 BIAS_LEFT ?d1)(make-displaced-line ?l2 BIAS_RIGHT ?d2)BIAS_COUNTERCLOCKWISEBIAS_CLOCKWISE))This rule takes two invariants: (1) a geom is at a fixed distance to the left of a given line, and (2) a geom is at a fixed distance to the right of a given line. The reformulation produces the invariant that the geom lies on the angular bisector of two lines that are parallel to the two given lines and at the specified distance from them. Either of the two original invariants in conjunction with the new one is equivalent to the original set of invariants.Besides reducing the number of plan fragments, reformulation rules also help to simplify action rules. Currently all action rules (for variable radius circles and line-segments) use only a single action to preserve or achieve an invariant. If we do not restrict the allowable signatures on a geom, it is possible to create examples where we need a sequence of (more than one) actions in the rule to achieve the invariant, or we need complex conditions that need to be checked to determine rule applicability. Allowing sequences and conditionals on the rules increases the complexity of both the rules and the pattern matcher. This makes it difficult to verify the correctness of rules and reduces the efficiency of the pattern matcher.Using invariant reformulation rules allows us to limit action rules to those that contain a single action. Unfortunately, it seems that we still need conditions to achieve certain invariants. For example, consider the following invariant on a variable radius circle:(fixed-distance-point ?circle ?pt ?dist BIAS_OUTSIDE)which states that a circle, ?circle be at some distance ?dist from a point ?pt and lie outside a circle around ?pt with radius ?dist. One action that may be taken to achieve this constraint is: (scale?circle(>> ?circle center)。

Finite Model Theory and Its Applications

Finite Model Theory and Its ApplicationsSpringer61Constraint Satisfactionthe data complexity of a fragment of existential second-order logic.We then go on in Section1.6and oﬀer a logical approach,via deﬁnability in Datalog, to establishing the tractability of non-uniform constraint-satisfaction prob-lems.In Section1.7,we leverage the connection between Datalog and certain pebble games,and show how these pebble games oﬀer an algorithmic ap-proach to solving uniform constraint-satisfaction problems.In Section1.8,we relate these pebble games to consistency properties of constraint-satisfaction instances,a well-known approach in constraint solving.Finally,in Section1.9, we show how the same pebble games can be used to identify large“islands of tractability”in the constraint-satisfaction terrain that are based on the concept of bounded treewidth.Much of the logical machinery used in this chapter is described in detail in Chapter??.For a book-length treatment of constraint satisfaction from the perspective of graph homomorphism,see[44].Two books on constraint programming and constraint processing are[3,23].1.2PreliminariesThe standard terminology in AI formalizes an instance P of constraint satis-faction as a triple(V,D,C),where1.V is a set of variables;2.D is a set of values,referred to as the domain;3.C is a collection of constraints C1,...,C q,where each constraint C i is apair(t,R)with t a k-tuple over V,k≥1,referred to as the scope of the constraint,and R a k-relation on D.A solution of such an instance is a mapping h:V→D such that,for each constraint(t,R)in C,we have that h(t)∈R,where h is deﬁned on tuples component-wise,that is,if t=(a1,...,a k),then h(t)=(h(a1),...,h(a k)). The constraint-satisfaction problem asks whether a given instance is solvable,i.e.,whether it has a solution.Note that,without loss of generality, we may assume that all constraints(t,R i)involving the same scope t have been consolidated to a single constraint(t,R),where R is the intersection of all relations R i constraining t.Thus,we can assume that each tuple t of variables occurs at most once in the collection C.Consider the Boolean satisﬁability problem3-Sat:given a3CNF-formula ϕwith variables x1,...,x n and clauses c1,...,c m,isϕsatisﬁable?Such an instance of3-Sat can be thought of as the constraint-satisfaction instance in which the set of variables is V={x1,...,x n},the domain is D={0,1},and the constraints are determined by the clauses ofϕ.For example,a clause of the form(¬x∨¬y∨z)gives rise to the constraint((x,y,z),{0,1}3−{(1,1,0)}). In an analogous manner,3-Colorability can be modelled as a constraint-satisfaction problem.Indeed,an instance G=(V,E)of3-Colorability can be thought of as the constraint-satisfaction instance in which the set1.2Preliminaries7 of variables is the set V of the nodes of the graph G,the domain is the set D={r,b,g}of three colors,and the constraints are the pairs((u,v),Q), where(u,v)∈E and Q={(r,b)(b,r),(r,g)(g,r),(b,g)(g,b)}is the disequality relation on D.Let A and B be two relational structures1over the same vocabulary.A homomorphism h from A to B is a mapping h:A→B from the universe A of A to the universe B of B such that,for every relation R A of A and every tuple (a1,...,a k)∈R A,we have that(h(a1),...,h(a k))∈R B.The existence of a homomorphism from A to B is denoted by A→B,or by A→h B,when we want to name the homomorphism h explicitly.An important observation made in[29]2is that every such constraint-satisfaction instance P=(V,D,C)can be viewed as an instance of the homomorphism problem,asking whether there is a homomorphism between two structures A P and B P that are obtained from P in the following way:1.the universe of A P is V and the universe of B P is D;2.the relations of B P are the distinct relations R occurring in C;3.the relations of A P are deﬁned as follows:for each distinct relation R onD occurring in C,we have the relation R A={t:(t,R)∈C}.Thus,R Aconsists of all scopes associated with R.We call(A P,B P)the homomorphism instance of P.Conversely,it is also clear that every instance of the homomorphism problem between two struc-tures A and B can be viewed as a constraint-satisfaction instance CSP(A,B) by simply“breaking up”each relation R A on A as follows:we generate a con-straint(t,R B)for each t∈R A.We call CSP(A,B)the constraint-satisfaction instance of(A,B).Thus,as pointed out in[29],the constraint-satisfaction problem can be identiﬁed with the homomorphism problem.To illustrate the passage from the constraint-satisfaction problem to the homomorphism problem,let us consider3-Sat.A3CNF-formulaϕwith vari-ables x1,...,x n and clauses c1,...,c m gives rise to a homomorphism instance (Aϕ,Bϕ)deﬁned as follows:•Aϕ=({x1,...,x n},Rϕ0,Rϕ1,Rϕ2,Rϕ3),where Rϕi is the ternary relation consisting of all triples(x,y,z)of variables that occur in a clause ofϕwith i negated literals,0≤i≤3;for instance,Rϕ2consists of all triples (x,y,z)of variables such that(¬x∨¬y∨z)is a clause ofϕ(here,we assume without loss of generality that the negated literals precede the positive literals).•Bϕ=({0,1},R0,R1,R2,R3),where R i consists of all triples that satisfy a3-clause in which theﬁrst i literals are negated;for instance,R2= {0,1}3−{1,1,0}.Note that Bϕdoes not depend onϕ.It is clear thatϕis satisﬁable if and only if there is a homomorphism from Aϕto Bϕ(in symbols,Aϕ→Bϕ).81Constraint SatisfactionAs another example,3-Colorability is equivalent to the problem of de-ciding whether there is a homomorphism h from a given graph G to the com-plete graph K3=({r,b,g},{(r,b)(b,r),(r,g)(g,r),(b,g)(g,b)}with3nodes. More generally,k-Colorability,k≥2,amounts to the existence of a ho-momorphism from a given graph G to the complete graph K k with k nodes (also known as the k-clique).Numerous other important NP-complete problems can be viewed as special cases of the Homomorphism Problem(and,hence,also of the Constraint-Satisfaction Problem).For example,consider the Clique problem:given a graph G and an integer k,does G contain a clique of size k?As a homo-morphism instance this is equivalent to asking if there is a homomorphism from the complete graph K k to G.As a constraint-satisfaction instance,the set of variables is{1,2,...,k},the domain is the set V of nodes of G,and the constraints are the pairs((i,j),E)such that i=j,1≤i,j≤k,and E is the edge relation of G.For another example,consider the Hamiltonic-ity Problem:given a graph G=(V,E)does it have a Hamiltonian cycle? This is equivalent to asking if there is a homomorphism from the structure (V,C V,=)to the structure(V,E,=),where C V is some cycle on the set V of nodes of G and=is the disequality relation on V.NP-completeness of the Homomorphism problem was pointed out explicitly in[53].In this chapter, we use both the traditional AI formulation of constraint satisfaction and the formulation as the homomorphism problem,as each has its own advantages.It turns out that in both formulations constraint satisfaction can be ex-pressed as a database-theoretic problem.We start with the homomorphism formulation,which is intimately related to conjunctive-query evaluation[48].A conjunctive query Q of arity n is a query deﬁnable by a positive existential ﬁrst-order formulaϕ(X1,...,X n)having conjunction as its only propositional connective,that is,by a formula of the form∃Z1...∃Z mψ(X1,...,X n,Z1,...,Z m),whereψ(X1,...,X n,Z1,...,Z m)is a conjunction of(positive)atomic formu-las.The free variables X1,...,X n of the deﬁning formula are called the distin-guished variables of Q.Such a conjunctive query is usually written as a rule, whose head is Q(X1,...,X n)and whose body isψ(X1,...,X n,Z1,...,Z m). For example,the formula∃Z1∃Z2(P(X1,Z1,Z2)∧R(Z2,Z3)∧R(Z3,X2))deﬁnes a binary conjunctive query Q,which as a rule becomesQ(X1,X2):-P(X1,Z1,Z2),R(Z2,Z3),R(Z3,X2).If the formula deﬁning a conjunctive query Q has no free variables(i.e.,if it is a sentence),then Q is a Boolean conjunctive query.For example,the sentence∃Z1∃Z2∃Z3(E(Z1,Z2)∧E(Z2,Z3)∧E(Z3,Z1))1.2Preliminaries9 deﬁnes the Boolean conjunctive query“is there a cycle of length3?”.If D is a databaseand Q is a n-ary query,then Q(D)is the n-ary relationon D obtained by evaluating the query Q on D,that is,the collection of all n-tuples from D that satisfy the query(cf.Chapter??).The conjunctive-query evaluation problem asks:given a n-ary query Q,a database D, and a n-tuple a from D,does a∈Q(D)?Let Q1and Q2be two n-aryqueries having the same tuple of distinguished variables.We say that Q1iscontained in Q2,and write Q1⊆Q2,if Q1(D)⊆Q2(D)for every database D.The conjunctive-query containment problem asks:given two con-junctive queries Q1and Q2,is Q1⊆Q2?These concepts can be deﬁned forBoolean conjunctive queries in an analogous manner.In particular,if Q is a Boolean query and D is a database,then Q(D)=1if D satisﬁes Q;otherwise,Q(D)=0.Moreover,the containment problem for Boolean queries Q1and Q2is equivalent to asking whether Q1logically implies Q2.It is well known that conjunctive-query containment can be reformu-lated both as a conjunctive-query evaluation problem and as a homomor-phism problem.What links these problems together is the canonical databaseD Q associated with Q.This database is deﬁned as follows.Each variableoccurring in Q is considered a distinct element in the universe of D Q.Ev-ery predicate in the body of Q is a predicate of D Q as well;moreover,for every distinguished variable X i of Q,there is a distinct monadic pred-icate P i(not occurring in Q).Every subgoal in the body of Q gives rise toa tuple in the corresponding predicate of D Q;moreover,if X i is a distin-guished variable of Q,then P i(X i)is also a(monadic)tuple of D Q.Thus, returning to the preceding example,the canonical database of the conjunctivequery∃Z1∃Z2(P(X1,Z1,Z2)∧R(Z2,Z3)∧R(Z3,X2))consists of the factsP(X1,Z1,Z2),R(Z2,Z3),R(Z3,X2),P1(X1),P2(X2).The relationship be-tween conjunctive-query containment,conjunctive-query evaluation,and ho-momorphisms is provided by the following classical result,due to Chandra and Merlin.Theorem1.1.[11]Let Q1and Q2be two conjunctive queries having thesame tuple(X1,...,X n)of distinguished variables.Then the following state-ments are equivalent.•Q1⊆Q2.•(X1,...,X n)∈Q2(D Q1).•There is a homomorphism h:D Q2→D Q1.It follows that the homomorphism problem can be viewed as a conjunctive-query evaluation problem or as a conjunctive-query containment problem.Forthis,with every structure A,we view the universe A={X1,...,X n}of A as aset of individual variables and associate with A the Boolean conjunctive query ∃X1...∃X n∧t∈R A R(t);we call this query the canonical conjunctive query of A and denote it by Q A.It is clear that A is isomorphic to the canonicaldatabase associated with Q A.101Constraint SatisfactionCorollary1.2.Let A and B be two structures over the same vocabulary. Then the following statements are equivalent.•A→B.•B|=Q A.•Q B⊆Q A.As an illustration,we have that a graph G is3-colorable iﬀK3|=Q G iﬀ⊆Q G.Q K3A relational join,denoted by the symbol1,is a conjunctive query with no existentially quantiﬁed variables.Thus,relational-join evaluation is a special case of conjunctive-query evaluation.For example,E(Z1,Z2)∧E(Z2,Z3)∧E(Z3,Z1)is a relational join that,when evaluated on a graph G=(V,E), returns all triples of nodes forming a3-cycle.There is a well known connec-tion between the traditional AI formulation of constraint satisfaction and relational-join evaluation that we describe next.Suppose we are given a constraint-satisfaction instance(V,D,C).We can assume without loss of gen-erality that in every constraint(t,R)∈C the elements in t are distinct. (Suppose to the contrary that t i=t j.Then we can delete from R every tu-ple in which the i th and j th entries disagree,and then project out that j-th column from t and R.)We can thus view every element of V as a relational attribute,every tuple of distinct elements of V as a relational schema,and every constraint(t,R)as a relation R over the schema t(cf.[1]).It now fol-lows from the deﬁnition of constraint satisfaction that CSP can be viewed as a relational-join evaluation problem.Proposition1.3.[6,42]A constraint-satisfaction instance(V,D,C)is solv-able if and only if1(t,R)∈C R is nonempty.Note that Proposition1.3is essentially the same as Corollary1.2.Indeed, the condition B|=Q A amounts to the non-emptiness of the relational join obtained from Q A by dropping all existential quantiﬁers and using the rela-tions from B as interpretations of the relational symbols in Q A.Moreover, the homomorphisms from A to B are precisely the tuples in the relational join associated with the constraint-satisfaction instance CSP(A,B).1.3Computational Complexity of Constraint SatisfactionThe Constraint-Satisfaction Problem is NP-complete,because it is clearly in NP and also contains NP-hard problems as special cases,including 3-Sat,3-Colorability,and Clique.As explained in Garey and Johnson’s classic monograph[36],one of the main ways to cope with NP-completeness is to identify polynomial-time solvable cases of the problem at hand that are obtained by imposing restrictions on the possible inputs.For instance,1.3Computational Complexity of Constraint Satisfaction11 Horn3-Sat,the restriction of3-Sat to Horn3CNF-formulas,is solvable in polynomial-time using a unit-propagation algorithm.Similarly,it is known that3-Colorability restricted to graphs of bounded treewidth is solvable in polynomial time(see[26]).In the case of constraint satisfaction,the pursuit of tractable cases has evolved over the years from the discovery of isolated cases to the discovery of large“islands of tractability”of constraint satisfaction.In what follows,we will give an account of some of the progress made in this ing the fact that the Constraint-Satisfaction Problem can be identiﬁed with the Homomorphism Problem,we begin by introducing some terminology and notation that will enable us to formalize the concept of an “island of tractability”of constraint satisfaction.In general,an instance of the Homomorphism Problem consists of two relational structures A and B.Thus,all restricted cases of this problem can be obtained by imposing restrictions on the input structures A and B.Deﬁnition1.4.Let A,B be two classes of relational structures.We write CSP(A,B)to denote the restriction of the Homomorphism Problem to in-put structures from A and B.In other words,CSP(A,B)={(A,B):A∈A,B∈B and A→B}.An island of tractability of constraint satisfaction is a pair(A,B)of classes of relational structures such that CSP(A,B)is in the complexity class PTIME of all decision problems solvable in polynomial time.(A more general deﬁnition of islands of tractability of constraint satisfaction would consider classes of pairs(A,B)of structures,cf.[28];we do not pursue this more general deﬁnition here.)The ultimate goal in the pursuit of islands of tractability of constraint sat-isfaction is to identify or characterize classes A and B of relational structures such that CSP(A,B)is in PTIME.The basic starting point in this investiga-tion is to consider the cases in which one of the two classes A,B is as small as possible,while the other is as large as possible.This amounts to considering the cases in which one of A,B is the class All of all relational structures over some arbitrary,butﬁxed,relational vocabulary,while the other is a single-ton,consisting of someﬁxed structure over that vocabulary.Thus,the start-ing points of the investigation is to determine,forﬁxed relational structures A,B,the computational complexity of the decision problems CSP({A},All) and CSP(All,{B}).Clearly,for eachﬁxed A,the decision problem CSP({A},All)can be solved in polynomial time,because,given a structure B,the existence of a homomorphism from A to B can be checked by testing all functions h from the universe A of A to the universe B of B(the total number of such functions is|B||A|,which is a polynomial number in the size of the structure B when A isﬁxed).Thus,having a singleton structure“on the left’is of little interest.At the other extreme,however,the situation is quite diﬀerent,since the computational complexity of CSP(All,{B})may very well depend on the121Constraint Satisfactionparticular structure B.Indeed,CSP(All,{K3})is NP-complete,because it is the3-Colorability problem;in contrast,CSP(All,{K2})is in P,because it is the2-Colorability problem.For simplicity,in what follows,for every ﬁxed structure B,we deﬁne CSP(B)=CSP(All,{B})and call this the non-uniform constraint-satisfaction problem associated with B.For such problems, we refer to B as the template.Thus,theﬁrst major goal in the study of the computational complexity of constraint satisfaction is to identify those templates B for which CSP(B)is in PTIME.This goals gives rise to an important open decision problem.The Tractability Classification Problem:Given a relational structure B,decide if CSP(B)is in PTIME.In addition to the family of non-uniform constraint-satisfaction problems CSP(B),where B is a relational structure,we also study decision problems of the form CSP(A,All),where A is a class of structures.We refer to such problems as uniform constraint-satisfaction problems.It is illuminating to consider the complexity of uniform and non-uniform constraint satisfaction from the perspective of query evaluation.As argued in[67](see Chapter??),there are three ways to measure the complexity of evaluating queries(we focus here on Boolean queries)expressible in a query language L:•The combined complexity of L is the complexity of the following decision problem:given an L-query Q and a structure A,does A|=Q?In symbols,{ Q,A :Q∈L and A|=Q}.•The expression complexity of L is the complexity of the following decision problems,one for eachﬁxed structure A:{Q:Q∈L and A|=Q}.•The data complexity of L is the complexity of the following decision prob-lems,one for eachﬁxed query Q∈L:{A:A|=Q}.As discussed in Chapter??,the data complexity ofﬁrst-order logic is in LOGSPACE,which means that,for eachﬁrst-order query Q,the problem {A:A|=Q}is in LOGSPACE.In contrast,the combined complexity for ﬁrst-order logic is PSPACE-complete.Furthermore,the expression complex-ity forﬁrst-order logic is also PSPACE-complete.In fact,for all but trivial structures A,the problem{Q:Q∈F O and A|=Q}is PSPACE-complete. This exponential gap between data complexity,on one hand,and combined and expression complexity,on the other hand,is typical[67].For conjunc-tive queries,on the other hand,both combined and expression complexity are NP-complete.1.4Non-Uniform Constraint Satisfaction13Consider now the uniform constraint-satisfaction problem CSP(A,All)= {(A,B):A∈A,and A→B},where A is a class of structures.By Corol-lary1.2,we have thatCSP(A,All)={(A,B):A∈A,B is a structure and B|=Q A}.Thus,studying the complexity of uniform constraint satisfaction amounts to studying the combined complexity for a class of conjunctive queries,as, for example,in[12,39,62].In contrast,consider the non-uniform constraint-satisfaction problem CSP(B)={A:A→B}.By Corollary1.2we have that CSP(B)={A:B|=Q A}.Thus,studying the complexity of non-uniform constraint satisfaction amounts to studying the expression complexity of conjunctive queries with respect to diﬀerent structures.This is a problem that has not been studied in the context of database theory.1.4Non-Uniform Constraint SatisfactionTheﬁrst major result in the study of non-uniform constraint-satisfaction problems was obtained by Schaefer[63],who,in eﬀect,classiﬁed the compu-tational complexity of all Boolean non-uniform constraint-satisfaction prob-lems.A Boolean structure is simply a relational structure with a2-element universe,that is,a structure of the form B=({0,1},R B1,...,R B m).A Boolean non-uniform constraint-satisfaction problem is a problem of the form CSP(B)with a Boolean template B.These problems are also known as Generalized-Satisfiability Problems,because they can be viewed as variants of Boolean-satisﬁability problems in which the formulas are conjunc-tions of generalized connectives[36].In particular,they contain the well known problems k-Sat,k≥2,1-in-3-Sat,Positive1-in-3-Sat,Not-All-Equal 3-Sat,and Monotone3-Sat as special cases.For example,as seen ear-lier,3-Sat is CSP(B),where B=({0,1},R0,R1,R2,R3)and R i is the set of all triples that satisfy a3-clause in which theﬁrst i-literals are negated, i=0,1,2,3(thus,R0={0,1}3−{(0,0,0)}).Similarly,Monotone3-SAT is CSP(B),where B=({0,1},R0,R3).Ladner[51]showed that if PTIME=NP,then there are decision problems in NP that are neither NP-complete,nor belong to PTIME.Such problems are called intermediate problems.Consequently,it is conceivable that a given family of NP-problems contains intermediate problems.Schaefer[63],how-ever,showed that the family of all Boolean non-uniform constraint-satisfaction problems contains no intermediate problems.Theorem1.5.(Schaefer’s Dichotomy Theorem[63])•If B=({0,1},R B1,...,R B m)is Boolean structure,then either CSP(B)is in PTIME or CSP(B)is NP-complete.141Constraint Satisfaction•The Tractability Classification Problem for Boolean structures is decidable;in fact,there is a polynomial-time algorithm to decide,given a Boolean structure B,whether CSP(B)is in PTIME or is NP-complete.Schaefer’s Dichotomy Theorem can be described pictorially as follows:NP-completeCSP(B)PSchaefer[63]actually showed that there are exactly six types of Boolean structures such that CSP(B)is in PTIME,and provided explicit descriptions of them.Speciﬁcally,he showed that CSP(B)is in PTIME precisely when at least one of the following six conditions is satisﬁed:•Every relation R B i,1≤i≤m,of B is0-valid,that is,R B i contains the all-zeroes tuple(0,...,0).•Every relation R B i,1≤i≤m,of B is1-valid,that is,R B i contains the all-ones tuple(1,...,1).•Every relation R B i,1≤i≤m,of B is bijunctive,that is,R B i is the set of truth assignments satisfying some2-CNF formula.•Every relation R B i,1≤i≤m,of B is Horn,that is,R B i is the set of truth assignments satisfying some Horn formula.•Every relation R B i,1≤i≤m,of B is dual Horn,that is,R B i is the set of truth assignments satisfying some dual Horn formula.•Every relation R B i,1≤i≤m,of B is aﬃne,that is,R B i is the set of solutions to a system of linear equations over the two-elementﬁeld.Schaefer’s Dichotomy Theorem established a dichotomy and a decidable classiﬁcation of the complexity of CSP(B)for Boolean templates B.After this, Hell and Neˇs etˇr il[43]established a dichotomy theorem for CSP(B)problems in which the template B is an undirected graph:if B is bipartite,then CSP(B) is solvable in polynomial time;otherwise,CSP(B)is NP-complete.To illus-trate this dichotomy theorem,let C n,n≥3,be a cycle with n elements.Then CSP(C n)is in PTIME if n is even,and is NP-complete if n is odd.The preceding two dichotomy results raise the challenge of classifying the computational complexity of CSP(B)for arbitrary relational templates B. Addressing this question,Feder and Vardi[29]formulated the following con-jecture.Conjecture1.6.(Dichotomy Conjecture)[29]If B=(B,R B1,...,R B m)is an arbitrary relational structure,then either CSP(B)is in PTIME or CSP(B)is NP-complete.1.4Non-Uniform Constraint Satisfaction15 In other words,the Dichotomy Conjecture says that the picture above de-scribes the complexity of non-uniform constraint-satisfaction problems CSP(B) for arbitrary structures B.The basis for the conjecture is not only the evi-dence from Boolean constraint satisfaction and undirected constraint satis-faction,but also from the seeming inability to carry out the diagonalization argument of[51]using the constraint-satisfaction machinery[27].The Dichotomy Conjecture inspired intensive research eﬀorts that signif-icantly advanced our understanding of the complexity of non-uniform con-straint satisfaction.In particular,Bulatov conﬁrmed two important cases of this conjecture.We say that a structure B=(B,R B1,...,R B m)is a3-element structure if B contains at most three element.We say that B is conserva-tive if all possible monadic relations on the universe included,that is,every non-empty subset of B is one of the relations R B i of B.Theorem1.7.[8,9]If B a3-element structure or a conservative structure, then either CSP(B)is in PTIME or CSP(B)is NP-complete.Moreover,in both cases the Tractability Classification Problem is decidable in poly-nomial time.In spite of the progress made,the Dichotomy Conjecture remains unre-solved in general.The research eﬀorts towards this conjecture,however,have also resulted into the discovery of broad suﬃcient conditions for tractabil-ity and intractability of non-uniform constraint satisfaction that have pro-vided unifying explanations for numerous seemingly disparate tractability and intractability results and have also led to the discovery of new islands of tractability of CSP(B).These broad suﬃcient conditions are based on con-cepts and techniques from two diﬀerent areas:universal algebra and logic.The approach via universal algebra yields suﬃcient conditions for tractabil-ity of CSP(B)in terms of closure properties of the relations in B under cer-tain functions on its universe B.Let R be a n-ary relation on a set B and f:B k→B a k-ary function.We say that R is closed under f,if whenever t1=(t11,t21,...,t n1),...,t k=(t1k,t2k,...,t n k)are k(not necessarily distinct) tuples in R,then the tuple(f(t11,...,t1k),f(t21,...,t2k),...,f(t n1,...,t n k))is also in R.We say that f:B k→B is a polymorphism of a structure B=(B,R1,...,R m)if each of the relations R j,1≤j≤m,is closed under f.It is easy to see that f is a polymorphism of B if and only if f is a homomorphism from B k to B,where B k is the k-th power of B.By deﬁnition, the k-th power B k is the structure(B k,R′1...,R′m)over the same vocabulary as B with universe B k and relations R′j,1≤j≤m,deﬁned as follows:if R j is of arity n,then R′j(s1,...,s n)holds in B k if and only if R j(s i1,...,s i n) holds in B for1≤i≤n.We write Pol(B)for the set of all polymorphisms of B.As it turns out, the complexity of CSP(B)is intimately connected to the kinds of functions161Constraint Satisfactionthat Pol(B )contains.This connection was ﬁrst unveiled in [29],and explored in depth by Jeavons and his collaborators;for a recent survey see [10].In particular,they showed that if Pol(B 1)=Pol(B 2)for two structures B 1and B 2(over ﬁnite vocabularies),then CSP(B 1)and CSP(B 2)are polynomially reducible to each other.Thus,the polymorphisms of a template B characterize the complexity of CSP(B ).The above mentioned dichotomy results for 3-element and conservative constraint satisfaction are based on a rather deep analysis of the appropriate sets of polymorphisms.1.5Monotone Monadic SNP and Non-UniformConstraint SatisfactionWe discussed earlier how non-uniform constraint satisfaction is related to the study of the expression complexity of conjunctive queries.We now show that it can also be viewed as the study of the data complexity of second-order logic.This will suggest a way to identify islands of tractability via logic.As described in Chapters ??and ??,existential second-order logic ESO deﬁnes,by Fagin’s Theorem,precisely the complexity class NP.The class SNP (for strict NP)[46,57]is a fragment of ESO,consisting of all existential second-order sentences with a universal ﬁrst-order part,namely,sentences of the form (∃S ′)(∀x )Φ(x ,S,S ′),where Φis a ﬁrst-order quantiﬁer-free formula.We refer to the relations over the input vocabulary S as input relations ,while the relations over the quantiﬁed vocabulary S ′are referred to as existential re-lations .3-Sat is an example of an SNP problem.The input structure consists of four ternary relations C 0,C 1,C 2,C 3,on the universe {0,1},where C i cor-responds to a clause on three variables with the ﬁrst i of them negated.There is a single existential monadic relation T describing a truth assignment.The condition that must be satisﬁed states that for all x 1,x 2,x 3,if C 0(x 1,x 2,x 3)then T (x 1)or T (x 2)or T (x 3),and similarly for the remaining C i by negating T (x j )if j ≤i .Formally,we can express 3-Sat with the SNP sentence:(∃T )(∀x 1,x 2,x 3)((C 0(x 1,x 2,x 3)→T (x 1)∨T (x 2)∨T (x 3))∧(C 1(x 1,x 2,x 3)→¬T (x 1)∨T (x 2)∨T (x 3))∧(C 2(x 1,x 2,x 3)→¬T (x 1)∨¬T (x 2)∨T (x 3))∧(C 3(x 1,x 2,x 3)→¬T (x 1)∨¬T (x 2)∨¬T (x 3))).It is easy to see that CSP(B )is in SNP for each structure B .For each ele-ment a in the universe of B ,we introduce an existentially quantiﬁed monadic relation T a ;intuitively,T a (x )indicates that a variable x has been assigned value a by the homomorphism.The sentence ϕB says that the sets T a cover all elements in the universe 3,and that the tuples in the input relations satisfy the constraints imposed by the structure B .Thus,if R (a 1,...,a n )does not hold in B ,then ϕB contains the conjunct ¬(R (x 1,...,x n )∧ n i =1T a i (x i )).For。

CogSci02 A Constraint Satisfaction Model of Causal Learning and Reasoning

A Constraint Satisfaction Model of Causal Learning and ReasoningYork Hagmayer(york.hagmayer@bio.uni-goettingen.de)Department of Psychology,University of GöttingenGosslerstr.14,37073Göttingen,GermanyMichael R.Waldmann(michael.waldmann@bio.uni-goettingen.de)Department of Psychology,University of GöttingenGosslerstr.14,37073Göttingen,GermanyAbstractFollowing up on previous work by Thagard(1989,2000)we have developed a connectionist constraint satisfaction model which aims at capturing a wide variety of tasks involving causal cognitions,including causal reasoning,learning,hy-pothesis testing,and prediction.We will show that this model predicts a number of recent findings,including asymmetries of blocking,and asymmetries of sensitivity to structural im-plications of causal models in explicit versus implicit tasks.IntroductionCausal reasoning has been widely investigated during the last decade,which has led to a number of interesting novel findings(see Shanks,Holyoak,&Medin,1996;Hagmayer &Waldmann,2001,for overviews).For example,it has been shown that participants’causal judgments are sensitive to the contingency between the cause and the effect,and that people’s judgments reflect the causal models underlying the observed learning events(see Hagmayer&Waldmann, 2001;Waldmann,1996).Moreover,causal reasoning has been studied in the context of a number of different tasks, such as learning,reasoning,categorization,or hypothesis testing.Most psychological theories and computational models of causal learning and reasoning are rooted in two traditions. They are either based on associationistic or on probabilistic or Bayesian models(see Shanks et al.,1996;Thagard, 2000).Both kinds of models have been criticized.Associa-tionistic learning networks have proven unable to capture the fundamental semantics of causal models because they are insensitive to the differences between learning events that represent causes versus effects(see Waldmann,1996). By contrast,Bayesian networks are perfectly capable of rep-resenting causal models with links directed from causes to effects(see Pearl,2000).However,although the goal of these networks is to reduce the complexity of purely prob-abilistic reasoning,realistic Bayesian models still require fairly complex computations,and they presuppose compe-tencies in reasoning with numerical probabilities which seem unrealistic for untutored people(see Thagard,2000,for a detailed critique of these models).The aim of this paper is to introduce a more qualitatively oriented,connectionist constraint satisfaction model of causal reasoning and learning.Our model is inspired by Thagard’s(2000)suggestion that constraint satisfaction models may qualitatively capture many insights underlying normative Bayesian network models in spite of the fact that constraint satisfaction model use computationally far sim-pler,and therefore psychologically more realistic processes. The model differs from standard associationist learning models(e.g.,Rescorla&Wagner,1972)in that it is capable of expressing basic differences between causal models.Our model embodies a uniform mechanism of learning and rea-soning,which assesses the fit between data and causal mod-els.This architecture allows us to model a wide range of different tasks within a unified model,which in the literature have so far been treated as separate,such as learning and hypothesis testing.Constraint Satisfaction Models Constraint satisfaction models(Thagard,1989,2000)aim at capturing qualitative aspects of reasoning.Their basic as-sumption is that people hold a set of interconnected beliefs. The beliefs pose constraints on each other,they either sup-port each other,contradict each other,or are unrelated.Co-herence between the beliefs can be achieved by processes which attempt to honor these constraints.Within a constraint satisfaction model beliefs are repre-sented as nodes which represent propositions(e.g.,“A causes B”).The nodes are connected by symmetric relations. The numerical activation of the nodes indicates the strength of the belief in the proposition.A belief that is highly acti-vated is held strongly,a belief that is negatively activated is rejected.The activation of a node depends on the activation of all other nodes with which it is connected.More pre-cisely,the net input to a single node j from all other nodes i is defined as the weighted sum of the activation a of all re-lated nodes(following Thagard,1989,p.466,eq.5):Net j=∑i w ij a i(t)(1) The weights w represent the strength of the connection of the beliefs.In our simulations,they are generally pre-set to default values which are either positive or negative and re-main constant throughout the simulation.At the beginning of the simulations,the activation of the nodes representing hy-potheses are set to a low default value.However,nodes rep-resenting empirical evidence are connected to a special acti-vation node whose activation remains constant at1.0.This architecture allows us to capture the intuition that more faith is put into empirical evidence than into theoretical hypothe-ses(see Thagard,1989).To update the activation in eachcycle of the simulation,first the net input net j to each node is computed using Equation1.Second the activation of all nodes is updated using the following equation(Thagard, 1989,p.446,eq.4):a j(t+1)=a j(t)(1-θ)+net j(max-a j(t))if net j>0=a j(t)(1-θ)+net j(a j(t)-min)otherwise.(2) In Equation2,θis a decay parameter that decrements the activity of each node in every cycle,min represents the minimum activation(-1)and max the maximum activation (+1).The activations of all nodes are updated until a stable equilibrium is reached,which means that the activation of all nodes do no longer substantially change.To derive quantita-tive predictions it would be necessary to specify rules that map the final activations to different types of responses. This is an important goal which should be addressed in fu-ture research.In the present article we only derive ordinal, qualitative predictions from the model.The ModelFollowing causal-model theory(Waldmann,1996)we as-sume that people typically enter causal tasks with initial as-sumptions about the causal structure they are going to ob-serve.Even though specific knowledge about causal rela-tions may not always be available,people often bring to bear knowledge about abstract features of the models,such as the distinction between events that refer to potential causes and events that refer to potential effects.In virtually all psycho-logical studies this information can be gleaned from the ini-tial instructions and the materials(see Waldmann,1996).Figure1displays an example of how the model repre-sents a causal model.The nodes represent either causal hy-potheses or observable events.The causal hypothesis node at the top represents a structural causal hypothesis(H1),in this case the hypothesis that the three events e1,e2,x form a common-effect structure with e1and e2as the two alternative causes and x as the common effect.The two nodes on the middle level refer to the two causal relations H2and H3that are part of the common-effect model with two causes and a single effect.The nodes on the lowest level refer to all pat-terns of events that can be observed with three events(a dot represents“and”).On the left side,the nodes represent pat-terns of three events,in the middle pairs,and on the right side single events.Not only the present but also the corre-sponding absent events are represented within this model (for example~x).The links connecting the nodes represent belief relations.Thus,they do not represent probabilities or causal relations as in Bayesian models.There are two differ-ent kinds of connections between the nodes.Solid lines indi-cate excitatory links,dashed lines inhibitory links.How are the connections defined?A connection is positive if the propositions support each other.For example,if all three events are present,the observation is in accordance with both hypotheses H2and H3.This pattern might be observed if both e1and e2cause x.Therefore the evidence node e1.e2.x is positively connected to H2and H3.In general,a hypothesis is positively connected to an evidence node if the events mentioned in the hypothesis are either all present or all absent.If this is not the case,that is if one of the relevant events specified in the hypothesis is absent,the link is as-signed the negative default value.Exploratory studies have shown,that participants share a common intuition whether a certain pattern of events supports or contradicts a hypothesis (Hagmayer&Waldmann,2001).The assigned weights mir-ror these general intuitions.The weights of the links remain the same throughout the simulations.Figure1does not dis-play the special activation node.This node was pre-set to 1.0and attached to event nodes describing present events in the respective experiment.and reasoning.See text for further explanations.In Figure1,the dashed line between the hypotheses H1and H2,which signifies an inhibitory link,is of special interest. The network represents a common-effect structure.This means that there are two causes e1and e2which compete in explaining the occurrence of effect x.Therefore the two hypotheses referring to the individual causal relations have to be connected by a inhibitory link(see also Thagard, 2000).However,both hypotheses H2and H3are positively connected to the structural hypothesis H1.By contrast,a common-cause structure is represented slightly differently. In such a structure,event x would be the common cause of the two effects e1and e2(i.e.,H1:x!e1.e2).A model of this structure looks almost identical to the one for the com-mon-effect structure in Figure1.There is only one very im-portant difference.Because there is no competition between the effects of a common cause,a common-cause model has no inhibitory link between H2and H3.All other nodes and links in the two models are identical.Both the common-effect and the common-cause model were implemented using Microsoft Excel.Default values were adopted from the literature if not indicated otherwise (Thagard,1989).Initial activations were set to0.01,inhibi-tory links between nodes to–0.05,and excitatory links to +0.05.The inhibitory link between H1and H2within the common-effect model was pre-set to a value of–0.20.Thespecial activation node was attached to all evidence nodes. The additional activation was divided among the evidence nodes according to the relative frequency of the evidence in the learning input.This principle captures the intuition that more faith is put into evidence that is observed more fre-quently.EvaluationIn order to evaluate the proposed constraint satisfaction model different tasks and paradigms from the literature on causal learning and reasoning were modeled.One of our main goals was to show that the same architecture can be used to simulate different types of tasks.However,different tasks required different sections of the model depicted in Figure1.We used two principles for the construction of task specific networks.The first principle is that we only in-cluded the event nodes that corresponded to the event pat-terns observed in the learning phase or that corresponded to events that have to be evaluated or predicted in the test phase.For example,to model a task in which only event triples were shown,only the event nodes on the left side of the event layer in Figure1would be incorporated in the model.However,if the task following the learning phase required the prediction of single events,the corresponding nodes for single events would have to be added to the event layer.The second principle is that only the hypothesis nodes were included that represent hypotheses that are given or suggested to participants.These two principles ensure that for each paradigm a minimally sufficient sub-model of the complete model is instantiated.Test1:Asymmetries of BlockingBlocking belongs to the central phenomena observed in as-sociative learning which,among other findings,have moti-vated learning rules that embody cue competition(e.g.,Res-corla&Wagner,1972).A typical blocking experiment con-sists of two learning phases.In Phase1participants learn that two events e1and x are either both present or absent.In Phase2a third event e2is introduced.Now all three events are either present or absent.In both phases,events e1and e2 represent cues and x the outcome to be predicted.Associa-tive theories generally predict a blocking effect which means that participants should be reluctant about the causal status of the redundant event e2that has been constantly paired with the predictive event e1from Phase1.This prediction has come under attack by recent findings that have shown that the blocking effect depends on the causal model learn-ers bring to bear on the task(see Waldmann,1996,2000).If participants assume that e1and e2are the causes of x(com-mon-effect structure)a blocking effect can be seen.In con-trast,if participants assume that e1and e2are the collateral effects of the common cause x(common-cause structure),no blocking of e2is observed.In this condition,learners tend to view both e1and e2as equally valid diagnostic cues of x.To model blocking,we used a network that was ex-tended after Phase1.In Phase1,the net consisted of a hy-pothesis node(H2)and the nodes for patterns of two events (e1,x).After Phase1,the final activation of the hypothesis node was transferred to Phase2.In Phase2,the network consisted of two nodes for the two causal hypotheses(H2 and H3),and nodes that represented patterns of three events, the patterns participants observed within the learning phase. Furthermore,the node H1was included,which,depending on the condition,either coded a common-cause or a com-mon-effect hypothesis.The nodes for the event pairs from Phase1were removed.Figure2shows the activation of the two hypotheses re-ferring to the causal relations in Phase1and2.Figure2A depicts the activation for the common-cause model and Fig-ure2B for the common-effect model.Figure2A:Simulation of a blocking paradigm(Test1).Activation of hypothesis nodes for a common-causemodel.The solid line represents the activation ofH2:x→e1,the dotted line of H3:x→e2.Phase2started at the101st.cycle.The model shows no blocking for event e2in the context of the common-cause model.It quickly acquires the belief that there is a causal connection between x and e2.Figure2B:Simulation of a blocking paradigm(Test1).Activation of hypothesis nodes for a common-effectstructure.The upper line represents the activation ofH2:e1→x,the lower line of H3:e2→x.Phase2started at the101st cycle.For the common-effect model the simulation shows blocking of the second cause,that is the second hypothesis is believed to be wrong.Thus,the simulations closely correspond to the empirical finding that blocking interacts with the structure of the causal model used to interpret the learning data.Test2:Testing Complex Causal HypothesesThe first test of the model used a phenomenon from the lit-erature on causal learning.We now want to turn to a com-pletely different paradigm,hypothesis testing.In experi-ments on causal learning participants are typically instructed about a causal structure,and the task is to learn about the causal relations within the structure.They are not asked whether they believe that the structure is supported by the learning data or not.In recent experiments(Hagmayer, 2001;Hagmayer&Waldmann,2001)we gave participants the task to test a complex causal model hypothesis.For ex-ample,we asked them whether three observed events sup-port a common-cause hypothesis or not.Normatively this task should be solved by testing the implications of the given structural hypothesis.For example,a common-cause model implies a(spurious)correlation of the effects of the single common cause.In contrast,a common-effect structure does not imply a correlation of the different causes of the joint effect.Unless there is an additional hidden event that causes a correlation among the causes,they should be uncorrelated. In the experiment,participants were given data which either displayed a correlation between all three events(data set1) or correlations between e1-x and e2-x only,that is e1and e2 were marginally independent in this data(data set2).Data set1was consistent with a common-cause hypothesis which implies correlations between all three events.In contrast, data set2favors the common-effect hypothesis with x as the effect and e1and e2as independent causes.However,in a series of experiments we found that participants were not aware of these differential structural implications when test-ing the two hypotheses.Instead they checked whether the individual causal relations within the complex structures held(e.g.,e1-x).Thus,participants dismissed a hypothesis if one of the assumed causal links was missing.However,they proved unable to distinguish between the common-cause and the common-effect structure when both structures specified causal connections between the same events(regardless of the direction).To model this task we used the model without the nodes for event pairs and individual events.The special activation node was connected to the patterns of three events.As be-fore the activation of the individual event patterns was pro-portional to the frequency of the respective pattern in the data.To test the model,we used three sets of data.Either all three events were correlated(data set1),e1and x,and e2 and x were correlated and e1and e2were marginally inde-pendent(data set2),or e1and x,and e1and e2were corre-lated,and e2and x were uncorrelated(data set3).As com-peting hypotheses we either used a common-cause model with x as the common cause,or a common-effect model with x as the common effect.Figure3shows the activation of the node H1which represents the hypothesis that the respective causal model underlies the observed data.Figure3A shows the results for the common-cause hy-pothesis,Figure3B for the common-effect hypothesis.The results clearly mirror the judgments of our participants. Whenever the two assumed causal relations within either causal model were represented in the data,the structural hypothesis was accepted(solid lines),if one link was miss-ing the hypothesis was rejected(dotted line).One slight deviation from our empirical findings was ob-served.In early cycles there seems to be an effect favoring the common-effect hypothesis with data consistent with this hypothesis.However,the difference between the hypotheses is relatively small and further decreases after100updating cycles.Thus,the results are consistent with participants’insensitivity to structural implications of causal models in hypothesis testing tasks.Figure3A:Activation of hypothesis node H1for a com-mon-cause model(Test2).The solid lines represent the activations for data set1and2,the dotted line the activa-tions for data set3.Figure3B:Activation of hypothesis node H1for a com-mon-effect model(Test2).The solid lines represent the activations for data set1and2,the dashed line at thebottom the activations for data set3Why does the model not differentiate between the two causal structures?The reason is that it is assumed that com-plex structural hypotheses are not directly linked to empiri-cal evidence.In our model empirical evidence is connected to the hypotheses that represent individual causal links which in turn are linked to more complex model-related hypotheses.This architecture allows it to model learning and hypothesis testing within the same model.It also seems to capture the empirical finding that participants can easily decide whether a certain pattern of events supports a simple causal hypothesis,but have a hard time to relate event pat-terns to complex causal hypotheses.Test3:Causal InferencesIn the previous section we have mentioned studies showing insensitivity to spurious relations implied by causal models.A last test for our model is a task in which participants have to predict other events under the assumption that a certain causal model holds.Interestingly we have empirically dem-onstrated sensitivity to structural implications of causal models in this more implicit task(Hagmayer&Waldmann, 2000).In this task participants do not have to evaluate the validity of a causal model in light of observed evidence but rather are instructed to use causal models when predicting individual events.In our experiments we presented partici-pants with two learning phases in which they learned about two causal relations one at a time.Thus,in each phase par-ticipants only received information about the presence and absence of two events(x and e1,or x and e2).They never saw patterns of all three events during the experiment.The initial instructions described the two causal relations,which were identically presented across conditions,either as parts of a common-cause model with x as the cause or as part of a common-effect model with x as the effect.After participants had learned about the two causal relations we asked them to predict whether e1and e2were present given that x was present.We found that participants were more likely to pre-dict that both e1and e2would co-occur when x was viewed as the common cause than when it was seen as a common effect.Thus,in this more implicit task the predictions ex-pressed knowledge about structural implications of causal models.In particular,the patterns the participants predicted embodied a spurious correlation among the effects of a com-mon cause,whereas the causes of a common effect tended to be marginally uncorrelated in the predicted patterns.By contrast,in a more direct task which required explicit judgments about correlations,no such sensitivity was observed,which is consistent with the results reported in the previous section.To model this experiment we eventually used the com-plete network depicted in Figure1which was successively augmented according to our two principles.In Phase1,the learning phase,patterns of two events were connected to the hypotheses H2and H3.Depending on the learning condi-tion,these two hypotheses were either linked to a common-cause or a common-effect hypothesis(H1).The activations of the hypothesis nodes at the end of Phase1were used as initial activation values in Phase2.In Phase2the model consisted of the three hypothesis nodes,the nodes for pat-terns of three events and the nodes representing single events.The single event nodes were included because the task required the prediction of individual events.The special activation node was now attached to event x.The model then predicted the other two individual events and patterns of all three events.The model quickly learned the causal relations during Phase1of the experiment.Figure4depicts the results of Phase2.Figure4A shows the predictions of the model for the condition in which participants assumed a common-cause model,Figure4B shows the results for the common-effect condition.The results of the simulations are consistent with the behavior we have observed in our participants. When the model assumes a common-cause model the pres-ence of x leads to a high positive activation of the two ef-fects e1and e2.This means that the model tends to prefer the prediction that the two effects of a common cause co-occur.In contrast,for the common-effect structure the model does not show such a preference.In this condition, both causes or either one of them equally qualify as possible explanations of the observed effect.This means that our model,similar to the one Thagard(2000)has proposed, tends to“explain away”the second cause when one of the competing causes is present.This is a consequence of the competition between the two causal hypothesis H2and H3.Figure4A:Implicit causal inferences(Test3).Activa-tion of single event nodes for the common-cause model: Event x(top),events e1and e2(bottom)Figure4B:Implicit causal inferences(Test3).Activa-tion of single event nodes for the common-effect model: Event x(top),event e1(middle),event e2(bottom)DiscussionA constraint satisfaction model of causal learning and rea-soning was presented in this paper that extends the architec-ture and scope of the model proposed by Thagard(2000). Thagard’s model focuses upon causal explanations of singu-lar events and belief updating.Our aim was to create a model that allows it to model both learning and reasoning within causal models.The model was successfully applied to three different tasks.It modeled people’s sensitivity to struc-tural implications of causal models in tasks involving learn-ing and predictions whereas the same model also predicted that people would fail in tasks which required explicit knowledge of the statistical implications of causal models.One question that might be raised is whether the pro-posed model really captures learning or just models causal judgment.In our view,the concept of learning does not nec-essarily imply incremental updating of associative weights Our model embodies a hypothesis testing approach to learn-ing which assumes that learners modify the strength of belief in deterministic causal hypotheses based on probabilistic learning input.This view also underlies recent Bayesian models of causality(Pearl,2000).In the model the activa-tion(i.e.,degree of belief)of the hypothesis nodes is modi-fied based on the learning input.This way the model is ca-pable of modeling trial-by-trial learning as well as learning based on summary data within the same architecture.Thus far we have pre-set the weights connecting evi-dence and hypotheses.In our view,the assigned values re-flect everyday qualitative intuitions about whether an event pattern supports or contradicts a hypothesized causal hy-pothesis.These weights remained constant throughout the simulations.Despite this restriction the model successfully predicted empirical phenomena in learning and reasoning. However,pre-setting these weights is not a necessary feature of the model.It is possible to add a learning component that acquires knowledge about the relation between event pat-terns and hypotheses based on feedback in a prior learning phase(see Wang et al.,1998,for a model adding associative learning to Echo).In summary,our constraint satisfaction model seems to offer a promising new way to model causal learning and reasoning.It is capable of modeling phenomena in a wide range of different tasks,which thus far have been treated as separate in the literature.Relative to normative Bayesian models,our connectionist model allows it to simulate a large number of different tasks and different phenomena while using fairly simple computational routines.It proved capable of capturing a number of recent phenomena that have pre-sented problems to extant models of causal cognition.More tests of the model clearly seem warranted.ReferencesHagmayer,Y.(2001).Denken mit undüber Kausalmodelle. Unpublished Doctoral Dissertation,University of Göttin-gen.Hagmayer,Y.,&Waldmann,M.R.(2000).Simulating causal models:The way to structural sensitivity.In L.R. Gleitman& A.K.Joshi(Eds.),Proceedings of the Twenty-Second Annual Conference of the Cognitive Sci-ence Society(pp.214-219).Mahwah,NJ:Erlbaum. Hagmayer,Y.,&Waldmann,M.R.(2001).Testing com-plex causal hypotheses.In M.May&U.Oestermeier (Eds.),Interdisciplinary perspectives on causation(pp.59-80).Bern:Books on Demand.Pearl,J.(2000).Causality:Models,reasoning,and infer-ence.Cambridge:Cambridge University Press. Rescorla,R.A.,&Wagner,A.R.(1972).A theory of Pav-lovian conditioning:Variations in the effectiveness of rein-forcement and non-reinforcement.In A.H.Black&W.F. Prokasy(Eds.),Classical conditioning II.Current re-search and theory(pp.64-99)New York:Appleton-Century-Crofts.Shanks,D.R.,Holyoak,K.J.,&Medin,D.L.(Eds.)(1996). The psychology of learning and motivation,Vol.34: Causal learning.San Diego:Academic Press. Thagard,P.(1989).Explanatory coherence.Behavioral and Brain Sciences,12,435-467.Thagard,P.(2000).Coherence in thought and action.Cam-bridge,MA:MIT Press.Waldmann,M.R.(1996).Knowledge-based causal induc-tion.In D.R.Shanks,K.J.Holyoak&D.L.Medin(Eds.), The psychology of learning and motivation,Vol.34: Causal learning(pp.47-88).San Diego:Academic Press. Waldmann,M.R.(2000).Competition among causes but not effects in predictive and diagnostic learning.Journal of Experimental Psychology:Learning,Memory,and Cognition,26,53-76.Wang,H.,Johnson,T.R.,&Zhang,J.(1998).UEcho:A model of uncertainty management in human abductive rea-soning.In M.A.Gernsbacher&S.R.Derry(Eds.),Pro-ceedings of the Twentieth Annual Conference of the Cog-nitive Science Society(pp.1113-1118).Mahwah,NJ:Erl-baum.。

Extended Abstract Dynamic Distributed Constraint Satisfaction for Resource Allocation

Extended Abstract:Dynamic Distributed Constraint Satisfaction for Resource AllocationPragnesh Jay ModiUniversity of Southern California/Information Sciences Institute4676Admiralty Way,Marina del Rey,CA90292,USAmodi,@1IntroductionDistributed resource allocation is a general problem in which a set of agents must in-telligently assign their resources to a set of tasks such that all tasks are performed with respect to certain criteria.This problem arises in many real-world domains such as distributed sensor networks[5],disaster rescue[3],hospital scheduling[1],and others. However,despite the variety of approaches proposed for distributed resource allocation, a systematic formalization of the problem,explaining the different sources of difﬁcul-ties,and a formal explanation of the strengths and limitations of key approaches is missing.We propose a formalization of distributed resource allocation that is expressive enough to represent both dynamic and distributed aspects of the problem.These two aspects present some key difﬁculties.First,a distributed situation results in agents ob-taining only local information,but facing global ambiguity—an agent may know the results of its local operations but it may not know the global task and hence may not know what operations others should perform.Second,the situation is dynamic so a so-lution to the resource allocation problem at one time may become unsuccessful when the underlying tasks have changed.So the agents must continuously monitor the qual-ity of the solution and must have a way to express such changes in the problem.In order to address this type of resource allocation problem,the paper deﬁnes the notion of Dynamic Distributed Constraint Satisfaction Problem(DDCSP).The central contribution of the paper is a reusable,generalized mapping from dis-tributed resource allocation to DDCSP.This mapping is proven to correctly perform re-source allocation problems of speciﬁc difﬁculty.Ideally,our formalization may enable researchers to understand the difﬁculty of their resource allocation problem,choose a suitable mapping using DDCSP,with automatic guarantees for correctness of the solu-tion.2Dynamic DCSPA Constraint Satisfaction Problem(CSP)is commonly deﬁned by a set of variables, each associated with aﬁnite domain,and a set of constraints on the values of the vari-ables.A solution is the value assignment for the variables which satisﬁes all the con-straints.A distributed CSP is a CSP in which variables and constraints are distributedamong multiple agents.Each variable belongs to an agent.A constraint deﬁned only on the variable belonging to a single agent is called a local constraint.In contrast,an external constraint involves variables of different agents.Solving a DCSP requires that agents not only solve their local constraints,but also communicate with other agents to satisfy external constraints.DCSP assumes that the set of constraints areﬁxed in advance.This assumption is problematic when we attempt to apply DCSP to domains where features of the envi-ronment are not known in advance and must be sensed at run-time.For example,in distributed sensor networks,agents do not know where the targets will appear.This makes it difﬁcult to specify the DCSP constraints in advance.Rather,we desire agents to sense the environment and then activate or deactivate constraints depending on the result of the sensing action.We formalize this idea next.We take the deﬁnition of DCSP one step further by deﬁning Dynamic DCSP(DD-CSP).DDCSP allows constraints to be conditional on some predicate P.More speciﬁ-cally,a dynamic constraint is given by a tuple(P,C),where P is an arbitrary predicate that is continuously evaluated by an agent and C is a familiar constraint in DCSP.When P is true,C must be satisﬁed in any DCSP solution.When P is false,C may be violated. An important consequence of dynamic DCSP is that agents no longer terminate when they reach a stable state.They must continue to monitor P,waiting to see if it changes. If its value changes,they may be required to search for a new solution.Note that a so-lution when P is true is also a solution when P is false,so the deletion of a constraint does not require any extra computation.However,the converse does not hold.When a constraint is added to the problem,agents may be forced to compute a new solution.In this work,we only need to address a restricted form of DDCSP i.e.it is only necessary that local constraints be dynamic.AWC[6]is a sound and complete algorithm for solving DCSPs.An agent with local variable,chooses a value for and sends this value to agents with whom it has external constraints.It then waits for and responds to messages.When the agent receives a variable value(=)from another agent,this value is stored in an AgentView. Therefore,an AgentView is a set of pairs(),(,),....Intuitively,the AgentView stores the current value of non-local variables.A subset of an AgentView is a NoGood if an agent cannotﬁnd a value for its local variable that satisﬁes all con-straints.For example,an agent with variable mayﬁnd that the set(),(, )is a NoGood because,given these values for and,it cannotﬁnd a value for that satisﬁes all of its constraints.This means that these value assignments can-not be part of any solution.In this case,the agent will request that the others change their variable value and a search for a solution continues.To guarantee completeness,a discovered NoGood is stored so that that assignment is not considered in the future.The most straightforward way to attempt to deal with dynamism in DCSP is to consider AWC as a subroutine that is invoked anew everytime a constraint is added. Unfortunately,in many domains such as ours,where the problem is dynamic but does not change drastically,starting from scratch may be prohibitively inefﬁcient.Another option,and the one that we adopt,is for agents to continue their computation even as local constraints change asynchronously.The potential problem with this approach is that when constraints are removed,a stored NoGood may now become part of a so-lution.We solve this problem by requiring agents to store their own variable values as part of non-empty NoGoods.For example,if an agent with variableﬁnds that a value does not satisfy all constraints given the AgentView(),(,), it will store the set(),(),(,)as a NoGood.With this modiﬁca-tion to AWC,NoGoods remain“no good”even as local constraints change.Let us call this modiﬁed algorithm Locally-Dynamic AWC(LD-AWC)and the modiﬁed NoGoods “LD-NoGoods”in order to distinguish them from the original AWC NoGoods.Lemma I:LD-AWC is sound and complete.The soundness of LD-AWC follows from the soundness of AWC.The completeness of AWC is guaranteed by the recording of NoGoods.A NoGood logically represents a set of assignments that leads to a contradiction.We need to show that this invariant is maintained in LD-NoGoods.An LD-NoGood is a superset of some non-empty AWC NoGood and since every superset of an AWC NoGood is no good,the invariant is true when a LD-NoGood isﬁrst recorded.The only problem that remains is the possibility that an LD-NoGood may later become good due to the dynamism of local constraints.A LD-NoGood contains a speciﬁc value of the local variable that is no good but never contains a local variable exclusively.Therefore,it logically holds information about external constraints only.Since external constraints are not allowed to be dynamic in LD-AWC,LD-NoGoods remain valid even in the face of dynamic local constraints. Thus the completeness of LD-AWC is guaranteed.3Formalization of Resource AllocationA Distributed Resource Allocation Problem is a structure,,where–is a set of agents,=,,...,.–=,...,,...,is a set of operations,where operation denotes the p‘th operation of agent.Let denote the set of operations of.Opera-tions in are mutually exclusive;an agent can only perform one operation ata time.–is a set of tasks,where a task is a collection of sets of operations that satisfy the following properties:,(i)(ii)is nonempty and,,is nonempty;(iii),,and.and are called minimal sets.Two minimal sets conﬂict if they contain an operations belonging to the same agent.Intuitively,a task is deﬁned by the operations that agents must perform in order to complete it.There may be alternative sets of operations that can complete a given task. Each such set is a minimal set.(Property(iii)requires that each set of operations in a task should be minimal in the sense that no other set is a subset of it.)A solution to a resource allocation problem then,involves choosing a minimal set for each task such that the minimal sets do not conﬂict.In this way,when the agents perform the operations in those minimal sets,all tasks are successfully completed.For instance,in For each task,we use()to denote the union of all the minimal sets of.More formally,–,()=We use to denote the set of tasks that include.–,()=()We require that,)0.That is,every operation should serve some tasks.We call a resource allocation problem unambiguous if,() =1and ambiguous otherwise.All the tasks in are not always present.We use to denote the set of tasks that are currently present.This set is determined by the environment.Agents can execute their operations at any time,but the success of an operation is determined by the set of tasks that are currently present.The following two deﬁnitions formalize this interface with the environment.–Deﬁnition1:,if is executed and such that (),then is said to succeed.A task is performed when all the operations in some minimal set succeed.More formally,–Deﬁnition2:,is performed iff such that all the opera-tions in succeed.We call a resource allocation problem static if is constant over time and dynamic otherwise.The following assumption states that a task does not involve two operations from the same agent since a task which violates this assumption cannot be performed.–Assumption:and,()() 1.Agents must somehow be informed of the set of current tasks.The notiﬁcation pro-cedure is outside of this formalism.Thus,the following assumption states that at least one agent is notiﬁed that a task is present by the success of one of its operations.–Notiﬁcation assumption:,if,then()such that,and succeeds.We now state some deﬁnitions that will allow us to categorize a given resource allocation problem and analyze its complexity.We deﬁne two types of conﬂict-free to denote tasks that can be performed concurrently.The Strongly Conﬂict Free condition implies that all minimal sets from the tasks are non-conﬂicting.Deﬁnition7:A resource allocation problem is called strongly conﬂict free(SCF) if,the following statement is true:–if,then,,,()+() 1.The Weakly Conﬂict Free condition implies that there exists a choice of minimal sets from the tasks that are non-conﬂicting.Deﬁnition8:A resource allocation problem is called weakly conﬂict free(WCF) if,the following statement is true:–if,then,s.t.,()+() 1.4Generalized MappingIn this section,we again map our formal model of the Resource Allocation Problem onto DDCSP.Our goal is to provide a general mapping so that any WCF resource allocation problem can be solved in a distributed manner by a set of agents by applying this mapping.More formally,given a Resource Allocation Problem,,,the correspond-ing DDCSP is deﬁned over as follows,–Variables:,create a DDCSP variable and assign it to agent.In this way,each agent has a variable for each task in which its operations are included.The domain of each variable is given by:–Domain:For each variable,create a value for each minimal set in,plus a “NP”value(not present).The NP value allows agents to avoid assigning resources to tasks that are not present and thus do not need to be performed.Next,we must constrain agents to assign non-NP values to variables only when an operation has succeeded,which indicates the presence of the corresponding task. However,in dynamic problems,an operation may succeed at some time and fail at another time since tasks are dynamically added and removed from the current set of tasks to be performed.Thus,every variable is constrained by the following dynamic local constraints.–Dynamic Local(Non-Binary)Constraint(LC1):,(),let B=.Then let the constraint be deﬁned as a non-binary constraint over the variables in B as follows:P:succeedsC:B s.t.NP.–Dynamic Local Constraint(LC2):,(),let the constraint be deﬁned on as follows:P:does not succeedC:=NPThe truth value of P is not known in advance.Agents must execute their operations, and based on the result,locally determine if C needs to be satisﬁed.In dynamic prob-lems,where the set of current tasks is changing over time,the truth value of P will change,and hence the corresponding DDCSP will also be dynamic.We now deﬁne the constraint that deﬁnes a valid allocation of resources.–Static Local Constraint(LC3):,if,then the value ofcannot conﬂict with the minimal set.NP does not conﬂict with anything.Finally,we deﬁne external constraints that require agents to agree on a particular allocation.–External Constraint(EC):We will now prove that our mapping can be used to solve any given WCF Resource Allocation Problem.Theﬁrst theorem states that our DDCSP always has a solution,and the second theorem states that if agents reach a solution,all current tasks are performed.Theorem V:Given a WCF Resource Allocation Problem,,,,there always exists a solution to the corresponding DDCSP.Proof omittedTheorem VI:Given a WCF Resource Allocation Problem,,,and the corresponding DDCSP,if an assignment of values to variables in the DDCSP is a solution,then all tasks in are performed.Proof omitted5Related WorkIn terms of related work,there is signiﬁcant research in the area of distributed resource allocation.However,a formalization of the general problem in distributed settings is yet to be forthcoming.Our work takes a step in this direction and provides a novel and general DDCSP mapping,with proven guarantees of performance.Some researchers have focused on formalizing resource allocation as a centralized CSP,where the issue of ambiguity does not arise[2].The fact that resource allocation is distributed and thus ambiguity must be dealt with,is a main component of our work.Furthermore,we pro-vide a mapping of the resource allocation problem to DDCSP and prove its correctness, an issue not addressed in previous work.Dynamic Constraint Satisfaction Problem has been studied in the centralized case by[4].However,there is no distribution or ambi-guity during the problem solving process.The work presented here differs in that we focus on distributed resource allocation,its formalization and its mapping to DDCSP. Indeed,in the future,our formalization may enable researchers to understand the dif-ﬁculty of their resource allocation problem,choose a suitable mapping using DDCSP, with automatic guarantees for correctness of the solution.References1.K.Decker and J.Li.Coordinated hospital patient scheduling.In ICMAS,1998.2. C.Frei and B.Faltings.Resource allocation in networks using abstraction and constraintsatisfaction techniques.In Proc of Constraint Programming,1999.3.Hiroaki Kitano.Robocup rescue:A grand challenge for multi-agent systems.In ICMAS,2000.4.S.Mittal and B.Falkenhainer.Dynamic constraint satisfaction problems.In AAAI,1990.5.Sanders.Ecm challenge problem,/ants/ecm.htm.2001.6.M.Yokoo and K.Hirayama.Distributed constraint satisfaction algorithm for complex localproblems.In ICMAS,July1998.。

constraint that加句子

constraint that加句子有：1.We need to have a constraint that all our products must comply withenvironmental standards.翻译：我们需要一个约束条件，即我们所有的产品都必须符合环保标准。

2.The contract includes a constraint that the supplier must deliver thegoods within a specified time frame.翻译：合同中包含一个约束条件，即供应商必须在规定的时间范围内交付货物。

3.The company's policy imposes a constraint that all employees mustcomplete safety training within their first month of employment.翻译：公司的政策规定了一个约束条件，即所有员工必须在雇用的第一个月内完成安全培训。

4.The law places a constraint that no person can be discriminatedagainst in accessing public services.翻译：法律规定了一个约束条件，即任何人都不得在获取公共服务时受到歧视。

5.The design includes a constraint that the structure must withstand acertain amount of pressure and stress.翻译：设计包括一个约束条件，即结构必须承受一定的压力和应力。

6.The project has a constraint that all work must be completed withina specified time frame.翻译：该项目有一个约束条件，即所有工作必须在规定的时间内完成。

A Constraint-based approach for examination timetabling

2.1 Objective
2.2 The constraints
There are two types of constraints: required constraints which must necessarily be satis ed, and secondary constraints which should preferably be satis ed, but can be violated.
2 Presentation of the problem
For each subject, there are several di erent examiners. The examiners are generally high school teachers. For a given day, a number of candidates (say nc ) have to take the complete examination, composed with the four following subjects: 1. mathematics (nm examiners for the considered day), 2. physics and chemistry (np examiners), 3. foreign language: either English (ne examiners), German (ng examiners) or Spanish (ns examiners), 4. discussion about a text (nd examiners). The total number of examiners (for a given day) will be denoted nexa. The day is composed of p consecutive periods (a period is the time an examination lasts); in our case, there are p = 15 periods numbered #1, #2, ... , #15. The aim is to provide a scheduling of the examinations for the day: for every subject, assign every candidate to one examiner during one period, so that a number of constraints (presented in section 2.2) are satis ed.

约束也是一种幸福作文指导

约束也是一种幸福作文指导英文回答：Constraints can indeed be a form of happiness. By imposing limits on our actions and choices, constraints can help us to focus our attention, develop clarity, and appreciate the present moment.One way in which constraints can foster happiness is by providing a sense of purpose and direction. When we have too many options, it can be difficult to know what to do or where to start. Constraints can help us to narrow down our choices and focus on what is truly important. By knowing what we are not allowed to do, we can more easily identify the things that we should do.Constraints can also help us to develop a sense of appreciation for the things that we have. When we are constantly bombarded with new and tempting options, it can be easy to take our current circumstances for granted.Constraints can help us to break out of this cycle of dissatisfaction and appreciate the beauty and simplicity of what we already have. By knowing what we cannot have, we can more easily recognize the value of what we do have.Finally, constraints can help us to live in the present moment. When we are constantly thinking about the future or the past, it is difficult to be truly present in the current moment. Constraints can help us to break out ofthis cycle of distraction and focus on the things that are happening right now. By knowing what we are not allowed to do, we can more easily let go of the things that we cannot control and focus on the things that we can.中文回答：约束也可以是一种幸福。

Using Soft CSPs for Approximating Pareto-Optimal Solution Sets

Using Soft CSPs for Approximating Pareto-Optimal Solution SetsMarc Torrens and Boi FaltingsSwiss Federal Institute of Technology(EPFL),Lausanne,SwitzerlandMarc.Torrens@epﬂ.ch,Boi.Faltings@epﬂ.chAbstractWe consider constraint satisfaction problems where so-lutions must be optimized according to multiple crite-ria.When the relative importance of different criteriacannot be quantiﬁed,there is no single optimal solution,but a possibly very large set of Pareto-optimal solutions.Computing this set completely is in general very costlyand often infeasible in practical applications.We consider several methods that apply algorithms forsoft CSP to this problem.We report on experiments,both on random and real problems,that show that suchalgorithms can compute surprisingly good approxima-tions of the Pareto-optimal set.We also derive variantsthat further improve the performance.IntroductionConstraint Satisfaction Problems(CSPs)(Tsang1993;Ku-mar1992)are ubiquitous in applications like conﬁguration,planning,resource allocation,scheduling,timetabling andmany others.A CSP is speciﬁed by a set of variables and aset of constraints among them.A solution to a CSP is a setof value assignments to all variables such that all constraintsare satisﬁed.In many applications of constraint satisfaction,the objec-tive is not only toﬁnd a solution satisfying the constraints,but also to optimize one or more preference criteria.Suchproblems occur in resource allocation,scheduling and con-ﬁguration.As an example,we consider in particular elec-tronic catalogs with conﬁguration functionalities:a hard constraint satisfaction problem deﬁnes the avail-able product conﬁgurations,for example different fea-tures of a PC.the customer has different preference criteria that need tobe optimized,for example price,certain functions,speed,etc.More precisely,we assume that optimization criteria aremodeled as functions that map each solution into a numer-ical value that indicates to what extent the criterion is vio-lated;i.e.the lower the value,the better the solution.1MAX-SOFT is a maximum value for soft constraints.By usinga speciﬁc maximum valuation for soft constraints,we can easilydifferenciate between a hard violation and soft violation.4) Figure1:Example of solutions in a CSP with two preference cri-teria.The two coordinates show the values indicating the degrees to which criteria(horizontal)and(vertical)are violated.and two solutions and of:and dominatesThe idea of Pareto-optimality(Pareto18961987)is to consider all solutions which are not dominated by another one as potentially optimal:Deﬁnition3.Any solution which is not dominated by an-other is called Pareto-optimal.Deﬁnition4.Given a MCOP,the Pareto-optimal set2is the set of solutions which are not dominated by any other one.In Figure1,the Pareto-optimal set is,as solu-tion7is dominated by4and6,5is dominated by3and4, and2is dominated by1.Pareto-optimal solutions are hard to compute because un-less preference criteria involve only few of the variables,the dominance relation can not be evaluated on partial solutions. Research on better algorithms for Pareto-optimality is still ongoing(see,for example,Gavanelli(Gavanelli2002)),but since it cannot escape this fundamental limitation,generat-ing all Pareto-optimal solutions is likely to always remain computationally very hard.Therefore,Pareto-optimality has so far found little use in practice,despite the fact that it characterizes optimality in a more realistic way.This is especially true when the Pareto-optimal set must be computed very quickly,for example in interactive conﬁguration applications(e.g.electronic cata-logs).Another characteristic of the Pareto-optimal set is that it usually contains many solutions;in fact,all solutions could be Pareto-optimal.Thus,it will be necessary to have the endelectronic catalogs return a list of possibilities thatﬁt the criteria in decreasing degree.In general,these solutions have been calculated assuming a certain weight distribution among constraints.It appears that listing a multitude of nearly optimal solutions is intended to compensate for the fact that these weights,and thus the optimality criterion,are usually not accurate for the particular user.For example,in Figure1,if we assume that constraints have equal weight,the order of solutions would be,and the top four according to this weighting are also the Pareto-optimal ones.The questions we address in this paper follow from this observation:how closely do the top-ranked solutions generated by a scheme with known constraint weights,in particular MAX-CSP,approximate the Pareto-optimal set,and can we derive variations that cover this set better while maintaining efﬁciency?We have performed experiments in the domain of conﬁg-uration problems that indicate that MAX-CSP can indeed provide a surprisingly close approximation of the Pareto-optimal set both in real settings and in randomly gener-ated problems,and derive improvements to the methods that could be applied in general settings.Using Soft CSP Algorithms for ApproximatingPareto-optimal SetsTo approximate the set of Pareto-optimal solutions,the sim-plest solution is to simply map the MCOP into an opti-mization problem with a single criterion obtained by aﬁxed weighting of the different criteria,called a weighted con-strained optimization problem(WCOP):Deﬁnition5.A WCOP is an MCOP with an associated weight vector,.The optimal solution to a COP is a tuple that minimizes the valuation functionThe best solutions to a WCOP is the set of the solutions with the lowest cost.is called the valuation of.We call feasible solutions to a WCOP those solutions which do not violate any hard constraint.Note that when the weight vector consists of all1s, WCOP is equivalent to MAX-CSP and is also an instanti-ation of the semiring CSP framework.WCOP can be solved by branch-and-bound search algorithms.These algorithms can be easily adapted to return not only the best solution,but an ordered list of the best solutions.In our work we use Partial Forward Checking(Freuder&Wallace1992)(PFC), which is a branch and bound algorithm with propagation. Pareto-optimality of WCOP SolutionsAs mentioned before,in practice it turns out that among the best solutions to a WCOP,many are also Pareto-optimal.Theorem1shows indeed that the optimal solution of a WCOP is always Pareto-optimal,and that furthermore among the best solutions all those which are not dominated by another one are Pareto-optimal for the whole problem: Theorem1.Let be the set of the best solutions ob-tained by optimizing with weights.If and is not dominated by anythen is Pareto-optimal.Proof.Assume that is not Pareto-optimal.Then,there is a solution which dominates solution,and by Deﬁnition2:andAs a consequence,we also have:i.e.must be better than according to the weighted optimization function.But this contradicts the fact that .This justiﬁes the use of soft CSP toﬁnd not just one,but a larger set of Pareto-optimal solutions.In particular,by ﬁltering the best solutions returned by a WCOP algorithm to eliminate the ones which are dominated by another one in the set,weﬁnd only solutions which are Pareto-optimal for the entire problem.We can thus bypass the costly step of proving non-dominance on the entire solution set.Theﬁrst algorithm is thus toﬁnd a subset of the Pareto set by modeling it as a WCOP with a single weight vector, generating the best solutions,andﬁltering them to retain only those which are not dominated(Algorithm1).Input::MCOP.:the maximal number of solutions to com-pute.Output::an approximation of the Pareto-optimal set.PFC(WCOP(P,),)eliminateDominatedSolutions()Input::MCOP.:the maximal number of solutions to com-pute.:is a collection of weight vectors.Output::an approximation of the Pareto-optimal set.PFC(WCOP(P,),) foreach doiterations,one for each cuple of constraints.The iteration for the constraints and,is performed with the weight vector,where,and .Method5:Algorithm2with100020003000400050006000700080003456789101112n u m b e r o f p a r e t o o p t i m a l s o l u t i o n snumber of criteria (soft constraints)Random problem with n=5, d=10, hc=20%hard tightness = 20%hard tightness = 40%hard tightness = 60%hard tightness = 80%Figure 2:Number of Pareto-optimal solutions depending on how many soft constraints we consider for random generated problems with 5variables,10values per domain and 20%of hard unary/binary constraint density.The number of of solutions in average for the generated problems are:for hard tightness =20%,for hard tightness =40%,for hard tightness =60%and 778for hard tightness =80%.number of criteria (soft constraints).In Figure 2,it is shown that the number of Pareto-optimal solutions clearly increases when the number of criteria increases.The same phenom-ena applies for instances with 5and 10variables.On the other hand,we have observed that even if the number of Pareto-optimal solutions decreases when the problem gets more constrained (less feasible solutions)the percentage in respect to the total number of solutions increases.Thus,the proportion of the Pareto-optimal solutions is more important when the problem gets more constrained.We have evaluated the proposed methods for each type of generated problems.Figure 3shows the proportion in avarage of Pareto-optimal solutions found by the different methods for problems with 6soft constrains.We emphasize the results of the methods up to 530solutions because in real applications it could not be feasible to compute a largerset of solutions.When computing up tosolutions,the behavior of the different methods does not change sig-niﬁcantly.The 50randomly generated problems used for Figure 3had in avarage feasible solutions (satisfy-ing hard constraints)andPareto-optimal solutions.The iterative methods perform better than the single search al-gorithm (Method 1)in respect to the total number of solu-tions computed.It is worth to note that the iterative meth-ods based on Algorithm 2ﬁnd more Pareto-optimal solu-tions when the number of iterations increase.Lexicographic Fuzzy method (Method 6)results in ﬁnding a very low per-centage of Pareto-optimal solutions (less than ).With Method 6,Theorem 1does not apply,thus the percentage shown of Pareto-optimal solutions is computed a posteriori by ﬁltering out the Pareto-optimal solutions that were not really Pareto-optimal for the entire problem.Another way of comparing the different methods is to% o f P a r e t o o p t i m a l s o l u t i o n s f o u n dTotal number of computed solutionsEvaluation of different methodsFigure 3:Pareto-optimal solutions found by the different proposed methods (in %).Methods are applied to 50ran-domly generated problems with 10variables,10values per domain,40%of density of hard unary/binary constraints with 40%of hard tightness and 6criteria (soft constraints).The number of total computed solutions for each method varies from 30to 530in steps of 100.% o f P a r e t o o p t i m a l s o l u t i o n s f o u n dTime in secondsEvaluation of different methodsFigure 4:Number of Pareto-optimal solutions found by the different proposed methods with respect to the computing time.For this plot,the problems have 10variables,10values per domain with 40%of hard unary/binary constraints with 40%of hard tightness and 6criteria (soft constraints).compare the number of Pareto-optimal solutions found withrespect to the computing time(Figure4).Using this com-parison,Method1performs the best.The performance ofthe variants of Method2decreases when the number of it-erations increases.Method3performs better than method4 which performs better than method5in terms of computingtime.In general,we can observe that when the number of it-erations of the methods increases the performance regardingthe total number of computed solutions also increases but theperformance regarding the computing time decreases.This is due to the fact that the computing time ofﬁnding the bestsolutions with PFC is not linear with respect ofﬁnding the best solutions with iterations(solutions per itera-tion).For example,computing solutions with one it-eration takes seconds and computing solutionswith7iterations(of solutions)takes seconds.Even if the tests based on Algorithm2takes more timethan Algorithm1for getting the same percentage of Pareto-optimal solutions,they are likely to produce a more repre-sentative set of the Pareto-optimal set.Using a brute force algorithm that computes all the feasi-ble solutions andﬁlter out those which are dominated took inavarage seconds for the same problems as in the above ﬁgures.This demonstrates the interest of using approxima-tive methods for computing Pareto-optimal solutions,espe-cially for interactive conﬁguration applications(e.g.elec-tronic catalogs).Empirical Results in a Real Application The Air Travel ConﬁguratorThe problem of arranging trips is here modeled as a softCSP(see(Torrens,Faltings,&Pu2002)for a detailed de-scription of our travel conﬁgurator).An itinerary is a set of legs,where each leg is represented by a set of origins,a set of destinations,and a set of possible dates.Leg variables represent the legs of the itinerary and their domains are the possibleﬂights for the associated leg.Another variable rep-resents the set of possible fares3applicable to the itinerary. The same itinerary can have several different fares depend-ing on the cabin class,airline,schedule and so u-ally,for each leg there can be about60ﬂights,and for each itinerary,there can be about40fares.Therefore,the size of the search space for a round trip is and for a three leg trip is.Constraint sat-isfaction techniques are well-suited for modeling the travel planning problem.In our model,two types of conﬁguration constraints(hard constraints)guarantee that:1.aﬂight for a leg arrives before aﬂight for a legtakes off,and2.a fare can be really applicable to a set ofﬂights(depend-ing on the fare rules).Users normally have preferences about the itinerary they are willing to plan.They can have preferences about the schedule of the itinerary,the airlines,the class of service,toﬁnd a certain number of Pareto-optimal solutions,even if this set only represents a small fraction of all the Pareto-optimal solutions.Actually,we consider that the number of total solutions that can be shown to the user must be small because of the limitations of the current graphical user inter-faces.Related WorkThe most commonly used approach for solving a Multi-criteria Optimization Problem is to convert a MCOP into several COPs which can be solved using standard mono-criteria optimization techniques.Each COP will give then a Pareto-optimal solution to the problem.Steuer’s book(Steuer1986)gives a deep study on different ways to translate a MCOP to a set of COPs.The most used strategy is to optimize by one linear function of all crite-ria with positive weights.The drawback of the method is that some Pareto-optimal solutions cannot be found if the efﬁcient frontier is not concave4.Our methods are based on this approach.Gavanelli(Gavanelli2002;2001)addresses the problem of multi-critera optimization in constraint problems directly. His method is based in a branch and bound schema where the Paerto dominance is checked against a set of previously found solutions using Point Quad-Trees.Point Quad-Trees are useful for efﬁciently bounding the search.However,the algorithm can be very costly if the number of criteria or if the number of Pareto-optimal solutions are high.Gavanelli’s method signiﬁcantly improves the approach of Wassenhove-Geders(Wassenhove&Gelders1980).The Wassenhove-Geder’s method basically consists of performing several search processes,one for each criteria.Each iteration takes the previous solution and tries to improve it by optimizing another ing this method,each search produces one Pareto-optimal solution,so a lot of search process must be done in order to approximate the Pareto-optimal set. The Global Criterion Method tries to solve a MCOP as a COP where the criteria to optimize is a minimization func-tion of a distance function to an ideal solution.The ideal solution is precomputed by optimizing each criteria inde-pendently(Salukvadze1974).Incomplete methods have also been developed for solving multi-criteria optimization,basically:genetic algorithms(Deb1999)and methods based on tabu search(Hansen1997).ConclusionsThis paper deals with a very well-studied topic,Pareto-optimality in multi-criteria optimization.It has been com-monly understood that Pareto-optimality is intractable to compute,and therefore has not been studied further.In-stead,many applications have simply mapped multi-criteria search into a single criterion with a particular weighting and returned a list of the best solutions rather than a single best one.This solution allows leveraging the well-developedKumar,V.1992.Algorithms for Constraint Satisfaction Problems:A Survey.AI Magazine13(1):32–44. Pareto,V.1896-1987.Cours d’´e conomie politique profess´e `a l’universit´e de Lausanne,usanne:F.Rouge. Salukvadze,M.E.1974.On the existence of solu-tion in problems of optimization under vector valued cri-teria.Journal of Optimization Theory and Applications 12(2):203–217.Steuer,R.E.1986.Multi Criteria Optimization:Theory, Computation,and Application.New York:Wiley. Torrens,M.;Faltings,B.;and Pu,P.2002.Smartclients: Constraint satisfaction as a paradigm for scaleable intel-ligent information systems.CONSTRAINTS:an interna-tional journal7:49–69.Tsang,E.1993.Foundations of Constraint Satisfaction. London,UK:Academic Press.Wassenhove,L.N.V.,and Gelders,L.F.1980.Solving a bicriterion scheduling problem.European Journal of Op-erational Research4(1):42–48.。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Abstract
The purpose of this paper is to present a new approach for solving ﬁrst-order predicate logic problems stated in conjunctive normal form. We propose to combine resolution with the Constraint Satisfaction Problem (CSP) paradigm to prove the inconsistency or ﬁnd a model of a problem. The resulting method beneﬁts from resolution and constraint satisfaction techniques and seems very efﬁcient when confronted to some problems of the CADE-13 competition.
Sacre : a Constraint Satisfaction Problem Based Theorem Prover
Jean-Michel Richer, Jean-Jacques Chabrier
LIRSIA, Burgundy University U.F.R des Sciences et Techniques Bˆ atiment Mirande - 9, Avenue Alain Savary, B.P. 400 21011 Dijon Cedex - France {richer,chabrier}@crid.u-bourgogne.fr
ing paradigm. However, Satchmo suffers from certain drawbacks. The ﬁrst one is range restriction requiring that each head variable must occur in the body of a clause. The second one is the fact that Satchmo might choose a clause irrelevant to the current goal to be solved and thus would cause unnecessary model candidate extensions. This may result in a potential explosion of the search space. However, some improvements can be made, such as relevancy testing (Loveland 93) to avoid unnecessary case splittings. Apart from those semantic approaches, Finder (Slaney 95) searches for ﬁnite models of ﬁrst order theories presented as sets of clauses. Falcon (Zhang 96), for which model generation is viewed as constraint satisfaction, constructs ﬁnite algebras from given equational axioms. Finite models are able to provide some kind of semantic guidance that helps refutation-based theorem provers to ﬁnd proofs more quickly (Slaney 94). It is also possible to combine resolution with rewrite techniques so as to guide the search and design more efﬁcient inference rules, such as the problem reducation format (Loveland 78) or the simpliﬁed problem reduction format (Plaisted 82), that permits to delete unachievable subgoals, or its extension the modiﬁed problem reduction format (Plaisted 88). The key novelty introduced in this paper is the combination of resolution with the Constraint Satisfaction Problem (CSP) paradigm, so as to solve ﬁrst-order predicate calculus problems, stated in conjunctive normal form. This combination is not fortuitous. First, consistency searching and model ﬁnding are both common problems related to logic and CSP. Second, the CSP techniques have proved to be very powerful to solve large combinatorial problems by applying stategies and heuristics that help guide the search and improve the resolution process by efﬁciently pruning the search space. The resulting method, called Sacre 1 is based on a unique forward chaining rule and combines constraint satisfaction heuristics and techniques together with resolution and is able to prove the inconsistency or ﬁnd a model of a problem. It is, to our knowledge, the ﬁrst attempt of this kind ever tried in this direction. The paper is organized as follows : in section 2, we will set forth some basic deﬁnitions of constraint satisfaction
1 for SAtisfaction de Contraintes et REsolution - Constraint satisfaction and resolution
problems. The next section is devoted to the Sacre system. Section 4 provides some typical examples in order to outline the domain of application of Sacre . The last section exhibits some results for some of the problems of the CADE-13 competition. Though the constraint satisfaction paradigm has been widely used to efﬁciently solve propositionnal calculus problems, so far, little research has been carried out on the resolution of ﬁrst-order logic problems considered as CSPs. The main reason is due to the fact that ﬁrst-order logic problems are not well suited to a CSP approach and tne major hurdles encountered
IБайду номын сангаасtroduction
From a general point of view we can classify methods for solving ﬁrst-order predicate calculus problems stated in conjunctive normal form, into two categories. The ﬁrst one is consistency searching or proof searching and is syntaxoriented. It consists of raising a contradiction from a set of clauses by applying inference rules. For example, Otter (McCune 94) uses resolution, unit-resolution, hyperresolution, while Setheo (Loveland 78) is based on model elimination. The second one, satisﬁability checking, also called model ﬁnding, is related to semantics and tries to ﬁnd a model or a counterexample of a problem. In this last category we can draw a distinction between saturation and extension approaches. In the former case, we iteratively generate ground instantiations of the problem and test ground clause sets for unsatisﬁability with a propositional calculus prover. In the latter case, we try to build a model of the problem by assuming new ground facts. It seems, that in the case of propositional calculus, semantic methods are more efﬁcient when compared to syntactic ones, whereas in predicate calculus, this is quite the contrary. One of the ﬁrst satisﬁability approaches was the one of Gilmore (Gilmore 60) which was a saturation approach and proved to be very inefﬁcient. We believe it is because this kind of approach has to tackle with the whole Herbrand base while only a part of it is necessary. Another approach is the Satchmo theorem prover (Manthey 88) which uses syntactic and semantic features to solve problems. It can be qualiﬁed as an extension approach. Satchmo is based on the model generation reasonCopyright c 2007, American Association for Artiﬁcial Intelligence (). All rights reserved.