Spoken Dialogue Management Using Hierarchical Reinforcement Learning and Dialogue Simulatio

合集下载

北京语言大学智慧树知到“英语”《英语商务通论》网课测试题答案卷2

北京语言大学智慧树知到“英语”《英语商务通论》网课测试题答案（图片大小可自由调整）第1卷一.综合考核(共10题)1.Corporations are the backbone of modern economies, although in number corporations account for only a small percentage of all business firms.()A.错误B.正确2.Speculators are mainly interested in stocks which give high pidends and/or are likely to rise in price in the long run.()A.错误B.正确3.In legal terminology, tort means a civil or private wrongful act by one party that results in injury to another partys body, property or reputation.()A.错误B.正确4.Major forms of import surtaxes are ___.A.tariffB.countervailing dutyC.antidumping dutyD.variable levy5.Trade credit includes ___.A.open accountB.promissory noteC.trade draftD.pident6.Tools that are commonly used for promotion are: ___A.dvertisingB.sales promotionC.personal sellingD.public relations7.According to the way they are collected, the import surtaxes or duties fall into ___.A.specific dutiesB.ad valorem dutiespound dutiesD.alternative duties8.It is generally agreed that the first revealing studies on human relations were made between 1932 and 1943 at the Hawthorne Works Plant of the Western Electric Company.()A.错误B.正确9.A committee which is set up to solve a specific task and then dissolved is called a standing committee.()A.错误B.正确10.A firm should give its new employees proper orientation and help them adapt to the new environment.()A.错误B.正确第1卷参考答案一.综合考核1.参考答案：B2.参考答案：B3.参考答案：B4.参考答案：BCD5.参考答案：ABC6.参考答案：ABCD7.参考答案：ABCD8.参考答案：B9.参考答案：A10.参考答案：B。

Confidence measures for spoken dialogue systems

CONFIDENCE MEASURES FOR SPOKEN DIALOGUE SYSTEMSRubén San-Segundo 1, Bryan Pellom, Kadri Hacioglu, Wayne WardCenter for Spoken Language Research. University of Colorado Boulder, Colorado 80309-0594, USA, José M. PardoGrupo de Tecnología del Habla. Universidad Politécnica de Madrid Ciudad Universitaria s/n, 28040 Madrid, Spain, http://www-gth.die.upm.es1This work was performed during a visiting internship at the Center for Spoken Language Research and has been supported by SpanishEducation Ministry grant AP97-20257252 and by DARPA through ONR grant #N00014-99-1-0418.ABSTRACTThis paper provides improved confidence assessment for detection of word-level speech recognition errors, out of domain utterances and incorrect concepts in the CU Communicator system. New features from the speech understanding component are proposed for confidence annotation at utterance and concept levels. We have considered a neural network to combine all features in each level. Using the data collected from a live telephony system, it is shown that 53.2% of incorrectly recognized words, 53.2% of out of domain utterances and 50.1% of incorrect concepts are detected at a 5% false rejection rate. In addition, the confidence measures are used to improve the word recognition accuracy. Several hypotheses from different speech recognizers are compiled into a word-graph. The word-graph is searched for the hypothesis with the best confidence. We report a 14.0% relative word error rate reduction after this confidence rescoring.1. INTRODUCTIONIn a spoken dialogue system we can define three different levels for confidence measures:- Word Level : in this level, confidence measures provide an idea about the accuracy of each recognized word. For this level we will use decoder and Language Model (LM)features.- Utterance Level : here, the target is the detection of out of domain utterances. In this level we will use acoustic, LM and parsing features.- Concept Level : The end-to-end system performance does not change in cases where the phrases in a sentence that belong to concepts are correctly recognized while the “filler” words or phrases are not correctly recognized. In this level, we focus on parts of phrases that are meaningful to the task.Decoder, LM and parsing features will be used to tag the concepts with the confidence measures.We use the CU Communicator system as our test-bed for experimentation in this paper. This system is a Hub compliant implementation of DARPA Communicator task [1][2][3]. Thesystem combines continuous speech recognition, natural language understanding and flexible dialogue control to enable natural conversational interaction by telephone callers to access information about airline flights, hotels and rental cars.2. DATABASEThe data used for the experiments has been obtained during the telephone data collection from November of 1999 through June of 2000 [3]. Over 900 calls were collected during this period totaling approximately 11,500 utterances. We have randomly split the data into three sets, 60% of the data for training, 20%for evaluation and 20% for testing. We have repeated it six times providing 6-Round Robin data sets to verify the results.The results presented are the average of these experiments. For the experiments in Sec. 3.3, we have used an independent set of utterances from the NIST Multi-Site Data Collection [3].3. WORD LEVELFor word level confidence we investigated a subset of the most promising features which were considered in [4][5][6]. We consider decoder and LM features.Decoder features:- Normalized Score : acoustic score of the word divided by the number of frames that it spans.- Count in the Nbest : percentage of times the word appears in the 100-best hypotheses in similar position.- Lattice Density : number of alternative paths to the word considered in the word-graph generated in the second pass of the recognizer.- Phone Perplexity : average number of phones searched along the frames where the recognized word has been active in the decoding process.Language Model features [7]:- Language Model Back-Off Behavior : back-off behavior of an N-gram language model along a 5 word context.- Language Model Score : the log-probability for each word in a sequence as computed from a back-off language model along a 5 word context.3.1 Feature combinationWe have considered a Multi-Layer Perceptron (MLP) to combine all the features. In this study, the features were quantized with 115 binary inputs. 10 bits per feature were used except for the 5-context LM back-offs where it was necessary to use only 5 bits to code all possible situations. We have coded the features considering more resolution in ranks with more training data. The hidden layer consisted of 30 units and one output node was used to model the word-level confidence.During weight estimation, a target value of 1 is assigned when the decoder correctly recognizes the word and a value of 0, is assigned during incorrect recognition (e.g., substitutions and insertions).3.2 ExperimentsTable 1 summarizes the correct detection rates for word-level recognition errors at false rejection rates of 2.5% and 5.0%. In this table we also present the minimum classification error and the baseline error that corresponds with the recognition rate. It can be seen that LM features provide better indicators for word-level confidence than decoder features. For example,using LM features alone, 42.0% of mis-recognized words were detected at a false rejection rate of 5%, similar to [7]. Using decoder features we only reject 28.5% of the mis-recognized words.Correct Detection Rates ClassificationError Word Level2.5% FR5.0% FRBaseline 19.0%DecoderFeatures 16.9%28.5%17.5%LMFeatures 28.3%42.0%15.0%ALLFeatures39.0%53.2%12.8%Table 1. Correct detection of mis-recognized words at a 2.5% and 5.0% false rejection rate (FR) rate. Minimum classification error and Baseline Error (Substitutions and Insertions) are also shown.The best results are obtained by combining all features. In this case we reject more than half of the incorrect words at 5% false rejection. These features reduce the classification error by 6.2%.3.3 Combining multiple hypothesesThe CU Communicator utilizes parallel banks of recognizers to obtain hypotheses from speaker-independent and female adapted telephone acoustic models. In the next sections we present several methods to use confidence measures for combining hypotheses from different decoders to improve speech recognition accuracy.3.3.1 Flat List Confidence Rescoring (FLCR)For each hypothesis output from the bank of decoders, we calculate the average confidence along the whole sentence. The hypothesis with the highest average confidence value is selected as the best hypothesis.3.3.2 Word Graph Confidence Rescoring (WGCR)In this case the key is to build a Word-Graph with all hypotheses and find the path along the graph with the highest average confidence value. This path can produce a new hypothesis, different from those used to build the graph. The idea is to pick up the best parts from different sentences.Word-Graph Generation: For each decoder hypothesis we tag each word with its confidence value. For example, consider Fig.1 that shows three hypotheses with confidence values shown underneath. In this figure we represent each word as an edge and we put a node between two consecutive words.Figure 1. Reference utterance and three alternatives.Next we join nodes from different hypotheses to build the graph. We join the beginning and end nodes of all the hypotheses and initial nodes of all words from different hypotheses situated in similar positions in the phrase. Finally we prune parallel transitions by picking the transition with the maximum confidence. The final word-graph is shown in Fig. 2.Best Path Calculation: Using dynamic programming, the best path along the graph can be calculated. In this case our heuristic has been the Accumulated Average Confidence , i.e.the average confidence from the initial node to the current node. In Fig. 2 we can see the sentence finally obtained.Following the bolded edges we can see how each word was obtained from each hypothesis. In this case the resulting phrase matches perfectly with the reference and it is different from each individual hypothesis.3.3.3 ExperimentsWe conducted experiments with four system configurations. In the baseline system that was evaluated by NIST [3], the CU Communicator initially runs two decoders in parallel. One decoder uses speaker independent models while the other uses female adapted models.Figure 2: Word-graph generated for the example in Figure 1. The bold arrows indicate the best hypothesis.IWANNA GO FROM AUSTINTO CHICAGO LATE MORNINGI WANT TO GO FROM BOSTON TO CHICAGO LATE MORNING 0.52 0.80 0.71 0.93 0.96 0.95 0.98 0.95 0.93 0.96 I’M GONNA GO FROM AUSTIN TO CHICAGO LATE MORNING 0.52 0.73 0.93 0.960.980.97 0.95 0.93 0.96 I WANNA GO FROM BOSTON TO CHICAGO MORNING 0.900.93 0.9 20.950.93 0.980.90 0.85WANTTO GO AUSTIN TO CHICAGOLATEMORNING FROM CHICAGO0.800.930.710.930.950.980.980.950.900.930.96WANNAI 0.90After 500 frames of input (5 sec.), the system selects one “best”decoder to use for the remainder of the telephone dialog. In this work, we have considered running decoders continuously in parallel throughout the dialog.Word Error Rates for Four System ConfigurationsMethod WERBaseline27.2%Best Path Score26.2%FLCR24.2%WGCR23.4%Table2. Word Error Rate Results.In the second configuration, we select the hypothesis with the best path score as output from the decoder search. Finally, we consider using the proposed FLCR and WGCR methods. Results are shown in Table 2.From these results we can affirm that confidence measures provide an important role in reducing the WER by combining hypotheses from different decoders. The WGCR method proposed in this paper produces a reduction of 3.8 % in WER (14% relative) and performs better than the FLCR method. In these experiments the difference between FLCR and WGCR is low because the number of decoders run in parallel is small, only two, and the average utterance length is also small: 2.4 words per utterance.4.UTTERANCE LEVELIn this level we use all sources of information: decoder, LM and parsing features: Decoder and LM features:-Average Word Confidence: this is the average word confidence along the sentence calculated in Sec. 3.Parsing features:-Number of words parsed in the sentence: number of words from the sentence belonging to a concept or a rule used to parse a concept.-Number of words that can be parsed: number of words from the sentence belonging to a concept or any rule in the task grammar.-Number of Concepts: number of concepts obtained in the sentence.-Average Count in the 100-best: average percentage of times that a concept appears in the 100-best hypotheses.-Percentage of hypotheses in the 100-best with any concept: with this feature we want to represent how many hypotheses parse with at least one extracted concept.For combining all features we have considered an MLP. In this study, the features were not quantized and they were used as input to the MLP. Because of this, a preprocessing is required to limit the dynamic range of each feature to the (0,1) interval. Here, the normalization consists of scaling utilizing the minimum and maximum value obtained for each feature in the training set.4.1ExperimentsTable 3 summarizes the correct detection rates for out of domain utterances at false rejection rates of 2.5% and 5.0%. It can be seen that parsing features provide better indicators for utterance-level confidence than decoder and LM features. These results are better than those obtained in [7].Correct Detection RatesClassificationError UtteranceLevel2.5% FR 5.0% FR Baseline 4.8% Decoder+ LMFeatures41.1%49.8% 4.2%ParsingFeatures43.8%52.8% 4.1%ALLFeatures45.7%53.2% 4.0%Table3. Correct detection of out of domain utterances at a 2.5% and 5.0% false rejection rate (FR) rate. Minimum classification error and Baseline Error are also shown.5.CONCEPT LEVELWe have calculated correct and incorrect concepts considering hypotheses and references, passing both through the parser and comparing them using a dynamic programming algorithm. Similar to the WER it is possible to calculate insertions, deletions and substitutions for concepts. In our system, we have a Concept Error Rate (CER) of 27.9%. The amount of incorrect concepts obtained is 16.5% (substitutions and insertions). In [8] we can see similar work in this level.In this level we use all sources of information: decoder, LM and parsing features:-Average word confidence in the words belonging to the rule used to get the concept: in this case we calculate the average word confidence obtained following the definition in Sec. 3 along the rule applied to get the concept.-Average word confidence for the value of the concept: average word confidence for the words that can be considered as the value of the concept. Considering the sentence: “I wanna go to Chicago.”: the phrase “go to Chicago” contain the words belonging to the rule applied to get the concept: [Arrival City], and “Chicago” is the value for this concept.-Number of words in the rule-Number of words in the concept value-Concept count in the 100-Best:Each hypothesis in the 100-best are parsed and the percentage of times that a concept appears in the hypotheses are counted.-Concepts and value count in the 100-Best: in this case we consider the number of times that a concept appears with the same value along the 100-best hypotheses.These two features are useful when a confusable pair appears in the hypotheses. For example, consider two cities with high confusion between them: Austin and Boston. When we have the sentence: “I wanna go from Denver to Austin” in the 100-best we observe many times how the word “Austin” is substituted by “Boston”. In this case the concept [Departing City] has high confidence because it appears in almost all the hypotheses but its value changes considerably, so this is a good measure of the concept value confidence. Obtaining large differences between these values means that the concept will probably be right but its value is confused.The next two features are obtained considering a Concept Language Model. In our case we have trained a 3-gram LM considering the concepts obtained from the references in the training set. Similar to the case for word level, we have considered:-Language Model Back-Off Behavior: back-off behavior of an N-gram language model along a 5-concept context.-Language Model Score: the log-probability for each concept in a sequence as computed from a back-off language model along a 5-concept context.For combining all features we have considered a MLP with the same characteristics described in the previous section.5.1Results.Table 4 summarizes the correct detection rates of incorrect concepts at false rejection rates of 2.5% and 5.0%. We have run three experiments: considering the Average Word Confidence features (AWCs) separated, the rest of the features and all together. It can be seen that using only the AWC features we get slightly improved results than considering the remaining proposed features. For example, 47.1% of the incorrect concepts were detected at a false rejection rate of 5%. Using the rest features we reject 40.1% for the same false rejection. The best results are obtained combining all features. In this case we reject more than 50% of incorrect concepts at 5% false rejection. Considering that we have 16.5% of the incorrect concepts, these features reduce the classification error in 4.5%.Correct DetectionRates ClassificationErrorConceptLevel2.5% FR 5.0% FR Baseline 16.5%AWCs31.0%47.1%12.8% RemainingFeatures29.3%40.1%13.5%ALLFeatures35.9%50.1%12.0%Table4. Wrong concepts correct detection at a 2.5% and 5.0% false rejection rate (FR) rate. Minimum classification error and Baseline Error are also shown.6.CONCLUSIONSIn this paper we present an analysis for confidence annotation for spoken dialogue systems at different levels: word, utterance and concept levels. All results are reported with respect to data collected from the CU Communicator during a seven month period (over 900 calls). At the word level we detect 53.2% of mis-recognized words at a 5% false rejection rate reducing the classification error by 6.2%. From the experimental observations, features coming from the LM perform better than decoder features. The best results are obtained by combining all features together. We propose the use of confidence measures as heuristic to combine several hypotheses from different recognizers. We analyze two options for combining the hypotheses: the Flat List Confidence Rescoring (FLCR) method and the Word-Graph Confidence Rescoring (WGCR) method. The word-graph generation algorithm is described and results with this method are reported. Using the WGCR method we reach a 14% relative word error rate reduction. For the utterance level combining all features, 53.2% of out of domain utterances are detected at 5% false rejection rate. At the concept level a new set of features are proposed detecting more than 50% of incorrect concepts at a 5% false rejection rate. New features from the speech understanding component are proposed for confidence annotation at utterance and concept levels7.FUTURE WORKIn a future work, we will analyze separately all features proposed from the speech understanding component, in order to obtain each contribution. For the utterance level we will consider the detection of, not only out of domain, but also misunderstanding utterances because of poor recognition. We also will consider the compilation of N-best lists from different coders into a single word graph to get a richer collection of hypotheses. Finally we will use the concept confidence to reduce the semantic errors that have more impact on the system's end-to-end performance.8.REFERENCES[1]W. Ward, B. Pellom. “The CU Communicator System,”Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Keystone Colorado, 1999.[2][3] B. Pellom, W. Ward, S. Pradhan, "The CUCommunicator: an architecture for Dialogue Systems". In Proc. ICSLP, Beijing, China, 2000.[4]L. Chase, "Error-Responsive Feedback Mechanisms forSpeech Recognizers", Ph.D. thesis, Carnegie Mellon University, Tech. Report CMU-RI-TR-97-18, April 1997 [5]L. Chase, "Word and acoustic confidence annotation forlarge vocabulary speech recognition". In Proc.Eurospeech, Rhodes, Greece, 1997. pp 815-818.[6]S.O. Kamppari, T.J. Hazen, "Word and phone levelacoustic confidence scoring". Proc. ICASSP, Istanbul.Turkey, 2000. pp III-1799, III-1802.[7]R. San-Segundo, B. Pellom, W. Ward., and JM. Pardo.,"Confidence measures for dialogue management in the CU communicator system". Proc. ICASSP, Istanbul.Turkey, 2000. pp III-1237, III-1240.[8]T. Hazen, T. Burianek, J. Polifroni, S. Seneff,"Recognition Confidence Scoring for Speech Understanding Systems," Proc. of the ISCA ITRW ASR2000 Workshop on Automatic Speech Recognition: Challenges for the new Millenium, Paris France, September 2000, pp. 213-220.。

A Method for Development of Dialogue Managers for Natural Language Interfaces

A Method for Development of Dialogue Managers forNatural Language InterfacesArne J¨o nsson∗Department of Computer and Information ScienceLink¨o ping UniversityS-58183Link¨o ping,Swedenarj@ida.liu.seAbstractThis paper describes a method for the development of dialogue managers for natural language interfaces.A dialogue manager is presented designed on the basis of both a theoretical investigation of models for dialogue management and an analysis of empirical material.It is argued that for natural language interfaces many of the human interaction phenomena accounted for in, for instance,plan-based models of dialogue do not oc-cur.Instead,for many applications,dialogue in natu-ral language interfaces can be managed from informa-tion on the functional role of an utterance as conveyed in the linguistic structure.This is modelled in a dia-logue grammar which controls the interaction.Focus structure is handled using dialogue objects recorded in a dialogue tree which can be accessed through a scoreboard by the various modules for interpretation, generation and background system access.A sublanguage approach is proposed.For each newapplication the Dialogue Manager is customized to meet the needs of the application.This requires em-pirical data which are collected through Wizard of Oz simulations.The corpus is used when updating the diﬀerent knowledge sources involved in the natural language interface.In this paper the customization of the Dialogue Manager for database information re-trieval applications is also described.IntroductionResearch on computational models of discourse can be motivated from two diﬀerent standpoints.One is to develop general models and theories of discourse for all kinds of agents and situations.The other approach is to account for a computational model of discourse for a speciﬁc application,say a natural language inter-face(Dahlb¨a ck and J¨o nsson,1992).It is not obvious that the two approaches should present similar com-putational theories for discourse.Instead the diﬀer-ent motivations should be considered when presenting ∗This research wasﬁnanced by the Swedish National Board for Technical Development and the Swedish Council for Research in the Humanities and Social Sciences.theories of dialogue management for natural language interfaces.Many models for dialogue in natural lan-guage interfaces are not only models for dialogue in such interfaces but they also account for general dis-course.The focus in this work is on dialogue manage-ment for natural language interfaces and not general discourse.Thus,the focus is on eﬃciency and habit-ability,i.e.a dialogue manager must correctly and ef-ﬁciently handle those phenomena that actually occur in typed human-computer interaction so that the user does not feel constrained or restricted when using the interface.This also means that a dialogue manager should be as simple as possible and not waste eﬀort on complex computations in order to handle phenom-ena not relevant for natural language interfaces.For instance,the system does not necessarily have to be psycholinguistically plausible or able to mimic all as-pects of human dialogue behaviour such as surprise or irony,if these do not occur in such dialogues.Grosz and Sidner(1986)presented a general compu-tational theory of discourse,both spoken and written, where they divide the problem of managing discourse into three parts:linguistic structure,attentional state and intentional state.The need for a component which records the ob-jects,properties and relations that are in the focus of attention,the attentional state,is not much debated, although the details of focusing need careful examina-tion.However,the role that is given to the intentional state,i.e.the structure of the discourse purposes,and to the linguistic structure,i.e.the structure of the sequences of utterances in the discourse,provide two competing approaches to dialogue management:•One approach is the plan-based approach.Here the linguistic structure is used to identify the intentional state in terms of the user’s goals and intentions. These are then modelled in plans describing the ac-tions which may possibly be carried out in diﬀerentsituations(cf.Cohen and Perrault,1979;Allen and Perrault,1980;Litman,1985;Carberry,1990;Pol-lack,1990).•The other approach to dialogue management is to use only the information in the linguistic structure to model the dialogue expectations,i.e.utterances are interpreted on the basis of their functional re-lation to the previous interaction.The idea is that these constraints on what can be uttered allow us to write a grammar to manage the dialogue(cf.Reich-man,1985;Polanyi and Scha,1984;Bilange,1991; J¨o nsson,1991).For the strong AI goal or the computational linguistics goal to mimic human language capabilities the plan recognition approach might be necessary.But,for the task of managing the dialogue in a natural language interface,the less sophisticated approach of using a dialogue grammar will do just as well,as will be argued below.The work presented in this paper is restricted to studying written human-computer interaction in nat-ural language,and natural language interfaces for dif-ferent applications which belong to the domain that Hayes and Reddy(1983)called simple service systems. Simple service systems“require in essence only that the customer or client identify certain entities to the person providing the service;these entities are param-eters of the service,and once they are identiﬁed the service can be provided”(ibid.p.252).A method for customizationThe method presented in this paper proposes a sublan-guage approach(Grishman and Kittredge,1986)to the development of dialogue managers.A dialogue man-ager should not account for the interaction behaviour utilized in every application,instead it should be de-signed to facilitate customization to meet the needs of a certain application.Kelley(1983)presents a method for developing a natural language interface in six steps.Theﬁrst two steps are mainly concerned with determining and im-plementing essential features of the application.In the third step,known as theﬁrst Wizard of Oz-step,the subject interacts with what they believe is a natural language interface but which in fact is a human sim-ulating such an interface(cf.Dahlb¨a ck et al.,1993; Fraser and Gilbert,1991).This provides data that are used to build aﬁrst version of the interface(step four).Kelley starts without grammar or lexicon.The rules and lexical entries are those used by the users during the simulation.In stepﬁve,Kelley improves his interface by conducting new Wizard of Oz simula-tions,this time with the interface running.However, when the user/subject enters a query that the system cannot handle,the wizard takes over and produces an appropriate response.The advantage is that the user’s interaction is not interrupted and a more realistic dia-logue is thus obtained.This interaction is logged and in step six the system is updated to be able to handle the situations where the wizard responded.The method used by Kelley of running a simulation in parallel with the interface was also used by Good et al.(1984).They developed a command language interface to an e-mail system using this iterative de-sign method,UDI(User-Derived Interface).Kelley and Good et al.focus on updating the lexical and grammatical knowledge and are not concerned with dialogue behaviour.The Dialogue Manager presented in this paper is customized to a speciﬁc application using a process in-spired by the method of User-Derived Interfaces.The starting point is a corpus of dialogues collected in Wiz-ard of Oz-experiments.From this corpus the knowl-edge structures used by the Dialogue Manager are cus-tomized.The Dialogue ManagerThe Dialogue Manager was initially designed from an analysis of a corpus of21dialogues,other than the30 used for customization(see below)collected in Wiz-ard of Oz-experiments using5diﬀerent background systems1.It can be viewed as a controller of re-sources for interpretation,database access and gen-eration.The Dialogue Manager receives input from the interpretation modules,inspects the result and ac-cesses the background system with information con-veyed in the user input.Eventually an answer is re-turned from the background system access module and the Dialogue Manager then calls the generation mod-ules to generate an answer to the user.If clariﬁcation is needed from any of the resources it is dealt with by the Dialogue Manager.The Dialogue Manager uses information from dia-logue objects which model the dialogue segments and moves and information associated with them.The dia-logue objects represent the constituents of the dialogue and the Dialogue Manager records instances of dia-logue objects in a dialogue tree as the interaction pro-ceeds.The dialogue objects are divided into three main classes on the basis of structural complexity.There is one class corresponding to the size of a dialogue, another class corresponding to the size of a discourse 1For further details of the Dialogue Manager,see (Ahrenberg et al.,1990);(J¨o nsson,1991)and(J¨o nsson, 1993).segment(cf.Grosz and Sidner,1986)and a third class corresponding to the size of a single speech act,or dia-logue move.Thus,a dialogue is structured in terms of discourse segments,and a discourse segment in terms of moves and embedded segments.Utterances are not analysed as dialogue objects,but as linguistic objects which function as vehicles of one or more moves.2 The dialogue object descriptions are domain depen-dent and can be modiﬁed for each new application. The Dialogue Manager is customized by specifying the dialogue objects;which parameters to use and what values they can take.From the perspective of dialogue management the dialogue objects modelling the dis-course segment are the most interesting.An initiative-response(IR)structure is assumed(cf.adjacency-pairs,Schegloﬀand Sacks,1973)where an initiative opens a segment by introducing a new goal and the response closes the segment(Dahlb¨a ck,1991).The parameters speciﬁed in the dialogue objects reﬂect the information needed by the various processes accessing information stored in the dialogue tree.A dialogue object consists of a set of parameters for specifying the initiator,responder,context etc.needed in most applications.Another set of parameters spec-ify content.Two of these,termed Objects and Prop-erties,account for the information structure of a move (query),where Objects identify a set of primary refer-ents,and Properties identify a complex predicate as-cribed to this set(cf.Ahrenberg,1987).These are focal parameters in the sense that they can be in focus over a sequence of IR-units.Two principles for maintaining the focus structure are utilized.A general heuristic principle is that ev-erything not changed in an utterance is copied from one IR-node in the dialogue tree to the newly created IR-node.Another principle is that the value for Ob-jects will be updated with the value from the module accessing the database,if provided.The dialogue objects are used to specify the be-haviour of the Dialogue Manager and thus the spec-iﬁcation of the dialogue objects must include informa-tion on what actions to take in certain situations.This is modelled in two non-focal content parameters,Type and Topic.Type corresponds to the illocutionary type of the move.Hayes and Reddy(1983,p266)identify two sub-goals in simple service systems:1)“specify a pa-2The use of three categories for hierarchically structur-ing the dialogue is motivated from the analysis of the cor-pora.However,there is no claim that they are applicable to all types of dialogue,and even less so,to any type of dis-course.When a diﬀerent number of categories are utilized, the Dialogue Manager can then be customized to capture these other categories.rameter to the system”and2)“obtain the speciﬁca-tion of a parameter”.Initiatives are categorized ac-cordingly as being of two diﬀerent types1)update,U, where users provide information to the system and2) question,Q,where users obtain information from the system.Responses are categorized as answer,A,for database answers from the system or answers to clari-ﬁcation requests.Other Type categories are Greeting, Farewell and Discourse Continuation(DC)(Dahlb¨a ck, 1991)the latter of which is used for utterances from the system whose purpose is to keep the conversation going.Topic describes which knowledge source to con-sult.In information retrieval applications three dif-ferent topics are used:the database for solving a task (T),acquiring information about the database,system-related,(S)or,ﬁnally,the ongoing dialogue(D). The empirical basis for customization The Dialogue Manager is customized on the basis of a corpus of30dialogues collected in Wizard of Oz-experiments using the actual applications.Three dif-ferent applications were used and each application uti-lized10dialogues for customization.The simulations were carefully designed and carried out using a power-ful simulation environment,(Dahlb¨a ck et al.,1993). In the experiments there were14female and16 male subjects with varying familiarity with comput-ers.Most subjects were computer novices.The av-erage age was26(min.15,max.55).Most of the subjects were students but there were also others with varying backgrounds,such as cleaning staﬀand admin-istrative assistants.The subjects did not realize that it was a simulation and they all,in post-experimental interviews,said that they felt very comfortable with the“system”.In the simulations a scenario is presented to the sub-jects.In one of the simulations,cars,the scenario presents a situation where the subject,and his/her ac-companying person,have just got the message that their old favourite Mercedes had broken down beyond repair and that they would have to consider buying a new car.They had a certain amount of money avail-able and using the computerized cars system were asked to select three cars,and also to provide a brief motivation for their choice.The cars database is implemented in INGRES,and output from the database can be presented directly to the subjects.Thus,answers from the system,after suc-cessful requests,are tables with information on prop-erties of used cars.The users/subjects found this type of output very convenient as they could view a par-ticular car in the context of other similar cars.Thiscan be seen as an argument favouring an approach to natural language interfaces where complex reasoning is replaced with fast output of structured information. Possibly more information than asked for is provided, but as long as it can be presented on one screen,it is convenient.The dialogues in the other domain,travel,were collected using two scenarios,one where the subjects were asked to gather information on charter trips to the Greek Archipelago and another where they have a certain amount of money available and were asked to use the travel system to order such a charter trip.In travel it is also possible to provide graphical informa-tion to the subjects,i.e.,maps of the various islands.The use of empirical materialAn important question is how to use empirical ma-terial on the one hand and common sense and prior knowledge on human-computer interaction and natu-ral language dialogue on the other.Dahlb¨a ck(1991) claims that this partly depends on the purpose of the study,whether it is aimed at theory development or system development.In the latter case,one always has the possibility to design the system to overcome certain problems encountered in the corpus.In this work empirical material is used for system development from two diﬀerent perspectives.Theﬁrst is to develop a dialogue manager for a natural lan-guage interface which can be used in various applica-tions.Here the empirical material needs to be analysed with the aim of designing a dialogue manager general enough to cover all the dialogue phenomena that can occur in realistic human-computer dialogues using var-ious background systems.Thus,phenomena which oc-cur in the empirical material must be accounted for and also certain generalizations must be made so that the Dialogue Manager can later be customized to cover phenomena that are not actually present in the corpus but are likely to occur for other applications. Empirical material is also used for customizing the Dialogue Manager to actual applications.Here gen-eralization is less emphasized,instead many details of how to eﬃciently deal with the phenomena in the im-plementation are more interesting.How can empirical material be used for customiza-tion?One can take the conservative standpoint and say that only those phenomena actually occurring in the dialogues are to be handled by the Dialogue Man-ager,(cf.Kelley,1983).This has the advantage that a minimal model is developed which is empirically well motivated and which does not waste time on handling phenomena not occurring in the corpus.The drawback is that a very large corpus is needed for coverage of the possible actions taken by a potential user.This was also pointed out by Ogden(1988,p296),who claims that“The performance of the system will depend on the availability of representative users prior to actual use,and it will depend on the abilities of the installer to collect and integrate the relevant information”. The other extreme standpoint is to only use the lin-guistic knowledge available.One problem with this ap-proach is that it is plausible that much eﬀort is spent on handling phenomena which will never occur in the dialogue,while at the same time not account for actu-ally occurring phenomena.However,as pointed out by Brown and Yule(1983,p21)“A dangerously extreme view of‘relevant data’for a discourse analyst would involve denying the admissibility of a constructed sen-tence as linguistic data”.For the purpose of customization,two kinds of in-formation can be obtained from a corpus:•First,it can be used as a source of phenomena which the designer of the natural language interface was not aware of from the beginning.•Second,it can be used to rule out certain interesting phenomena which are complicated but which do not occur in the corpus.Theﬁrst point also includes the use of the corpus to make the system behaviour more accurate.This can be illustrated by the use of clariﬁcation subdialogues. In the cars dialogues,when the user initiative is too vague and the system needs a clariﬁcation,itﬁrst ex-plicitly states the alternatives available and then asks for a clariﬁcation.Subjects using the cars system follow up such a clariﬁcation subdialogue as intended. However,in the travel system there are certain sys-tem clariﬁcation requests which are less explicit,and which do not state any alternatives.These clariﬁca-tions do not always result in a follow up answer from the user.To illustrate the second point,consider the use of singular pronouns.Singular pronouns can be used in various ways to refer to a previously mentioned item. One could argue that if a user utters something like What is the price of a Toyota Corolla?,and the answer is a table with two types of cars of diﬀerent years,then the user may form a conceptualization of Toyota as a generic car and can therefore utter something like How fast is it?referring to properties of a Toyota Corolla of any year.In the work on developing the Dialogue Manager, the use of pronouns in the corpus in various situations motivates the need for designing the Dialogue Manager to capture both uses of singular pronouns.However, when customizing the Dialogue Manager the situationis diﬀerent.For instance,in the cars dialogues theusers restrict their use of singular pronouns.Thus,thecustomized Dialogue Manager for the cars databaseis not provided with speciﬁc means for managing theuse of singular pronouns if presented in the contextabove.If they occur they will result in a clariﬁcationsubdialogue.However,the“normal”use of singularpronouns is allowed.There is another motivation forthis position.Excluding the generic use of a singularpronoun leads to a simpler Dialogue Manager.On theother hand including the normal use of singular pro-nouns will not increase the complexity of the DialogueManager.The principle utilized in the customization of theDialogue Manager is obviously very pragmatic.If thephenomenon is present in the corpus then it should beincluded.If it is not present,but if it is present in otherWizard of Oz-studies using similar background systemsand scenarios and implementation is straightforward,the Dialogue Manager should be customized to dealwith it.Otherwise,if it is not present and it wouldincrease the complexity of the Dialogue Manager,thenit is not included.This does not prevent the use of knowledge fromother sources(cf.Grishman et al.,1986).In the cus-tomization of the Dialogue Manager for the cars andtravel systems,knowledge on how the database is or-ganised and also how users retrieve information fromdatabases is used in the customization.Customizing the Dialogue Manager Customization of the Dialogue Manager involves two major tasks:1)Deﬁning the focal parameters of the dialogue objects in more detail and customizing the heuristic principles for changing the values of these parameters.2)Constructing a dialogue grammar for controlling the dialogue.The focus structureIn the cars application,task-related questions areabout cars which means that the Objects parameterholds various instances of sets of cars and Properties,are various properties of cars.In travel,on the otherhand,users switch their attention between objects ofdiﬀerent kinds:hotels,resorts and trips.This requiresa slightly modiﬁed Objects parameter.It can be eithera hotel or a resort.However,in travel the appropri-ate resort can be found from a hotel description byfollowing the relation in the domain model from ho-tel to resort.Finding the hotel from a resort can beaccomplished by a backwards search in the dialoguetree.Therefore,one single focused object–a hotel ora resort–will suﬃce.The value need not be a single object,it can be a set of hotels or resorts.The general focusing principles need to be slightly modiﬁed to apply to the cars and travel applica-tions.For the cars application the heuristic principles apply well to the Objects parameter.An intensionally speciﬁed object description provided in a user initia-tive will be replaced by the extensional speciﬁcation provided by the module accessing the database,which means that erroneous objects will be removed,as they will not be part of the response from the database man-ager.For the travel application the principles for providing information to the Objects parameter are modiﬁed to allow hotels to be added if the resort re-mains the same.The heuristic principles for the Properties parameter for the cars application need to be modiﬁed.The principle is that if the user does not change Objects to a set of cars which is not a subset of Objects,then the attributes provided in the new user initiative are added to the old set of attributes.This is based on the observation that users often start with a rather large set,in this case a set of cars,and then gradually specify a smaller set by adding restrictions(cf.Kaplan1983), for instance in cars using utterances like remove all small size cars.For the travel application the copy principle holds without exception.The modiﬁcations of the general principles are minor and are carried out during the customization.The results from the customizations showed that the heuristic principles applied well.In cars52%of the user initiatives were fully speciﬁed,i.e.they did not need any information from the context to be inter-preted.43%could be interpreted from information found in the current segment as copied from the pre-vious segment.Thus,only5%required a search in the dialogue tree.For the travel application without or-dering,44%of the user initiatives were fully speciﬁed and50%required local context,while in the ordering dialogues59%were fully speciﬁed and39%needed lo-cal context.In the travel system there is one more object;the order form.A holiday trip is not fully deﬁned by spec-ifying a hotel at a resort.It also requires informa-tion concerning the actual trip:Travel length,Depar-ture date and Number of persons.This information is needed to answer questions on the price of a holiday trip.The order form also contains all the information necessary when ordering a charter trip.In addition to the information on Resort,Hotel,Departure date,etc. the order form includes information about the name, address and telephone number of the user.Further-more,information on travel insurance,cancellation in-surance,departure airport etc.is found in the orderform.The order form isﬁlled with user information during a system controlled phase of the dialogue. The dialogue structureThe dialogue structure parameters Type and Topic also require customization.In the cars system the users never update the database with new information, but in the travel system where ordering is allowed the users update the order form.Here another Type is needed,CONF,which is used to close an ordering ses-sion by summarizing the order and implicitly prompt for conﬁrmation.For the ordering phase the Topic pa-rameter O for order is added,which means that the utterance aﬀects the order form.The dialogue structure can be modelled in a dialogue grammar.The resulting grammar from the customiza-tions of both cars and travel is context free,in fact, it is very simple and consists merely of sequences of task-related initiatives followed by database responses, Q T/A T3,sometimes with an embedded clariﬁcation se-quence,Q D/A D.In cars60%of the initiatives are of this type.For travel83%of the initiatives in the non-ordering dialogues and70%of the ordering dialogues are of this type.Other task related initiatives result in a response providing system information,Q T/A S,or a response stating that the intitiative was too vague, Q T/A D.There are also a number of explicit calls for system information,Q S/A S.The grammar rules dis-cussed here only show two of the parameters of the dialogue objects.In fact,a number of parameters de-scribing speaker,hearer,objects,properties,etc are used.These descriptors provide additional informa-tion for deciding which actions to carry out.However, the complexity of the dialogue is constrained by the grammar.The dialogue grammar is developed byﬁrst con-structing a minimal dialogue grammar from an analy-sis of dialogues from the application,or an application of the same type,rmation retrieval from a database.This grammar is generalized and extended, using general knowledge on human-computer natural language interaction,with new rules to cover“obvi-ous”additions not found in the initial grammar.In the cars dialogues it included,for instance,Greetings and Farewells,which did not appear in the analysis of the dialogues.In the travel system it involved, among other things,allowing for multiple clariﬁcation requests and clariﬁcation requests not answered by the user.Some extensions not found in any of the dialogues 3For brevity,when presenting the dialogue grammar, Topic type will be indicated with a subscript to the Type. The Initiative is theﬁrst TypeTopic-pair while the Re-sponse is the second separated by a slash(/).were also added,for instance,a rule for having the sys-tem prompt the user with a discourse continuation if (s)he becomes unsure who has the initiative.However, if a phenomenon requires sophisticated and complex mechanisms,it will be necessary to consider what will happen if the grammar is used without that addition. This also includes considering how probable it is that a certain phenomenon may occur.For each new application,new simulations are needed to determine which phenomena are speciﬁc for that application.This is illustrated in the travel sys-tem dialogues where ordering is not allowed.In these dialogues some users try to state an order although it is not possible.This resulted in a new rule,U O/A S, informing the users that ordering is not possible.In the work by Kelley(1983)and Good et al.(1984), on lexical and grammatical acquisition,the customiza-tion process was saturated after a certain number of di-alogues.The results presented here indicate that this is also the case for the dialogue structure.From a rather limited number of dialogues,a context free grammar can be constructed which,with a few generalizations, will cover the interaction patterns occurring in the ac-tual application(J¨o nsson,1993).SummaryThis paper has presented a method for the develop-ment of dialogue managers for natural language inter-faces for various applications.The method uses a gen-eral dialogue manager which is customized from a cor-pus of dialogues,with users interacting with the actual application,collected in Wizard of Oz-experiments. The corpus is used when customizing dialogue objects with parameters and heuristic principles for maintain-ing focus structure.It is also used when constructing a dialogue grammar which controls the dialogue.The customization of the Dialogue Manager for two diﬀerent applications–database information retrieval and database information retrieval plus ordering–was also presented.Customization was carried out for two diﬀerent domains:properties of used cars and infor-mation on holiday trips.For both domains questions can be described as queries on speciﬁcations of do-main concepts about objects in the database and sim-ple heuristic principles are suﬃcient for modelling the focus structure.A context free dialogue grammar can accurately control the dialogue for both applications. The results on customization are very promising for the approach to dialogue management presented in this pa-per.They show that the use of dialogue objects which can be customized for various applications in combina-tion with a dialogue grammar is a fruitful way to build application-speciﬁc dialogue managers.。

公司管理对话英语作文

公司管理对话英语作文Title: Effective Workplace Communication: A Dialogue on Company Management。

---。

Introduction:In any organization, effective communication among team members and management is crucial for success. This dialogue presents a scenario where employees discuss various aspects of company management and how communication plays a pivotal role in achieving organizational goals.Dialogue:Alex: Good morning, everyone. Today, I want to discuss how we can enhance communication within our company to improve overall efficiency and productivity. Any thoughts?Sara: I believe regular team meetings can be a great platform for fostering communication. It allows us to discuss ongoing projects, address any concerns, and brainstorm ideas collectively.John: Absolutely, Sara. Additionally, having an open-door policy where employees feel comfortable approaching management with their ideas or issues can greatly improve transparency and trust within the organization.Emily: I agree with both of you. However, it's essential for management to actively listen to employees' feedback and take appropriate actions. This not only makes employees feel valued but also leads to better decision-making processes.Alex: That's a valid point, Emily. Actively listening to our employees and implementing their suggestions where feasible can significantly boost morale and foster a culture of innovation.Sara: Another aspect we should consider is the use oftechnology to streamline communication processes. Implementing collaboration tools and platforms canfacilitate seamless information sharing and project management, especially in today's remote work environment.John: Absolutely, Sara. Technology can bridge the gap between remote teams and ensure everyone stays connected and informed. However, we must ensure that proper training is provided to employees to maximize the benefits of these tools.Emily: In addition to formal communication channels, informal interactions such as team-building activities and social events can also strengthen bonds among employees and promote a positive work environment.Alex: Indeed, Emily. Building a strong sense of camaraderie among team members fosters collaboration and enhances overall productivity. It's essential to strike a balance between work and social interactions to maintain employee engagement and satisfaction.Conclusion:Effective communication is the cornerstone of successful company management. By fostering open dialogue, leveraging technology, and nurturing a supportive work culture, organizations can empower their employees and drive sustainable growth. It's imperative for management to prioritize communication strategies that promote transparency, collaboration, and employee engagement to achieve long-term success.---。

英文版谈论管理制度作文

英文版谈论管理制度作文Management systems are essential in any organization,as they provide structure and direction for employees. Without clear management systems, employees may feel lostor unsure about their roles and responsibilities.One important aspect of management systems is communication. Effective communication channels must be established to ensure that information is shared throughout the organization. This includes both formal channels, such as company meetings and memos, as well as informal channels, such as water cooler conversations.Another crucial element of management systems is accountability. Employees must be held accountable fortheir actions and performance. This can be achieved through regular performance evaluations and clear expectations for job responsibilities.Training and development programs are also necessarycomponents of management systems. Employees should be given opportunities to improve their skills and knowledge, which will not only benefit them personally, but also contribute to the overall success of the organization.In addition, management systems must be adaptable to change. As the business landscape evolves, organizations must be able to adjust their management systems to remain competitive and relevant. This requires a willingness to embrace new technologies and processes, and a commitment to continuous improvement.Finally, management systems must be aligned with the organization's values and goals. This ensures that all employees are working towards a common purpose, and helps to create a strong sense of company culture and identity.Overall, effective management systems are essential for the success of any organization. By establishing clear communication channels, promoting accountability, providing training and development opportunities, embracing change,and aligning with company values and goals, organizations can create a strong foundation for growth and success.。

2023北京高三一模英语

1.【此处可播放相关音频，请去附件查看】
What will the speakers do next?
A.Visit a friend.B.Pick up Billy.C.Buy some beans.
【答案】C
【解析】
【原文】M: Billy needs some beans for his science project at school. Maybe we can stop by a store on our way home.
W: That would be terrible.
听下面一段较长对话，回答以下小题。【此处可播放相关音频，请去附件查看】
14.Whose speech did the woman listen to this morning?
A.John Miller's.B.David Thompson's.C.Allan Brown's.
C.They pay all her expenses.
12.What does Ethan suggest Becky do regarding her mother?
A.Have patience.
B.Provide company.
C.Express gratitude.
13.Why is Ethan concerned about his parents living on their own?
M: Do you pay a contribution to the house?
W: Of course, I do. But it’s still much less than I would pay to live in my own flat. Right?

南京大学20春《管理原理》第2次作业

B、planning
C、organizational culture
D、manufacturing design
E、directing
说明：
题号:17题型:单选题（请在以下几个选项中选择唯一正确答案）本题分数:2
The three components that make up an attitude are ______________.
A、managers
B、leaders
C、organizers
D、visionaries
E、team members.
说明：
题号:6题型:单选题（请在以下几个选项中选择唯一正确答案）本题分数:2
Of the following, which is NOT a common source of information used by managers to measure performance?
A、measuring actual performance
B、changing the standard
C、taking managerial action
D、comparing actual against the standard
说明：
题号:11题型:单选题（请在以下几个选项中选择唯一正确答案）本题分数:2
A、cognitive, affective, behavioral.
B、traits, behavioral, emotional.
C、knowledge, opinion, individual history.
D、intention, opinion, environment.
E、pre-opinion, experience,

1.管理沟通基础知识(英文版)

Why communicate?
It leads to greater effectiveness. It keeps people in the picture. It gets people involved with the organization and
increases motivation to perform well; increases commitment to the organization. It makes for better relationships and understanding between: boss and subordinate; colleagues; people within the organization and outside it. It helps people to understand the need for change: how they should manage it; how to reduce resistance to change.
Philosophy elements (preference, background, experiences and value).
管理沟通的内涵：
1. 沟通首先是意义上的传递。 2. 要使沟通成功，意义不仅需要被传递，
还需要被理解。 3. 在沟通过程中，所有传递于沟通者之间
的，只是一些符号，而不是信息本身。 4. 良好的沟通常被错误地理解为沟通双方
哈佛大学就业指导小组1995年调查结果显示，在500名被解职的男女中，因人际沟通不良而导致工作不称职者占82％。
“三个臭皮匠，赛过一个诸葛亮”
沟通是个人事业成功的重要因素

Spoken Dialogue Management using Probabilistic Reasoning

Spoken Dialogue Management Using Probabilistic Reasoning Nicholas Roy and Joelle Pineau and Sebastian ThrunRobotics InstituteCarnegie Mellon UniversityPittsburgh,PA15213AbstractSpoken dialogue managers have beneﬁted from using stochastic planners such as Markov Decision Processes(MDPs).How-ever,so far,MDPs do not handle well noisy and ambiguous speech utterances.We use a Partially Observable Markov Decision Pro-cess(POMDP)-style approach to generate dialogue strategies by inverting the notion of dialogue state;the state represents the user’s intentions,rather than the system state.We demonstrate that under the same noisy con-ditions,a POMDP dialogue manager makes fewer mistakes than an MDP dialogue man-ager.Furthermore,as the quality of speech recognition degrades,the POMDP dialogue manager automatically adjusts the policy.1IntroductionThe development of automatic speech recognition has made possible more natural human-computer interaction.Speech recognition and speech un-derstanding,however,are not yet at the point where a computer can reliably extract the in-tended meaning from every human utterance. Human speech can be both noisy and ambigu-ous,and many real-world systems must also be speaker-independent.Regardless of these difﬁ-culties,any system that manages human-machine dialogues must be able to perform reliably even with noisy and stochastic speech input.Recent research in dialogue management has shown that Markov Decision Processes(MDPs) can be useful for generating effective dialogue strategies(Young,1990;Levin et al.,1998);the system is modelled as a set of states that represent the dialogue as a whole,and a set of actions corre-sponding to speech productions from the system. The goal is to maximise the reward obtained for fulﬁlling a user’s request.However,the correct way to represent the state of the dialogue is still an open problem(Singh et al.,1999).A common solution is to restrict the system to a single goal. For example,in booking aﬂight in an automated travel agent system,the system state is described in terms of how close the agent is to being able to book theﬂight.Such systems suffer from a principal prob-lem.A conventional MDP-based dialogue man-ager must know the current state of the system at all times,and therefore the state has to be wholly contained in the system representation.These systems perform well under certain conditions, but not all.For example,MDPs have been used successfully for such tasks as retrieving e-mail or making travel arrangements(Walker et al.,1998; Levin et al.,1998)over the phone,task domains that are generally low in both noise and ambigu-ity.However,the issue of reliability in the face of noise is a major concern for our application.Our dialogue manager was developed for a mobile robot application that has knowledge from sev-eral domains,and must interact with many peo-ple over time.For speaker-independent systems and systems that must act in a noisy environment, the user’s action and intentions cannot always be used to infer the dialogue state;it may be not be possible to reliably and completely determine the state of the dialogue following each utterance. The poor reliability of the audio signal on a mo-bile robot,coupled with the expectations of nat-ural interaction that people have with more an-thropomorphic interfaces,increases the demands placed on the dialogue manager.Most existing dialogue systems do not model conﬁdences on recognition accuracy of the hu-man utterances,and therefore do not account for the reliability of speech recognition when apply-ing a dialogue strategy.Some systems do use the log-likelihood values for speech utterances,how-ever these values are only thresholded to indicate whether the utterance needs to be conﬁrmed(Ni-imi and Kobayashi,1996;Singh et al.,1999).An important concept lying at the heart of this issue is that of observability–the ultimate goal of a dialogue system is to satisfy a user request;how-ever,what the user really wants is at best partially observable.We handle the problem of partial observabil-ity by inverting the conventional notion of state in a dialogue.The world is viewed as partially unobservable–the underlying state is the inten-tion of the user with respect to the dialogue task. The only observations about the user’s state are the speech utterances given by the speech recog-nition system,from which some knowledge about the current state can be inferred.By accepting the partial observability of the world,the dia-logue problem becomes one that is addressed by Partially Observable Markov Decision Processes (POMDPs)(Sondik,1971).Finding an optimal policy for a given POMDP model corresponds to deﬁning an optimal dialogue strategy.Optimality is attained within the context of a set of rewards that deﬁne the relative value of taking various ac-tions.We will show that conventional MDP solutions are insufﬁcient,and that a more robust method-ology is required.Note that in the limit of per-fect sensing,the POMDP policy will be equiva-lent to an MDP policy.What the POMDP policy offers is an ability to compensate appropriately for better or worse sensing.As the speech recog-nition degrades,the POMDP policy acquires re-ward more slowly,but makes fewer mistakes and blind guesses compared to a conventional MDP policy.There are several POMDP algorithms that may be the natural choice for policy genera-tion(Sondik,1971;Monahan,1982;Parr and Russell,1995;Cassandra et al.,1997;Kaelbling et al.,1998;Thrun,1999).However,solving real world dialogue scenarios is computationally in-tractable for full-blown POMDP solvers,as the complexity is doubly exponential in the number of states.We therefore will use an algorithm for ﬁnding approximate solutions to POMDP-style problems and apply it to dialogue management. This algorithm,the Augmented MDP,was devel-oped for mobile robot navigation(Roy and Thrun, 1999),and operates by augmenting the state de-scription with a compression of the current belief state.By representing the belief state succinctly with its entropy,belief-space planning can be ap-proximated without the expected complexity.In theﬁrst section of this paper,we develop the model of dialogue interaction.This model allows for a more natural description of dialogue prob-lems,and in particular allows for intuitive han-dling of noisy and ambiguous dialogues.Few existing dialogues can handle ambiguous input, typically relying on natural language processing to resolve semantic ambiguities(Aust and Ney, 1998).Secondly,we present a description of an example problem domain,andﬁnally we present experimental results comparing the performance of the POMDP(approximated by the Augmented MDP)to conventional MDP dialogue strategies. 2Dialogue Systems and POMDPsA Partially Observable Markov Decision Process (POMDP)is a natural way of modelling dialogue processes,especially when the state of the sys-tem is viewed as the state of the user.The par-tial observability capabilities of a POMDP pol-icy allows the dialogue planner to recover from noisy or ambiguous utterances in a natural and autonomous way.At no time does the machine interpreter have any direct knowledge of the state of the user,i.e,what the user wants.The machine interpreter can only infer this state from the user’s noisy input.The POMDP framework provides a principled mechanism for modelling uncertainty about what the user is trying to accomplish.The POMDP consists of an underlying,unob-servable Markov Decision Process.The MDP is speciﬁed by:a set of statesa set of actionsa set of transition probabilitiesa set of rewardsan initial stateThe actions represent the set of responses that the system can carry out.The transition prob-abilities form a structure over the set of states, connecting the states in a directed graph with arcs between states with non-zero transition prob-abilities.The rewards deﬁne the relative value of accomplishing certain actions when in certain states.The POMDP adds:a set of observationsa set of observation probabilitiesand replacesthe initial state with an initial belief,the set of rewards with rewards conditioned on observations as well:The observations consist of a set of keywords which are extracted from the speech utterances. The POMDP plans in belief space;each belief consists of a probability distribution over the set of states,representing the respective probability that the user is in each of these states.The ini-tial belief speciﬁed in the model is updated every time the system receives a new observation from the user.The POMDP model,as deﬁned above,ﬁrst goes through a planning phase,during which it ﬁnds an optimal strategy,or policy,which de-scribes an optimal mapping of action to be-lief,for all possible beliefs.Thedialogue manager uses this policy to direct its behaviour during conversations with users.The optimal strategy for a POMDP is one that pre-scribes action selection that maximises the ex-pected reward.Unfortunately,ﬁnding an opti-mal policy exactly for all but the most trivial POMDP problems is computationally intractable.A near-optimal policy can be computed signiﬁ-cantly faster than an exact one,at the expense of a slight reduction in performance.This is often done by imposing restrictions on the policies that can be selected,or by simplifying the belief state and solving for a simpliﬁed uncertainty represen-tation.In the Augmented MDP approach,the POMDP problem is simpliﬁed by noticing that the belief state of the system tends to have a certain struc-ture.The uncertainty that the system has is usu-ally domain-speciﬁc and localised.For example, it may be likely that a household robot system can confuse TV channels(‘ABC’for‘NBC’),but it is unlikely that the system will confuse a TV chan-nel request for a request to get coffee.By making the localised assumption about the uncertainty,it becomes possible to summarise any given belief vector by a pair consisting of the most likely state, and the entropy of the belief state.(1)(2) The entropy of the belief state approximates a suf-ﬁcient statistic for the entire belief state1.Given this assumption,we can plan a policy for every possible such state,entropy pair,that approx-imates the POMDP policy for the corresponding belief.Figure1:Florence Nightingale,the prototype nursing home robot used in these experiments.3The Example DomainThe system that was used throughout these ex-periments is based on a mobile robot,FlorenceNightingale(Flo),developed as a prototype nurs-ing home assistant.Flo uses the Sphinx II speechrecognition system(Ravishankar,1996),and theFestival speech synthesis system(Black et al.,1999).Figure1shows a picture of the robot.Since the robot is a nursing home assistant,weuse task domains that are relevant to assisted liv-ing in a home environment.Table1shows a list ofthe task domains the user can inquire about(thetime,the patient’s medication schedule,what ison different TV stations),in addition to a list ofrobot motion commands.These abilities have allbeen implemented on Flo.The medication sched-ule is pre-programmed,the information about theTV schedules is downloaded on request from theweb,and the motion commands correspond topre-selected robot navigation sequences.TimeMedication(Medication1,Medication2,...,Medication n)TV Schedules for different channels(ABC,NBC,CBS)Robot Motion Commands(To the kitchen,To the Bedroom)hello”responseand for robot motion to the kitchen.4Experimental ResultsWe compared the performance of the three al-gorithms(conventional MDP,POMDP approx-imated by the Augmented MDP,and exactPOMDP)over the example domain.The met-ric used was to look at the total reward accumu-lated over the course of an extended test.In or-der to perform this full test,the observations andstates from the underlying MDP were generatedstochastically from the model and then given tothe policy.The action taken by the policy was re-turned to the model,and the policy was rewardedbased on the state-action-observation triplet.Theexperiments were run for a total of100dialogues,where each dialogue is considered to be a cycle ofobservation-action utterances from the start staterequest_begun through a sequence of statesand back to the start state.The time was nor-malised by the length of each dialogue cycle.4.1The Restricted State Space ProblemThe exact POMDP policy was generated usingthe Incremental Improvement algorithm(Cassan-Figure2:A simpliﬁed graph of the basic Markov Decision Process underlying the dialogue manager.Only the maximum-likelihood transitions are shown.Observation True State Belief Entropy Action Rewardbegun0.406saymeds 2.735asktime0.490saytv 1.176ask station-1ﬂo was on abc want abc100ﬂo what is on nbc want channelnbc0.062sayrobot0.864ask where-1ﬂo that that hello be send bedroom 1.839conﬁrm place-1ﬂo the bedroom any i send bedroom0.194go bedroom100ﬂo go it eight a hello send robotrobot toTable2:An example dialogue.Note that the robot chooses the correct action in theﬁnal two exchanges,even though the utterance is both noisy and ambiguous.dra et al.,1997).The solver was unable to com-plete a solution for the full state space,so we cre-ated a much smaller dialogue model,with only7states and2task domains:time and weather in-formation.Figure3shows the performance of the threealgorithms,over the course of100dialogues.Notice that the exact POMDP strategy outper-formed both the conventional MDP and approx-imate POMDP;it accumulated the most reward, and did so with the fastest rate of accumulation. The good performance of the exact POMDP is not surprising because it is an optimal solution for this problem,but time to compute this strategy is high:729secs,compared with1.6msec for the MDP and719msec for the Augmented MDP. 4.2The Full State Space ProblemFigure4demonstrates the algorithms on the full dialogue model as given in Figure2.Because of the number of states,no exact POMDP solution could be computed for this problem;the POMDP500010000150002000025000300000102030405060708090100 RewardGainedNumber of DialogsReward Gained per Dialog, for Small Decision ProblemPOMDP strategyAugmented MDPConventional MDPFigure3:A comparison of the reward gained over time for the exact POMDP,POMDP approximated by the Aug-mented MDP,and the conventional MDP for the7state problem.In this case,the time is measured in dialogues, or iterations of satisfying user requests.policy is restricted to the approximate solution. The POMDP solution clearly outperforms the conventional MDP strategy,as it more than triples the total accumulated reward over the lifetime of the strategies,although at the cost of taking longer to reach the goal state in each dialogue.-50005000100001500020000250000102030405060708090100R e w a r d G a i n e dNumber of DialogsReward Gained per Dialog, for Full Decision ProblemAugmented MDP Conventional MDPFigure 4:A comparison of the reward gained over time for the approximate POMDP vs.the conventional MDP for the 13state problem.Again,the time is measured in number of actions.Table 3breaks down the numbers in more de-tail.The average reward for the POMDP is 18.6per action,which is the maximum reward for most actions,suggesting that the POMDP is tak-ing the right action about 95%of the time.Fur-thermore,the average reward per dialogue for the POMDP is 230compared to 49.7for the conven-tional MDP ,which suggests that the conventional MDP is making a large number of mistakes in each dialogue.Finally,the standard deviation for the POMDP is much narrower,suggesting that this algorithm is getting its rewards much more consistently than the conventional MDP .4.3Veriﬁcation of Models on UsersWe veriﬁed the utility of the POMDP approach by testing the approximating model on human users.The user testing of the robot is still pre-liminary,and therefore the experiment presented here cannot be considered a rigorous demonstra-tion.However,Table 4shows some promising results.Again,the POMDP policy is the one pro-vided by the approximating Augmented MDP .The experiment consisted of having users inter-act with the mobile robot under a variety of con-ditions.The users tested both the POMDP and an implementation of a conventional MDP dialogue manager.Both planners used exactly the same model.The users were presented ﬁrst with one manager,and then the other,although they were not told which manager was ﬁrst and the order varied from user to user randomly.The user la-belled each action from the system as “Correct”(+100reward),“OK”(-1reward)or “Wrong”(-100reward).The “OK”label was used for re-sponses by the robot that were questions (i.e.,did not satisfy the user request)but were relevant to the request,e.g.,a conﬁrmation of TV channel when a TV channel was requested.The system performed differently for the three test subjects,compensating for the speech recog-nition accuracy which varied signiﬁcantly be-tween them.In user #2’s case,the POMDP man-ager took longer to satisfy the requests,but in general gained more reward per action.This is because the speech recognition system generally had lower word-accuracy for this user,either be-cause the user had unusual speech patterns,or be-cause the acoustic signal was corrupted by back-ground noise.By comparison,user #3’s results show that in the limit of good sensing,the POMDP policy ap-proaches the MDP policy.This user had a much higher recognition rate from the speech recog-niser,and consequently both the POMDP and conventional MDP acquire rewards at equivalent rates,and satisﬁed requests at similar rates.5ConclusionThis paper discusses a novel way to view the dialogue management problem.The domain is represented as the partially observable state of the user,where the observations are speech ut-terances from the user.The POMDP represen-tation inverts the traditional notion of state in dia-logue management,treating the state as unknown,but inferrable from the sequences of observations from the user.Our approach allows us to model observations from the user probabilistically,and in particular we can compensate appropriately for more or less reliable observations from the speech recognition system.In the limit of perfect recog-nition,we achieve the same performance as a conventional MDP dialogue policy.However,as recognition degrades,we can model the effects of actively gathering information from the user to offset the loss of information in the utterance stream.In the past,POMDPs have not been used for di-alogue management because of the computational complexity involved in solving anything but triv-ial problems.We avoid this problem by using anAverage Reward Per Action18.6+/-57.1Average Reward Per Action 3.8+/-67.2 Average Dialogue Reward230.7+/-77.4Average Dialogue Reward49.7+/-193.7POMDP Conventional MDPUser2Reward Per Action36.95 6.19Errors per request0.1+/-0.090.825+/-1.56Time toﬁll request 2.5+/-1.22 1.86+/-1.47Table4:A comparison of the rewards accumulated for the two algorithms using the full model on real users,with results given as mean+/-std.dev.augmented MDP state representation for approxi-mating the optimal policy,which allows us toﬁnd a solution that quantitatively outperforms the con-ventional MDP,while dramatically reducing the time to solution compared to an exact POMDP algorithm(linear vs.exponential in the number of states).We have shown experimentally both in sim-ulation and in preliminary user testing that the POMDP solution consistently outperforms the conventional MDP dialogue manager,as a func-tion of erroneous actions during the dialogue.We are able to show with actual users that as the speech recognition performance varies,the dia-logue manager is able to compensate appropri-ately.While the results of the POMDP approach to the dialogue system are promising,a number of improvements are needed.The POMDP is overly cautious,refusing to commit to a particular course of action until it is completely certain that it is ap-propriate.This is reﬂected in its liberal use of ver-iﬁcation questions.This could be avoided by hav-ing some non-static reward structure,where infor-mation gathering becomes increasingly costly as it progresses.The policy is extremely sensitive to the param-eters of the model,which are currently set by hand.While learning the parameters from scratch for a full POMDP is probably unnecessary,auto-matic tuning of the model parameters would def-initely add to the utility of the model.For exam-ple,the optimality of a policy is strongly depen-dent on the design of the reward structure.It fol-lows that incorporating a learning component that adapts the reward structure to reﬂect actual user satisfaction would likely improve performance. 6AcknowledgementsThe authors would like to thank Tom Mitchell for his advice and support of this research.Kevin Lenzo and Mathur Ravishankar made our use of Sphinx possible,answered requests for information and made bugﬁxes willingly.Tony Cassandra was extremely helpful in distributing his POMDP code to us,and answering promptly any questions we had.The assistance of the Nursebot team is also gratefully acknowledged, including the members from the School of Nurs-ing and the Department of Computer Science In-telligent Systems at the University of Pittsburgh. This research was supported in part by Le Fonds pour la Formation de Chercheurs et l’Aide `a la Recherche(Fonds FCAR).ReferencesHarald Aust and Hermann Ney.1998.Evaluating di-alog systems used in the real world.In Proc.IEEE ICASSP,volume2,pages1053–1056.A.Black,P.Taylor,and R.Caley,1999.The FestivalSpeech Synthesis System,1.4edition.Anthony Cassandra,Michael L.Littman,and Nevin L.Zhang.1997.Incremental pruning:A simple,fast, exact algorithm for partially observable Markov de-cision processes.In Proc.13th Ann.Conf.on Un-certainty in Artiﬁcial Intelligence(UAI–97),pages 54–61,San Francisco,CA.Leslie Pack Kaelbling,Michael L.Littman,and An-thony R.Cassandra.1998.Planning and acting in partially observable stochastic domains.Artiﬁcial Intelligence,101:99–134.Esther Levin,Roberto Pieraccini,and Wieland Eckert.ing Markov decision process for learning dialogue strategies.In Proc.International Confer-ence on Acoustics,Speech and Signal Processing (ICASSP).George E.Monahan.1982.A survey of partially ob-servable Markov decision processes.Management Science,28(1):1–16.Yasuhisa Niimi and Y utaka Kobayashi.1996.Dialog control strategy based on the reliability of speech recognition.In Proc.International Conference on Spoken Language Processing(ICSLP).Ronald Parr and Stuart Russell.1995.Approximating optimal policies for partially observable stochastic domains.In Proceedings of the14th International Joint Conferences on Artiﬁcial Intelligence.M.Ravishankar.1996.Efﬁcient Algorithms for Speech Recognition.Ph.D.thesis,Carnegie Mel-lon.Nicholas Roy and Sebastian Thrun.1999.Coastal navigation with mobile robots.In Advances in Neu-ral Processing Systems,volume12.Satinder Singh,Michael Kearns,Diane Litman,and Marilyn Walker.1999.Reinforcement learning for spoken dialog systems.In Advances in Neural Pro-cessing Systems,volume12.E.Sondik.1971.The Optimal Control of PartiallyObservable Markov Decision Processes.Ph.D.the-sis,Stanford University,Stanford,California. Sebastian Thrun.1999.Monte carlo pomdps.In S.A.Solla,T.K.Leen,and K.R.M¨u ller,editors,Ad-vances in Neural Processing Systems,volume12.Marilyn A.Walker,Jeanne C.Fromer,and Shrikanth Narayanan.1998.Learning optimal dialogue strategies:a case study of a spoken dialogue agent for email.In Proc.ACL/COLING’98.Sheryl Y e of dialogue,pragmatics and semantics to enhance speech recognition.Speech Communication,9(5-6),Dec.。

Clarification Dialogues as Measure to Increase Robustness in a Spoken Dialogue System

Dialogue
The dialogue component is realized as a hybrid architecture: it contains statistical and knowledgebased methods. Both parts work with dialogue acts (Bunt, 1981) as basic units of processing. The statistics module is based on data automatically derived from a corpus annotated with dialogue acts. It determines possible follow-up dialogue acts for every utterance. The plan recognizer as knowledge-based module of the dialogue component incorporates a dialogue model, which describes sequences of dialogue acts as occurring in appointment scheduling dialogues (Alexandersson and Reithinger, 1995).

{maier, reithinger, alexanders son}@dfki, uni-sb, de
Abstract A number of methods are implemented in the face-to-face translation system VERBMOBIL to improve its robustness. In this paper, we describe clarification dialogues as one method to deal with incomplete or inconsistent information. Currently, three • types of clarification dialogues are realized: subdialogues concerning phonological ambiguities, unknown words and semantic inconsistencies. For each clarification type we discuss the detection of situations and system states which lead to their initialization and explain the information flow during processing. 1 Dialogue Processing i n VERBMOBIL • it provides contextual information for other VERBMOBIL components. These components are allowed to store (intermediate) processing results in the so-called dialogue memory (Maier, 1996); • the dialogue memory merges the results of the various parallel processing streams, represents them consistently and makes them accessible in a uniform manner (Alexandersson, Reithinger, and Maier, 1997); • on the basis of the content of the dialogue memory inferences can be drawn that are used to augment the results processed by other VERBMOBIL components; • taking the history of previous dialogue states into account, the dialogue component predicts which dialogue state is most likely to occur next (Reithinger et ai., 1996). The dialogue component does not only have to be robust against unexpected, faulty or incomplete input, it also corrects and/or improves the input provided by other VERBMOBILcomponents. Among the measures to achieve this goal is the possibility to carry out clarification dialogues. 1.2 T h e A r c h i t e c t u r e of t h e Component

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Spoken Dialogue Management Using Hierarchical Reinforcement Learning and Dialogue SimulationHeriberto Cuay´a huitlSupervisors:Prof.Steve Renals and Dr.Oliver LemonPhD Research ProposalInstitute for Communicating and Collaborative SystemsSchool of InformaticsUniversity of EdinburghOctober2005AbstractSpeech-based human-computer interaction faces several difﬁcult challenges in order to be more widely accepted.One of the challenges in spoken dialogue management is to control the dialogueﬂow(dialogue strategy)in an efﬁcient and natural way.Dialogue strategies designed by humans are prone to errors,labour-intensive and non-portable,making automatic design an attractive alternative.Previous work proposed addressing the dialogue strategy design as an op-timization problem using the reinforcement learning framework.However,the size of the state space grows exponentially according to the state variables taken into account,making the task of learning dialogue strategies for large-scale spoken dialogue systems difﬁcult.In addition, learning dialogue strategies from real users is a very expensive and time-consuming process, making automatic learning an attractive alternative.To address these research problems three lines of investigation are proposed.Firstly,to investigate a method to simulate task-oriented human-computer dialogues at the intention level in order to design the dialogue strategy au-tomatically.Secondly,to investigate a metric to evaluate the realism of simulated dialogues. Thirdly,to make a comparative study between hierarchical reinforcement learning methods and reinforcement learning with function approximation,in order toﬁnd an effective and efﬁcient method to learn optimal dialogue strategies in large state spaces.Finally,a timeline for the completion of this research is proposed.Keywords:Spoken dialogue systems,probabilistic dialogue management,human-computer dialogue simulation,user modelling,hidden Markov models,dialogue optimization,dialogue strategies,Markov decision processes,Semi-Markov decision processes,reinforcement learn-ing,hierarchical reinforcement learning,function approximation,dialogue systems evaluation.iiiAcknowledgementsThis research is being sponsored mainly by PROMEP(“PROgrama de MEjoramiento del Pro-fesorado”),part of the Mexican Ministry of Education(http://promep.sep.gob.mx).It is also being sponsored by the Autonomous University of Tlaxcala(www.uatx.mx).ivTable of Contents1Introduction11.1Spoken Dialogue Systems (1)1.2Motivation (2)1.3Proposal (3)1.4Research Questions (4)1.5Contributions (4)2Previous Work52.1Spoken Dialogue Management (5)2.2Human-Computer Dialogue Simulation (8)2.3Reinforcement Learning for Dialogue Management (10)2.4Spoken Dialogue Systems Evaluation (11)3Human-Computer Dialogue Simulation Using Hidden Markov Models133.1Introduction (13)3.2Probabilistic Dialogue Simulation (14)3.2.1The System Model (14)3.2.2The User Model (15)3.2.3The Simulation Algorithm (15)3.3Dialogue Similarity (16)3.4Experimental Design (17)3.4.1Training the System and User Model (17)3.4.2Evaluation Metrics (19)3.4.3Experiments and Results (19)3.5Conclusions and Future Directions (21)3.6Proposed Future Work (22)4Spoken Dialogue Management Using Hierarchical Reinforcement Learning254.1Introduction (25)4.2The Reinforcement Learning Framework (26)4.2.1Markov Decision Processes (26)4.2.2Semi-Markov Decision Processes (28)4.2.3Reinforcement Learning Methods (28)4.3Hierarchical Reinforcement Learning Methods (29)4.3.1The Options Framework (29)4.3.2The MAXQ Method (31)4.3.3Hierarchies of Abstract Machines (33)4.3.4Comparison of Methods (34)4.4Reinforcement Learning with Function Approximation (35)v4.5Experimental Design (36)4.5.1The Agent-Environment (36)4.5.2Evaluation Metrics (38)4.5.3Experiments (38)4.6Proposed Future Work (38)5Future Plans395.1Timetable (39)References41viChapter1Introduction1.1Spoken Dialogue SystemsSpeech-based human-computer interaction faces several difﬁcult challenges in order to be more widely accepted.An important justiﬁcation for doing research on this topic is the fact that speech is the most efﬁcient and natural form of communication for humans.Currently,human-computer interaction is mainly performed using the following devices:keyboard,mouse and screen.There are many different and attractive reasons for interacting with computers us-ing speech.For instance,in hands-free and eyes-free environments(such as walking or driv-ing a car),in computer applications where providing information is tedious(such as search-ing/consulting information or booking a service),in mobile environments,or simply to have fun(such as talking with toys).Computer programs supporting interaction with speech are called“Conversational Interfaces”,computer programs supporting different modalities(such as speech,pen,and touch among others)are called“Multimodal Conversational Interfaces”, computer programs supporting only speech are typically called“Spoken Dialogue Systems”. The main task of a spoken dialogue system is to recognize user intentions and to provide co-herent responses until user goals are achieved.Figure1.1illustrates the architecture of a basic spoken dialogue system.Brieﬂy,when the user speaks,the“input”components recognize user intentions and provide them to the“di-alogue manager”(DM),which consults information from a database and provides an answer to the user through the“output”components.The conversation is basically driven by three levels of communication:speech,words and intentions.Typically,the user provides speech signals using either a microphone or telephone.The“input”components receive a speech signal and provide user intentions:the speech signal is given to the“Automatic Speech Recog-nition”(ASR)component,which looks for the words corresponding to the given speech signal and passes them on to the“natural language understanding”(NLU)component,which looks for the intentions corresponding to the given words.The main task of the dialogue manager is to control theﬂow of the conversation in an effective and natural way by providing the best system intentions given the current user intentions,information from the database and history of the conversation.The“output”components are the counterpart of the input components, receiving a set of system intentions and providing a speech signal to the user:the system in-tentions are given to the“natural language generation”(NLG)component,which generates a contextually appropriate response and provides the corresponding words to the speech syn-thesis”(TTS)component,which provides the corresponding speech signal to the user.In this way,human-computer conversations are compounded by user turns and system turns in aﬁnite iterative loop until user goals are achieved.The process described above still is a challenge for science and engineering.None of the12Chapter1.IntroductionFigure1.1:Basic architecture of a spoken dialogue system. components described above is perfect,even for simple tasks.The ASR component may pro-vide an incorrect sequence of words due to noise in the channel,noise in the environment or out-of-vocabulary words.The NLU component may provide an incorrect sequence of in-tentions due to the fact that a word sequence may be incorrect or may have several different interpretations.The DM component has a very challenging task due to the fact that the input components may convey incorrect user intentions,so the DM must work under uncertainty. In order to overcome the misunderstandings of the input components and choose the best sys-tem intentions it must use all possible knowledge in the conversation.In addition,the NLG component may provide a word sequence that may be unclear for the user.Finally,the TTS component usually provides an unnatural speech signal that may distract the user’s attention.All components in a spoken dialogue system may be simpliﬁed for well deﬁned and simple task-oriented applications.For instance,the ASR may have a small vocabulary,the NLU may only provide semantic representation based on the utterance semantic tags,the DM may have a predeﬁned controlﬂow of the conversation,the NLG may have predeﬁned answers,andﬁ-nally,pre-recorded prompts may be used instead of TTS.However,even in simple applications sophisticated components are required for successful conversations in real environments.For instance,robust ASR and DM may signiﬁcantly improve the performance of spoken dialogue systems.This research is focused on dialogue management for large-scale spoken dialogue systems, where the user may have several different goals in a single conversation(e.g.,some goals in the travel domain are:book a multi-legﬂight,book a hotel and rent a car).1.2MotivationThe main motivation behind this research is the fact that the automatic design of spoken dia-logue managers remains problematic,even for simple applications.Dialogue design(control ﬂow of the conversations)is typically hand-crafted by system developers,based on their intu-ition about proper dialogueﬂow.There are at least three motivations for automating dialogue design:a)it is a time-consuming process,b)the difﬁculty increases according to the dialogue complexity,and c)there may be some design issues that escape system developers.Automating dialogue design using the reinforcement learning framework,based on learning by interaction within an environment,is a current research topic.However,the size of the state space grows exponentially according to the variables taken into account,making the task of learning optimal dialogue strategies1in large state spaces difﬁcult.1.3.Proposal3Another related problem is how to learn the dialogue strategy automatically.Spoken dia-logue managers are typically optimized and evaluated with dialogues collected from lengthy cycles of trials with human subjects.But training optimal dialogue strategies usually requires many dialogues to derive an optimal policy and learning from conversations with real users may be impractical.An alternative is to use simulated dialogues.For dialogue modelling, simulation at the intention level is the most convenient,since the effects of recognition and understanding errors can be modelled and the intricacies of natural language generation can be avoided(Young,2000).Simulated dialogues must incorporate all possible outcomes that may occur in real environments,including system and user errors.Previous work has addressed this topic by learning a user model in order to plug it into a spoken dialogue system.However,little attention has been given to methods for simulating both sides of the conversation(system and user).Furthermore,there has been a lack of research on evaluating the realism of simulated dialogues.Some potential uses of simulating the system and user are:a)to acquire knowl-edge from both entities,b)to learn optimal dialogue strategies in the absence of a real spoken dialogue system and real users,and c)to evaluate spoken dialogue systems in early stages of development.The primary aims of my research proposal are:1.To investigate a method for simulating task-oriented human-computer dialogues on bothsides of the conversation(system and users).This is important because small dialogue data sets may be expanded for purposes of optimization and evaluation,and because knowledge from both entities may be acquired in order to improve the interaction.2.To investigate a metric for evaluating the realism of simulated dialogues.This is impor-tant because different system/user models may be compared in a more reliable way,so that the best models may be used for simulation.3.To investigate an efﬁcient reinforcement learning method for learning optimal dialoguestrategies in large state spaces.This is beneﬁcial to optimize dialogue strategies for dialogue managers with many different variables,which is useful not only for spoken dialogue systems but also for multimodal conversational interfaces,where the number of variables increases.1.3ProposalI propose a two-stage research project:theﬁrst stage focused on human-computer dialogue simulation and the second stage focused on learning optimal dialogue strategies.In theﬁrst stage I propose to investigate a novel probabilistic method to simulate task-oriented human-computer conversations at the intention level using hidden Markov models with rich structures.In addition,for evaluating the simulation method I propose to investigate a novel metric to measure the realism of the simulated dialogues,based on a probabilistic ap-proach of dialogue similarity and potentially combined with other metrics previously proposed in the natural language processingﬁeld that may help to correlate dialogue realism.For the second stage I propose to perform a comparative study of three hierarchical rein-forcement learning methods proposed in the machine learningﬁeld.This proposal is motivated by the fact that hierarchical methods have proven to learn faster,with less training data and have not been applied to spoken dialogue systems with large state spaces.In addition,I propose to compare the hierarchical methods against a function approximation method,which is another alternative for addressing optimization in large state spaces.Theﬁrst stage will be used as a4Chapter1.Introduction valuable resource for learning dialogue strategies automatically,and the second stage will be a valuable resource for learning dialogue strategies for large-scale spoken dialogue systems.Finally and for both stages,I propose to perform experiments using the2001DARPA Communicator corpora,which is annotated with the DATE annotation scheme and consists of ∼1.2K dialogues in the domain of travel information(multi-legﬂight booking,hotel booking and rental car).If time and resources permit I will perform experiments using a real spoken dialogue system and real users within the same domain.1.4Research QuestionsIn this research I aim to provide answers to the following questions:1.How can a small corpus of dialogue data be expanded with more varied simulatedconversations?There are two challenges in this question:a)how to predict system behaviour,and b)how to predict user behaviour.Due to the fact that probabilistic models will be used for this purpose,incoherent dialogues may be generated.Therefore,exploitation of knowledge from different variables must be taken into account.Preliminary experiments suggest promising results,and answers to this question will be driven by the following assump-tion:the more knowledge they have,the more accurate they are.An important problem to face in this question is data sparsity due to the small amount of data used to train the models.2.How can the realism of simulated human-computer dialogues be evaluated?This question is difﬁcult due to the fact that it is not known if simulated dialogues may occur in real environments.Nevertheless,the answers to this question will be driven by the following assumption:the more similar the simulated dialogues are to the real ones, the more realistic they are.Another possible direction is through evaluating the utility of simulated dialogues under the following assumption:the more useful(for optimization or evaluation)they are,the more accurate they are.3.How can optimal dialogue strategies be learnt for spoken dialogue systems withlarge state spaces?There are two researchﬁelds to address for this question:a)hierarchical reinforcement learning,and b)reinforcement learning with function approximation.There are sev-eral methods for eachﬁeld and potential methods will be selected in order to performa comparative study,which may reveal potential application of speciﬁc methods in spo-ken dialogue systems.Experiments in this topic must be designed carefully in order to evaluate performance,computational cost and portability to other domains.1.5ContributionsThis research intends to advance the current knowledge in spoken dialogue systems according to the following expected contributions:1.A method to generate human-computer simulated dialogues at the intention level.2.A metric to evaluate the realism of human-computer simulated dialogues.3.A comparative study of reinforcement learning methods to learn optimal dialogue strate-gies in large state spaces.Chapter2Previous WorkThis chapter summarizes previous work that supports the proposal described in this document. The topics investigated and related to this research are:spoken dialogue management,human-computer dialogue simulation,reinforcement learning for dialogue management and evaluation of spoken dialogue systems.A brief description of each topic is provided,with references to relevant related work.Finally,at the end of each section a list of research gaps is provided. 2.1Spoken Dialogue ManagementThe main task of a spoken dialogue manager is to control theﬂow of the conversation between the user and the system.More speciﬁcally,a dialogue manager in task-oriented applications must gather information from the user(e.g.,“Where do you want to go?”),possibly clarifying information explicitly(e.g.,“Did you say London?”)or implicitly(e.g.“Aﬂight to Lon-don.For what date?”),resolve ambiguities that arise due to recognition errors(e.g.,“Did you say Boston or London?”)or incomplete speciﬁcations(e.g.,“On what day would you like to travel?”).In addition,the dialogue manager must guide the user by suggesting subsequent sub-goals(e.g.,“Would you like me to summarize your trip?”),offer assistance upon request(e.g.,“Try asking forﬂights between two major cities.”)or clariﬁcation,provide alternatives when the information is not available(e.g.,“I couldn’tﬁnd anyﬂights on United.I have2Alaska Airlinesﬂights...”),provide additional constraints(e.g.,“I found tenﬂights,do you have a pre-ferred airline?”),and control the degree of initiative such as system-initiative(“What city are you leaving from?”)or mixed-initiative(e.g.,“How may I help you?”).The dialogue manager can inﬂuence other system components in order to make dynamic adjustments in the system such as:vocabulary,language models or grammars.In general,the goal of a spoken dialogue manager is to take an active role in directing the conversation towards a successful,effective and natural experience.However,there is a trade-off between increasing userﬂexibility and increasing system understanding accuracy(Zue&Glass,2000).There are several architectures for designing spoken dialogue managers,which can be broadly classiﬁed as follows:•State-Based:In this architecture the dialogue structure is represented in the form ofa network,where every node represents a question,and the transitions between nodesrepresent all the possible dialogues(McTear,1998).•Frame-Based:This architecture is based on frames(forms)that have to beﬁlled by the user,where each frame contains slots that guide the user through the dialogue.Here the user is free to take the initiative in the dialogue(Goddeau et al.,1996)(Chu-Carroll, 1999)(Pieraccini et al.,2001).56Chapter2.Previous Work •Agenda-Based:This architecture is based on the frame-based,but builds a more com-plex data structure(using dynamic trees)for conversations in more complex domains (Rudnicky&Xu.,1999)(Wei&Rudnicky,2000)(Bohus&Rudnicky,2003).•Agent-Based:(Allen et al.,2001a)(Allen et al.,2001b)propose an architecture where the components of a dialogue system are divided into three areas of functionality:in-terpretation,generation and behaviour.Each area consists of a general control module and they communicate with each other,sharing information and messages.The agent responsible for controlling theﬂow of the conversation is mainly the behavioural agent.Another approach using collaborative agents is COLLAGEN,which mimics the relation-ships between two humans(agents)collaborating on a task involving a shared artifact.The collaborative manager supports mixed-initiative by interpreting dialogue acts of the tasks and goals of the agents(Rich&Sidner,1998).•Information State Approach:This architecture is based on the notions of information state,which represents information to distinguish dialogues,to represent previous actions and to motivate future actions(Larsson&Traum,2000).An important dialogue data collection from several different systems using different fea-tures is the DARPA Communicator corpora(available through LDC)(Walker et al.,2002). This data was collected with the aim of supporting rapid development of spoken dialogue sys-tems with advanced conversational capabilities.This valuable resource for research collected dialogues from8different systems:AT&T(Levin et al.,2000b),BBN(Stallard,2000),CMU (Rudnicky et al.,1999),COL(Pellom et al.,1999)(Pellom et al.,2001),IBM(Erdo˘g an, 2001),LUC(Potamianos et al.,2000),MIT(Seneff&Polifroni,2000)and SRI.Most of the systems adopt the frame-based architecture with some variations in the dialogue strategies, only the CMU Communicator adopted the agenda-based architecture.Figure2.1illustrates a dialogue from the MIT Communicator system.We can observe that manual design of this kind of dialogue is a labour-intensive task due to the many possible strategies that must be taken into account.In addition,manual design is prone to errors:for instance,in the second leg the system implicitly conﬁrmed the user’s departure from“Zurich”(line31),but in line43the ASR recognizes a new departure city“Newark”.Here the dialogue manager assumed that the user changed his mind,which may be a reasonable action.In line 44the system explicitly conﬁrms this new assumption,and the recognizer provides a wrong sequence of words due to a mixture of in-vocabulary words and out-of-vocabulary words(line 47).In line48,the system assumes a correct conﬁrmation and the user is annoyed with this new departure city.At this point,a potential better strategy(considering dialogue history) may be to ignore this new assumption,possibly saving system turns,or simply minimizing user annoyance.Thus it is a fact that there will be some design issues that escape system developers,which makes automatic design of dialogue strategies an attractive option.Despite the previous research efforts in spoken dialogue management,more investigations are needed in order to build systems supporting successful,efﬁcient and natural conversations. The following are some potential research gaps in this area:•To investigate methods for optimizing dialogue strategies exploiting system and users’knowledge•To balance the trade-off between userﬂexibility and system understanding•To separate the domain-independent and domain-dependent system features•To explore how dialogue managers can be abstracted and reused across domains2.1.Spoken Dialogue Management7LINE TURN UTTERANCE01. SYS: how may i help you?02. USR: thank you i want to fly from kansas city missouri on october third in the morning to denver colorado03. [thank you i want to fly from kansas city missouri on october third in the morning to denver colorado]04. SYS: okay, flights from kansas city to denver departing 5 a.m. and noon on Wednesday october third...05. USR: what else is available please06. [what are available please]07. SYS: i have a nonstop united flight leaving at 6:20 a. m. and arriving at 7:01 a. m would that work?08. USR: please09. [nowhere]10. SYS: i have a nonstop united flight leaving at 6:20 a. m. and arriving at 7:01 a. m would that work?11. USR: no i want to fly out later than six twenty12. [no i want to fly later in the morning]13. SYS: okay, flights from kansas city to denver departing after 6:20 a.m. on Wednesday october third...14. would that work? i also have a united flight leaving at 7:40 a. m.15. USR: i will take the united leaving at seven forty16. [i will take the united leaving at seven forty]17. SYS: united flight 1631 has been added to your itinerary. you depart... and arrive.... what date will you be returning on?18. USR: i want to leave denver on october fourth in the afternoon and i want to fly from there to zurich19. (switzer-) switzerland20. [i want to leave denver on october four in the afternoon and i want to fly american zurich switzerland switzerland]21. SYS: okay, flights from denver to kansas city departing between noon and 6:00 pm on thursday october22. fourth. one moment please. i have flights on...23. USR: no24. [no]25. SYS: i have a nonstop united flight leaving at 12:38 p. m. and arriving at 3:13 pm? would that work?26. USR: no i am flying from denver to zurich switzerland27. [no i am flying from denver to zurich switzerland]28. SYS: okay from denver to zurich. what date will you be travelling?29. USR: october four30. [october four]31. SYS: okay, from denver to zurich on thursday october fourth. can you provide the approximate departure32. time or airline?33. USR: i'd like to leave in mid afternoon34. [i would like to leave in the afternoon]35. SYS: okay, flights from denver to zurich departing between noon and 6:00 pm on thursday october fourth…36. USR: the american flight please37. [the american flight please]38. SYS: american flight 1078 connecting to american flight 170 in chicago o'hare ... what date will you be returning on?39. USR: i'm going to return to kansas on october fifteen40. [sunday returning in atlanta at four eighteen]41. SYS: okay, flights from zurich to kansas city departing at 4:18 pm on sunday october seventh… would that work?42. USR: (-o) i want to leave zurich in the late morning please43. [i want to leave newark in the late morning please]44. SYS: i heard you say you want to depart from newark. is that correct? if you need help, you can say,45. for example, what cities do you know in oregon.46. USR: i i don't want to go to oregon47. [i will what about oregon]48. SYS: okay, flights from newark to kansas city departing between 10:00 am and noon... would that work?49. USR: no i am not in newark i'm flying from zurich z. u. r. i c. h. switzerland remember you booked me to go to zurich50. [no i am not in newark on friday from new york and you are i see switzerland remember you booked51. me to voyager]52. SYS: sorry, i'm not sure what you said.53. USR: i want to leave zurich switzerland on on october fifteen in the late morning and fly back to kansas city54. [i want to leave zurich switzerland on october fifteen in the late morning and fly back to kansas city]55. SYS: okay, flights from zurich to kansas city departing between 10:00 am and noon on monday october fifteenth…56. USR: ...Figure2.1:Sample dialogue from the MIT Communicator system illustrating a three-legﬂightbooking:leg1from lines1to16,leg2from lines17to37,and leg3from lines38to56.Thetags SYS and USR correspond to system turns and user turns.The word sequences in squaredbrackets are the speech recognition results.8Chapter2.Previous Work•To compare the architectures in domains of different complexity•To extend the architectures for more complex tasks2.2Human-Computer Dialogue SimulationThe task of human-computer dialogue simulation consists of generating artiﬁcial conversations between a spoken dialogue system and a user.The aim is to automate the optimization and evaluation of spoken dialogue systems.Dialogue simulation may be performed at different levels of communication.The speech and word levels are useful for ASR systems in order to train acoustic and language models,and the intention level is useful for dialogue managers in order to train dialogue strategies.The following factors motivate the use of dialogue simulation for dialogue management:•Training optimal dialogue strategies requires many dialogues to derive an optimal policy and learning from real conversations may be impractical since it is expensive,labour-intensive,and time-consuming.An alternative is to use simulated dialogues.•Simulated dialogues can be used to evaluate spoken dialogue systems at early stages of development,and potentially to discover errors that may help to reduce expensive and lengthy trials with human subjects.•When a dialogue manager is updated,the previous optimization is no longer valid and another optimization is needed.At this point,simulated dialogues may help to speed up the development and deployment of optimized spoken dialogue systems.Several research efforts have been undertaken in this area and the following dimensions are used to summarize such investigations:1.Approach:Whilst some approaches are rule-based(Chung,2004)(Lin&Lee et al.,2001)(L´o pez-C´o zar et al.,2003),others are corpus-based(Eckert et al.,1997)(Schefﬂer &Young,2000)(Schefﬂer&Young,2001)(Georgila et al.,2005b).The advantage of the corpus-based methods is towards minimizing the portability problem(lack of expertise and high development costs)(Sutton et al.,1996).Rule-based methods tend to be ad-hoc for the task and domain.munication Level:Most of the investigations are intention based,and some usethe speech and word levels;depending on the purposes of the simulated dialogues.The investigations based on intentions have the purpose of optimizing dialogue strategies (Eckert et al.,1997)(Schefﬂer&Young,2000)(Levin et al.,2000a)(Schefﬂer&Young, 2001)(Pietquin&Renals,2002)(Georgila et al.,2005b).(L´o pez-C´o zar et al.,2003)use the speech and word levels in order to evaluate different speech recognition front-ends and dialogue strategies.(Chung,2004)uses the speech and word levels in order to train speech recognition and understanding components.3.Evaluation:A few investigations attempt to evaluate the simulated dialogues.(Eckert etal.,1997)(Schefﬂer&Young,2000)(Schefﬂer&Young,2001)use the average number of turns.(Schatzmann et al.,2005a)use three dimensions:high level features(dialogue and turn lengths),dialogue style(speech-acts frequency;proportion of goal-directed ac-tions,grounding,formalities,and unrecognized actions;proportion of information pro-vided,reprovided,requested and rerequested),and dialogue efﬁciency(goal completion rates and times).Finally,(Georgila et al.,2005b)use perplexity and a performance function based onﬁlled slots,conﬁrmed slots,and number of actions performed.。