Literature Review 英文文献综述模板
国内外文献综述英文

国内外文献综述英文A literature review is an essential component of academic research, providing a comprehensive overview of existing literature on a particular topic. In this section, we will discuss the importance of a literature review, the process of conducting a literature review, and the key considerations for writing a literature review.Importance of a Literature Review:A literature review serves several important purposes in academic research. Firstly, it helps to establish the current state of knowledge on a given topic, identifying key concepts, theories, and findings from previous studies. This allows researchers to build upon existing knowledge and identify gaps or areas for further investigation. Additionally, a literature review demonstrates the researcher's familiarity with the relevant literature and provides a theoretical framework for their own study. It also helps to contextualize the research within the broaderacademic discourse and can be used to support the rationale and significance of the study.Process of Conducting a Literature Review:Conducting a literature review involves several key steps. Firstly, researchers must define the scope and focus of the review, identifying the key research questions or objectives. They then conduct a comprehensive search of relevant literature using academic databases, libraries, and other sources. Once the relevant literature has been identified, researchers critically evaluate and analyze the findings, identifying key themes, trends, and gaps in the existing literature. Finally, the findings of theliterature review are synthesized and integrated into a coherent and well-structured narrative.Key Considerations for Writing a Literature Review:When writing a literature review, there are several key considerations to keep in mind. Firstly, it is important to maintain a critical and analytical approach, evaluating thestrengths and limitations of the existing literature. Researchers should also strive to present the informationin a clear, logical, and organized manner, highlighting the key findings and their implications. It is also important to properly cite and reference all sources used in the literature review, adhering to the appropriate academic citation style.In conclusion, a literature review is a critical component of academic research, providing a comprehensive overview of existing literature on a particular topic. By conducting a thorough literature review, researchers can build upon existing knowledge, identify research gaps, and provide a theoretical framework for their own study. When writing a literature review, it is important to maintain a critical and analytical approach, present the informationin a clear and organized manner, and properly cite all sources.。
Literature Review 英文文献综述模板

Text Recognition with Machine Learning based on Text StructureLiterature ReviewYifan Shi Student ID:27291944Email:ys1n13@MSc Artificial IntelligenceFaculty of Physical Sciences&Eng,University of SouthamptonAbstract—The fast developing Machine Learning algorithms introduced to semantic area nowadays has brought vast techniques in text recognition,classification, and processing.However,there is always a contradiction between accuracy and speed,as higher accuracy generally represents more complicated system as well as large training database.In order to achieve a balance between fast speed and good accuracy,many brilliant designs are used in text processing.In this literature review,these efforts are introduced in three layers:Natural-Language Processing,Text Classification,and IBM Watson System.Keywords—Machine Learning,Natural-Language Processing,Text Classification,IBM WatsonI.I NTRODUCTIONThe growing popularity of the Internet has brought increasing number of users online,with a vast amount of messages,blogs,articles,etc.to be dealt with.These texts,known as natural-language texts,contain possible useful information but take a long time for human to read,understand and deal with.Despite the popular search engine technology nowadays in helping users tofind the sources with keywords,semantic techniques are also needed by many companies to improve their user-friendly working environment.In this literature review,I will introduce several important semantic techniques,starting from the most basic Natural-Language Processing,concentrating in the meaning of words and sentences,followed by Text Classification which is focused on paragraphs and articles.Then,I will introduce a landmark system named IBM Watson,which has DeepQA as its working pipeline.Finally,a conclusion will be included to give some comments on these techniques.II.N ATURAL L ANGUAGE P ROCESSING In order to deal with the human natural-language, it is necessary to transform the unstructured text into well-structured tables of explicit semantics (Ferrucci,2012).According to Liddy(2001), Natural-Language Processing(NLP)is a series of computational techniques used to analyze and represent naturally organized text in order to achieve certain tasks and applications.Collobert and Weston(2008)have categorized NLP tasks into six types:Part-Of-Speech Tagging,Chunking,Named Entity Recognition,Semantic Role Labeling, Language Models,and Semantically Related Words.In addition to this,they also implemented Multitask Learning with Deep Neural Networks to build a successful unified architecture which avoided traditional large amount of empirical hand-designed features to train the system by using backpropagation training(Collobert et al.,2011).III.T EXT C LASSIFICATIONOne of the simple way to represent an article for a learning algorithm is to use the number of times that distinct words appear in the document (Joachims,2005).However,due to the large amount of possible words used in articles,it would create a very high dimensional space of features.Joachims(1999)suggests a TransductiveSupport Vector Machines to do classification because of its effective learning ability even in high dimensional feature space.Rather than using non-linear Support Vector Machine(SVM), Dumais et al.(1998)compared linear SVM with another four different learning algorithms which are Find Similar,Decision Trees,Naive Bayes, and Bayes Nets,which also supports SVM in text classification because of its high accuracy,fast speed as well as its simple model.Sebastiani(2002) also recommends Neural Network as a potential selection in text classification in that its accuracy is only slightly lower than SVM in comparison. The cross-document comparison of small pieces of text,using linguistic features such as noun phrases,and synonyms is introduced by Hatzivassiloglou et al.(1999).The similarity of two paragraphs is defined by the same action conducted on the same object by the same actor. Therefore,drawing features according to nouns and verbs would generally conclude a paragraph into several primitive elements.In addition to the similar primitive elements,restrictions such as ordering, distances and primitive(matching noun and verb pairs)are also implemented to exclude weakly related features.The feature selection methods can effectively reduce the dimensions of dataset (Ikonomakis,2005)while keeping the performance of classification.To make sure which words are to be kept,an Evaluation function has been introduced by Soucy and Mineau(2003)to measure how much information we can get by classifying through a single word.Another improvement by Han et al. (2004)is to use Principal Component Analysis (PCA)to reduce the dimension in transformation of features.Nigam and Mccallum(2000)combine Expectation-Maximization and Naive Bayes classifier to train the classifier with certain amount of labeled texts followed by large amount of unlabeled documents,which realizes the automatic training without huge amount of hand-designed training data.IV.IBM W ATSONThe IBM Watson project has shown us that computer system in open-domain question-answering(QA)is possible to beat human champions in Jeopardy.As Ferrucci(2012) mentioned,the structure of Watson is more complicated than any single agent as it has hundreds of algorithms working together,in the way that Minsky(1988)introduced in Society of Mind.Generally,Watson consists of parts which are DeepQA,Natural Language Processing(NLP), Machine Learning(ML),and Semantic Web and Cloud Computing(Gliozzo et al.,2013).The DeepQA system analyzes the question by different algorithms,giving different interpretations of questions and forming queries for each question (Ferrucci,2012).It provides all the possible answers to the question with the evidences and the scores for each candidate,which would generate a ranking of candidate answers with the likelihood of correctness.The Machine Learning algorithms are used to train the weights in its evaluating and analyzing algorithms(Gliozzo et al.,2013).The clue that Watson uses in searching is named as lexical answer type(LAT),which tells Watson what the question is asking about and what kind of things it needs to look for.Before doing searching, it would generate prior knowledge of type label, known as‘direction’,to each candidate answer and search evidences for and against this‘type direction’(Ferrucci,2012).The DeepQA also has a high requirement in Grammar-based and syntactic analysis techniques,for example,relation extraction techniques in getting possible relations between words,based on a rule-based approach.In addition,the ability of breaking the question down into sub-questions by logics also improved Watsons performance(Ferrucci,2012),which enables Watson tofind results for each smaller questions and combine them together.In correspondence to the ability of breaking down questions,it can also generate the score for the original question based on the evidence for sub-questions.To simulate human knowledge,Watson also uses self-contained database.However,this requirement has led to its great hardware cost.Watson also needs to do automatic text analysis and knowledge extraction to update its database,because of the enormous amount of work and the insurance ofinput-knowledge accuracy.However,the use of self-contained database is costly,that only few institutions can afford the hardware expense,which makes the application of Watson expensive.Another limitation is that the structured resource is relatively narrow compared with vast unstructured natural-language texts.One of the possible improvement is to use online data and ordinary online search engine tofind possible related articles and analyze them with PC clients.Despite the tradeoff between accuracy and cost,because of the possible the unreal data and incorrect information online,it makes the technique more realizable in general.V.C ONCLUSIONAs can be seen from the content above,most techniques used in text analysis are based on‘word feature’extraction,word types,and relations, which are all semantic techniques.While Watson also uses searching techniques tofind the exact answer shown in text.However,the machines lack the ability to conclude the main idea in a paragraph,which is more related with abstract logic thinking.While the way that human read concerns not only on vocabularies and meanings, but also the structure of paragraph and the location of sentences,for example,thefirst sentence in the paragraph usually guides the following content, which helps tell the significance of the sentences and words.Therefore,using machine learning to analyze the structure of an article and combining with the meaning of every sentence might generate the ability to conclude the main idea,which can be used in text scanning and classification.R EFERENCES[1]S.Dumais,J.Platt,D.Heckerman,and M.Sahami,InductiveLearning Algorithms and Representations for Text Categoriza-tion,Proceedings of the seventh international conference on Information and knowledge management,pp-148-155,1998. [2]T.Joachims,Text Categorization with Support Vector Machines:Learning with Many Relevant,ECML-98Proceedings of the10th European Conference on Machine Learning,pp-137-142,1998.[3]T.Joachims,Transductive Inference for Text Classification usingSupport Vector Machines,International Conference on Machine Learning(ICML),pp-200-209,1999.[4]V.Hatzivassiloglou,J.Klavans,and E.Eskin,Detecting TextSimilarity Over Short Passages:Exploring Linguistic Feature Combinations Via Machine Learning,Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora,2000.[5]K.Nigam,Text Classification from Labeled and Unlabeled Doc-uments using EM,Machine Learning,V olume39,pp-103134, 2000.[6] E.Liddy,Natural Language Processing,In Encyclopedia ofLibrary and Information Science,2nd Ed.NY.Marcel Decker, Inc,2001.[7]S.Tong and D.Koller,Support Vector Machine Active Learningwith Applications to Text Classification,Journal of Machine Learning Research pp-45-66,2001.[8] F.Sebastiani,Machine Learning in Automated Text Categoriza-tion,ACM Computing Surveys(CSUR),Issue1,V olume34, pp-1-47,2002.[9]P.Soucy and G.Mineau,Feature Selection Strategies for TextCategorization,AI2003,LNAI2671,pp-505-509,2003. [10]X.Han,G.Zu,W.Ohyama,T.Wakabayashi,and F.Kimura,Accuracy Improvement of Automatic Text Classification Based on Feature Transformation and Multi-classifier Combination, LNCS,V olume3309,pp.463-468,Jan2004.[11]M.Ikonomakis,S.Kotsiantis,V.and Tampakas,Text Classifica-tion using Machine Learning Techniques,WSEAS Transactions on Computers,Issue8,V olume4,pp-966-974,2005.[12]R.Collobert and J.Weston,unified architecture for natural lan-guage processing:deep neural networks with multitask learning, ICML’08Proceedings of the25th international conference on Machine learning,ACM New York,USA,Pages160-167,2008.[13]R.Collobert,J.Weston,L.Bottou,M.Karlen,K.Kavukcuoglu,and P.Kuksa Natural Language Processing(Almost)from Scratch,Journal of Machine Learning Research,V olume12,pp-2493-2537,2011.[14] A.Gliozzo,O.Biran,S.Patwardhan,and K.McKeown,Seman-tic Technologies in IBM Watson,The10th International Semantic Web Conference,Bonn,Germany,2011.[15] D.Ferrucci,Introduction to“This is Watson”,IBM Journal ofResearch and Development,V olume56Number3/4,pp-1:1-1:15 May/July2012.[16]G.Tesauro,D.Gondek,J.Lenchner,J.Fan,and J.Prager,Simulation,learning,and optimization techniques in Watsons game strategies,IBM Journal of Research and Development, V olume56,Number3/4,pp-16:116:11,2012.。
英文文献综述标准范文

英文文献综述标准范文下面是店铺为大家整理的一些关于“英文文献综述标准范文”的资料,供大家参阅。
英文文献综述范文How to Write a Literature Review ?I. The definition of Literature Review文献综述(Literautre Review)是科研论文中重要的文体之一。
它以作者对各种文献资料的整理、归纳、分析和比较为基础,就某个专题的历史背景、前人的工作、研究现状、争论的焦点及发展前景等方面进行综合、总结和评论。
通过阅读文献综述,科研工作者可花费较少的时间获得较多的关于某一专题系统而具体的信息,了解其研究现状、存在的问题和未来的发展方向。
II. The purposes of literature review And Its ComponentsA. The PurposesOn the one hand, it helps you broaden the view and perspective of the topic for your graduation thesis.On the other hand, it helps you narrow down the topic and arrive at a focusedresearch question.B. Its ComponentsThere are six parts in a complete Literature Review.标题与作者(title and author)摘要与关键词(abstract and key words)引言(introduction)述评(review)结论(conclusion)参考文献(references)III. Classification of Source MaterialsHow can we locate the materials relevant to our topics betterand faster? Basically, all these source materials may be classified into four majors of sources.A: Background sources:Basic information which can usually be found in dictionaries andencyclopedia complied by major scholars or founders of the field. Three very good and commonly recommenced encyclopedias are encyclopedias ABC, namely, Encyclopedia Americana, Encyclopedia Britannica, and Collier’s Encyclopedia. There are also reference works more specialized, such as The Encyclopedia of Language and Linguistics for linguistics and TEFL studies. Moreover, you may also find Encyclopedia on the web.B: Primary sourcesThose providing direct evidence, such as works of scholars of the field,biographies or autobiographies, memoirs, speeches, lectures, diaries, collection of letters, interviews, case studies, approaches, etc. Primary sources come in various shapes and sizes, and often you have to do a little bit of research about the source to make sure you have correctly identified it. When a first search yields too few results, try searching by broader topic; when a search yields toomany results, refine your search by narrowing down your search.C: Secondary sourcesThose providing indirect evidence, such as research articles or papers, bookreviews, assays, journal articles by experts in a given field, studies on authors orwriters and their works, etc. Secondary sources will informmost of your writingin college. You will often be asked to research your topic using primary sources,but secondary sources will tell you which primary sources you should use andwill help you interpret those primary sources. T o use theme well, however, youneed to think critically them. There are two parts of a source that you need toanalyze: the text itself and the argument within the text.D: Web sourcesThe sources or information from websites. Web serves as an excellentresource for your materials. However, you need to select and evaluate Websources with special care for very often Web sources lack quality control. Youmay start with search engines, such as Google, Yahoo, Ask, Excit e, etc. It’s agood idea to try more than one search engine, since each locates sources in itsown way. When using websites for information, be sure to take care for theauthorship and sponsorship. If they are both unclear, be critical when you useinformation. The currency of website information should also be taken intoaccount. Don’t use too out information dated for your purpose.IV. Major strategies of Selecting Materials for literaturereviewA. Choosing primary sources rather than secondary sourcesIf you have two sources, one of them summarizing or explaining a work andthe other the work itself, choose the work itself. Never attempt to write a paperon a topic without reading the original source.B. Choosing sources that give a variety of viewpoints on your thesisRemember that good argument essays take into account counter arguments.Do not reject a source because it makes an argument against you thesis.C. Choosing sources that cover the topic in depthProbably most books on Communicative Language Teaching mention WilliamLittlewood, but if this your topic, you will find that few sources cover the topicin depth. Choose those.D. Choosing sources written by acknowledged expertsIf you have a choice between an article written by a freelance journalist onTask-based Teaching and one written by a recognized expert like David Nunan,Choose the article by the expert.E. Choosing the most current sourcesIf your topic involves a current issue or social problem or development in ascientific field, it is essential to find the latest possible information. If all thebooks on these topics are rather old, you probably need to look for information inperiodicals.V. Writing a literature ReviewA. When you review related literature, the major review focuses should be:1. The prevailing and current theories which underlie the research problem.2. The main controversies about the issue, and about the problem.3. The major findings in the area, by whom and when.4. The studies which can be considered the better ones, and why.5. Description of the types of research studies which can provide the basis for the current theories and controversies.6. Criticism of the work in the area.B. When you write literature review, the two principles to follow are:1. Review the sources that are most relevant to your to your thesis.2. Describe or write your review as clear and objective as you can.C. Some tips for writing the review:1. Define key terms or concepts clearly and relevant to your topic.2. Discuss the least-related references to your question first and the mostrelated references last.3. Conclude your review with a brief summary.4. Start writing your review early.VI. 文献综述主要部分的细节性提示和注意事项主要部分细节提示:引言(Introduction)引言是文献综述正文的开始部分,主要包括两个内容:一是提出问题;二是介绍综述的范围和内容。
(完整word版)A Sample of Literature Review(英文文献综述模板)

A Sample of Literature ReviewOn Advertising EnglishAmong the so many scholars who examine advertising language, G. N. Leech deserves prime attention for his thorough research of advertising in the field of linguistics in his book English in Advertising(66).Vestergaard and Schroder, however, probe into advertising language not only in the respect of linguistics, but also in that of psychology and ideology.The Language of Advertising (85) written by them is a revealing study of the strategies of persuasion advertisers use and of the crucial underlying assumptions advertising makes.Focusing on magazine and newspaper advertising, the authors illustrate the range of linguistic and visual techniques advertisers use to achieve emphasis and special effects.Apart from that, Hafer and White make contribution to the research of advertising language by writing a book titled Adverting English which is conceived as a bridge between rules and suggestions for writing advertisement that have been run or aired.And in the Secrets of Successful Copywriting (86), Patrick Quinn tells the reader everything he needs to know from the drafting of press ads to the scripting of television commercials, from radio to audio-visual, and the concepts, the treatments as well as the wrinkles.Compared with linguists who study advertising from the angle of language alone, more scholars carry out their research of advertising in a comprehensive way. Their study covers the history of advertising, the work of advertising agencies, the procedure of advertising, etc. with advertising language concerned more or less. For instance, Essentials of Advertising (80) written by Louis Kaufman examines in detail every stage of the business of advertising, from the initial concept to execution. And it moves from the pragmatic considerations that underlie the finished ad-marketing intelligence and research and the budget-through media, to the final campaign. Nonetheless, there is a chapter particularly devoted to advertising language. Other examples may include Advertising(84) by William M. Weilbacher and Contemporary Advertising(6) by Courtland L. Bove and William F. Arens. The former is about all the advertising issues suggested by various definition of advertising that have been presented and it also tells how advertising is created, produced and used. The latter is a more complete study of advertising and the language aspects are involved in it inevitably.In China, there are also some experts who study the language of English advertising or advertising business in general. Books such as Advertising English by Cui Gang(93) and Advertising English and Examples by Sun Xiaoli (95) are works on analyses of the language of advertising. The author illustrate the general characteristics of advertising English mainly in the aspects of words, sentences and rhetorical devices, and examines features of different kinds of English advertisements. The books serve as a guide to students and practitioners to help them attain proficiency in writing advertising copy for different media. In his book Pragmatics in English Learning(7) Professor He Ziran discusses advertising language from the angle of pragmatics and sociolinguistics.。
文献综述英文例文通用

文献综述英文例文通用IntroductionA literature review is an important component of academic research. It helps researchers to identify existing gaps in knowledge, evaluate the current state of research in a particular area, and generate ideas for future research. A good literature review should be comprehensive, up-to-date and well-organized. This article will provide a general guideline on how to write a literature review, including the structure, format and content.StructureA literature review should have an introduction, main body and conclusion. The introduction should provide the background information on the topic, the research question, and the purpose of the review. The main body should be divided into different sections based on the themes or topics. Each section should summarize the key findings from the literature and explain how they relate to the research question. Finally, the conclusion should summarize the main findings from the review, identify the gaps in the existing literature, and suggest possible avenues for future research.FormatA literature review can be written in different formats depending on the discipline and the research topic. In general, there are two common formats: the narrative review and the systematic review. The narrative review is a descriptive summary of the literature, whereas the systematic review is a more rigorous evaluation of the literature using a predefined search strategy and inclusion/exclusion criteria.ContentThe content of a literature review should be focused on the research question and the themes identified in the main body. The literature reviewed should be relevant, reliable, and recent. The sources of the literature can be primary or secondary, depending on the research question and the availability of the literature. The sources can be in different forms, such as articles, books, reports, conference proceedings, and online databases.TipsHere are some tips on how to write a good literature review:- Start early: Begin the literature review as early as possible to allow sufficient time for reading, writing, and revising.- Define the research question: Clearly define the research question to guide the literature search and the selection of the literature.- Use appropriate keywords: Use appropriate keywords and search terms to identify the relevant literature.- Keep records: Keep a record of the literature searched, read and cited to avoidduplication and facilitate referencing.- Analyze and synthesize: Analyze the literature critically and synthesize the findings into a coherent and organized structure.- Avoid plagiarism: Acknowledge the sources of the literature accurately and avoid plagiarism by paraphrasing and referencing properly.- Be critical: Be critical of the literature reviewed and identify the strengths, weaknesses, and limitations of the research in the field.ConclusionIn summary, a literature review is an essential component of academic research, and it requires careful planning, organizing, and writing. A good literature review should provide a comprehensive and critical evaluation of the existing literature and identify the gaps and limitations in the research field. By following the guidelines and tips provided in this article, researchers can write a well-structured, informative and engaging literature review.。
文献综述英文模板

文献综述英文模板撰写文献综述英文模板的步骤如下:1. Title: Use a clear and concise title that reflects the focus of your literature review.2. Abstract: Provide a brief overview of your literature review, including the research question, methods, key findings, and conclusions.3. Introduction: Explain the background and importance of your topic, introduce the research question, and outline the aims and objectives of your literature review.4. Literature Search Methodology: Describe the search strategy you used to identify relevant studies, including databases, keywords, and inclusion/exclusion criteria.5. Summary of Literature Reviewed: Highlight the key findings and themes from the studies you have included in your literature review, paying attention to their relevance to your research question.6. Analysis and Discussion: Analyze and compare the findings from the selected studies, exploring patterns, trends, and gaps in the literature. Discuss how these findings contribute to ourunderstanding of the topic and identify any limitations or biases in the research.7. Conclusion: Summarize the main points of your literature review and highlight its significance. Draw conclusions about the state of research on your topic and identify any gaps or future research directions.8. References: Cite all the studies included in your literature review using the appropriate referencing style (, APA, MLA).这是一个基本的文献综述英文模板,具体内容可能需要根据研究领域和主题进行调整。
英文文献综述万能模板范文

英文文献综述万能模板范文英文回答:Introduction.A literature review is a comprehensive survey of the existing research on a particular topic. It provides a critical analysis of the literature, identifying the key themes, gaps, and areas for future research. A well-written literature review can help readers quickly and easily understand the current state of knowledge on a topic.Steps to Writing a Literature Review.1. Define your topic. The first step is to define the scope of your literature review. This includes identifying the key concepts, variables, and research questions that you will be addressing.2. Search for relevant literature. Once you havedefined your topic, you need to search for relevant literature. This can be done through a variety of sources, including academic databases, Google Scholar, and library catalogs.3. Evaluate the literature. Once you have found a bodyof literature, you need to evaluate it to determine its relevance, quality, and credibility. This involves reading the abstracts and full text of the articles and assessing their strengths and weaknesses.4. Organize your review. Once you have evaluated the literature, you need to organize it into a logical structure. This may involve grouping the articles by theme, methodology, or research question.5. Write your review. The final step is to write your literature review. This should include a clear introduction, a body that discusses the key findings of the literature, and a conclusion that summarizes your findings andidentifies areas for future research.Tips for Writing a Literature Review.Be comprehensive. Include all of the relevant literature on your topic, even if it is not supportive of your hypothesis.Be critical. Evaluate the strengths and weaknesses of the literature, and identify any gaps in the research.Be clear and concise. Write in a clear and concise style, and avoid using jargon or technical language.Proofread carefully. Make sure to proofread your literature review carefully before submitting it.中文回答:文献综述的撰写步骤。
英语专业文献综述范文

英语专业文献综述范文As an author of Baidu Wenku document, I am going to write a comprehensive review of English major literature. English major literature review is a critical analysis of the existing literature on a specific topic within the field of English studies. It is an essential part of academic research and provides a comprehensive overview of the current state of knowledge in a particular area.The purpose of an English major literature review is to identify gaps in the existing literature and to highlight areas for future research. It also helps to provide a theoretical framework for the research and to demonstrate the significance of the study within the broader field of English studies.In conducting an English major literature review, it is important to use a systematic approach to search for and select relevant literature. This may involve searching electronic databases, library catalogs, and other sources of scholarly information. It is also important to critically evaluate the quality and relevance of the literature that is identified, and to synthesize the findings in a clear and coherent manner.There are several key components of an English major literature review, including an introduction that provides an overview of the topic and its significance, a discussion of the existing literature, a critical analysis of the literature, and a conclusion that summarizes the key findings and identifies areas for future research.In writing an English major literature review, it is important to use clear and concise language, and to organize the information in a logical and coherent manner. It is also important to use proper citation and referencing, and to follow the conventions of academic writing.In conclusion, an English major literature review is an essential part of academic research in the field of English studies. It provides a comprehensive overview of the current state of knowledge in a particular area, and helps to identify gaps in the existing literature and areas for future research. By using a systematic approach to search for andselect relevant literature, and by organizing the information in a clear and coherent manner, it is possible to produce a high-quality literature review that makes a significant contribution to the field of English studies.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Text Recognition with Machine Learning based on Text StructureLiterature ReviewYifan Shi Student ID:27291944Email:ys1n13@MSc Artificial IntelligenceFaculty of Physical Sciences&Eng,University of SouthamptonAbstract—The fast developing Machine Learning algorithms introduced to semantic area nowadays has brought vast techniques in text recognition,classification, and processing.However,there is always a contradiction between accuracy and speed,as higher accuracy generally represents more complicated system as well as large training database.In order to achieve a balance between fast speed and good accuracy,many brilliant designs are used in text processing.In this literature review,these efforts are introduced in three layers:Natural-Language Processing,Text Classification,and IBM Watson System.Keywords—Machine Learning,Natural-Language Processing,Text Classification,IBM WatsonI.I NTRODUCTIONThe growing popularity of the Internet has brought increasing number of users online,with a vast amount of messages,blogs,articles,etc.to be dealt with.These texts,known as natural-language texts,contain possible useful information but take a long time for human to read,understand and deal with.Despite the popular search engine technology nowadays in helping users tofind the sources with keywords,semantic techniques are also needed by many companies to improve their user-friendly working environment.In this literature review,I will introduce several important semantic techniques,starting from the most basic Natural-Language Processing,concentrating in the meaning of words and sentences,followed by Text Classification which is focused on paragraphs and articles.Then,I will introduce a landmark system named IBM Watson,which has DeepQA as its working pipeline.Finally,a conclusion will be included to give some comments on these techniques.II.N ATURAL L ANGUAGE P ROCESSING In order to deal with the human natural-language, it is necessary to transform the unstructured text into well-structured tables of explicit semantics (Ferrucci,2012).According to Liddy(2001), Natural-Language Processing(NLP)is a series of computational techniques used to analyze and represent naturally organized text in order to achieve certain tasks and applications.Collobert and Weston(2008)have categorized NLP tasks into six types:Part-Of-Speech Tagging,Chunking,Named Entity Recognition,Semantic Role Labeling, Language Models,and Semantically Related Words.In addition to this,they also implemented Multitask Learning with Deep Neural Networks to build a successful unified architecture which avoided traditional large amount of empirical hand-designed features to train the system by using backpropagation training(Collobert et al.,2011).III.T EXT C LASSIFICATIONOne of the simple way to represent an article for a learning algorithm is to use the number of times that distinct words appear in the document (Joachims,2005).However,due to the large amount of possible words used in articles,it would create a very high dimensional space of features.Joachims(1999)suggests a TransductiveSupport Vector Machines to do classification because of its effective learning ability even in high dimensional feature space.Rather than using non-linear Support Vector Machine(SVM), Dumais et al.(1998)compared linear SVM with another four different learning algorithms which are Find Similar,Decision Trees,Naive Bayes, and Bayes Nets,which also supports SVM in text classification because of its high accuracy,fast speed as well as its simple model.Sebastiani(2002) also recommends Neural Network as a potential selection in text classification in that its accuracy is only slightly lower than SVM in comparison. The cross-document comparison of small pieces of text,using linguistic features such as noun phrases,and synonyms is introduced by Hatzivassiloglou et al.(1999).The similarity of two paragraphs is defined by the same action conducted on the same object by the same actor. Therefore,drawing features according to nouns and verbs would generally conclude a paragraph into several primitive elements.In addition to the similar primitive elements,restrictions such as ordering, distances and primitive(matching noun and verb pairs)are also implemented to exclude weakly related features.The feature selection methods can effectively reduce the dimensions of dataset (Ikonomakis,2005)while keeping the performance of classification.To make sure which words are to be kept,an Evaluation function has been introduced by Soucy and Mineau(2003)to measure how much information we can get by classifying through a single word.Another improvement by Han et al. (2004)is to use Principal Component Analysis (PCA)to reduce the dimension in transformation of features.Nigam and Mccallum(2000)combine Expectation-Maximization and Naive Bayes classifier to train the classifier with certain amount of labeled texts followed by large amount of unlabeled documents,which realizes the automatic training without huge amount of hand-designed training data.IV.IBM W ATSONThe IBM Watson project has shown us that computer system in open-domain question-answering(QA)is possible to beat human champions in Jeopardy.As Ferrucci(2012) mentioned,the structure of Watson is more complicated than any single agent as it has hundreds of algorithms working together,in the way that Minsky(1988)introduced in Society of Mind.Generally,Watson consists of parts which are DeepQA,Natural Language Processing(NLP), Machine Learning(ML),and Semantic Web and Cloud Computing(Gliozzo et al.,2013).The DeepQA system analyzes the question by different algorithms,giving different interpretations of questions and forming queries for each question (Ferrucci,2012).It provides all the possible answers to the question with the evidences and the scores for each candidate,which would generate a ranking of candidate answers with the likelihood of correctness.The Machine Learning algorithms are used to train the weights in its evaluating and analyzing algorithms(Gliozzo et al.,2013).The clue that Watson uses in searching is named as lexical answer type(LAT),which tells Watson what the question is asking about and what kind of things it needs to look for.Before doing searching, it would generate prior knowledge of type label, known as‘direction’,to each candidate answer and search evidences for and against this‘type direction’(Ferrucci,2012).The DeepQA also has a high requirement in Grammar-based and syntactic analysis techniques,for example,relation extraction techniques in getting possible relations between words,based on a rule-based approach.In addition,the ability of breaking the question down into sub-questions by logics also improved Watsons performance(Ferrucci,2012),which enables Watson tofind results for each smaller questions and combine them together.In correspondence to the ability of breaking down questions,it can also generate the score for the original question based on the evidence for sub-questions.To simulate human knowledge,Watson also uses self-contained database.However,this requirement has led to its great hardware cost.Watson also needs to do automatic text analysis and knowledge extraction to update its database,because of the enormous amount of work and the insurance ofinput-knowledge accuracy.However,the use of self-contained database is costly,that only few institutions can afford the hardware expense,which makes the application of Watson expensive.Another limitation is that the structured resource is relatively narrow compared with vast unstructured natural-language texts.One of the possible improvement is to use online data and ordinary online search engine tofind possible related articles and analyze them with PC clients.Despite the tradeoff between accuracy and cost,because of the possible the unreal data and incorrect information online,it makes the technique more realizable in general.V.C ONCLUSIONAs can be seen from the content above,most techniques used in text analysis are based on‘word feature’extraction,word types,and relations, which are all semantic techniques.While Watson also uses searching techniques tofind the exact answer shown in text.However,the machines lack the ability to conclude the main idea in a paragraph,which is more related with abstract logic thinking.While the way that human read concerns not only on vocabularies and meanings, but also the structure of paragraph and the location of sentences,for example,thefirst sentence in the paragraph usually guides the following content, which helps tell the significance of the sentences and words.Therefore,using machine learning to analyze the structure of an article and combining with the meaning of every sentence might generate the ability to conclude the main idea,which can be used in text scanning and classification.R EFERENCES[1]S.Dumais,J.Platt,D.Heckerman,and M.Sahami,InductiveLearning Algorithms and Representations for Text Categoriza-tion,Proceedings of the seventh international conference on Information and knowledge management,pp-148-155,1998. [2]T.Joachims,Text Categorization with Support Vector Machines:Learning with Many Relevant,ECML-98Proceedings of the10th European Conference on Machine Learning,pp-137-142,1998.[3]T.Joachims,Transductive Inference for Text Classification usingSupport Vector Machines,International Conference on Machine Learning(ICML),pp-200-209,1999.[4]V.Hatzivassiloglou,J.Klavans,and E.Eskin,Detecting TextSimilarity Over Short Passages:Exploring Linguistic Feature Combinations Via Machine Learning,Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora,2000.[5]K.Nigam,Text Classification from Labeled and Unlabeled Doc-uments using EM,Machine Learning,V olume39,pp-103134, 2000.[6] E.Liddy,Natural Language Processing,In Encyclopedia ofLibrary and Information Science,2nd Ed.NY.Marcel Decker, Inc,2001.[7]S.Tong and D.Koller,Support Vector Machine Active Learningwith Applications to Text Classification,Journal of Machine Learning Research pp-45-66,2001.[8] F.Sebastiani,Machine Learning in Automated Text Categoriza-tion,ACM Computing Surveys(CSUR),Issue1,V olume34, pp-1-47,2002.[9]P.Soucy and G.Mineau,Feature Selection Strategies for TextCategorization,AI2003,LNAI2671,pp-505-509,2003. [10]X.Han,G.Zu,W.Ohyama,T.Wakabayashi,and F.Kimura,Accuracy Improvement of Automatic Text Classification Based on Feature Transformation and Multi-classifier Combination, LNCS,V olume3309,pp.463-468,Jan2004.[11]M.Ikonomakis,S.Kotsiantis,V.and Tampakas,Text Classifica-tion using Machine Learning Techniques,WSEAS Transactions on Computers,Issue8,V olume4,pp-966-974,2005.[12]R.Collobert and J.Weston,unified architecture for natural lan-guage processing:deep neural networks with multitask learning, ICML’08Proceedings of the25th international conference on Machine learning,ACM New York,USA,Pages160-167,2008.[13]R.Collobert,J.Weston,L.Bottou,M.Karlen,K.Kavukcuoglu,and P.Kuksa Natural Language Processing(Almost)from Scratch,Journal of Machine Learning Research,V olume12,pp-2493-2537,2011.[14] A.Gliozzo,O.Biran,S.Patwardhan,and K.McKeown,Seman-tic Technologies in IBM Watson,The10th International Semantic Web Conference,Bonn,Germany,2011.[15] D.Ferrucci,Introduction to“This is Watson”,IBM Journal ofResearch and Development,V olume56Number3/4,pp-1:1-1:15 May/July2012.[16]G.Tesauro,D.Gondek,J.Lenchner,J.Fan,and J.Prager,Simulation,learning,and optimization techniques in Watsons game strategies,IBM Journal of Research and Development, V olume56,Number3/4,pp-16:116:11,2012.。