Extracting exact answers from large-scale corpus based on hybrid strategy 44

合集下载

2023年重庆市第十一中学校中考一模英语试题

2023年重庆市第十一中学校中考一模英语试题学校:___________姓名：___________班级：___________考号：___________一、单项选择1．My daughter plays ________ piano very well.A．a B．an C．the D．/2．—Your spoken English is perfect!—Thank you. I think ________ is better. You read English every morning.A．you B．your C．yours D．yourself3．________ Day is a good chance for us to thank our mothers.A．Mother B．Mother’s C．Mothers D．Móthers’4．Usually I make breakfast for my family ________ Saturdays.A．on B．for C．at D．in5．You will find it useful to learn by yourself ________ you go to college.A．when B．though C．until D．since6．My parents often tell me ________ doing my homework in one and a half hours.A．finish B．to finish C．finished D．finishing 7．________ a brave firefighter he is!A．What B．Why C．Who D．How8．It’s said that tea _________ for the first time about 5,000 years ago.A．drinks B．drank C．is drunk D．was drunk 9．—Dad, must I become a doctor like you when I finish university in the future?—________. You can make your own decision and do whatever you like.A．Yes, you must B．No, you needn’t C．Yes, you can D．No, you can’t 10．—Do you know ________?—Yes. Twice a day.A．how long they do eye exercises at school B．how long do they do eye exercises at schoolC．how often they do eye exercises at school D．how often do they do eye exercises at school二、完形填空After being told that he was badly ill, Alex Meyer, 9, of Florida, spent a lot of time in theto medical bills, ”he says.Alex’s idea to collect toys started as a family project. But word 14 , and 7, 000 toy donations came in from all over.Alex hopes other kids will hold 15 drives too. “When you give a toy to a kid, he just smiles,” he says. “That one smile can make a really 16 difference in their life.”“I’m the 17 granddaughter, niece, and daughter in my family, and I always got more toys than any others because they are all boys.”says Aerin Gardner, 11, she wanted to keep them all. But when she was 6, she 18 her county’s Toys for Kids drive. It gives holiday gifts to kids who wouldn’t get them 19 some reasons.Aerin decided that donating to the drive would be a great idea. Each year since then, she has collected stuffed animals all year long, then given them to Toys for Kids. In 2020, she says. “I love to give people 20 . ”11．A．office B．school C．street D．hospital 12．A．empty B．dirty C．full D．clean 13．A．doctor B．nurse C．money D．water 14．A．shouted B．reached C．left D．spread 15．A．food B．clothes C．toy D．medicine 16．A．small B．big C．long D．short 17．A．only B．right C．new D．old 18．A．sang with B．wrote down C．talked about D．heard about 19．A．to B．from C．for D．about 20．A．money B．happiness C．surprises D．animals三、阅读单选21．Blood Type A people are ________.A．honest B．serious C．outgoing D．creative 22．The least common blood type is ________.A．Type A B．Type B C．Type O D．Type AB 23．According to the chart above, which of the following is right?A．Type O people should only eat meat.B．Type A people should eat more fruits.C．People with different blood types have different personalities.D．People eating different foods have different blood types.Learning a foreign language is not a popular choice at school in Britain. In UK schools it is common for children to start studying a foreign language at the age of 11 and many students give up languages completely at 14.▲ Research suggests that students think that it is more difficult to get good grades in languages than in other subjects such as science or history. The British government is now looking for different ways to improve language learning at school. One idea is to start much younger; there are plans to introduce foreign languages from the age of five.Another plan is to give school children more choices. The languages traditionally studied in British schools have been French, Spanish and German. Now the government is encouraging teachers to increase the choices of foreign languages.Chinese is planned to become the second most popular foreign language learned in UK schools. It is already studied by more children than those who studied German or Russian. Only French and Spanish are more popular.Gareth from Wales says, “I am learning Chinese, and find it fun.” Another student, says, “Just telling people that I learn Chinese impresses(使人印象深刻) people.”24．Which of the following sentences can be put in the ▲ ?A．But when do young people stop studying a foreign language?B．So why don’t young people continue with languages at school?C．What can government do to help children with language learning?D．And how many subjects do the children study at school?25．From the passage we know the British government ________.A．encourages its people to speak EnglishB．encourages its people to learn foreign languagesC．doesn’t care about its people’s language learningD．doesn’t allow its people to learn other languages26．The right order of people’s choices of foreign languages in UK is ________.A．French, Spanish, Chinese B．German, French, ChineseC．Chinese, French, Spanish D．French, Chinese, German 27．According to the last paragraph, we know that ________.A．Children find learning Chinese difficult B．Children find learning Chinese useless C．more and more children will come to China D．more and more children likelearning ChineseIt’s a Thursday afternoon. People are walking around as usual. You’re just about to cross a street near a park. Out jumps a man in black. He rushes to you and takes away your bag. What are you going to do?“Help!” You will call out to the people passing by for help, right? Yes, it’s common to call out loudly, but not to a certain person. Hopefully, someone will stand out or at least call the police for you. But what if they don’t stop to help you?Most of the time, most of the people who are near you do not want to offer help. But the bystander effect (旁观者效应) stops them from doing so. What is this effect about? In a group, everyone in the group may think that others will step up to help. As a result, no one takes the action to help. In some situation, someone may finally decide to help, but it may already be too late.So, what should you do on earth?In this situation, remember to ask one exact person for help. For example, you may say, “please call the police for me!” Instead of looking at others, this lady is more likely to help you. This way, it will be much easier for you to get help in time!28．How does the writer start the passage?A．By comparing facts.B．By listing numbers.C．By explaining differences.D．By raising questions.29．In which part of a newspaper can we read this passage?A．Safety.B．Sports.C．Travel.D．Science. 30．According to the passage, ________.A．it isn’t a good way to call the people passing by for helpB．most of the people passing by don’t want to help othersC．people usually don’t want to act differently from othersD．the bystander effect is common and we needn’t change it31．According to the passage, which of the following is the best way to call for help?A．Who can give me a hand?B．Here comes a policeman!C．Hey, don’t just stand there! Come and help!D．The man with a blue hat, please call 110 for me!There’s a new AI robot: ChatGPT, and you’d better pay attention, even if you aren’t into artificial intelligence. The tool is an AI chatbot system that OpenAI publicly known in November 2022 to show off and test what a very large, powerful AI system can achieve.ChatGPT remembers the logic of your dialogue, using earlier questions and answers to tell its next answers. It gets its answers from large number of information on the Internet. You can ask ChatGPT anything, like explaining physics, but it can be creative, and its answers can sound downright authoritative（权威的）.A few days after its launch, more than 1 million people were trying out ChatGPT. UBS analyst Lloyd Walmsley predicted in February 2023 that ChatGPT reached 100 million monthly users in January, reaching in 2 months what took TikTok about 9 months and Instagram two and a half years.ChatGPT is free to use at the moment because it is still in its research phase. But when too many people “jump” onto the server（服务器）, it overloads and can’t process your request. It just means you should try visiting the site at a later time when fewer people are trying to use it. Then on Feb. 1, 2023, OpenAI has a ChatGPT pro plan, which allows users to be able to use it even during busy times. This service does come at a cost of＄20/month.ChatGPT is an artificial intelligence bot that provides solutions to your questions, but Google is a search engine in which you can search for as much information as possible.ChatGPT has limited knowledge due to its programming but Google has unlimited knowledge which is updated every day.32．What does paragraph 2 mainly tell us about ChatGPT?A．How it works.B．What it can achieve.C．Where it can work.D．When it answers the questions. 33．The writer lists numbers in paragraph 3 in order to ________.A．show the popularity of ChatGPT B．stress the cost of inventing ChatGPT C．tell number of people using TikTok D．Show how slowly Instagram develops 34．The underlined word “It” in Paragraph 4 refers to “________”.A．OpenAI B．ChatGPT C．ChatGPT Plus100D．Google 35．According to Paragraph 5, we can infer that ________.A．Google cannot replace humans’ work B．Google can give more information than ChatGPTC．ChatGPT can give solutions to your problems D．ChatGPT cannot take the place of Google①Speaking at the first news meeting with reporters at the Astronaut Center of China in Beijing, mission commander Senior Colonel Chen Dong said that every day during their six-month orbital（在轨的）journey（实验舱）, making three spacewalks and giving a science lecture for students.①Chen recalled the moment he and his teammates—Senior Colonel Liu Yang and Senior Colonel Cai Xuzhe—met their fellow astronauts in the Shenzhou XV flight inside the Tiangong station.①“When the Shenzhou XV actually launched, we were watching the live broadcast and we were so happy that we kept clapping our hands for a long time.”①“In the hours before their spacecraft arrived at the station, we were supposed to have a sleep but none of us really went to bed because we were too excited to feel sleepy. The moment I was about to open the hatch（舱门）, I saw them through the window and waved my hand to welcome them to our home,” he said.①“Then the six of us gave thumbs up in the selfie（自拍）to celebrate the first in-orbit gathering, to show our admiration of our space home and also to pay respect to our great motherland.” Chen said.①Liu, the first Chinese woman in space, said the Shenzhou XIV was her second spaceflight and she made a paper “lucky star”each day in the mission.①“I know that there are lots of females in our nation working hard to follow their dreams with courage and determination（决心）,” she said. “I wish that each of us could realize our aspirations and that we could become models as light to bring warmth to others.”①Cai, who made his first spaceflight, recalled that eating the vegetables they grew in the space station brought a lot of happiness to the crew, though taking care of the plants was never easy.36．During the Shenzhou XIV’ six-month orbital journey, the three astronauts ________.A．gave three science lectures for studentsB．met another three astronauts inside the Tiangong stationC．made paper “lucky stars” every day in the missionD．found it easier to grow vegetables in the space station37．The underlined word “aspirations” in Paragraph 7 probably means ________.A．dreams B．importance C．mistakes D．difficulties 38．The structure of the passage is ________.A．B．C．D．39．What would be the best title for the passage?A．The Mission of the Astronauts of Shenzhou XIV.B．The Tasks of the Astronauts of Shenzhou XV.C．Astronauts of Shenzhou XIV Share Stories with Public.D．Celebrating the First in-orbit Gathering of the Two Teams.四、补全对话7选5阅读下面的对话，根据上下文，选择恰当的选项补全对话，使句意完整、符合逻辑。

提取关键词的方法英语作文

提取关键词的方法英语作文Keywords extraction is a method used to identify and extract the most important words or phrases from a piece of text. This technique is commonly used in information retrieval, text mining, and natural language processing.In the process of keyword extraction, various algorithms and techniques can be employed to analyze the frequency, relevance, and co-occurrence of words within the text. These methods include statistical analysis,linguistic analysis, and machine learning algorithms.One of the main benefits of keyword extraction is its ability to provide a quick and efficient way to summarize the main topics or themes within a large body of text. This can be particularly useful for researchers, content creators, and anyone who needs to quickly understand the key points of a document.There are several different approaches to keywordextraction, including statistical methods such as TF-IDF (Term Frequency-Inverse Document Frequency), graph-based methods like TextRank, and natural language processing techniques such as part-of-speech tagging and named entity recognition.In addition to summarization, keyword extraction can also be used for categorization, indexing, and search engine optimization. By identifying the most relevant keywords within a document, it becomes easier to organize, retrieve, and search for information.Overall, keyword extraction is a valuable tool for making sense of large volumes of text and identifying the most important information. Whether it's for research, content creation, or information retrieval, this method can help to streamline the process of understanding and organizing textual data.。

面试中译英面试题目(3篇)

第1篇---IntroductionIn today's globalized world, the ability to communicate effectively across languages is a crucial skill for professionals in multinational corporations. This interview question aims to assess the candidate's proficiency in English to Chinese translation, a skill that is essential for roles that involve cross-cultural communication, marketing, and documentation. The question provided below is designed to gauge the candidate's understanding of the source text, their ability to convey the intended meaning accurately, and their attention to detail and cultural appropriateness.---Interview Question:As a marketing manager for a global technology company, you have been tasked with translating a press release about a new software productthat is set to revolutionize the way businesses manage their data. The press release is written in English and contains technical jargon, industry-specific terminology, and references to cultural nuances that are unique to the English-speaking market. Below is the English text of the press release. Your task is to translate it into Chinese, ensuring that the translation is accurate, culturally appropriate, and maintains the original tone and intent.---English Text:---FOR IMMEDIATE RELEASEGlobalTech Announces the Launch of DataXpress, the Ultimate Solution for Data Management[City, Country] – GlobalTech, a leading provider of innovative data management solutions, is proud to announce the launch of DataXpress, the latest addition to its suite of cutting-edge products. DataXpress is designed to transform the way businesses store, analyze, and securetheir data, offering a comprehensive solution that addresses the evolving challenges of the digital age.A Game-Changer for Data ManagementDataXpress is a revolutionary software platform that leverages advanced machine learning algorithms to optimize data storage and retrieval processes. With its intuitive user interface and robust security features, DataXpress empowers businesses to manage their data more efficiently and securely than ever before.“DataXpress is a game-changer for data management,” says John Smith, Chief Technology Officer at GlobalTech. “Our team has poured years of research and development into creating a product that not only meets the demands of today’s data-intensive businesses but also prepares them for the challenges of tomorrow.”Key Features of DataXpress:- Intelligent Data Storage: Utilizes machine learning to analyze and categorize data, ensuring optimal storage solutions.- Advanced Analytics: Offers powerful tools for data analysis, allowing businesses to gain actionable insights from their data.- Enhanced Security: Implements cutting-edge encryption techniques to protect sensitive data from unauthorized access.- Scalable Architecture: Designed to handle large volumes of data and scale with the growth of the business.- Comprehensive Support: Provides 24/7 customer support to ensure smooth implementation and ongoing assistance.GlobalTech’s Commitment to InnovationGlobalTech has a long-standing reputation for innovation and excellence in data management. With DataXpress, the company continues its commitment to providing cutting-edge solutions that empower businesses to thrive in the digital era.“DataXpress is the result of our dedica tion to driving technological advancements in data management,” says Sarah Johnson, President of GlobalTech. “We are confident that this product will become anessential tool for businesses worldwide.”Availability and PricingDataXpress is now available f or purchase through GlobalTech’s official website. Pricing starts at $99 per month for a basic subscription, with discounts available for annual commitments.About GlobalTechGlobalTech is a global leader in data management solutions, offering a wide range of products and services designed to help businesses manage their data effectively. With a focus on innovation and customer satisfaction, GlobalTech has become a trusted partner for businesses around the world.---Instructions for the Candidate:1. Read the entire press release carefully to ensure you understand the context and the intended message.2. Pay close attention to technical jargon, industry-specific terminology, and cultural nuances.3. Translate the press release into Chinese, ensuring that the translation is accurate and maintains the original tone and intent.4. Your translation should be clear, concise, and culturally appropriate.5. Pay attention to grammar, punctuation, and formatting.6. Submit your translation in a separate document.---Evaluation Criteria:- Accuracy: The translation should accurately reflect the original text, including technical terms and industry-specific jargon.- Cultural Appropriateness: The translation should be culturally appropriate, taking into account the target audience and cultural nuances.- Tone and Intent: The translation should maintain the original tone and intent of the press release.- Clarity and Conciseness: The translation should be clear and concise, avoiding unnecessary wordiness or ambiguity.- Grammar and Punctuation: The translation should be grammatically correct and punctuated accurately.---This interview question is designed to test the candidate's proficiency in English to Chinese translation, their attention to detail, and their ability to adapt to the specific requirements of the target language and culture.第2篇IntroductionThe role of a translator is pivotal in the globalized world, where communication across languages is essential for business, culture, and education. This document outlines a comprehensive set of interview questions designed to assess the skills, knowledge, and personality of candidates applying for a translator position. The questions are categorized into different sections to provide a structured approach to evaluating the candidate's suitability for the role.Section 1: Language Proficiency and Translation Skills1. Tell us about your language background. What languages do youfluently speak and write?2. Can you describe a challenging translation project you have worked on and how you overcame the difficulties?3. How do you ensure the accuracy and consistency of your translations?4. What tools and software do you use for translation work? Explain how you utilize them effectively.5. Discuss the importance of context in translation. Give an example of how you handled a contextually challenging translation.6. How do you maintain the tone and style of the original text in your translations?7. Describe a time when you had to translate a technical term or concept. How did you approach it?8. What is your approach to translating idiomatic expressions?9. How do you handle cultural differences in your translations?10. Can you explain the difference between literal translation and free translation? Give an example of each.Section 2: Specialization and Industry Knowledge11. What is your area of specialization in translation (e.g., legal, medical, technical, literary)?12. Can you provide examples of specialized terminology in your fieldand how you handle them?13. How do you stay updated with the latest developments in your specialized field?14. What experience do you have in translating documents related to [specific industry or field]?15. How do you ensure the cultural relevance of your translations withina specific industry?16. Can you describe a situation where you had to adapt your translation style to suit a specific audience within an industry?17. What are the key challenges you face when translating documents from [specific source language] to [specific target language]?18. How do you ensure the confidentiality of sensitive information in your translations?19. What are the legal and ethical considerations you take into account when translating documents?Section 3: Project Management and Work Style20. How do you prioritize and manage multiple translation projects simultaneously?21. Can you describe your workflow for a typical translation project?22. What is your approach to meeting tight deadlines?23. How do you ensure quality control in your translations?24. What feedback mechanisms do you use to improve your translation work?25. How do you handle client queries and revisions?26. What experience do you have with project management tools and software?27. How do you ensure effective communication with clients and colleagues?28. What is your approach to working in a team on translation projects?29. How do you handle pressure and stress in your work environment?30. What are your long-term career goals in the field of translation?Section 4: Professional Development and Learning31. How do you stay motivated in your translation work?32. What professional development opportunities have you pursued in the past year?33. How do you stay current with industry trends and advancements in translation technology?34. What are your preferred methods for learning new languages and terminology?35. How do you keep your language skills sharp and up-to-date?36. What certifications or qualifications do you hold in translation or related fields?37. What professional organizations or networks are you a part of in the translation industry?38. How do you approach continuous learning and improvement in your work?39. What advice would you give to someone starting their career in translation?40. How do you envision your professional growth over the next five years?ConclusionThese interview questions are designed to provide a comprehensive evaluation of a candidate's suitability for a translator position. By asking a wide range of questions, employers can gain insights into the candidate's language proficiency, translation skills, specialization knowledge, project management abilities, work style, and professional development aspirations. It is important to tailor these questions to the specific requirements of the role and the company to ensure the best fit for the position.第3篇Introduction:As a professional Chinese-English interpreter, you are expected to possess not only linguistic proficiency but also cultural understanding, quick thinking, and the ability to adapt to various communication scenarios. This comprehensive set of interview questions is designed to assess your skills, experience, and suitability for a Chinese-English interpreter position.1. Personal Background and Language SkillsQuestion 1: Can you please introduce yourself and tell us about your background in language learning and interpretation?Answer:[Your name] is a dedicated and highly motivated individual with a strong passion for language and cross-cultural communication. I hold a Bachelor’s degree in Translation and Interpretation from [University Name], where I majored in Chinese-English translation and interpretation. Throughout my academic journey, I have consistently achieved top gradesin both language courses and practical interpretation exercises.My interest in languages began at a young age, and I have sincededicated myself to mastering both Chinese and English. I have completed numerous translation and interpretation projects, including conferences, business meetings, and cultural events. My proficiency in both languages is not only linguistic but also cultural, as I have lived and worked in both China and English-speaking countries, providing me with a deep understanding of the nuances of both languages and cultures.Question 2: What are the main differences between Chinese and English in terms of grammar, vocabulary, and usage? How do you handle these differences when interpreting?Answer:The main differences between Chinese and English lie in their grammatical structures, vocabulary, and usage. For example, Chinese has no articles, while English requires articles in certain contexts.Additionally, Chinese tends to use more idiomatic expressions and proverbs, which can be challenging to translate directly into English.To handle these differences, I approach each interpretation task with a keen awareness of the cultural and linguistic nuances involved. I focus on understanding the context of the conversation, identifying the intended meaning behind the words, and then conveying that meaning in a way that is natural and appropriate for the target language. This often involves using synonyms, paraphrasing, or even creating new expressions to ensure the message is accurately and effectively communicated.2. Professional Experience and SkillsQuestion 3: Can you describe your experience in interpreting at conferences and business meetings? What were some of the challenges you faced, and how did you overcome them?Answer:Throughout my career, I have had the opportunity to interpret at numerous conferences and business meetings, including international trade fairs, seminars, and corporate events. One of the challenges I often face is the need to quickly adapt to the specific terminology and industry jargon used by the participants.To overcome this challenge, I spend time researching the relevant subject matter before the event and familiarize myself with the key terms and concepts. I also actively seek feedback from the participants to ensure that my interpretations are accurate and clear. Additionally, I maintain a calm and professional demeanor to manage the pressure and ensure a smooth flow of communication.Question 4: What is your approach to consecutive interpretation? Can you give an example of a situation where you used consecutive interpretation effectively?Answer:Consecutive interpretation requires a high level of concentration, memory, and language skills. My approach to consecutive interpretationinvolves listening carefully to the speaker, mentally processing the information, and then conveying the message in the target language in a coherent and concise manner.One example of a situation where I used consecutive interpretation effectively was during a business negotiation between a Chinese company and an international client. The negotiation involved complex technical terms and required a deep understanding of both the business context and the cultural nuances of the conversation. By maintaining a calm demeanor and focusing on the key points, I was able to convey the message accurately and facilitate a successful negotiation.Question 5: How do you prepare for a major interpreting assignment? What are some of the resources you use?Answer:Preparing for a major interpreting assignment involves several steps. First, I research the topic and the participants to understand the context and the key issues at stake. I then familiarize myself with the relevant terminology and industry jargon, using dictionaries, glossaries, and online resources.I also prepare by practicing the interpretation of sample text and role-playing scenarios to improve my timing and delivery. Additionally, I ensure that I am well-rested and hydrated on the day of the event to maintain peak performance.3. Adaptability and Problem-SolvingQuestion 6: Describe a time when you had to interpret in a challengingor unfamiliar environment. How did you handle the situation?Answer:During a recent conference, I was asked to interpret in a venue that was extremely noisy due to construction work. This made it difficult to hear the speakers clearly and to convey the message accurately to the participants.To handle the situation, I asked the organizers to move the interpreter booth closer to the speakers and to provide noise-cancelling headphones.I also increased my focus and concentration, and made a conscious effort to repeat key points and ask for clarifications when necessary. Despite the challenging environment, I was able to maintain the quality of my interpretation and ensure that the event ran smoothly.Question 7: How do you handle situations where there is a cultural misunderstanding or miscommunication during an interpretation?Answer:Cultural misunderstandings can occur at any time during an interpretation, and it is important to address them promptly and effectively. When I encounter a cultural misunderstanding, I take a few moments to pause and reflect on the context and the likely source of the misunderstanding.I then clarify the point with the speaker, ensuring that I have a clear understanding of their intentions. If necessary, I seek additional information from the participants to facilitate a more accurate interpretation. By maintaining open communication and showing empathy, I can often resolve cultural misunderstandings and ensure a successful interpretation.4. Ethics and ProfessionalismQuestion 8: What are your ethical considerations when working as an interpreter? Can you give an example of a situation where you had to adhere to an ethical guideline?Answer:As an interpreter, I am bound by a set of ethical guidelines that emphasize confidentiality, neutrality, and professionalism. These guidelines ensure that I maintain the integrity of the communication process and protect the interests of all parties involved.One example of a situation where I had to adhere to an ethical guideline was during a legal deposition. I was required to remain neutral andimpartial, ensuring that the interpretation accurately reflected the statements of both the plaintiff and the defendant. By adhering to these ethical principles, I was able to maintain the integrity of the legal process and provide a fair and accurate account of the proceedings.Question 9: How do you ensure the confidentiality of sensitive information during an interpretation?Answer:Confidentiality is a crucial aspect of interpretation, and I take it very seriously. To ensure the confidentiality of sensitive information, I follow these steps:1. Understand the context: Before beginning the interpretation, Iclarify the nature of the information being shared and anyconfidentiality requirements.2. Establish trust: I build a strong rapport with the participants, ensuring that they trust me to handle sensitive information with care.3. Maintain confidentiality: I do not discuss the interpretation with anyone outside of the assignment and take steps to secure any physical or digital materials related to the interpretation.4. Legal compliance: I am aware of the legal requirements for confidentiality in my jurisdiction and ensure that I comply with all relevant laws and regulations.5. ConclusionAs a professional Chinese-English interpreter, I am committed to providing high-quality, accurate, and culturally sensitiveinterpretation services. I am confident that my language skills, professional experience, and ethical standards make me a suitable candidate for this position. I am eager to contribute to your team and help facilitate effective communication between Chinese and English-speaking parties. Thank you for considering my application.。

高中英语重点单词及短语汇总

高中英语重点单词及短语汇总下面是高中英语重点单词及短语的详细介绍。

这些词汇和短语在学习英语过程中非常重要，掌握它们可以帮助你更好地理解和应用英语。

1. Vocabulary - 词汇Vocabulary refers to the words and phrases used in a particular language or by a particular group of people. It is essential to have a good vocabulary in order to understand and communicate effectively in English.2. Grammar - 语法Grammar is the set of rules that govern the structure and use of language. It includes various aspects such as tenses, parts of speech, sentence structure, and punctuation. Understanding and using grammar correctly is important for clear and accurate communication.3. Reading - 阅读Reading is an important skill that helps improve vocabulary, comprehension,and critical thinking. It involves understanding written text and extracting meaning from it. Reading various types of texts, such as novels, articles, and essays, can enhance your language skills.4. Writing - 写作Writing involves expressing thoughts and ideas using written words. It is a vital skill for academic and professional success. Effective writing requires good grammar, organization, and clarity of expression. Practice and feedback can help improve your writing skills.5. Listening - 听力Listening is the ability to understand spoken language. It is a crucial skill for effective communication. Developing good listening skills involves paying attention, understanding context, and recognizing different accents and speech patterns.6. Speaking - 口语Speaking is the ability to express thoughts and ideas orally. It involvesusing correct pronunciation, fluency, and appropriate vocabulary and grammar. Regular practice, conversations, and presentations can help improve your speaking skills.7. Conversation - 对话Conversation refers to an informal exchange of ideas and information between two or more people. It involves listening, responding, and engaging in meaningful dialogue. Conversations help improve fluency and communication skills.8. Pronunciation - 发音Pronunciation is the way in which words are spoken. It involves the correct articulation of sounds, stress, and intonation patterns. Practicing pronunciation can enhance clarity and understanding in spoken English.9. Comprehension - 理解能力Comprehension is the ability to understand written or spoken language. It involves extracting meaning, making inferences, and answering questions based on the given information. Improving comprehension skills requires practice and exposure to different texts.10. Idioms - 习语Idioms are phrases that have a different meaning from the literal interpretation of the words used. They are unique to a language and often reflect the cultural context. Learning idioms can help you understand native speakers and communicate more effectively.11. Phrasal Verbs - 短语动词Phrasal verbs are combinations of a verb and one or more particles (e.g., "look up," "take off"). They have a different meaning from the original verb and can be challenging for non-native speakers. Understanding and using phrasal verbs is important for natural and fluent communication.12. Collocations - 搭配词Collocations are words that frequently occur together. They form natural and common phrases in a language. Learning collocations can enhance vocabulary and improve accuracy in speaking and writing.13. Synonyms - 同义词Synonyms are words that have similar meanings. Knowing synonyms can help you avoid repetition and expand your vocabulary. Thesauruses and online resources are useful tools for finding synonyms.14. Antonyms - 反义词Antonyms are words that have opposite meanings. Knowing antonyms can help you express contrasting ideas and expand your vocabulary. Online resources anddictionaries can provide lists of antonyms.15. Homophones - 同音异义词Homophones are words that have the same pronunciation but different meanings and spellings (e.g., "their," "there," "they're"). Knowing homophones is important for accurate communication and avoiding confusion.16. Prefixes and Suffixes - 前缀和后缀Prefixes are added to the beginning of a word to change its meaning (e.g., "un-" in "unhappy"). Suffixes are added to the end of a word to modify its meaning or part of speech (e.g., "-able" in "enjoyable"). Understanding and using prefixes and suffixes can help expand your vocabulary.17. Context Clues - 上下文线索Context clues are hints or information surrounding an unknown word that can help infer its meaning. They include the words, phrases, or sentences that provide clues about the unknown word's definition. Recognizing context clues is essential for effective reading comprehension.18. Figurative Language - 比喻语言Figurative language refers to the use of words or expressions to convey meaning beyond their literal interpretation. It includes similes, metaphors, personification, and hyperbole. Understanding figurative language enhances language skills and enables more creative expression.19. Tenses - 时态Tenses are verb forms that express time relationships. They indicate when an action or state of being occurs. English has several tenses, including past, present, and future forms. Understanding tenses is crucial for accurate communication and writing.20. Modal Verbs - 情态动词Modal verbs are auxiliary verbs that express attitudes, abilities, or obligations. Examples include "can," "could," "should," and "must." Understanding and using modal verbs correctly can help convey meaning and express politeness in English.21. Active Voice - 主动语态Active voice is a grammatical voice in which the subject of the sentence performs the action. It is generally more direct and clear than passive voice. Using active voice can make your writing and speaking more dynamic and engaging.22. Passive Voice - 被动语态Passive voice is a grammatical voice in which the subject of the sentence receives the action. It is often used when the focus is on the action rather than the doer. Understanding when to use passive voice can enhance yourwriting and speaking skills.23. Conditionals - 条件句Conditionals are sentences that express hypothetical or conditional situations. They are used to talk about possibilities, permissions, and imaginary situations. There are four main types of conditionals, including zero, first, second, and third conditionals.24. Reported Speech - 间接引语Reported speech is used to tell someone what another person said, withoutusing the exact words. It involves changing direct speech into indirect speech, adjusting tenses and pronouns accordingly. Understanding reported speech is important for effective communication.25. Relative Clauses - 定语从句Relative clauses are used to provide additional information about a noun or pronoun in a sentence. They begin with a relative pronoun (e.g., "who," "which," "that") and are essential for building complex sentences and expressing relationships between ideas.26. Adjectives - 形容词Adjectives are words that describe or modify nouns and pronouns. They provide characteristics or attributes about the person, place, or thing being described. Using a wide range of adjectives can make your writing and speaking more descriptive and engaging.27. Adverbs - 副词Adverbs are words that describe or modify verbs, adjectives, or other adverbs. They provide information about how, when, where, or to what extent an actionis performed. Using adverbs correctly can enhance the clarity and precision of your communication.28. Prepositions - 介词Prepositions are words that show the relationship between a noun or pronounand other words in a sentence. They indicate time, place, direction, and other relationships. Understanding prepositions is essential for constructing clear and grammatically correct sentences.29. Conjunctions - 连词Conjunctions are words that connect clauses, phrases, or words within a sentence. They include coordinating conjunctions (e.g., "and," "but," "or")and subordinating conjunctions (e.g., "because," "although," "unless"). Using conjunctions effectively can improve the flow and coherence of your writingand speaking.30. Interjections - 感叹词Interjections are words or phrases that express strong emotions or reactions. They are often used to convey surprise, joy, enthusiasm, or disapproval. Including interjections in your communication can add emphasis and expressiveness.31. Punctuation - 标点符号Punctuation is the system of marks used in writing to separate sentences, clauses, and phrases. It includes periods, commas, question marks, exclamation points, and more. Using punctuation correctly is crucial for clear and effective communication.32. Spelling - 拼写Spelling refers to the correct arrangement of letters in a word. Good spelling skills are essential for clear and accurate writing. Practice, memorization, and the use of dictionaries and spell checkers can help improve your spelling.33. Abbreviations - 缩写Abbreviations are shortened forms of words or phrases, often used to save space or time. Common examples include "A.M." (ante meridiem) and "P.M." (post meridiem). Understanding and using abbreviations correctly can enhance your writing and comprehension skills.34. Acronyms - 首字母缩略词Acronyms are words formed from the initial letters of other words, such as "NASA" (National Aeronautics and Space Administration). They are often used in technical and professional contexts. Learning acronyms can help you communicate more efficiently and understand specialized terminology.35. Plagiarism - 抄袭Plagiarism is the act of using someone else's work or ideas without giving them proper credit. It is considered unethical and can have serious consequences in academic and professional settings. Understanding how to avoid plagiarism and properly cite sources is essential for maintaining integrity in your work.36. Paraphrasing - 改写Paraphrasing involves restating someone else's ideas or information in your own words. It is a valuable skill for avoiding plagiarism and demonstrating understanding. Practice and attention to detail can help improve your paraphrasing skills.37. Summarizing - 概括Summarizing involves condensing information or ideas into a shorter form. Itis a useful skill for writing reports, essays, and presentations. Developing effective summarizing skills requires extracting key points and presenting them in a clear and concise manner.。

Definitions of CEC2014 benchmark suite Part A

function and each basic function. Considering that in the real-world problems, it is seldom that there exist linkages among all variables. In CEC’14 the variables are divided into subcomponents randomly. The rotation matrix for each subcomponents are generated from standard normally distributed entries by Gram-Schmidt ortho-normalization with condition number c that is equal to 1 or 2.
J. J. Liang1, B. Y. Qu2, P. N. Suganthan3
2
School of Electrical Engineering, Zhengzhou University, Zhengzhou, China School of Electric and Information Engineering, Zhongyuan University of Technology, Zhengzhou, China
oi1 [oi1 , oi 2 ,..., oiD ]T : the shifted global optimum (defined in “shift_data_x.txt”), which is
randomly distributed in [-80,80]D. for CEC’14.
Different from CEC’13, each function has a shift data

雷达的发展论文中英文对照资料外文翻译文献

雷达的发展论文中英文对照资料外文翻译文献At the beginning of the 20th century。

XXX。

XXX for the development of radar。

However。

it was not until before and after World War II that XXX 1922.XXX objects。

In 1925.the United XXX and used it to measure the height of objects。

The US Navy also discovered continuous wave radar in 1922.which could be used on ships.During World War II。

XXX ns。

It was used to detect enemy planes and ships。

XXX。

After the war。

XXX such as air traffic control and XXX.In the 1950s and 1960s。

the development of XXX imaging of objects。

and were used in XXX.Today。

XXX。

radar will XXX industries.In the early 1930s。

XXX system for aircraft with a range of about 40 kilometers and a n of 457 meters in 1936.Two years later。

the United Kingdom deployed a chain of early warning radar XXX.Overall。

the XXX role in the n of XXX off the ionosphere。

蓝莓提取花青素实验室流程

蓝莓提取花青素实验室流程英文回答：The laboratory process for extracting anthocyanins from blueberries involves several steps. Here is a detailed outline of the procedure:1. Sample Preparation:Collect fresh blueberries and wash them thoroughly to remove any dirt or impurities.Pat dry the berries using a clean towel or paper towel.Weigh the berries to determine the exact amount for extraction.2. Extraction:Place the blueberries in a blender or food processor and blend until a smooth puree is obtained.Transfer the puree into a beaker or a glass container.Add a suitable solvent, such as ethanol or methanol, to the puree. The solvent should be added in a ratio of 3:1 (solvent: puree).Stir the mixture gently to ensure proper mixing of the solvent with the puree.3. Incubation:Cover the beaker with a lid or aluminum foil to prevent evaporation.Place the beaker in a dark and cool environment, such as a refrigerator, and allow it to incubate for 24 to 48 hours.During this incubation period, the solvent will extract the anthocyanins from the blueberry puree.4. Filtration:After the incubation period, filter the mixture using filter paper or a fine mesh sieve to separate the liquid extract from the solid residue.Collect the filtrate in a clean container and discard the residue.5. Concentration:Transfer the liquid extract into a rotary evaporator or a similar equipment for concentration.Apply gentle heat and vacuum to evaporate the solvent, leaving behind a concentrated anthocyanin extract.Monitor the temperature and adjust accordingly to prevent degradation of the anthocyanins.6. Analysis and Storage:Analyze the concentrated extract for anthocyanin content using suitable analytical techniques, such as spectrophotometry or high-performance liquid chromatography (HPLC).Store the extracted anthocyanins in a dark, airtight container at a low temperature to maintain their stability and potency.中文回答：蓝莓提取花青素的实验室流程包括以下几个步骤：1. 样品准备：收集新鲜的蓝莓，并彻底清洗以去除任何污垢或杂质。

火药的起源与发明英语作文

火药的起源与发明英语作文The Enigmatic Origins and Revolutionary Invention of Gunpowder: Unveiling the Spark that Ignited the Course of History.Gunpowder, a potent alchemical concoction that revolutionized warfare and indelibly altered the trajectory of human history, had its enigmatic origins shrouded in the mists of time. Its invention marked a seminal moment, a spark that ignited a transformative journey, paving the way for advancements in weaponry, military strategy, and technological prowess.While the exact circumstances surrounding the discovery of gunpowder remain shrouded in obscurity, it is widely believed to have originated in China sometime during the9th century. Alchemists, tirelessly pursuing the elusive elixir of immortality, inadvertently stumbled upon this volatile substance while experimenting with various mixtures of saltpeter, sulfur, and charcoal.As these alchemists patiently heated and combined these ingredients, the serendipitous discovery of gunpowder occurred. A sudden and explosive reaction ensued, releasing a deafening roar and a cloud of acrid smoke. This remarkable phenomenon ignited a new era, forever changing the landscape of warfare.Initially, gunpowder's potential was not fully recognized. Early Chinese alchemists primarily utilized it for entertainment purposes, creating rudimentary fireworks and firecrackers that emitted colorful sparks and thunderous noises. However, it wasn't long before its explosive potential caught the attention of military strategists.During the 13th century, the Song Dynasty in China recognized the immense military applications of gunpowder. They devised novel weapons, such as the fire lance and the cannon, which propelled projectiles with unprecedented force and accuracy. These innovations revolutionized siege warfare, enabling armies to breach fortified walls andconquer enemy strongholds with devastating efficiency.The news of gunpowder's military prowess spread like wildfire throughout Eurasia. By the 14th century, knowledge of its explosive properties had reached Europe through Arab traders and scholars. Europeans eagerly embraced this transformative technology, incorporating it into their own arsenals.During the Hundred Years' War between England and France, gunpowder played a decisive role in the outcome of several key battles. The English army, equipped with cannons, demonstrated the devastating impact of artillery fire on medieval fortifications. The invention of the musket, a portable firearm, also transformed infantry warfare, providing soldiers with a ranged weapon that could penetrate armor.The advent of gunpowder not only revolutionized warfare but also had profound implications for society as a whole. The introduction of firearms led to the development of standing armies, a shift from feudal levies to professionalsoldiers, and the rise of centralized states with theability to project power over vast distances.Furthermore, the invention of gunpowder spurred advancements in mining, engineering, and metallurgy. The need for large quantities of saltpeter, the key ingredient in gunpowder, led to the development of new techniques for extracting this substance from natural deposits. The production of cannons and firearms required specialized skills in metalworking and casting, fostering the emergence of skilled artisans and engineers.As the centuries progressed, gunpowder remained a driving force behind military and technological advancements. From the development of rockets and grenades to the invention of modern firearms, gunpowder's explosive power continued to shape the course of history. Its legacy is etched in the annals of warfare, exploration, and the advancement of human knowledge.In conclusion, the discovery and invention of gunpowder marked a pivotal turning point in human history. Itstransformative impact on warfare, society, and technology cannot be overstated. From its humble origins as an alchemical curiosity to its role as the catalyst formilitary revolutions and technological advancements, gunpowder stands as a testament to human ingenuity and the profound influence of scientific discovery. As we continueto explore the frontiers of science and technology, let us remember the enigmatic origins and enduring legacy of this extraordinary substance, the spark that ignited the flamesof progress and forever altered the course of human history.。

Convolutional Networks for Images Speech and Time Series

LeCun & Bengio: Convolutional Networks for Images, Speech, and Time-Series
3
1 INTRODUCTION
The ability of multilayer back-propagation networks to learn complex, high-dimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from the input and eliminates irrelevant variabilities. A trainable classi er then categorizes the resulting feature vectors (or strings of symbols) into classes. In this scheme, standard, fully-connected multilayer networks can be used as classi ers. A potentially more interesting scheme is to eliminate the feature extractor, feeding the network with \raw" inputs (e.g. normalized images), and to rely on backpropagation to turn the rst few layers into an appropriate feature extractor. While this can be done with an ordinary fully connected feed-forward network with some success for tasks such as character recognition, there are problems. Firstly, typical images, or spectral representations of spoken words, are large, often with several hundred variables. A fully-connected rst layer with, say a few 100 hidden units, would already contain several 10,000 weights. Over tting problems may occur if training data is scarce. In addition, the memory requirement for that many weights may rule out certain hardware implementations. But, the main de ciency of unstructured nets for image or speech aplications is that they have no built-in invariance with respect to translations, or

全新版大学英语视听阅读4视频听力原文

Unit 1 The Perfect SwarmNarrator: Damage from swarms of locusts can reach disastrous proportions. A single swarmof desert locusts can consume over 70,000 metric tons of vegetation a day. There is, however,-free: North America.one continent that’s locust旁白：蝗虫群的伤害可以达到灾难性的程度。

一个单一的沙漠蝗虫可以消耗超过70000吨的植被一天。

然而，有一个大陆是蝗虫自由：美国北部。

Interestingly enough, this wasn’t always true. For hundreds of years, the Rocky Mountain locust was a common pest in the American West. Back in the mid-1800s, thousands of pioneers journeyed across the U.S. in search of free land and new opportunities. They settled on thefrontier of the western states, and began to farm the land intensively, growing corn and othercrops.有趣的是，这并不总是真实的。

几百年来，落基山脉的蝗虫是美国西部的一种常见害虫。

早在19世纪中叶，成千上万的先驱者跨越美国在自由的土地和寻找新的机会。

他们定居在西部边境，并开始对土地进行集中耕种，种植玉米和其他农作物。

Then, in 1875, out of nowhere, a rare combination of air currents, drought, and basic biology produced the right conditions for an unthinkable event, the worst storm ever recorded, thee over the horizon like a strange, dark cloud. Not millions, not billions, but“perfect swarm.” It camtrillions of insects, sweeping through the land like a living tornado. Those who saw the incredibleevent and survived never forgot what they witnessed.然后，在1875，走出无处，一个罕见的组合，空气电流，干旱，和基本生物学产生了正确的条件为一个不可想象的事件，最坏的风暴有史以来，“完美的群”，它在地平线上像一个奇怪的，黑暗的云。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Extracting exact answers from large-scale corpus based on hybrid strategyLI Peng, WANG Xiao-long, WANG Bao-xun(School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)Abstract: This paper provides a novel and efficient method for extracting exact textual answers from the returned documents that are retrieved by traditional IR system in large-scale collection of texts. The main intended contribution of this paper is to propose System Similarity Model (SSM), which can be considered as an extension of vector space model (VSM) to rank passages, It presents a method of formalized answer extraction based on pattern learning and applies binary logistic regression model (LRM), which seldom be used in IE to extract special information from candidate data sets. The parameters estimated for the data gathers with serious problem of data sparse, therefore we take stratified sampling method, and improve traditional logistic regression model parameters estimated methods. The series of experimental results show that the overall performance of our system is good and our approach is effective. Our system, Insun05QA1, which participated in QA track of TREC 2005 obtained excellent results.Key words: question answering; answer extraction; system similarity model; stratified sampling; logistic regression model1. IntroductionOpen-domain question answering has recently received more and more attention by reason of advances in the areas of information retrieval (IR), information extraction (IE), and natural language processing (NLP). The goal of a question answering system is to retrieve exact answers to question rather than full documents or passages containing answers as most information retrieval systems usually do. Question answering systems combined IR and IE technology and focused on fact-based, short-answer natural language questions such as “Who invented theLI Peng, male, Ph.D., the membership of China Computer Federation; research fields: network information processing, question answering and natural language processing.WANG Xiao-long, male, Ph.D., professor, the program chair of IEEE ICMLC; research fields: network information processing, artificial intelligence, natural language processing, computational molecular biology and business intelligence. paper clip?” The state-of-the-arts in QA research has been represented in the Text Retrieval Evaluation Conference (TREC) question answering track evaluation[1]. From the first QA mission that the TREC 8 executed in 1999 to TREC 2005, more and more organizations have joined in evaluation which is much more complicated and promoted the development of QA. The TREC Question Answering Track has motivated much of the advancement in the open-domain QA system. Our system, Insun05QA1, participated in QA track of TREC 2005 for the first time, and obtained satisfied results. Especially, our system ranks fifth for factoid questions.The identification, ranking and extraction of relevant answer strings constitute the main goal and challenge of an open-domain Q/A system[2]. Therefore, various types of clues have been used to extract answer. For example, Steven Abney extracted answers according to their frequency and position in the passages[3]. Diego et al utilized dependency theory and semantic interpreter for answer extracting[4]. Marius and Sandra employed perceptron-based machine learning approach to answer ranking[5-6]. Abraham Ittycheriah applied Maximum Entropy and decision tree for answer tagging [7]. It can be concluded from the related work, many researchers deem the problem of AE as a very important problem and adopt many kinds of methods. However, another people still couldn’t be full satisfied up to now. People still explore new avenue constantly and make every effort to reach a better result.In this article, we recommend three innovative methods emphatically, namely, System Similarity Model, Pattern learning and stratified sampling logistic regression model (LRM) as well as their application in44passage retrieval, formalized answer extraction and answer ranking respectively. System Similarity Model (SSM) can be regarded as an extension of vector space model (VSM), it overcomes a great deal of deficiency in VSM. Formalized Answer Extraction is focused on finding patterns to formulate a natural answer to question. Answer ranking is typical two-category case, and it is proved in a lot of fields that logistic regression model is a valid tool to solve two-category problems, but because of the serious problem of data imbalance that are used for training, make traditional logistic regression can’t carry through the parameter estimates well, so we have improved the traditional algorithm by means of stratified sampling, make it estimate better under the circumstances of multi-feature imbalance data.The rest of the article is organized as follows. We present the architecture of our system in the next section. Then we describe the system similarity model and passage retrieval in section 3. Formalized answer extraction based on pattern learning is introduced in section 4. Answer ranking based on stratified sampling logistic regression is introduced in section 5. Finally, experiments and conclusions are described in section 6 and 7.2. Brief description of systemOur system can be divided into three main components. The architecture structure of system is shown in Fig. 1. Query preprocessing is comprised of keyword extraction and expansion, answer type prediction. IR component consists of document retrieval, Web retrieval and passage retrieval. Answer extraction includes Named Entity recognition, shallow parser, answer ranking, and answer selection. Question sentences will be changed to a series of keywords and their expansion with non-stop word list and Word Net[8]. It has been apparent for a long time that the most common words in English provide no benefit to searching for a topic[9]. So, we discard the non-stop words in question sentences, and the rest of words will be taken as keywords. For a question, the amount of keyword information that is contained in original sentence is not enough. In order to improve the amount of information, we use WordNet to add synonyms of the keywords. Since this is not the emphasis of this article and space reason, we only give a simple introduction. The detail contents of system can refer to our companion paper[10].Fig. 1 Architecture structure of system3. Passage retrieval with system similarity modelSince our aim is to retrieve exact answer of a natural language question from a large-scale (3GBytes) document collection of texts, documents are too large as a target to retrieve. Therefore documents are segmented into a set of passages based on surface clues such as punctuation symbols. We define passages as overlapping sets consisting of a sentence and its two immediate neighbors. The main action of this step is similarity degree calculation in order to further diminish the extent of extracting. We use a novel information retrieval model, System Similarity Model (SSM), which can be considered as an extension of vector space model (VSM) to calculate the similarity degree[11]. It overcomes a great deal of deficiency in vector space model (VSM)[12]. The score for passage iis the score of the central sentence i, and the score of4546 central sentences were calculated with System Similarity Model (SSM).Given a system 12{,,...,}m A ααα=, m A = and a system 12{,,...,}n B b b b =,n B =.Let 0i x > denote the importance of component i α (1)i m ≤≤; Let 0j y > denote the importance of component j b(1)j n ≤≤, assuming the number of the most similarbinary tuple (MSBTs) is () min{,}p p m n ≤, denoted by 12,,...,p s s s A B ∈×, generally, supposing they are, with similarity degrees (1)i i p μ≤≤ respectively. The system similarity degree is (,)Q A B ,then:2(,)pi ixQ A B μ=∑ (1)Here, A denotes the user query and the12,,...,m ααα means keywords of query. In the same way, B denotes the every sentence of document set or page set retrieved from prior step. 12,,...,n b b b mean keywords of sentence. i x and j y compute by means of variations of standard TF-IDF term weighting scheme [13]. Similarity degrees i μ are computed withWordNet. The top 50 passages are passed on as input to the answer extraction component.To compute the relevance score between user query and passage, SSM uses system similarity function (1) to compute the systematic similarity between query vector and passage return the result. To describe it in more detail, we present a system similarity-computing algorithm as Fig. 2.The algorithm is a recursive algorithm. Three input parameters are: Thesaurus, Set representation of system A and system B . thesaurus is global system resource containing all elements with importance and similarity degree between two elements. System A and system B are two systems under computation that are represented by set already, the importance of all level of components has been determined.4. Formalized answer extraction based on pattern learningIn recent years, the combination of web growth, improvements in information technology, and the explosive demand for better information access have increased the interest in QA systems. Unlike most QA systems that face the problem of how to find short and correct answers to open-domain questions by searchinga large collection of documents, this project is focused on finding patterns to formulate a “complete” and “natural” answer to questions, given the short answer. Finding such patterns is important as it can be used to enhance existing QA systems to provide answers to the user in a more “natural way”[14].Our method extracts answers directly by matching the question patterns and the answer patterns automatically; we try to formulate all possible answer patterns derived from question to provide complete and natural answers. The method gains formalized patterns automatically by using a machine learning strategy. Therefore, huge workload and low coverage of manually methods are avoided. The process is covered by two independent operations, such as pattern learning and pattern testing.Pattern Learning:(1) Query Retrieval: to train questions, input the key words and answers of questions to information retrieval system (such as Google, AskJeeve, etc.) and extracts the snippets that contains answers.(2) Corpus Constitute: tag the Query L and Snippets L by means of tagging tool, and generate training corpus.(3) Pattern Extraction: generalize the training corpus, and generate question patterns set and answer patterns set.Pattern Testing:(1) Query Retrieval: to test questions, input the key words and answers type of questions to information retrieval system (such as Google, AskJeeve, etc.) and extract the snippets which contain candidate of answers type.(2) Pattern Tagging: tag the Query T and Snippets T by means of tagging tool, and generate testing patterns.(3) Answer Extraction: match the testing patterns to the (Question & Answer) patterns set, and extract the answer string.Fig. 2 Processing of pattern learning and testing5. Answers ranking based on stratified sampling logistic regression modelLogistic regression model (LRM) is a regular and effective method of statistical analysis for two-category regression analysis. It has extensive application in such fields as economics[15], sociology[16], medicine and so on, but it is less in the field of information processing. Recently, some researchers begin to focus their interest on it, i.e. XU, et al utilized logistic regression to calculate the text units’ similarity [17].Logistic regression is a nonlinear model, therefore the parameters of the model are estimated by maximum likelihood generally. It is proved that maximum-likelihood estimation of logistic regression has the characteristics of consistency, asymptotic validity and asymptotic normality[18]. Maximum-likelihood estimation methods have a number of attractive attributes. First, they nearly always have good convergence properties as the number of training samples increases. Furthermore, maximum-likelihood estimation often can be simpler than alternative methods, such as Bayesian techniques or other methods.Answer extraction is typical two-category case, because one candidate answer only has two kinds of situations, whether it is an answer or not. Therefore, this kind of problem is suitable for the method of logistic regression for analyzing. But in the actual conditions, the positive instance (correct answer) far less than negative instance (interference answer), it brings about serious data sparse. In this case, if you4748 directly adopt maximum-likelihood estimation, it will result in the model parameter and probability estimate deviation. This paper brings forward a method of parameter estimation, which can diminish the deviation of estimation.5.1 Binary logistic regression: Model and parameter estimatedIn logistic regression, a single outcome variable i Y ()1,...,i n = follows a Bernoulli probability function that takes on the value 1 with probability i P and 0 with probability 1i P −. (1)i i P P − is referred to as the odds of an event occurring. Then i P varies over the observations as an inverse logistic function of a vector i X , which includes a constant and K explanatory variables:)/(~i i i P Y Bernoulli Y (2)01(1)ln ln()1(1)Ki k ik k i P Y odds X P Y αβ====+−=∑ (3) The above is referred to as the log odds and also the logit. By taking the antilog of both sides, the model can also be expressed in odds rather than log odds, i.e.01(1)exp()1(1)Ki k ik k i P Y odds X P Y αβ====+−=∑ (4)0111()Kk ikk k kk kKKX X X k k ee ee e αβαβαβ=+==∑==∗=∗∏∏(5) As Aldrich and Nelson note, there are several alternatives to the LRM that might be just as plausible or more plausible in a particular case. However,(1) The LRM is comparatively easy from a computational standpoint.(2) There are many tools available which can estimate logistic regression models.(3) The LRM tends to work fairly well in practice. Note that, if we know either the odds or the log odds, it is easy to figure out the corresponding probability:00''exp()11exp()i x X oddsP odds X αβαβ+==+++ (6) The unknown parameter 0α is a scalar constant term and 'βis a k×1 vector with elementscorresponding to the explanatory variables. Theparameters of the model are estimated by maximum likelihood. The coefficients that make our observed results most “likely” are selected. The likelihood function is formed by assuming independence over the observations:101(,)(1)i iiin Y Y x x i L P P αβ−==−∏(7)To random sample (,) ,1,2,...,i i x y i n =, By taking logs and using formula (3), the log-likelihood simplifies to''0001ln((,))[()ln(1exp())]ni i i i L y x x αβαβαβ==+−++∑(8)The estimator of unknown parameter 0αand'βcan be gained from following equations by means of maximum-likelihood estimation.'00'100'00'10ln[(,)]exp()01exp()ln[(,)]exp()01exp()1,2,3,...,.ni i ni ij i jL x y x L x y x x j m αβαβααβαβαββαβ==∂+=−=∂++∂+=−=∂++=⎧⎪⎪⎪⎨⎪⎪⎪⎩∑∑ (9)5.2 Stratified sampling logistic regression in imbalance dataIn actual application, it often has large gap between the positive instance and the negative instance, and the positive instance is far less than negative instance, so such data have serious data sparse problem. If we adopt general logistic regression to estimate parameters in such data, usually the results are not good or even the wrong. Therefore, we utilized the method of stratified sampling to take full advantage of the resource of positive instances. The concrete process is: random extract some examples from positive instances and negative instances and merge the training samples to parameter estimation.Under the condition of stratified sampling, sample distribution and population distribution doesn’t have identity. Then even though we know the x , the observed value 1y =isn’t equal to x P . Of course, the49observed value 0y = isn’t equal to 1x P −. In other word, the conditional probability of a sample observed value y k =can’t be expressed by formula (7) and formula (9) can’t be found naturally.Assuming that positive instances and negative instances have 0P N and ()01P N −respectively among the population, the positive instances of independent variable x divided by total positive instances is x γ, then the positive instances of independent variable x is0x P N γ. We assume that the negative instances ofindependent variable x is x κ, namely,()00x xxx P P N P N γγκ=+ (10)Then, ()01x x x x P P N P κγ=− and the negative instances of independent variable x divided by total negative instances is x λ.())0011x x x x P P P P λγ=−− (11)Adopting the method of stratified sampling, we randomly extract 1r positive instances and 2rnegative instances as sample. The probability of theobserved value 1y =, 0y =is:()()()()1011210201111x xx x x x x r P P r P r r r P P r P P γγλ−==+−+− (12)()()()()2021210201011x xx x x x x r P P r P r r r P P r P P λγλ−==+−+− (13)Assuming ()0001P N P N ω=−, 112r r ω=, namely, 0ωis the ratio of the positive instances and the negative instances in population; 1ω is the ratio of the positive instances and the negative instances in sample. As to stratified sample (), ,1,2,...,i i x y i n =, the logarithmic likelihood function is:()[]()()()[]()[]111111110ln ln ln ,1ln ln 1ln 1lnln111===−++=−+−−+−=+−+−+⎧⎫⎪⎪⎨⎬⎪⎪⎩⎭∑∑∑∑ix nix i x x nnnxxiii i i xxy P L y PP P P P yyP P ωαβωωωωωωω(14)Utilizing formula (3), the log-likelihood simplifies to()[]()()[]''1ln ,ln 1exp niii iy x L x αβαβαωβ=+−=Ω++++⎧⎫⎪⎪⎨⎬⎪⎪⎩⎭∑ (15)Here, 1ni i y ω=Ω=∑and 10ln ωω=are nothingto estimated parameters. If we assume that10ααω=+, then the estimator of unknown parameter 1αand 'β can be gained from following equations by means of maximum-likelihood estimation.011010111ln[(,)]exp()1exp()ln[(,)]exp()1exp()1,2,3,...,.nii niiji jL x y x L x y x x j m αβαβααβαβαββαβ==∂+=−=∂++∂+=−=∂++=⎧⎪⎪⎪⎨⎪⎪⎪⎩∑∑(16)Formula (16) is the parameter estimation formula of stratified sampling logistic regression model. Under the condition of random sampling, sample distribution is identical to population distribution,10ωω=, then0ω=, 10αα=, formula (16) is equal to formula (9). Therefore, formula (16) can be considered as an expansion of formula (9) under the condition of stratified sampling.6. Experiments and evaluation6.1 The result and analysis of passage retrieval We use the traditional information retrieval system, SMART, as information extracting component, which will get the document related with the question from TREC data sets. Obviously, the precision of document retrieval has determined the final upper limit of our system. Fig. 3 shows IR part with increase that file counting of feedback, the variation tendency of precision. We can find out the rate of accuracy of increase with the document quantity is a trend increased progressively, but the rate of accuracy rises very slowlyafter the document feedback counts over 50, the rate of accuracies from 50 to 70 only geares up 3%. This proves that such a document counting under the circumstances, it doesn't already have much point to the improvement of the rate of accuracy to depend on increasing the document quantity, the increase of document counting will exert a negative influence on the rate of accuracy of the follow-up step, so we choose file counting of feedback to be 70. The precision of document retrieval is 59%. The rest processing are all to do on the basis of this rate of accuracy, such a rate of accuracy is not very ideal and this is restricted by ability of SMART system. So, it is one of the main research directions in our future work to develop own high-level IR system.The next step is to calculate passage retrieval through similar degree. Every loss through the rate of accuracy of a step is unavoidable. We can see that generally drops by 6 percentage points or so through the rate of accuracy of passage retrieval step, such loss can be accepted. It proves that the System Similarity Model is effective to calculate the similarity degree. Ultimately, the top 50 passages are passed on as input to the answer extraction component.Fig.3 IR part with increase that file counting of feedback, the variation tendency of precision6.2 The result of formalized answer extractionThe evaluation of the formalized answer extraction using the test-set of questions, resulted in only 80.8% of grammatically correct answer formulations. Table 1 shows in the last column the percentage of questions, which were grammatically correct formulated out of the number of matched questions for each type of question.Table 1 Result of formalized answer extractionfor each type of question# TypeQuestionsType ofQuestion% MatchedQuestionsAccuracy 100 People 6464% 51 79.7%100 Location6767% 56 83.6%100 Organ. 5454% 42 77.8%100 Date 7272% 65 90.3%100 Other 2929% 17 58.6%500 All28657.2%231/286(231/500)80.8%(46.2%)6.3 The result and analysis of answer rankingThe component of answer ranking is a nonlinear classifier using a set of ten features with weights developed by a machine-learning algorithm employing stratified sampling logistic regression. The features used are: the number of keywords and their expansion, the number of different keywords and their expansion, the number of named entities in the passage, formalization, subject or object, the average distance in words between the beginning of the candidate and the keyword and their expansion that also appear in the passage and so on.The training data set is selected from TREC 8 to TREC 2004 factoid questions set. We adopted 550 questions that have correct answers and their corresponding passages set as training data set to estimate parameter by stratified sampling logistic regression. We devised several experiments and appraised the effect of different scheme.From the results of Fig. 4, we can see that Stratified Sampling Logistic Regression (SLR-AE) gains a satisfied result. The result of formalized answer extraction (FAE) is better than C4.5 and Maximum Entropy (ME). Since we adopt stratified sampling logistic regression to estimate parameters in imbalance data and the effect is improved obviously. It proved that the method of stratified sampling logistic regression is effective to solve the problem of imbalance data.5051Fig. 4 Answer extraction accuracy of different scheme6.4 Evaluation of TREC2005 QA trackAt QA Track of TREC 2005, the Q/A system is developed, system similarity model and stratified sampling logistic regression model are adopted as core component of passage retrieval and answer ranking, Insun05QA, which submitted answers to three types of questions: factoid questions, list questions and other questions. The evaluation result is described in Table 2.Table 2 Performance of Insun05QA1 in TREC 2005 TREC 2005 QA TrackInsun05QA1 Average per-series score 0.187Number of correct 106Number ofunsupported 15Number of inexact 16Number of wrong 225Accuracy0.293 (median accuracyscores 0.153)Precision of NIL 0.057 FactoidquestionsRecall of NIL 0.176Among seventy-one participants, our systemranks fifth for factoid questions, seventh for list questions, and eighth in synthetic average score [19].Fig. 5 Performance of Insun05QA1 in factoid questions7. ConclusionsWe have described some core technologies of a Q/A system to automatic answer extraction from large-scale text collections, in response to open-domain, natural language questions. The main intended contribution of this paper is to propose System Similarity Model, pattern learning and stratified sampling logistic regression model are adopted as core technologies to apply in passage retrieval, answer extraction and answer ranking respectively. Evaluations indicate that the effect of answer extraction is effective and the method improves the overall performance of system obviously. At QA Track of TREC 2005, our system ranks fifth for factoid questions and seventh for list questions among seventy-one participants. Although there is some other auxiliary technologies making the ultimate system upgrade. However, the satisfied performances of core methods play the crucial role in system.There are several possible areas for future work. It is potential for us to improve performance through more up-to-date machine learning methods and sophisticated use of NLP techniques. In particular, the semantic and syntactic information of texts may provide significant help. Another field of our future work is to further discover new and effective method to solve the problem of serious data sparse. Developing a high level IR system is also an important factor in the development of an excellent Q/A system.References:[1] Voorhees, E. Overview of the TREC 2004 questionanswering track. The Thirteenth Text Retrieval Conference , 2004.[2] John O’Connor. Retrieval of answer sentences andanswer-figures from papers by text searching. Information Processing & Management , 1975, 11(5/7): 155-164. [3] Abney, S. and Collins, M. Answer extraction. In Proceedingsof the Applied Natural Language Processing Conference (ANLP-2000). Seattle, Washington, 2000: 296-301.[4] Diego Molla and Ben Hutchinson. Dependency-BasedSemantic Interpretation for Answer Extraction.[5] Marius A. Pasca and Sandra M. Harabagiu. Highperformance question/answering. Proceedings of the 24th Annual International Acm Sigir Conference on Researchand Development in Information Retrieval. New Orleans,Louisiana, United States, 2001: 366-374.[6]Marius A. Pasca. High-Performance, Open-DomainQuestion Answering from Large Text Collections (Ph.D.Thesis). University of Southern Methodist, 2001.[7]Abraham Ittycheriah. Trainable Question AnsweringSystems (Ph.D. Thesis). The State University of NewJersey, 2001.[8]Miller, G. WordNet: An on-line lexical database.International Journal of Lexicography, 1991: 235-312. [9]H.P. Luhn. A statistical approach to mechanized encodingand searching of literary information. IBM Journal ofResearch and Development, 1957, 1(4): 309-317.[10]LI Peng. WANG Xiao-long and GUAN Yi. Extractinganswers to natural language questions from large-scalecorpus. Proceedings of 2005 IEEE InternationalConference on Natural Language Processing andKnowledge Engineering (IEEE NLP-KE’05). ChinaWuhan, 2005: 690-694.[11]GUAN Yi, WANG Xiao-long, WANG Qiang. Measurementof system similarity. JSCL-2005. Nanjing, 2005.[12]S.K.M. Wong, W. Ziarko and P.C.N. Wong. Generalizedvector space model in information retrieval. ACM SIGIR,1985: 18-25.[13]G. Salton. Automatic Text Processing: TheTransformation, Analysis, and Retrieval of Information byComputer. Addison-Wesley, 1989.[14]Glenda Anaya. ANSFORM: Answer Formulation forQuestion-Answering (Ph.D. Thesis). Concordia University of Canada, 2002.[15]LIANG Qi. Distress prediction: Application of the PCA inlogistic regression. Journal of Industrial Engineering andEngineering Management, 2005, 19(1): 100-104.[16]Gary King, Michael Tomz and ZENG Lang-che. ReLogit:Rare events logistic regression. Journal of StatisticalSoftware, 2003, 8.[17]XU Yong-dong, XU Zhi-ming, WANG Xiao-Long. Usingmultiple features and statistical model to calculate textunits similarity. In Proceedings of the FourthInternational Conference on Machine Learning andCybernetics (ICMLC 2005), China, Guangzhou, 2005. [18]Richard O. Duda, Peter E. Hart, and David G. Stork.Pattern Classification, Second Edition. John Wiley &Sons, Inc., 2001: 84-113.[19]Ellen M. Voorhees, Hoa Trang Dang. Overview of theTREC 2005 question answering track. TREC 2005, 2005.(Edited by Rachel, Yunflyer)52。