语言测试学3
应用语言学语言测试理论知识点整理

应用语言学语言测试理论知识点整理在应用语言学领域,语言测试理论是一个重要的分支,它对于评估语言学习者的语言能力、指导教学实践以及推动语言教育的发展都具有关键意义。
以下将对应用语言学语言测试理论的一些重要知识点进行整理。
一、语言测试的定义与目的语言测试是对语言学习者的语言能力进行测量和评估的一种手段。
其主要目的包括:1、为教育决策提供依据,例如确定学生的升级、留级或毕业。
2、评估教学效果,帮助教师了解教学方法的有效性和学生的学习进展。
3、为学生提供反馈,让他们了解自己的语言水平和不足之处,以便进一步改进学习策略。
二、语言测试的类型1、水平测试(Proficiency Test)旨在测量考生对某种语言的整体掌握程度,不考虑考生之前的学习经历或特定的课程内容。
常见的水平测试如雅思(IELTS)、托福(TOEFL)等。
2、成绩测试(Achievement Test)侧重于检测考生在特定课程或学习阶段所掌握的语言知识和技能,与教学内容紧密相关。
比如学校的期末考试、单元测验等。
3、诊断测试(Diagnostic Test)主要用于发现考生在语言学习中存在的具体问题和薄弱环节,以便为后续的教学和学习提供针对性的指导。
4、潜能测试(Aptitude Test)预测考生学习语言的潜力和能力,而非对现有语言水平的评估。
三、语言测试的质量评估标准1、效度(Validity)指测试能够准确测量出其所要测量的语言能力或语言知识的程度。
效度分为内容效度、结构效度、预测效度等。
内容效度:测试内容是否涵盖了所要考查的语言技能和知识点。
结构效度:测试结果是否与语言能力的理论结构相一致。
预测效度:测试成绩能否有效地预测考生在未来语言学习或实际语言运用中的表现。
2、信度(Reliability)反映测试结果的稳定性和一致性。
包括重测信度、复本信度、分半信度等。
重测信度:对同一批考生在不同时间进行相同测试,两次测试结果的相关性。
复本信度:使用两份内容相似但不完全相同的试卷对同一批考生进行测试,两次结果的相关性。
北京语言大学外语用考研第三章语言学测试

北京语言大学外语用考研语言学第三章测试题Part I Fill in the blanks or answer the questions as required.(50points)1.modify the meaning of the stem,but usually do not change the part of speech of the original word.2.When two different forms,e.g.bat and bit,are identical in every way except for one sound segment,which occurs in the same place in the strings,the two sound combinations are said to form a.3.is a relatively complex form of compounding,in which two words are blended by joining the initial part of the first word and the final part of the second word,or by joining the initial parts of the two words.4.Categorize the following ten morphemes into inflectional and derivational.(1*10)(1)-able(2)-ed(past)(3)-ness(4)-ize(5)-ing(6)-s(plural)(7)-lish(8)-ly(9)-ate(10)-s(3rd person singular)5.How many morphemes does each have?(1*5)(1)desirability(2)tobacco(3)nationalization(4)typewriter(5)inhabitant(1)(2)(3)(4)(5)6.are added to an existing form to create a word,which is a very common way in English.7.is the abstract unit which refers to the smallest unit in the meaning system that can be distinguished from other smaller units and that appears in different grammatical contexts.8.The word“Laser”is formed by the process of.9.are often thought to be the smallest meaningful units of language by the linguists.10.are independent units of meaning and can be used freely all by themselves.11.is a process of forming a new word by combing parts of two words.12.The rules that govern which affix can be added to what type of stem to form a new word are called.13.Loan translation is also called which may be a word,a phrase,or even a short sentence.14.suggested that word is the minimum free form.15.Lexical words carry the main content of a language while grammatical ones serve to link together content parts,so lexical words are also known as content words and grammatical ones as .16.In traditional grammar,is the only word class which can function as a substitute for another item.17.is the base form of a word that cannot be further be analyzed without destroying its meaning.18.serve to produce different forms of a single lexeme.19.The head of a nominal or adjective endocentric compound is derived from a.20.Derivation shows a relationship between and.21.is the smallest component of meaning.22.refers to those words which are used before the noun acting as head of a nominal group,and which determine the kind of reference the nominal group has.23.Infinitive marker,negative marker and the subordinate units in phrasal verbs belong to theof word class.24.Word-formation refers to the process of how words are formed,and it can be further divided into two types:derivational and.25.studies the interrelationship between phonology and morphology.26.All the allomorphs should be in.27.All monomorphemic words are free morphemes,and polymorphemic words which consist of wholly free morphemes are called.28.had been originally a performance error,which was overlooked and gradually accepted by the speech community.29.Abbreviation refers to the word-formation that a new word is created by cutting some parts ofa single word while is made up from the first letters of an organization,etc.30.is a word-formation process by which a word is altered from one part of speech into another without the addition or deletion of any morpheme.31.The three kinds of semantic changes are broadening,narrowing,and.32.An allomorph is said to be when its form is dependent on the neighboring phonemes,and it is said to be when its form is arbitrary part of another vocabulary item.33.mainly refer to quantity including all,both,half double,twice and other multiplier expressions.34.What are the linguistic terms for the formation of the following words?(1*8)a.hippob.can’tc.dawkII.Terms(5*10)1.Back-formation2.Analogical Creation3.Morphophoneme4.Allomorph5.Inflectional Morpheme6.Morphological rules7.Lexeme8.Rootpounding10.Meaning shiftIII.Essays(10*5)1.Tell what you know about phonological changes.2.Stem and root are important conception in morphology.Give your understanding of thedifference between stem and root.nguage changes through history.Do you know any language change process?4.How many morphemes are there in morphological analysis?And how they serve in words?5.How do we classify words and in what kind of norms?。
应用语言学纲要第3版第三章-语言测试

(3)人是一个十分复杂的统一体,即使是同一个被试者,在不同的外部环境 下,每一次测试都会显示出智力上的、生理上的、心理上的差异,从而影响 测试的结果。
第一节 实验方法在语言学中的应用
第一节 实验方法在语言学中的应用
三、测量的信度和效度
(一)信度的测量
2.信度估算的方法:再测法、平行测试法、对半法
3.评估员的信度问题
评估员在评阅主观性试题(如语言测试中的作文、口试等)时,常 常会有误差,这就牵涉到评估员的信度问题: 评估员的内部信度问题、阅 卷员之间的信度问题。
第一节 实验方法在语言学中的应用
二、语言测试的性质
(二)语言测试所包含的信息 语言测试应包括两方面的内容,一是语言方面,二是测试方面。因此,
语言测试需要考虑理论和实践方面的诸多因素,将密切关注以下三个方面的 信息。 1. 关于语言技能方面的信息 2. 关于语言发展方面的信息 3. 关于语言知识方面的信息
第二节 现代语言测试的理论框架
第三节 语言测试中试题的生产程序
二、试题的评分
(一)客观题的评分 (1)人工评分;(2)人工输入+机器评分;(3)机器读入+机器评分
大规模的测试应尽可能地采用第(2)种、第(3)种方式。
(二)主观题的评分 主观题的评分主要针对产生性技能和产生性运用的试题而言。 1. 制定统一的评分标准 2. 培训评分员 3. 改善评分方式
进行监控,而且为语言的教授和学习提供了试验和调查的方法。语言测试对应 用语言学的贡献可以归结为三点: (1)使应用语言学的理论框架转为实际运用。 (2)使教学大纲和教学安排的制定有了明确的目标和标准。 (3)为应用语言学的研究提供了方法论上的借鉴。源自第二节 现代语言测试的理论框架
语言学教程3试题及答案

语言学教程3试题及答案一、选择题(每题2分,共20分)1. 语言学研究的核心对象是什么?A. 语言B. 文学C. 历史D. 哲学答案:A2. 下列哪一项不是语言学的分支学科?A. 语音学B. 句法学C. 心理学D. 语用学答案:C3. 索绪尔认为语言的两个基本要素是什么?A. 语音和语义B. 符号和意义C. 语法和词汇D. 语言和言语答案:D4. 语言的任意性原则是指什么?A. 语言的规则性B. 语言的系统性C. 语言符号与其所指对象之间没有必然联系D. 语言符号与其所指对象之间有必然联系答案:C5. 语言的层级结构理论是由哪位学者提出的?A. 索绪尔B. 乔姆斯基C. 布隆菲尔德D. 德里达答案:B6. 下列哪一项不是语言的交际功能?A. 信息传递B. 情感表达C. 命令与请求D. 艺术欣赏答案:D7. 语言的同义现象是指什么?A. 同音词B. 同义词C. 反义词D. 多义词答案:B8. 语言的演变过程是:A. 从简单到复杂B. 从复杂到简单C. 从单一到多样D. 从多样到单一答案:A9. 语言的交际功能包括哪些?A. 信息传递B. 情感表达C. 命令与请求D. 所有以上选项答案:D10. 语言的方言差异主要体现在哪些方面?A. 语音B. 词汇C. 语法D. 所有以上选项答案:D二、填空题(每空1分,共10分)1. 语言学是研究的科学。
答案:语言2. 语言的两个基本功能是和。
答案:表达思想、交流信息3. 语言的性是语言符号的一个显著特点。
答案:任意4. 语言的性决定了语言的多样性。
答案:社会5. 语言的性是语言能够传递信息的基础。
答案:结构6. 语言的性使得语言能够表达复杂的思想。
答案:创造性7. 语言的性使得语言能够适应不断变化的社会环境。
答案:动态8. 语言的性是语言学研究的重要内容。
答案:系统9. 语言的性是语言能够被学习和使用的基础。
答案:规则10. 语言的性是语言能够适应不同交际场合的关键。
语言测试学资料3

Chapter 3(第三章)The Reliability of Testing(测试的信度)•The definition of reliability•The reliability coefficient•How to make tests more reliableWhat is reliability?Reliability refers to the trustworthiness and stability of candidates‟ test results.In other words, if a group of students were given the same test twice at different time, the more similar the scores would have been, the more reliable the test is said to be.How to establish the reliability of a test?It is possible to quantify the reliability of a test in the form of a reliability coefficient.They allow us to compare the reliability of different tests.The ideal reliability coefficient is 1.---A test with a reliability coefficient of 1 is one which would give precisely the same results for a particular set of candidates regardless ofwhen it happened to be administered.---A test which had a reliability coefficient of zero would give sets of result quite unconnected with each other.It is between the two extremes of 1 and zero that genuine test reliability coefficients are to be found.How high should we expect for different types of language tests? Lado saysGood vocabulary, structure and reading tests are usually in the 0.9 to 0.99 range, while auditory comprehension tests are more often in the 0.8 to 0.89 range.A reliability coefficient of 0.85 might be considered high for an oral production test but low for a reading test.The way to establish the reliability of a test:1. Test-retest methodIt means to have two sets of scores for comparison. The most obvious way of obtaining these is to get a group of subjects to take the same test twice.2. Split-half methodIn this method, the subjects take the test in the usual way, but each subject is given two scores. One score is for one half of the test, the second score is for the other half. The two sets of scores are then used to obtain the reliability coefficient as if the whole test had been taken twice.In order for this method to work, it is necessary for the test to be spilt into two halves which are really equivalent, through the careful matching of items (in fact where items in the test have been ordered in terms of difficulty, a split into odd-numbered items and even-numbered items may be adequate).3. Parallel forms method(the alternate forms method)It means to use two different forms of the same test to measure a group of students continuously or in a very short time. However, alternate forms are often simply not available.How to make tests more reliableAs we have seen, there are two components of test reliability: the performance of candidates from occasion to occasion, and the reliability of the scoring.Here we will begin by suggesting ways of achieving consistent performances from candidates and then turn our attention to scorer reliability.1.Take enough samples of behaviorOther things being equal, the more items that you have on a test, the more reliable that test will be.e.g.If we wanted to know how good an archer someone was, wewouldn‟t rely on the evidence of a single shot at the target. That one shot could be quite unrepresentative of their ability. To be satisfied that we had a really reliable measure of the ability we should want to see a large number of shots at the target.The same is true for language testing.It has been demonstrated empirically that the addition of further items will make a test more reliable.The additional items should be independent of each other and of existing items.e.g.A reading test asks the question:“Where did the thief hide the jewels?”If an additional item following that took the form: “What was unusual about the hiding place?”Would it make a full contribution to an increase in the reliability of the test?No.Why not?Because it is hardly possible for someone who got the original questions wrong to get the supplementary question right.We do not get an additional sample of their behavior, so the reliability of our estimate of their ability is not increased.Each additional item should as far as possible represent a fresh start for the candidate.Do you think the longer a test is, the more reliability we will get?It is important to make a test long enough to achieve satisfactory reliability, but it should not be made so long that the candidates become so bored or tired that the behavior that they exhibit becomes unrepresentative of their ability.2. Do not allow candidates too much freedomIn general, candidates should not be given a choice, and the range over which possible answers might vary should be restricted.Compare the following writing tasks:a) Write a composition on tourism.b) Write a composition on tourism in this country.c) Write a composition on how we might develop the tourist industry in this country.d) Discuss the following measures intended to increase the number of foreign tourists coming to this country:i)More/better advertising and / or information (where? What formshould it take?)ii)Improve facilities (hotels, transportation, communication etc.). iii)Training of personnel (guides, hotel managers etc.)The successive tasks impose more and more control over what iswritten. The fourth task is likely to be a much more reliable indicator of writing ability than the first.But in restricting the students we must be careful not to distort too much the task that we really want to see them perform.3. Write unambiguous itemsIt is essential that candidates should not be presented with items whose meaning is not clear or to which there is an acceptable answer which the test writer has not anticipated.The best way to arrive at unambiguous items is, having drafted them, to subject them to the critical scrutiny of colleagues, who should try as hard as they can to find alternative interpretations to the ones intended. 4. Provide clear and explicit instructionsThis applies both to written and oral instructions.If it is possible for candidates to misinterpret what they are asked to do, then on some occasions some of them certainly will.A common fault of tests written for the students of a particular teaching institution is the supposition that the students all know what is intended by carelessly worded instructions.The frequency of the complaint that students are unintelligent, have been stupid, have willfully misunderstood what they were asked to do, reveals that the supposition is often unwarranted.Test writers should not rely on the students‟ powers of telepathy toelicit the desired behavior.The best means of avoiding problems is the use of colleagues to criticize drafts of instructions (including those which will be spoken).Spoken instructions should always be read from a prepared text in order to avoid introducing confusion.5. Ensure that tests are well laid out and perfectly legibleToo often, institutional tests are badly typed (or handwritten), have too much text in too small a space, and are poorly reproduced. As a result, students are faced with additional tasks which are not ones meant to measure their language ability. Their variable performance on the unwanted tasks will lower the reliability of a test.6. Candidates should be familiar with format and testing techniquesIn any aspect of a test is unfamiliar to candidates, they are likely to perform less well than they would do otherwise. For this reason, every effort must be made to ensure that all candidates have the opportunity to learn just what will be required of them. This may mean the distribution of sample tests (or of past test paper), or at least the provision of practice materials in the case of tests set within teaching institutions.7. Provide uniform and non-distracting conditions of administrationThe greater the differences between one administration of a test and another, the greater the differences one can expect between a candidate‟s performance on the two occasions.Great care should be taken to ensure uniformity.e.g.Timing should be specified and strictly adhered to;The acoustic conditions should be similar for all administrations of a listening test. Every precaution should be taken to maintain a quiet setting with no distracting sounds or movements.How to obtain scorer reliability1. Use items that permit scoring which is as objective as possibleThis may appear to be a recommendation to use multiple choice items, which permit completely objective scoring. This is not intended. While it would be mistaken to say that multiple choice items are never appropriate, it is certainly true that there are many circumstances in which they are quite inappropriate. What is more, good multiple choice items are notoriously difficult to write and always require extensive pretesting.An alternative to multiple choice is the open-ended item which has a unique, possibly one-word, correct response which the candidates produce themselves. This too should ensure objective scoring, but in fact problems with such matters as spelling which makes a candidate‟s meaning unclear often make demands on the scorer‟s judgment. The longer the required response, the greater the difficulties of this kind.One way of dealing with this is to struct ure the candidate‟s response byproviding part of it.e.g.The open-ended question What was different about the results?may be designed to elicit the responseSuccess was closely associated with high motivation.This is likely to cause problems for scoring. Greater scorer reliability will probably be achieved if the question is followed by:_____ was more closely associated with _____.2. Make comparisons between candidates as direct as possibleThis reinforces the suggestion already made that candidates should not be given a choice of items and that they should be limited in the way that they are allowed to respond.Scoring the compositions all on one topic will be more reliable than if the candidates are allowed to choose from six topics, as has been the case in some well-known tests.3. Provide a detailed scoring keyThis should specify acceptable answers and assign points for partially correct responses. For high scorer reliability the key should be as detailed as possible in its assignment of points. It should be the outcome of efforts to anticipate all possible responses and have been subjected to group criticism. (This advice applies only where responses can be classed as partially or totally …correct‟, not in the case of compositions, forinstance.)4. Train scorersThis is especially important where scoring is more subjective. The scoring of compositions, for example, should hot be assigned to anyone who has not learned to score accurately compositions from past administrations. After each administration, patterns of scoring should be analyzed. Individuals whose scoring deviates markedly and inconsistently from the norm should not be used again.5. Agree acceptable responses and appropriate scores at outset of scoringA sample of scripts should be taken immediately after the administration of the test. Where there are compositions, archetypical representatives of different levels of ability should be selected. Only when all scorers are agreed on the scores to be given to these should real scoring begin.For short answer questions, the scorers should note any difficulties they have in assigning points (the key is unlikely to have anticipated every relevant response), and bring these to the attention of whoever is supervising that part of the scoring. Once a decision has been taken as to the points to be assigned, the supervisor should convey it to all the scorers concerned.6. Identify candidates by number, not nameScorers inevitably have expectations of candidates that they know.Except in purely objective testing, this will affect the way that they score. Studies have shown that even where the candidates are unknown to the scorers, the name on a script (or a photograph) will make a significant difference to the scores given.e.g.A scorer may be influenced by the gender or nationality of a name into making predictions which can affect the score given. The identification of candidates only by number will reduce such effects.7. Employ multiple, independent scoringAs a general rule, and certainly where testing is subjective, all scripts should be scored by at least two independent scores. Neither scorer should know how the other has scored a test paper. Scores should be recorded on separate score sheets and passed to a third, senior, colleague, who compares the two sets of scores and investigates discrepancies. Reliability and validityTo be valid a test must provide consistently accurate measurements. It must therefore be reliable. A reliable test, however, may not be valid at all.For example, as a writing test we might require candidates to write down the translation equivalents of 500 words in their own language. This could well be a reliable test; but it is unlikely to be a valid test of writing.In our efforts to make tests reliable, we must be wary of reducing their validity. This depends in part on what exactly we are trying to measure by setting the task. If we are interested in candidates‟ ability to structure a composition, then it would be hard to justify providing them with a structure in order to increase reliability. At the same time we would still try to restrict candidates in ways which would not render their performance on the task invalid.There will always be some tension between reliability and validity. The tester has to balance gains in one against losses in the other.。
语言测试类型知识点总结

语言测试类型知识点总结语言测试的种类有很多,比如笔试、口试、听力测试、阅读测试等。
在进行语言测试时,需要根据测试的目的选择合适的测试方法和评分标准。
不同的语言测试项目需要测试不同的语言技能,比如词汇、语法、听力、口语、阅读、写作等。
下面我们将逐一介绍这些语言测试中的知识点。
一、词汇词汇是语言的基本组成部分,它是语言运用的基础。
在语言测试中,词汇测试通常包括词义、词性、词组、短语、语境等方面的考察。
测试者需要掌握词汇的拼写、发音、用法和搭配等方面的知识。
1、词义:词义是词汇的基本含义,它是词汇测试的重点内容之一。
测试者需要掌握词汇的基本含义,了解常用词汇的多种含义和用法。
2、词性:词性是词汇的重要属性,它决定了词汇的用法和搭配。
测试者需要掌握各种词性的词汇,理解它们在语言中的作用和用法。
3、词组和短语:词组和短语是语言中常用的固定搭配,它们在语言测试中也是重点内容之一。
测试者需要掌握常用的词组和短语,了解它们的意义和用法。
4、语境:语境是词汇使用的重要依据,它可以帮助理解词汇的含义和用法。
测试者需要在不同的语境中运用词汇,理解它们的具体含义和用法。
二、语法语法是语言的基本规则,它决定了语言的结构和用法。
在语言测试中,语法通常包括句子结构、时态、语态、语气、语序、主谓一致、形容词和副词的比较级和最高级、连词、代词等方面的考察。
1、句子结构:句子结构是语法的基本内容之一,它是语言表达的基本单位。
测试者需要掌握不同类型的句子结构,了解它们的构成和用法。
2、时态:时态是表示动作发生时间的一种语法形式,它在语言测试中也是重点内容之一。
测试者需要掌握各种时态的用法,理解它们的差异和应用场合。
3、语态:语态是表示句子主语和谓语之间关系的一种语法形式,它在语言测试中也是重点内容之一。
测试者需要掌握各种语态的用法,了解它们在句子中的作用和区别。
4、语气:语气是表示说话者的语气和情绪的一种语法形式,它在语言测试中也是重点内容之一。
语言测试学

直接测试(Direct Test) 间接测试(Indirect Test)
测量形式
分离式测试(Discrete-point Test) 综合式测试(Integrative Test)
考分解释
常模参照测试(Norm-referenced Test)
标准参照测试(Criterionreferenced Test)
描述(统计图表)、解读(结果及原因)
二、语言测试的类别
测试目的
水平测试(Proficiency Test) 学业测试(Achievement Test) 学能测试(Scholastic aptitude
Test) 分级测试(Placement Test) 诊断测试(Diagnostic Test)
考试
课程 课程
考试
结业 结业
一、语言测试的功能
2 研究功能
研究问题及假设(Questions & Hypotheses) 研究对象及抽样(Objects & Sampling) 研究方法与过程(Methods & Procedures)
实验设计、测量工具、变量及类型、分析方法
研究结果与讨论(Results & Discussions)
(1-.47)/2 ×100=26
平均分μ
72 50 70 3 F(z)甲 =.75
标准差σ
8 2 10 1 F(z)乙 =.47
标准分 甲乙 -.25 -.38 2.5 1.5 1.9 2.5 4.15 3.62
1.15 .62
•α=0.05
•α=0.01
•α=0.001
xm in
* ** * ** * * *
x 3 x 2
语言学教程3试题及答案

语言学教程3试题及答案一、选择题(每题2分,共20分)1. 语言学的主要研究对象是什么?A. 语言的历史发展B. 语言的结构系统C. 语言的社会功能D. 语言的地理分布答案:B2. 下列哪项不是语言的属性?A. 任意性B. 线性C. 离散性D. 连续性答案:D3. 语音学研究的主要内容是什么?A. 语言的语法结构B. 语言的词汇系统C. 语言的发音规律D. 语言的书写形式答案:C4. 语法学的研究对象是什么?A. 语言的声音系统B. 语言的词汇系统C. 语言的语法结构D. 语言的语义内容答案:C5. 语用学主要研究什么?A. 语言的发音规则B. 语言的语法规则C. 语言的使用环境D. 语言的书写规则答案:C6. 语言的最小意义单位是什么?A. 音素B. 词C. 语素D. 句答案:C7. 以下哪个选项是语言的交际功能?A. 表达思想B. 传递信息C. 娱乐消遣D. 教育指导答案:B8. 语言的演变主要受到哪些因素的影响?A. 社会变迁B. 地理隔离C. 文化交流D. 所有以上选项答案:D9. 语言的同源词指的是什么?A. 同一词根派生出的词B. 词义相近的词C. 形式和意义相同的词D. 形式和意义都不同的词答案:A10. 下列哪项是社会语言学的研究内容?A. 语言的语音变化B. 语言的词汇变化C. 语言与社会的关系D. 语言的语法变化答案:C二、填空题(每题2分,共20分)1. 语言学是研究________的科学。
答案:人类语言2. 语言的任意性是指语言的________与________之间没有必然的联系。
答案:形式意义3. 语言的线性是指语言在时间上是________的。
答案:连续4. 语言的离散性是指语言的单位是________的。
答案:有限5. 语音学是研究人类语言的________规律的学科。
答案:发音6. 语法学是研究语言的________和________的学科。
答案:结构规律7. 语用学是研究语言在________中的使用情况的学科。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
proficiency test
Designed to measure people’s ability in a language regardless any training they may have had. not syllabus-based, but with a specification of what test candidates have to be able to do for a particular purpose. (e.g.TOEFL, PETS, IELTS)
Lado
3. It is more effective to test grammar since it is limited, but not situation which is infinite. Hence, the grammar items are deigned without context. 4. Forms should be the points for language tests because two languages are different when transferred.
If the NR test is properly The items or parts will be designed, the scores selected according to how attained will typically be adequately they represent distributed in the shape of a these ability levels or “normal” bell-shaped curve. content domains.
achievement test proficiency test aptitude test diagnostic test placement test
achievement test
Directly related to the content of the course under the condition that the teaching material serves the objectives of teaching. Based on the syllabus and done at the end of the term or section. Non standardized, so incomparable. In terms of the score, the majority should pass, for it provides information on student progress. Otherwise problems will arise.
back
placement test
A type proficiency test Provide information which will help to place students at the stage of program most appropriate to their abilities.
Norm-referenced test VS Criterion-referenced test
NR tests are designed and developed to maximize distinctions among individual test takers.
CR tests are designed to be representative of specified levels of the ability or domains of content.
3.4.1discrete point tests
American linguist, R. Lado summarized opinions of structural linguists on language testing in Language Testing, the
Construction and Use of Language Tests (1961)
Nitko 1971; Nitko 1984)
Very important!!!
不是“矮子里面拔高子”, 考生只要达到所规定的标准, 全都能通过。
NR+CR?
Despite the differences, however, it is important to understand that these two frames of reference are not necessarily mutually exclusive ( 高考、大学 英语4、6、级考试). It is possible , for example, to develop NR score interpretations, on the basis of appropriate groups of ability test takers.
III. Types of Language Tests
3.1 achievement test, proficiency test, aptitude test, diagnostic test and placement test 3.2 Norm-referenced test and criterion-referenced test 3.3 Subjective test and objective test 3.4 Discrete-point test , integrative test and communicative test
Information- No pass based
Ability/Sylla bus based
placement
rank
No pass
university incompa rable
3.2 from Interpretation of
test score
Scores of the Norm-referenced test are interpreted with referenced to the performance of other individuals on the test. Criterion-referenced test scores are interpreted as indicators of a level of ability or degree of mastery of the content domain. It is important to point out that it is this level of ability or domain of content that constitutes the criterion, and not the setting of a cut-off (分数线)score for making decisions.
back
3. 1 Summary
kinds
achievement purpose Assess Content x Syllabus based grading Percent(60) pass Test paper by Use of grade university incompa rable
3.3 from Ways of grading
Subjective test : The whole examination is essay-type. Assessment of the examinee’s work is “subjective’ in the sense that its merit has to be evaluated or judged by the examiner. E.g..discuss, compare, contrast, describe. objective test: The assessment of the merit of the examinee’s work is “objective” in the sense that no evaluative judgment is needed on the part of the examiner.
Mean (average):50; standard deviation:10
Critics: level & content,
The necessary condition for the development of a CR test is the specification of a level of ability or domain of content (Glaser 1963; Glaser &
(e.g.GRE, SAT)
back
diagnostic test
A type of achievement test not for information on what has been learned, but to identify students’ weakness. Based on syllabus Class quiz, unit test
back
aptitude test
Test for potential ability Not based on what has been learned, but on various domains in regard to analysis, judgments, sensitivity and so on.
Importance
An understanding of the different types of language tests and their roles in evaluation is essential to the appropriate use of language tests. (Bachman: 54)