可汗学院数学题目-33Datacollectionandconclusions
(完整版)数据挖掘_概念和技术[第三版]部分习题答案解析
![(完整版)数据挖掘_概念和技术[第三版]部分习题答案解析](https://img.taocdn.com/s3/m/89e3bb26240c844768eaee79.png)
1.4 数据仓库和数据库有何不同?有哪些相似之处?答:区别:数据仓库是面向主题的,集成的,不易更改且随时间变化的数据集合,用来支持管理人员的决策,数据库由一组内部相关的数据和一组管理和存取数据的软件程序组成,是面向操作型的数据库,是组成数据仓库的源数据。
它用表组织数据,采用ER数据模型.相似:它们都为数据挖掘提供了源数据,都是数据的组合.1。
3 定义下列数据挖掘功能:特征化、区分、关联和相关分析、预测聚类和演变分析。
使用你熟悉的现实生活的数据库,给出每种数据挖掘功能的例子。
答:特征化是一个目标类数据的一般特性或特性的汇总。
例如,学生的特征可被提出,形成所有大学的计算机科学专业一年级学生的轮廓,这些特征包括作为一种高的年级平均成绩(GPA:Grade point aversge)的信息,还有所修的课程的最大数量.区分是将目标类数据对象的一般特性与一个或多个对比类对象的一般特性进行比较。
例如,具有高GPA 的学生的一般特性可被用来与具有低GPA 的一般特性比较.最终的描述可能是学生的一个一般可比较的轮廓,就像具有高GPA 的学生的75%是四年级计算机科学专业的学生,而具有低GPA 的学生的65%不是。
关联是指发现关联规则,这些规则表示一起频繁发生在给定数据集的特征值的条件.例如,一个数据挖掘系统可能发现的关联规则为:major(X,“computing science”) ⇒ owns(X, “personal computer”)[support=12%, confidence=98%] 其中,X 是一个表示学生的变量。
这个规则指出正在学习的学生,12%(支持度)主修计算机科学并且拥有一台个人计算机。
这个组一个学生拥有一台个人电脑的概率是98%(置信度,或确定度).分类与预测不同,因为前者的作用是构造一系列能描述和区分数据类型或概念的模型(或功能),而后者是建立一个模型去预测缺失的或无效的、并且通常是数字的数据值。
可汗学院新SAT语法真题下载

可汗学院新SAT语法真题下载到目前为止,新版SAT可汗学院官方不断放出更多真题,已经放出了68篇阅读,且之前已经和大家分享过可汗学院新SAT阅读真题,想要下载的同学,请点击:新SAT阅读真题下载(共68篇,且已全)目前可汗学院一共放出41篇新SAT数学真题!想要下载的同学,请点击:新SAT数学真题下载(共41篇)分享了可汗学院的数学和阅读真题后,还有我们的可汗学院SAT语法真题。
截止到6月前,可汗学院一共放出了48套新SAT语法真题,想要吗下载请点击:新SAT语法真题下载(共48篇)(网址:)可汗学院新SAT语法真题(部分)Questions 1-5 are based on the following passage. 1Searching for GuinevereStories of kings and queens have captivated readers for centuries, and arguably, the tales of King Arthur and Guinevere are among the most enchanting. Arthur ruled the kingdom of Camelot, and Guinevere was his queen. But were they real people or fictional characters The debate has continued for centuries. Though many scholars have found evidence that the legendary Arthur was, at the very least, based on a real person who lived in Britain roughly between 450 and (1) 500 CE. They continue to search for the historical identity of Guinevere. Guinevere first appeared as King Arthur’s queen in one of the most widelystudied works of Arthurian literature, (2)The History of the Kings of Britain. This book was written by Geoffrey of Monmouth around 1135 CE. Geoffrey’s historical treatment of the legend is often(3)sited as evidence that the queen of Camelot existed, as the book chronicles the lives of a number of historical rulers.*God help those who help themselves. We help those who trust us. Contact Wechat:satxbs123, help is waiting.1A) NO CHANGEB) 500 CE. ContinuingC) 500 CE, continuingD) 500 CE, they continue2Which choice most effectively combines the sentences at the underlined portionA) The History of the Kings of Britain, and this bookB) The History of the Kings of Britain, whichC) a book called The History of the Kings of Britain,as thisD) a book called The History of the Kings of Britain,and this3A) NO CHANGEB) insightedC) citedD) incitedGuinevere is identified by Geoffrey as a noblewoman of Roman descent who met King Arthur in the court of Duke Cador of Cornwall, where she lived as a ward. (4)In Malory’s portrayal, Guinevere had no real power as a monarch but served as a kind of spiritual leader, providing guidance and moral support to the knights in their roles as defenders of the kingdom. Le Morte d’Arthur was also one of the first works to reference Guinevere’s romance with the knight, Sir many Arth urian scholars know, the distinction between history and literature was blurred in the Middle Ages. Consequently, the true identity of Guinevere may never be known with certainty. Yet regardless of whether Guinevere was real or fictional, her story (5) had endured centuries—and through each retelling, she continues to live on in the imaginations of people around the world.4At this point, the author wants to add a sentence which effectively sets up the portrayal of Guinevere discussed in the rest of the paragraph. Which choice best accomplishes this goalA) Three centuries later, however, Thomas Malory painted a very different portrait of Guinevere in Le Morte d’Arthur.B) Sir Thomas Malory was an English knight and Member of Parliament who also wrote extensively about the history of the British monarchy.C) Many historians believe that the portrayal of Arthur and Guinevere in Sir Thomas Malory’s Le Morte d’Arthur was actually a political commentary on the War of the Roses (1455-1487CE).D) In Le Morte d’Art hur, Sir Thomas Malory describes an idyllic England under King Arthur and Guinevere, which eventually collapses into chaos and political unrest. E. I would be guessing.5A) NO CHANGEB) was enduringC) would have enduredD) has enduredQuestions 1-5 are based on the following passage. 1Cometary Missions: Trajectory for SuccessScientists have been launching cometary missions since 1978. The first one, a joint mission by the European Space Agency, and the National Aeronautics and Space Administration (NAS A), was a “flyby” in which the spacecraft collected data while passing around Comet Giacobini-Zinner. (1)However, the landing of the Rosetta space probe on comet 67P/Churyumov-Gerasemenko in 2014 was different: it marked the first time that a probe landed on a( 2 )comet and giving scientists an unprecedented opportunity to study the surface of a comet. In order to continue this valuable research, additional missions are needed; thus, it is critical that more funding be allocated for this 2014 Rosetta mission provided a rare opportunity for scientists to test a number of hypotheses regarding the composition of (3) comets; the distribution of organic compounds in our solar system and the origins of life on Earth. Unlike other cometary missions, the Rosetta spacecraft contained a probe, Philae, that was able to land on the surface of a comet. *Rack your brain and you don't know Wechat: satxbs123, she can help you!1At this point, the writer wants to add accurate information from the graph. Which choice best accomplishes this goalA) From 1978 to 2014, the number of successful missions increased from 28 percent to 72 percent.B) Before 2014, the majority of attempted cometary missions were considered unsuccessful.C) Between then and 2014, 72 percent of the cometary missions were successful.D) Of the missions attempted since then, 44 percent have been successful.2A) NO CHANGEB) comet, but it gaveC) comet, yet givesD) comet, giving3A) NO CHANGEB) comets, the distribution of organic compounds in our solar system,C) comets, the distribution of organic compounds in our solar system;D) comets; the distribution of organic compounds in our solar system,。
数学之旅测试地的题目

1爱因斯坦创立广义相对论时用到了下列什么重要的数学工具?黎曼几何2下面这个方程有没有整数解?方程有没有整数解?有3下列哪个是孪生素数对?(17,19)4圆与椭圆在下列哪个数学分支中可看作一样?拓扑5具有同样周长的下列图形哪个面积更大?圆6以下汉字哪一个可以一笔不重复地写出?日7偶数与正整数哪个多?一样多8数列极限趋于0的直观定义的弱点是下面哪一点?缺乏可操作性9课程中费曼的故事告诉我们懂得一件事情最重要的是下面列出的哪一条?找到感觉10超弦理论中蜷缩的空间可以用下面那个空间来描述?Calabi-Yau空间11下面哪一位人物用穷竭法证明了圆的面积与其直径平方成正比?欧多克索斯12以下什么成果是阿基米德首先得到的?抛物线弓形的面积13阿基米德求几何级数的和用的是什么方法?几何的方法14欧多克索斯、阿基米德和刘徽等人对微积分的贡献主要体现在什么方面?定积分15《一种发展连续不可分量的新几何学的方法》是下列哪位数学家的著作?卡瓦列里16现在我们一直在用的“函数(function)”这个词是谁引进的?莱布尼兹17本课程提到的最美的风景点是指?牛顿-莱布尼兹公式18一直沿用至今的ε-δ语言是哪位数学家引入的?魏尔斯特拉斯19康托尔所创立的什么理论是实数以至整个微积分理论体系的基础?集合论20下面关于黎曼可积和勒贝格可积的论述那一项是正确的?黎曼可积函数类是不完备的,勒贝格可积函数类是完备的21试用阿基米德的方法求下面几何级数的和。
22计算加百列号的表面积与体积,并解释为何在这个号角里面灌满油漆,油漆的体积是有限的,但它却能够涂满无限的表面积?23举例说明黎曼积分中积分号和极限号有时不可交换,并给出可交换时需要的条件。
24下列四个定义中,哪个不能作为Rn中的度量(距离)?25度量的三个基本属性中不包括下列哪一个?连续性(三角不等式,正定型,对称性)26下列关于度量和范数的说法中正确的是?由范数可以定义距离,但由距离不可以定义范数27下列说法中不正确的是?对,若为的范数,则下列说法中不正确的是?若为实数,则有28以下现象可以用什么原理来解释?在三维空间中,波的传播有清晰的前后阵面,但是在二维空间中却没有?惠更斯原理29下列选项中正确的是?以下向量组中哪个不能构成的基向量?(0,1,1),(2,1,1),(1,0,0)30下列哪个选项是正确的?若向量a=(1,0,5,2),b=(3,-2,3,-4),c=(-1,1,t,3)线性相关,那么t的值为?1 31下列选项正确的是?向量和的夹角为?32下列说法哪一个是正确的?向量组线性无关的充分必要条件是?齐次线性方程组只有零解33下列哪个属性不是内积所具有的?三角不等式(对称性,对第一个变元的线性性,正定性)34给定一个集合,试验证下面两个集族是否构成集合M上的拓扑?1).2).35随着网络的迅速发展,人们越来越多的使用e-mail联系和交流。
可汗学院数学题目-41Complexnumbers[1]
![可汗学院数学题目-41Complexnumbers[1]](https://img.taocdn.com/s3/m/4dab8c7f5901020207409cc3.png)
Complex numbers1.(−8+4i)(1−i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−12+4iB.−12+12iC.−4+12iD.−4+4iCorrect answer: C Difficulty level: 22.(4+i)2Which of the following is equivalent to **plex number shown above?Note: i=√−1A.15+8iB.15−8iC.17+8iD.17−8iCorrect answer: A Difficulty level: 23.(8−2i)(4−2i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.28−24iB.28+8iC.36−24iD.36+8iCorrect answer: A Difficulty level: 24.(5+i)(7−3i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.32+8iB.32−8iC.38+8iD.38−8iCorrect answer: D Difficulty level: 25.i4+4i2+4Which of the following is equivalent to **plex number shown above?Note: i=√−1A.1B.−1C.i+4D.i−4Correct answer: A Difficulty level: 26.(−3−i)(4−2i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−14−2iB.−14+2iC.−10−2iD.−10+2iCorrect answer: B Difficulty level: 27.(6+2i)2Which of the following is equivalent to **plex number shown above?Note: i=√−1A.40+4i2B.40+24iC.32+24iD.32+4i2Correct answer: C Difficulty level: 28.(1+i)(1−i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.2-2iB.2iC.0D.2Correct answer: D Difficulty level: 29.i(7−3i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.4iB.10iC.7i−3D.7i+3Correct answer: D Difficulty level: 210.i2−16i+4Which of the following is equivalent to **plex number shown above?Note: i=√−1A.i−4B.i+4C.−i−4D.−i+4Correct answer: A Difficulty level: 211.(3+i)(2−4i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.2−10iB.2−14iC.10−10iD.10−14iCorrect answer: C Difficulty level: 212.i101Which of the following is equivalent to **plex number shown above?Note: i=√−1A.1B.−1C.iD.−iCorrect answer: C Difficulty level: 213.(5−i)2Which of the following is equivalent to **plex number shown above?Note: i=√−1A.24−10iB.24+10iC.26−10iD.26+10iCorrect answer: A Difficulty level: 214.−8(7i−3i2)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−80iB.−56i−24C.−56+24iD.−32iCorrect answer: B Difficulty level: 315.32+iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.2−iB.2+iC.6+3i5D.6+3i5Correct answer: D Difficulty level: 316.(3−i)3Which of the following is equivalent to **plex number shown above?Note: i=√−1A.8−26iB.18−26iC.27−26iD.30−26iCorrect answer: B Difficulty level: 317.(5−7i+i2)+(8i3+12)**plex expression above is equivalent to the expression a+bi for the integer constants a and b.What is the value of a?Note: i=√−1A.16B.17C.18D.19Correct answer: A Difficulty level: 318.(−3+2i)(1−i3)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−5−iB.−5+5iC.−1−iD.−1+5iCorrect answer: A Difficulty level: 319.i11+i13Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−2iB.2iC.0D.2Correct answer: C Difficulty level: 320.51+3iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.1+3i2B.1−3i2C.−5(1+3i)8D.−5(1−3i)8Correct answer: B Difficulty level: 321.(10−8i3)−(6+i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.4−7iB.4+7iC.4+9iD.4−9iCorrect answer: B Difficulty level: 322.11−iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.2−2iB.2+2iC.1−i2D.1+i2Correct answer: D Difficulty level: 323.21−iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.1−iB.1+iC.2−iD.2+iCorrect answer: B Difficulty level: 324.8ix=−5What is the value of x in the equation above?Note:i=√−1A.−8i5B.8i5C.−5i8D.5i8Correct answer: D Difficulty level: 325.(2−3i)3Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−46−9iB.−26−9iC.26−9iD.46−9iCorrect answer: A Difficulty level: 326.(12+i)(8−6i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−2+5iB.2+2iC.10+5iD.14+2iCorrect answer: C Difficulty level: 327.11−6i −11+6iWhich of the following is equivalent to **plex number shown above? Note: i=√−1A.1237iB.−1237iC.1237D.−1237Correct answer: A Difficulty level: 328.(23+12i)(12−13i)**plex expression above is equivalent to the expression a+bi for the rational constants a and b. What is the value of b?Note: i=√−1A.b=16B.b=−16C.b=496D. b=529Correct answer: D Difficulty level: 329.P(x)=2x2+3x−17If x=8−2i, what is the value of the polynomial P above?Note:i=√−1A.15−2iB.23−6iC.127−70iD.135−62iCorrect answer: C Difficulty level: 330.(6+i2)(2−2i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.6−8iB.8−8iC.10−8iD.12−8iCorrect answer: B Difficulty level: 331.3i +2i2Which of the following is equivalent to **plex number shown above?Note: i=√−1A.3i+2B.3i−2C.−3i+2D.−3i−2Correct answer: D Difficulty level: 332.21+iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.−1+iB.−1−iC.1+iD.1−iCorrect answer: D Difficulty level: 433.3i10+i11Which of the following is equivalent to **plex number shown above?Note: i=√−1A.3+iB.−3+iC.3−iD.−3−iCorrect answer: D Difficulty level: 434.P(n)=n2−5n−7What is the value of P(−3i)?Note:i=√−1A.−4+15iB.−7+12iC.−7+24iD.−16+15iCorrect answer: D Difficulty level: 435.√3t2+5t+√27=0Which of the following is a solution to the equation above?Note:i=√−1A.t=−4√11i6B.t=−3√33i6C.t=−5+√11i6D.t=−5√3+√33i6Correct answer: D Difficulty level: 436.29=3(x+7)2+41Which of the following is a solution to the equation above?Note:i=√−1A.x=−7+2iB.x=−42−12iC.x=−76+√4436iD.x=−7−√1233iCorrect answer: A Difficulty level: 437.(8−2i)2(8+2i)Which of the following is equivalent to **plex number shown above?Note: i=√−1A.60B.68C.480−120iD.544−136iCorrect answer: D Difficulty level: 438.√54i41√27i101Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−√2iB.−√2C.√2D.√2iCorrect answer: C Difficulty level: 439.22−i −22+iWhich of the following is equivalent to **plex number shown above? Note: i=√−1A.4i5B.−4i5C.2i3D.−2i3Correct answer: A Difficulty level: 440.1+2i1−2i ÷1−2i1+2iWhich of the following is equivalent to **plex number shown above? Note: i=√−1A.1B.−1C.−725+2425iD.−725−2425iCorrect answer: D Difficulty level: 441.i3+i2Which of the following is equivalent to **plex number shown above?Note: i=√−1A.−1B.−2C.−1+iD.−1−iCorrect answer: D Difficulty level: 442.m2+6m+10=0Which of the following are solutions to the equation above?I. -3+iII. -3-iIII. 3+iNote:i=√−1A.I onlyB.I and II onlyC.I and III onlyD.I, II, and IIICorrect answer: B Difficulty level: 443.704i1776Which of the following is equivalent to **plex number shown above?Note: i=√−1A.704B.−704C.704iD.−704iCorrect answer: A Difficulty level: 444.12+5i −4+3i3−iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.10+26i(2+5i)(3−i)B.10−26i(2+5i)(3−i)C.10+27i(2+5i)(3−i)D.10−27i(2+5i)(3−i)Correct answer: D Difficulty level: 445.5+7i6−3iWhich of the following is equivalent to **plex number shown above?Note: i=√−1A.9+57i45B.9+57i3C.51+57i45D.51+57i3Correct answer: A Difficulty level: 446.9+7i184−i**plex expression above is equivalent to the expression a+bi for the rational constants a and b.What is the value of b?Note: i=√−1A.b=215B.b=217C.b=−1D.b=−7Correct answer: B Difficulty level: 447.5−i+(11−i)z=40+18iWhat is the value of z in the equation above?Note:i=√−1A.z=−19+24iB.z=24+20iC.z=3+2iD.z=3.2+145iCorrect answer: C Difficulty level: 448.2i+4ℎ−14=2iℎWhat is the value of h in the equation above?Note:i=√−1A.ℎ=3+iB.ℎ=72C.ℎ=7−iD.ℎ=83Correct answer: A Difficulty level: 4。
最新北师大版高中数学高中数学选修2-3第三章《统计案例》检测题(包含答案解析)(1)

一、选择题1.下列说法错误的是( )A .在回归直线方程0.2 0.8y x =+中,当解释变量x 每增加1个单位时,预报变量y 平均增加0.2个单位.B .对分类变量X 与Y ,随机变量2K 的观测值k 越大,则判断“X 与Y 有关系”的把握程度越小.C .两个随机变量的线性相关性越强,则相关系数的绝对值就越接近于1.D .回归直线过样本点的中心(),x y .2.为检测某药品服用后的多长时间开始有药物反应,现随机抽取服用了该药品的1000人,其服用后开始有药物反应的时间(分钟)与人数的数据绘成的频率分布直方图如图所示.若将直方图中分组区间的中点值设为解释变量x (分钟),这个区间上的人数为y (人),易见两变量x ,y 线性相关,那么一定在其线性回归直线上的点为( )A .()1.5,0.10B .()2.5,0.25C .()2.5,250D .()3,3003.利用独立性检验的方法调查大学生的性别与爱好某项运动是否有关,通过随机询问400名不同的大学生是否爱好某项运动,利用22⨯列联表,计算可得2K 的观测值7.556k ≈,附表:20()P K k ≥0.15 0.100.050.025 0.010 0.005 0.001 0k 2.0722.7063.8415.0246.6357.87910.828参照附表,得到的正确结论是A .有99%以上的把握认为“爱好该项运动与性别无关”B .有99%以上的把握认为“爱好该项运动与性别有关”C .在犯错误的概率不超过0.5%的前提下,认为“爱好该项运动与性别有关”D .在犯错误的概率不超过1%的前提下,认为“爱好该项运动与性别无关” 4.下列判断错误的是A .若随机变量ξ服从正态分布()()21,,30.72N P σξ≤=,则()10.28P ξ≤-=;B .若n 组数据()()()1122,,,,...,,n n x y x y x y 的散点都在1y x =-+上,则相关系数1r =-;C .若随机变量ξ服从二项分布: 15,5B ξ⎛⎫~ ⎪⎝⎭, 则()1E ξ=; D .am bm >是a b >的充分不必要条件;5.为了普及环保知识,增强环保意识,某大学从理工类专业的A 班和文史类专业的B 班各抽取20名同学参加环保知识测试,统计得到成绩与专业的列联表:( )附:参考公式及数据:(1)统计量:()()()()()22n ad bc K a b c d a c b d -=++++,(n a b c d =+++).(2)独立性检验的临界值表:则下列说法正确的是A .有95%的把握认为环保知识测试成绩与专业有关B .有95%的把握认为环保知识测试成绩与专业无关C .有99%的把握认为环保知识测试成绩与专业有关D .有99%的把握认为环保知识测试成绩与专业无关6.某种产品的广告费支出x 与销售额y (单位:万元)之间有下表关系:y 与x 的线性回归方程为 6.5175ˆ.y x =+,当广告支出5万元时,随机误差的效应(残差)为( ) A .40 B .20 C .30D .107.某市政府调查市民收入与旅游欲望时,采用独立性检验法抽取3 000人,计算发现k 2=6.023,则根据这一数据查阅下表,市政府断言市民收入增减与旅游欲望有关系的把握是( )A .90%B .95%C .97.5%D .99.5%8.某科研机构为了研究中年人秃发与心脏病是否有关,随机调查了一些中年人的情况,具体数据见下表:根据表中数据得到()277520450530015.96820750320455k ⨯⨯-⨯=≈⨯⨯⨯,因为K 2≥10.828,则断定秃发与心脏病有关系,那么这种判断出错的可能性为( ) A .0.1B .0.05C .0.01D .0.0019.某家具厂的原材料费支出x 与销售量y (单位:万元)之间有如下数据,根据表中提供的全部数据,用最小二乘法得出y 与x 的线性回归方程为ˆ8ˆy x b =+,则^b为( )A .5B .15C .10D .2010.已知样本789x y 、、、、的平均数是8xy 值为 A .8B .32C .60D .8011.已知,x y 的取值如下表:( )y1 1.3 3.2 5.6 8.9若依据表中数据所画的散点图中,所有样本点()(,)1,2,3,4,5i i x y i =都在曲线212y x a =+附近波动,则a =( ) A .1B .12C .13D .12-12.某工厂为了调查工人文化程度与月收入的关系,随机抽取了部分工人,得到如下列表:由上表中数据计算得2K =()21051030204555503075⨯⨯-⨯⨯⨯⨯≈6.109,请根据下表,估计有多大把握认为“文化程度与月收入有关系”( )A .1%B .99%C .2.5%D .97.5%二、填空题13.给出以下四个命题:①设,,a b c 是空间中的三条直线,若a b ⊥,b c ⊥,则//a c .②在面积为S 的ABC 的边AB 上任取一点P ,则PBC 的面积大于S4的概率为34.③已知一个回归直线方程为 1.545y x =+{}()1,5,7,13,19,1,2,...,5i x i ∈=,则58.5=y . ④数列{}n a 为等差数列的充要条件是其通项公式为n 的一次函数. 其中正确命题的序号为________.(把所有正确命题的序号都填上)14.在一次独立试验中,有200人按性别和是否色弱分类如下表(单位:人)男 女 正常 73 117 色弱73你能在犯错误的概率不超过_____的前提下认为“是否色弱与性别有关”?15.以下结论正确..的序号有_________ (1)根据22⨯列联表中的数据计算得出2K ≥6.635, 而P (2K ≥6.635)≈0.01,则有99% 的把握认为两个分类变量有关系.(2)在残差图中,残差点比较均匀落在水平的带状区域中即可说明选用的模型比较合适,与带状区域的宽度无关.(3)在线性回归分析中,相关系数为r ,r 越接近于1,相关程度越大;r 越小,相关程度越小.(4)在回归直线0.585y x =-中,变量200x =时,变量y 的值一定是15. 16.给出下列命题:①线性相关系数越大,两个变量的线性相关越强;反之,线性相关性越弱; ②由变量和的数据得到其回归直线方程:,则一定经过;③从越苏传递的产品生产流水线上,质检员每10分钟从中抽取一件产品进行某项指标检测,这样的抽样是分层抽样;④在回归分析模型中,残差平方和越小,说明模型的拟合效果越好; ⑤在回归直线方程中,当解释变量每增加一个单位时,预报变量增加0.1个单位,其中真命题的序号是___________.17.在吸烟与患肺病这两个分类变量的计算中,“若2x 的观测值为6.635,我们有99%的把握认为吸烟与患肺病有关系”这句话的意思: ①是指“在100个吸烟的人中,必有99个人患肺病 ②是指“有1%的可能性认为推理出现错误”; ③是指“某人吸烟,那么他有99%的可能性患有肺病”; ④是指“某人吸烟,如果他患有肺病,那么99%是因为吸烟”. 其中正确的解释是______.18.用线性回归模型求得甲、乙、丙3组不同的数据对应的2R 的值分别为0.81,0.98,0.63,其中__________(填甲、乙、丙中的一个)组数据的线性回归的效果最好.19.下列说法正确的个数有_________(1)已知变量x 和y 满足关系23y x =-+,则x 与y 正相关;(2)线性回归直线必过点(),x y ;(3)对于分类变量A 与B 的随机变量2k ,2k 越大说明“A 与B 有关系”的可信度越大 (4)在刻画回归模型的拟合效果时,残差平方和越小,相关指数2R 的值越大,说明拟合的效果越好.20.下列说法中,正确的有_______.①回归直线ˆˆˆy bx a =+恒过点(),x y ,且至少过一个样本点;②根据22⨯列列联表中的数据计算得出2 6.635K ≥,而()26.6350.01P K ≥≈,则有99%的把握认为两个分类变量有关系;③2k 是用来判断两个分类变量是否相关的随机变量,当2k 的值很小时可以推断两个变量不相关;三、解答题21.有治疗某种疾病的A B 、两种药物,为了分析药物的康复效果进行了如下随机抽样调查:AB 、两种药物各有100位病人服用,他们服用药物后的康复时间(单位:天数)及人数记录如下: 服用A 药物:(1)若康复时间低于15天(不含15天),记该种药物对某病人为“速效药物”.当17a >时,请完成下列22⨯列联表,并判断是否有99%的把握认为病人服用药物A 比服用药物B 更速效?A 药物的7人为Ⅰ组,服用B 药物的7人为Ⅱ组.现从Ⅰ、Ⅱ两组中随机各选一人,分别记为甲、乙.①a 为何值时,Ⅰ、Ⅱ两组人康复时间的方差相等(不用说明理由); ②在①成立且12a >的条件下,求甲的康复时间比乙的康复时间长的概率. 参考数据:参考公式:2()()()()()n ad bc K a b c b a c b d -=++++,其中n =a +b +c +d.22.目前,新冠病毒引发的肺炎疫情在全球肆虐,为了解新冠肺炎传播途径,采取有效防控措施,某医院组织专家统计了该地区500名患者新冠病毒潜伏期的相关信息,数据经过汇总整理得到如图所示的频率分布直方图(用频率作为概率).潜伏期不高于平均数的患者,称为“短潜伏者”,潜伏期高于平均数的患者,称为“长潜伏者”.(1)求这500名患者潜伏期的平均数(同一组中的数据用该组区间的中点值作代表),并计算出这500名患者中“长潜伏者”的人数;(2)为研究潜伏期与患者年龄的关系,从上述500名患者中抽取300人,得到如下列联表,根据列联表判断是否有97.5%的把握认为潜伏期长短与患者年龄有关:短潜伏者 长潜伏者 合计60岁及以上 90 70 160 60岁以下 60 80 140 合计 150150300附表及公式:20P K k ≥()0.15 0.10 0.05 0.025 0.010 0.005 0.001 0k2.0722.7063.8415.0246.6357.87910.82822()()()()()n ad bc K a b c d a c b d -=++++23.我国新型冠状病毒肺炎疫情期间,以网络购物和网上服务所代表的新兴消费展现出了强大的生命力,新兴消费将成为我国消费增长的新动能.某市为了了解本地居民在2020年2月至3月两个月网络购物消费情况,在网上随机对1000人做了问卷调查,得如表频数分布表:(1)作出这些数据的频率分布直方图,并估计本市居民此期间网络购物的消费平均值;(2)在调查问卷中有一项是填写本人年龄,为研究网购金额和网购人年龄的关系,以网购金额是否超过4000元为标准进行分层抽样,从上述1000人中抽取200人,得到如表列联表,请将表补充完整并根据列联表判断,在此期间是否有95%的把握认为网购金额与网购人年龄有关.参考公式和数据:()()()()()22n ad bcKa b c d a c b d-=++++.(其中n a b c d=+++为样本容量)24.2020年3月,因为新冠肺炎疫情的影响,我市全体学生只能在网上在线学习,为了研究学生在线学习情况,市教研院数学学科随机从市区各高中学校抽取120名学生对线上教学情况进行调查(其中,男生与女生的人数之比为3:1),结果发现:男生中有40名对于线上教学满意,女生中有10名表示对于线上教学不满意.(1)请完成如表2×2列联表,并回答能否有95%的把握认为对“线上教学是否满意与性别有关”;态度性别满意不满意合计男生女生合计120(2)采用分层抽样的方法,从被调查的对线上教学满意的学生中,抽取6名学生,再从这6名学生中抽取2名学生,作线上学习的经验介绍,求所选取的2名学生性别不同的概率.附:参考公式及临界值表()()()()()22n ad bc K a b c d a c b d -=++++,其中n a b c d =+++P (K 2>k 0)0.150.100.050.0250.0100.0050.001 k 02.0722.7063.8415.0246.6357.87910.82825.某地为响应国家“脱贫攻坚战”的号召,帮助贫困户脱贫,安排贫困人员参与工厂生产.现用A ,B 两条生产线生产某产品.为了检测该产品的某项质量指标值(记为Z ),现随机抽取这两种这两条生产线的产品各100件,由检测结果得到如下频率分布直方图.(Ⅰ)分别估计A ,B 两条生产线的产品质量指标值的平均数(同一组数据中的数据用该组区间的中点值作代表),从平均数结果看,哪条生产线的质量指标值更好?(Ⅱ)计算A 生产线的产品质量指标值的众数和中位数(中位数计算结果精确到小数点后两位).(Ⅲ)该公司规定当92Z ≥时,产品为超优品.根据所检测的结果填写22⨯列联表,并判断是否有95%的把握认为“生产超优品是否与生产线有关”.附:()()()()()22n ad bc K a b c d a c b d -=++++,n a b c d =+++()20P K k ≥0.050 0.010 0.005 0.001 0k 3.8416.6357.87910.82822⨯列联表A 生产线B 生产线 总计超优品非超优品 总计26.根据国家统计局数据,1999年至2019年我国进出口贸易总额从3万亿元跃升至31.6万亿元,中国在国际市场上的贸易份额越来越大对外贸易在国民经济中的作用日益突出.将年份1999,2004,2009,2014,2019分别用1,2,3,4,5代替,并表示为t ,y 表示全国进出口贸易总额.(1)根据以上统计数据及图表,给出了下列两个方案,请解决方案1中的问题. 方案1:用y bt a =+作为全国进出口贸易总额y 关于t 的回归方程,根据以下参考数据,求出y 关于t 的回归方程,并求相关指数21R .方案2:用dt y ce =作为全国进出口贸易总额y 关于t 的回归方程,求得回归方程0.57212.3259x y e =,相关指数22R .(2)通过对比(1)中两个方案的相关指数,你认为哪个方案中的回归方程更合适,并利用此回归方程预测2020年全国进出口贸易总额. 参考数据:y()()51=--∑iii t t y y()521ii y y =-∑17.14 74 555.792①0.140.340.66 1.86 2.048.192++++=②222220.140.34 1.86 2.04 2.1412.336++++=③8.1920.0147555.792≈④12.3360.0222555.792≈参考公式:线性回归方程中的斜率和截距的最小二乘法估计公式分别为:()()()121nii i nii xx y yb xx==--=-∑∑,a y bx =-,相关指数()()221211ni ii n ii y y R yy==-=--∑∑.【参考答案】***试卷处理标记,请不要删除一、选择题 1.B 解析:B 【分析】根据线性回归方程,相关系数,独立性检验的相关知识即可判断选项的正误. 【详解】对于选项A :在回归直线方程0.2.8ˆ0yx =+中,当解释变量x 每增加1个单位时,预报变量y 平均增加0.2个单位,正确.对于选项B :对分类变量X 与Y ,随机变量2K 的观测值k 越大,则判断“X 与Y 有关系"的把握程度越大,错误.对于选项C :两个随机变量的线性相关性越强,则相关系数的绝对值就越接近于1,正确. 对于选项D :回归直线过样本点的中心(),x y ,正确. 故选: B 【点睛】本题主要考查了线性回归的有关知识,考查了随机变量的相关性,考查了推理能力,属于中档题.2.C解析:C 【分析】写出四个区间中点的横纵坐标,从而可求出 2.5x =,250y =,进而可选出正确答案. 【详解】解:由频率分布直方图可知, 第一个区间中点坐标,111.0,0.101000100x y ==⨯=, 第二个区间中点坐标,222.0,0.211000210x y ==⨯=, 第三个区间中点坐标,333.0,0.301000300x y ==⨯=, 第四个区间中点坐标,444.0,0.391000390x y ==⨯=, 则()12341 2.54x x x x x =+++=,()123412504y y y y y =+++=, 则一定在其线性回归直线上的点为(),x y ()2.5,250=. 故选:C. 【点睛】本题考查了频率分布直方图,考查了线性回归直线方程的性质.本题的关键是利用线性回归直线方程的性质,即点(),x y 一定在方程上.3.B解析:B 【分析】根据2K 的观测值7.556k ≈,对照表中数据,即可得到相应的结论. 【详解】根据2K 的观测值7.556k ≈,对照表中数据得出有0.01的几率说明这两个变量之间的关系是不可信的,即有10.0199%-=的把握说明两个变量之间有关系,故选B . 【点睛】本题主要考查独立性检验的应用,独立性检验的一般步骤:(1)根据样本数据制成22⨯列联表;(2)根据公式计算2K 的观测值k ;(3)查表比较k 与临界值的大小关系,作统计判断.(注意:在实际问题中,独立性检验的结论也仅仅是一种数学关系,得到的结论也可能犯错误)4.D解析:D 【解析】分析:根据正态分布的对称性求出()1P ξ≤-的值,判断A 正确; 根据线性相关关系与相关系数的定义,判断B 正确; 根据二项分布的均值计算公式求出()E ξ的值,判断C 正确; 判断充分性和必要性是否成立,得出D 错误.详解:对于A ,随机变量ξ服从正态分布()21,N σ,∴曲线关于1ξ=对称,131310.720.28PP P ξξξ∴≤-=≥=-≤=-=()()(),A 正确;对于B ,若n 组数据()()()1122,,,,...,,n n x y x y x y 的散点都在1y x =-+上, 则x y ,成负相关,且相关关系最强,此时相关系数1r =-,B 正确;对于C ,若随机变量ξ服从二项分布: 15,5B ξ⎛⎫~ ⎪⎝⎭,则1515E(),ξ=⨯= C 正确;对于D ,am >bm 时,a >b 不一定成立,即充分性不成立,a b am bm >时,> 不一定成立,即必要性不成立,是既不充分也不必要条件,D 错误. 故选:D .点睛:本题考查了命题真假的判断问题,是综合题.5.A解析:A 【解析】分析:首先计算观测值k 0的值,然后给出结论即可.详解:由列联表计算观测值:()2401413672804.912 3.8412119202057k ⨯⨯-⨯==≈>⨯⨯⨯, 则有95%的把握认为环保知识测试成绩与专业有关. 本题选择A 选项.点睛:本题主要考查独立性检验及其应用等知识,意在考查学生的转化能力和计算求解能力.6.D解析:D 【解析】∵y 与x 的线性回归方程为 6.5175ˆ.y x =+ 当5x =时,ˆ50y=. 当广告支出5万元时,由表格得:60y = 故随机误差的效应(残差)为605010.-= 故选D .7.C解析:C 【详解】∵2 6.023 5.024K =>∴可断言市民收入增减与旅游欲望有关的把握为97.5%. 故选C.点睛:本题主要考查独立性检验的实际应用.独立性检验的一般步骤:(1)根据样本数据制成22⨯列联表;(2)根据公式22()()()()()n ad bc K a b c d a c b d -=++++,计算出2K 的值;(3)查表比较2K 与临界值的大小关系,作统计判断.8.D解析:D 【解析】010.828,10.0010.99999.90k ≥∴-==,则有0099.9以上的把握认为秃发与患心脏病有关,故这种判断出错的可能性为10.9990.001-=,故选D.【方法点睛】本题主要考查独立性检验的实际应用,属于难题.独立性检验的一般步骤:(1)根据样本数据制成22⨯列联表;(2)根据公式()()()()()22n ad bc K a b a d a c b d -=++++计算2K 的值;(3) 查表比较2K 与临界值的大小关系,作统计判断.(注意:在实际问题中,独立性检验的结论也仅仅是一种数学关系,得到的结论也可能犯错误.)9.C解析:C由题意可得:2456855x ++++==,2535605575505y ++++==,回归方程过样本中心点,则:5285,1ˆˆ0bb =⨯+∴=. 本题选择C 选项.10.C解析:C 【解析】由78982x y++++⎧=⎪⎪=得=60xy ,故选C. 11.A解析:A 【解析】 设2t x = ,则11(014916)6,(1 1.3 3.2 5.68.9)455t y =++++==++++=,所以点(6,4)在直线12y t a =+上,求出1a =,选A. 点睛:本题主要考查了散点图,属于基础题.样本点的中心(),x y 一定在直线回归直线上,本题关键是将原曲线变形为12y t a =+,将点(6,4)代入,求出值. 12.D解析:D 【解析】 试题由题根据二列联表得出;2K =()21051030204555503075⨯⨯-⨯⨯⨯⨯≈6.109,对应参考值得 2 5.024K >,则有10.0250.975-=,即有97.5%的把握认为文化程度与月收入有关系。
数据挖掘_国防科技大学中国大学mooc课后章节答案期末考试题库2023年

数据挖掘_国防科技大学中国大学mooc课后章节答案期末考试题库2023年1.某超市研究销售纪录数据后发现,买啤酒的人很大概率也会购买尿布,这种属于数据挖掘的哪类问题?()答案:关联规则发现2.下列有关SVM说法不正确的是()答案:SVM因为使用了核函数,因此它没有过拟合的风险3.影响聚类算法效果的主要原因有:()答案:特征选取_聚类准则_模式相似性测度4.7、朴素贝叶斯分类器不存在数据平滑问题。
( )答案:错误5.决策树中包含一下哪些结点答案:内部结点(internal node)_叶结点(leaf node)_根结点(root node) 6.标称类型数据的可以利用的数学计算为:众数7.一般,k-NN最近邻方法在( )的情况下效果较好答案:样本较少但典型性好8.考虑两队之间的足球比赛:队0和队1。
假设65%的比赛队0胜出、P(Y=0)=0.65。
剩余的比赛队1胜出、P(Y=1)=0.35。
队0获胜的比赛中只有30%在队1的主场、P(X=1|Y=0)=0.3,而队1获胜的比赛中75%是主场获胜、P(X=1|Y=1)=0.75。
则队1在主场获胜的概率即P(Y=1|X=1)为:()答案:0.579.一组数据的最小值为12,000,最大值为98,000,利用最小最大规范化将数据规范到[0,1],则73,000规范化的值为:()答案:0.71610.以下哪个分类方法可以较好地避免样本的不平衡问题:()答案:KNN11.简单地将数据对象集划分成不重叠的子集,使得每个数据对象恰在一个子集中,下列哪些不属于这种聚类类型层次聚类_模糊聚类_非互斥聚类12.数据点密度分布不均会影响K-means聚类的效果。
答案:正确13.数据集成需要解决模式集成、实体识别、数据冲突检测等问题答案:正确14.决策树模型中应处理连续型属性数据的方法之一为:根据信息增益选择阈值进行离散化。
答案:正确15.数据库中某属性缺失值比较多时,数据清理可以采用忽略元组的方法。
Summa Scalable universal matrix multiplication algorithm

These are the special cases implemented as part of the widely used sequential Basic Linear Algebra Subprograms 11]. We will assume that each matrix X is of dimension mX nX , X 2 fA; B; C g. Naturally, there are constraints on these dimensions for the multiplications to be well de ned: We will assume that the dimensions of C are m n, while the \other" dimension is k.
This work is partially supported by the NASA High Performance Computing and Communications Program's Earth and Space Sciences Project under NRA Grant NAG5-2497. Additional support came from the Intel Research Council. Jerrell Watts is being supported by an NSF Graduate Research Fellowship.
新北师大版高中数学高中数学选修2-3第三章《统计案例》测试卷(包含答案解析)(5)

一、选择题1.在一次对性别与是否说谎有关的调查中,得到如下数据,根据表中数据判断如下结论中正确的是()A.在此次调查中有95%的把握认为是否说谎与性别有关B.在此次调查中有99%的把握认为是否说谎与性别有关C.在此次调查中有99.5%的把握认为是否说谎与性别有关D.在此次调查中没有充分证据显示说谎与性别有关2.某高校为调查学生喜欢“应用统计”课程是否与性别有关,随机抽取了选修课程的55名学生,得到数据如下表:临界值参考:(参考公式:22()()()()()n ad bcKa b c d a c b d-=++++,其中n a b c d=+++)参照附表,得到的正确结论是()A.在犯错误的概率不超过0.1%的前提下,认为“喜欢“应用统计”课程与性别有关”B .在犯错误的概率不超过0.1%的前提下,认为“喜欢“应用统计”课程与性别无关”C .有99.99%以上的把握认为“喜欢“应用统计”课程与性别有关”D .有99.99%以上的把握认为“喜欢“应用统计”课程与性别无关”3.为了调查某校高二学生的身高是否与性别有关,随机调查该校64名高二学生,得到2×2列联表如表:附:K 2()()()()2()n ad bc a b c d a c b d -=++++由此得出的正确结论是( )A .在犯错误的概率不超过0.01的前提下,认为“身高与性别无关”B .在犯错误的概率不超过0.01的前提下,认为“身高与性别有关”C .有99.9%的把握认为“身高与性别无关”D .有99.9%的把握认为“身高与性别有关” 4.已知两个统计案例如下:①为了探究患肺炎与吸烟的关系,调查了339名50岁以上的人,调查结果如下表:②为了解某地母亲与女儿身高的关系,随机测得10对母女的身高如下表:母亲身高(cm) 159 160160 163 159 154 159 158 159 157 女儿身高(cm) 158 159 160 161 161 155 162 157 162 156则对这些数据的处理所应用的统计方法是( ) A .①回归分析,②取平均值 B .①独立性检验,②回归分析 C .①回归分析,②独立性检验D .①独立性检验,②取平均值5.某科研机构为了研究中年人秃发与患心脏病是否有关,随机调查了一些中年人的情况,具体数据如表,根据表中数据则可判定秃发与患心脏病有关,那么这种判定出错的可能性为( ) 患心脏病情况秃发情况 患心脏病无心脏病 秃发 20 300 不秃发5450A .0.1B .0.05C .0.01D .0.996.设导弹发射的事故率为0.01,若发射10次,其出事故的次数为ξ,则下列结论正确的是 ( ) A .0.1E ξ=B .•01D ξ=C .10()0.01?0.99k k P k ξ-==D .1010()0.99?0.01kkkP k C ξ-==7.某研究型学习小组调查研究学生使用智能手机对学习的影响.部分统计数据如下表:附表:经计算2K 的观测值10k =,则下列选项正确的是( ) A .有99.5%的把握认为使用智能手机对学习有影响 B .有99.5%的把握认为使用智能手机对学习无影响C .有99.9%的把握认为使用智能手机对学习有影响D .有99.9%的把握认为使用智能手机对学习无影响8.为了检验设备M 与设备N 的生产效率,研究人员作出统计,得到如下表所示的结果,则( )附:参考公式:22()()()()()n ad bc K a b c d a c b d -=++++,其中n a b c d =+++.A .有90%的把握认为生产的产品质量与设备的选择具有相关性B .没有90%的把握认为生产的产品质量与设备的选择具有相关性C .可以在犯错误的概率不超过0.01的前提下认为生产的产品质量与设备的选择具有相关性D .不能在犯错误的概率不超过0.1的前提下认为生产的产品质量与设备的选择具有相关性 9.为考察数学成绩与物理成绩的关系,在高二随机抽取了300名学生,得到下面的列联表:现判断数学成绩与物理成绩有关系,则犯错误的概率不超过 ( ) A .0.005B .0.01C .0.02D .0.0510.以下四个命题中:①某地市高三理科学生有15000名,在一次调研测试中,数学成绩ξ服从正态分布()2100,N σ,已知()801000.40P ξ<≤=,若按成绩分层抽样的方式抽取100分试卷进行分析,则应从120分以上(包括120分)的试卷中抽取15分;②已知命题:p x ∀∈R ,sin 1x ≤,则:p x ⌝∃∈R ,sin 1x >;③在[]4,3-上随机取一个数m ,能使函数()222f x x mx =++在R 上有零点的概率为37; ④在某次飞行航程中遭遇恶劣气候,用分层抽样的20名男乘客中有5名晕机,12名女乘客中有8名晕机,在检验这些乘客晕机是否与性别有关时,采用独立性检验,有97%以上的把握认为与性别有关.()2P k k ≥0.15 0.1 0.05 0.025 0k 2.0722.7063.8415.024其中真命题的序号为( ) A .①②③ B .②③④C .①②④D .①③④11.有下列数据: x123y35.9912.01下列四个函数中,模拟效果最好的为( ) A .B .C .D .12.某家具厂的原材料费支出x 与销售量y (单位:万元)之间有如下数据,根据表中提供的全部数据,用最小二乘法得出y 与x 的线性回归方程为ˆ8ˆy x b =+,则^b为( ) x 2 4 5 6 8 y2535605575A .5B .15C .10D .20二、填空题13.回归方程ˆˆ 2.50.2x y=+在样本(4,1.2)处的残差为________. 14.如果根据性别与是否爱好运动的列联表得到K 2≈3.852>3.841,则判断性别与是否爱好运动有关,那么这种判断犯错的可能性不超过________. 15.若两个分类变量X 与Y 的列联表为:y 1 y 2 x 1 10 15 x 24016则“X 与Y 之间有关系”这个结论出错的可能性为________. 16.给出下列命题:①线性相关系数越大,两个变量的线性相关越强;反之,线性相关性越弱; ②由变量和的数据得到其回归直线方程:,则一定经过;③从越苏传递的产品生产流水线上,质检员每10分钟从中抽取一件产品进行某项指标检测,这样的抽样是分层抽样;④在回归分析模型中,残差平方和越小,说明模型的拟合效果越好; ⑤在回归直线方程中,当解释变量每增加一个单位时,预报变量增加0.1个单位,其中真命题的序号是___________.17.某单位为了了解用电量y 度与气温x ℃之间的关系,随机统计了某4天的用电量与当天气温. 气温(℃)14 12 86用电量(度) 22 26 34 38由表中数据得线性方程x b a yˆˆ+=中2ˆ-=b ,据此预测当气温为5℃时,用电量的度数约为 .18.以下4个命题中,正确命题的序号为_________.①“两个分类变量的独立性检验”是指利用随机变量2K 来确定是否能以给定的把握认为“两个分类变量有关系”的统计方法; ②将参数方程cos sin x y θθ=⎧⎨=⎩(θ是参数,[]0,θπ∈)化为普通方程,即为221x y +=;③极坐标系中,22,3A π⎛⎫⎪⎝⎭与()3,0B 19 ④推理:“因为所有边长相等的凸多边形都是正多边形,而菱形是所有边长都相等的凸多边形,所以菱形是正多边形”,推理错误在于“大前提”错误.19.在吸烟与患肺病这两个分类变量的计算中,下列说法正确的是_____________. ①若K 2的观测值满足K 2≥6.635,我们有99%的把握认为吸烟与患肺病有关系,那么在100个吸烟的人中必有99人患有肺病;②从独立性检验可知有99%的把握认为吸烟与患病有关系时,我们说某人吸烟,那么他有99%的可能患有肺病;③从统计量中得知有95%的把握认为吸烟与患肺病有关系,是指有5%的可能性使得推断出现错误.20.某班主任对全班50名学生的积极性和对待班级工作的态度进行了调查,统计数据如下表所示:则至少有________的把握认为学生的学习积极性与对待班级工作的态度有关.(请用百分数表示).注:独立性检验界值表三、解答题21.为了解某企业生产的某产品的年利润与年广告投入的关系,该企业对最近一些相关数据进行了调查统计,得出相关数据见下表:根据以上数据,研究人员分别借助甲、乙两种不同的回归模型,得到两个回归方程:方程甲,2(1)(1) 2.75yb x =-+^^;方程乙,(2)1.6yc x =-^^.(1)求b ^(结果精确到0.01)与c ^的值.(2)为了评价两种模型的拟合效果,完成以下任务.①完成下表(备注:i i ie y y =-^^,i e ^称为相应于点(x i ,y i )的残差);②分别计算模型甲与模型乙的残差平方和Q1及Q2,并通过比较Q1,Q2的大小,判断哪个模型拟合效果更好.22.为考察某种药物预防禽流感的效果,进行动物家禽试验,调查了100个样本,统计结果为:服用药的共有60个样本,服用药但患病的仍有20个样本,没有服用药且未患病的有20个样本.(1)根据所给样本数据画出22⨯列联表;(2)请问能有多大把握认为药物有效?附公式:()()()()()22=n ad bcKa b c d a c b d-++++.23.为提高全民身体素质,加强体育运动意识,某校体育部从全校随机抽取了男生、女生各100人进行问卷调查,以了解学生参加体育运动的积极性是否与性别有关,得到如下列联表(单位:人):男生 70 30 100 女生 60 40 100 合计13070200(1)根据以上数据,判断能否在犯错误的概率不超过10%的情况下认为该校参加体育运动的积极性与性别有关;(2)用频率估计概率,现从该校所有女生中随机抽取3人.记被抽取的3人中“偶尔运动或不运动”的人数为X ,求X 的分布列、期望()E X 和方差()D X .附:22()()()()()n ad bc K a b c d a c b d -=++++,其中n a b c d =+++.()20P K k 0.150.10 0.05 0.025 0k 2.0722.7063.8415.02424.在中国,不仅是购物,而且从共享单车到医院挂号再到公共缴费,日常生活中几乎全部领域都支持手机支付,出门不带现金的人数正在迅速增加.某机构随机抽取了一组市民,并统计他们各自出门随身携带现金(单位:元)的情况,制作出如图所示的茎叶图.规定:随身携带的现金在100元以下(不含100元)的为“手机支付族”,其他为“非手机支付族”.(1)根据茎叶图的数据,完成答题卡上的22⨯列联表;男生 女生 合计手机支付族非手机支付族合计45(2)根据(1)中的列联表,判断是否有99%的把握认为“手机支付族”与“性别”有关. 附:()20P K k ≥0.050 0.010 0.001 0k 3.8416.63510.82822()()()()()()n ad bc K n a b c d a b c d a c b d -==+++++++25.某地为响应国家“脱贫攻坚战”的号召,帮助贫困户脱贫,安排贫困人员参与工厂生产.现用A ,B 两条生产线生产某产品.为了检测该产品的某项质量指标值(记为Z ),现随机抽取这两种这两条生产线的产品各100件,由检测结果得到如下频率分布直方图.(Ⅰ)分别估计A ,B 两条生产线的产品质量指标值的平均数(同一组数据中的数据用该组区间的中点值作代表),从平均数结果看,哪条生产线的质量指标值更好?(Ⅱ)计算A 生产线的产品质量指标值的众数和中位数(中位数计算结果精确到小数点后两位).(Ⅲ)该公司规定当92Z ≥时,产品为超优品.根据所检测的结果填写22⨯列联表,并判断是否有95%的把握认为“生产超优品是否与生产线有关”.附:()()()()()22n ad bc K a b c d a c b d -=++++,n a b c d =+++()20P K k ≥0.050 0.010 0.005 0.001 0k3.8416.6357.87910.82822⨯列联表26.某学生兴趣小组随机调查了某市100天中每天的空气质量等级和当天到某公园锻炼的人次,整理数据得到下表(单位:天):(2)求一天中到该公园锻炼的平均人次的估计值(同一组中的数据用该组区间的中点值为代表);(3)若某天的空气质量等级为1或2,则称这天“空气质量好”;若某天的空气质量等级为3或4,则称这天“空气质量不好”.根据所给数据,完成下面的2×2列联表,并根据列联表,判断是否有95%的把握认为一天中到该公园锻炼的人次与该市当天的空气质量有关?附:2()()()()()n ad bc K a b c d a c b d -=++++,【参考答案】***试卷处理标记,请不要删除一、选择题 1.D 解析:D 【解析】根据上表数据可求得20.027 1.323k ≈<,再结合课本上的概率附表可知在此次调查中没有充分证据显示说谎与性别有关,故选D2.A解析:A 【分析】计算212.010.828K ≈>,对比临界值表得到答案. 【详解】()222552020105()53912.010.828()()()()3025302545n ad bc K a b c d a c b d ⨯-⨯-===≈>++++⨯⨯⨯,故在犯错误的概率不超过0.1%的前提下,认为“喜欢“应用统计”课程与性别有关”. 故选:A. 【点睛】本题考查了独立性检验,意在考查学生的计算能力和应用能力.3.D解析:D 【分析】根据22⨯列联表,计算2k ,与临界值表比较即可得出结论. 【详解】K 的观测值:K 2264(862426)34303232⨯⨯-⨯=≈⨯⨯⨯20.330;由于20.330>10.828,∴有99.9%的把握认为“身高与性别有关”,即在犯错误的概率不超过0.001的前提下,认为“身高与性别有关” 故选:D . 【点睛】本题主要考查了独立性检验的应用问题,K 2的计算,22⨯列联表,考查了运算能力,属于中档题.4.B解析:B 【分析】根据独立性检验和回归分析的概念,即可作出判定,得到答案. 【详解】由题意,独立性检验通常是研究两个分类变量之间是否有关系,所以①采用独立性检验, 回归分析通常是研究两个具有相关关系的变量的相关程度,②采用回归分析, 综上可知①是独立性检验,②是回归分析,故选B . 【点睛】本题主要考查了独立性检验和回归分析的概念及其判定,其中解答中熟记独立性检验和回归分析的概念是解答的关键,着重考查了分析问题和解答问题的能力,属于基础题.5.C解析:C 【分析】首先列出22⨯联表,通过计算出2K 的值,然后作统计推断,得出正确的结论. 【详解】列出22⨯联表如下图所示:()277520450530015.96825750455320K ⨯⨯-⨯=≈⨯⨯⨯ 6.635>,故判断错误的概率不超过0.01,故选C .【点睛】本小题主要考查补全22⨯联表,考查2K 的计算以及独立性检验的概念,属于基础题. 独立性检验的步骤:(1)根据样本数据制成22⨯列联表;(2)根据公式22n ad bc K a b c d a c b d -=++++()()()()(),计算2K 的观测值;(3)比较2K 与临界值的大小关系作统计推断. 6.A解析:A 【解析】 【分析】由题意知本题是在相同的条件下发生的试验,发射的事故率都为0.01,实验的结果只有发生和不发生两种结果,故本题符合独立重复试验,由独立重复试验的期望公式得到结果. 【详解】由题意知本题是在相同的条件下发生的试验,发射的事故率都为0.01,故本题符合独立重复试验,即ξ~(10,0.01)B . ∴100.010.1E ξ=⨯= 故选A . 【点睛】解决离散型随机变量分布列和期望问题时,主要依据概率的有关概念和运算,同时还要注意题目中离散型随机变量服从什么分布,若服从特殊的分布则运算要简单的多.7.A解析:A 【解析】 【分析】由题意结合2K 的观测值k 由独立性检验的数学思想给出正确的结论即可. 【详解】由于2K 的观测值10k =7.879>,其对应的值0.0050.5%=,据此结合独立性检验的思想可知:有99.5%的把握认为使用智能手机对学习有影响. 本题选择A 选项. 【点睛】独立性检验得出的结论是带有概率性质的,只能说结论成立的概率有多大,而不能完全肯定一个结论,因此才出现了临界值表,在分析问题时一定要注意这点,不可对某个问题下确定性结论,否则就可能对统计计算的结果作出错误的解释.8.A解析:A 【解析】将表中的数据代入公式,计算得22100(487243) 3.0535050919K ⨯⨯-⨯=≈⨯⨯⨯,∵3.053 2.706>,∴有90%的把握认为生产的产品质量与设备的选择具有相关性,故选A .9.D解析:D 【解析】因为K 2的观测值k=2300(371433585)12217872228⨯-⨯⨯⨯⨯≈4.514>3.841, 所以在犯错误的概率不超过0.05的前提下认为数学成绩与物理成绩有关系. 选D.10.B解析:B 【解析】对于①,在一次调研测试中,数学成绩ξ服从正态分布N (100,σ2),∴数学成绩ξ关于ξ=100对称,∵P (80<ξ≤100)=0.40,∴P (ξ>120)=P (ξ<80)=0.5-0.40=0.1,则该班数学成绩在120分以上的人数为0.1×100=10,故①错误;对于②,已知命题p :∀x ∈R ,sinx≤1,则¬p :∃x ∈R ,sinx >1,故②正确;对于③,由)2−8≥0,解得m≤-2或m≥2,∴在[-4,3]上随机取一个数m ,能使函数()22f x x =+在R 上有零点的概率为37,故③正确;对于④,填写2×2列联表如下:则k 2的观测值k =()23215854 5.398 5.024********⨯⨯-⨯≈>⨯⨯⨯有97%以上的把握认为晕机与性别有关.故④对 故选B11.A解析:A 【解析】当x =1,2,3时,分别代入求y 值,离y 最近的值模拟效果最好,可知A 模拟效果最好.故选A.考点:非线性回归方程的选择.12.C解析:C 【详解】由题意可得:2456855x ++++==,2535605575505y ++++==,回归方程过样本中心点,则:5285,1ˆˆ0bb =⨯+∴=. 本题选择C 选项.二、填空题13.【分析】根据残差的定义直接计算即可【详解】由题当x=4时故所以回归方程在样本处的残差为故答案为:【点睛】本题主要考查了残差的概念考查了运算能力属于容易题 解析:9-【分析】根据残差的定义直接计算即可.【详解】由题当x =4时,4ˆ 2.50.210.2y=+=⨯, 故1.210.29-=-所以回归方程ˆˆ 2.50.2x y=+在样本(4,1.2)处的残差为9-. 故答案为:9- 【点睛】本题主要考查了残差的概念,考查了运算能力,属于容易题.14.【解析】∵P(K2≥3841)≈005∴判断性别与是否爱好运动有关出错的可能性不超过5点睛:根据卡方公式计算再与参考数据比较就可确定可能性 解析:5%【解析】∵P (K 2≥3.841)≈0.05.∴判断性别与是否爱好运动有关,出错的可能性不超过5%. 点睛:根据卡方公式计算2K ,再与参考数据比较,就可确定可能性.15.1【解析】由题意可得K2的观测值k =≈7227∵P(K2≥6635)≈1所以x 与y 之间有关系出错的可能性为1解析:1% 【解析】由题意可得K 2的观测值k =210154016)(10164015)(1015)(4016)(1040)(1516)+++⨯-⨯++++(≈7.227,∵P (K 2≥6.635)≈1%, 所以“x 与y 之间有关系”出错的可能性为1%16.②④⑤【解析】试题分析:线性相关系数为当越接近1时两个变量的线性相关越强当越接近0时两个变量的线性相关越弱①错;由变量和的数据得到其回归直线方程:则一定经过②正确;每10分钟从中抽取一件产品进行某项解析:②④⑤ 【解析】试题分析:线性相关系数为,当越接近1时,两个变量的线性相关越强,当越接近0时,两个变量的线性相关越弱,①错;由变量和的数据得到其回归直线方程:,则一定经过,②正确;每10分钟从中抽取一件产品进行某项指标检测,这样的抽样是系统抽样,③错;相关指数用来刻画回归的效果,其计算公式是,在含有一个解释变量的线性模型中,恰好等于相关系数的平方.显然,取值越大,意味着残差平方和越小,也就是模型的拟合效果越好,④正确;在回归直线方程表示解释变量每增加一个单位时,预报变量增加0.1个单位,⑤正确,故填②④⑤.考点:线性相关,线性回归直线方程,抽样方法,残差.17.【解析】试题分析:由回归方程过样本平均数点则:由代入可得:由当气温为5℃时用电量的度数约为:40考点:回归方程的性质及应用解析:【解析】试题分析:由回归方程过样本平均数点(,)x y ,则:10,30x y ==,由2ˆ-=b代入x b a yˆˆˆ+=可得: ˆ50a=,由ˆ502y x =-当气温为5℃时,用电量的度数约为:40 考点:回归方程的性质及应用.18.①③④【解析】①是独立性检验的应用①对②中由于所以显然是半个圆②错③中由极坐标中两点距离公式=③对④中所有边长相等的凸多边形都是正多边形为大前提是错误的因为只需要正多边形挤压变形使之仍为凸多边形即可解析:①③④ 【解析】①是独立性检验的应用,①对.②中由于[]0,θπ∈,所以01y ≤≤,显然是半个圆,②错.③中,由极坐标中两点距离公式2221212212cos()AB ρρρρθθ=+--=14912()19,2+-⨯-=19AB ③对.④中“所有边长相等的凸多边形都是正多边形”为大前提,是错误的,因为只需要正多边形挤压变形,使之仍为凸多边形即可.④对.所以填①③④.19.③【解析】推断在100个吸烟的人中必有99人患有肺病说法错误排除①有99的把握认为吸烟与患病有关系时与99的可能患有肺病是两个不同概念排除②故填③解析:③ 【解析】推断在100个吸烟的人中必有99人患有肺病,说法错误,排除①,有99%的把握认为吸烟与患病有关系时,与99%的可能患有肺病是两个不同概念,排除②,故填③.20.【分析】根据列联表计算可得由可得结果【详解】由题意得:至少有的把握认为学生的学习积极性与对待班级工作的态度有关故答案为:【点睛】本题考查独立性检验问题的求解考查基础公式的应用 解析:99.9%【分析】根据22⨯列联表计算可得2K ,由210.828K >可得结果.【详解】由题意得:()225018197611.53810.82825252426K ⨯⨯-⨯=≈>⨯⨯⨯, ∴至少有10.1%99.9%-=的把握认为学生的学习积极性与对待班级工作的态度有关.故答案为:99.9%. 【点睛】本题考查独立性检验问题的求解,考查基础公式的应用.三、解答题21.(1)0.33b ≈^,2c =^;(2)①表格见解析,②Q 1<Q 2,所以模型甲的拟合效果更好. 【分析】(1)对于方程甲:设t=(x-1)2,则(1)2.75ybt =+^^,代入数据,求出,t y ,代入方程即可求出b ^,对于方程乙,求出x 的值,代入方程,即可求出c ^;(2)①将数据分别代入两方程,计算求解,即可完成表格,②分别计算模型甲与模型乙的残差平方和Q 1及Q 2,进行比较,即可得答案. 【详解】(1)对于方程甲:设t=(x-1)2,则(1)2.75y bt =+^^,所以t =15(1+4+9+16+25)=11,1(346811) 6.45y =++++=,所以6.411 2.75b =⨯+^,解得0.33b ≈^. 对于方程乙:1(23456)45x =++++=, 所以6.44 1.6c =⨯-^,解得2c =^. (2)①经计算,可得下表:12因为Q 1<Q 2,所以模型甲的拟合效果更好. 【点睛】本题考查回归直线的求法与应用,残差平方的计算与分析,计算难度偏大,考查分析理解,计算求值的能力,属中档题.22.(1)列联表见解析;(2)大概有90%把握认为药物有效. 【分析】(1)根据服用药的共有60个样本,服用药但患病的仍有20个样本,没有服用药且未患病的有20个样本,根据各种数据,列好表格,填好数据,得到列联表.(2)根据列联表数据,代入临界值公式,做出观测值,进行比较,即可得出结果. 【详解】(1)根据服用药的共有60个样本,服用药但患病的仍有20个样本,没有服用药且未患病的有20个样本,得到列联表()()()()()22n ad bc K a b c d a c b d -=++++()210040202020 2.77860406040⨯⨯-⨯=≈⨯⨯⨯由()22.7060.10P K ≥=,所以大概有90%把握认为药物有效.【点睛】本题考查了完善列联表和独立性检验,考查了数据分析能力和计算能力,属于基础题目. 23.(1)不能在犯错误的概率不超过10%的情况下认为该校参加体育运动的积极性与性别有关;(2)分布列答案见解析,6()5E X =,18()25D X =. 【分析】(1)代入2K 即可得出结论;(2)X 服从二项分布,分别求出概率,即可得出X 的分布列,然后代入数据求出期望和方差即可. 【详解】(1)由列联表可知2200(70406030)2002.1981307010010091k ⨯⨯-⨯==≈⨯⨯⨯,因为2.198 2.706<,所以不能在犯错误的概率不超过10%的情况下认为该校参加体育运动的积极性与性别有关. (2)由题意可知2(3,)5XB ,X 的所有可能取值为0,1,2,3,033327(0)()5125P X C ===,1232354(1)()()55125P X C ==⨯=,2232336(2)()55125P X C ==⨯=,33328(3)()5125P X C ===. 所以X 的分布列为()355E X =⨯=,()3(1)5525D X =⨯⨯-=.【点睛】本题主要考查独立性检验原理以及利用二项分布求期望和方差.属于中档题. 24.(1)列联表见解析;(2)有99%的把握认为”“手机支付族”与“性别”有关. 【分析】(1)根据茎叶图提供的数据可计数可得出列联表; (2)计算出2K 可得结论. 【详解】 解:(1)(2)由于245(1516410)7.287 6.63519262520K ⨯⨯-⨯=≈>⨯⨯⨯,因此有99%的把握认为”“手机支付族”与“性别”有关. 【点睛】本题考查列联表,考查独立性检验,正确认识茎叶图是解题关键.25.(Ⅰ)81.68;80.4;A 生产线的质量指标值更好;(Ⅱ)众数为80;中位数约为81.58;(Ⅲ)列联表见解析,有.【分析】(Ⅰ)同一组数据中的数据用该组区间的中点值作估值结合频率可计算出均值; (Ⅱ)频率最大的那组数据中间值为众数,中位数要计算频率不0.5的那一点,它在区间[]76,84上.(Ⅲ)根据频率分布直方图可得各数据,得列联表,计算2K 后可得结论. 【详解】解:(Ⅰ)设A ,B 两条生产线的产品质量指标值的平均数分别为x ,y ,由直方图可得(640.00625720.01825800.05375880.035960.01125)881.68x =⨯+⨯+⨯+⨯+⨯⨯=,同理80.4y =,x y >,因此A 生产线的质量指标值更好. (Ⅱ)A 生产线的产品质量指标值的众数为80由A 生产线的产品质量指标值频率分布直方图,前两组频率为0.0062580.0187580.20.5⨯+⨯=<前三组频率为0.0062580.0187580.0537580.630.5⨯+⨯+⨯=> 故中位数在区间[]76,84,设为x ,则()0.0062580.0187580.05375760.5x ⨯+⨯+⨯-=,解得 5.587681.58x ≈+=,故A 生产线的产品质量指标值的中位数约为81.58. (Ⅲ)()229982912004.714 3.84110010011189K ⨯-⨯⨯=≈>⨯⨯⨯故有95%的把握认为“生产超优品是否与生产线有关”.【点睛】本题考查频率分布直方图,考查用频率分布直方图估计众数,中位数,均值等,考查独立性检验.考查了学生的数据处理能力和运算求解能力,属于中档题.26.(1)该市一天的空气质量等级分别为1、2、3、4的概率分别为0.43、0.27、0.21、0.09;(2)350;(3)有,理由见解析.【分析】(1)根据频数分布表可计算出该市一天的空气质量等级分别为1、2、3、4的概率; (2)利用每组的中点值乘以频数,相加后除以100可得结果;(3)根据表格中的数据完善22⨯列联表,计算出2K 的观测值,再结合临界值表可得结论. 【详解】(1)由频数分布表可知,该市一天的空气质量等级为1的概率为216250.43100++=,等级为2的概率为510120.27100++=,等级为3的概率为6780.21100++=,等级为4的概率为7200.09100++=; (2)由频数分布表可知,一天中到该公园锻炼的人次的平均数为100203003550045350100⨯+⨯+⨯=(3)22⨯列联表如下:()221003383722 5.820 3.84155457030K ⨯⨯-⨯=≈>⨯⨯⨯,因此,有95%的把握认为一天中到该公园锻炼的人次与该市当天的空气质量有关.【点睛】本题考查利用频数分布表计算频率和平均数,同时也考查了独立性检验的应用,考查数据处理能力,属于基础题.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Data collection and conclusions1.The graph above shows the results of a controlled experiment designed by a scientist to determine the effect of magnetic field strength on the growth of sunflower plants. 500young sunflower plants were randomly assigned to the control or experimental group. In the control group, the scientist grew 250 sunflower plants under normal local geo-magnetic field conditions(30 microteslas). In the experimental group, the scientist grew 250 sunflower plants identically except under a lower geomagnetic field (20 microteslas). Based on the results of this experiment, which conclusion is NOT valid?A.Sunflower plants grown under lower magnetic field conditions were more likely to weighmore than sunflower plants grown under normal magnetic field conditions.B.There is evidence of an association between the strength of magnetic field and height insunflower plants.C.Sunflower plants grown under lower magnetic field conditions were more likely to be tallerthan sunflower plants grown under normal magnetic field conditions.D.Members of the control group were more likely to grow to less than 100 inches than membersof the experimental group.Correct answer: A Difficulty level: 22.A researcher wants to conduct a survey to gauge United States (U.S.) voters' opinions about the U.S. Congress. Which of the following should NOT be a component of this survey?A.The researcher collects data from the survey takers.B.The researcher analyzes data from the survey takers.C.The researcher distributes the survey to 10,000 randomly selected U.S. citizens aged 18 andolder.D.The researcher distributes the survey to 10,000 residents of a Washington D.C. neighborhood. Correct answer: D Difficulty level: 23.A scientist wants to collect data about the effects of gravity on the growth of soybean plants. To test her hypothesis that soybeans grow better in a zero-gravity setting, she randomly assigns the plants into one of two groups. The first group is grown in typical soybean growing conditions in a greenhouse on earth, and the second group is grown in a zero-gravity, yet otherwise identical greenhouse in a space station. Which of the following is the best description of the research design for this study?A.Controlled experimentB.Observational studyC.Sample SurveyD.None of the aboveCorrect answer: A Difficulty level: 24.A researcher representing a city government wants to measure public opinion about recycling by asking 1,000 randomly selected residents a series of questions on the subject. Which of the following is the best description of the research design for this study?A.Observational studyB.Sample SurveyC.Controlled experimentD.None of the aboveCorrect answer: B Difficulty level: 25.In order to determine whether children who have just watched cartoons will perform better on cognitive tasks than children who have not just watched cartoons, researchers randomly divided 60 preschoolers into three groups. For nine minutes, one group watched a rapid-paced cartoon, one group watched a slower-paced educational program, and one group colored. They then administered standardized tests to determine the immediate impact of the children’s previous nine minutes of activity. Which of the following is the best description of this type of research design?A.An observational study, a study in which investigators observe subjects and measure variablesof interest without assigning treatments to the subjects.B. A controlled experiment, a study in which an investigator separates subjects into a controlgroup that does not receive a treatment and an experimental group that receives a treatment, and then observes the effect of the treatment on the experimental group.C. A sample survey, a study that obtains data from a subset of a population, usually through aquestionnaire or interview, in order to estimate population attributes.D.None of the aboveCorrect answer: B Difficulty level: 26.The table above shows the results of an observational study designed to observe the social media habits of different age groups of internet users in the U.S. between 2005 and 2013. Based on the results of this study, which of the following conclusions are valid?I: In each year of the study, U.S. internet users aged 18-29 were more likely to use social media than any other age group in the study.II: Over the course of the study, there was growth in the percentage of U.S. internet users that use social media across all of the age groups observed.III: The rate of social media use by U.S. internet users will continue to rise in the future.IV: Social media was more likely to be used by a U.S. internet user aged 30-49 in 2013 than it was by a U.S. internet user aged 30-49 in 2005.A.I onlyB.I and IVC.I, II, and IVD.I, II, III, and IVCorrect answer: C Difficulty level: 37.The table above shows the results of a controlled experiment designed to determine the effect that adding sodium chloride to water has on the boiling point of water at sea level. Based on the results of this experiment, what conclusion is NOT valid when up to three tablespoons of sodium chloride are added to one quart of water?A.The more sodium chloride that is added to boiling water, the higher the water's boilingtemperature becomes.B.The more sodium chloride that is added to water, the longer the water will take to boil.C.There is an association between adding sodium chloride to water and an increase in theboiling temperature of water.D.There is a linear relationship between sodium chloride added to water and the water's boilingtemperature.Correct answer: B Difficulty level: 38.Louis Pasteur conducted a famous experiment that addressed the question: "Can microorganisms generate spontaneously?" To replicate the experiment, in the control group, purify water in closed flasks by boiling them, and then let the water sit in the closed flasks at room temperature for a predetermined period of time. In the experimental group, purify water in identical closed flasks by boiling them. However, before letting the experimental group sit at room temperature for the predetermined period of time, break the top stem of these flasks to expose the water to outside elements. After the predetermined period of time, if no microorganisms are observed in the control flasks and several thousand microorganisms are observed in each experimental flask, which of the following conclusions are valid?I: When closed off to outside elements, purified water will not spontaneously generate microorganisms.II: Exposing water to the elements causes the water to become harmful to humans.III: Breaking the top stem of the experimental flask allowed the microorganisms to enter the purified water.IV: Not breaking the stem of the control flask prevented microorganisms from entering the purified water within.A.III onlyB.III and IVC.I,III, and IVD.I,II,III, and IVCorrect answer: C Difficulty level: 39.A writer for a high school newspaper is conducting a survey to estimate the number of students that will vote for a particular candidate in an upcoming student government election. All students at the high school are eligible to vote in the election, and the writer decides to select a sample of students to take the survey. Which of the following sampling methods is most likely to produce valid results?A.Survey every fifth student to enter the school library.B.Survey every fifth student to arrive at school one morning.C.Survey every fifth senior to arrive at school one morning.D.Survey every fifth student to enter the school stadium for a football game.Correct answer: B Difficulty level: 310.The graph shown above shows the results of an observational study of corn grain yield, in bushels per acre, versus rate of nitrogen fertilizer solution, in pounds per acre, applied to crops. Based on the results of this study, which conclusion is best supported by the data?ing nitrogen in the soil causes greater grain yield.B.There is evidence of a linear association between the amount of nitrogen applied to the soiland the grain yield.C.There is evidence of an association between the amount of nitrogen applied to the soil and thegrain yield, but the association does not appear to be linear.D.Low levels of nitrogen in the soil leads to poor grain yield.Correct answer: C Difficulty level: 311.The table above shows the results of a controlled experiment designed to determine the effect of tailgate position on the fuel consumption of a pickup truck. Based on the results of this experiment, which conclusion is NOT valid?A.The truck needed the least fuel to travel a set distance when its tailgate was all the way up.B.The truck needed the most fuel to travel a set distance when its tailgate was all the way down.C.There is an association between the truck's tailgate position and the amount of fuel needed totravel a set distance.D. A truck driver who drives with the tailgate up will spend less money on fuel than when thetruck driver drives with the tailgate down.Correct answer: D Difficulty level: 412.An ecologist conducted measured the population of brown bears in a North American region and the number of deforested acres in the same region since the year 2000.The study concluded that as the population of brown bears steadily decreased, the number of deforested acres steadily increased during the same time period. Based on this data, which conclusion is valid?A.The increase in the number of deforested acres in the North American region since 20002000caused the decrease in the brown bear population there during the same time period.B.The decrease in the brown bear population in the North American region since 20002000caused the increase in the number of deforested acres there during the same time period.C.There is no evidence of an association between the brown bear population levels in the NorthAmerican region and the number of deforested acres there in the years since 20002000.D.There is evidence of an association between the brown bear population levels in the NorthAmerican region and the number of deforested acres there in the years since 20002000. Correct answer: D Difficulty level: 413.The graph to the left shows the results of a controlled experiment designed to determine how effective a new toothpaste is at preventing cavities. A researcher randomly selected 1,000 healthy adults **parable dental habits and records to participate and randomly assigned participants to either the experimental or control group. In the experimental group, 500 participants were asked to use the new toothpaste for a 6 month period. In the control group, the remaining 500 participants were asked to continue using their normal toothpaste during the same 6 month period. Based on the results of this experiment, which conclusion is NOT valid?A.There is an association between the participants brushing their teeth every day and notdeveloping new cavities.B.There is an association between using the new toothpaste and not developing new cavities.C.Four hundred members of the experimental group reported no new cavities during the study.D.Members of the control group were more likely to develop cavities than members of theexperimental group.Correct answer: A Difficulty level: 414.Adapted from "The Role of Deliberate Practice in the Acquisition of Expert Performance," by K.A. Ericsson, R. Th. Krampe, and C. Tesch-Romer, 1993, Psychological Review, 700(3).In a famous study on the role of practice in the acquisition of expert performance, **pared the amount of time spent on solitary practice, based on diaries and retrospective estimates, for four groups of violinists: professional violinists, the best expert violinists, good expert violinists, and the least accomplished expert violinists (lesser experts). Based on the results of this study, which conclusion is best supported by the data?A. A violinist who practices about 10,000hours by the age of 20will become a professionalviolinist.B.By the age of 20, the best experts and professional violinists in the study had practiced morethan twice as much as the least accomplished violinists.C.The least accomplished violinists did not practice as much because they became discouraged.D.There is no evidence of an association between increased solitary practice before the age of 18and level of expertise as a violinist.Correct answer: B Difficulty level: 415.The above table shows the percentages of the Canadian population as well as the percentage of Canadian hockey players in the National Hockey League (NHL) residing in cities of various sizes. Based on the results of this study, which conclusion is best supported by the evidence?A.There is evidence that players from mid-sized cities (100,000-499,999) are overrepresented inthe NHL.B.Players from very **munities (<1,000) do not have as many opportunities for elite training asplayers from **munities.C.Cities with populations larger than 500,000 are underrepresented in terms of players in theNHL because players in **munities face too **petition.Players in large cities have more opportunities for elite training than do players from smaller cities.Correct answer: A Difficulty level: 416.A local tv news station wants to determine how often and through which medium their viewers check the weather. Which of the following survey methods is most likely to produce valid results?A.Ask a random sample of their viewers how much they enjoy the weather portion of the localnews.B.Ask a random sample of their viewers whether they own a smartphone.C.Ask a random sample of members of the local meteorological society whether they watch thelocal news.D.Ask a random sample of their viewers how often and when they use various sources to obtainweather information.Correct answer: D Difficulty level: 317.A writer for a high school newspaper is conducting a survey to estimate the number of students that will vote for a particular candidate in an upcoming student government election. All students at the high school are eligible to vote in the election, and the writer decides to select a sample ofstudents to take the survey. Which of the following sampling methods is most likely to produce valid results?A.Survey every fifth student to enter the school library.B.Survey every fifth student to arrive at school one morning.C.Survey every fifth senior to arrive at school one morning.D.Survey every fifth student to enter the school stadium for a football game.Correct answer: BN Difficulty level: 3。