Thresholding for Making Classifiers Cost-sensitive

合集下载

2020_2021高中英语Unit1Greatscientists阅读理解科技篇题型专项突破新人教版

题型专项突破-Unit1阅读理解（科技篇）一、阅读理解阅读下列短文，从各题所给的四个选项(A、B、C和D)中，选出最佳选项。

1.The year is 1947. Thirteen-year-old Carl Sagan is standing outside a small house in the eastern city of Brooklyn, New York. It is dark. He is looking up at the sky. After a few minutes, he finds the spot for which he has been searching. It is a light red color in the night sky. Carl is looking at the planet Mars.Carl has just finished reading a book by American writer Edgar Rice Burroughs called A Princess of Mars, in which the man who travels to Mars can make the trip by looking at the planet for several minutes. He then is transported there by a strange force.Carl Sagan stands watching the red planet. After a while, young Carl realizes this will not happen. He turns to enter his home. But in his mind he says, “Some day it will be possible to travel to Mars.”It should be no surprise to learn that Carl Sagan studied the stars and planets when he grew older. He did this at the University of Chicago. Later he taught astronomy(天文学) at Harvard University and Cornell University. In the 1950s, Mr. Sagan helped design mechanical devices for use on some of the first space flights.He spent much of his life helping make space travel possible far out in the universe. He died in December, 1996. However, much of the work he did during his life helped make it possible for the American Pathfinder vehicle to land on Mars. It landed onJuly 4, 1997. It soon began sending back to Earth lots of information and thousands of pictures about the red planet.Carl Sagan's family say he would have been extremely happy about the new information from Mars.(1) What did little Carl Sagan intend to do while staring at a light red spot in the sky?A. To bee a scientist of Astronomy.B. To be sent to Mars by a strange force.C. To make sure of the position of Mars.D. To decide how to arrive at Mars.(2) It is implied in the last paragraph that .A. Carl Sagan didn't have the chance to go to MarsB. Carl's family felt sad at the news about the travelC. Carl Sagan did much for American space scienceD. Pathfinder was able to send back information from Mars soon(3) The story was written to .A. introduce Carl SaganB. explain space scienceC. record the travel of Pathfinder to MarsD. advertise a book A Princess of Mars(4) It can be inferred that .A. Pathfinder landed on Mars successfullyB. Carl Sagan began to watch Mars when 13C. the book A Princess of Mars may tell about travel from the Earth to MarsD. Carl Sagan was extremely happy to get information from Mars2.Unlike chemists and physicists, who usually do their experiments using machines, biologists and medical researchers have to use living things like rats. But there are three Nobel prize-winning scientists who actually chose to experiment on themselves —all in the name of science, reported The Telegraph.(1) The passage is mainly about .A. some dangerous experiments that Nobel prize winners did on themselvesB. difficulties that scientists went through in order to make important discoveriesC. the reasons why some scientists chose to experiment on themselvesD. three Nobel prize winners who did experiments on themselves(2) Which of the following is TRUE according to the article?A. Forssmann's experiment ended in failure.B. Forssmann managed to do his experiment on different kinds of animals.C. Barry Marshall succeeded by drinking some Helicobacter pylori.D. Barry Marshall's experiment on himself confirmed that most doctors' belief about gastritis was correct.(3) From the text, we can conclude that Ralph Steinman .A. discovered a new type of cancer cell called the dendritic cellB. tried different therapies containing the dendritic cell on himselfC. knew that he himself would have cancer and die from itD. believed that he was better than doctors at treating cancer3.Critics of cloning often repeat the question related to the controversial(有争议的) science “Just because we can, does it mean we should?”The closer we e to being able to clone a human, the hotter the debate over it grows. For all the good things cloning may achieve, opponents say that it will do just as much harm. Another question is how to govern cloning process.There is federal official law banning cloning in the United States, but several states have passed their own laws to ban the practice.The US Food and Drugs Administration (FDA), has also said that anyone in the United States attempting human cloning must first get its permission. In Japan, human cloning is a crime that is punished by up to 10 years in prison.While laws are to ban cloning at this time, some scientists believe that the technology is not ready to be tested on humans. Ian Wilmut, one of the co-creators of Dolly, has even said that human cloning projects would be an irresponsible crime. Cloning technology is still in its early stages, and nearly 98% percent of cloning efforts end in failure. The embryos are either not suitable for implanting into the uterus(子宫) or they die shortly after birth.The clones that do survive end up suffering from deadly or problematic genetic abnormalities(畸形). Some clones have been born with faulty heart, lung problems and blood vessel problems. One of the most famous cases was a cloned sheep that was born with malformed arteries (畸形动脉) leading to the lungs.Opponents of cloning will point out that we can enthanize(安乐死) these faulty clones of other animals, but they ask what if a human clone is born with these same problems. Advocates of cloning respond that it is now easier to pick out faulty embryos even before they are implanted into the mother. The debate over human cloning is just beginning, but as science advances, it could be the biggest moral dilemma of the 21st century.(1) Which word in the text is the opposite of the underlined word “opponents”?A. Critics.B. Advocates.C. Scientists.D. Co-creators.(2) The writer writes this passage mainly to .A. support passing laws to ban human cloningB. list problems with human cloningC. introduce critics' ideas about human cloningD. state the debate over human cloning(3) Which of the following will Ian Wilmut probably NOT agree with?A. Cloning technology is in its early stages.B. Human cloning should be made illegal.C. Very few cloning efforts are successful.D. Cloning technology is ready to be tested on humans.二、短文7选51.Many people think that they have to accept whatever life throws at them. They'll say, "This is my fate. I cannot change it. "(1) You don't have to suffer needlessly. Your fate depends on you, not on any other outside factors.We want to be happy and live our life to the fullest, but we have to do our share of making the effort to live the life of our dreams.(2) You just don't sit around and wait for a million dollars to fall from the sky. You have to get off the couch, get your eyes off the TV screen, get your hands off the phone (unless it contributes to your success), and get your mind and body to work! Don't expect your luck to change, unless you do something about it. If your life is not meant the way you want it to be, don't just say, "Our time will e." or "(3) " Don't expect your luck to change, unless you do something about it. If something goes wrong, don't just regard it as a temporary setback; but use it as feedback. Learn yourlesson, make the most of the situation, and do something to solve the problem. It's not enough to think positive; you also have to act positive.If someone's life is in trouble, do you just hope and pray that things will turn out fine? Of course not! (4)So it is with your own life. It is not enough to hope for the best, but you have to DO your best. In other words, don't just stand (or sit) there. (5)A. I think so.B. Of course not.C. You'd better let it be.D. Things will get better someday.E. Do something to improve your life.F. Remember that you harvest what you sow.G. You've got to do anything you can to save the person.题型专项突破-Unit1阅读理解（科技篇）答案一、阅读理解阅读下列短文，从各题所给的四个选项(A、B、C和D)中，选出最佳选项。

小学下册第十四次英语第二单元测验卷(有答案)

小学下册英语第二单元测验卷(有答案)考试时间：80分钟（总分:110）A卷一、综合题(共计100题共100分)1. 听力题：The cake is _______ (layered) with cream.2. 听力题：The Berlin Wall divided _______ into East and West.3. 听力题：The chemical formula for hydrochloric acid is ______.4. 选择题：What is the capital of the USA?A. New YorkB. Washington, D.C.C. Los AngelesD. Chicago答案:B5. 填空题：My aunt is a great __________ (组织者) for family gatherings.6. 听力题：A sound that is too high or too low may not be ______.7. 听力题：The _____ of a planet determines its seasons.8. 听力题：Gases have a ______ density compared to solids and liquids.9. 听力题：The boy has a cool ________.The chemical symbol for neon is ______.11. 选择题：What do you call the shared community of plants and animals?A. EcosystemB. HabitatC. BiomeD. Population答案:A12. 听力题：Photosynthesis converts sunlight into ______ energy.13. 填空题：The _______ (The Renaissance) led to advancements in arts and sciences.14. 选择题：What do we call a baby dog?A. KittenB. PuppyC. CalfD. Chick答案:B15. 听力题：I want to learn how to ______ (cook) dishes from different countries.16. 听力填空题：I love going to science fairs to see innovative __________.17. 填空题：A hedgehog rolls into a ball when it feels ______ (害怕).18. 选择题：What is the opposite of love?A. HateB. LikeC. CareD. Enjoy答案:A19. 听力题：A ______ is a natural phenomenon that can cause destruction.What is the capital of Estonia?A. TallinnB. RigaC. VilniusD. Helsinki答案:A21. 选择题：What is the opposite of polite?A. RudeB. CourteousC. RespectfulD. Civil答案:A22. 听力题：The ______ is a giant cloud of gas and dust in space.23. 选择题：How do you say "goodbye" in Italian?A. AdiósB. CiaoC. Au revoirD. Arrivederci24. 听力题：The fire is _____ (hot/cold).25. 填空题：I enjoy _____ (writing) about plants.26. 填空题：Recognizing the signs of a healthy plant can lead to better ______ practices. (识别健康植物的迹象可以促进更好的园艺实践。

模式识别与机器学习复习资料温雯老师

(1) image enhancement (2) separating touching or occluding fish (3) finding the boundary of the fish
温雯

一些需要提及的问题
温雯广东工业大学计算机学院 23
温雯
广东工业大学
计算机学院
21
温雯
广东工业大学

模式识别系统的复杂性 – An Example
“利用光学传感器采集信息，对传送带上的鱼进行种类的自动区分” Fish Classification: Sea Bass / Salmon
一个例子
将鲈鱼与三文鱼进行区分问题归纳（抽象而言） • 模式识别系统 • 设计流程
Preprocessing involves:
广东工业大学计算机学院 28
Overlap in the histograms is small compared to length feature 温雯广东工业大学计算机学院
27
温雯
判定边界

错误分类的代价

模型的复杂度

Generalization (推广能力）

Partition the feature space into two regions by finding the decision boundary （判定边界）that minimizes the error.
Optical Character Recognition (typography)
A v t u I h D U w K

一种新的人机交互系统你，从中看到模式识别吗？
Vision

数据挖掘顶级期刊简介

顶级会议第一KDD 第二SIAM ICDM中国计算机学会推荐国际学术刊物(数据库、数据挖掘与内容检索)序号刊物简称刊物全称出版社网址1 TODS ACM Transactions on Database Systems ACM /tods/2 TOIS ACM Transactions on Information andSystems ACM /pubs/tois/3 TKDE IEEE Transactions on Knowledge and Data Engineering IEEE Computer Society /tkde/4 VLDBJ VLDB Journal S pringer-Verlag/dblp/db/journals/vldb/index.html二、B类序号刊物简称刊物全称出版社网址1 TKDD ACM Transactions on Knowledge Discovery from Data ACM/pubs/tkdd/2 AEI Advanced Engineering Informatics Elsevier/wps/find/journaldescription.cws_home/622240/3 DKE Data and Knowledge Engineering Elsevier/science/journal/0169023X4 DMKD Data Mining and Knowledge DiscoverySpringer/content/100254/5 EJIS European Journal of Information Systems The OR Society/ejis/6 GeoInformatica Springer /content/1573-7624/7 IPM Information Processing and Management Elsevier/locate/infoproman8 Information Sciences Elsevier /locate/issn/002002559 IS Information Systems Elsevier/information-systems/10 JASIST Journal of the American Society for Information Science and TechnologyAmerican Society for Information Science and Technology /Publications/JASIS/jasis.html11 JWS Journal of Web Semantics Elsevier /locate/inca/67132212 KIS Knowledge and Information Systems Springer /journal/1011513 TWEB ACM Transactions on the Web ACM /三、C类序号刊物简称刊物全称出版社网址1 DPD Distributed and Parallel Databases Springer/content/1573-7578/2 I&M Information and Management E lsevier /locate/im/3 IPL Information Processing Letters Elsevier /locate/ipl4 Information Retrieval Springer /issn/1386-45645 IJCIS International Journal of Cooperative Information Systems World Scientific/ijcis6 IJGIS International Journal of Geographical Information Science Taylor & Francis/journals/tf/13658816.html7 IJIS International Journal of Intelligent Systems Wiley/jpages/0884-8173/8 IJKM International Journal of Knowledge Management IGI/journals/details.asp?id=42889 IJSWIS International Journal on Semantic Web and Information Systems IGI/10 JCIS J ournal of Computer Information Systems IACIS/web/journal.htm11 JDM Journal of Database Management IGI-Global/journals/details.asp?id=19812 JGITM Journal of Global Information Technology Management Ivy League Publishing/bae/jgitm/13 JIIS Journal of Intelligent Information Systems Springer/content/1573-7675/14 JSIS Journal of Strategic Information Systems Elsevier/locate/jsis中国计算机学会推荐国际学术刊物(数据库、数据挖掘与内容检索)一、A类序号刊物简称刊物全称出版社网址1 TODS ACM Transactions on Database Systems ACM /tods/2 TOIS ACM Transactions on Information andSystems ACM /pubs/tois/3 TKDE IEEE Transactions on Knowledge and Data Engineering IEEE Computer Society /tkde/4 VLDBJ VLDB Journal S pringer-Verlag/dblp/db/journals/vldb/index.html二、B类序号刊物简称刊物全称出版社网址1 TKDD ACM Transactions on Knowledge Discovery from Data ACM/pubs/tkdd/2 AEI Advanced Engineering Informatics Elsevier/wps/find/journaldescription.cws_home/622240/3 DKE Data and Knowledge Engineering Elsevier/science/journal/0169023X4 DMKD Data Mining and Knowledge DiscoverySpringer/content/100254/5 EJIS European Journal of Information Systems The OR Society/ejis/6 GeoInformatica Springer /content/1573-7624/7 IPM Information Processing and Management Elsevier/locate/infoproman8 Information Sciences Elsevier /locate/issn/002002559 IS Information Systems Elsevier/information-systems/10 JASIST Journal of the American Society for Information Science and TechnologyAmerican Society for Information Science and Technology /Publications/JASIS/jasis.html11 JWS Journal of Web Semantics Elsevier /locate/inca/67132212 KIS Knowledge and Information Systems Springer /journal/1011513 TWEB ACM Transactions on the Web ACM /三、C类序号刊物简称刊物全称出版社网址1 DPD Distributed and Parallel Databases Springer/content/1573-7578/2 I&M Information and Management E lsevier /locate/im/3 IPL Information Processing Letters Elsevier /locate/ipl4 Information Retrieval Springer /issn/1386-45645 IJCIS International Journal of Cooperative Information Systems World Scientific/ijcis6 IJGIS International Journal of Geographical Information Science Taylor & Francis/journals/tf/13658816.html7 IJIS International Journal of Intelligent Systems Wiley/jpages/0884-8173/8 IJKM International Journal of Knowledge Management IGI/journals/details.asp?id=42889 IJSWIS International Journal on Semantic Web and Information Systems IGI/10 JCIS J ournal of Computer Information Systems IACIS/web/journal.htm11 JDM Journal of Database Management IGI-Global/journals/details.asp?id=19812 JGITM Journal of Global Information Technology Management Ivy League Publishing/bae/jgitm/13 JIIS Journal of Intelligent Information Systems Springer/content/1573-7675/14 JSIS Journal of Strategic Information Systems Elsevier/locate/jsis一、以下是一些数据挖掘领域专家牛人的网站，有很多精华，能开阔研究者的思路，在此共享：1.Rakesh Agrawal主页：/en-us/people/rakesha/ 数据挖掘领域唯一独有的关联规则研究的创始人，其主要的Apriori算法开启了这一伟大的领域。

小学上册第十四次英语第6单元真题(含答案)

小学上册英语第6单元真题(含答案)英语试题一、综合题(本题有100小题，每小题1分，共100分.每小题不选、错误，均不给分)1.We have a ________ (家庭聚会) every year.2.What is the term for a young quokka?A. KitB. PupC. CalfD. Chick答案:c3.I have a _____ (跳绳) that I use to exercise. 我有一根用来锻炼的跳绳。

4.The garden is full of ________ (植物).5.The __________ (历史的启示) can spark innovation.6.The chemical symbol for barium is ______.7.What is the opposite of "clean"?A. DirtyB. WetC. SmallD. Tall答案:A Dirty8.I enjoy _______ (看书) at the library.9.The ant can lift objects many times its _______.10.What is the capital of Mozambique?A. MaputoB. BeiraC. NampulaD. Tete答案: A11.Which fruit is red and often mistaken for a vegetable?A. BananaB. TomatoC. OrangeD. Grape答案: B12.The process of hydrolysis uses ______ to break bonds.13.What do you call the season when leaves fall from trees?A. SpringB. SummerC. FallD. Winter答案:C14.She likes to swim in the ___. (lake)15.My cousin is very __________ (有条理的) in her studies.16.Vinegar is an example of an _______.17.The sun rises in the ______ (east).18.She is drawing a ________ (图画).19.The _____ (lettuce) grows quickly in cool weather.20.What is the name of the famous American author known for his adventure novels?A. Mark TwainB. Ernest HemingwayC. F. Scott FitzgeraldD. John Steinbeck答案:A21. A ______ is a type of animal that can run very fast.22.The vegetables are very ___. (fresh)23. (20) River is known as the "Yellow River." The ____24.The ______ (小鸟) chirps cheerfully in the ______ (早晨).25. A sunny day is great for flying a __________. (风筝)26.The _______ can help create a sustainable environment.27.The ________ (文化节) highlights traditions.28.Baking soda is a common ______ used in baking.29.The capital of Armenia is ________ (亚美尼亚的首都是________).30. A polymer is a large molecule made up of many ________.31.The ______ is known for her support of the arts.32.I saw a _______ (蝴蝶) resting on a flower.33.The capital of Libya is _____.34.I like to visit the ______.35. A _______ (小水獺) plays in the river.36.What do you call the art of making paper flowers?A. OrigamiB. QuillingC. PapercraftD. Floral Design答案: A37.The process of combining two or more elements to form a compound is called_______.38.What do you call a fruit that is usually red and grows on a vine?A. PotatoB. TomatoC. CarrotD. Cucumber答案: B39.The _______ (鲸鱼) sings beautiful songs.40.The __________ is a famous city known for its canals. (威尼斯)41.My favorite animal is the _________ (大象).42.I love to visit ______ (自然保护区) to learn about wildlife and conservation efforts. It’s important to protect our planet.43. A ______ is a geographical area characterized by its unique features.44.I have a pet _____ that likes to chase balls.45._____ (落叶树) lose their leaves in the winter.46. A chemical reaction can create a new _______.47.I tell my __________ about my day. (妈妈)48.The process of turning a liquid into a gas is called ______.49.Abraham Lincoln was the ________ president of the United States.50.What do you call the study of the Earth?A. GeographyB. GeologyC. BiologyD. History答案: A51.In a chemical reaction, the substances that are produced are called _____.52.What is the term for the outer layer of the Earth?A. CoreB. MantleC. CrustD. Surface答案:c53.My brother loves going to ____ (amusement parks).54.ayas are famous for their ________ (雪山). The Hima55.I see a spider on the ___. (wall)56.I enjoy ________ (旅行) with my family.57.The term "viscosity" refers to a liquid's _______ to flow.58.Chemical reactions can be affected by _____, concentration, and surface area.59.In a chemical equation, reactants are found on the ______.60.The _____ is known for its spiral shape.61.I want to ________ (learn) English.62.My _____ (邻居) is very nice.63.The __________ (历史的协作) encourages partnership.64.George Washington was the commander of the Continental ________.65. A _______ (小孔雀) displays its feathers proudly.66.I have a toy _______ that can jump high and far, bringing me joy.67. A chemical reaction can produce a precipitate from ______.68.In ancient Rome, people used to watch _____ (gladiator) fights in the arena.69.I enjoy _______ (参加) sports activities.70.The __________ is essential for understanding the geology of an area.71. Age marks the beginning of human ________ (文明). The Suez72. A ________ (有机农业) avoids chemicals.73.My friend is a ______. He loves to read comics.74. A _____ can tell us about the history of our solar system.75.The birds are ______ in the bright blue sky. (flying)76. A ____ hops quickly and has large ears.77.I like to _______ (写作) stories.78.Hawaii is a group of ________ (夏威夷是一组________) in the Pacific Ocean.79.The antelope leaps gracefully across the _____.80.What is the capital of the Solomon Islands?A. HoniaraB. SuvaC. TarawaD. Funafuti答案: A. Honiara81.The _____ (小兔) hops in the grass.82.I like to ___ new things. (discover)83.The ______ is a symbol of peace.84.The __________ (历史的探索) invites curiosity.85.In a covalent bond, atoms share ______.86.The __________ (国家公园) protects natural beauty.87.The tortoise is much _________ than the hare. (慢)88.I think it's essential to be respectful to __________.89.Photosynthesis is how plants make their own ________.90.What is the freezing point of water?A. 0 degrees CelsiusB. 32 degrees CelsiusC. 100 degrees CelsiusD. 50 degrees Celsius答案:A.0 degrees Celsius91. A ____ is often found swimming in ponds and has smooth skin.92. A _______ can be used to demonstrate the principles of physics.93.The capital of Macedonia is __________.94.The country famous for chocolate is ________ (以巧克力闻名的国家是________).95.I love watching the ________ (星星) at night.96.What is the name of the famous novel written by George Orwell?A. Brave New WorldB. Moby DickC. 1984D. Animal Farm答案: C97.What is the capital of Iceland?A. OsloB. ReykjavikC. HelsinkiD. Copenhagen答案:B98.My brother is passionate about __________ (科学).99. A polymer is a large molecule made of many ______. 100.________ (生态影响) shapes landscapes.。

图像处理专业词汇

FT 滤波器FFT filtersVGA 调色板和许多其他参数VGA palette and many others 按名称排序sort by name包括角度和刻度including angle and scale保持目标keep targets保存save保存和装载save and load饱和度saturation饱和加法和减法add and subtract with saturate背景淡化background flatten背景发现find background边缘和条纹测量Edge and Stripe/Measurement边缘和条纹的提取find edge and stripe编辑Edit编辑edit编辑或删除相关区域edit or delete relative region编码Code编码条Coda Bar变换forward or reverse fast Fourier transformation变量和自定义的行为variables and custom actions变量检测examine variables变形warping变形系数warping coefficients标题tile标注和影响区域label and zone of influence标准normal标准偏差standard deviation表面弯曲convex并入图像merge to image采集栏digitizer bar采集类型grab type菜单形式menu item参数Preferences参数轴和角度reference axis and angle测量measurement测量方法提取extract measurements from测量结果显示和统计display measurement results and statistics测量转换transfer to measurement插入Insert插入insert插入条件检查Insert condition checks查找最大值find extreme maximum长度length超过50 个不同特征的计算calculate over 50 differentfeatures area撤销undo撤销次数number of undo levels乘multiply尺寸size抽取或融合分量红red/处理Processing处理/采集图像到一个新的窗口processed/grabbed image into new window 窗口window窗口监视watch window窗位window leveling创建create垂直边沿vertical edge从Windows从表格新建new from grid从工具条按钮from toolbar button从用户窗口融合merge from user form粗糙roughness错误纠正error correction错误匹配fit error打开open打开近期的文件或脚本open recent file or script打印print打印设置print setup打印预览print preview大小和日期size and date带通band pass带有调色板的8- bit带有动态预览的直方图和x, y 线曲线椭圆轮廓histogram and x, y line curveellipse profiles with dynamic preview带阻band reject代码类型code type单步single step单一simple单帧采集snap shot导入VB等等etc.低通low pass第一帧first点point调色板预览palette viewer调试方式debug mode调用外部的DLL调整大小resize调整轮廓滤波器的平滑度和轮廓的最小域值adjust smoothness of contour filter and minimum threshold forcontours定点除fixed point divide定位精度positional accuracy定义一个包含有不相关的不一致的或无特征区域的模板define model including mask for irrelevant inconsistent orfeatureless areas定制制定-配置菜单Customize - configure menus动态预览with dynamic preview读出或产生一个条形或矩阵码read or generate bar and matrix codes读取和查验特征字符串erify character strings断点break points对比度contrast对比度拉伸contrast stretch对称symmetry对模板应用“不关心的”像素标注apply don't care pixel mask to model多边形polygon二进制binary二进制分离separate binary二值和灰度binary and grayscale翻转reverse返回return放大或缩小7 个级别zoom in or out 7 levels分类结果sort results分水岭Watershed分析Analysis分组视图view components浮点float腐蚀erode复合视图view composite复合输入combined with input复制duplicate复制duplicateselect all傅立叶变换Fourier transform改变热点值change hotspot values感兴趣区域ROI高级几何学Advanced geometry高通high pass格式栏formatbar更改默认的搜索参数modify default search parameters 工具Utilities工具栏toolbar工具属性tool properties工具条toolbar工作区workspace bar共享轮廓shared contours构件build构造表格construct grid和/或and/or和逆FFT画图工具drawing tools缓存buffer换算convert灰度grayscale恢复目标restore targets回放playback绘图连结connect map获得/装载标注make/load mask获取选定粒子draw selected blobs或从一个相关区域创建一个ROI or create an ROI from a relative region基线score基于校准映射的畸变校正distortion correction based on calibration mapping 极性polarity极坐标转换polar coordinatetransformation几何学Geometry记录record加粗thick加法add间隔spacing兼容compatible简洁compactness剪切cut减法subtract减小缩进outdent交互式的定义字体参数包括搜索限制ine font parameters including search constraints脚本栏script bar角度angle角度和缩放范围angle and scale range接收和确定域值acceptance and certainty thresholds结果栏result bar解开目标unlock targets精确度和时间间隔accuracy and timeout interval矩形rectangle矩形rectangular绝对差分absolute difference绝对值absolute value均匀uniform均值average拷贝copy拷贝序列copy sequence可接收的域值acceptance threshold克隆clone控制control控制controls快捷健shortcut key宽度breadth宽度width拉普拉斯Laplacians拉伸elongation蓝blue类型type粒子Blob粒子blob粒子标注label blobs粒子分离segment blobs粒子内的孔数目number of holes in a blob 亮度brightness亮度luminance另存为save as滤波器filters绿green轮廓profile overlay轮廓极性contour polarity逻辑运算logical operations面积area模板编辑edit model模板覆盖model coverage模板和目标覆盖model and target coverage 模板索引model index模板探测器Model Finder模板位置和角度model position and angle 模板中心model center模糊mask模块import VB module模块modules模式匹配Pattern matching默认案例default cases目标Targets目标分离separate objects目标评价target score欧拉数Euler number盆basins膨胀dilate匹配率match scores匹配数目number of matches平方和sum of the squares平滑smooth平均average平均averaged平均值mean平移translation前景色foreground color清除缓冲区为一个恒量clear buffer to a constant清除特定部分delete special区域增长region-growing ROI取反negate全部删除delete all缺省填充和相连粒子分离fill holes and separate touching blobs任意指定位置的中心矩和二阶矩central and ordinary moments of any order location: X, Y锐化sharpen三维视图view 3D色度hue删除delete删除帧delete frame设置settings设置相机类型enable digitizer camera type设置要点set main示例demos事件发现数量number of occurrences事件数目number of occurrences视图View收藏collectionDICOM手动manually手绘曲线freehand输出选项output options输出选择结果export selected results输入通道input channel属性页properties page数据矩阵DataMatrix数字化设置Digitizer settings双缓存double buffer双域值two-level水平边沿horizontal edge搜索find搜索和其他应用Windows Finder and other applications 搜索角度search angle搜索结果search results搜索区域search area搜索区域search region搜索速度search speed速度speed算法arithmetic缩放scaling缩放和偏移scale and offset锁定目标lock destination锁定实时图像处理效果预览lock live preview of processing effects on images 锁定预览Lock preview锁定源lock source特定角度at specific angle特定匹配操作hit or miss梯度rank替换replace添加噪声add noise条带直径ferret diameter停止stop停止采集halt grab同步synchronize同步通道sync channel统计Statistics图像Image图像大小image size图像拷贝copy image图像属性image properties图形graph退出exit椭圆ellipse椭圆ellipses外形shape伪彩pseudo-color位置position文本查看view as text文件File文件MIL MFO font file文件load and save as MIL MMF files文件load and save models as MIL MMO files OCR文件中的函数make calls to functions in external DLL files文件转换器file converterActiveMIL Builder ActiveMIL Builder 无符号抽取部分Extract band -细化thin下一帧next显示表现字体的灰度级ayscale representations of fonts显示代码show code线line线lines相对起点relative origin像素总数sum of all pixels向前或向后移动Move to front or back向上或向下up or down校准Calibration校准calibrate新的/感兴趣区域粘贴paste into New/ROI新建new信息/ 图形层DICOM information/overlay形态morphology行为actions修改modify修改路径modify paths修改搜索参数modify default search parameters 序列采集sequence旋转rotation旋转模板rotate model选择select选择selector循环loops移动move移动shift应用过滤器和分类器apply filters and classifiers影响区域zone of influence映射mapping用户定义user defined用基于变化上的控制实时预览分水岭转化结果阻止过分切割live preview of resulting watershed transformations with controlover variation to prevent over segmentation用某个值填充fill with value优化和编辑调色板palette optimization/editor有条件的conditional域值threshold域值thresholding预处理模板优化搜索速度循环全部扫描preprocess model to optimize search speed circular over-scan预览previous元件数目和开始（自动或手动）number of cells and threshold auto or manual元件最小／最大尺寸cell size min/max源source允许的匹配错误率和加权fit error and weight运行run在目标中匹配数目number of modelmatches in target暂停pause增大缩进indent整数除integer divide正FFT正常连续continuous normal支持象征学supported symbologies: BC 412直方图均衡histogram equalization执行execute执行外部程序和自动完成VBA only execute external programs and perform Automation VBA only指定specify指数exponential Rayleigh中值median重复repeat重建reconstruct重建和修改字体restore and modify fonts重新操作redo重心center of gravity周长perimeter注释annotations转换Convert转换convert装载load装载和保存模板为MIL MMO装载和另存为MIL MFO装载和另存为MIL MMF状态栏status bar资源管理器拖放图像drag-and-drop images from Windows ExplorerWindows自动或手动automatic or manual自动或手动模板创建automatic or manual model creation字符产大小string size字符串string字体font最大maximum最大化maximum最大数maxima最后一帧last frame最小minimum最小化minimum最小间隔标准minimum separation criteria最小数minima坐标盒的范围bounding box coordinatesAlgebraic operation 代数运算；一种图像处理运算，包括两幅图像对应像素的和、差、积、商。

中科院机器学习题库-new

机器学习题库一、极大似然1、 ML estimation of exponential model (10)A Gaussian distribution is often used to model data on the real line, but is sometimesinappropriate when the data are often close to zero but constrained to be nonnegative. In such cases one can fit an exponential distribution, whose probability density function is given by()1xb p x e b-=Given N observations x i drawn from such a distribution:(a) Write down the likelihood as a function of the scale parameter b.(b) Write down the derivative of the log likelihood.(c) Give a simple expression for the ML estimate for b.2、换成Poisson 分布：()|,0,1,2,...!x e p x y x θθθ-==()()()()()1111log |log log !log log !N Ni i i i N N i i i i l p x x x x N x θθθθθθ======--⎡⎤=--⎢⎥⎣⎦∑∑∑∑3、二、贝叶斯假设在考试的多项选择中，考生知道正确答案的概率为p ，猜测答案的概率为1-p ，并且假设考生知道正确答案答对题的概率为1，猜中正确答案的概率为1，其中m 为多选项的数目。

多媒体期末复习资料

Image Enhancement
Segmentation
Image Acquisition
Object Recognition
Problem Domain Colour Image Processing Image Compression
Representation & Description
Digital Image Processing School of Information Science and Technology
2016/1/14
CHEN Junzhou
1
Digital Image Processing School of Information Science and Technology
考试要点
• • • • 基本概念基本思想常见算法综合应用
2016/1/14
CHEN Junzhou
2
Digital Image Processing School of Information Science and Technology
Images taken from Gonzalez & Woods, Digital Image Processing (2002)
Key Stages in Digital Image Processing: Segmentation
Image Restoration Morphological Processing
CCD of the Camera
Digital Image Processing School of Information Science and Technology
What is a Digital Image? (cont…)

决策树(文献翻译-节选)

本科毕业设计（论文）外文参考文献译文及原文学院管理学院专业信息管理与信息系统年级班别2008级（6）班学号**********学生姓名张钟权指导教师胡凤2012年5月目录（一）外文文献译文 (1)4 决策树 (1)4.1 介绍 (1)4.2 决策和模式分类 (2)4.2.1 统计模式分类 (2)4.2.2使用逻辑相互关系 (3)4.3 决策域 (5)……4.6决策树实例 (6)（二）外文文献原文 (11)4 Decision Trees (11)4.1 Introduction (11)4.2 Decision-Making and Pattern Classification (14)4.2.1 Statistical Pattern Classification (14)4.2.2 Use of Logical Inter-relationships (15)4.3 Decision Regions (17)……4.6 Decision Tree Examples (19)（一）外文文献译文4 决策树4.1 介绍统计决策广泛应用于实验地球科学，它在环境科学中扮演着更重要的角色，由于环境系统随时间不断改变，需要根据观测系统和可能情况不断地矫正行动（采取不同的行动策略）。

一组可能的矫正措施通常在一个决策环境中，称为决策集。

一些物理属性（或变量）的观测值是潜在有用的，这也是可采取的矫正措施的一种情况。

在系统中根据新的情况不断地矫正措施，目的是为了减少损失，或成本或为了最大利益。

考虑到成本是一个负收益，对一个给定的决策问题，科学家和企业人员看法了一个综合单一标准——成本最小。

一个好的决策应该满足：一、综合成本最小，二、最优决策。

获取和收集物理变量值的过程也被称为特征提取（特征变量）、变量测定，这些变量有时候也被称为特征、特征变量、测量。

这些特征变量中的一些变量可能会对决策有影响，确定这些变量是一个挑战。

Summary A STATISTICAL STUDY OF VISUAL RESOLUTION THRESHOLDS

A STATISTICAL STUDY OF VISUAL RESOLUTION THRESHOLDSbyB. L. H i l l s , R. L. Beurle & M. V. D a n i e l s ,Department of E l e c t r i c a l and E l e c t r o n i c Engineering,U n i v e r s i t y of Nottingham, England.SummaryMeasurements of threshold contrast requiredf o r various r e s o l u t i o n tasks at various back-ground l igh t l e v e l s are r e p o r t e d. The p r e di c -t i o n s of a simple s t a t i s t i c a l model based onphoton-noise l i m i t e d d e t e c t i o n are compared w i t h the e m p i r i c a l observations. Correspondence is encouraging provided care is taken to account f o r the v a r i a t i o n of a l l important parameters such as s p a t i a l and temporal i n t e g r a t i o n , p u p i l area, e t c.I n t r o d u c t i o n Since the e a r l y work of Konig in 1879, it has been known t h a t v i s u a l a c u i t y , i.e. the a b i l i t y t o d i s c r i m i n a t e f i n e d e t a i l i n a n o b j e c t , p r o -g r e s s i v e l y d e t e r i o r a t e s as the l e v e l of i l l u m i -n a t i o n f a l l s. Shlaer 1 has shown t h a t t h i s f a l l in v i s u a l a c u i t y is dependent on the type of a c u i t y p a t t e r n used. Thus at backgroundluminances above 100 t r o l a n d s , the gap in a'Landolt C' p a t t e r n is easier to see than bars of a s i m i l a r w i d t h in a p a r a l l e l bar r e s o l u t i o n p a t t e r n. At lower luminances, the reverse is t r u e. The u l t i m a t e l i m i t of a c u i t y underoptimum viewing c o n d i t i o n s depends c r i t i c a l l y on the nature of the task. For v e r n i e r a c u i t y , a displacement of 2 seconds of arc between the top and bottom halves of a t h i n v e r t i c a l black l i n e has been detected. For the Landolt 'C' and the p a r a l l e l bar p a t t e r n , gaps and bar widths of the order of 25' are the best t h a t have been r e s o l v e d.I t i s g e n e r a l l y agreed t h a t the f a l l i nv i s u a l a c u i t y a t low l e v e l s o f i l l u m i n a t i o n i sconnected w i t h the changeover from using the small cone summation areas of the fovea at high adapta-t i o n l e v e l s , to using the l a r g e r rod summation areas of the p e r i p h e r y at low l e v e l s. A l s o , these summation areas are known to increase in size w i t h decrease in the background l e v e l , to which the eye is adapted. This w i l l also reduce a c u i t y.This account is concerned w i t h the recog-n i t i o n of simple p a t t e r n s of v a r y i n g c o n t r a s t at r e l a t i v e l y low l i g h t l e v e l s and is an extension of work on d e t e c t i o n p r e v i o u s l y described by the a u t h o r s 2. The e m p i r i c a l observations of detec-t i o n threshold described in t h i s reference have been compared w i t h the p r e d i c t i o n s of a simple s t a t i s t i c a l model f o r a photon-noise l i m i t e d d e t e c t o r. This work, which is to be published elsewhere, showed an encouraging correspondence, and the present paper describes the a p p l i c a t i o n of the same technique to r e s o l u t i o n t h r e s h o l d s. The Ideal Detector The original idea of comparing the actual performance of the human visual system with an 'Ideal Detector' is due to Rose 3, although similar theories were developed independently by de Vries and also Pirenne. An ideal detector (or ideal picture pickup device) is one whose performance is limited only by the s t a t i s t i c a l fluctuations in the number of incident photons picked up by the device. Presented with the task of detecting a pattern distinguished by a small luminance, A I , superimposed on a uniform background of luminance I, this detector can do no better than count the photons arriving within the area where the pattern is anticipated, and compare this with the mean density of arrival of photons in the background I. Background photons are considered indistinguishable from target photons, and a l l these incident photons are given equal weight in the output of the device (i.e., it is l i n e a r ), so that every effectively absorbed quantum is taken into account in the f i n a l decision. In adapting this concept, the ideal detector has been assumed to be subject to certain l i m i -tation analogous to those of the eye. F or example, it is taken to have the a b i l i t y to integrate temporally for a period which is assumed to vary as empirical observation suggests it does for the eye. Then again it is assumed that there is spatial integration over an area the extent of which depends on the background l i g h t l e v e l. S o m e assumptions must be made about the method of determining I. F or example, it may be assumed that the background can be sampled on a sufficient number of occasions or over a s u f f i c -ient area by the ideal detector for the back-ground luminance I to be known precisely. Alternatively, it may be assumed that the measure-ment of I is subject to error in the same way as the measurement of Now, it is generally accepted that the num-bers of absorbed quanta fluctuate according to a Poisson d i s t r i b u t i o n , the deviations from an average absorption of ft quanta thus having a root mean square value of n° • Thus if i quanta are absorbed from the background, it is these fluctuations that interfere with the detection of a small change AI in illumination. If this small change in illumination, or 'signal', yields An absorbed quanta then, for a given r e l i a b i l i t y , the threshold of detection is given by where k is a constant c a l l e d the threshold s i g n a l --455-t o -n o i s e r a t i o , i t s exact value depending on the r e q u i r e d degree o f c e r t a i n t y i n d e t e c t i o n.Equation (1) is the basic equation of thef l u c t u a t i o n theory of Rose. It i s , however, an approximation, f o r i t ignores the f l u c t u a t i o n s i n the s ig n a l. Thus, a more r i g o r o u s form of the equation is For a l l except the lowest background l e v e l s , t h i s equation approximates to (1), since An >> n, but at very low background l e v e l s when t h i s c o n d i t i o n does not a p p l y , equation (2) must be used. In the experiments on d e t e c t i o n , it was shown d e s i r a b l e to assume t h a t in what has been c a l l e d a primary r e c i p i e n t u n i t 2 the s e n s i t i v i t y v a r i e s r a d i a l l y over the e f f e c t i v e summation area over which s p a t i a l i n t e g r a t i o n takes p l a c e , a f a c t which has been recognised in the l i t e r a t u r e 4,5. The assumed v a r i a t i o n is shown in F igure 1,2,5 t h i s curve r e p r e s e n t i n g a r a d i a l v a r i a t i o n of theform where r m is a f u n c t i o n of the background luminance, as shown in F igure 2. The adoption of t h i s expression has been j u s t i f i e d merely f o r con-venience i n c a l c u l a t i o n. I t gives a good f i t t o the range o f e m p i r i c a l r e s u l t s which i t represents, but there is no reason to t h i n k t h a t it has any b i o l o g i c a l s i g n i f i c a n c e. The summation time T and p u p i l area, ap, were assumed to vary w i t h background luminance in accordance w i t h the curve shown in F igure 3 and the data given in Table I. With t h i s knowledge of the s p a t i a l and temporal summation c h a r a c t e r i s t i c s of the r e t i n a , it now becomes p o s s i b l e to c a l c u l a t e the number of photons An in a s i n g l e sample taken by the eye from a p a t t e r n i l l u m i n a t e d by A I. The mean number of quanta n from the background I may s i m i l a r l y be c a l c u l a t e d. S u b s t i t u t i o n of these in equation (1) now gives an equation f o r AI in terms of background luminance I. As a rough approximation t h is is of the form but the precise r e l a t i o n s h i p i s appreciably a f f e c t e d b y the v a r i a t i o n w i t h l i g h t l e v e l o f r m , T and a p . Since the technique of c a l c u l a t i o n is summarised in more d e t a i l in a companion paper 8, i t i s n o t proposed t o elaborate f u r t h e r o n the d e t a i l s here. I t was found t h a t f o r l a r g e o b j e c t s a t higher l i g h t l e v e l s the primary r e c i p i e n t d e t e c t o r was inadequate to e x p l a i n the r e s u l t s , and it wase s s e n t i a l also to assume the existence ofelongated l i n e a r "edge" d e t e c t o r s 2. The evidencesuggested t h a t these had a transverse v a r i a t i o n ofs e n s i t i v i t y s i m i l a r t o the v a r i a t i o n across thediameter o f the u n i t s already r e f e r r e d t o. Thiscould be the r e s u l t of combining the output of a l i n e a r a r r a y of primary summation u n i t s. There was also evidence f o r a drop o f f in s e n s i t i v i t y towards the ends of these long summation areas and a law s i m i l a r to equation (3) was assumed, r e p l a c i n g r m w i t h a l e n g t h constant 1^ which wasalso found to be a s i m i l a r f u n c t i o n of background l e v e l. Values assumed f o r a l l these parameters are given in Table I which also gives a p the area of the n a t u r a l p u p i l assumed in these c a l c u l a t i o n s.Since the d e t a i l e d r e s u l t s of these e x p e r i -ments are t o b e published elsewhere, i t w i l ls u f f i c e here merely to say t h a t f o r the p a t t e r n s t e s t e d , d i s c s , a n n u l i and p a r a l l e l b a r s , the correspondence between e m p i r i c a l thresholds and p r e d i c t e d thresholds was encouraging, d i s c r e p a n -cies being t y p i c a l l y less than 0.2 i n l o g a r i t h m i c u n i t s to the base 10.Resolution MeasurementsThe experiments reported here were designedto compare the t h r e s h o l d c o n t r a s t f o r r e s o l u t i o n a t v a r i o u s l i g h t l e v e l s o f the f o l l o w i n g p a t t e r n shapes:r a r a l l e l barsThe "F oucault F an" p a t t e r nThe double disc p a t t e r nThe 'L a n d o l t C'A square equal to the gap in the 'L a n d o l t C'A v e r n i e r a c u i t y p a t t e r nThe shapes of these p a t t e r n s are shown in F igure 4. Three d i f f e r e n t sizes of each p a t t e r n were used, and the o b j e c t was to see whether the v a r i a t i o n s between r e s o l u t i o n thresholds f o r d i f f e r e n tp a t t e r n s could to any e x t e n t be accounted f o r by the a p p l i c a t i o n of the photon noise theory o u t -l i n e d above.In these experiments the subjects were given adequate time to adapt to an evenly i l l u m i n a t e d background, and the r e s o l u t i o n p a t t e r n was super-imposed on t h i s , the colour temperatures beingr e s p e c t i v e l y. The r e s o l u t i o nthresholds were determined by the s u b j e c t , who had c o n t r o l of the i l l u m i n a t i o n of the p a t t e r n. He was asked to s t r a d d l e the p o i n t at which ther e s o l u t i o n f e a t u r e of the p a t t e r n was j u s t v i s i b l e. It is thus p o s s i b l e to make a d i s t i n c t i o n between the "d e t e c t i o n " t h r e s h o l d a t which i t i s p o s s i b l e to see the p a t t e r n as a whole, and the r e c o g n i t i o n t h r e s h o l d at which is is p o s s i b l e to see the r e s o -l u t i o n f e a t u r e (e.g. the bars i n the p a r a l l e l bar r e s o l u t i o n p a t t e r n. With the exception of the square a l l the p a t t e r n s have such a f e a t u r e , e.g. the gap in the "Landolt C", the existence of two separate discs and the d i s c o n t i n u i t y in the v e r t i -c a l l i n e. The square was included f o r comparison w i t h the Landolt C where the task might be des-c r i b e d as "s e e i n g " the "missing square" where the gap is p r e s e n t. The Results The averaged increment thresholds f o r the three subjects are given i n Table I I . F or -456-s i m p l i c i t y , the thresholds f o r the p a t t e r n s w i l l be compared two at a time.One s i m p l i f y i n g feature of the r e s u l t s is t h a t the corresponding bar and fan p a t t e r n s (as defined in F igure A) have v i r t u a l l y i d e n t i c a l increment t h r e s h o l d s. This i s true f o r a l l three sizes and a l l background luminances. This is because the p o i n t e r s i n d i c a t i n g where 'z ! is measured are c e n t r a l l y placed on the fan p a t t e r n. It has been found t h a t if the p o i n t e r s are used too near the t o p , the increment threshold f o r the fan is higher than f o r the b a r s. Thus, w i t h the p r o v i s o t h a t the p o i n t e r s are used in the c e n t r a l r e g i o n , the comparisons of performance that w i l l be made w i t h the p a r a l l e l bar p a t t e r n would be the same if the comparisons had been made w i t h the fan p a t t e r n. In F igure 5, the f i r s t comparison is made between the increment thresholds f o r r e s o l u t i o n of the Landolt 'C' and the p a r a l l e l bar p a t t e r n s. I t i s f i r s t noted t h a t the increment thresholds f o r the bar p a t t e r n s are considerably lower than the e q u i v a l e n t Landolt 'C' as defined in terms of z in F igure 4. There are major d i f f e r e n c e s between the thresholds f o r these two p a t t e r n s. For the l a r g e and medium sizes of both p a t t e r n s , the increment threshold p r o g r e s s i v e l y decreases as the background l e v e l is lowered, but f o r the small p a t t e r n s , the increment threshold is constant for background luminances of 10~4 mL and below.I t w i l l b e seen t h a t v e r t i c a l displacement of the curves f o r the bar p a t t e r n by 0.7 log u n i t s gives reasonably close f i t s to the curves f o r the Landolt C. This is t r u e to some extent f o r a l l the p a t t e r n s , excluding the squares, and might suggest t h a t once some allowance has been made f o r the o v e r a l l i n e f f i c i e n c y f a c t o r of each p a t t e r n , they would a l l give much the same r e s u l t s. This i s , however, believed to be an over-s i m p l i f i c a t i o n , and indeed is only a rough approxi-mation. Compared d i r e c t l y , there i s l i t t l e s i m i l a r i t y between the r e s u l t s. F or example, the large Landolt C and the small bar p a t t e r n r e q u i r e the same c o n t r a s t threshold f o r r e s o l u t i o n at a background l e v e l of -1.80 Log mL. Below t h i s adaptation l e v e l , the threshold of the large Landolt C becomes p r o g r e s s i v e l y lower r e l a t i v e to t h a t of the small bar p a t t e r n , w h i l e above t h i s l e v e l , the reverse i s t r u e. Also f o r the highest background l e v e l , a c o n t r a s t can be found f o r which a l l the bar p a t t e r n s but none of theLandolt C's can be r e s o l v e d , whereas at the lowest background luminances a c o n t r a s t l e v e l which j u s t allows a l l the bars p a t t e r n s to be r e s o l v e d , o n l y leaves the small Landolt C unresolved. This again shows t h a t the bar p a t t e r n becomes more v i s i b l e in r e l a t i o n to t h i s 'w h i t e ' Landolt C if the background l e v e l is r a i s e d , at l e a s t f o r the range of background luminances used here. It w i l l be suggested l a t e r t h a t t h i s is because the Landolt C is resolved using primary r e c i p i e n t u n i t s , w h i l e the bar p a t t e r n is resolved by line/edge d e t e c t o r s. In F igure 6, the increment thresholds f o r the r e s o l u t i o n of the 'double d i s c ' and bar p a t t e r n s are compared. The comparison shows some features s i m i l a r to t h a t between the Landolt C's and p a r a l l e l bars. The double disc in general has a higher t h r e s h o l d , and the threshold increases r e l a t i v e to the bar p a t t e r n as the background l e v e l is r a i s e d. At the lowest background l e v e l s the thresholds f o r the double disc are much closer to the thresholds of the p a r a l l e l bars than to those of the Landolt C. Thus, if we a t t r i b u t e the behaviour of the double d i s c to primary r e c i p i e n t u n i t s and t h a t of the bar p a t t e r n s to line/edge d e t e c t o r s , we must also e x p l a i n the d i f f e r e n c e between Landolt C and double d i s c. The increment thresholds f o r the r e s o l u t i o nof the Landolt C and the d e t e c t i o n of the squaresare compared in F igure 7. F or background luminances of 10"3 mL and above, the two sets of curves f o l l o w each other very c l o s e l y. This suggests that in t h i s r e g i o n , r e s o l u t i o n of the gap in these white Landolt C's is determined by the d e t e c t a b i l i t y of a square decrement of l i g h t equal in area to the gap, viewed against a s i m i l a r background. This presupposes t h a t increment and decrement thresholds f o r d e t e c t i o n of a square are very s i m i l a r. Below 10~3 mL, the increment thresholds f o r the squares f a l l p r o g r e s s i v e l y below those of the Landolt C. The curves f o r the squares are the steepest of a l l the p a t t e r n s at the very low background l e v e l s. In F igure 8, the increment thresholds f o r the r e s o l u t i o n of the broken l i n e are compared w i t h those f o r the d e t e c t i o n of the square. F or background luminances above l O -4 mL, the curves are s u f f i c i e n t l y close to suggest t h a t the r e s o l u t i o n of the broken l i n e under these con-d i t i o n s is l i m i t e d by the d e t e c t a b i l i t y of a square patch of l i g h t of side equal to the d i s -placement between the top and bottom halves of the l i n e. An i n t e r e s t i n g experiment might t h e r e f o r e be to compare the l i m i t of v e r n i e r a c u i t y as it is normally determined w i t h the size of the smallest d e t e c t a b l e black square, f o r a wide range of background luminances. For background luminances above 10""3 mL, it w i l l be seen from Table II t h a t the increment thresholds f o r r e s o l u t i o n of the medium double disc p a t t e r n and d e t e c t i o n of the large square are very c l o s e. The same is t r u e f o r the small double d i s c and medium sized square at background luminances above 10~2 mL. During the experiment, the subjects were asked whether they could s t i l l detect the presence of a p a t t e r n when i t s increment in luminance had been lowered so t h a t it could no longer be r e s o l v e d. Their v e r b a l r e p o r t s i n d i -cate t h a t at high background l e v e l s , they could a t best only detect the p a t t e r n s very f a i n t l y i n t h i s n o n -r e s o l v i n g c o n d i t i o n. P a r t i c u l a r l y w i t h the large and medium sizes of p a t t e r n , r e s o l u t i o n came w i t h d e t e c t i o n. At the very low background luminances, most of the large p a t t e r n s were s t i l l only f a i n t l y d e t e c t a b l e , but the large Landolt C was b r i g h t enough to be r e a d i l y d e t e c t a b l e. -457-The medium sizes of p a t t e r n were a l l r e a d i l yd e t e c t a b l e , and the small sizes were v e r y b r i g h t indeed compared w i t h the background, and y e t not r e s o l v e d.Extension of the D e t e c t i o nModel to ResolutionThus f a r , the r e s u l t s have only beendiscussed i n q u a l i t a t i v e terms. I n t h i s s e c t i o n , a more q u a n t i t a t i v e approach is taken byextending the model of d e t e c t i o n r e f e r r e d to e a r l i e r , t o cover the c a l c u l a t i o n o f r e s o l u t i o n t h r e s h o l d s. T h e o r e t i c a l curves of increment t h r e s h o l d f o r r e s o l u t i o n , A I R , against background luminance, I , are d e r i v e d f o r the p a r a l l e l b a r s , Landolt C and 'd i s c s ' p a t t e r n s , and the e f f e c t of d i f f e r e n t assumptions i s examined.For d e t e c t i o n of an increment in luminanceA I , on a background luminance I the response of a d e t e c t o r o p t i m a l l y p o s i t i o n e d w i t h respect t o the stimulus so as to sample 1+AI, was compared w i t h the response of a d e t e c t o r sampling the background, I , alone. I t was argued t h a t the d i f f e r e n c e between the means of these two responses was the 's i g n a l 1, and t h a t t h i s s i g n a l must exceed the combined f l u c t u a t i o n s in these two responses by a constant r a t i o in order f o r the s i g n a l to be d e t e c t e d.For r e s o l u t i o n , i t i s proposed t h a t theresponses to be compared should be from twoi d e n t i c a l d e t e c t o r s , X and Y, which are centred over d i f f e r e n t regions o f the r e s o l u t i o n p a t t e r n. For the p a r a l l e l bar p a t t e r n , one is centred over a b a r , and the other is centred over an adjacent space between the b a r s. For the Landolt C, one is centred over the gap, and the other over a segment of the r i n g. In the case of the double d i s c p a t t e r n , one is centred on one of the d i s c s , the o t h e r over or near the gap between the two d i s c s. Again, the d i f f e r e n c e between the mean responses of the two d e t e c t o r s is taken as the 's i g n a l ' and i t i s argued t h a t t h i s must exceed the combined f l u c t u a t i o n s of the two responses by a constant r a t i o in order f o r the p a t t e r n to be r e s o l v e d.P r e d i c t e d Resolution ofP a r a l l e l Bar P a t t e r nFor reasons discussed in the l a s t s e c t i o n ,the r e s o l u t i o n of the bars in the p a r a l l e l bar p a t t e r n s w i l l be assumed to be performed by l i n e d e t e c t o r s. F igure 9(a) represents a l i n ed e t e c t o r , X, centred on a bar and a l i n ed e t e c t o r , Y, centred on an adjacent space between b a r 8. The response of each d e t e c t o r to each p o i n t o f the bar p a t t e r n i s p r o p o r t i o n a l t o the product of the l i g h t i n t e n s i t y and thes e n s i t i v i t y of the d e t e c t o r at t h a t p o i n t. Thus the e f f e c t i v e areas, o'E (X) and o'E (Y) of the bar p a t t e r n in causing a response in the d e t e c t o r s X and Y r e s p e c t i v e l y , may be found by i n t e g r a t i o n. The 's i g n a l' now becomes where C is a constant embodying geometrical f a c t o r s and the number of quanta per u n i t of l i g h t. The i n t e g r a t e d background which con-t r i b u t e s to noise becomes where I D r e p r e s e n t i n g an "eigengrau" is unimportant except near absolute t h r e s h o l d. Thus from equation (2), by p u t t i ng Ck - K' At a l l but the lowest background luminances, the terms in I D and A I R on the r i g h t hand side of t h i s equation are n e g l i g i b l e. The values of r m , L M , T and a p were f i r s t a l l considered to be set by the background luminance alone. The t h e o r e t i c a l curves thus derived are compared w i t h the e m p i r i c a l data in F igure 10. The values of K' and I D used were -1.22 and 7_10~7 mL r e s p e c t i v e l y. It w i l l be seen t h a t the t h e o r e t i c a l curves f i t the e m p i r i c a l p o i n t s w e l l f o r the whole range of backgrounds. However, t h i s good f i t is thought to be m i s -l e a d i n g f o r reasons explained l a t e r. I n the s e c t i o n 'R e s u l t s ', i t was noted t h a t at low background luminances, the small bar p a t t e r n had to be very b r i g h t compared w i t h the surrounding background, f o r i t t o b e r e s o l v e d. T h e r e f o r e , i n F igure 11, the t h e o r e t i c a l l y f i t t e d curves have been c a l c u l a t e d assuming r m , L m , T and a p are set by (I +A l /2), the mean luminance w i t h i n the bar p a t t e r n. The f i t f o r the medium and l a r g e p a t t e r n s is h a r d l y a f f e c t e d , but f o r the small p a t t e r n at low background luminances, the t h e o r e t i c a l p r e d i c t i o n i s s l i g h t l y o p t i m i s t i c. I n Table I I I , the absolute t h r e s h o l d values c a l c u l a t e d from equation (4) assuming r m , L^, T and <x p are set by A I /2 are compared w i t h the measured v a l u e s. Again, the f i t i s good f o r the l a r g e and medium p a t t e r n s , and the p r e d i c t i o n a l i t t l e o p t i m i s t i c f o r the small bar p a t t e r n a t low background luminances. This w i l l be r e f e r r e d t o l a t e r. Resolution of the Landolt C P a t t e r n For reasons discussed in the l a s t s e c t i o n , we s h a l l take it t h a t the p o s i t i o n of the gap in the Landolt C is discovered by l i g h t i n t e g r a t e d i n primary r e c i p i e n t u n i t s w i t h r a d i a l symmetry as in equation (3). F igure 9(b) i n d i c a t e s the p o s i t i o n s of the centres of the two u n i t s whose responses are compared in order to resolve the p a t t e r n. Unit Y was centred over the gap of the C. In the f i r s t case, u n i t X was considered to be d i a g o n a l l y opposite Y, since t h i s would give the maximum d i f f e r e n c e in response. Thee f f e c t i v e areas a E (X) and a E (Y) of the p a t t e r ncausing responses in the u n i t s X and Y respec-t i v e l y were c a l c u l a t e d by numerical i n t e g r a t i o n.By adapting equation (4), the incrementthreshold f o r r e s o l u t i o n, is given byAt a l l but the lowest background l e v e l s , the terms in I D and A I R on the r i g h t hand side of the equation are again n e g l i g i b l e. Figure 12 compares the e m p i r i c a l data w i t hthe t h e o r e t i c a l curves assuming that r m , " T and oip are f u n c t i o n s of I alone or of '"The values of K and I p used were -1*22 l o g u n i t s and . __ mL r e s p e c t i v e l y. For both sets of assumptions, the theory p r e d i c t s thresholds con-s i d e r a b l y lower than were in f a c t measured.I t was t h e r e f o r e a r b i t r a r i l y decided t oderive the t h e o r e t i c a l curves assuming that the comparison u n i t , X, was located only one gap's w i d t h away from the u n i t , Y, on the Landolt C r i n g (see Figure 9(b )). I n Figure 13, the r e s u l t i n g t h e o r e t i c a l curves are compared w i t h the experimental d a t a. The same values of K and I D were used. r m , Lm, T and a were considered f u n c t i o n s o f I o n l y. The f i t i s s t i l l not good, but is considerably b e t t e r than t h a t found when the u n i t X is d i a g o n a l l y opposite Y. The t r u t h probably l i e s somewhere between, and it may be p o s s i b l e to improve t h i s theory of the r e s o l u t i o n t h r e s h o l d of the Landolt C by more r e f i n e dp o s i t i o n i n g of the u n i t X. The u n i t should perhaps be p o s i t i o n e d a f i x e d distance from Y r a t h e r than a distance r e l a t e d to the gap w i d t h.Resolution of Double Pise P a t t e r nResolution is again assumed to be performed by primary r e c i p i e n t u n i t s. Figure 9(c)represents the two cases considered. In both c o n d i t i o n s , u n i t X was located at the centre of a d i s c , (see Figure 9). U n i t Y was located (i ) at the centre of the gap between the two d i s c s , or (i i ) at a distance of 1.64 times the disc radius from the centres of both d i s c s. The e f f e c t i v e areas a E (X) and a E (Y) of the discs in causing a response in the u n i t s X and Y r e s p e c t i v e l y , were c a l c u l a t e d by numerical i n t e g r a t i o n.T h e o r e t i c a l increment thresholds f o r r e s o l u t i o n were c a l c u l a t e d f o r case (i ) and case (i i ) by s u b s t i t u t i o n i n equation (4).Case (i i ) provides the l a r g e r s i g n a ld i f fe r e n c e between the X and Y u n i t s and thus p r e d i c t s the higher s e n s i t i v i t y. I n Figure 14, the case (i i ) p r e d i c t i o n s are compared w i t h thee m p i r i c a l dataf o r a value of K of -1.16 log u n i t s. The f i t is reasonable at background luminances below 10-2 mL, but above thi s adaptation l e v e l , the e x p e r i m e n t a l l y determined increment thresholds are lower than those p r e d i c t e d. I t i s possible that p a r t of the discrepancy at high background luminances may be due to edge d e t e c t i o n super-seding area d e t e c t i o n. However, if K had been taken as 1.22, as in Figures 10, 11 and 12, the agreement would have been close at high l i g h t l e v e l s , but the p r e d i c t e d thresholds would have been 0.6 l o g u n i t s too low at low l i g h t l e v e l s. A possible reason f o r t h i s w i l l be discussed l a t e r. The t h e o r e t i c a l values derived f o r case (i ) w i t h the u n i t Y p o s i t i o n e d c e n t r a l l y between the two discs have not been p l o t t e d because the r e s u l t s showed no improvement. Detection of Squares Increment thresholds f o r the area d e t e c t i o n of squares were c a l c u l a t e d using equation (5), s i m p l i f y i n g i n t e g r a t i o n by assuming t h a t a square is as detectable as an equal area disc f o r t h i s type of d e t e c t i o n. I t can be shown numerically t h a t the e r r o r in the c a l c u l a t e d increment threshold in making t h i s assumption is less than 0*5%. The t h e o r e t i c a l increment thresholds thus derived are compared w i t h those obtained e x p e r i -mentally in Figure 15. The values of K and I n used were -0.98 l o g u n i t s and 7><10-7 r e s p e c t i v e l y. The former was chosen to give the best a p p r o x i -mation at low l i g h t l e v e l s and f o r background luminances below 10~2 mL the p r e d i c t e d values are in good agreement w i t h the e m p i r i c a l measurements, but above t h i s background l e v e l we f i n d t h a t the p r e d i c t e d thresholds are about 0.25 l o g u n i t s higher than those measured. To t e s t whether or not t h i s discrepancy is due to edge d e t e c t i o n predominating a t these l i g h t l e v e l s , t h e o r e t i c a l edge d e t e c t i o n curves were derived using equation (4), w i t h the value of K' made equal to that of K used in the case of area d e t e c t i o n. The t h e o r e t i c a l area d e t e c t i o n. The t h e o r e t i c a l area and edge d e t e c t i o n curves f o r the squares are shown in Figure 16. I t w i l l be seen t h a t edge d e t e c t i o n only supersedes area d e t e c t i o n f o r the large square at the highest background luminance. Thus, the simple edge d e t e c t i o n so f a r developed cannot e x p l a i n the discrepancies at high background luminances. Summation of the four border e f f e c t s is a p o s s i b l e f a c t o r. A f e a t u r e of the t h e o r e t i c a l curve f i t t i n g to the data of t h i s experiment is t h a t the value of K which has been chosen f o r the squares (-0.98 l o g u n i t s ) is considerably lower than the value of K' used f o r the r e s o l u t i o n of the bar p a t t e r n (-1.22 l o g u n i t s ). I ndeed, i f the value K = -1.22 l o g u n i t s had been t a k e n , the f i t would have been good at h i g h l e v e l s of background luminance, at the expense of the f i t at the lowest l i g h t l e v e l , which would then be o p t i m i s t i c by 0.24 l o g u n i t s. General Comments We have seen t h a t depending on the choice of K , i t i s p o s s i b l e t o f i t t h e o r e t i c a l curves e i t h e r at the upper end or at the lower end of the back-ground luminance range covered in these experiments. 459-。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Thresholding for Making Classifiers Cost-sensitiveVictor S. Sheng, Charles X. LingDepartment of Computer ScienceThe University of Western Ontario, London, Ontario N6A 5B7, Canada{ssheng, cling}@csd.uwo.caAbstractIn this paper we propose a very simple, yet general and effective method to make any cost-insensitive classifiers (that can produce probability estimates) cost-sensitive. The method, called Thresholding, selects a proper threshold from training instances according to the misclassification cost. Similar to other cost-sensitive meta-learning methods, Thresholding can convert any existing (and future) cost-insensitive learning algorithms and techniques into cost-sensitive ones. However, by comparing with the existing cost sensitive meta-learning methods and the direct use of the theoretical threshold, Thresholding almost always produces the lowest misclassification cost. Experiments also show that Thresholding has the least sensitivity on the misclassification cost ratio. Thus, it is recommended to use when the difference on misclassification costs is large.IntroductionClassification is a primary task of inductive learning in machine learning. Many effective inductive learning techniques have developed, such as naïve Bayes, decision trees, neural networks, and so on. However, most original classification algorithms ignore different misclassification errors; or they implicitly assume that all misclassification errors cost equally. In many real-world applications, this assumption is not true. For example, in medical diagnosis, missing a cancer diagnosis (false negative) is much more serious than the other way around (false positive); the patient could lose his/her life because of the delay in treatment. In many real-world applications, the differences between different misclassification errors can be quite large. Cost-sensitive learning (Turney, 1995, 2000; Elkan, 2001; Zadrozny and Elkan, 2001; Lizotte, 2003; Ting, 1998) has received much attention in recent years to deal with such an issue. Many works for dealing with different misclassification costs have been done, and they can be categorized into two groups. One is to design cost-sensitive learning algorithms directly (Turney, 1995; Drummond and Holte, 2000). The other is to design a wrapper that converts existing cost-insensitive base learning algorithms into cost-sensitive ones. The wrapper method is also called cost-sensitive meta-learning. Section 2 provides a more detailed review of cost-sensitive meta-learning approaches (such as relabeling (Domingos, 1999,Copyright © 2006, American Association for Artificial Intelligence (). All rights reserved.Witten & Frank, 2005), sampling (Zadronzny et al., 2003), and weighting (Ting, 1998)).Cost-sensitive meta-learning methods are useful because they allow us to reuse existing base learning algorithms and their related improvements. Thresholding is another cost-sensitive meta-learning method, and it is applicable to any classifiers that can produce probability estimates on training and test examples. Almost all classifiers (such as decision trees, naïve Bayes, and neural networks) can produce probability estimates on examples. Thresholdingis very simple: it selects the probability that minimizes the total misclassification cost on the training instances as the threshold for predicting testing instances. However, we will show that Thresholding is highly effective. It outperforms previous meta-learning cost-sensitive methods, and even the theoretical threshold, on almost all datasets. It is also least sensitive when the difference in misclassification costs is high.In the next section, we will give an overview of previous work on cost-sensitive meta-learning, particularly MetaCost (Domingos, 1999) and Weighting (Ting, 1998). Section 3 describes Thresholding that can convert any cost-insensitive classifiers into cost-sensitive ones. The empirical evaluation is presented in Section 4, which is followed by conclusions in the last section.Review of Previous WorkCost-sensitive meta-learning converts existing cost-insensitive base learning algorithms into cost-sensitive ones without modifying them. Thus, it can be regarded as a middleware component that pre-processes the training data, or post-processes the output, for cost-insensitive learning algorithms.Cost-sensitive meta-learning techniques can be classified into two main categories, sampling and non-sampling, in terms of whether the distribution of training data is modified or not according to the misclassification costs. Costing (Zadronzny et al., 2003) belongs to the sampling category. This paper focuses on the non-sampling cost-sensitive meta-learning approaches. The non-sampling approaches can be further classified into three subcategories: relabeling, weighting, and threshold adjusting, described below.The first is relabeling the classes of instances, by applying the minimum expected cost criterion (Michie, Spiegelhalter, and Taylor, 1994). Relabeling can be further divided into two branches: relabeling the training instancesand relabeling the test instances. MetaCost (Domingos, 1999) belongs to the former, and CostSensitiveClassifier (CSC) (Witten & Frank, 2005) belongs to the latter.Weighting (Ting, 1998) assigns a certain weight to each instance in terms of its class, according to the misclassification costs, such that the learning algorithm is in favor of the class with high weight/cost.The third subcategory is threshold adjusting . Thresholding belongs to this category. It searches for the best probability as a threshold for future prediction. We provide a detailed description of it in Section 3. In Section 4, we compare it with the other non-sampling methods: relabling and weighting.In (Elkan, 2001), the theoretical threshold for making an optimal decision on classifying instances into positive is obtained as:)1,0()0,1()0,1(C C C T +=, (2) where C(j,i) is the misclassification cost of classifying an instance belonging to class j into class i . In this paper, we assume that there is no cost for the true positive and the true negative, i.e., C(0,0) = C(1,1) = 0.(Elkan, 2001) further discusses how to use this formula to rebalance training instances (e.g., via sampling) to turn cost-insensitive classifiers into cost-sensitive ones. In a later section, we will show Thresholding , which searches for the best threshold, surprisingly outperforms the direct use of the theoretical threshold defined in (2).ThresholdingAs we have discussed in Introduction, almost all classification methods can produce probability estimates on instances (both training instances and test instances). Thresholding simply finds the best probability from the training instances as the threshold, and use it to predict the class label of test instances: a test example with predicted probability above or equal to this threshold is predicted as positive; otherwise as negative. Thus, for a given threshold, the total misclassification cost for a set of examples can be calculated, and it (M C ) is a function of the threshold (T ); that is, M C =f(T). The curve of this function can be obtained after computing misclassification costs for each possible threshold. In reality, we only need to calculate misclassification costs for each possible probability estimates on the training examples. With this curve, Thresholding can simply choose the best threshold that minimizes the total misclassification cost, with the following two improvements on tie breaking and overfitting avoidance.There are in general three types of curves for the function M C =f(T), as shown in Figure 1. Figure 1(a) shows a curve of the total misclassification cost with one global minimum. This is the ideal case. However, in practice, there may exist local minima in the curve M C =f(T) as shown in Figures 1(b) and 1(c). Figure 1(b) shows a case with multiple local minima but one of them is smaller than all others. In both cases ((a) and (b)) it is straightforward for Thresholding to select the threshold with the minimaltotal cost. Figure 1(c) shows a case with two or more local minima with the same value. We have designed a heuristic to resolve the tie: we select the local minimum with hills that are less steep on average; in another word, we select the local minimum whose “valley” has a wider span. The rationale behind this heuristic for the tie breaking is that we prefer a local minimum that is less sensitive to small changes in the threshold selection. For the case shown in Figure 1(c), the span of the right “valley” is greater than the one of the left. Thus, T 2 is chosen as the best threshold.Figure 1. Typical curves for the total misclassification cost. Another improvement is overfitting avoidance. Overfitting can occur if the threshold is obtained directly from the training instances: the best threshold obtained directly from the training instances may not generalize well for the test instances. To reduce overfitting, Thresholding searches for the best probability as threshold from the validation sets. More specifically, an m-fold cross-validation is applied, and the base learning algorithm predicts the probability estimates on the validation sets. After this, the probability estimate of each training example is obtained (as it was in the validation set). Thresholding then simply picks up the best threshold that yields the minimum total misclassification cost (with the tie breaking heuristic described earlier), and use it for the test instances. Note that the test instances are not used for searching the best threshold.Empirical EvaluationTable 1. Twelve Datasets used in the experiments, where Monks-P3 represents the dataset Monks-Problems-3.To compare Thresholding with other existing methods, we choose 11 real-world datasets and 1 artificial dataset (Monks-Problems-3), listed in Table 1, from the UCI Machine Learning Repository (Blake and Merz, 1998). These datasets are chosen because they are binary classes,No. of Attributes No. of Instances Class dist.(N/P ) Cost ratio(FP/FN )Breast-cancer 10 286201/85 85/201 Breast-w 10 699 458/241241/458 Car 7 1728 1210/518 518/1210 Credit-g 21 1000 700/300 300/700 Diabetes 9 768 500/268 268/500 Hepatitis 20 155 32/123 123/32 Kr-vs-kp 37 3196 1669/1527 1527/1669 Monks-P3 7 554 266/288 288/266 Sick 30 3772 3541/231 231/3541 Spect 23 267 55/212 212/55 Spectf 45 349 95/254 254/95 Tic-tac-toe 10 958 332/626 626/332have at least some discrete attributes, and have a good number of instances. In all experiments, we use 10-fold cross validation in Thresholding .Comparing with Other Meta-Learning MethodsWe choose C4.5 (Quinlan, 1993) as the base learning algorithm. We first conduct experiments to compare the performance of Thresholding with existing meta-learning cost-sensitive methods: MetaCost, CSC and Weighting in CostSensitiveClassifier. Many researchers (Bauer and Kohavi, 1999; Domingos, 1999; Buhlmann and Yu, 2003; Zadrozny et al., 2003) have shown that Bagging (Breiman, 1996) can reliably improve base classifiers. As bagging has already been applied in MetaCost, we also apply bagging (with different numbers of bagging iterations) to Thresholding and CostSensitiveClassifier. We implement Thresholding in the popular machine learning toolbox WEKA (Witten & Frank, 2005). As MetaCost and CostSensitiveClassifier are already implemented in WEKA, we directly use these implementations in our experiments.As misclassification costs are not available for the datasets in the UCI Machine Learning Repository, we reasonably assign their values to be roughly the number ofinstances of the opposite class. This way, the rare class is more expensive if you predict it incorrectly. This is normally the case in the real-world applications. This setting can also reduce the potential effect of the class distribution, because the performance of MetaCost, CSC and Weighting in CostSensitiveClassifier may be affected by the base learners that implicitly make decisions based on the threshold 0.5. Later we will set misclassification costs to be independent of the number of examples.The experimental results, shown in Figure 3, are presented in terms of the average total cost via 10 runs over ten-fold cross-validation applied to all the methods. This is the external cross-validation for Thresholding . Note that Thresholding has an internal cross-validation (i.e., the m-fold cross validation described in Section 3), which is only used to search the proper threshold from the training set in Thresholding . Figure 2 shows the experiment process for Thresholding .1. Apply 10-fold cross-validation. That is, sample 90% data for training, and the rest (none-overlapping) is for testinga. Apply 10-fold cross-validation on the training data to find the proper threshold.Figure 3. Comparing Thresholding with other meta-learning approaches. The lower the total cost, the better.i.Apply the base learner on the internal trainingsetii.Predict probability estimates on the validation setb.Find the best threshold based on the predictedprobabilitiesc.Classify the examples in the test set with thethreshold obtained in step 1(a)2.Obtain the average total costFigure 2. The experiment process of Thresholding.In Figure 3, the vertical axis represents the total misclassification cost, and the horizontal axis represents the number of iterations in bagging. We summarize the experimental results in Table 2.Table 2. Summary of the experimental results. An entry w/t/l means that the approach at the corresponding row wins in w datasets, ties in t datasets, and loses in l datasets, compared to the approach at the corresponding column1.MetaCostCSCWeighting CSC 7/1/4Weighting 9/0/3 10/1/1Thresholding 9/1/2 9/1/2 6/1/5the results shown in Figure 3 and Table 2. First of all, MetaCost almost performs worse than other meta-learning algorithms. MetaCost may overfit the data as it uses the same learning algorithm to build the model as the one to relabel the training examples. Bagging does improve its performance in all datasets tested, particularly in first 10 iterations. But the improvements are not as significant as Bagging applied in other algorithms, particularly after 10 iterations. Second, CSC performs better than MetaCost in seven out of twelve datasets. In other datasets, it is similar or worse. Third, overall, Weighting performs much better than MetaCost and CSC. Weighting performs worse than MetaCost only in three datasets (Car, Kr-vs-kp, and Monks-Problems-3). In others, it outperforms MetaCost significantly. Comparing with CSC, Weighting performs better in ten out of twelve datasets. In the other datasets, it is the same (Breast-w) or worse (Kr-vs-kp). Fourth, Thresholding outperforms MetaCost and CSC in nine out of twelve datasets respectively, and outperforms Weighting in six datasets. In the others, it is similar or worse. Similar to MetaCost, Bagging does improve the performance of Thresholding, but not significant. Without bagging (i.e., the number of iteration is 1), Thresholding performs the best in nine out of twelve datasets. In all, we can conclude that Thresholding is the best, followed by Weighting, followed by CSC.Both MetaCost and CSC belong to the relabeling category: the former relabels the training instances and the latter relabels the test instances. This leads us to believe 1As there are four points in each curve, we define that curve A wins curveB if A has more than three points, including three points, lower than their corresponding points in B. We also define that A ties with B if A has two points lower and the other two points higher than their corresponding points in B. For the rest cases, curve A loses to curve B. that the relabeling approach is less satisfactory, and Thresholding and Weighting seem to be better meta-learning approaches. The experimental results in this section show Thresholding is the best.Sensitivity to Cost RatiosIn the last section, we compare the performance of meta-learning methods under some specific misclassification costs. In this section, we evaluate the sensitivity of these meta-learning methods in terms of different cost ratios of 2:1, 5:1, 10:1, and 20:1 between false positive and false negative. These cost ratios are independent to the number of positive and negative instances.Bagging (with 10 iterations) is still applied in all methods. The results, shown in Figure 4, are presented in terms of the average total cost (in units; we set the false negative misclassification cost as one unit) over ten-fold cross-validation. The vertical axis represents the total cost, and the horizontal axis represents the cost ratios. We summarize the results in Table 3.Table 3. Summary of the experimental results (Figure 4). The definition of the entry w/t/l is the same as Table 2.MetaCost CSC Weighting CSC2/0/10Weighting5/3/47/2/3Thresholding6/3/39/2/1 7/2/3From the results in Figure 4 and Table 3, we can draw the following conclusions. First, the relative relationship for the performance of the meta-learning methods remains the same: Thresholding is the best, followed by Weighting. However, MetaCost is much better than CSC. This shows that the post-relabeling (CSC) becomes worse when the cost ratios increase. Thresholding outperforms all other methods for most cost ratios in seven out of twelve datasets tested.Second, overall the total misclassification cost increases with increasing values of the cost ratios. This is expected as the sum of false positive and false negative increases when the value of the cost ratio increases.Another interesting conclusion is that each method has a different sensitivity to the cost ratio increment. The sensitivity can be reflected by how quickly the total misclassification cost increases when the cost ratio increases. The less quickly it increases, the better. We can see from Figure 4 that the increment of the total cost of CSC is always almost the greatest. It is followed by MetaCost. MetaCost is similar as Weighting only in three datasets (Breast-w,Credit-g, and Spect). However, Weighting outperforms MetaCost in five of the rest datasets. Thresholding is again the best (i.e., the slowest increment) in six out of twelve datasets tested. It performs better than MetaCost in six datasets, better than CSC in nine datasets, and better than Weighting in seven datasets. Except two datasets (Credit-g and Diabetes),Thresholding is one of the best methods for the rest of thedatasets. In all, we can conclude that CSC is most sensitiveto the increment of the cost ratios, followed by MetaCost,and followed by Weighting. Thresholding is the mostresistant (the best) to the cost ratios. Thus, when the cost ratio is large, it is recommended over other methods. Theoretical Threshold Thresholding searches for the best threshold from thetraining instances; however, it is time consuming to searchfor the best threshold via cross-validation. How does itcompare with the theoretical threshold reviewed earlier? Inthis section, we compare Thresholding with the direct useof the theoretical threshold. We conduct the sameexperiments as the last subsection. The results arepresented in Figure 5. We can see that Threholding clearlyoutperforms the theoretical threshold in nine out of twelvedatasets. They are exact same in the dataset Monks-Problems-3. In addition, Thresholding has a betterresistance (insensitivity) to large cost ratios, particularly indatasets Breast-w , Car, Credit-g , Hepatitis , Kr-vs-kp,Spectf , Spect , and Tic-tac-toe . We can thus conclude that itis worth spending time to search for the best threshold inThresholding .Table 4. Comparing Theoretical with other approaches.MetaCost CSC WeightingTheoretical 6/0/6 9/2/1 5/0/7Figure 5 in Table 4. We can see that the use of thetheoretical threshold performs much better than CSC, although it ties to MetaCost and worse than Weighting. Conclusions and Future Work Thresholding is a general method to make any cost-insensitive learning algorithms cost-sensitive. It is a simpleyet direct approach as it learns the best threshold from thetraining instances. Thus, the best threshold chosen reflectsnot only different misclassification costs but also the data distribution. We were surprised by its good performance. However our repeated experiments show that Thresholding outperforms other existing cost-sensitive meta-learning methods, such as MetaCost, CSC, Weighting, and the direct use of the theoretical threshold. Threholding also has the best resistance (insensitivity) to large misclassification cost ratios. Thus, it is recommended to use especially when the difference in misclassification costs is large. In our future work, we plan to apply Thresholding on datasets with multiple classes. Acknowledgements We thank anonymous reviewers for the useful comments and suggestions. The authors thank NSERC for the support of their research. References Bauer, E., and Kohavi, R. 1999. An empirical comparison of voting classification algorithms: bagging, boosting and variants, Machine Learning , 36(1/2):105-139.Blake, C.L., and Merz, C.J. 1998. UCI Repository ofmachine learning databases (website). Irvine, CA:University of California, Department of Information andComputer Science.Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J.1984. Classification and regression trees. Wadsworth,Belmont, CA.Brieman, L. 1996. Bagging predictors. Machine Learning,24:123-140.Buhlmann, P., and Yu, B. 2003. Analyzing bagging.Annals of Statistics.Chawla, N.V., Japkowicz, N., and Kolcz, A. eds. 2004.Special Issue on Learning from Imbalanced Datasets.SIGKDD, 6(1): ACM Press.Domingos, P. 1999. MetaCost: A general method formaking classifiers cost-sensitive. In Proceedings of theFifth International Conference on Knowledge Discoveryand Data Mining, 155-164, ACM Press.Drummond, C., and Holte, R. 2000. Exploiting the cost(in)sensitivity of decision tree splitting criteria. InProceedings of the 17th International Conference onMachine Learning, 239-246.Drummond, C., and Holte, R.C. 2003. C4.5, ClassImbalance, and Cost Sensitivity: Why under-samplingbeats over-sampling. Workshop on Learning fromImbalanced Datasets II.Elkan, C. 2001. The Foundations of Cost-SensitiveLearning. In Proceedings of the Seventeenth InternationalJoint Conference of Artificial Intelligence, 973-978.Seattle, Washington: Morgan Kaufmann.Lizotte, D., Madani, O., and Greiner R. 2003. BudgetedLearning of Naïve-Bayes Classifiers. In Proceedings of theNineteenth Conference on Uncertainty in ArtificialIntelligence. Acapulco, Mexico: Morgan Kaufmann.Michie, D., Spiegelhalter, D.J., and Taylor, C.C. 1994.Machine Learning, Neural and Statistical Classification.Ellis Horwood Limited.Quinlan, J.R. eds. 1993. C4.5: Programs for MachineLearning. Morgan Kaufmann.Ting, K.M. 1998. Inducing Cost-Sensitive Trees viaInstance Weighting. In Proceedings of the SecondEuropean Symposium on Principles of Data Mining andKnowledge Discovery, 23-26. Springer-Verlag.Turney, P.D. 1995. Cost-Sensitive Classification:Empirical Evaluation of a Hybrid Genetic Decision TreeInduction Algorithm. Journal of Artificial IntelligenceResearch 2:369-409.Turney, P.D. 2000. Types of cost in inductive conceptlearning. In Proceedings of the Workshop on Cost-Sensitive Learning at the Seventeenth InternationalConference on Machine Learning, Stanford University,California.Weiss, G., and Provost, F. 2003. Learning when TrainingData are Costly: The Effect of Class Distribution on TreeInduction. Journal of Artificial Intelligence Research 19:315-354.Witten, I.H., and Frank, E. 2005. Data Mining – PracticalMachine Learning Tools and Techniques with JavaImplementations. Morgan Kaufmann Publishers.Zadrozny, B., Langford, J., and Abe, N. 2003. Cost-sensitive learning by Cost-Proportionate instanceWeighting. In Proceedings of the 3th InternationalConference on Data Mining.Zadrozny, B. and Elkan, C. 2001. Learning and MakingDecisions When Costs and Probabilities are BothUnknown. In Proceedings of the Seventh InternationalConference on Knowledge Discovery and Data Mining,204-213.。