Multiple nose region matching for 3D face recognition under varying facial expression

合集下载

人脸识别 面部 数字图像处理相关 中英对照 外文文献翻译 毕业设计论文 高质量人工翻译 原文带出处

人脸识别 面部 数字图像处理相关 中英对照 外文文献翻译 毕业设计论文 高质量人工翻译 原文带出处

人脸识别相关文献翻译,纯手工翻译,带原文出处(原文及译文)如下翻译原文来自Thomas David Heseltine BSc. Hons. The University of YorkDepartment of Computer ScienceFor the Qualification of PhD. — September 2005 -《Face Recognition: Two-Dimensional and Three-Dimensional Techniques》4 Two-dimensional Face Recognition4.1 Feature LocalizationBefore discussing the methods of comparing two facial images we now take a brief look at some at the preliminary processes of facial feature alignment. This process typically consists of two stages: face detection and eye localisation. Depending on the application, if the position of the face within the image is known beforehand (fbr a cooperative subject in a door access system fbr example) then the face detection stage can often be skipped, as the region of interest is already known. Therefore, we discuss eye localisation here, with a brief discussion of face detection in the literature review(section 3.1.1).The eye localisation method is used to align the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented are representative of the face recognition accuracy and not a product of the performance of the eye localisation routine, all image alignments are manually checked and any errors corrected, prior to testing and evaluation.We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of feces is taken, and each image cropped to an area around both eyes. The average image is calculated and used as a template.Figure 4-1 - The average eyes. Used as a template for eye detection.Both eyes are included in a single template, rather than individually searching for each eye in turn, as the characteristic symmetry of the eyes either side of the nose, provides a useful feature that helps distinguish between the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale(i.e. subject distance from the camera) and also introduces the assumption that eyes in the image appear near horizontal. Some preliminary experimentation also reveals that it is advantageous to include the area of skin justbeneath the eyes. The reason being that in some cases the eyebrows can closely match the template, particularly if there are shadows in the eye-sockets, but the area of skin below the eyes helps to distinguish the eyes from eyebrows (the area just below the eyebrows contain eyes, whereas the area below the eyes contains only plain skin).A window is passed over the test images and the absolute difference taken to that of the average eye image shown above. The area of the image with the lowest difference is taken as the region of interest containing the eyes. Applying the same procedure using a smaller template of the individual left and right eyes then refines each eye position.This basic template-based method of eye localisation, although providing fairly preciselocalisations, often fails to locate the eyes completely. However, we are able to improve performance by including a weighting scheme.Eye localisation is performed on the set of training images, which is then separated into two sets: those in which eye detection was successful; and those in which eye detection failed. Taking the set of successful localisations we compute the average distance from the eye template (Figure 4-2 top). Note that the image is quite dark, indicating that the detected eyes correlate closely to the eye template, as we would expect. However, bright points do occur near the whites of the eye, suggesting that this area is often inconsistent, varying greatly from the average eye template.Figure 4-2 一Distance to the eye template for successful detections (top) indicating variance due to noise and failed detections (bottom) showing credible variance due to miss-detected features.In the lower image (Figure 4-2 bottom), we have taken the set of failed localisations(images of the forehead, nose, cheeks, background etc. falsely detected by the localisation routine) and once again computed the average distance from the eye template. The bright pupils surrounded by darker areas indicate that a failed match is often due to the high correlation of the nose and cheekbone regions overwhelming the poorly correlated pupils. Wanting to emphasise the difference of the pupil regions for these failed matches and minimise the variance of the whites of the eyes for successful matches, we divide the lower image values by the upper image to produce a weights vector as shown in Figure 4-3. When applied to the difference image before summing a total error, this weighting scheme provides a much improved detection rate.Figure 4-3 - Eye template weights used to give higher priority to those pixels that best represent the eyes.4.2 The Direct Correlation ApproachWe begin our investigation into face recognition with perhaps the simplest approach,known as the direct correlation method (also referred to as template matching by Brunelli and Poggio [29 ]) involving the direct comparison of pixel intensity values taken from facial images. We use the term "Direct Conelation, to encompass all techniques in which face images are compared directly, without any form of image space analysis, weighting schemes or feature extraction, regardless of the distance metric used. Therefore, we do not infer that Pearson's correlation is applied as the similarity function (although such an approach would obviously come under our definition of direct correlation). We typically use the Euclidean distance as our metric in these investigations (inversely related to Pearson's correlation and can be considered as a scale and translation sensitive form of image correlation), as this persists with the contrast made between image space and subspace approaches in later sections.Firstly, all facial images must be aligned such that the eye centres are located at two specified pixel coordinates and the image cropped to remove any background information. These images are stored as greyscale bitmaps of 65 by 82 pixels and prior to recognition converted into a vector of 5330 elements (each element containing the corresponding pixel intensity value). Each corresponding vector can be thought of as describing a point within a 5330 dimensional image space. This simple principle can easily be extended to much larger images: a 256 by 256 pixel image occupies a single point in 65,536-dimensional image space and again, similar images occupy close points within that space. Likewise, similar faces are located close together within the image space, while dissimilar faces are spaced far apart. Calculating the Euclidean distance d, between two facial image vectors (often referred to as the query image q, and gallery image g), we get an indication of similarity. A threshold is then applied to make the final verification decision.d . q - g ( threshold accept ) (d threshold ⇒ reject ). Equ. 4-14.2.1 Verification TestsThe primary concern in any face recognition system is its ability to correctly verify a claimed identity or determine a person's most likely identity from a set of potential matches in a database. In order to assess a given system's ability to perform these tasks, a variety of evaluation methodologies have arisen. Some of these analysis methods simulate a specific mode of operation (i.e. secure site access or surveillance), while others provide a more mathematicaldescription of data distribution in some classification space. In addition, the results generated from each analysis method may be presented in a variety of formats. Throughout the experimentations in this thesis, we primarily use the verification test as our method of analysis and comparison, although we also use Fisher's Linear Discriminant to analyse individual subspace components in section 7 and the identification test for the final evaluations described in section 8. The verification test measures a system's ability to correctly accept or reject the proposed identity of an individual. At a functional level, this reduces to two images being presented for comparison, fbr which the system must return either an acceptance (the two images are of the same person) or rejection (the two images are of different people). The test is designed to simulate the application area of secure site access. In this scenario, a subject will present some form of identification at a point of entry, perhaps as a swipe card, proximity chip or PIN number. This number is then used to retrieve a stored image from a database of known subjects (often referred to as the target or gallery image) and compared with a live image captured at the point of entry (the query image). Access is then granted depending on the acceptance/rej ection decision.The results of the test are calculated according to how many times the accept/reject decision is made correctly. In order to execute this test we must first define our test set of face images. Although the number of images in the test set does not affect the results produced (as the error rates are specified as percentages of image comparisons), it is important to ensure that the test set is sufficiently large such that statistical anomalies become insignificant (fbr example, a couple of badly aligned images matching well). Also, the type of images (high variation in lighting, partial occlusions etc.) will significantly alter the results of the test. Therefore, in order to compare multiple face recognition systems, they must be applied to the same test set.However, it should also be noted that if the results are to be representative of system performance in a real world situation, then the test data should be captured under precisely the same circumstances as in the application environment.On the other hand, if the purpose of the experimentation is to evaluate and improve a method of face recognition, which may be applied to a range of application environments, then the test data should present the range of difficulties that are to be overcome. This may mean including a greater percentage of6difficult9 images than would be expected in the perceived operating conditions and hence higher error rates in the results produced. Below we provide the algorithm for executing the verification test. The algorithm is applied to a single test set of face images, using a single function call to the face recognition algorithm: CompareF aces(F ace A, FaceB). This call is used to compare two facial images, returning a distance score indicating how dissimilar the two face images are: the lower the score the more similar the two face images. Ideally, images of the same face should produce low scores, while images of different faces should produce high scores.Every image is compared with every other image, no image is compared with itself and nopair is compared more than once (we assume that the relationship is symmetrical). Once two images have been compared, producing a similarity score, the ground-truth is used to determine if the images are of the same person or different people. In practical tests this information is often encapsulated as part of the image filename (by means of a unique person identifier). Scores are then stored in one of two lists: a list containing scores produced by comparing images of different people and a list containing scores produced by comparing images of the same person. The final acceptance/rejection decision is made by application of a threshold. Any incorrect decision is recorded as either a false acceptance or false rejection. The false rejection rate (FRR) is calculated as the percentage of scores from the same people that were classified as rejections. The false acceptance rate (FAR) is calculated as the percentage of scores from different people that were classified as acceptances.For IndexA = 0 to length(TestSet) For IndexB = IndexA+l to length(TestSet) Score = CompareFaces(TestSet[IndexA], TestSet[IndexB]) If IndexA and IndexB are the same person Append Score to AcceptScoresListElseAppend Score to RejectScoresListFor Threshold = Minimum Score to Maximum Score:FalseAcceptCount, FalseRejectCount = 0For each Score in RejectScoresListIf Score <= ThresholdIncrease FalseAcceptCountFor each Score in AcceptScoresListIf Score > ThresholdIncrease FalseRejectCountF alse AcceptRate = FalseAcceptCount / Length(AcceptScoresList)FalseRej ectRate = FalseRejectCount / length(RejectScoresList)Add plot to error curve at (FalseRejectRate, FalseAcceptRate)These two error rates express the inadequacies of the system when operating at aspecific threshold value. Ideally, both these figures should be zero, but in reality reducing either the FAR or FRR (by altering the threshold value) will inevitably resultin increasing the other. Therefore, in order to describe the full operating range of a particular system, we vary the threshold value through the entire range of scores produced. The application of each threshold value produces an additional FAR, FRR pair, which when plotted on a graph produces the error rate curve shown below.False Acceptance Rate / %Figure 4-5 - Example Error Rate Curve produced by the verification test.The equal error rate (EER) can be seen as the point at which FAR is equal to FRR. This EER value is often used as a single figure representing the general recognition performance of a biometric system and allows for easy visual comparison of multiple methods. However, it is important to note that the EER does not indicate the level of error that would be expected in a real world application. It is unlikely that any real system would use a threshold value such that the percentage of false acceptances were equal to the percentage of false rejections. Secure site access systems would typically set the threshold such that false acceptances were significantly lower than false rejections: unwilling to tolerate intruders at the cost of inconvenient access denials.Surveillance systems on the other hand would require low false rejection rates to successfully identify people in a less controlled environment. Therefore we should bear in mind that a system with a lower EER might not necessarily be the better performer towards the extremes of its operating capability.There is a strong connection between the above graph and the receiver operating characteristic (ROC) curves, also used in such experiments. Both graphs are simply two visualisations of the same results, in that the ROC format uses the True Acceptance Rate(TAR), where TAR = 1.0 - FRR in place of the FRR, effectively flipping the graph vertically. Another visualisation of the verification test results is to display both the FRR and FAR as functions of the threshold value. This presentation format provides a reference to determine the threshold value necessary to achieve a specific FRR and FAR. The EER can be seen as the point where the two curves intersect.Figure 4-6 - Example error rate curve as a function of the score threshold The fluctuation of these error curves due to noise and other errors is dependant on the number of face image comparisons made to generate the data. A small dataset that only allows fbr a small number of comparisons will results in a jagged curve, in which large steps correspond to the influence of a single image on a high proportion of the comparisons made. A typical dataset of 720 images (as used in section 4.2.2) provides 258,840 verification operations, hence a drop of 1% EER represents an additional 2588 correct decisions, whereas the quality of a single image could cause the EER to fluctuate by up to 0.28.422 ResultsAs a simple experiment to test the direct correlation method, we apply the technique described above to a test set of 720 images of 60 different people, taken from the AR Face Database [ 39 ]. Every image is compared with every other image in the test set to produce a likeness score, providing 258,840 verification operations from which to calculate false acceptance rates and false rejection rates. The error curve produced is shown in Figure 4-7.Figure 4-7 - Error rate curve produced by the direct correlation method using no image preprocessing.We see that an EER of 25.1% is produced, meaning that at the EER threshold approximately one quarter of all verification operations carried out resulted in an incorrect classification. Thereare a number of well-known reasons for this poor level of accuracy. Tiny changes in lighting, expression or head orientation cause the location in image space to change dramatically. Images in face space are moved far apart due to these image capture conditions, despite being of the same person's face. The distance between images of different people becomes smaller than the area of face space covered by images of the same person and hence false acceptances and false rejections occur frequently. Other disadvantages include the large amount of storage necessaryfor holding many face images and the intensive processing required for each comparison, making this method unsuitable fbr applications applied to a large database. In section 4.3 we explore the eigenface method, which attempts to address some of these issues.4二维人脸识别4.1功能定位在讨论比较两个人脸图像,我们现在就简要介绍的方法一些在人脸特征的初步调整过程。

一种鲁棒的均值漂移与彩色特征点提取相结合的面部跟踪算法

一种鲁棒的均值漂移与彩色特征点提取相结合的面部跟踪算法

关键词 :面部跟踪 ;均值漂移 ; Har ris 探测器 ;彩色特 征点 中图分类号 : TP391. 4 文献标识码 :A 文章编号 :100529490( 2007) 0521712204 配的均值漂移算法加快处理速度 . 基于肤色特征的 均值漂移算法对面部旋转 、 变形以及背景运动都不 敏感 , 且计算量小 . 但是单纯利 用肤色信息进行 跟 踪 ,可能导致错误地跟踪其它近肤色物体 . 本文在肤 色跟踪的基础上 ,对面部区域提取特征点以获得更 多的面部特征信息 , 可以有效地排除近肤色物体对 跟踪的干扰 ,而且在多人脸发生交叉重叠的情况下 , 可以对每个面部区域保持稳定的跟踪 .
3结合特征点的均值漂移跟踪算法本文提出的算法在视频第一帧中手动选取要跟踪的面部区域并利用彩色harris探测器对其进行特征点提取在接着的图像序列中首先根据人脸肤色的色度信息利用均值漂移算法粗略估计人脸的位置然后对这个粗略区域提取特征点与之前提取面部的特征点进行匹配满足约束条件的为前一帧中跟踪的人脸区域否则为近肤色区域8
1 基于肤色的均值漂移跟踪算法
均值漂移 ( mean shift ) 算法是一种密度梯度的 无参估计算法 ,1975 年由 Fukunaga [ 1 ] 提出 ,1995 年 由 Cheng[ 2 ] 将其引入计算机领域 . 与传统的窗内遍
收稿日期 :20 06210210 作者简介 : 韩秋蕾 (1 97 82) ,女 , 博士 研究 生 ,主要研究方向为数字图像图处理 、 人脸检测跟踪等 , ha nqiulei @g mail. co m. 8
第 30 卷 第5 期 20 07 年 10 月
电 子 器 件
Ch inese Jou r nal Of Elect ro n Devices

7-HALCON_三维机器视觉方法介绍(1)

7-HALCON_三维机器视觉方法介绍(1)

Initial parameters
Calibration settings 'pose' 'params' …
CalibModelID
Observed points
Calibration result
Data gathering and parameter setup are handled by flexible operators
set_calib_data_observ_points (:: ► CalibDataID, ► CameraIndex, ► CalibObjIndex, ► CalibObjPoseIndex, ► Row, Column, 0 ► Index, ► Pose :)
1 2
0
3
1
Configure optimization parameters
Planare about object geometry?
Yes
Feature points
Vector to pose
No 3D object model
Planar
3D reconstruction
Primitive fitting
CAD model
Yes
Unknown
Multi-view Stereo
Sheet-of-light technique
Select Your Solution: Planar
Geometric primitives?
No Clear edges?
Yes
Perspective, deformable matching
Configuration Create model create_calib_data

初中英语学业质量检测试题命制多维细目表

初中英语学业质量检测试题命制多维细目表

初中英语学业质量检测试题命制多维细目表全文共3篇示例,供读者参考篇1Middle School English Academic Quality Inspection TestItem Making Multidimensional Detailed TableIntroductionMiddle school English academic quality inspection test is an important measure to evaluate the teaching quality and students’ learning outcomes. In order to accurately assess students' English proficiency and provide effective feedback to teachers, it is necessary to design a comprehensive and multidimensional test item making detailed table. This table should cover various aspects of language skills, such as listening, speaking, reading, writing, vocabulary, grammar, and cultural knowledge.ListeningListening skills are essential for communication in English. In the test item making detailed table, listening tasks should include multiple-choice questions, fill-in-the-blanks, and short answer questions. These tasks should cover various topics,a ccents, and speech speeds to test students’ comprehensionabilities. Additionally, it is important to include tasks that require students to infer meaning from context, identify main ideas, and make predictions based on the information provided.SpeakingSpeaking skills are crucial for students to express themselves fluently and accurately in English. The speaking tasks in the test item making detailed table should include role-plays, presentations, interviews, and discussions. These tasks should as sess students’ ability to use appropriate vocabulary, grammar structures, and pronunciation in different communicative contexts. Furthermore, students should be evaluated on their fluency, coherence, and confidence in spoken English.ReadingReading skills are essential for students to understand and interpret written texts in English. The reading tasks in the test item making detailed table should include multiple-choice questions, true/false questions, matching exercises, and short answer questions. These tasks should cover a variety of text types, such as articles, essays, dialogues, and advertisements. Moreover, students should be tested on their ability to identify main ideas, supporting details, tone, and purpose of the text.WritingWriting skills are important for students to convey their ideas clearly and persuasively in English. The writing tasks in the test item making detailed table should include essays, reports, letters, and emails. These tasks should assess students’ ability to organize their ideas coherently, use appropriate vocabulary and grammar, and express themselves effectively. Additionally, students should be evaluated on their writing style, creativity, and critical thinking skills.VocabularyVocabulary knowledge is crucial for students to understand and use a wide range of words in English. The vocabulary tasks in the test item making detailed table should include matching exercises, word formation, collocations, andsynonyms/antonyms. These tasks should cover common vocabulary themes, such as daily routines, travel, education, health, and technology. Furthermore, students should be tested on their ability to use context clues to determine the meaning of unfamiliar words.GrammarGrammar is the foundation of language learning and essential for students to communicate accurately in English. The grammar tasks in the test item making detailed table should include multiple-choice questions, sentence completion, error correction, and sentence transformation. These tasks should cover various grammar structures, such as tenses, articles, prepositions, conjunctions, and verb forms. Moreover, students should be tested on their ability to apply grammar rules in different contexts and avoid common errors.Cultural KnowledgeCultural knowledge is important for students to understand the customs, traditions, and values of English-speaking countries. The cultural knowledge tasks in the test item making detailed table should include reading passages, videos, and audio clips that introduce different aspects of English-speaking cultures. These tasks should cover topics, such as holidays, food, music, sports, and social norms. Additionally, students should be tested on their ability to compare and contrast their own culture with English-speaking cultures.ConclusionIn conclusion, designing a multidimensional detailed table for middle school English academic quality inspection test itemsis essential for evaluating students’ language skills comprehensively. By covering various aspects of listening, speaking, reading, writing, vocabulary, grammar, and cultural knowledge, this table can provide valuable insights into students’ strengths and areas for improvement. It is important for test developers to create tasks that are challenging, engaging, and aligned with the learning objectives of the curriculum. By using this detailed table, teachers can assess students’ English proficiency accurately and provide targeted support to help them succeed in their language learning journey.篇2Middle School English Academic Quality Inspection Test Questionnaire Making Multidimensional Detailed TableI. ListeningA. Listening Comprehension1. Listen to a conversation and answer multiple-choice questions2. Listen to a passage and fill in the blanks3. Listen to a dialogue and rearrange the sentences in correct orderB. Listening Skills1. Listen to a passage and identify main ideas2. Listen to a dialogue and summarize key points3. Listen to a conversation and infer the speakers' intentions II. ReadingA. Reading Comprehension1. Read a passage and answer questions2. Read a text and identify main themes3. Read a dialogue and infer characters' emotionsB. Reading Skills1. Read a passage and make predictions about the outcome2. Read a text and analyze the author's purpose3. Read a dialogue and identify persuasive techniquesIII. WritingA. Writing Skills1. Write a descriptive paragraph about a given topic2. Write an argumentative essay supporting a viewpoint3. Write a narrative story based on a promptB. Writing Mechanics1. Use correct grammar and punctuation in writing2. Organize ideas logically in writing3. Use a variety of vocabulary and sentence structures in writingIV. SpeakingA. Presentation Skills1. Give a short presentation on a given topic2. Answer questions during a presentation3. Use visual aids effectively during a presentationB. Interaction Skills1. Role-play a conversation in a given scenario2. Engage in a group discussion on a topic3. Demonstrate active listening skills during a conversationIn conclusion, the Middle School English Academic Quality Inspection Test Questionnaire should encompass various dimensions of language learning, including listening, reading,writing, and speaking. By incorporating diverse question types and tasks, educators can assess students' language proficiency comprehensively and accurately.篇3Middle School English Academic Quality Inspection Test Questionnaire Multidimensional Detailed ListI. Vocabulary and Grammar:1. Fill in the blanks with the correct form of the given words:a. The teacher asked the students to (study) for the upcoming exam.b. My sister is a (success) lawyer in the city.c. The (rain) season in this region lasts for several months.2. Choose the correct word to complete the sentence:a. The dog (lie/lay) in the sun all afternoon.b. She is studying (hardly/hard) for the test.c. We are going to the park (after/before) lunch.3. Match the following sentences with the correct tenses:a. I have finished my homework. 1. Present Simpleb. She will be here in an hour. 2. Present Continuousc. They had already left when I arrived. 3. Past PerfectII. Reading Comprehension:Read the passage below and answer the questions that follow:The Great Wall of China is one of the most famous landmarks in the world. It was built over 2,000 years ago to protect the Chinese borders from invaders. The wall is over 13,000 miles long and is made of stone, brick, and wood. Many people visit the Great Wall every year to see its beauty and learn about its history.1. Why was the Great Wall of China built?2. How long is the Great Wall?3. What materials were used to build the wall?4. Why do many people visit the Great Wall?III. Writing:Write a paragraph describing your favorite holiday and why you enjoy it. Include details about where you go, what you do, and who you spend time with.IV. Listening:Listen to the audio recording and answer the following questions:1. What is the speaker's name?2. Where is the speaker from?3. What is he/she talking about?4. What does he/she plan to do next weekend?V. Speaking:Prepare a short presentation on a topic of your choice and present it to the class. Include relevant information, examples, and visuals to support your presentation.By using this multidimensional detailed list for the middle school English academic quality inspection test questionnaire, educators can effectively evaluate students' language skills in various areas, including vocabulary, grammar, reading comprehension, writing, listening, and speaking. This comprehensive approach ensures a thorough assessment of students' English proficiency and helps them improve their language abilities.。

3D龙容器树鲁棒的匹配

3D龙容器树鲁棒的匹配

Robust Matching of3D Lung Vessel Trees Dirk Smeets ,Pieter Bruyninckx,Johannes Keustermans,Dirk Vandermeulen,and Paul SuetensK.U.Leuven,Faculty of Engineering,ESAT/PSIMedical Imaging Research Center,UZ Gasthuisberg,Herestraat49bus7003,B-3000Leuven,BelgiumAbstract.In order to study ventilation or to extract other functionalinformation of the lungs,intra-patient matching of scans at a different in-spiration level is valuable as an examination tool.In this paper,a methodfor robust3D tree matching is proposed as an independent registrationmethod or as a guide for other,e.g.voxel-based,types of registration.Forthefirst time,the3D tree is represented by intrinsic matrices,referenceframe independent descriptions containing the geodesic or Euclidean dis-tance between each pair of detected bifurcations.Marginalization of pointpair probabilities based on the intrinsic matrices provides soft assign cor-respondences between the two trees.This global correspondence modelis combined with local bifurcation similarity models,based on the shapeof the adjoining vessels and the local gray value distribution.As a proofof concept of this general matching approach,the method is applied formatching lung vessel trees acquired from CT images in the inhale andexhale phase of the respiratory cycle.1IntroductionThe pulmonary ventilation,i.e.the inflow and outflow of air between the lungs and atmosphere,is the result of the movement of the diaphragm or the ribs leading to small pressure differences between the alveoli and the atmosphere. Quantification of pulmonary ventilation is a clinically important functional com-ponent in lung diagnosis.Pulmonary ventilation can be studied using several CT images in one breathing cycle(4D CT)[1].In radiotherapy treatment,extraction of the lung deformation is important for correction of tumor motion,leading to a more accurate irradiation.Matching,i.e.spatially aligning images(also referred as registration),inspi-ration and expiration scans is a challenging task because of the substantial, locally varying deformations during respiration[2].To capture these deforma-tions,a non-rigid registration is required,which can be image-based,surface-based and/or landmark-based.Generally,a non-rigid registration algorithm re-quires three components:a similarity measure,a transformation model and an optimization process.Corresponding author:dirk.smeets@uz.kuleuven.beIn image registration,common voxel similarity measures are sum of square differences(SSD),correlation coefficient(CC)and mutual information(MI)[3, 4].To regularize the transformation,popular transformation models are elastic models,viscousfluid models,spline-based techniques,finite element models or opticalflow techniques[3].Voxel-similarity based techniques have been applied to lung matching in[5–7].They have the advantage that dense and generally accurate correspondences are obtained.Disadvantages are the sensitivity to the initialization and the computational demands.Surface-based registration meth-ods use a similarity measure that is a function of the distances between points on the two surfaces.Thin-plate splines are popular for regularization of the transfor-mation.A combination of an voxel-similarity based and surface based registra-tion for lung matching is presented in[8].Generally,landmark-based non-rigid registration algorithms allow large deformations.Because of their sparsity,they are very efficient.In[9]bifurcations of lung vessels arefirst detected after which these landmarks are matched by comparing the3D local shape context of each pair of bifurcations with theχ2-distance.In this paper,we present a robust registration method for matching vessel trees.After detecting candidate bifurcations(Sec.2.1),correspondences are es-tablished by combining a global and a local bifurcation similarity model.For the novel global model(Sec.2.3),the tree is represented by two reference frame inde-pendent,hence intrinsic,matrices:the Euclidean(rigid transformation invariant) and geodesic(isometric deformation invariant)distance matrix.Marginalization of point pair probabilities provides soft(not binary)correspondences between the detected bifurcations of the two trees.The local bifurcation similarity model reflects correspondences based on local intensities and is explained in Sec.2.3. The optimization to obtain hard correspondences is done using a spectral de-composition technique(Sec.2.4).The proof of concept of the general matching approach is shown in Sec.3.Finally we draw some conclusions.2MethodThe goal of the proposed matching framework is to establish correspondences between characteristic points in the image,independent of the reference frame (translation and rotation invariant)and,for characteristic points in structures that are assumed to deform isometrically,invariant for isometric deformations. Isometries are defined as distance-preserving isomorphisms between metric spaces, which generally means that structures only bend without stretching.The bifur-cations,the splittings of the vessels in two or more parts,are chosen as charac-teristic points,although their detection is not the main goal of this paper.We then assume the vessel tree to deform isometrically.This assumption of nearly constant vessel length has been made before in a.o.[10]and[11].2.1Bifurcation detectionBifurcations are locations where a blood vessel splits into two smaller vessels.A subset of the bifurcations that involves only major vessels can be detected in a robust way by analyzing the skeletonization of the segmented vessels.As preprocessing,the lungs are segmented by keeping the largest connected component of all voxels with an intensity lower than -200HU.A morphological closing operator includes the lung vessels to the segmentation and a subsequent erosion operation removes the ribs from the segmentation.Then,a rough seg-mentation of the major blood vessels within the lung is obtained by thresholding at -200HU.Cavities in binary segmentation smaller than 10voxels are closed.The skeletonization of the segmentation localizes the major bifurcations and is obtained through a 3D distance transformation by homotopic thinning [12].Since bifurcations are locations where three vessels join,they are characterized as skeleton voxels having three different neighbors belonging to the skeleton.After-wards,bifurcations are discarded that have a probability lower than 0.5/√2πσ2according to a statistical intensity model.The vessel intensities I k are assumed to be Gaussian distributed P (c vessel |I k )=1√2πσ2exp −(I k −μ)2σ2 ,(1)with experimentally determined parameters (μ=−40HU and σ=65HU).2.2Global correspondence modelAfter the vessel bifurcations are detected,soft correspondences are established based on a global and a local correspondence model,both independent of rigid and isometric deformations of the considered vessel trees.Intrinsic vessel tree representation.Each tree is intrinsically represented by a Euclidean distance matrix E =[d R 3,ij ](EDM),containing Euclidean dis-tances between each pair of bifurcations,and a geodesic distance matrix G =[g ij ](GDM).Each element g ij corresponds to the geodesic distance between the bi-furcations i and j .This distance is the distance between i and j along the vessels and is computed with the fast marching method in an image containing a soft segmented vessel tree using Eq.(1).Isometric vessel deformations,by defini-tion,leave these geodesic distances unchanged.Therefore,the GDM is invariant to the bending of the vessels.On the other hand,the EDM is only invariant to rigid transformations (and,when normalized,invariant to scale variations)of the vessel tree.However,Euclidean distance computation is expected to be more robust against noise than geodesic distance computation since the error accumu-lates along the geodesic,i.e.the shortest path.Both the EDM and the GDM are symmetric and uniquely defined up to an arbitrary simultaneous permutation of their rows and columns due to the arbitrary sampling order of the bifurcations.An example of a lung vessel tree,represented by a GDM and a EDM,is shown in Fig.1.(a)10203040506010203040506050100150200250300350400(b)102030405060102030405060020406080100120(c)Fig.1.A lung vessel tree with detected bifurcations (a)is represented by a geodesic (b)and a Euclidean (c)distance matrix,containing the geodesic or Euclidean distance between each pair of bifurcations.Soft correspondences of the global model.A probabilistic framework is used to estimate the probability that a bifurcation i of one tree corresponds with a bifurcations k of the other tree.It is applied twice,once for the EDM and once for the GDM.We will explain the framework using the GDM.Given two lung vessel trees,represented by GDMs,the probability that the pair of bifurcations (i,j )of the first tree G 1correponds with the pair (k,l )of the second tree G 2is assumed to be normally distributed,P (C (i,j ),(k,l ))=1√2exp −(g 1,ij −g 2,kl )2σ2 ,(2)with σchosen to be 1and g 1,ij an element of the GDM representing the first tree.It reflects the assumption that geodesic distances between pairs of bifur-cations are preserved,obeying an isometric deformation.The probability that abifurcation i corresponds with k is then given byP(C i,k)=jlP(C(i,j),(k,l))=m G,ik,(3)from which the soft correspondence matrix,M G=[m G,ik],is constructed.The same procedure is followed to obtain an assignment matrix M E based on the EDM as intrinsic tree representation.This matrix is expected to be more robust against noise,but varies under isometric deformations of the vessel tree, contrary to M G.2.3Local correspondence modelComplementary to the global bifurcation similarity model,a local model based on intensities is implemented.In the computer vision literature a large number of local image descriptors are proposed,see for example[13].In this paper,the n-SIFT(scale invariant feature transform)image descrip-tor is used[14],which summarizes a cubic voxel region centered at the feature location,in casu the bifurcation location.The cube is divided into64subre-gions,each using a60bin histogram to summarize the gradients of the voxels in the subregion.This results in a3840-dimensional feature vector f by com-bining the histograms in a vector after weighting by a Gaussian centered at the feature location.The probability that a bifurcation i corresponds with k is then proportional toP(C i,k)∝exp(− f i−f k 2)=m L,ik,(4) 2.4Spectral matchingThe combined match matrix M C is found as the pointwise product of the match matrices of the separate models,m C,ik=m G,ik.m E,ik.m L,ik(i.e.product of the corresponding probabilities).To establish hard correspondences(one-to-one mapping),the algorithm proposed by Scott and Longuet-Higgins[15]is used. It applies a singular value decomposition to M C(M C=UΣV T)and computes the orthogonal matrix M C=U˜I n V T,with˜I n a pseudo-identity matrix.Two bifurcations i and k match if m C,ik is both the greatest element in its row and the greatest in its column.3Experimental resultsThe described matching framework is evaluated using a publicly available4D CT thorax dataset of the L´e on B´e rard Cancer Center&CREATIS lab(Lyon, France),called“POPI-model”.It contains103D volumes representing the dif-ferent phases of the average breathing cycle,with an axial resolution of0.976562 mm×0.976562mm and a slice thickness of2mm.Also,3D landmarks,indi-cated by medical experts,and vectorfields describing the motion of the lung, are available[16].3.1Matching manually indicated landmarksFirst,the matching framework is evaluated using manually indicated landmarks to examine the performance independent of the bifurcation detection step.For this purpose,all 40landmarks are used in one CT image (end-inhalation phase),while varying the number of landmarks in the other image.The result of this experiment,expressed as the percentage of correct correspondences in function of the percentage outliers of the total number of landmarks,is shown in Fig.2for different images in the respiratory cycle.Since not all landmarks are located in the vessels and therefore geodesic distances are not meaningful,M G is not computed.% outliers of total no. landmarks % c o r r e c t c o r r e s p o n d e n c e s Fig.2.Results of the framework for matching manually indicated landmarks expressed as the percentage of correct correspondences in function of the percentage outliers.The three curves correspond with different scans in the respiratory cycle,in which scan no.1is end-inhale and scan no.6end-exhale.These results show that even for a large number of outliers good correspon-dences are still found.Consequently,it can be expected that not all bifurcations detected in one image must be present in the other image.Also,a small decrease in performance can be noticed when more lung deformation is present (end-exhale vs.end-inhale),probably because the Euclidean distance matrix (used in the global correspondence model)is not invariant for these deformations.Second,the robustness against landmark positioning errors is examined.Therefore,we add uniform distributed noise in each direction to one set of landmarks and vary the maximum amplitude of the noise.Fig.3shows the percentage of correct correspondences in function of this maximum amplitude,averaged over 15runs per noise level.These results show that indication errors of more than 5mm (±5voxels in x and y direction and 2.5voxels in z di-rection)decrease the matching performance.It is therefore expected that the localization of bifurcations during automatic bifurcation detection must not be extremely accurate in order to still obtain good correspondences.maximum noise amplitude in each direction [mm]% c o r r e c t c o r r e s p o n d e n c e s Fig.3.Indication error dependence,expressed as the percentage of correct correspon-dences in function of the maximum amplitude of the added uniform noise.The three curves correspond with different scans in the respiratory cycle,in which scan no.1is end-inhale and scan no.6end-exhale.3.2Matching automatically detected bifurcationsNext,the framework is evaluated for the task of matching 3D vessel trees by means of automatically detected bifurcations.The result of matching the end-inhale with the end-exhale vessel tree is shown in Fig.4,clarifying that most correspondence pairs are correct.It is also clear that simple thresholding has resulted in oversegmentation in one of the two images.This,however,did not affect the automatic matching.The performance of matching automatically detected bifurcations is quan-tified using the dense deformation field that is available in the public dataset (POPI-model).This deformation field is obtained using a parametric image-based registration [17,18,16].The average target registration error in the manual landmarks is 1.0mm with a standard deviation of 0.5mm.Fig.4.Qualitative evaluation for matching3D vessel trees by means of automatically detected bifurcations.Fig.5illustrates the accuracy of the matching of the end-inhale scan with all other scans.It shows histograms of the registration error(in each direction and in absolute value)for all bifurcations.The average registration error is2.31mm,the median1.84mm and the maximum24.0mm.Only for0.21%of the bifurcation matches the absolute registration error is larger than1.5cm,demonstrating the robustness of the matching algorithm.4ConclusionA robust matching framework is proposed,combining two global and one local landmark similarity model.The results of matching manually indicated land-marks demonstrate the potential of the proposed method for matching land-marks in unimodal medical images independent of the translation and rotation between both images.A high number of outliers is allowed when the landmarks−10−50510020406080100120error in x−direction [mm]f r e q u e n t y −10−50510error in y−direction [mm]f r e q u e n t y−10−50510error in z−direction [mm]f r e q u e n t y 051015absolute error [mm]f r e q u e n t y Fig.5.Histograms of the registration error (in each direction and in absolute value)for all bifurcations give an impression of the matching accuracy.in the image are well located.Moreover,we demonstrated that the matching framework can also be used for automatically detected landmarks,in this case lung bifurcations extracted from CT data.As future work,we see some applications of the soft correspondences for ro-bust initialization of an iterative non-rigid,e.g.voxel-based,registration method.References1.Guerrero,T.,Sanders,K.,Castillo,E.,Zhang,Y.,Bidaut,L.,Pan,T.,Komaki,R.:Dynamic ventilation imaging from four-dimensional computed tomography.Phys.Med.Biol.51(2006)777–7912.Sluimer,I.,Schilham,A.,Prokop,M.,van Ginneken,B.:Computer analysis ofcomputed tomography scans of the lung:a survey.IEEE Trans.Med.Imaging25(4)(2006)385–4053.Crum,W.R.,Hartkens,T.,Hill,D.L.G.:Non-rigid image registration:theory andpractice.Br J Radiol 77Spec No 2(2004)S140–534.Pluim,J.P.W.,Maintz,J.B.A.,Viergever,M.A.:Mutual information based reg-istration of medical images:A survey.IEEE Trans.Med.Imaging 22(8)(2003)986–10045.Dougherty,L.,Asmuth,J.C.,Gefter,W.B.:Alignment of ct lung volumes with anopticalflow method.Academic Radiology10(3)(2003)249–2546.Stancanello,J.,Berna,E.,Cavedon,C.,Francescon,P.,Loeckx,D.,Cerveri,P.,Ferrigno,G.,Baselli,G.:Preliminary study on the use of nonrigid registration for thoraco-abdominal radiosurgery.Med Phys.32(12)(2005)3777–857.Loeckx, D.:Automated Nonrigid Intra-Patient Image Registration Using B-Splines.PhD thesis,K.U.Leuven(2006)8.Kaus,M.,Netsch,T.,Kabus,S.,Pekar,V.,McNutt,T.,Fischer,B.:Estimationof organ motion from4D CT for4D radiation therapy planning of lung cancer.In: MICCAI.Volume3217of LNCS.(2004)1017–10249.Hilsmann,A.,Vik,T.,Kaus,M.,Franks,K.,Bissonette,J.P.,Purdie,T.,Beziak,A.,Aach,T.:Deformable4DCT lung registration with vessel bifurcations.In:CARS.(2007)10.Klabunde,R.E.:Determinants of resistance toflow./Hemodynamics/H003.htm(august2008)11.Groher,M.,Zikic,D.,Navab,N.:Deformable2D-3D registration of vascular struc-tures in a one view scenario.IEEE Trans Med Imaging28(6)(2009)847–60 12.Selle,D.:Analyse von gef¨a ssstrukturen in medizinischen schichtdatens¨a tzen f¨u rdie computergest¨u tzte operationsplanung.PhD thesis,Aachen(2000)13.Mikolajczyk,K.,Schmid,C.:A performance evaluation of local descriptors.IEEETrans.Pattern Anal.Mach.Intell.27(10)(2005)1615–163014.Cheung,W.,Hamarneh,G.:n-sift:n-dimensional scale invariant feature transform.Trans.Img.Proc.18(9)(2009)2012–202115.Scott,G.L.,Longuet-Higgins,H.C.:An algorithm for associating the features oftwo images.Proc.R.Soc.Lond.B244(1991)21–2616.Vandemeulebroucke,J.,Sarrut, D.,Clarysse,P.:The POPI-model,a point-validated pixel-based breathing thorax model.In:Conference on the Use of Com-puters in Radiation Therapy.(2007)17.Rueckert,D.,Sonoda,L.I.,Hayes,C.,Hill,D.L.G.,Leach,M.O.,Hawkes,D.J.:Nonrigid registration using free-form deformations:Application to breast mr im-ages.IEEE Transactions on Medical Imaging18(1999)712–72118.Delhay,B.,Clarysse,P.,Magnin,I.E.:Locally adapted spatio-temporal deforma-tion model for dense motion estimation in periodic cardiac image sequences.In: FIMH’07:Proceedings of the4th international conference on Functional imaging and modeling of the heart,Berlin,Heidelberg,Springer-Verlag(2007)393–402。

A space-sweep approach to true multi-image matching

A space-sweep approach to true multi-image matching

2 True Multi-Image Matching 2.1 De nition
This section presents, for the rst time, a set of conditions that a stereo matching technique should meet to be called a \true multi-image" method. By this we mean that the technique truly operates in a multi-image manner, and is not just a repeated application of two- or three-camera techniques.
1 Introduction This paper considers the problem of multi-image stereo reconstruction, namely the recovery of
static 3D scene structure from multiple, overlapping images taken by perspective cameras with known extrinsic (pose) and intrinsic (lens) parameters. The dominant paradigm is to rst determine corresponding 2D image features across the views, followed by triangulation to obtain a precise estimate of 3D feature location and shape. The rst step, solving for matching features across multiple views, is by far the most di cult. Unlike motion sequences, which exhibit a rich set of constraints that lead to e cient matching techniques based on tracking, determining feature correspondences from a set of widely-spaced views is a challenging prob-

请简述机器视觉系统的工作流程

请简述机器视觉系统的工作流程

请简述机器视觉系统的工作流程英文回答:1. Image Acquisition.The first step in any machine vision system is image acquisition. This involves capturing an image of the object or scene of interest using a camera or other imaging device. The image is then converted into a digital format for processing by the computer.2. Image Preprocessing.Once the image has been acquired, it is typically preprocessed to improve its quality and make it moresuitable for analysis. This may involve operations such as noise removal, contrast enhancement, and edge detection.3. Feature Extraction.The next step is to extract features from the preprocessed image. These features are characteristics of the object or scene that are relevant to the task at hand. For example, in a facial recognition system, the features might include the shape of the face, the position of the eyes and nose, and the texture of the skin.4. Image Segmentation.Image segmentation is the process of dividing the image into different regions or segments. This helps to isolate the objects or features of interest from the background. Segmentation can be performed using a variety of techniques, such as edge detection, region growing, and watershed segmentation.5. Object Recognition.Object recognition is the process of identifyingobjects in the image. This can be done using a variety of techniques, such as template matching, shape matching, and feature matching.6. Image Interpretation.Once the objects in the image have been recognized, they can be interpreted to understand their meaning. This may involve tasks such as scene understanding, object tracking, and motion analysis.7. Decision Making.The final step in a machine vision system is decision making. This involves using the information obtained from the image interpretation to make a decision about the task at hand. For example, in a quality control system, the decision might be to accept or reject the product.中文回答:机器视觉系统的工作流程:1. 图像采集,首先,机器视觉系统需要获取目标物体的图像或场景图像,可以使用摄像机或其他成像设备。

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Multiple Nose Region Matching for3D Face Recognition under Varying Facial Expression Kyong I.Chang,Kevin W.Bowyer,Fellow,IEEE, and Patrick J.Flynn,Senior Member,IEEEAbstract—An algorithm is proposed for3D face recognition in the presence of varied facial expressions.It is based on combining the match scores from matching multiple overlapping regions around the nose.Experimental results are presented using the largest database employed to date in3D face recognition studies,over4,000scans of449subjects.Results show substantial improvement over matching the shape of a single larger frontal face region.This is the first approach to use multiple overlapping regions around the nose to handle the problem of expression variation.Index Terms—Biometrics,face recognition,three-dimensional face,facial expression.Ç1I NTRODUCTIONF ACE recognition using3D shape is believed to offer advantages over the use of intensity images[1],[2],[3].Research on face recognition using3D shape has recently begun to look at the problem of handling the variations in shape caused by facial expressions[4],[5],[6],[7],[8].Various approaches might be employed for this purpose.One is to concentrate on regions of the face whose shape changes the least with facial expression[9],[10]. For example,one might ignore the lips and mouth,since their shape varies greatly with expression.Of course,there is no large subset of the face that is perfectly rigid across a broad range of expressions.Another approach is to enroll a person into the gallery using a set of different expressions.However,the probe shape may still be an expression different than those sampled.A third approach is to have a model of3D facial expression that can be applied to any face shape.However,there likely is no general model to predict,for example,how each person’s neutral expression is transformed into their smile.A smile is different for different persons and for the same person at different times.A fourth approach is to try to compute an expression-invariant representation of the3D face shape[11],[12].Given that there is no fully“correct”approach to handling varying facial expression,one question is which approach(es)can be most effectively used to achieve desired levels of performance.In this work,we explore an approach that matches multiple,over-lapping surface patches around the nose area and then combines the results from these matches to achieve greater accuracy.Thus,this work seeks to explore what can be achieved by using a subset of the face surface that is approximately rigid across expression variation. 2B ASELINE PCA AND ICP P ERFORMANCEWe first establish“baseline”performance levels for3D face recognition on the data set used in this work.Images are obtained using a Minolta900/910sensor that produces registered 640Â480range and color images.The sensor takes several seconds to acquire the data and subject motion can result in artifacts[7].Images with noticeable artifacts result in recognition errors.See Fig.1for examples of the various facial expressions.For the baseline algorithms,we use a PCA approach similar to previous work[1],[13],[14]and an iterative closest point(ICP) approach similar to previous work[2],[10],[15].More sophisti-cated approaches have appeared in the literature[3],[4],[5],[6], [7],[8].These“baseline”approaches are simply meant to represent common known approaches.See Fig.2for examples of the frontal face regions used for these baseline algorithms.A total of 546subjects participated in one or more data acquisitions,yielding a total of4,4853D scans as summarized in Table1.Acquisition sessions took place at intervals over approximately a year and,so, a subject may have changes in hair style,weight,and other factors across their set of images.Among the546subjects,449participated in both a gallery acquisition and one or more probe acquisitions. The earliest scan with a neutral expression is used for the gallery and all later scans are used as probes.The neutral-expression probe images are divided into nine sets,based on increasing time lapse between acquisition sessions.There are2,349neutral-expression probes,one or more for each of the449neutral-expression gallery images.The nonneutral-expression probe images fall into eight sets,based on increasing time lapse.There are1,590nonneutral probes,one or more for each of355subjects with neutral-expression gallery images.Results for the PCA baseline are created using manually-identified landmark points to register the3D data to create the depth image.The training set for the PCA algorithm contains the 449gallery images plus the97images of subjects for whom only one good scan was available.Results for the ICP baseline use the manually-identified landmark points to obtain the initial rotation and translation estimate for the ICP matching.In this sense,the baseline represents an idealized level of performance for these approaches.There is a significant performance decrease when expression varies between gallery and probe,from an average 91percent to61.5percent for the ICP baseline,and from77.7percent to61.3percent for the PCA baseline.The higher performance obtained by the ICP baseline is likely due to the fact that ICP handles pose variation between gallery and probe better than PCA.These results agree with previously reported observations:one,that ICP approaches outperform PCA approaches for3D face recognition [10],[16],and,two,that expression variation degrades recognition performance[4],[5],[6],[7],[8],[17].3M ULTIPLE N OSE R EGION M ATCHINGBeginning with an approximately frontal scan,the eye pits,nose tip,and bridge of the nose are automatically located.These landmarks are used to define,for a gallery face shape,one larger surface region around the nose;and for a probe face shape, multiple smaller,overlapping surface regions around the nose.For recognition,the multiple probe shape regions are individually matched to the gallery and their results combined.3.1Preprocessing and Facial Region Extraction Preprocessing steps isolate the face region in the scan.Considering the range image as a binary image in which each pixel has or doesn’t have a valid measurement,isolated small regions are removed using a morphological opening operator(radius of10pixels).Then, connected component labeling is performed,and the largest region is kept;see Fig.3.Outliers,which can occur due to range sensor artifacts,are eliminated by examining the variance in Z values.A 3D point is labeled as an outlier when the angle between the optical axis and the point’s local surface normal is greater than a threshold value(80degrees).Next,the2D color pixels corresponding to these 3D points are transformed into YCbCr color space and used for skin detection.The3D data points in the detected skin region are subsampled keeping the points in every fourth row and column. (On average,there are more than11,000points in the face region, about3,000points in the gallery surface,and500to1,000points in the probe surfaces.)This reduces computation in later steps and.K.I.Chang is with Philips Medical Systems,22100Bothell Everett Hwy, Bothell,WA98021.E-mail:Jin.Chang@..K.W.Bowyer and P.J.Flynn are with the Department of Computer Science and Engineering,University of Notre Dame,384Fitzpatrick Hall,Notre Dame,IN46556.E-mail:{kwb,Flynn}@.Manuscript received4Apr.2005;revised31Mar.2006;accepted17Apr. 2006;published online11Aug.2006.Recommended for acceptance by P.Fua.For information on obtaining reprints of this article,please send e-mail to: tpami@,and reference IEEECS Log Number TPAMI-0182-0405.0162-8828/06/$20.00ß2006IEEE Published by the IEEE Computer Societyinitial smaller experiments indicated that it does not significantly affect recognition.3.2Curvature-Based Segmentation and Landmark DetectionWe compute surface curvature at each point,create a region segmentation based on curvature type,and then detect landmarks on the face.A local coordinate system for a small region around each point is established prior to the curvature computation,formed by the tangent plane (X -Y plane)and a surface normal (Z axis)at the ing a PCA analysis of the points in the local region,the X and Y axes are the eigenvectors of two largest eigenvalues and Z axis is the smallest eigenvector,assumed to be the surface normal.Once the local coordinate system is established,a quadratic surface is fit to the local region.After the coefficients for the fit are found,partial derivatives are computed to estimate mean curvature,H ,and Gaussian curvature,K .The curvature type is labeled based on H and K and points with the same curvature type are grouped to form regions.Fig.4illustrates the steps to detect the nose tip (peak region),eye cavities (pit region),and nose bridge (saddle region).A nose tip is expected to be a peak (K >T K and H <T H ),a pair of eye cavities to be a pair of pit regions (K >T K and H <T H )and the nose bridge to be a saddle region (K <T K and H >T H ),where T K ¼0:0025and T H ¼0:00005.Since there may be a number of pit regions,a systematic way is needed to find those corresponding to the eye cavities.First,small pit regions (<80points)are removed.Second,a pair of regions that has similar average value in both Y and Z is found.Third,if there are still multiple candidate regions,the ones with higher Y values are chosen.The nose tip is found next.Starting between the eye landmark points,the search traverses down looking for the peak region with the largest difference in Z value from the center of the pit st,the area located between the two eye cavities is searched for a saddle region corresponding to the nose bridge.3.3Extracting Gallery/Probe Surface PatchesThe pose is standardized in order to help make probe surface extraction more accurate.Pose correction is performed by aligning an input surface to a generic 3D face model.A circular region around the nose tip is extracted as a probe C,see Fig.5b.Surface registration is performed between a probe C and a model surface.One reason to use a probe C rather than a whole facial region is to improve the registration in the presence of hair obscuring part(s)of the face.The input data points are then transformed using this registration.For ICP-based matching,the probe surface should be a subset of the gallery surface,see Fig.5a.For a probe,three different surfaces are extracted around the nose peak.These probe surfaces are extracted using predefined functions that are located on each face by the automatically-found feature points.For example,probe surface N is defined by a rectangle (labeled as 1in Fig.5c)based on the automatically-found nose tip and eye pit landmarks,with parts cut out based on four predefined ellipse regions (labeled as 2,3,4,and 5).Each of the elements is defined by constant offset values from the centroids of the facial landmark regions.Considering several different probe surfaces provides a better chance to select the best match among them.For instance,the probe N excludes the nostril portion of the nose while the probe I contains more of the forehead thus capturing more profile information of the nose.The definitions of these three probe surfaces were determined a priori based on considering results of earlier work [10].Our curvature-based detection of facial landmark regions is fully automatic and has been evaluated on 4,4853D face images of 546people with a variety of facial expressions.The accuracy of the facial feature finding method is measured based on the degree of inclusion of the nose area in the probe C.The landmarks (eye cavities,nose tip,and nose bridge)were successfully found in 99.4percent of the images (4,458of 4,485).In those cases where the landmark points are not found correctly,a recognition error is almost certain to result.Fig.1.Example images in 2D and 3D with different facialexpressions.Fig.2.Example of the large frontal face regions used with the baseline algorithms.For the ICP-baseline,note that the probe face is intentionally smaller than the gallery face,to ensure that the all the probe surface has some corresponding part on the gallery surface.4E XPERIMENTAL R ESULTSTable 2gives rank-one recognition rates for the individual probe surfaces.As described earlier,the ICP baseline that matches a larger frontal surface achieves 91.0percent rank-one recognition in matching neutral expression probe shapes to neutral expression gallery shapes.Interestingly,each of the three nose region surfaces individually achieves 95-96percent rank-one recognition in neutral expression matching.The fact that using less of the face can result in more accurate recognition may at first seem contradictory.However,even if a subject is asked to make a neutral expression at two different times,the 3D face shape will still be different by some amount.Also,difficulties with hair over the forehead,or with noise around the regions of the eyes,are more likely with the larger frontal face region.Our result suggests that such “accidental”sources of variation are much less of a problem for the nose region than for larger face regions.In the case of expression variation,the ICP baseline using the frontal face resulted in 61.5percent rank-one recognition.As shown in Table 2,an individual nose region surface such as probe N achieves nearly 84percent.Probe C has lower performance,possibly because it contains points in regions where more frequent deformation was observed.We next consider recognition from combining the results obtained from multiple nose region surfaces.4.1Performance Using Two Surfaces for a ProbeWe considered three rules for combining similarity measurements from multiple probe surfaces:sum,product,and minimum.All three showed similar performance when matching a neutral expression probe to a neutral expression gallery:96.59percent average for product,96.57percent for sum,and 96.5percent for minimum.However,in the presence of expression variation the product and sum achieved 87.1percent and 86.8percent,whereas the minimum rule achieved only 82.9percent.Thus,we selectedTABLE 1Description of the Data SetUsed in This StudyFig.3.Illustration of steps in face region extraction.Fig.4Steps involved in the facial landmark region detection process.the product rule,although its results are not significantly different from using the sum rule.The results for using different combinations of the individual probe surfaces are shown in Table3.When matching neutral expression to neutral expression,the information in the different nose region surfaces is somewhat paring Table3to Table2,the performance improvement is less than one percent. However,when there is expression change between the probe and the gallery,combining results from multiple nose region surfaces has a larger effect.In this case,the best individual probe surface resulted in83.5percent rank-one recognition and the best pair of surfaces resulted in87.1percent.Interestingly,while the combination of three surfaces improved slightly over the best combination of two surfaces in the case of matching neutral expressions,the combination of three did not do as well as the best combination of two in the case of matching varying expressions.In the end,the best overall performance comes from the combination of probe surfaces N and I.One surprising element of this work is that we can achieve such good performance using only a small portion of the face surface. However,there still exists an approximate10percent performance degradation,from roughly97to87percent,in going from matching neutral expressions to matching varying expressions.The Receiver Operating Characteristic(ROC)curve in Fig.6 reports results for a verification scenario.The equal-error rate (EER)is the ROC point at which the false accept rate is equal to the false reject rate.The EER for our approach goes from approxi-mately0.12for neutral expressions to approximately0.23for varying expressions.The EER for the ICP baseline shows a much greater performance degradation in going from all neutral expressions to varying facial expression.This indicates that our new algorithm is effective in closing part of performance gap that arises in handling varying facial expressions.A verification scenario implies1-to-1matching of3D shapes, whereas a recognition scenario implies matching one probe against a potentially large gallery.ICP-based matching of face shapes can be computationally expensive.Our current algorithm takes approxi-mately one second to match one probe shape against one gallery shape.Techniques to speed up3D shape matching in face recognition are a topic of current research[18],[19].The results in Fig.7show the effect on recognition rate of varying the number of enrolled subjects.We begin with probe set#1, randomly select one half of probe set#1to generate a reduced-size probe set,and do this multiple times.To account for variations in subject pool,the performance shown for each reduced data set size is the average of10randomly selected subsets of that size.There is aFig.5.Matching surface extraction for a gallery and three probe surfaces.(a)A gallery surface,(b)probe C(probe surface in general center face area,(c)probe N(probe surface in a nose region),and(d)probe I(probe surface in an interior nose region).TABLE2Rank-One Recognition Rates for Individual ProbeSurfacestrend toward higher observed rank-one recognition rate with smaller gallery size.This effect is much more prominent when expressions are varied.However,the degree of decrease in recognition rate that accompanies a doubling in gallery set size is much less here for 3D than has been reported by others for 2D [20].5S UMMARY AND D ISCUSSIONWe consider the issue of facial expression variation in 3D face recognition.Results of our new approach are compared to results from PCA and ICP-based approaches similar to previous work.Our new approach uses multiple overlapping surfaces from the nose area since this area appears to have relatively low shape variation across a range of expressions.Surface patches are automatically extracted from a curvature-based segmentation of the face.We consider using as many as three overlapping probe surface patches,but find that three does not improve performance over using two.Our approach substantially outperforms the ICP baseline that uses a frontal face region and manually identified landmark points.However,there is more to be done to solve the problem of handling expression variation,as there is about a 10percent drop in rank-one recognition rate when going from matching neutral expressions to matching varying expressions.One possible means to better performance is to use additional probe regions.For example,surface patches from the temples and/or from the chin may carry useful information about face shape and size.Algorithms to use such larger collections of surface patches will need to deal with missing patches,and make comparisons across probes that may use different numbers of patches in matching.The work of Cook et al.[3]may be relevant in this regard.They experimented with an approach to 3D face recognition that uses ICP to register the surfaces,then samples the distance between the registered surfaces at a number of points and models the intra versus interperson distribution of such feature vectors.It may be possible to adapt this approach to deal with expression variation,either by registering parts of the face surface individually,or by detecting elements of interperson variation caused by change in facial expression.There has been substantial work in dealing with expression variation in 2D face recognition.Yacoob et al.suggested “that there is need to incorporate dynamic analysis of facial expressions in future face recognition systems to better recognize faces”[21].This seems promising for future work,but sensors for 3D face imaging are currently not as mature as 2D camera technology [17].Martinez has also worked on 2D face recognition in the context of facial expressions and noted that “different facial expressions influence different parts of the face more than others”[22].He developed a strategy for “giving more importance to the results obtained from those local areas that are less affected by the current displayed emotion”[22].This general motivation is similar to that in our work.Also,Heisele has done work looking at “components”of the face [23].He experimented with 14local regions,or components,of 2D face appearance using 3D morphable models,and presented “a method for automatically learning a set of discriminatory facial components”in this context [23].The automatic learning of useful local regions of the 3D face shape is an interesting topic for future research.A CKNOWLEDGMENTSThe authors would like to thank the associate editor and the anonymous reviewers for their helpful suggestions to improve this paper.Biometrics research at the University of Notre Dame is supported by the US National Science Foundation under grant CNS-013089,by the Central Intelligence Agency,and by the US Department of Justice under grants 2005-DD-BX-1224and 2005-DD-CX-K078.The data set used in this work is available to other research groups through the Face Recognition Grand Challenge program [24].An early version of this work was presented at the Workshop on Face Recognition Grand Challenge Experiments [4].TABLE 3Rank-One Recognition Rates Using Multiple Probe SurfacesFig.6.ROC performance on neutral and non-neutral expression probe sets.R EFERENCES[1] C.Hesher,A.Srivastava,and G.Erlebacher,“A Novel Technique for FaceRecognition Using Range Imaging,”Proc.Int’l Symp.Signal Processing and Its Applications,pp.201-204,2003.[2]G.Medioni and R.Waupotitsch,“Face Modeling and Recognition in3-D,”Proc.IEEE Int’l Workshop Analysis and Modeling of Faces and Gestures,pp.232-233,Oct.2003.[3]J.Cook,V.Chandran,S.Sridharan,and C.Fookes,“Face Recognition from3D Data Using Iterative Closest Point Algorithm and Gaussian Mixture Models,”Proc.Second Int’l Symp.3D Data Processing,Visualization,and Transmission,pp.502-509,2004.[4]K.I.Chang,K.W.Bowyer,and P.J.Flynn,“Adaptive Rigid Multi-RegionSelection for Handling Expression Variation in3D Face Recognition,”Proc.IEEE Workshop Face Recognition Grand Challenge Experiments,June2005. [5]M.Husken,M.Brauckmann,S.Gehlen,and C.von der Malsburg,“Strategies and Benefits of Fusion of2D and3D Face Recognition,”Proc.IEEE Workshop Face Recognition Grand Challenge Experiments,June2005. [6]X.Lu and A.K.Jain,“Deformation Analysis for3D Face Matching,”Proc.Seventh IEEE Workshop Applications of Computer Vision,pp.99-104,2005. [7]T.Maurer,D.Guigonis,and I.Maslov et al.,“Performance of GeometrixActiveid3D Face Recognition Engine on the FRGC Data,”Proc.IEEE Workshop Face Recognition Grand Challenge Experiments,June2005.[8]G.Passalis,I.A.Kakadiaris,T.Theoharis,G.Toderici,and N.Murtuza,“Evaluation of3D Face Recognition in the Presence of Facial Expressions: An Annotated Deformable Model Approach,”Proc.IEEE Workshop Face Recognition Grand Challenge Experiments,June2005.[9] C.Chua,F.Han,and Y.Ho,“3D Human Face Recognition Using PointSignature,”Proc.Int’l Conf.Automatic Face and Gesture Recognition,pp.233-238,2000.[10]K.Chang,K.Bowyer,and P.Flynn,“Effects on Facial Expression in3D FaceRecognition,”Proc.SPIE Conf.Biometric Technology for Human Identification, pp.132-143,Apr.2005.[11] A.M.Bronstein,M.M.Bronstein,and R.Kimmel,“Three-Dimensional FaceRecognition,”Int’l puter Vision,vol.64,pp.5-30,2005.[12] A.M.Bronstein,M.M.Bronstein,and R.Kimmel,“Generalized Multi-dimensional Scaling:A Framework for Isometry-Invariant Partial Surface Matching,”Proc.Nat’l Academy of Science,pp.1168-1172,2006.[13]K.I.Chang,K.W.Bowyer,and P.J.Flynn,“An Evaluation of Multi-Modal2D+3D Face Biometrics,”IEEE Trans.Pattern Analysis and Machine Intelligence,vol.27,pp.619-624,2005.[14] F.Tsalakanidou,D.Tzovaras,and M.Strintzis,“Use of Depth and ColourEigenfaces for Face Recognition,”Pattern Recognition Letters,pp.1427-1435, 2003.[15]X.Lu,A.K.Jain,and D.Colbry,“Matching2.5D Face Scans to3D Models,”IEEE Trans.Pattern Analysis and Machine Intelligence,vol.28,pp.31-43,2006.[16] B.Gokberk,A.Ali Salah,and L.Akarun,“Rank-Based Decision Fusion for3D Shape-Based Face Recognition,”Proc.Fifth Int’l Conf.Audio and Video-Based Biometric Person Authentication,pp.1019-1028,July2005.[17]K.Bowyer,K.Chang,and P.Flynn,“A Survey of Approaches andChallenges in3D and Multi-Modal3D+2D Face Recognition,”Computer Vision and Image Understanding,vol.101,no.1,pp.1-15,2006.[18]P.Yan and K.W.Bowyer,“A Fast Algorithm for ICP-Based3D ShapeBiometrics,”Proc.IEEE Workshop Automatic Identification Advanced Technol-ogies,Oct.2005.[19]M.L.Koudelka,M.W.Koch,and T.D.Russ,“A Prescreener for3D FaceRecognition Using Radial Symmetry and the Hausdorff Fraction,”Proc.IEEE Workshop Face Recognition Grand Challenge Experiments,June2005.[20]J.Phillips,P.Grother,R.Micheals,D.Blackburn,E.Tabassi,and M.Bone,“Facial Recognition Vendor Test2002:Evaluation Report,”http:// /FRVT2002/documents.htm,2002.[21]Y.Yacoob,m,and L.Davis,“Recognizing Faces ShowingExpressions,”Proc.Int’l Workshop Automatic Face and Gesture Recognition, pp.278-283,1995.[22] A.Martinez,“Recognizing Imprecisely Localized,Partially Occluded,andExpression Variant Faces from a Single Sample per Class,”IEEE Trans.Pattern Analysis and Machine Intelligence,vol.24,pp.748-763,June2002. [23] B.Heisele and T.Koshizen,“Components for Face Recognition,”Proc.SixthIEEE Int’l Conf.Face and Gesture Recognition,pp.153-158,2004.[24]P.J.Phillips et al.,“Overview of the Face Recognition Grand Challenge,”Computer Vision and Pattern Recognition,pp.I:947-954,June2005..For more information on this or any other computing topic,please visit our Digital Library at /publications/dlib.Fig.7.Rank-one recognition rate with varying sata set size.(a)Neutral expression probes and(b)varying expression probes.。

相关文档
最新文档