数字图像处理与边缘检测中英文对照外文翻译文献

合集下载

图像识别中英文对照外文翻译文献

图像识别中英文对照外文翻译文献

中英文对照外文翻译文献(文档含英文原文和中文翻译)Elastic image matchingAbstractOne fundamental problem in image recognition is to establish the resemblance of two images. This can be done by searching the best pixel to pixel mapping taking into account monotonicity and continuity constraints. We show that this problem is NP-complete by reduction from 3-SAT, thus giving evidence that the known exponential time algorithms are justified, but approximation algorithms or simplifications are necessary.Keywords: Elastic image matching; Two-dimensional warping; NP-completeness 1. IntroductionIn image recognition, a common problem is to match two given images, e.g. when comparing an observed image to given references. In that pro-cess, elastic image matching, two-dimensional (2D-)warping (Uchida and Sakoe, 1998) or similar types of invariant methods (Keysers et al., 2000) can be used. For this purpose, we can define cost functions depending on the distortion introduced in the matching andsearch for the best matching with respect to a given cost function. In this paper, we show that it is an algorithmically hard problem to decide whether a matching between two images exists with costs below a given threshold. We show that the problem image matching is NP-complete by means of a reduction from 3-SAT, which is a common method of demonstrating a problem to be intrinsically hard (Garey and Johnson, 1979). This result shows the inherent computational difficulties in this type of image comparison, while interestingly the same problem is solvable for 1D sequences in polynomial time, e.g. the dynamic time warping problem in speech recognition (see e.g. Ney et al., 1992). This has the following implications: researchers who are interested in an exact solution to this problem cannot hope to find a polynomial time algorithm, unless P=NP. Furthermore, one can conclude that exponential time algorithms as presented and extended by Uchida and Sakoe (1998, 1999a,b, 2000a,b) may be justified for some image matching applications. On the other hand this shows that those interested in faster algorithms––e.g. for pattern recognition purposes––are right in searching for sub-optimal solutions. One method to do this is the restriction to local optimizations or linear approximations of global transformations as presented in (Keysers et al., 2000). Another possibility is to use heuristic approaches like simulated annealing or genetic algorithms to find an approximate solution. Furthermore, methods like beam search are promising candidates, as these are used successfully in speech recognition, although linguistic decoding is also an NP-complete problem (Casacuberta and de la Higuera, 1999). 2. Image matchingAmong the varieties of matching algorithms,we choose the one presented by Uchida and Sakoe(1998) as a starting point to formalize the problem image matching. Let the images be given as(without loss of generality) square grids of size M×M with gray values (respectively node labels)from a finite alphabet &={1,…,G}. To define thed:&×&→N , problem, two distance functions are needed,one acting on gray valuesg measuring the match in gray values, and one acting on displacement differences :Z×Z→N , measuring the distortion introduced by t he matching. For these distance ddfunctions we assume that they are monotonous functions (computable in polynomial time) of the commonly used squared Euclid-ean distance, i.ed g (g 1,g 2)=f 1(||g 1-g 2||²)and d d (z)=f 2(||z||²) monotonously increasing. Now we call the following optimization problem the image matching problem (let µ={1,…M} ).Instance: The pair( A ; B ) of two images A and B of size M×M .Solution: A mapping function f :µ×µ→µ×µ.Measure:c (A,B,f )=),(),(j i f ij g B Ad ∑μμ⨯∈),(j i+∑⨯-⋅⋅⋅∈+-+μ}1,{1,),()))0,1(),(())0,1(),(((M j i d j i f j i f dμ⨯-⋅⋅⋅∈}1,{1,),(M j i +∑⋅⋅⋅⨯∈+-+1}-M ,{1,),()))1,0(),(())1,0(),(((μj i d j i f j i f d 1}-M ,{1,),(⋅⋅⋅⨯∈μj iGoal:min f c(A,B,f).In other words, the problem is to find the mapping from A onto B that minimizes the distance between the mapped gray values together with a measure for the distortion introduced by the mapping. Here, the distortion is measured by the deviation from the identity mapping in the two dimensions. The identity mapping fulfills f(i,j)=(i,j),and therefore ,f((i,j)+(x,y))=f(i,j)+(x,y)The corresponding decision problem is fixed by the followingQuestion:Given an instance of image matching and a cost c′, does there exist a ma pping f such that c(A,B,f)≤c′?In the definition of the problem some care must be taken concerning the distance functions. For example, if either one of the distance functions is a constant function, the problem is clearly in P (for d g constant, the minimum is given by the identity mapping and for d d constant, the minimum can be determined by sorting all possible matching for each pixel by gray value cost and mapping to one of the pixels with minimum cost). But these special cases are not those we are concerned with in image matching in general.We choose the matching problem of Uchida and Sakoe (1998) to complete the definition of the problem. Here, the mapping functions are restricted by continuity and monotonicity constraints: the deviations from the identity mapping may locally be at most one pixel (i.e. limited to the eight-neighborhood with squared Euclidean distance less than or equal to 2). This can be formalized in this approach bychoosing the functions f1,f2as e.g.f 1=id,f2(x)=step(x):=⎩⎨⎧.2,)10(,2,0>≤⋅xGxMM3. Reduction from 3-SAT3-SAT is a very well-known NP-complete problem (Garey and Johnson, 1979), where 3-SAT is defined as follows:Instance: Collection of clauses C={C1,···,CK} on a set of variables X={x1, (x)L}such that each ckconsists of 3 literals for k=1,···K .Each literal is a variable or the negation of a variable.Question:Is there a truth assignment for X which satisfies each clause ck, k=1,···K ?The dependency graph D(Ф)corresponding to an instance Ф of 3-SAT is defined to be the bipartite graph whose independent sets are formed by the set of clauses Cand the set of variables X .Two vert ices ck and x1are adjacent iff ckinvolvesx 1or-xL.Given any 3-SAT formula U, we show how to construct in polynomial time anequivalent image matching problem l(Ф)=(A(Ф),B(Ф)); . The two images of l (Ф)are similar according to the cost function (i.e.f:c(A(Ф),B(Ф),f)≤0) iff the formulaФ is satisfiable. We perform the reduction from 3-SAT using the following steps:• From the formula Ф we construct the dependency graph D(Ф).• The dependency graph D(Ф)is drawn in the plane.• The drawing of D(Ф)is refined to depict the logical behaviour of Ф , yielding two images(A(Ф),B(Ф)).For this, we use three types of components: one component to represent variables of Ф , one component to represent clauses of Ф, and components which act as interfaces between the former two types. Before we give the formal reduction, we introduce these components.3.1. Basic componentsFor the reduction from 3-SAT we need five components from which we will construct the in-stances for image matching , given a Boolean formula in 3-DNF,respectively its graph. The five components are the building blocks needed for the graph drawing and will be introduced in the following, namely the representations of connectors,crossings, variables, and clauses. The connectors represent the edges and have two varieties, straight connectors and corner connectors. Each of the components consists of two parts, one for image A and one for image B , where blank pixels are considered to be of the‘background ’color.We will depict possible mappings in the following using arrows indicating the direction of displacement (where displacements within the eight-neighborhood of a pixel are the only cases considered). Blank squares represent mapping to the respective counterpart in the second image.For example, the following displacements of neighboring pixels can be used with zero cost:On the other hand, the following displacements result in costs greater than zero:Fig. 1 shows the first component, the straight connector component, which consists of a line of two different interchanging colors,here denoted by the two symbols◇and□. Given that the outside pixels are mapped to their respe ctive counterparts and the connector is continued infinitely, there are two possible ways in which the colored pixels can be mapped, namely to the left (i.e. f(2,j)=(2,j-1)) or to the right (i.e. f(2,j)=(2,j+1)),where the background pixels have different possibilities for the mapping, not influencing the main property of the connector. This property, which justifies the name ‘connector ’, is the following: It is not possible to find a mapping, which yields zero cost where the relative displacements of the connector pixels are not equal, i.e. one always has f(2,j)-(2,j)=f(2,j')-(2,j'),which can easily be observed by induction over j'.That is, given an initial displacement of one pixel (which will be ±1 in this context), the remaining end of the connector has the same displacement if overall costs of the mapping are zero. Given this property and the direction of a connector, which we define to be directed from variable to clause, wecan define the state of the connector as carrying the‘true’truth value, if the displacement is 1 pixel in the direction of the connector and as carrying the‘false’ truth value, if the displacement is -1 pixel in the direction of the connector. This property then ensures that the truth value transmitted by the connector cannot change at mappings of zero cost.Image A image Bmapping 1 mapping 2Fig. 1. The straight connector component with two possible zero cost mappings.For drawing of arbitrary graphs, clearly one also needs corners,which are represented in Fig. 2.By considering all possible displacements which guarantee overall cost zero, one can observe that the corner component also ensures the basic connector property. For example, consider the first depicted mapping, which has zero cost. On the other hand, the second mapping shows, that it is not possible to construct a zero cost mapping with both connectors‘leaving’the component. In that case, the pixel at the position marked‘? ’either has a conflict (that i s, introduces a cost greater than zero in the criterion function because of mapping mismatch) with the pixel above or to the right of it,if the same color is to be met and otherwise, a cost in the gray value mismatch term is introduced.image A image Bmapping 1 mapping 2Fig. 2. The corner connector component and two example mappings.Fig. 3 shows the variable component, in this case with two positive (to the left) and one negated output (to the right) leaving the component as connectors. Here, a fourth color is used, denoted by ·.This component has two possible mappings for thecolored pixels with zero cost, which map the vertical component of the source image to the left or the right vertical component in the target image, respectively. (In both cases the second vertical element in the target image is not a target of the mapping.) This ensures±1 pixel relative displacements at the entry to the connectors. This property again can be deducted by regarding all possible mappings of the two images.The property that follows (which is necessary for the use as variable) is that all zero cost mappings ensure that all positive connectors carry the same truth value,which is the opposite of the truth value for all the negated connectors. It is easy to see from this example how variable components for arbitrary numbers of positive and negated outputs can be constructed.image A image BImage C image DFig. 3. The variable component with two positive and one negated output and two possible mappings (for true and false truth value).Fig. 4 shows the most complex of the components, the clause component. This component consists of two parts. The first part is the horizontal connector with a 'bend' in it to the right.This part has the property that cost zero mappings are possible for all truth values of x and y with the exception of two 'false' values. This two input disjunction,can be extended to a three input dis-junction using the part in the lower left. If the z connector carries a 'false' truth value, this part can only be mapped one pixel downwards at zero cost.In that case the junction pixel (the fourth pixel in the third row) cannot be mapped upwards at zero cost and the 'two input clause' behaves as de-scribed above. On the other hand, if the z connector carries a 'true' truth value, this part can only be mapped one pixel upwards at zero cost,and the junction pixel can be mapped upwards,thus allowing both x and y to carry a 'false' truth value in a zero cost mapping. Thus there exists a zero cost mapping of the clause component iff at least one of the input connectors carries a truth value.image Aimage B mapping 1(true,true,false)mapping 2 (false,false,true,)Fig. 4. The clause component with three incoming connectors x, y , z and zero cost mappings forthe two cases(true,true,false)and (false, false, true).The described components are already sufficient to prove NP-completeness by reduction from planar 3-SAT (which is an NP-complete sub-problem of 3-SAT where the additional constraints on the instances is that the dependency graph is planar),but in order to derive a reduction from 3-SAT, we also include the possibility of crossing connectors.Fig. 5 shows the connector crossing, whose basic property is to allow zero cost mappings if the truth–values are consistently propagated. This is assured by a color change of the vertical connector and a 'flexible' middle part, which can be mapped to four different positions depending on the truth value distribution.image Aimage Bzero cost mappingFig. 5. The connector crossing component and one zero cost mapping.3.2. ReductionUsing the previously introduced components, we can now perform the reduction from 3-SAT to image matching .Proof of the claim that the image matching problem is NP-complete:Clearly, the image matching problem is in NP since, given a mapping f and two images A and B ,the computation of c(A,B,f)can be done in polynomial time. To prove NP-hardness, we construct a reduction from the 3-SAT problem. Given an instance of 3-SAT we construct two images A and B , for which a mapping of cost zero exists iff all the clauses can be satisfied.Given the dependency graph D ,we construct an embedding of the graph into a 2D pixel grid, placing the vertices on a large enough distance from each other (say100(K+L)² ).This can be done using well-known methods from graph drawing (see e.g.di Battista et al.,1999).From this image of the graph D we construct the two images A and B , using the components described above.Each vertex belonging to a variable is replaced with the respective parts of the variable component, having a number of leaving connectors equal to the number of incident edges under consideration of the positive or negative use in the respective clause. Each vertex belonging to a clause is replaced by the respective clause component,and each crossing of edges is replaced by the respective crossing component. Finally, all the edges are replaced with connectors and corner connectors, and the remaining pixels inside the rectangular hull of the construction are set to the background gray value. Clearly, the placement of the components can be done in such a way that all the components are at a large enough distance from each other, where the background pixels act as an 'insulation' against mapping of pixels, which do not belong to the same component. It can be easily seen, that the size of the constructed images is polynomial with respect to the number of vertices and edges of D and thus polynomial in the size of the instance of 3-SAT, at most in the order (K+L)².Furthermore, it can obviously be constructed in polynomial time, as the corresponding graph drawing algorithms are polynomial.Let there exist a truth assignment to the variables x1,…,xL, which satisfies allthe clauses c1,…,cK. We construct a mapping f , that satisfies c(f,A,B)=0 asfollows.For all pixels (i, j ) belonging to variable component l with A(i,j)not of the background color,set f(i,j)=(i,j-1)if xlis assigned the truth value 'true' , set f(i,j)=(i,j+1), otherwise. For the remaining pixels of the variable component set A(i,j)=B(i,j),if f(i,j)=(i,j), otherwise choose f(i,j)from{(i,j+1),(i+1,j+1),(i-1,j+1)}for xl'false' respectively from {(i,j-1),(i+1,j-1),(i-1,j-1)}for xl'true ',such that A(i,j)=B(f(i,j)). This assignment is always possible and has zero cost, as can be easily verified.For the pixels(i,j)belonging to (corner) connector components,the mapping function can only be extended in one way without the introduction of nonzero cost,starting from the connection with the variable component. This is ensured by thebasic connector property. By choosing f (i ,j )=(i,j )for all pixels of background color, we obtain a valid extension for the connectors. For the connector crossing components the extension is straight forward, although here ––as in the variable mapping ––some care must be taken with the assign ment of the background value pixels, but a zero cost assignment is always possible using the same scheme as presented for the variable mapping.It remains to be shown that the clause components can be mapped at zero cost, if at least one of the input connectors x , y , z carries a ' true' truth value.For a proof we regard alls even possibilities and construct a mapping for each case. In thedescription of the clause component it was already argued that this is possible,and due to space limitations we omit the formalization of the argument here.Finally, for all the pixels (i ,j )not belonging to any of the components, we set f (i ,j )=(i ,j )thus arriving at a mapping function which has c (f ,A ,B )=0。

人脸识别 面部 数字图像处理相关 中英对照 外文文献翻译 毕业设计论文 高质量人工翻译 原文带出处

人脸识别 面部 数字图像处理相关 中英对照 外文文献翻译 毕业设计论文 高质量人工翻译 原文带出处

人脸识别相关文献翻译,纯手工翻译,带原文出处(原文及译文)如下翻译原文来自Thomas David Heseltine BSc. Hons. The University of YorkDepartment of Computer ScienceFor the Qualification of PhD. — September 2005 -《Face Recognition: Two-Dimensional and Three-Dimensional Techniques》4 Two-dimensional Face Recognition4.1 Feature LocalizationBefore discussing the methods of comparing two facial images we now take a brief look at some at the preliminary processes of facial feature alignment. This process typically consists of two stages: face detection and eye localisation. Depending on the application, if the position of the face within the image is known beforehand (fbr a cooperative subject in a door access system fbr example) then the face detection stage can often be skipped, as the region of interest is already known. Therefore, we discuss eye localisation here, with a brief discussion of face detection in the literature review(section 3.1.1).The eye localisation method is used to align the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented are representative of the face recognition accuracy and not a product of the performance of the eye localisation routine, all image alignments are manually checked and any errors corrected, prior to testing and evaluation.We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of feces is taken, and each image cropped to an area around both eyes. The average image is calculated and used as a template.Figure 4-1 - The average eyes. Used as a template for eye detection.Both eyes are included in a single template, rather than individually searching for each eye in turn, as the characteristic symmetry of the eyes either side of the nose, provides a useful feature that helps distinguish between the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale(i.e. subject distance from the camera) and also introduces the assumption that eyes in the image appear near horizontal. Some preliminary experimentation also reveals that it is advantageous to include the area of skin justbeneath the eyes. The reason being that in some cases the eyebrows can closely match the template, particularly if there are shadows in the eye-sockets, but the area of skin below the eyes helps to distinguish the eyes from eyebrows (the area just below the eyebrows contain eyes, whereas the area below the eyes contains only plain skin).A window is passed over the test images and the absolute difference taken to that of the average eye image shown above. The area of the image with the lowest difference is taken as the region of interest containing the eyes. Applying the same procedure using a smaller template of the individual left and right eyes then refines each eye position.This basic template-based method of eye localisation, although providing fairly preciselocalisations, often fails to locate the eyes completely. However, we are able to improve performance by including a weighting scheme.Eye localisation is performed on the set of training images, which is then separated into two sets: those in which eye detection was successful; and those in which eye detection failed. Taking the set of successful localisations we compute the average distance from the eye template (Figure 4-2 top). Note that the image is quite dark, indicating that the detected eyes correlate closely to the eye template, as we would expect. However, bright points do occur near the whites of the eye, suggesting that this area is often inconsistent, varying greatly from the average eye template.Figure 4-2 一Distance to the eye template for successful detections (top) indicating variance due to noise and failed detections (bottom) showing credible variance due to miss-detected features.In the lower image (Figure 4-2 bottom), we have taken the set of failed localisations(images of the forehead, nose, cheeks, background etc. falsely detected by the localisation routine) and once again computed the average distance from the eye template. The bright pupils surrounded by darker areas indicate that a failed match is often due to the high correlation of the nose and cheekbone regions overwhelming the poorly correlated pupils. Wanting to emphasise the difference of the pupil regions for these failed matches and minimise the variance of the whites of the eyes for successful matches, we divide the lower image values by the upper image to produce a weights vector as shown in Figure 4-3. When applied to the difference image before summing a total error, this weighting scheme provides a much improved detection rate.Figure 4-3 - Eye template weights used to give higher priority to those pixels that best represent the eyes.4.2 The Direct Correlation ApproachWe begin our investigation into face recognition with perhaps the simplest approach,known as the direct correlation method (also referred to as template matching by Brunelli and Poggio [29 ]) involving the direct comparison of pixel intensity values taken from facial images. We use the term "Direct Conelation, to encompass all techniques in which face images are compared directly, without any form of image space analysis, weighting schemes or feature extraction, regardless of the distance metric used. Therefore, we do not infer that Pearson's correlation is applied as the similarity function (although such an approach would obviously come under our definition of direct correlation). We typically use the Euclidean distance as our metric in these investigations (inversely related to Pearson's correlation and can be considered as a scale and translation sensitive form of image correlation), as this persists with the contrast made between image space and subspace approaches in later sections.Firstly, all facial images must be aligned such that the eye centres are located at two specified pixel coordinates and the image cropped to remove any background information. These images are stored as greyscale bitmaps of 65 by 82 pixels and prior to recognition converted into a vector of 5330 elements (each element containing the corresponding pixel intensity value). Each corresponding vector can be thought of as describing a point within a 5330 dimensional image space. This simple principle can easily be extended to much larger images: a 256 by 256 pixel image occupies a single point in 65,536-dimensional image space and again, similar images occupy close points within that space. Likewise, similar faces are located close together within the image space, while dissimilar faces are spaced far apart. Calculating the Euclidean distance d, between two facial image vectors (often referred to as the query image q, and gallery image g), we get an indication of similarity. A threshold is then applied to make the final verification decision.d . q - g ( threshold accept ) (d threshold ⇒ reject ). Equ. 4-14.2.1 Verification TestsThe primary concern in any face recognition system is its ability to correctly verify a claimed identity or determine a person's most likely identity from a set of potential matches in a database. In order to assess a given system's ability to perform these tasks, a variety of evaluation methodologies have arisen. Some of these analysis methods simulate a specific mode of operation (i.e. secure site access or surveillance), while others provide a more mathematicaldescription of data distribution in some classification space. In addition, the results generated from each analysis method may be presented in a variety of formats. Throughout the experimentations in this thesis, we primarily use the verification test as our method of analysis and comparison, although we also use Fisher's Linear Discriminant to analyse individual subspace components in section 7 and the identification test for the final evaluations described in section 8. The verification test measures a system's ability to correctly accept or reject the proposed identity of an individual. At a functional level, this reduces to two images being presented for comparison, fbr which the system must return either an acceptance (the two images are of the same person) or rejection (the two images are of different people). The test is designed to simulate the application area of secure site access. In this scenario, a subject will present some form of identification at a point of entry, perhaps as a swipe card, proximity chip or PIN number. This number is then used to retrieve a stored image from a database of known subjects (often referred to as the target or gallery image) and compared with a live image captured at the point of entry (the query image). Access is then granted depending on the acceptance/rej ection decision.The results of the test are calculated according to how many times the accept/reject decision is made correctly. In order to execute this test we must first define our test set of face images. Although the number of images in the test set does not affect the results produced (as the error rates are specified as percentages of image comparisons), it is important to ensure that the test set is sufficiently large such that statistical anomalies become insignificant (fbr example, a couple of badly aligned images matching well). Also, the type of images (high variation in lighting, partial occlusions etc.) will significantly alter the results of the test. Therefore, in order to compare multiple face recognition systems, they must be applied to the same test set.However, it should also be noted that if the results are to be representative of system performance in a real world situation, then the test data should be captured under precisely the same circumstances as in the application environment.On the other hand, if the purpose of the experimentation is to evaluate and improve a method of face recognition, which may be applied to a range of application environments, then the test data should present the range of difficulties that are to be overcome. This may mean including a greater percentage of6difficult9 images than would be expected in the perceived operating conditions and hence higher error rates in the results produced. Below we provide the algorithm for executing the verification test. The algorithm is applied to a single test set of face images, using a single function call to the face recognition algorithm: CompareF aces(F ace A, FaceB). This call is used to compare two facial images, returning a distance score indicating how dissimilar the two face images are: the lower the score the more similar the two face images. Ideally, images of the same face should produce low scores, while images of different faces should produce high scores.Every image is compared with every other image, no image is compared with itself and nopair is compared more than once (we assume that the relationship is symmetrical). Once two images have been compared, producing a similarity score, the ground-truth is used to determine if the images are of the same person or different people. In practical tests this information is often encapsulated as part of the image filename (by means of a unique person identifier). Scores are then stored in one of two lists: a list containing scores produced by comparing images of different people and a list containing scores produced by comparing images of the same person. The final acceptance/rejection decision is made by application of a threshold. Any incorrect decision is recorded as either a false acceptance or false rejection. The false rejection rate (FRR) is calculated as the percentage of scores from the same people that were classified as rejections. The false acceptance rate (FAR) is calculated as the percentage of scores from different people that were classified as acceptances.For IndexA = 0 to length(TestSet) For IndexB = IndexA+l to length(TestSet) Score = CompareFaces(TestSet[IndexA], TestSet[IndexB]) If IndexA and IndexB are the same person Append Score to AcceptScoresListElseAppend Score to RejectScoresListFor Threshold = Minimum Score to Maximum Score:FalseAcceptCount, FalseRejectCount = 0For each Score in RejectScoresListIf Score <= ThresholdIncrease FalseAcceptCountFor each Score in AcceptScoresListIf Score > ThresholdIncrease FalseRejectCountF alse AcceptRate = FalseAcceptCount / Length(AcceptScoresList)FalseRej ectRate = FalseRejectCount / length(RejectScoresList)Add plot to error curve at (FalseRejectRate, FalseAcceptRate)These two error rates express the inadequacies of the system when operating at aspecific threshold value. Ideally, both these figures should be zero, but in reality reducing either the FAR or FRR (by altering the threshold value) will inevitably resultin increasing the other. Therefore, in order to describe the full operating range of a particular system, we vary the threshold value through the entire range of scores produced. The application of each threshold value produces an additional FAR, FRR pair, which when plotted on a graph produces the error rate curve shown below.False Acceptance Rate / %Figure 4-5 - Example Error Rate Curve produced by the verification test.The equal error rate (EER) can be seen as the point at which FAR is equal to FRR. This EER value is often used as a single figure representing the general recognition performance of a biometric system and allows for easy visual comparison of multiple methods. However, it is important to note that the EER does not indicate the level of error that would be expected in a real world application. It is unlikely that any real system would use a threshold value such that the percentage of false acceptances were equal to the percentage of false rejections. Secure site access systems would typically set the threshold such that false acceptances were significantly lower than false rejections: unwilling to tolerate intruders at the cost of inconvenient access denials.Surveillance systems on the other hand would require low false rejection rates to successfully identify people in a less controlled environment. Therefore we should bear in mind that a system with a lower EER might not necessarily be the better performer towards the extremes of its operating capability.There is a strong connection between the above graph and the receiver operating characteristic (ROC) curves, also used in such experiments. Both graphs are simply two visualisations of the same results, in that the ROC format uses the True Acceptance Rate(TAR), where TAR = 1.0 - FRR in place of the FRR, effectively flipping the graph vertically. Another visualisation of the verification test results is to display both the FRR and FAR as functions of the threshold value. This presentation format provides a reference to determine the threshold value necessary to achieve a specific FRR and FAR. The EER can be seen as the point where the two curves intersect.Figure 4-6 - Example error rate curve as a function of the score threshold The fluctuation of these error curves due to noise and other errors is dependant on the number of face image comparisons made to generate the data. A small dataset that only allows fbr a small number of comparisons will results in a jagged curve, in which large steps correspond to the influence of a single image on a high proportion of the comparisons made. A typical dataset of 720 images (as used in section 4.2.2) provides 258,840 verification operations, hence a drop of 1% EER represents an additional 2588 correct decisions, whereas the quality of a single image could cause the EER to fluctuate by up to 0.28.422 ResultsAs a simple experiment to test the direct correlation method, we apply the technique described above to a test set of 720 images of 60 different people, taken from the AR Face Database [ 39 ]. Every image is compared with every other image in the test set to produce a likeness score, providing 258,840 verification operations from which to calculate false acceptance rates and false rejection rates. The error curve produced is shown in Figure 4-7.Figure 4-7 - Error rate curve produced by the direct correlation method using no image preprocessing.We see that an EER of 25.1% is produced, meaning that at the EER threshold approximately one quarter of all verification operations carried out resulted in an incorrect classification. Thereare a number of well-known reasons for this poor level of accuracy. Tiny changes in lighting, expression or head orientation cause the location in image space to change dramatically. Images in face space are moved far apart due to these image capture conditions, despite being of the same person's face. The distance between images of different people becomes smaller than the area of face space covered by images of the same person and hence false acceptances and false rejections occur frequently. Other disadvantages include the large amount of storage necessaryfor holding many face images and the intensive processing required for each comparison, making this method unsuitable fbr applications applied to a large database. In section 4.3 we explore the eigenface method, which attempts to address some of these issues.4二维人脸识别4.1功能定位在讨论比较两个人脸图像,我们现在就简要介绍的方法一些在人脸特征的初步调整过程。

图像识别中英文对照外文翻译文献

图像识别中英文对照外文翻译文献

中英文对照外文翻译文献(文档含英文原文和中文翻译)Elastic image matchingAbstractOne fundamental problem in image recognition is to establish the resemblance of two images. This can be done by searching the best pixel to pixel mapping taking into account monotonicity and continuity constraints. We show that this problem is NP-complete by reduction from 3-SAT, thus giving evidence that the known exponential time algorithms are justified, but approximation algorithms or simplifications are necessary.Keywords: Elastic image matching; Two-dimensional warping; NP-completeness 1. IntroductionIn image recognition, a common problem is to match two given images, e.g. when comparing an observed image to given references. In that pro-cess, elastic image matching, two-dimensional (2D-)warping (Uchida and Sakoe, 1998) or similar types of invariant methods (Keysers et al., 2000) can be used. For this purpose, we can define cost functions depending on the distortion introduced in the matching andsearch for the best matching with respect to a given cost function. In this paper, we show that it is an algorithmically hard problem to decide whether a matching between two images exists with costs below a given threshold. We show that the problem image matching is NP-complete by means of a reduction from 3-SAT, which is a common method of demonstrating a problem to be intrinsically hard (Garey and Johnson, 1979). This result shows the inherent computational difficulties in this type of image comparison, while interestingly the same problem is solvable for 1D sequences in polynomial time, e.g. the dynamic time warping problem in speech recognition (see e.g. Ney et al., 1992). This has the following implications: researchers who are interested in an exact solution to this problem cannot hope to find a polynomial time algorithm, unless P=NP. Furthermore, one can conclude that exponential time algorithms as presented and extended by Uchida and Sakoe (1998, 1999a,b, 2000a,b) may be justified for some image matching applications. On the other hand this shows that those interested in faster algorithms––e.g. for pattern recognition purposes––are right in searching for sub-optimal solutions. One method to do this is the restriction to local optimizations or linear approximations of global transformations as presented in (Keysers et al., 2000). Another possibility is to use heuristic approaches like simulated annealing or genetic algorithms to find an approximate solution. Furthermore, methods like beam search are promising candidates, as these are used successfully in speech recognition, although linguistic decoding is also an NP-complete problem (Casacuberta and de la Higuera, 1999). 2. Image matchingAmong the varieties of matching algorithms,we choose the one presented by Uchida and Sakoe(1998) as a starting point to formalize the problem image matching. Let the images be given as(without loss of generality) square grids of size M×M with gray values (respectively node labels)from a finite alphabet &={1,…,G}. To define thed:&×&→N , problem, two distance functions are needed,one acting on gray valuesg measuring the match in gray values, and one acting on displacement differences :Z×Z→N , measuring the distortion introduced by t he matching. For these distance ddfunctions we assume that they are monotonous functions (computable in polynomial time) of the commonly used squared Euclid-ean distance, i.ed g (g 1,g 2)=f 1(||g 1-g 2||²)and d d (z)=f 2(||z||²) monotonously increasing. Now we call the following optimization problem the image matching problem (let µ={1,…M} ).Instance: The pair( A ; B ) of two images A and B of size M×M .Solution: A mapping function f :µ×µ→µ×µ.Measure:c (A,B,f )=),(),(j i f ij g B Ad ∑μμ⨯∈),(j i+∑⨯-⋅⋅⋅∈+-+μ}1,{1,),()))0,1(),(())0,1(),(((M j i d j i f j i f dμ⨯-⋅⋅⋅∈}1,{1,),(M j i +∑⋅⋅⋅⨯∈+-+1}-M ,{1,),()))1,0(),(())1,0(),(((μj i d j i f j i f d 1}-M ,{1,),(⋅⋅⋅⨯∈μj iGoal:min f c(A,B,f).In other words, the problem is to find the mapping from A onto B that minimizes the distance between the mapped gray values together with a measure for the distortion introduced by the mapping. Here, the distortion is measured by the deviation from the identity mapping in the two dimensions. The identity mapping fulfills f(i,j)=(i,j),and therefore ,f((i,j)+(x,y))=f(i,j)+(x,y)The corresponding decision problem is fixed by the followingQuestion:Given an instance of image matching and a cost c′, does there exist a ma pping f such that c(A,B,f)≤c′?In the definition of the problem some care must be taken concerning the distance functions. For example, if either one of the distance functions is a constant function, the problem is clearly in P (for d g constant, the minimum is given by the identity mapping and for d d constant, the minimum can be determined by sorting all possible matching for each pixel by gray value cost and mapping to one of the pixels with minimum cost). But these special cases are not those we are concerned with in image matching in general.We choose the matching problem of Uchida and Sakoe (1998) to complete the definition of the problem. Here, the mapping functions are restricted by continuity and monotonicity constraints: the deviations from the identity mapping may locally be at most one pixel (i.e. limited to the eight-neighborhood with squared Euclidean distance less than or equal to 2). This can be formalized in this approach bychoosing the functions f1,f2as e.g.f 1=id,f2(x)=step(x):=⎩⎨⎧.2,)10(,2,0>≤⋅xGxMM3. Reduction from 3-SAT3-SAT is a very well-known NP-complete problem (Garey and Johnson, 1979), where 3-SAT is defined as follows:Instance: Collection of clauses C={C1,···,CK} on a set of variables X={x1, (x)L}such that each ckconsists of 3 literals for k=1,···K .Each literal is a variable or the negation of a variable.Question:Is there a truth assignment for X which satisfies each clause ck, k=1,···K ?The dependency graph D(Ф)corresponding to an instance Ф of 3-SAT is defined to be the bipartite graph whose independent sets are formed by the set of clauses Cand the set of variables X .Two vert ices ck and x1are adjacent iff ckinvolvesx 1or-xL.Given any 3-SAT formula U, we show how to construct in polynomial time anequivalent image matching problem l(Ф)=(A(Ф),B(Ф)); . The two images of l (Ф)are similar according to the cost function (i.e.f:c(A(Ф),B(Ф),f)≤0) iff the formulaФ is satisfiable. We perform the reduction from 3-SAT using the following steps:• From the formula Ф we construct the dependency graph D(Ф).• The dependency graph D(Ф)is drawn in the plane.• The drawing of D(Ф)is refined to depict the logical behaviour of Ф , yielding two images(A(Ф),B(Ф)).For this, we use three types of components: one component to represent variables of Ф , one component to represent clauses of Ф, and components which act as interfaces between the former two types. Before we give the formal reduction, we introduce these components.3.1. Basic componentsFor the reduction from 3-SAT we need five components from which we will construct the in-stances for image matching , given a Boolean formula in 3-DNF,respectively its graph. The five components are the building blocks needed for the graph drawing and will be introduced in the following, namely the representations of connectors,crossings, variables, and clauses. The connectors represent the edges and have two varieties, straight connectors and corner connectors. Each of the components consists of two parts, one for image A and one for image B , where blank pixels are considered to be of the‘background ’color.We will depict possible mappings in the following using arrows indicating the direction of displacement (where displacements within the eight-neighborhood of a pixel are the only cases considered). Blank squares represent mapping to the respective counterpart in the second image.For example, the following displacements of neighboring pixels can be used with zero cost:On the other hand, the following displacements result in costs greater than zero:Fig. 1 shows the first component, the straight connector component, which consists of a line of two different interchanging colors,here denoted by the two symbols◇and□. Given that the outside pixels are mapped to their respe ctive counterparts and the connector is continued infinitely, there are two possible ways in which the colored pixels can be mapped, namely to the left (i.e. f(2,j)=(2,j-1)) or to the right (i.e. f(2,j)=(2,j+1)),where the background pixels have different possibilities for the mapping, not influencing the main property of the connector. This property, which justifies the name ‘connector ’, is the following: It is not possible to find a mapping, which yields zero cost where the relative displacements of the connector pixels are not equal, i.e. one always has f(2,j)-(2,j)=f(2,j')-(2,j'),which can easily be observed by induction over j'.That is, given an initial displacement of one pixel (which will be ±1 in this context), the remaining end of the connector has the same displacement if overall costs of the mapping are zero. Given this property and the direction of a connector, which we define to be directed from variable to clause, wecan define the state of the connector as carrying the‘true’truth value, if the displacement is 1 pixel in the direction of the connector and as carrying the‘false’ truth value, if the displacement is -1 pixel in the direction of the connector. This property then ensures that the truth value transmitted by the connector cannot change at mappings of zero cost.Image A image Bmapping 1 mapping 2Fig. 1. The straight connector component with two possible zero cost mappings.For drawing of arbitrary graphs, clearly one also needs corners,which are represented in Fig. 2.By considering all possible displacements which guarantee overall cost zero, one can observe that the corner component also ensures the basic connector property. For example, consider the first depicted mapping, which has zero cost. On the other hand, the second mapping shows, that it is not possible to construct a zero cost mapping with both connectors‘leaving’the component. In that case, the pixel at the position marked‘? ’either has a conflict (that i s, introduces a cost greater than zero in the criterion function because of mapping mismatch) with the pixel above or to the right of it,if the same color is to be met and otherwise, a cost in the gray value mismatch term is introduced.image A image Bmapping 1 mapping 2Fig. 2. The corner connector component and two example mappings.Fig. 3 shows the variable component, in this case with two positive (to the left) and one negated output (to the right) leaving the component as connectors. Here, a fourth color is used, denoted by ·.This component has two possible mappings for thecolored pixels with zero cost, which map the vertical component of the source image to the left or the right vertical component in the target image, respectively. (In both cases the second vertical element in the target image is not a target of the mapping.) This ensures±1 pixel relative displacements at the entry to the connectors. This property again can be deducted by regarding all possible mappings of the two images.The property that follows (which is necessary for the use as variable) is that all zero cost mappings ensure that all positive connectors carry the same truth value,which is the opposite of the truth value for all the negated connectors. It is easy to see from this example how variable components for arbitrary numbers of positive and negated outputs can be constructed.image A image BImage C image DFig. 3. The variable component with two positive and one negated output and two possible mappings (for true and false truth value).Fig. 4 shows the most complex of the components, the clause component. This component consists of two parts. The first part is the horizontal connector with a 'bend' in it to the right.This part has the property that cost zero mappings are possible for all truth values of x and y with the exception of two 'false' values. This two input disjunction,can be extended to a three input dis-junction using the part in the lower left. If the z connector carries a 'false' truth value, this part can only be mapped one pixel downwards at zero cost.In that case the junction pixel (the fourth pixel in the third row) cannot be mapped upwards at zero cost and the 'two input clause' behaves as de-scribed above. On the other hand, if the z connector carries a 'true' truth value, this part can only be mapped one pixel upwards at zero cost,and the junction pixel can be mapped upwards,thus allowing both x and y to carry a 'false' truth value in a zero cost mapping. Thus there exists a zero cost mapping of the clause component iff at least one of the input connectors carries a truth value.image Aimage B mapping 1(true,true,false)mapping 2 (false,false,true,)Fig. 4. The clause component with three incoming connectors x, y , z and zero cost mappings forthe two cases(true,true,false)and (false, false, true).The described components are already sufficient to prove NP-completeness by reduction from planar 3-SAT (which is an NP-complete sub-problem of 3-SAT where the additional constraints on the instances is that the dependency graph is planar),but in order to derive a reduction from 3-SAT, we also include the possibility of crossing connectors.Fig. 5 shows the connector crossing, whose basic property is to allow zero cost mappings if the truth–values are consistently propagated. This is assured by a color change of the vertical connector and a 'flexible' middle part, which can be mapped to four different positions depending on the truth value distribution.image Aimage Bzero cost mappingFig. 5. The connector crossing component and one zero cost mapping.3.2. ReductionUsing the previously introduced components, we can now perform the reduction from 3-SAT to image matching .Proof of the claim that the image matching problem is NP-complete:Clearly, the image matching problem is in NP since, given a mapping f and two images A and B ,the computation of c(A,B,f)can be done in polynomial time. To prove NP-hardness, we construct a reduction from the 3-SAT problem. Given an instance of 3-SAT we construct two images A and B , for which a mapping of cost zero exists iff all the clauses can be satisfied.Given the dependency graph D ,we construct an embedding of the graph into a 2D pixel grid, placing the vertices on a large enough distance from each other (say100(K+L)² ).This can be done using well-known methods from graph drawing (see e.g.di Battista et al.,1999).From this image of the graph D we construct the two images A and B , using the components described above.Each vertex belonging to a variable is replaced with the respective parts of the variable component, having a number of leaving connectors equal to the number of incident edges under consideration of the positive or negative use in the respective clause. Each vertex belonging to a clause is replaced by the respective clause component,and each crossing of edges is replaced by the respective crossing component. Finally, all the edges are replaced with connectors and corner connectors, and the remaining pixels inside the rectangular hull of the construction are set to the background gray value. Clearly, the placement of the components can be done in such a way that all the components are at a large enough distance from each other, where the background pixels act as an 'insulation' against mapping of pixels, which do not belong to the same component. It can be easily seen, that the size of the constructed images is polynomial with respect to the number of vertices and edges of D and thus polynomial in the size of the instance of 3-SAT, at most in the order (K+L)².Furthermore, it can obviously be constructed in polynomial time, as the corresponding graph drawing algorithms are polynomial.Let there exist a truth assignment to the variables x1,…,xL, which satisfies allthe clauses c1,…,cK. We construct a mapping f , that satisfies c(f,A,B)=0 asfollows.For all pixels (i, j ) belonging to variable component l with A(i,j)not of the background color,set f(i,j)=(i,j-1)if xlis assigned the truth value 'true' , set f(i,j)=(i,j+1), otherwise. For the remaining pixels of the variable component set A(i,j)=B(i,j),if f(i,j)=(i,j), otherwise choose f(i,j)from{(i,j+1),(i+1,j+1),(i-1,j+1)}for xl'false' respectively from {(i,j-1),(i+1,j-1),(i-1,j-1)}for xl'true ',such that A(i,j)=B(f(i,j)). This assignment is always possible and has zero cost, as can be easily verified.For the pixels(i,j)belonging to (corner) connector components,the mapping function can only be extended in one way without the introduction of nonzero cost,starting from the connection with the variable component. This is ensured by thebasic connector property. By choosing f (i ,j )=(i,j )for all pixels of background color, we obtain a valid extension for the connectors. For the connector crossing components the extension is straight forward, although here ––as in the variable mapping ––some care must be taken with the assign ment of the background value pixels, but a zero cost assignment is always possible using the same scheme as presented for the variable mapping.It remains to be shown that the clause components can be mapped at zero cost, if at least one of the input connectors x , y , z carries a ' true' truth value.For a proof we regard alls even possibilities and construct a mapping for each case. In thedescription of the clause component it was already argued that this is possible,and due to space limitations we omit the formalization of the argument here.Finally, for all the pixels (i ,j )not belonging to any of the components, we set f (i ,j )=(i ,j )thus arriving at a mapping function which has c (f ,A ,B )=0。

数字图像处理英文文献翻译参考

数字图像处理英文文献翻译参考

…………………………………………………装………………订………………线…………………………………………………………………Hybrid Genetic Algorithm Based Image EnhancementTechnologyMu Dongzhou Department of the Information Engineering XuZhou College of Industrial TechnologyXuZhou, China ****************.cnXu Chao and Ge Hongmei Department of the Information Engineering XuZhou College of Industrial TechnologyXuZhou, China ***************.cn,***************.cnAbstract—in image enhancement, Tubbs proposed a normalized incomplete Beta function to represent several kinds of commonly used non-linear transform functions to do the research on image enhancement. But how to define the coefficients of the Beta function is still a problem. We proposed a Hybrid Genetic Algorithm which combines the Differential Evolution to the Genetic Algorithm in the image enhancement process and utilize the quickly searching ability of the algorithm to carry out the adaptive mutation and searches. Finally we use the Simulation experiment to prove the effectiveness of the method.Keywords- Image enhancement; Hybrid Genetic Algorithm; adaptive enhancementI. INTRODUCTIONIn the image formation, transfer or conversion process, due to other objective factors such as system noise, inadequate or excessive exposure, relative motion and so the impact will get the image often a difference between the original image (referred to as degraded or degraded) Degraded image is usually blurred or after the extraction of information through the machine to reduce or even wrong, it must take some measures for its improvement.Image enhancement technology is proposed in this sense, and the purpose is to improve the image quality. Fuzzy Image Enhancement situation according to the image using a variety of special technical highlights some of the information in the image, reduce or eliminate the irrelevant information, to emphasize the image of the whole or the purpose of local features. Image enhancement method is still no unified theory, image enhancement techniques can be divided into three categories: point operations, and spatial frequency enhancement methods Enhancement Act. This paper presents an automatic adjustment according to the image characteristics of adaptive image enhancement method that called hybrid genetic algorithm. It combines the differential evolution algorithm of adaptive search capabilities, automatically determines the transformation function of the parameter values in order to achieve adaptive image enhancement.…………………………………………………装………………订………………线…………………………………………………………………II. IMAGE ENHANCEMENT TECHNOLOGYImage enhancement refers to some features of the image, such as contour, contrast, emphasis or highlight edges, etc., in order to facilitate detection or further analysis and processing. Enhancements will not increase the information in the image data, but will choose the appropriate features of the expansion of dynamic range, making these features more easily detected or identified, for the detection and treatment follow-up analysis and lay a good foundation.Image enhancement method consists of point operations, spatial filtering, and frequency domain filtering categories. Point operations, including contrast stretching, histogram modeling, and limiting noise and image subtraction techniques. Spatial filter including low-pass filtering, median filtering, high pass filter (image sharpening). Frequency filter including homomorphism filtering, multi-scale multi-resolution image enhancement applied [1].III. DIFFERENTIAL EVOLUTION ALGORITHMDifferential Evolution (DE) was first proposed by Price and Storn, and with other evolutionary algorithms are compared, DE algorithm has a strong spatial search capability, and easy to implement, easy to understand. DE algorithm is a novel search algorithm, it is first in the search space randomly generates the initial population and then calculate the difference between any two members of the vector, and the difference is added to the third member of the vector, by which Method to form a new individual. If you find that the fitness of new individual members better than the original, then replace the original with the formation of individual self.The operation of DE is the same as genetic algorithm, and it conclude mutation, crossover and selection, but the methods are different. We suppose that the group size is P, the vector dimension is D, and we can express the object vector as (1):xi=[xi1,xi2,…,xiD] (i =1,…,P)(1) And the mutation vector can be expressed as (2):()321rrriXXFXV-⨯+=i=1,...,P (2) 1rX,2rX,3rX are three randomly selected individuals from group, and r1≠r2≠r3≠i.F is a range of [0, 2] between the actual type constant factor difference vector is used to control the influence, commonly referred to as scaling factor. Clearly the difference between the vector and the smaller the disturbance also smaller, which means that if groups close to the optimum value, the disturbance will be automatically reduced.DE algorithm selection operation is a "greedy " selection mode, if and only if the new vector ui the fitness of the individual than the target vector is better when the individual xi, ui will be retained to the next group. Otherwise, the target vector xi individuals remain in the original group, once again as the next generation of the parent vector.…………………………………………………装………………订………………线…………………………………………………………………IV. HYBRID GA FOR IMAGE ENHANCEMENT IMAGEenhancement is the foundation to get the fast object detection, so it is necessary to find real-time and good performance algorithm. For the practical requirements of different systems, many algorithms need to determine the parameters and artificial thresholds. Can use a non-complete Beta function, it can completely cover the typical image enhancement transform type, but to determine the Beta function parameters are still many problems to be solved. This section presents a Beta function, since according to the applicable method for image enhancement, adaptive Hybrid genetic algorithm search capabilities, automatically determines the transformation function of the parameter values in order to achieve adaptive image enhancement.The purpose of image enhancement is to improve image quality, which are more prominent features of the specified restore the degraded image details and so on. In the degraded image in a common feature is the contrast lower side usually presents bright, dim or gray concentrated. Low-contrast degraded image can be stretched to achieve a dynamic histogram enhancement, such as gray level change. We use Ixy to illustrate the gray level of point (x, y) which can be expressed by (3).Ixy=f(x, y) (3) where: “f” is a linear or nonline ar function. In general, gray image have four nonlinear translations [6] [7] that can be shown as Figure 1. We use a normalized incomplete Beta function to automatically fit the 4 categories of image enhancement transformation curve. It defines in (4):()()()()10,01,111<<-=---⎰βαβαβαdtttBufu(4) where:()()⎰---=1111,dtttBβαβα(5) For different value of α and β, we can get response curve from (4) and (5).The hybrid GA can make use of the previous section adaptive differential evolution algorithm to search for the best function to determine a value of Beta, and then each pixel grayscale values into the Beta function, the corresponding transformation of Figure 1, resulting in ideal image enhancement. The detail description is follows:Assuming the original image pixel (x, y) of the pixel gray level by the formula (4),denoted byxyi,()Ω∈yx,, here Ω is the image domain. Enhanced image is denoted by Ixy. Firstly, the image gray value normalized into [0, 1] by (6).minmaxminiiiig xyxy--=(6)where:maxi andm ini express the maximum and minimum of image gray relatively.Define the nonlinear transformation function f(u) (0≤u≤1) to transform source image…………………………………………………装………………订………………线…………………………………………………………………Finally, we use the hybrid genetic algorithm to determine the appropriate Beta function f (u) the optimal parameters αand β. Will enhance the image Gxy transformed antinormalized.V. EXPERIMENT AND ANALYSISIn the simulation, we used two different types of gray-scale images degraded; the program performed 50 times, population sizes of 30, evolved 600 times. The results show that the proposed method can very effectively enhance the different types of degraded image.Figure 2, the size of the original image a 320 × 320, it's the contrast to low, and some details of the more obscure, in particular, scarves and other details of the texture is not obvious, visual effects, poor, using the method proposed in this section, to overcome the above some of the issues and get satisfactory image results, as shown in Figure 5 (b) shows, the visual effects have been well improved. From the histogram view, the scope of the distribution of image intensity is more uniform, and the distribution of light and dark gray area is more reasonable. Hybrid genetic algorithm to automatically identify the nonlinear…………………………………………………装………………订………………线…………………………………………………………………transformation of the function curve, and the values obtained before 9.837,5.7912, from the curve can be drawn, it is consistent with Figure 3, c-class, that stretch across the middle region compression transform the region, which were consistent with the histogram, the overall original image low contrast, compression at both ends of the middle region stretching region is consistent with human visual sense, enhanced the effect of significantly improved.Figure 3, the size of the original image a 320 × 256, the overall intensity is low, the use of the method proposed in this section are the images b, we can see the ground, chairs and clothes and other details of the resolution and contrast than the original image has Improved significantly, the original image gray distribution concentrated in the lower region, and the enhanced image of the gray uniform, gray before and after transformation and nonlinear transformation of basic graph 3 (a) the same class, namely, the image Dim region stretching, and the values were 5.9409,9.5704, nonlinear transformation of images degraded type inference is correct, the enhanced visual effect and good robustness enhancement.Difficult to assess the quality of image enhancement, image is still no common evaluation criteria, common peak signal to noise ratio (PSNR) evaluation in terms of line, but the peak signal to noise ratio does not reflect the human visual system error. Therefore, we use marginal protection index and contrast increase index to evaluate the experimental results.Edgel Protection Index (EPI) is defined as follows:…………………………………………………装………………订………………线…………………………………………………………………(7)Contrast Increase Index (CII) is defined as follows:minmaxminmax,GGGGCCCEOD+-==(8)In figure 4, we compared with the Wavelet Transform based algorithm and get the evaluate number in TABLE I.Figure 4 (a, c) show the original image and the differential evolution algorithm for enhanced results can be seen from the enhanced contrast markedly improved, clearer image details, edge feature more prominent. b, c shows the wavelet-based hybrid genetic algorithm-based Comparison of Image Enhancement: wavelet-based enhancement method to enhance image detail out some of the image visual effect is an improvement over the original image, but the enhancement is not obvious; and Hybrid genetic algorithm based on adaptive transform image enhancement effect is very good, image details, texture, clarity is enhanced compared with the results based on wavelet transform has greatly improved the image of the post-analytical processing helpful. Experimental enhancement experiment using wavelet transform "sym4" wavelet, enhanced differential evolution algorithm experiment, the parameters and the values were 5.9409,9.5704. For a 256 × 256 size image transform based on adaptive hybrid genetic algorithm in Matlab 7.0 image enhancement software, the computing time is about 2 seconds, operation is very fast. From TABLE I, objective evaluation criteria can be seen, both the edge of the protection index, or to enhance the contrast index, based on adaptive hybrid genetic algorithm compared to traditional methods based on wavelet transform has a larger increase, which is from This section describes the objective advantages of the method. From above analysis, we can see…………………………………………………装………………订………………线…………………………………………………………………that this method.From above analysis, we can see that this method can be useful and effective.VI. CONCLUSIONIn this paper, to maintain the integrity of the perspective image information, the use of Hybrid genetic algorithm for image enhancement, can be seen from the experimental results, based on the Hybrid genetic algorithm for image enhancement method has obvious effect. Compared with other evolutionary algorithms, hybrid genetic algorithm outstanding performance of the algorithm, it is simple, robust and rapid convergence is almost optimal solution can be found in each run, while the hybrid genetic algorithm is only a few parameters need to be set and the same set of parameters can be used in many different problems. Using the Hybrid genetic algorithm quick search capability for a given test image adaptive mutation, search, to finalize the transformation function from the best parameter values. And the exhaustive method compared to a significant reduction in the time to ask and solve the computing complexity. Therefore, the proposed image enhancement method has some practical value.REFERENCES[1] HE Bin et al., Visual C++ Digital Image Processing [M], Posts & Telecom Press,2001,4:473~477[2] Storn R, Price K. Differential Evolution—a Simple and Efficient Adaptive Scheme forGlobal Optimization over Continuous Space[R]. International Computer Science Institute, Berlaey, 1995.[3] Tubbs J D. A note on parametric image enhancement [J].Pattern Recognition.1997,30(6):617-621.[4] TANG Ming, MA Song De, XIAO Jing. Enhancing Far Infrared Image Sequences withModel Based Adaptive Filtering [J] . CHINESE JOURNAL OF COMPUTERS, 2000, 23(8):893-896.[5] ZHOU Ji Liu, LV Hang, Image Enhancement Based on A New Genetic Algorithm [J].Chinese Journal of Computers, 2001, 24(9):959-964.[6] LI Yun, LIU Xuecheng. On Algorithm of Image Constract Enhancement Based onWavelet Transformation [J]. Computer Applications and Software, 2008,8.[7] XIE Mei-hua, WANG Zheng-ming, The Partial Differential Equation Method for ImageResolution Enhancement [J]. Journal of Remote Sensing, 2005,9(6):673-679.…………………………………………………装………………订………………线…………………………………………………………………基于混合遗传算法的图像增强技术Mu Dongzhou 徐州工业职业技术学院信息工程系 XuZhou, China****************.cnXu Chao and Ge Hongmei 徐州工业职业技术学院信息工程系 XuZhou,********************.cn,***************.cn摘要—在图像增强之中,塔布斯提出了归一化不完全β函数表示常用的几种使用的非线性变换函数对图像进行研究增强。

数字图像检测中英文对照外文翻译文献

数字图像检测中英文对照外文翻译文献

中英文对照外文翻译(文档含英文原文和中文翻译)Edge detection in noisy images by neuro-fuzzyprocessing通过神经模糊处理的噪声图像边缘检测AbstractA novel neuro-fuzzy (NF) operator for edge detection in digital images corrupted by impulse noise is presented. The proposed operator is constructed by combining a desired number of NF subdetectors with a postprocessor. Each NF subdetector in the structure evaluates a different pixel neighborhood relation. Hence, the number of NF subdetectors in the structure may be varied to obtain the desired edge detection performance. Internal parameters of the NF subdetectors are adaptively optimized by training by using simple artificial training images. The performance of the proposed edge detector is evaluated on different test images and compared with popular edge detectors from the literature. Simulation results indicate that the proposed NF operator outperforms competing edge detectors and offers superior performance in edge detection in digital images corrupted by impulse noise.Keywords: Neuro-fuzzy systems; Image processing; Edge detection摘要针对被脉冲信号干扰的数字图像进行边缘检测,提出了一种新型的NF边缘检测器,它是由一定数量的NF子探测器与一个后处理器组成。

外文翻译---MATLAB 在图像边缘检测中的应用

外文翻译---MATLAB 在图像边缘检测中的应用

英文资料翻译MATLAB application in image edge detection MATLAB of the 1984 countries MathWorks company to market since, after 10 years of development, has become internationally recognized the best technology application software. MATLAB is not only a kind of direct, efficient computer language, and at the same time, a scientific computing platform, it for data analysis and data visualization, algorithm and application development to provide the most core of math and advanced graphics tools. According to provide it with the more than 500 math and engineering function, engineering and technical personnel and scientific workers can integrated environment of developing or programming to complete their calculation.MATLAB software has very strong openness and adapt to sex. Keep the kernel in under the condition of invariable, MATLAB is in view of the different application subject of launch corresponding Toolbox (Toolbox), has now launched image processing Toolbox, signal processing Toolbox, wavelet Toolbox, neural network Toolbox and communication tools box, etc multiple disciplines special kit, which would place of different subjects research work.MATLAB image processing kit is by a series of support image processing function from the composition, the support of the image processing operation: geometric operation area of operation and operation; Linear filter and filter design; Transform (DCT transform); Image analysis and strengthened; Binary image manipulation, etc. Image processing tool kit function, the function can be divided into the following categories: image display; Image file input and output; Geometric operation; Pixels statistics; Image analysis and strengthened; Image filtering; Sex 2 d filter design; Image transformation; Fields and piece of operation; Binary image operation; Color mapping and color space transformation; Image types and type conversion; Kit acquiring parameters and Settings.1.Edge detection thisUse computer image processing has two purposes: produce more suitable for human observation and identification of the images; Hope can by the automatic computer image recognition and understanding.No matter what kind of purpose to, image processing the key step is to contain a variety of scenery of decomposition of image information. Decomposition of the end result is that break down into some has some kind of characteristics of the smallest components, known as the image of the yuan. Relative to the whole image of speaking, this the yuan more easily to be rapid processing.Image characteristics is to point to the image can be used as the sign of the field properties, it can be divided into the statistical features of the image and image visual, two types of levy. The statistical features of the image is to point to some people the characteristics of definition, through the transform to get, such as image histogram, moments, spectrum, etc.; Image visual characteristics is refers to person visual sense can be directly by the natural features, such as the brightness of the area, and texture or outline, etc. The two kinds of characteristics of the image into a series of meaningful goal or regional p rocess called image segmentation.The image is the basic characteristics of edge, the edge is to show its pixel grayscale around a step change order or roof of the collection of those changes pixels. It exists in target and background, goals and objectives, regional and region, the yuan and the yuan between, therefore, it is the image segmentation dependent on the most important characteristic that the texture characteristics of important information sources and shape characteristics of the foundation, and the image of the texture characteristics and the extraction of shape often dependent on image segmentation. Image edge extraction is also the basis of image matching, because it is the sign of position, the change of the original is not sensitive, and can be used for matching the feature points.The edge of the image is reflected by gray not continuity. Classic edge extraction method is investigation of each pixel image in an area of the gray change, use edge first or second order nearby directional derivative change rule,with simple method of edge detection, this method called edge detection method of local operators.The type of edge can be divided into two types: (1) step representation sexual edge, it on both sides of the pixel gray value varies significantly different; (2) the roof edges, it is located in gray value from the change of increased to reduce the turning point. For order jump sexual edge, second order directional derivative in edge is zero cross; For the roof edges, second order directional derivative in edge take extreme value.If a pixel fell in the image a certain object boundary, then its field will become a gray level with the change. The most useful to change two features is the rate of change and the gray direction, they are in the range of the gradient vector and the direction to said. Edge detection operator check every pixel grayscale rate fields and evaluation, and also include to determine the directions of the most use based on directional derivative deconvolution method for masking.Digital image processing technique has been widely applied to the biomedical field, the use of computer image processing and analysis, and complete detection and recognition of cancer cells can help doctors make a diagnosis of tumor cancers. Need to be made in the identification of cancer cells, the quantitative results, the human eye is difficult to accurately complete such work, and the use of computer image processing to complete the analysis and identification of the microscopic images have made great progress. In recent years, domestic and foreign medical images of cancer cells testing to identify the researchers put forward a lot of theory and method for the diagnosis of cancer cells has very important meaning and practical value.Cell edge detection is the cell area of the number of roundness and color, shape and chromaticity calculation and the basis of the analysis their test results directly affect the analysis and diagnosis of the disease. Classical edge detection operators such as Sobel operator, Laplacian operator, each pixel neighborhood of the image gray scale changes to detect the edge. Although these operators is simple, fast, but there are sensitive to noise, get isolated or in short sections of acontinuous edge pixels, overlapping the adjacent cell edge defects, while the optimal threshold segmentation and contour extraction method of combining edge detection, obtained by the iterative algorithm for the optimal threshold for image segmentation, contour extraction algorithm, digging inside the cell pixels, the last remaining part of the image is the edge of the cell, change the processing order of the traditional edge detection algorithm, by MATLAB programming, the experimental results that can effectively suppress the noise impact at the same time be able to objectively and correctly select the edge detection threshold, precision cell edge detection.2.Edge detection of MATLABMATLAB image processing toolkit defines the edge () function is used to test the edge of gray image.(1) BW = edge (I, "method"), returns and I size binary image BW, includingelements of 1 said is on the edge of the point, 0 means the edge points.Method for the following a string of:1) soble: the default value, with derivative Sobel edge detectionapproximate measure, to return to a maximum gradient edge;2) prewitt: with the derivative prewitt approximate edge detection, amaximum gradient to return to edge;3) Roberts: with the derivative Roberts approximate edge detection margins,return to a maximum gradient edge;4) the log: use the Laplace operation gaussian filter to I carry filtering,through the looking for 0 intersecting detection of edge;5) zerocross: use the filter to designated I filter, looking for 0 intersectingdetection of edge.(2) BW = edge (I, "method", thresh) with thresh designated sensitivitythreshold value, rather than the edge of all not thresh are ignored.(3) BW = edge (I, "method" thresh, direction, for soble and prewitt methodspecified direction, direction for string, including horizontal level said direction; Vertical said to hang straight party; Both said the two directions(the default).(4) BW = edge (I, 'log', thresh, log sigma), with sigma specified standarddeviation.(5) [BW, thresh] = edge (...), the return value of a function in fact have multiple(" BW "and" thresh "), but because the brace up with u said as a matrix, and so can be thought a return only parameters, which also shows the introduction of the concept of matrix MATLAB unity and superiority.st wordMATLAB has strong image processing function, provide a simple function calls to realize many classic image processing method. Not only is the image edge detection, in transform domain processing, image enhancement, mathematics morphological processing, and other aspects of the study, MATLAB can greatly improve the efficiency rapidly in the study of new ideas.MATLAB 在图像边缘检测中的应用MATLAB自1984年由国MathWorks公司推向市场以来,历经十几年的发展,现已成为国际公认的最优秀的科技应用软件。

数字图像处理外文翻译参考文献

数字图像处理外文翻译参考文献

数字图像处理外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)原文:Application Of Digital Image Processing In The MeasurementOf Casting Surface RoughnessAhstract- This paper presents a surface image acquisition system based on digital image processing technology. The image acquired by CCD is pre-processed through the procedure of image editing, image equalization, the image binary conversation and feature parameters extraction to achieve casting surface roughness measurement. The three-dimensional evaluation method is taken to obtain the evaluation parametersand the casting surface roughness based on feature parameters extraction. An automatic detection interface of casting surface roughness based on MA TLAB is compiled which can provide a solid foundation for the online and fast detection of casting surface roughness based on image processing technology.Keywords-casting surface; roughness measurement; image processing; feature parametersⅠ.INTRODUCTIONNowadays the demand for the quality and surface roughness of machining is highly increased, and the machine vision inspection based on image processing has become one of the hotspot of measuring technology in mechanical industry due to their advantages such as non-contact, fast speed, suitable precision, strong ability of anti-interference, etc [1,2]. As there is no laws about the casting surface and the range of roughness is wide, detection parameters just related to highly direction can not meet the current requirements of the development of the photoelectric technology, horizontal spacing or roughness also requires a quantitative representation. Therefore, the three-dimensional evaluation system of the casting surface roughness is established as the goal [3,4], surface roughness measurement based on image processing technology is presented. Image preprocessing is deduced through the image enhancement processing, the image binary conversation. The three-dimensional roughness evaluation based on the feature parameters is performed . An automatic detection interface of casting surface roughness based on MA TLAB is compiled which provides a solid foundation for the online and fast detection of casting surface roughness.II. CASTING SURFACE IMAGE ACQUISITION SYSTEMThe acquisition system is composed of the sample carrier, microscope, CCD camera, image acquisition card and the computer. Sample carrier is used to place tested castings. According to the experimental requirements, we can select a fixed carrier and the sample location can be manually transformed, or select curing specimens and the position of the sampling stage can be changed. Figure 1 shows the whole processing procedure.,Firstly,the detected castings should be placed in the illuminated backgrounds as far as possible, and then through regulating optical lens, setting the CCD camera resolution and exposure time, the pictures collected by CCD are saved to computer memory through the acquisition card. The image preprocessing and feature value extraction on casting surface based on corresponding software are followed. Finally the detecting result is output.III. CASTING SURFACE IMAGE PROCESSINGCasting surface image processing includes image editing, equalization processing, image enhancement and the image binary conversation,etc. The original and clipped images of the measured casting is given in Figure 2. In which a) presents the original image and b) shows the clipped image.A.Image EnhancementImage enhancement is a kind of processing method which can highlight certain image information according to some specific needs and weaken or remove some unwanted informations at the same time[5].In order to obtain more clearly contour of the casting surface equalization processing of the image namely the correction of the image histogram should be pre-processed before image segmentation processing. Figure 3 shows the original grayscale image and equalization processing image and their histograms. As shown in the figure, each gray level of the histogram has substantially the same pixel point and becomes more flat after gray equalization processing. The image appears more clearly after the correction and the contrast of the image is enhanced.Fig.2 Casting surface imageFig.3 Equalization processing imageB. Image SegmentationImage segmentation is the process of pixel classification in essence. It is a very important technology by threshold classification. The optimal threshold is attained through the instmction thresh = graythresh (II). Figure 4 shows the image of the binary conversation. The gray value of the black areas of the Image displays the portion of the contour less than the threshold (0.43137), while the white area shows the gray value greater than the threshold. The shadows and shading emerge in the bright region may be caused by noise or surface depression.Fig4 Binary conversationIV. ROUGHNESS PARAMETER EXTRACTIONIn order to detect the surface roughness, it is necessary to extract feature parameters of roughness. The average histogram and variance are parameters used to characterize the texture size of surface contour. While unit surface's peak area is parameter that can reflect the roughness of horizontal workpiece.And kurtosis parameter can both characterize the roughness of vertical direction and horizontal direction. Therefore, this paper establisheshistogram of the mean and variance, the unit surface's peak area and the steepness as the roughness evaluating parameters of the castings 3D assessment. Image preprocessing and feature extraction interface is compiled based on MATLAB. Figure 5 shows the detection interface of surface roughness. Image preprocessing of the clipped casting can be successfully achieved by this software, which includes image filtering, image enhancement, image segmentation and histogram equalization, and it can also display the extracted evaluation parameters of surface roughness.Fig.5 Automatic roughness measurement interfaceV. CONCLUSIONSThis paper investigates the casting surface roughness measuring method based on digital Image processing technology. The method is composed of image acquisition, image enhancement, the image binary conversation and the extraction of characteristic parameters of roughness casting surface. The interface of image preprocessing and the extraction of roughness evaluation parameters is compiled by MA TLAB which can provide a solid foundation for the online and fast detection of casting surface roughness.REFERENCE[1] Xu Deyan, Lin Zunqi. The optical surface roughness research pro gress and direction[1]. Optical instruments 1996, 18 (1): 32-37.[2] Wang Yujing. Turning surface roughness based on image measurement [D]. Harbin:Harbin University of Science and Technology[3] BRADLEY C. Automated surface roughness measurement[1]. The InternationalJournal of Advanced Manufacturing Technology ,2000,16(9) :668-674.[4] Li Chenggui, Li xing-shan, Qiang XI-FU 3D surface topography measurement method[J]. Aerospace measurement technology, 2000, 20(4): 2-10.[5] Liu He. Digital image processing and application [ M]. China Electric Power Press,2005译文:数字图像处理在铸件表面粗糙度测量中的应用摘要—本文提出了一种表面图像采集基于数字图像处理技术的系统。

外文翻译----基于数字图像处理技术的边缘特征提取

外文翻译----基于数字图像处理技术的边缘特征提取
Edge feature extraction has been applied in many areas widely. This paper mainly discusses about advantages and disadvantages of several edge detection operators applied in the cable insulation parameter measurement. In order to gain more legible image outline, firstly the acquired image is filtered anddenoised. In the process ofdenoising, wavelet transformation is used. And then different operators are applied to detect edge including Differential operator, Log operator,Cannyoperator and Binary morphology operator. Finally the edge pixels of image are connected using the method of bordering closed. Then a clear and complete image outline will be obtained.
The traditionaldenoisingmethod is the use of a low-pass or band-pass filter todenoise. Its shortcoming is that the signal is blurred when noises are removed. There is irreconcilable contradiction between removing noise and edge maintenance. Yet wavelet analysis has been proved to be a powerful tool for image processing. Because Waveletdenoisinguses a different frequency band-pass filters on the signal filtering. It removes the coefficients of some scales which mainly reflect the noise frequency. Then the coefficient of every remaining scale is integrated for inverse transform, so that noise can be suppressed well. So wavelet analysis can be widely used in manyaspects such as image compression, imagedenoising, etc.
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

中英文资料对照外文翻译Digital Image Processing and Edge DetectionDigital Image ProcessingInterest in digital image processing methods stems from two principal applica- tion areas: improvement of pictorial information for human interpretation; and processing of image data for storage, transmission, and representation for au- tonomous machine perception.An image may be defined as a two-dimensional function, f(x, y), where x and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all finite, discrete quantities, we call the image a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. Note that a digital image is composed of a finite number of elements, each of which has a particular location and value. These elements are referred to as picture elements, image elements, pels, and pixels. Pixel is the term most widely used to denote the elements of a digital image.Vision is the most advanced of our senses, so it is not surprising that images play the single most important role in human perception. However, unlike humans, who are limited to the visual band of the electromagnetic (EM) spec- trum, imaging machines cover almost the entire EM spectrum, ranging from gamma to radio waves. They can operate on images generated by sources that humans are not accustomed to associating with images. These include ultra- sound, electron microscopy, and computer-generated images. Thus, digital image processing encompasses a wide and varied field of applications.There is no general agreement among authors regarding where image processing stops and other related areas, such as image analysis and computer vi- sion, start. Sometimes a distinction is made by defining image processing as a discipline in which both the input and output of a process are images. We believe this to be a limiting and somewhat artificial boundary. For example, under this definition, even the trivial task of computing the average intensity of an image (which yields a single number) would not be considered an image processing operation. On the other hand, there are fields such as computer vision whose ultimate goal is to use computers to emulate human vision, including learning and being able to make inferences and take actions based on visual inputs. This area itself is a branch of artificial intelligence (AI) whose objective is to emulate human intelligence. The field of AI is in its earliest stages of infancy in terms of development, with progress having been much slower thanoriginally anticipated. The area of image analysis (also called image understanding) is in be- tween image processing and computer vision.There are no clearcut boundaries in the continuum from image processing at one end to computer vision at the other. However, one useful paradigm is to consider three types of computerized processes in this continuum: low-, mid-, and highlevel processes. Low-level processes involve primitive opera- tions such as image preprocessing to reduce noise, contrast enhancement, and image sharpening. A low-level process is characterized by the fact that both its inputs and outputs are images. Mid-level processing on images involves tasks such as segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. A midlevel process is characterized by the fact that its inputs generally are images, but its outputs are attributes extracted from those images (e.g., edges, contours, and the identity of individual objects). Finally, higherlevel processing involves “makin g sense”of an ensemble of recognized objects, as in image analysis, and, at the far end of the continuum, performing the cognitive functions normally associated with vision.Based on the preceding comments, we see that a logical place of overlap between image processing and image analysis is the area of recognition of individual regions or objects in an image. Thus, what we call in this book digital image processing encompasses processes whose inputs and outputs are images and, in addition, encompasses processes that extract attributes from images, up to and including the recognition of individual objects. As a simple illustration to clarify these concepts, consider the area of automated analysis of text. The processes of acquiring an image of the area containing the text, preprocessing that image, extracting (segmenting) the individual characters, describing the characters in a form suitable for computer processing, and recognizing those individual characters are in the scope of what we call digital image processing in this book. Making sense of the content of the page may be viewed as being in the domain of image analysis and even computer vision, depending on the level of complexity implied by the statement “making sense.”As will become evident shortly, digital image processing, as we have defined it, is used successfully in a broad range of areas of exceptional social and economic value.The areas of application of digital image processing are so varied that some form of organization is desirable in attempting to capture the breadth of this field. One of the simplest ways to develop a basic understanding of the extent of image processing applications is to categorize images according to their source (e.g., visual, X-ray, and so on). The principal energy source for images in use today is the electromagnetic energy spectrum. Other important sources of energy include acoustic, ultrasonic, and electronic (in the form of electron beams used in electron microscopy). Synthetic images, used for modeling and visualization, are generated by computer. In this section we discuss briefly how images are generated in these various categories and the areas in which they are applied.Images based on radiation from the EM spectrum are the most familiar, es- pecially images in the X-ray and visual bands of the spectrum. Electromagnet- ic waves can be conceptualized as propagating sinusoidal waves of varyingwavelengths, or they can be thought of as a stream of massless particles, each traveling in a wavelike pattern and moving at the speed of light. Each massless particle contains a certain amount (or bundle) of energy. Each bundle of energy is called a photon. If spectral bands are grouped according to energy per photon, we obtain the spectrum shown in fig. below, ranging from gamma rays (highest energy) at one end to radio waves (lowest energy) at the other. The bands are shown shaded to convey the fact that bands of the EM spectrum are not distinct but rather transition smoothly from one to the other.Image acquisition is the first process. Note that acquisition could be as simple as being given an image that is already in digital form. Generally, the image acquisition stage involves preprocessing, such as scaling.Image enhancement is among the simplest and most appealing areas of digital image processing. Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to highlight certain features of interest in an image. A familiar example of enhancement is when we increase the contrast of an image because “it looks better.” It is important to keep in mind that enhancement is a very subjective area of image processing. Image restoration is an area that also deals with improving the appearance of an image. However, unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration techniques tend to be based on mathematical or probabilistic models of image degradation. Enhancement, on the other hand, is based on human subjective preferences regarding what constitutes a “good”enhancement result.Color image processing is an area that has been gaining in importance because of the significant increase in the use of digital images over the Internet. It covers a number of fundamental concepts in color models and basic color processing in a digital domain. Color is used also in later chapters as the basis for extracting features of interest in an image.Wavelets are the foundation for representing images in various degrees of resolution. In particular, this material is used in this book for image data compression and for pyramidal representation, in which images are subdivided successively into smaller regions.Compression, as the name implies, deals with techniques for reducing the storage required to save an image, or the bandwidth required to transmi it.Although storage technology has improved significantly over the past decade, the same cannot be said for transmission capacity. This is true particularly in uses of the Internet, which are characterized by significant pictorial content. Image compression is familiar (perhaps inadvertently) to most users of computers in the form of image file extensions, such as the jpg file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.Morphological processing deals with tools for extracting image components that are useful in the representation and description of shape. The material in this chapter begins a transition from processes that output images to processes that output image attributes.Segmentation procedures partition an image into its constituent parts or objects. In general, autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged segmentation procedure brings the process a long way toward successful solution of imaging problems that require objects to be identified individually. On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual failure. In general, the more accurate the segmentation, the more likely recognition is to succeed.Representation and description almost always follow the output of a segmentation stage, which usually is raw pixel data, constituting either the bound- ary of a region (i.e., the set of pixels separating one image region from another) or all the points in the region itself. In either case, converting the data to a form suitable for computer processing is necessary. The first decision that must be made is whether the data should be represented as a boundary or as a complete region. Boundary representation is appropriate when the focus is on external shape characteristics, such as corners and inflections. Regional representation is appropriate when the focus is on internal properties, such as texture or skeletal shape. In some applications, these representations complement each other. Choosing a representation isonly part of the solution for trans- forming raw data into a form suitable for subsequent computer processing. A method must also be specified for describing the data so that features of interest are highlighted. Description, also called feature selection, deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another.Recognition is the process that assigns a label (e.g., “vehicle”) to an object based on its descriptors. As detailed before, we conclude our coverage of digital image processing with the development of methods for recognition of individual objects.So far we have said nothing about the need for prior knowledge or about the interaction between the knowledge base and the processing modules in Fig2 above. Knowledge about a problem domain is coded into an image processing system in the form of a knowledge database. This knowledge may be as sim- ple as detailing regions of an image where the information of interest is known to be located, thus limiting the search that has to be conducted in seeking that information. The knowledge base also can be quite complex, such as an interrelated list of all major possible defects in a materials inspection problem or an image database containing high-resolution satellite images of a region in con- nection with change-detection applications. In addition to guiding the operation of each processing module, the knowledge base also controls the interaction between modules. This distinction is made in Fig2 above by the use of double-headed arrows between the processing modules and the knowledge base, as op- posed to single-headed arrows linking the processing modules.Edge detectionEdge detection is a terminology in image processing and computer vision, particularly in the areas of feature detection and feature extraction, to refer to algorithms which aim at identifying points in a digital image at which the image brightness changes sharply or more formally has discontinuities.Although point and line detection certainly are important in any discussion on segmentation,edge dectection is by far the most common approach for detecting meaningful discounties in gray level.Although certain literature has considered the detection of ideal step edges, the edges obtained from natural images are usually not at all ideal step edges. Instead they are normally affected by one or several of the following effects:1.focal blur caused by a finite depth-of-field and finite point spread function; 2.penumbral blur caused by shadows created by light sources of non-zero radius; 3.shading at a smooth object edge; 4.local specularities or interreflections in the vicinity of object edges.A typical edge might for instance be the border between a block of red color and a block of yellow. In contrast a line (as can be extracted by a ridge detector) can be a small number of pixels of a different color on an otherwise unchanging background. For a line, there may therefore usually be one edge on each side of the line.To illustrate why edge detection is not a trivial task, let us consider the problem of detecting edges in the following one-dimensional signal. Here, we may intuitively say that there should be an edge between the 4th and 5th pixels.5 76 4 152 148 149If the intensity difference were smaller between the 4th and the 5th pixels and if the intensity differences between the adjacent neighbouring pixels were higher, it would not be as easy to say that there should be an edge in the corresponding region. Moreover, one could argue that this case is one in which there are several edges.Hence, to firmly state a specific threshold on how large the intensity change between two neighbouring pixels must be for us to say that there should be an edge between these pixels is not always a simple problem. Indeed, this is one of the reasons why edge detection may be a non-trivial problem unless the objects in the scene are particularly simple and the illumination conditions can be well controlled.There are many methods for edge detection, but most of them can be grouped into two categories,search-based and zero-crossing based. The search-based methods detect edges by first computing a measure of edge strength, usually a first-order derivative expression such as the gradient magnitude, and then searching for local directional maxima of the gradient magnitude using a computed estimate of the local orientation of the edge, usually the gradient direction. The zero-crossing based methods search for zero crossings in a second-order derivative expression computed from the image in order to find edges, usually the zero-crossings of the Laplacian or the zero-crossings of a non-linear differential expression, as will be described in the section on differential edge detection following below. As a pre-processing step to edge detection, a smoothing stage, typically Gaussian smoothing, is almost always applied (see also noise reduction).The edge detection methods that have been published mainly differ in the types of smoothing filters that are applied and the way the measures of edge strength are computed. As many edge detection methods rely on the computation of image gradients, they also differ in the types of filters used for computing gradient estimates in the x- and y-directions.Once we have computed a measure of edge strength (typically the gradient magnitude), the next stage is to apply a threshold, to decide whether edges are present or not at an image point. The lower the threshold, the more edges will be detected, and the result will be increasingly susceptible to noise, and also to picking out irrelevant features from the image. Conversely a high threshold may miss subtle edges, or result in fragmented edges.If the edge thresholding is applied to just the gradient magnitude image, the resulting edges will in general be thick and some type of edge thinning post-processing is necessary. For edges detected with non-maximum suppression however, the edge curves are thin by definition and the edge pixels can be linked into edge polygon by an edge linking (edge tracking) procedure. On a discrete grid, the non-maximum suppression stage can be implemented by estimating the gradient direction using first-order derivatives, then rounding off the gradient direction to multiples of 45 degrees, and finally comparing the values of the gradient magnitude in the estimated gradient direction.A commonly used approach to handle the problem of appropriate thresholds forthresholding is by using thresholding with hysteresis. This method uses multiple thresholds to find edges. We begin by using the upper threshold to find the start of an edge. Once we have a start point, we then trace the path of the edge through the image pixel by pixel, marking an edge whenever we are above the lower threshold. We stop marking our edge only when the value falls below our lower threshold. This approach makes the assumption that edges are likely to be in continuous curves, and allows us to follow a faint section of an edge we have previously seen, without meaning that every noisy pixel in the image is marked down as an edge. Still, however, we have the problem of choosing appropriate thresholding parameters, and suitable thresholding values may vary over the image.Some edge-detection operators are instead based upon second-order derivatives of the intensity. This essentially captures the rate of change in the intensity gradient. Thus, in the ideal continuous case, detection of zero-crossings in the second derivative captures local maxima in the gradient.We can come to a conclusion that,to be classified as a meaningful edge point,the transition in gray level associated with that point has to be significantly stronger than the background at that point.Since we are dealing with local computations,the method of choice to determine whether a value is “significant” or not id to use a threshold.Thus we define a point in an image as being as being an edge point if its two-dimensional first-order derivative is greater than a specified criterion of connectedness is by definition an edge.The term edge segment generally is used if the edge is short in relation to the dimensions of the image.A key problem in segmentation is to assemble edge segments into longer edges.An alternate definition if we elect to use the second-derivative is simply to define the edge ponits in an image as the zero crossings of its second derivative.The definition of an edge in this case is the same as above.It is important to note that these definitions do not guarantee success in finding edge in an image.They simply give us a formalism to look for them.First-order derivatives in an image are computed using the gradient.Second-order derivatives are obtained using the Laplacian.中文对照数字图像处理与边缘检测数字图像处理数字图像处理方法的研究源于两个主要应用领域:其一是为了便于人们分析而对图像信息进行改进:其二是为使机器自动理解而对图像数据进行存储、传输及显示。

相关文档
最新文档