Image Interpolation Using Classification-based Neural Networks
Image sequence fusion using a shift invariant wavelet transform

With the development of new imaging sensors arises the need of a meaningful combination of all employed imaging sensors, a problem which is addressed by image fusion. In this paper, we focus on the so-called pixel-level image fusion, e.g. a composite image has to be built of several spatially registered input images or image sequences. A possible application is the fusion of infrared and visible light image sequences in an airborne sensor platform to aid pilot navigation in poor weather conditions or darkness.
20
root mean square error (RMSE)
15
10
5
0
0
4
81216 Nhomakorabea20
24
28
32
shift in [pixels]
Fig. 1: Shift dependency of the multiresolution
fusion methods (see text for details)
The dotted line indicates the shift error when the Haar
threshold and

Effective wavelet-based compression method with adaptive quantizationthreshold and zerotree codingArtur Przelaskowski, Marian Kazubek, Tomasz JamrógiewiczInstitute of Radioelectronics, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warszawa,PolandABSTRACTEfficient image compression technique especially for medical applications is presented. Dyadic wavelet decomposition by use of Antonini and Villasenor bank filters is followed by adaptive space-frequency quantization and zerotree-based entropy coding of wavelet coefficients. Threshold selection and uniform quantization is made on a base of spatial variance estimate built on the lowest frequency subband data set. Threshold value for each coefficient is evaluated as linear function of 9-order binary context. After quantization zerotree construction, pruning and arithmetic coding is applied for efficient lossless data coding. Presented compression method is less complex than the most effective EZW-based techniques but allows to achieve comparable compression efficiency. Specifically our method has similar to SPIHT efficiency in MR image compression, slightly better for CT image and significantly better in US image compression. Thus the compression efficiency of presented method is competitive with the best published algorithms in the literature across diverse classes of medical images. Keywords: wavelet transform, image compression, medical image archiving, adaptive quantization1. INTRODUCTIONLossy image compression techniques allow significantly diminish the length of original image representation at the cost of certain original data changes. At range of lower bit rates these changes are mostly observed as distortion but sometimes improved image quality is visible. Compression of the concrete image with its all important features preserving and the noise and all redundancy of original representation removing is do required. The choice of proper compression method depends on many factors, especially on statistical image characteristics (global and local) and application. Medical applications seem to be challenged because of restricted demands on image quality (in the meaning of diagnostic accuracy) preserving. Perfect reconstruction of very small structures which are often very important for diagnosis even at low bit rates is possible by increasing adaptability of the algorithm. Fitting data processing method to changeable data behaviour within an image and taking into account a priori data knowledge allow to achieve sufficient compression efficiency. Recent achievements clearly show that nowadays wavelet-based techniques can realise these ideas in the best way.Wavelet transform features are useful for better representation of the actual nonstationary signals and allow to use a priori and a posteriori data knowledge for diagnostically important image elements preserving. Wavelets are very efficient for image compression as entire transformation basis function set. This transformation gives similar level of data decorrelation in comparison to very popular discrete cosine transform and has additional very important features. It often provides a more natural basis set than the sinusoids of the Fourier analysis, enables widen set of solution to construct effective adaptive scalar or vector quantization in time-frequency domain and correlated entropy coding techniques, does not create blocking artefacts and is well suited for hardware implementation. Wavelet-based compression is naturally multiresolution and scalable in different applications so that a single decomposition provides reconstruction at a variety of sizes and resolutions (limited by compressed representation) and progressive coding and transmission in multiuser environments.Wavelet decomposition can be implemented in terms of filters and realised as subband coding approach. The fundamental issue in construction of efficient subband coding techniques is to select, design or modify the analysis and synthesis filters.1Wavelets are good tool to create wide class of new filters which occur very effective in compression schemes. The choice of suitable wavelet family, with such criteria as regularity, linearity, symmetry, orthogonality or impulse and step response of corresponding filter bank, can significantly improve compression efficiency. For compactly supported wavelets corresponding filter length is proportional to the degree of smoothness and regularity of the wavelet. Butwhen the wavelets are orthogonal (the greatest data decorrelation) they also have non-linear phase in the associated FIR filters. The symmetry, compact support and linear phase of filters may be achieved by biorthogonal wavelet bases application. Then quadrature mirror and perfect reconstruction subband filters are used to compute the wavelet transform. Biorthogonal wavelet-based filters occurred very efficient in compression algorithms. A construction of wavelet transformation by fitting local defined basis transformation function (or finite length filters) into image data characteristics is possible but very difficult. Because of nonstationary of image data, miscellaneous image futures which could be important for good reconstruction, significant various image quality (signal to noise level, spatial resolution etc.) from different imaging systems it is very difficult to elaborate the construction method of the optimal-for-compression filters. Many issues relating to the choice of the most efficient filter bank for image compression remain still unresolved.2The demands of preserving the diagnostic accuracy in reconstructed medical images are exacting. Important high frequency coefficients which appear at the place of small structure edges in CT and MR images should be saved. Accurate global organ shapes reconstruction in US images and strong noise reduction in MN images is also required. It is rather difficult to imagine that one filter bank can do it in the best way. Rather choosing the best wavelet families for each modality is expected.Our aim is to increase the image compression efficiency, especially for medical applications, by applying suitable wavelet transformation, adaptive quantization scheme and corresponding processed decomposition tree entropy coding. We want to achieve higher acceptable compression ratios for medical images by better preserving the diagnostic accuracy of images. Many bit allocation techniques applied in quantization scheme are based on data distribution assumptions, quantiser distortion function etc. All statistical assumptions built on global data characteristics do not cover exactly local data behaviour and important detail of original image, e.g., different texture small area may be lost. Thus we decided to build quantization scheme on the base of local data characteristics such a direct data context in two dimensions mentioned earlier. We do data variance estimation on the base of real data set as spatial estimate for corresponding coefficient positions in successive subbands. The details of quantization process and correlated coding technique as a part of effective simple wavelet-based compression method which allows to achieve high reconstructed image quality at low bit rates are presented.2. THE COMPRESSION TECHNIQUEScheme of our algorithm is very simple: dyadic, 3 levels decomposition of original image (256×256 images were used) done by selected filters. For symmetrical filters symmetry boundary extension at the image borders was used and for asymmetrical filters - a periodic (or circular) boundary extension.Figure 1. Dyadic wavelet image decomposition scheme. - horizontal relations, - parent - children relations. LL - the lowest frequency subband.Our approach to filters is utilitarian one, making use of the literature to select the proper filters rather than to design them. We conducted an experiment using different kinds of wavelet transformation in presented algorithm. Long list of wavelet families and corresponding filters were tested: Daubechies, Adelson, Brislawn, Odegard, Villasenor, Spline, Antonini, Coiflet, Symmlet, Beylkin, Vaid etc.3 Generally Antonini 4 filters occurred to be the most efficient. Villasenor, Odegard and Brislawn filters allow to achieve similar compression efficiency. Finally: Antonini 7/9 tap filters are used for MR and US image compression and Villasenor 18/10 tap filters for CT image compression.2.1 Adaptive space-frequency quantizationPresented space-frequency quantization technique is realised as entire data pre-selection, threshold selection and scalar uniform quantization with step size conditioned by chosen compression ratio. For adaptive estimation of threshold and quantization step values two extra data structure are build. Entire data pre-selection allows to evaluate zero-quantized data set and predict the spatial context of each coefficient. Next simple quantization of the lowest frequency subband (LL) allows to estimate quantized coefficient variance prediction as a space function across sequential subbands. Next the value of quantization step is slightly modified by a model build on variance estimate. Additionally, a set of coefficients is reduced by threshold selection. The threshold value is increased in the areas with the dominant zero-valued coefficients and the level of growth depends on coefficient spatial position according variance estimation function.Firstly zero-quantized data prediction is performed. The step size w is assumed to be constant for all coefficients at each decomposition level. For such quantization model the threshold value is equal to w /2. Each coefficient whose value is less than threshold is predicted to be zero-valued after quantization (insignificant). In opposite case coefficient is predicted to be not equal to zero (significant). It allows to create predictive zero-quantized coefficients P map for threshold evaluation in the next step. The process of P map creation is as follows:if c w then p else p i i i <==/201, (1)where i m n m n =⋅−12,,...,;, horizontal and vertical image size , c i - wavelet coefficient value. The coefficient variance estimation is made on the base of LL data for coefficients from next subbands in corresponding spatial positions. The quantization with mentioned step size w is performed in LL and the most often occurring coefficient value is estimated. This value is named MHC (mode of histogram coefficient). The areas of MHC appearance are strongly correlated with zero-valued data areas in the successive subbands. The absolute difference of the LL quantized data and MHC is used as variance estimate for next subband coefficients in corresponding spatial positions. We tested many different schemes but this model allows to achieve the best results in the final meaning of compression efficiency. The variance estimation is rather coarse but this simple adaptive model built on real data does not need additional information for reconstruction process and increases the compression efficiency. Let lc i , i =1,2,...,lm , be a set ofLL quantized coefficient values, lm - size of this set . Furthermore let mode of histogram coefficient MHC value be estimated as follows:f MHC f lc MHC Al lc Al i i ()max ()=∈∈ and , (2)where Al - alphabet of data source which describes the values of the coefficient set and f lc n lmi lc i ()=, n lc i - number of lc i -valued coefficients. The normalised values of variance estimate ve si for next subband coefficients in corresponding to i spatial positions (parent - children relations from the top to the bottom of zerotree - see fig. 1) are simply expressed by the following equation: ve lc MHC ve si i =−max . (3)These set of ve si data is treated as top parent estimation and is applied to all corresponding child nodes in wavelet hierarchical decomposition tree.9-th order context model is applied for coarser data reduction in ‘unimportant' areas (usually with low diagnostic importance). The unimportance means that in these areas the majority of the data are equal to zero and significant values are separated. If single significant values appear in these areas it most often suggests that these high frequency coefficients are caused by noise. Thus the coarser data reduction by higher threshold allows to increase signal to noise ratio by removing the noise. At the edges of diagnostically important structures significant values are grouped together and the threshold value is lower at this fields. P map is used for each coefficient context estimation. Noncausal prediction of the coefficient importance is made as linear function of the binary surrounding data excluding considered coefficient significance. The other polynomial, exponential or hyperbolic function were tested but linear function occurred the most efficient. The data context shown on fig. 2 is formed for each coefficient. This context is modified in the previous data points of processing stream by the results of the selection with the actual threshold values at these points instead of w /2 (causal modification). Values of the coefficient importance - cim are evaluated for each c i coefficient from the following equation:cim coeff p i i j j =⋅−=∑1199(),, where i m n =⋅12,,...,. (4)Next the threshold value is evaluated for each c i coefficient: th w cim w ve i i si =⋅+⋅⋅−/(())211, (5)where i m n =⋅12,,...,, si - corresponding to LL parent spatial location in lower decomposition levels.The modified quantization step model uses the LL-based variance estimate to slightly increase the step size for less variance coefficients. Threshold data selection and uniform quantization is made as follows: each coefficient value is firstly compared to its threshold value and then quantized using w step for LL and modified step value mw si for next subbands . Threshold selection and quantization for each c i coefficient can be clearly described by the following equations:LLif c then c c welse if c th then c else c c mw i i i i i i i i si∈=<==//0, (6)where mw w coeff ve si si =⋅+⋅−(())112. (7)The coeff 1 and coeff 2 values are fitted to actual data characteristic by using a priori image knowledge and performingentire tests on groups of similar characteristic images.a) b)Figure 2. a) 9-order coefficient context for evaluating the coefficient importance value in procedure of adaptive threshold P map context of single edge coefficient.2.2 Zerotrees construction and codingSophisticated entropy coding methods which can significantly improve compression efficiency should retain progressive way of data reconstruction. Progressive reconstruction is simple and natural after wavelet-based decomposition. Thus the wavelet coefficient values are coded subband-sequentially and spectral selection is made typically for wavelet methods. The same scale subbands are coded as follows: firstly the lowest frequency subband, then right side coefficient block, down-left and down-right block at the end. After that next larger scale data blocks are coded in the same order. To reduce a redundancy of such data representation zerotree structure is built. Zerotree describes well the correlation between data values in horizontal and vertical directions, especially between large areas with zero-valued data. These correlated fragments of zerotree are removed and final data streams for entropy coding are significantly diminish. Also zerotree structure allows to create different characteristics data streams to increase the coding efficiency. We used simple arithmetic coders for these data streams coding instead of applied in many techniques bit map (from MSB to LSB) coding with necessity of applying the efficient context model construction. Because of refusing the successive approximation we lost full progression. But the simplicity of the algorithm and sometimes even higher coding efficiency was achieved. Two slightly different arithmetic coders for producing ending data stream were used.2.2.1 Construction and pruning of zerotreeThe dyadic hierarchical image data decomposition is presented on fig. 1. Decomposition tree structure reflects this hierarchical data processing and strictly corresponds to created in transformation process data streams. The four lowest frequency subbands which belong to the coarsest scale level are located at the top of the tree. These data have not got parent values but they are the parents for the coefficients in lower tree level of greater scale in corresponding spatial positions. These correspondence is shown on the fig. 1 as parent-children relations. Each parent coefficient has got four direct children and each child is under one direct parent. Additionally, horizontal relations at top tree level are introduced to describe the data correlation in better way.The decomposition tree becomes zerotree when node values of quantized coefficients are signed by symbols of binary alphabet. Each tree node is checked to be significant (not equal to zero) or insignificant (equal to zero) - binary tree is built. For LL nodes way of significance estimation is slightly different. The MHC value is used again because of the LL areas of MHC appearance strong correlation with zero-valued data areas in the next subbands. Node is signed to be significant if its value is not equal to MHC value or insignificant if its value is equal to MHC. The value of MHC must be sent to a decoder for correct tree reconstruction.Next step of algorithm is a pruning of this tree. Only the branches to insignificant nodes can be pruned and the procedure is slightly other at different levels of the zerotree. Procedure of zerotree pruning starts at the bottom of wavelet zerotree. Sequential values of four children data and their parent from higher level are tested. If the parent and the children are insignificant - the tree branch with child nodes is removed and the parent is signed as pruned branch node (PBN). Because of this the tree alphabet is widened to three symbols. At the middle levels the pruning of the tree is performed if the parent value is insignificant and all children are recognised as PBN. From conducted research we found out that adding extra symbols to the tree alphabet is not efficient for decreasing the code bit rate. The zerotree pruning at top level is different. The checking node values is made in horizontal tree directions by exploiting the spatial correlation of the quantized coefficients in the subbands of the coarsest scale - see fig. 1. Sequentially the four coefficients from the same spatial positions and different subbands are compared with one another. The tree is pruned if the LL node is insignificant and three corresponding coefficients are PBN. Thus three branches with nodes are removed and LL node is signed as PBN. It means that all its children across zerotree are insignificant. The spatial horizontal correlation between the data at other tree levels is not strong enough to increase the coding efficiency by its utilisation.2.2.2 Making three data streams and codingPruned zerotree structure is handy to create data streams for ending efficient entropy coding. Instead of PBN zero or MHC values (nodes of LL) additional code value is inserted into data set of coded values. Also bit maps of PBN spatial distribution at different tree levels can be applied. We used optionally only PBN bit map of LL data to slightly increase the coding efficiency. The zerotree coding is performed sequentially from the top to the bottom to support progressive reconstruction. Because of various quantized data characteristics and wider alphabet of data source model after zerotree pruning three separated different data streams and optionally fourth bit map stream are produced for efficient data coding. It is well known from information theory that if we deal with a data set with significant variability of data statistics anddifferent statistics (alphabet and estimate of conditional probabilities) data may be grouped together it is better to separate these data and encode each group independently to increase the coding efficiency. Especially is true when context-based arithmetic coder is used. The data separation is made on the base of zerotree and than the following data are coded independently:- the LL data set which has usually smaller number of insignificant (MHC-valued) coefficients, less PBN and less spatial data correlation than next subband data (word- or charwise arithmetic coder is less efficient then bitwise coder);optionally this data stream is divided on PBN distribution bit map and word or char data set without PBNs,- the rest of top level (three next subbands) and middle level subband data set with a considerable number of zero-valued (insignificant) coefficients and PBN code values; level of data correlation is greater, thus word- or charwise arithmetic coder is efficient enough,- the lowest level data set with usually great number of insignificant coefficients and without PBN code value; data correlation is very high.Urban Koistinen arithmetic coder (DDJ Compression Contest public domain code accessible by internet) with simple bitwise algorithm is used for first data stream coding. For the second and third data stream coding 1-st order arithmetic coder built on the base of code presented in Nelson book 5 is applied. Urban coder occurred up to 10% more efficient than Nelson coder for first data stream coding. Combining a rest of top level data and the similar statistics middle level data allows to increase the coding efficiency approximately up to 3%.The procedure of the zerotree construction, pruning and coding is presented on fig. 3.Construction ofbinary zerotreeBitwise arithmetic codingFinal compressed data representationFigure 3. Quantized wavelet coefficients coding scheme with using zerotree structure. PBN - pruned branch node.3. TESTS, RESULTS AND DISCUSSIONIn our tests many different medical modality images were used. For chosen results presentation we applied three 256×256×8-bit images from various medical imaging systems: CT (computed tomography), MR (magnetic resonance) and US(ultrasound) images. These images are shown on fig. 4. Mean square error - MSE and peak signal to noise ratio - PSNR were assumed to be reconstructed image quality evaluation criteria. Subjective quality appreciation was conducted in very simple way - only by psychovisual impression of the non-professional observer.Application of adaptive quantization scheme based on modified threshold value and quantization step size is more efficient than simple uniform scalar quantization up to 10% in a sense of better compression of all algorithm. Generally applying zerotree structure and its processing improved coding efficiency up to 10% in comparison to direct arithmetic coding of quantized data set.The comparison of the compression efficiency of three methods: DCT-based algorithm,6,7 SPIHT 8 and presented compression technique, called MBWT (modified basic wavelet-based technique) were performed for efficiency evaluation of MBWT. The results of MSE and PSNR-based evaluation are presented in table 1. Two wavelet-based compression techniques are clearly more efficient than DCT-based compression in terms of MSE/PSNR and also in our subjective evaluation for all cases. MBWT overcomes SPIHT method for US images and slightly for CT test image at lower bit rate range.The concept of adaptive threshold and modified quantization step size is effective for strong reduction of noise but it occurs sometimes too coarse at lower bit rate range and very small details of the image structures are put out of shape. US images contain significant noise level and diagnostically important small structures do not appear (image resolution is poor). Thus these images can be efficiently compressed by MBWT with image quality preserved. It is clearly shown on fig.5. An improvement of compression efficiency in relatio to SPIHT is almost constant at wide range of bit rates (0.3 - 0.6 dB of PSNR).a) b)c)Figure 4. Examples of images used in the tests of compression efficiency evaluation. The results presented in table 1 and on fig. 5 were achieved for those images. The images are as follows: a ) echocardiography image, b) CT head image, c) MR head image.Table 1. Comparison of the three techniques compression efficiency: DCT-based, SPIHT and MBWT. The bit rates are chosen in diagnostically interesting range (near the borders of acceptance).Modality - bit rateDCT-based SPIHT MBWTMSE PSNR[dB] MSE PSNR[dB] MSE PSNR[db] MRI - 0.70 bpp8.93 38.62 4.65 41.45 4.75 41.36 MRI - 0.50 bpp13.8 36.72 8.00 39.10 7.96 39.12 CT - 0.50 bpp6.41 40.06 3.17 43.12 3.1843.11 CT - 0.30 bpp18.5 35.46 8.30 38.94 8.0639.07 US - 0.40 bpp54.5 30.08 31.3 33.18 28.3 33.61 US - 0.25 bpp 91.5 28.61 51.5 31.01 46.8 31.43The level of noise in CT and MR images is lower and small structures are often important in image analysis. That is the reason why the benefits of MBWT in this case are smaller. Generally compression efficiency of MBWT is comparable to SPIHT for these images. Presented method lost its effectiveness for higher bit rates (see PSNR of 0.7 bpp MR representation) but for lower bit rates both MR and CT images are compressed significantly better. Maybe the reason is that the coefficients are reduced relatively stronger because of its importance reduction in MBWT threshold selection at lower bits rate range.0,20,30,40,50,60,70,8Rate in bits/pixel PSNR in dBFigure 5. Comparison of SPIHT and presented in this paper technique (MBWT) compression efficiency at range of low bit rates. US test image was compressed.4. CONCLUSIONSAdaptive space-frequency quantization scheme and zerotree-based entropy coding are not time-consuming and allow to achieve significant compression efficiency. Generally our algorithm is simpler than EZW-based algorithms 9 and other algorithms with extended subband classification or space -frequency quantization models 10 but compression efficiency of presented method is competitive with the best published algorithms in the literature across diverse classes of medical images. The MBWT-based compression gives slightly better results than SPIHT for high quality images: CT and MR and significantly better efficiency for US images. Presented compression technique occurred very useful and promising for medical applications. Appropriate reconstructed image quality evaluation is desirable to delimit the acceptable lossy compression ratios for each medical modality. We intend to improve the efficiency of this method by: the design a construction method of adaptive filter banks and correlated more sufficient quantization scheme. It seems to be possible byapplying proper a priori model of image features which determine diagnostic accuracy. Also more efficient context-based arithmetic coders should be applied and more sophisticated zerotree structures should be tested.REFERENCES1.Hui, C. W. Kok, T. Q. Nguyen, …Image Compression Using Shift-Invariant Dydiadic Wavelet Transform”, subbmited toIEEE Trans. Image Proc., April 3nd, 1996.2.J. D. Villasenor, B. Belzer and J. Liao, …Wavelet Filter Evaluation for Image Compression”, IEEE Trans. Image Proc.,August 1995.3. A. Przelaskowski, M.Kazubek, T. Jamrógiewicz, …Optimalization of the Wavelet-Based Algorithm for Increasing theMedical Image Compression Efficiency”, submitted and accepted to TFTS'97 2nd IEEE UK Symposium on Applications of Time-Frequency and Time-Scale Methods, Coventry, UK 27-29 August 1997.4.M. Antonini, M. Barlaud, P. Mathieu and I. Daubechies, …Image coding using wavelet transform”, IEEE Trans. ImageProc., vol. IP-1, pp.205-220, April 1992.5.M. Nelson, The Data Compression Book, chapter 6, M&T Books, 1991.6.M. Kazubek, A. Przelaskowski and T. Jamrógiewicz, …Using A Priori Information for Improving the Compression ofMedical Images”, Analysis of Biomedical Signals and Images, vol. 13,pp. 32-34, 1996.7. A. Przelaskowski, M. Kazubek and T. Jamrógiewicz, …Application of Medical Image Data Characteristics forConstructing DCT-based Compression Algorithm”, Medical & Biological Engineering & Computing,vol. 34, Supplement I, part I, pp.243-244, 1996.8. A. Said and W. A. Pearlman, …A New Fast and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees”,submitted to IEEE Trans. Circ. & Syst. Video Tech., 1996.9.J. M. Shapiro, …Embedded Image Coding Using Zerotrees of Wavelet Coefficients”, IEEE Trans. Signal Proces., vol.41, no.12, pp. 3445-3462, December 1993.10.Z. Xiong, K. Ramchandran and M. T. Orchard, …Space-Frequency Quantization for Wavelet Image Coding”, IEEETrans. Image Proc., to appear in 1997.。
图像融合之泊松融合(PossionMatting)

图像融合之泊松融合(PossionMatting)前⾯有介绍拉普拉斯融合,今天说下OpenCV泊松融合使⽤。
顺便提⼀下,泊松是拉普拉斯的学⽣。
泊松融合的原理请参考这篇博⽂,讲的⾮常详细,此处不再赘述。
OpenCV中集成了泊松融合,API为seamless Clone(),函数原型如下: 泊松融合是将⼀个src放进dst中,放置位置根据dst中P点为中⼼的⼀个前景mask⼤⼩范围内。
融合过程会改变src图像中颜⾊以及梯度,达到⽆缝融合效果。
需要注意⼀点是,中⼼点P点的设置,最好是先根据前景mask算⼀个外接矩形框Rect,取Rect的中⼼点为P,保证Rect能够放进dst中,不会越界就好。
效果展⽰如下: src dstmask blend⽰例代码:1 #include <opencv2\opencv.hpp>2 #include <iostream>3 #include <string>45using namespace std;6using namespace cv;789void main()10 {11 Mat imgL = imread("data/apple.jpg");12 Mat imgR = imread("data/orange.jpg");1314int imgH = imgR.rows;15int imgW = imgR.cols;16 Mat mask = Mat::zeros(imgL.size(), CV_8UC1);17 mask(Rect(0,0, imgW*0.5, imgH)).setTo(255);18 cv::imshow("mask", mask);19 Point center(imgW*0.25, imgH*0.5);2021 Mat blendImg;22 seamlessClone(imgL, imgR, mask, center, blendImg, NORMAL_CLONE); 2324 cv::imshow("blendimg", blendImg);25 waitKey(0);26 }。
重建去卷积方法的重建无线图像技术评价(IJIGSP-V10-N10-3)

Published Online October 2018 in MECS (/) DOI: 10.5815/ijigsp.2018.10.03
Evaluation of Reconstructed Radio Images Techniques of CLEAN De-convolution Methods1来自M.A. Mohamed1
Communication and Electronics, Faculty of Engineering, Mansoura University, Egypt Email: mazim12@
(1) (2) (3) (4) (5)
Iobs i, m I psf i, m *Isky i, m ∬S u, v V u, v e
2 πi ui,vm
.dudv
where I(i, m) is Sky brightness distribution, I(i, m)obs Observed sky distribution, I(i, m)������������������ is the true sky I.J. Image, Graphics and Signal Processing, 2018, 10, 31-39
V u, v ∬I i, m e2 πi ui,vm .didm
Vobs u, v S u, v V u, v
1 F1 Vobs u, v F S u, v V u, v
1 Iobs i, m F1 S u, v *F V u, v
ImageNetClassificationwithDeepConvolutionalNeuralN

ImageNet Classification with Deep Convolutional Neural NetworksAlex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton摘要咱们训练了一个大型的深度卷积神经网络,来将在ImageNet LSVRC-2020大赛中的120万张高清图像分为1000个不同的类别。
对测试数据,咱们取得了top-1误差率%,和top-5误差率%,那个成效比之前最顶尖的都要好得多。
该神经网络有6000万个参数和650,000个神经元,由五个卷积层,和某些卷积层后随着的max-pooling层,和三个全连接层,还有排在最后的1000-way的softmax层组成。
为了使训练速度更快,咱们利用了非饱和的神经元和一个超级高效的GPU关于卷积运算的工具。
为了减少全连接层的过拟合,咱们采纳了最新开发的正那么化方式,称为“dropout”,它已被证明是超级有效的。
在ILSVRC-2021大赛中,咱们又输入了该模型的一个变体,并依托top-5测试误差率%取得了成功,相较较下,次优项的错误率是%。
1 引言当前物体识别的方式大体上都利用了机械学习方式。
为了改善这些方式的性能,咱们能够搜集更大的数据集,学习更强有力的模型,并利用更好的技术,以避免过拟合。
直到最近,标记图像的数据集都相当小——大约数万张图像(例如,NORB [16],Caltech-101/256 [8, 9],和CIFAR-10/100 [12])。
简单的识别任务能够用这种规模的数据集解决得相当好,专门是当它们用标签-保留转换增强了的时候。
例如,在MNIST数字识别任务中当前最好的误差率(<%)接近于人类的表现[4]。
可是现实环境中的物体表现出相当大的转变,因此要学习它们以对它们进行识别就必需利用更大的训练集。
事实上,小规模图像数据集的缺点已被普遍认同(例如,Pinto等人[21]),可是直到最近,搜集有着上百万张图像的带标签数据集才成为可能。
On Single Image Scale-Up using Sparse-Representations(Michael Elad 的文章)

2
Roman Zeyde, Michael Elad and Matan Protter
for an additive i.i.d. white Gaussian noise, denoted by v ∼ N 0, σ2I . Given zl, the problem is to find yˆ ∈ RNh such that yˆ ≈ yh. Due to the
hereafter that H applies a known low-pass filter to the image, and S performs
a decimation by an integer factor s, by discarding rows/columns from the input
{romanz,elad,matanpr}@cs.technion.ac.il
Abstract. This paper deals with the single image scale-up problem using sparse-representation modeling. The goal is to recover an original image from its blurred and down-scaled noisy version. Since this problem is highly ill-posed, a prior is needed in order to regularize it. The literature offers various ways to address this problem, ranging from simple linear space-invariant interpolation schemes (e.g., bicubic interpolation), to spatially-adaptive and non-linear filters of various sorts. We embark from a recently-proposed successful algorithm by Yang et. al. [13,14], and similarly assume a local Sparse-Land model on image patches, serving as regularization. Several important modifications to the above-mentioned solution are introduced, and are shown to lead to improved results. These modifications include a major simplification of the overall process both in terms of the computational complexity and the algorithm architecture, using a different training approach for the dictionary-pair, and introducing the ability to operate without a trainingset by boot-strapping the scale-up task from the given low-resolution image. We demonstrate the results on true images, showing both visual and PSNR improvements.
基于自适应对偶字典的磁共振图像的超分辨率重建
L I U Z h e n - q i , B A 0 L i - j u n , C HE N Z h o n g
r De p a r t m e n t o f E l e c t r o n i c S c i e n c e , X i a me n U n i v e r s i t y , Xi a me n 3 6 1 0 0 5 , C h i n a )
刘振 圻 , 包立君 , 陈 忠
( 厦 门大学电子科 学系, 福建 厦门 3 6 1 0 0 5 )
摘 要: 为了提高磁共振成像的图像 质量 , 提 出了一种基于 自适应对偶字典的超分辨率 去噪重建方法 , 在超分辨率重建过程 中引入去噪功能 , 使 得改善图像 分辨率的同时能够有效地滤除 图像 中的噪声 , 实现 了超分辨率重建和去噪技术 的有机结合 。该 方法利用聚类一P c A算 法提取图像的主要特征来构造主特征字典 , 采用 训练方法设计 出表达图像 细节信 息的 自学 习字 典 , 两者 结合构成的 自适应对偶字典具有 良好 的稀疏度和 自适应性 。实验表 明, 与其他超分辨率算法相 比, 该方法超分辨率重建效果显 著, 峰值信噪 比和平均结构相似度均有所提高。
第2 8 卷第 4 期
2 0 1 3 年8 月
பைடு நூலகம்光 电技术 应 用
EL ECT RO一 0P T I C T ECHNOLOGY AP P LI CAT1 0N
V O1 . 28. NO. 4
Au g u s t , 2 01 3
・
信号 与信息处理 ・
基 于 自适应对偶 字典的磁共振 图像 的超 分辨率重建
高光谱图像处理与信息提取前沿
3
3.1 3.1.1
高光谱图像处理与信息提取方法
噪声评估与数据降维方法 噪声评估 典型地物具有的诊断性光谱特征是高光谱遥
感目标探测和精细分类的前提,但是由于成像光 谱仪波段通道较密而造成光成像能量不足,相对 于全色图像,高光谱图像的信噪比提高比较困 难。在图像数据获取过程中,地物光谱特征在噪 声的影响下容易产生“失真”,如对某一吸收特征进 行探测,则要求噪声水平比吸收深度要低至少一 个数量级。因此,噪声的精确估计无论对于遥感 器性能评价,还是对于后续信息提取算法的支 撑,都具有重要意义。
张兵:高光谱图像处理与信息提取前沿
1063
得新的突破。高光谱图像处理与信息提取技术的 研究主要包括数据降维、图像分类、混合像元分 解和目标探测等方向(张兵和高连如,2011)。本文 首先从上述4个方向梳理高光谱图像处理与信息提 取中的关键问题,然后分别针对每个方向,在回 顾相关经典理论和模型方法的基础上,介绍近年 来取得的新的代表性成果、发展趋势和未来的研 究热点。此外,高性能计算技术的发展显著提升 了数据处理与分析的效率,在高光谱图像信息提 取中也得到了广泛而成功的应用,因此本文还将 介绍高光谱图像高性能处理技术的发展状况。
题制图的基础数据,在土地覆盖和资源调查以及 环境监测等领域均有着巨大的应用价值。高光谱 图像分类中主要面临Hughes现象(Hughes,1968)和 维数灾难 (Bellman , 2015) 、特征空间中数据非线 性分布等问题。同时,传统算法多是以像元作为 基本单元进行分类,并未考虑遥感图像的空间域 特征,从而使得算法无法有效处理同物异谱问 题,分类结果中地物内部易出现许多噪点。 (4) 高光谱图像提供的精细光谱特征可以用于 区分存在细微差异的目标,包括那些与自然背景 存在较高相似度的目标。因此,高光谱图像目标 探测技术在公共安全和国防领域中有着巨大的应 用潜力和价值。高光谱图像目标探测要求目标具 有诊断性的光谱特征,在实际应用中受目标光谱 的变异性、背景信息分布与模型假设存在差异、 目标地物尺寸处于亚像元级别等问题影响,有时 存在虚警率过高的问题,需要发展稳定可靠的新 方法。 此外,高光谱遥感观测的目的是获取有用的 目标信息,而不是体量巨大的高维原始数据,传 统图像处理平台和信息提取方式难以满足目标信 息快速获取的需求。尽管高性能处理器件的迅猛 发展,为亟待解决的高光谱图像并行快速处理和 在轨实时信息提取提供了实现途径,但也面临着 一系列的关键技术问题。并行处理和在轨实时处 理都需要对算法架构进行优化,同时要依据处理 硬件的特点考虑编程方面的问题,此外,在轨实时 处理还对硬件在功耗等方面提出了特殊的要求。
纹理物体缺陷的视觉检测算法研究--优秀毕业论文
摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
image_interpolation
• When we see a video clip on a PC, we like to see it in the full screen mode
– We want GOOD images
• If some block of an image gets damaged during the transmission, we want to repair it
Image Interpolation
• Introduction
• Interpolation Techniques
– What is image interpolation? – Why do we need it?
• Interpolation Applications
– 1D linear interpolation (elementary algebra) – 2D = 2 sequential 1D (divide-and-conquer) – Directional(adaptive) interpolation*
EE465: Introduction to Digital Image Processing 2
A Sentimental Comment
• Haven’t we just learned from discrete sampling (A-D conversion)? • Yes, image interpolation is about D-A conversion • Recall the gap between biological vision and artificial vision systems
From 1D to 2D
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Image Interpolation Using Classification-based Neural Networks Hao Hu, Student Member, IEEE, Paul M. Hofman and Gerard de Haan, Senior Member, IEEEAbstract — Standard image interpolation methods generally use a uniform interpolation filter on the entire image. To achieve a better performance on specific structures, some content adaptive interpolation methods, such as Kondo’s method [1], have been introduced. However, these content adaptive methods are limited to fit image data into a linear model in each class. We investigate replacing the linear model by a flexible non-linear model, such as a feed-forward neural network. This results in a new interpolation algorithm based on known classification, but achieving better results. In this paper, such a classification-based neural network approach and its evaluation are presented. Both objective and subjective image quality results indicate that the proposed method gives an additional improvement in the interpolated image quality1.Index Terms — image interpolation, neural network, non-linear models, classificationI.I NTRODUCTIONMAGE interpolation is commonly performed by linear filters. The well-known bilinear and bicubic interpolation algorithms use linear interpolation filters, which have a drawback that they tend to produce undesirable blurring in the interpolated images. To enhance the sharpness and edge performance of image interpolation, some content adaptive interpolation methods have been introduced, such as Kondo’s method [1] and Atkins’ method [2]. Kondo’s method uses a simple method to classify the local image structure in the filter’s aperture and computes the optimal linear filter for each class. Kondo’s method has a better performance on edges than common linear methods [3]. Atkins’ method uses a more complicated classification method based on a stochastic model and obtains a high resolution image by mixing the outputs of optimized linear filters with coefficients that depend on the classification. Atkins’ method can also reduce edge blurring but is computationally more intensive than Kondo’s method.As an approach of non-linear filtering methods, neural networks can be used for image interpolation. Some applications use neural networks as function approximators. Plaziac [4] proposed a forward neural network to restore the1Hao Hu is with Eindhoven University of Technology, Den Dolech 2, 5600 MB Eindhoven, the Netherlands (e-mail: hao.hu@).Paul M. Hofman is with Philips Research Laboratories Eindhoven, Prof. Holstlaan 4, 5656 AA Eindhoven, the Netherlands (e-mail: paul.hofman@ ).Gerard de Haan is with Philips Research Laboratories Eindhoven, Prof. Holstlaan 4, 5656 AA Eindhoven, the Netherlands (e-mail: g.de.haan@ ). high resolution image by filling the gaps from a decimated high resolution image. This neural network model had 24 input pixels from the low resolution content and 16 hidden neurons. The paper reported that the method outperformed linear and median filters. A similar method was also proposed by Go [5] to realize color interpolation with a Bayer pattern for digital still cameras.Other interpolation schemes use the neural network as a pattern classifier. Marsi [6] used a neural network to segment the image according to the presence of oriented edges, then applied a set of different directional adaptive filters to interpolate the image. The filter outputs were weighted according to the output of the neural network. Ahmed [7] proposed to use a neural network to segment and label document images into text, halftone images, and background. Once the segmentation was performed, a specific enhancement or interpolation kernel was applied to each document component.In this paper, we propose an interpolation method using feed-forward neural networks with coefficients that are specific for a class of local structure. The proposed method classifies the local structure of a 3X3 pixel neighborhood in the same way as Kondo’s method, but uses non-linear filters (neural networks) for the interpolation, rather than linear filters. The optimal coefficients are obtained on the basis of the same error criteria as Kondo’s method. Experimental results show that the proposed neural network approach has a better interpolation performance than Kondo’s method.In Section II, we give a brief introduction about Kondo’s method. Then we describe how to apply the proposed interpolation method and obtain the optimal coefficients in Section III. Section IV includes experiment results. Finally, the conclusion and discussion are provided in Section V.II.K ONDO’S METHODKondo’s method is a content adaptive filtering approach, that is, the coefficients of interpolation filters depend on the pattern of the local structure. A block diagram of the method is presented in Fig. 1.In a 3X3 aperture, all pixel values are encoded into binary 0 or 1 by using Adaptive Dynamic Range Coding (ADRC) [8]. The 1-bit ADRC code of every pixel is defined by:1,()()0,()avavi iADRC ii i≥⎧=⎨<⎩(1) where i is the value of pixels in the aperture and av i is the average value of all pixels in the aperture.IFig. 1 The interpolation process of Kondo’s method: the local image structure is classified using ADRC and the filter coefficients can be obtained from LUT.The concatenation of ADRC(i) of all pixels in the aperture gives class code, c . The class, c , is used as the index to a Look-Up-Table (LUT) that contains a set of filter coefficients for every class. The high resolution pixels are computed by weighing the pixel values with the coefficients and subsequently summing the results.11,12,29,9c c c y w i w i w i =+++" (2)where 1,c w ,2,c w , … 9,c w are the coefficients for the classwhich index in the LUT is c . To interpolate the value of 2y ,the aperture is flipped horizontally as shown in Fig. 2. Then 2y can be calculated in the same way as 1y . Similarly, 3y and4y can be calculated using vertical flipping.Fig. 2 Aperture flipping: the high resolution pixel y 2 is flipped to the position of y 1 after the aperture flipping.To obtain the optimal interpolation coefficients, Kondo’smethod employs original high resolution images and down-scaled images as training material and uses the Least MeanSquares (LMS) method to get the optimal coefficient for every class. Suppose in one class, c , the number of total training samples is N . Let orig y be the original high resolution pixel value and inpl ybe the interpolated pixel value. The sum square error then is: 22,,1()Nc orig n inpl n n e y y ==−∑(3) Insert (2) into (3), then the sum square error becomes22,1,1,2,2,9,9,1[()]Nc orig n c n c n c n n e y w i w i w i ==−+++∑" (4) To get the minimal value of 2c e , let the first derivatives of 2c eto 1,c w , 2,c w , …, 9,c w equal zero.21,,1,1,2,2,9,9,11,22,,1,1,2,2,9,9,12,29,,1,1,2,2,9,9,19,2[()]02[()]02[()]0Nc n orig n c n c n c n n c N cn orig n c n c n c n n c Nc n orig n c n c n c n n c e i y w i w i w i w e i y w i w i w i w e i y w i w i w i w ===∂=−+++=∂∂=−+++=∂∂=−+++=∂∑∑∑""""(5)Equations (5) comprise of a set of linear equations which are solved in a straightforward manner by matrix inversion. Then, the optimal coefficients for class c are obtained.III. T HE PROPOSED METHODThe block diagram of the proposed interpolation method is shown in Fig. 3. The system interpolates high resolution images using neural networks with coefficients that areselected from the LUT according to the classification.Fig. 3 The interpolation process of the proposed method. Note that the overall scheme is identical to Fig. 1 except that the linear interpolation model is replaced by a neural network.A. Interpolation ProcessSimilar to Kondo’s method, we use all pixels in a 3X 3 aperture in the low resolution image to estimate the corresponding interpolated pixels. The nine pixels in theaperture are presented to the neural network as an input vector.The ADRC code of the input is used to choose thecorresponding set of coefficients for the neural network from the LUT.The neural network model (Fig. 4) that we use is a simplethree-layer feed forward architecture which has only a fewhidden neurons and one output neuron. The non-linear function used in the hidden layer is the hyperbolic tangent function whereas a linear function is used at the output layer. Suppose there are h N hidden neurons in the network. Thelow-resolution imageinterpolated high-resolution imagelow-resolution imageinterpolated high-resolution imageoutput n h of one hidden neuron n is:tanh()n n n h w i b =⋅+KK (6)where i K is the input vector (1i ,2i , … , 9i ), n w Kis the weight vector between the input layer and the hidden neuron n and n b is the bias to the hidden neuron n . The output of thenetwork is:10y u h b =⋅+KK (7)where u Kis the weight vector between the hidden layer andthe output neuron, h Kis the output vector at the hidden layer (1h ,2h , …, h N h ) and 0b is the bias to the output neuron. Theother outputs 2y , 3y and 4y can be calculated by flipping the aperture as in Kondo’s method according to Section II.Fig. 4 The model uses a three-layer feed-forward architecture and it has a nonlinear transfer function in the hidden layer and a linear transfer function in the output layer.The total amount of parameters of the neural network in Fig.4 is about h N times the amount of the linear filter coefficients in Kondo’s method. Furthermore, note that the term n w i ⋅KK in(6) represents a single linear filter and that the neural network can be considered as direct extension of the linear model.B. Selective Training Our training procedure is shown in Fig. 5. We use the down-scaled image and the original image as the input and output target. Before training, input and output target pairs (i K , y ) are classified using ADRC on the input vector i K .The pairs that belong to one specific class are used for neural network training, resulting in optimal coefficients for thisclass. The optimal coefficients are obtained by using the Levenberg-Marquardt optimization algorithm [9] with the sumsquare error criterion. During training, the errors between outputs and targets are computed and the derivatives of the errors are back propagated to adjust the coefficients of the network and minimize the sum square errors. The optimization is done for each separate class. Once the trainingin one class is complete, the coefficients are stored in the corresponding index of the LUT and the training procedure is repeated for the next class. Because a similar classification is used, the number of coefficient sets is equal to that of Kondo’s method.Fig. 5 The selective training procedureIV. E XPERIMENTS AND R ESULTSTo compare the performance of Kondo’s method and the proposed method, we obtain the coefficients of these two methods by training on the same training material. Both of the methods use algorithms that minimize the sum of square errors to get optimal coefficients, therefore we also choose the Mean Square Error (MSE) as the criterion for objective quality comparison. We use the same evaluation process as Zhao et al. [3]. A. Training MaterialsThe training materials consist of 30 high resolution pictures with dimensions 3000X 4500 pixels. These pictures include a variety of natural images, including people, building, animals and landscape. We only use the luminance component of these images. At least 200,000 samples in each class are used for training. Note that the images which we evaluated the methods are not included in this training set.B. Neural Network SettingThe neural network used in our experiments has one hidden layer with five hidden neurons. Our experiments show thatabout two to five hidden neurons in the hidden layer are required to obtain significantly better results than Kondo’s method. Before training, the input vector and target value are rescaled from the range [0, 255] to [-1, 1], which corresponds to output range of hyperbolic tangent function. During the interpolation process, the output of the neural network will berescaled back from [-1, 1] to [0 255]. C. The down-scaling and MSE calculationTo generate training material and calculate the MSE for performance comparison, one needs to down-scale the high resolution material to a low resolution. Here, we use four-pixel averaging and a 3-tap peaking filter to give the best compensation to the averaging filter in [3].The down-scaled test image or sequence is up-scaled using Kondo’s method as well as the proposed method. The MSE between the interpolated and the original high resolutionmaterial can be calculated as:i 1 i 2i 9y 1neural network imagesdownscaled images2111((,)(,))J Korig inpl j k MSE y j k y j k JK ===−∑∑ (8) where J , K are the numbers of columns and rows in the image, (,)orig y j k , (,)inpl y j k is the pixel value of theoriginal image and the interpolated image respectively.D. Testing images and sequences The experiments are tested on the images and sequences used by Zhao et al [3]. This test material (shown in Fig. 6) includes: a sequence with high contrast edges in all directions (Bicycle), a sequence with low contrast details (Football), a still portrait image (Lena), a sequence with detailed lettering and fine structure (Siena), and a sequence with abundant detail (Tokyo).E. Results In Table I, the MSE score on every test image or video sequence and the average MSE score are provided forKondo’s method and the proposed method.(A) Bicycle(B) Football(C) Lena(D) Siena(E) TokyoFig. 6 Testing images and video sequencesTable I shows the proposed method has about 10%reduction in the MSE score on every test image and sequence. Especially on Bicycle sequence the difference is large probably because of many high contrast edges in this sequence. From the results one can see that by introducing aflexible neural model one can estimate the interpolated pixel more accurately and achieve a lower MSE score.T ABLE IMSE SCORE ON THE TEST IMAGE AND SEQUENCEMSE Kondo’s method Proposed methodBicycle 47.4 39.4 Football 59.8 55.2Lena 20.8 19.4Siena 80.1 76.9 Tokyo 53.4 51.8 Average 52.3 48.5To enable a subjective comparison, some image fragments from the Bicycle sequence interpolated by both methods are showed in Fig. 7. This figure demonstrates that the proposed method produces images closer to the original images. The image fragments of the first row show that Kondo’s method causes blockiness near the letters, while the propose methodrenders the borders more correctly. In image fragments of thesecond row, the proposed method generates thinner and highercontrast grid lines. The image fragments of the last row illustrate the differences of high contrast edge performance of the two methods. Kondo’s method causes some overshoots near the edges, while such overshoots are absent in the proposed method. V. C ONCLUSION A ND D ISCUSSION We have presented a classification-based neural network approach for image interpolation. The results show thatreplacing the linear filter in Kondo’s method with highly flexible neural models produces better results and avoids some artifacts of Kondo’s method. By using a large data set it waspossible to obtain a stable fit of the neural network despite a significant increase in the number of parameters. The use of pre-classification in the proposed method limits the complexity of the neural network while still achieving good results. In other words, the use of classification simplifies the function approximation of the neural network. The analysis of the trade off between the complexity of the interpolation model and the amount of classes merits furtherattention.R EFERENCES[1] T. Kondo, T. Fujiwara, Y. Okumura, and Y. Node, “Picture conversionapparatus, picture conversion method, learning apparatus and learning method ”, US-patent 6,323,905, Nov. 2001 [2] C. B. Atkins, C. A. Bouman, and J. P. Allebach, “Optimal image scaling using pixel classification”, 2001 International Conference on Image Processing , vol.3, pp. 864-867, 2001. [3] M. Zhao, J. A. Leitao and G. de Haan, “Towards an Overview of Spatial Up-conversion Techniques”, Proceedings of ISCE'02, pp. E13-E16, Sep. 2002[4] N. Plaziac, ‘Image interpolation using neural networks,” IEEE Trans. Image Processing , vol. 8, no. 11, pp. 381-390, 1999.[5] J. Go, K. Sohn and C. Lee, “Interpolation using neural networks fordigital still cameras”, IEEE Trans. Consumer Electronics , vol. 46, No. 3,Aug. 2000.[6] S. Marsi and S. Carrato, “Neural network-based image segmentation forimage interpolation”, Neural Networks for Signal Processing V. Proceedings of the 1995 IEEE Workshop , pp. 388 - 397, Sept. 1995. [7] M. N. Ahmed, B. E. Cooper, Shaun T. Love, “Adaptive image interpolation using a multilayer neural network,” Applications ofArtificial Neural Networks in Image Processing VI, Proc. SPIE , Vol.4305, pp. 30-38, Apr. 2001.[8] T. Kondo, and K. Kawaguchi, “Adaptive dynamic range encoding method and apparatus ”, US-patent 5,444,487, Aug. 1995. [9] R. Fletcher, Practical Methods of Optimization , Second Edition, JohnWiley & Sons Ltd, ISBN 0-471-49463-1, pp. 100-107, Aug. 2000.Kondo’s method Proposed methodOriginalFig.7 Image fragments processed by Kondo’s method, the proposed method and the original. Note that the proposed method has better performance on edges, especially in the indicated area in the image fragments.。