Applying alternating direction method of multipliers for constrained dictionary learning

合集下载

医学超声原理-第八讲---超声换能器

医学超声原理-第八讲---超声换能器
本书的特点是在注重基本概念,基本原理,基 本方法的同时,兼顾一定的工程技术实用性, 如包含声场的数值模拟,超声图像的C语言程 序处理,超声波发射电路原理,换能器的匹配 技术等。本书适合医学超声以及相关领域的本 科生作教材,也可供该领域的研究生,科研及 工程技术工作者参考。
Shanghai Jiao Tong University
器处于发射状态时,将电能转换成机械能,再转 换成声能。
用来接收声波的换能器称为接收器。当换 能器处于接收状态时,将声能变成机械能,再转 换成电能。
有些情况下,换能器既可以用作发射器,又 可以用作接收器,即所谓的收发两用型换能器。
Shanghai Jiao Tong University
一、超声换能器介绍
阵列换能器的基本结构
Shanghai Jiao Tong University
三、医学超声ห้องสมุดไป่ตู้能器结构
(二)多阵元换能器
(a)
(b)
(c)
(d)
图3.6 (a)线阵 (b)相控阵 (c)凸面线阵 (d)方阵
Shanghai Jiao Tong University
三、医学超声换能器结构
所谓相控换能器阵列:就是指多个换能器阵列排列成一 条直线,每个阵元上所加信号的相位不同。
Shanghai Jiao Tong University
二、医学超声换能器种类
发射型换能器 3.按收发方式分 接收型换能器
收发兼用型换能器
圆形换能器
环形换能器
4.按几何形状分
方型换能器 矩形换能器
喇叭型换能器
菊花型换能器
Shanghai Jiao Tong University
三、医学超声换能器结构

2014_ICASSP_EFFICIENT CONVOLUTIONAL SPARSE CODING

2014_ICASSP_EFFICIENT CONVOLUTIONAL SPARSE CODING

EFFICIENT CONVOLUTIONAL SPARSE CODINGBrendt WohlbergTheoretical DivisionLos Alamos National LaboratoryLos Alamos,NM87545,USAABSTRACTWhen applying sparse representation techniques to images,the standard approach is to independently compute the rep-resentations for a set of image patches.Thismethod performs very well in a variety of applications,butthe independent sparse coding of each patch results in a rep-resentation that is not optimal for the image as a whole.Arecent development is convolutional sparse coding,in whicha sparse representation for an entire image is computed by re-placing the linear combination of a set of dictionary vectorsby the sum of a set of convolutions with dictionaryfilters.Adisadvantage of this formulation is its computational expense,but the development of efficient algorithms has received someattention in the literature,with the current leading method ex-ploiting a Fourier domain approach.The present paper intro-duces a new way of solving the problem in the Fourier do-main,leading to substantially reduced computational cost.Index Terms—Sparse Representation,Sparse Coding,Convolutional Sparse Coding,ADMM1.INTRODUCTIONOver the past15year or so,sparse representations[1]havebecome a very widely used technique for a variety of prob-lems in image processing.There are numerous approaches tosparse coding,the inverse problem of computing a sparse rep-resentation of a signal or image vector s,one of themost widely used being Basis Pursuit DeNoising(BPDN)[2]arg minx 12D x−s 22+λ x 1,(1)where D is a dictionary matrix,x is the sparse representation, andλis a regularization parameter.When applied to images, decomposition is usually applied independently to a set of overlapping image patches covering the image;this approach is convenient,but often necessitates somewhat ad hoc subse-quent handling of the overlap between patches,and results in a representation over the whole image that is suboptimal.This research was supported by the U.S.Department of Energy through the LANL/LDRD Program.More recently,these techniques have also begun to be ap-plied,with considerable success,to computer vision problems such as face recognition[3]and image classification[4,5,6]. It is in this application context that convolutional sparse rep-resentations were introduced[7],replacing(1)with arg min{x m}12md m∗x m−s22+λmx m 1,(2)where{d m}is a set of M dictionaryfilters,∗denotes convo-lution,and{x m}is a set of coefficient maps,each of which is the same size as s.Here s is a full image,and the{d m} are usually much smaller.For notational simplicity s and x m are considered to be N dimensional vectors,where N is the the number of pixels in an image,and the notation{x m}is adopted to denote all M of the x m stacked as a single column vector.The derivations presented here are for a single image with a single color band,but the extension to multiple color bands(for both image andfilters)and simultaneous sparse coding of multiple images is mathematically straightforward.The original algorithm proposed for convolutional sparse coding[7]adopted a splitting technique with alternating minimization of two subproblems,thefirst consisting of the solution of a large linear system via an iterative method, and the other a simple shrinkage.The resulting alternating minimization algorithm is similar to one that would be ob-tained within an Alternating Direction Method of Multipliers (ADMM)[8,9]framework,but requires continuation on the auxiliary parameter to enforce the constraint inherent in the splitting.All computation is performed in the spatial domain, the authors expecting that computation in the Discrete Fourier Transform(DFT)domain would result in undesirable bound-ary artifacts[7].Other algorithms that have been proposed for this problem include coordinate descent[10],and a proximal gradient method[11],both operating in the spatial domain.Very recently,an ADMM algorithm operating in the DFT domain has been proposed for dictionary learning for con-volutional sparse representations[12].The use of the Fast Fourier Transform(FFT)in solving the relevant linear sys-tems is shown to give substantially better asymptotic perfor-mance than the original spatial domain method,and evidence is presented to support the claim that the resulting boundary2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP)effects are not significant.The present paper describes a convolutional sparse coding algorithm that is derived within the ADMM framework and exploits the FFT for computational advantage.It is very sim-ilar to the sparse coding component of the dictionary learning algorithm of[12],but introduces a method for solving the linear systems that dominate the computational cost of the al-gorithm in time that is linear in the number offilters,instead of cubic as in the method of[12].2.ADMM ALGORITHMRewriting(2)in a form suitable for ADMM by introducing auxiliary variables{y m},we havearg min {x m},{y m}12md m∗x m−s22+λmy m 1 such that x m−y m=0∀m,(3)for which the corresponding iterations(see[8,Sec.3]),with dual variables{u m},are{x m}(k+1)=arg min{x m}12md m∗x m−s22+ρ2mx m−y(k)m+u(k)m22(4){y m}(k+1)=arg min{y m}λmy m 1+ρ2mx(k+1)m−y m+u(k)m22(5)u(k+1) m =u(k)m+x(k+1)m−y(k+1)m.(6)Subproblem(5)is solved via shrinkage/soft thresholding asy(k+1) m =Sλ/ρx(k+1)m+u(k)m,(7)whereSγ(u)=sign(u) max(0,|u|−γ),(8) with sign(·)and|·|of a vector considered to be applied element-wise.The computational cost is O(MN).The only computationally expensive step is solving(4), which is of the formarg min {x m}12md m∗x m−s22+ρ2mx m−z m 22.(9)2.1.DFT Domain FormulationAn obvious approach is to attempt to exploit the FFT for ef-ficient implementation of the convolution via the DFT convo-lution theorem.(This does involve some increase in memory requirement since the d m are zero-padded to the size of the x m before application of the FFT.)Define linear operators D m such that D m x m=d m∗x m,and denote the variables D m,x m,s,and z m in the DFT domain byˆD m,ˆx m,ˆs,andˆz m respectively.It is easy to show via the DFT convolution theorem that(9)is equivalent toarg min{ˆx m}12mˆDmˆx m−ˆs22+ρ2mˆx m−ˆz m 22(10)with the{x m}minimizing(9)being given by the inverse DFT of the{ˆx m}minimizing(10).DefiningˆD= ˆDˆD1...,ˆx=⎛⎜⎝ˆx0ˆx1...⎞⎟⎠,ˆz=⎛⎜⎝ˆz0ˆz1...⎞⎟⎠,(11) this problem can be expressed asarg minˆx12ˆDˆx−ˆs22+ρ2ˆx−ˆz 22,(12) the solution being given by(ˆD HˆD+ρI)ˆx=ˆD Hˆs+ρˆz.(13) 2.2.Independent Linear SystemsMatrixˆD has a block structure consisting of M concatenated N×N diagonal matrices,where M is the number offilters and N is the number of samples in s.ˆD HˆD is an MN×MN matrix,but due to the diagonal block(not block diagonal) structure ofˆD,a row ofˆD H with its non-zero element at col-umn n will only have a non-zero product with a column ofˆD with its non-zero element at row n.As a result,there is no interaction between elements ofˆD corresponding to differ-ent frequencies,so that(as pointed out in[12])one need only solve N independent M×M linear systems to solve(13). Bristow et al.[12]do not specify how they solve these linear systems(and their software implementation was not available for inspection),but since they rate the computational cost of solving them as O(M3),it is reasonable to conclude that they apply a direct method such as Gaussian elimination.This can be very effective[8,Sec. 4.2.3]when it is possible to pre-compute and store a Cholesky or similar decomposition of the linear system(s),but in this case it is not practical unless M is very small,having an O(M2N)memory requirement for storage of these decomposition.Nevertheless,this remains a reasonable approach,the only obvious alternative being an iterative method such as conjugate gradient(CG).A more careful analysis of the unique structure of this problem,however,reveals that there is an alternative,and vastly more effective,solution.First,define the m th block of the right hand side of(13)asˆr m=ˆD H mˆs+ρˆz m,(14)so that⎛⎜⎝ˆr 0ˆr 1...⎞⎟⎠=ˆDH ˆs +ρˆz .(15)Now,denoting the n th element of a vector x by x (n )to avoid confusion between indexing of the vectors themselves and se-lection of elements of these vectors,definev n =⎛⎜⎝ˆx 0(n )ˆx 1(n )...⎞⎟⎠b n =⎛⎜⎝ˆr 0(n )ˆr 1(n )...⎞⎟⎠,(16)and define a n as the column vector containing all of the non-zero entries from column n of ˆDH ,i.e.writing ˆD =⎛⎜⎜⎜⎝ˆd 0,00...ˆd 1,00...0ˆd 0,10...0ˆd 1,10...00ˆd 0,2...00ˆd 1,2...........................⎞⎟⎟⎟⎠(17)thena n =⎛⎜⎝ˆd ∗0,nˆd ∗1,n ...⎞⎟⎠,(18)where ∗denotes complex conjugation.The linear system to solve corresponding to element n of the {x m }is (a n a H n +ρI )v n =b n .(19)The critical observation is that the matrix on the left handside of this system consists of a rank-one matrix plus a scaled identity.Applying the Sherman-Morrison formula(A +uv H )−1=A −1−A −1uv H A −11+u H A −1v (20)gives(ρI +aa H )−1=ρ−1 I −aaHρ+a H a,(21)so that the solution to (19)isv n =ρ−1b n −a H n b nρ+a H n a na n.(22)The only vector operations here are inner products,element-wise addition,and scalar multiplication,so that this method is O (M )instead of O (M 3)as in [12].The cost of solving N of these systems is O (MN ),and the cost of the FFTs is O (MN log N ).Here it is the cost of the FFTs that dominates,whereas in [12]the cost of solving the DFT domain linear systems dominates the cost of the FFTs.This approach can be implemented in an interpreted lan-guage such as Matlab in a form that avoids explicit iteration over the N frequency indices by passing data for all N in-dices as a single array to the relevant linear-algebraic routines (commonly referred to as vectorization in Matlab terminol-ogy).Some additional computation time improvement is pos-sible,at the cost of additional memory requirements,by pre-computing a H n /(ρ+a Hn a n )in (22).2.3.Algorithm SummaryThe proposed algorithm is summarized in Alg.1.stop-ping criteria are those discussed in [8,Sec.3.3],together withan upper bound on the number of iterations.The options for the ρupdate are (i)fixed ρ(i.e.no update),(ii)the adaptive update strategy described in [8,Sec. 3.4.1],and the multi-plicative increase scheme advocated in [12].Input :image s ,filter dictionary {d m },parameters λ,ρPrecompute:FFTs of {d m }→{ˆDm },FFT of s →ˆs Initialize:{y m }={u m }=0while stopping criteria not met doCompute FFTs of {y m }→{ˆym },{u m }→{ˆu m }Compute {ˆxm }using the method in Sec.2.2Compute inverse FFTs of {ˆxm }→{x m }{y m }=S λ/ρ({x m }+{u m }){u m }={u m }+{x m }−{y m }Update ρif appropriate endOutput :Coefficient maps {x m }Algorithm 1:Summary of proposed ADMM algorithm The computational cost of the algorithm components is O (MN log N )for the FFTs,order O (MN )for the proposed linear solver,and O (MN )for both the shrinkage and dual variable update,so that the cost of the entire algorithm is O (MN log N ),dominated by the cost of FFTs.In contrast,the cost of the algorithm proposed in [12]is O (M 3N )(there is also an O (MN log N )cost for FFTs,but it is dominated by the O (M 3N )cost of the linear solver),and the cost of the original spatial-domain algorithm [7]is O (M 2N 2L ),where L is the dimensionality of the filters.3.DICTIONARY LEARNINGThe extension of (2)to learning a dictionary from training data involves replacing the minimization with respect to x m with minimization with respect to both x m and d m .The op-timization is invariably performed via alternating minimiza-tion between the two variables,the most common approach consisting of a sparse coding step followed by a dictionary update [13].The commutativity of convolution suggests that the DFT domain solution of Sec.2.1can be directly applied in minimizing with respect to d m instead of x m ,but this is not possible since the d m are of constrained size,and must be zero-padded to the size of the x m prior to a DFT domain im-plementation of the convolution.If the size constraint is im-plemented in an ADMM framework [14],however,the prob-lem is decomposed into a computationally cheap subproblem corresponding to a projection onto to constraint set,and an-other subproblem that can be efficiently solved by extending the method in Sec.2.1.This iterative algorithm for the dictio-nary update can alternate with a sparse coding stage to form amore traditional dictionary learning method [15],or the sub-problems of the sparse coding and dictionary update algo-rithms can be combined into a single ADMM algorithm [12].4.RESULTScomparison of execution times for the algorithm (λ=0.05)with different methods of solving the linear system,for a set of overcomplete 8×8DCT dictionaries and the 512×512greyscale Lena image,is presented in Fig.1.It is worth em-phasizing that this is a large image by the standards of prior publications on convolutional sparse coding;the test images in [12],for example,are 50×50and 128×128pixels in size.The Gaussian elimination solution is computed using a Cholesky decomposition (since it is,in general,impossible to cache this decomposition,it is necessary to recompute it at every solution),as implemented by the Matlab mldivide function,and is applied by iterating over all frequencies in the apparent absence of any practical alternative.The conjugate gradient solution is computed using two different relative error tolerances.A significant part of the computational advantage here of CG over the direct method is that it is applied simultaneously over all frequencies.The two curves for the proposed solver based on the Sherman-Morrison formula illustrate the significant gain from an implementation that simultaneously solves over all frequencies and that the relative advantage of doing so de-creases with increasing M .Dictionary size (M )E x e c u t i o n t i m e (s )512256128641e+051e+041e+031e+021e+01Fig.1.A comparison of execution times for 10steps of the ADMM algorithm for different methods of solving the lin-ear system:Gaussian elimination (GE),Conjugate Gradient with relative error tolerance 10−5(CG 10−5)and 10−3(CG 10−3),and Sherman-Morrison implemented with a loop over frequencies (SM-L)or jointly over all frequencies (SM-V).The performance of the three ρupdate strategies dis-cussed in the previous section was compared by sparse cod-ing a 256×256Lena image using a 9×9×512dictionary (from [16],by the authors of [17])with a fixed value of λ=0.02and a range of initial ρvalues ρ0.The resulting values of the functional in (2)after 100,500,and 1000itera-tions of the proposed algorithm are displayed in Table 1.The adaptive update strategy uses the default parameters of [8,Sec. 3.4.1],and the increasing strategy uses a multiplica-tive update by a factor of 1.1with a maximum of 105,as advocated by [12].In summary,a fixed ρcan perform well,but is sensitive to a good choice of parameter.When initialized with a small ρ0,the increasing ρstrategy provides the most rapid decrease in functional value,but thereafter converges very slowly.Over-all,unless rapid computation of an approximate solution is desired,the adaptive ρstrategy appears to provide the best performance,with the least sensitivity to choice of ρ0.This is-sue is complex,however,and further experimentation is nec-essary before drawing any general conclusions that could be considered valid over a broad range of problems.Iter.ρ010−210−1100101102103Fixed ρ10028.2727.8018.1010.099.7611.6050028.0522.2511.118.899.1110.13100027.8017.009.648.828.969.71Adaptive ρ10021.6216.9714.5610.7111.1411.4150010.8110.239.819.019.189.0910009.449.219.068.838.878.84Increasing ρ10014.789.829.509.9011.5115.155009.559.459.469.8911.4714.5110009.539.449.459.8811.4113.97Table parison of functional value convergence for thesame problem with three different ρupdate strategies.5.CONCLUSIONA computationally efficient algorithm is proposed for solving the convolutional sparse coding problem in the Fourier do-main.This algorithm has the same general structure as a pre-viously proposed approach [12],but enables a very significantreduction in computational cost by careful design of a linear solver for the most critical component of the iterative algo-rithm.The theoretical computational cost of the algorithm is reduced from O (M 3)to O (MN log N )(where N is the di-mensionality of the data and M is the number of elementsin the dictionary),and is also shown empirically to result in greatly reduced computation time.The significant improve-ment in efficiency of the proposed approach is expected togreatly increase the range of problems that can practically be addressed via convolutional sparse representations.6.REFERENCES[1]A.M.Bruckstein,D.L.Donoho,and M.Elad,“Fromsparse solutions of systems of equations to sparse mod-eling of signals and images,”SIAM Review,vol.51, no.1,pp.34–81,2009.doi:10.1137/060657704[2]S.S.Chen,D.L.Donoho,and M.A.Saunders,“Atomicdecomposition by basis pursuit,”SIAM Journal on Sci-entific Computing,vol.20,no.1,pp.33–61,1998.doi:10.1137/S1064827596304010[3]J.Wright,A.Y.Yang,A.Ganesh,S.S.Sastry,andY.Ma,“Robust face recognition via sparse representa-tion,”IEEE Transactions on Pattern Analysis and Ma-chine Intelligence,vol.31,no.2,pp.210–227,February 2009.doi:10.1109/tpami.2008.79[4]Y.Boureau,F.Bach,Y.A.LeCun,and J.Ponce,“Learn-ing mid-level features for recognition,”in Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition(CVPR),June2010,pp.2559–2566.doi:10.1109/cvpr.2010.5539963[5]J.Yang,K.Yu,and T.S.Huang,“Supervisedtranslation-invariant sparse coding,”Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition(CVPR),pp.3517–3524,2010.doi:10.1109/cvpr.2010.5539958[6]J.Mairal,F.Bach,and J.Ponce,“Task-driven dictionarylearning,”IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.34,no.4,pp.791–804,April 2012.doi:10.1109/tpami.2011.156[7]M.D.Zeiler,D.Krishnan,G.W.Taylor,and R.Fer-gus,“Deconvolutional networks,”in Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition(CVPR),June2010,pp.2528–2535.doi:10.1109/cvpr.2010.5539957[8]S.Boyd,N.Parikh,E.Chu,B.Peleato,and J.Eckstein,“Distributed optimization and statistical learning via the alternating direction method of multipliers,”Founda-tions and Trends in Machine Learning,vol.3,no.1,pp.1–122,2010.doi:10.1561/2200000016[9]J.Eckstein,“Augmented Lagrangian and alternatingdirection methods for convex optimization:A tutorial and some illustrative computational results,”Rutgers Center for Operations Research,Rutgers University, Rutcor Research Report RRR32-2012,December 2012.[Online].Available:/pub/ rrr/reports2012/322012.pdf[10]K.Kavukcuoglu,P.Sermanet,Y.Boureau,K.Gregor,M.Mathieu,and Y.A.LeCun,“Learning convolutionalfeature hierachies for visual recognition,”in Advances in Neural Information Processing Systems(NIPS2010), 2010.[11]R.Chalasani,J.C.Principe,and N.Ramakrishnan,“A fast proximal method for convolutional sparse cod-ing,”in Proceedings of the International Joint Confer-ence on Neural Networks(IJCNN),Aug.2013,pp.1–5.doi:10.1109/IJCNN.2013.6706854[12]H.Bristow, A.Eriksson,and S.Lucey,“Fast con-volutional sparse coding,”in Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition(CVPR),Jun.2013,pp.391–398.doi:10.1109/CVPR.2013.57[13]B.Mailh´e and M.D.Plumbley,“Dictionary learningwith large step gradient descent for sparse representa-tions,”in Latent Variable Analysis and Signal Sepa-ration,ser.Lecture Notes in Computer Science,F.J.Theis,A.Cichocki,A.Yeredor,and M.Zibulevsky,Eds.Springer Berlin Heidelberg,2012,vol.7191,pp.231–238.doi:10.1007/978-3-642-28551-629[14]M.V.Afonso,J.M.Bioucas-Dias,and M. A.T.Figueiredo,“An Augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems,”IEEE Transactions on Image Pro-cessing,vol.20,no.3,pp.681–695,March2011.doi:10.1109/tip.2010.2076294[15]K.Engan,S.O.Aase,and J.H.Husøy,“Method ofoptimal directions for frame design,”in Proceedings of the IEEE International Conference on Acoustics, Speech,and Signal Processing(ICASSP),vol.5,1999, pp.2443–2446.doi:10.1109/icassp.1999.760624 [16]J.Mairal,Software available from http://lear.inrialpes.fr/people/mairal/denoise ICCV09.tar.gz.[17]J.Mairal,F.Bach,J.Ponce,G.Sapiro,and A.Zis-serman,“Non-local sparse models for image restora-tion,”in Proceedings of the IEEE International Con-ference on Computer Vision(CVPR),2009,pp.2272–2279.doi:10.1109/iccv.2009.5459452。

指导下棋的英语作文

指导下棋的英语作文

指导下棋的英语作文标题,Guidance on Playing Chess。

Chess is a game of strategy, intellect, and patience. It is played on a square board divided into 64 squares of alternating colors. Each player starts with 16 pieces: one king, one queen, two rooks, two knights, two bishops, and eight pawns. The objective of the game is to checkmate the opponent's king, putting it into a position where it is in check (under threat of capture) and there is no way to move it out of check.To begin, the board is set up with each player's pieces arranged symmetrically on their respective sides. The player with the white pieces moves first, and then the players take turns moving their pieces until the game is over. Here are some key strategies and tips to help you improve your chess game:1. Control the Center: Controlling the center of theboard is crucial in chess. It allows your pieces to have more mobility and influence over the entire board. Aim to occupy the central squares with your pawns and pieces early in the game.2. Develop Your Pieces: In the opening phase of the game, focus on developing your pieces to active squares. Knights and bishops are usually developed before the rooks and queen. Avoid moving the same piece multiple times inthe opening unless necessary.3. King Safety: Keep your king safe by castling earlyin the game. Castling involves moving the king two squares towards a rook on its initial square and then moving the rook to the square next to the king. This helps to protect the king and connect the rooks.4. Pawn Structure: Pay attention to your pawn structure. Pawns form the backbone of your position and can influence the dynamics of the game. Avoid creating weaknesses in your pawn structure that can be exploited by your opponent.5. Tactical Awareness: Be vigilant for tactical opportunities such as forks, pins, skewers, and discovered attacks. These tactical motifs can help you gain material advantage or create threats against your opponent's pieces.6. Plan Ahead: Develop a plan based on the position of the pieces and the pawn structure. Consider both short-term and long-term goals, such as controlling key squares, targeting weak points in your opponent's position, or launching an attack against the enemy king.7. Evaluate Trades: Assess the consequences of piece exchanges before making them. Sometimes it's beneficial to trade pieces to simplify the position or to eliminate a strong opponent's piece. Other times, it's better to maintain the tension and keep the pieces on the board to maintain attacking chances.8. Endgame Fundamentals: Study basic endgame principles such as king and pawn endgames, rook endgames, and basic checkmating patterns. Endgame skills are crucial for converting an advantage into a win or salvaging a draw froma difficult position.9. Learn from Mistakes: Analyze your games to learn from your mistakes and improve your understanding of chess. Identify key moments where you made inaccuracies or blunders and figure out how you could have played better.10. Practice Regularly: Like any skill, improving at chess requires practice. Play games regularly against opponents of varying skill levels, solve chess puzzles, and study classic games by grandmasters to deepen your understanding of chess principles and strategies.In conclusion, chess is a fascinating game that offers endless opportunities for learning and improvement. By applying these strategies and tips, you can enhance your chess skills and enjoy the game even more. Remember, becoming a strong chess player takes time, dedication, and a willingness to learn from both victories and defeats. So keep practicing, keep learning, and most importantly, have fun!。

汽车变速器有关外文文献(A Short Course on Automatic Transmission)

汽车变速器有关外文文献(A Short Course on Automatic Transmission)

A ShortCourse onAutomaticTransmissionsby Charles OfriaThe modern automatictransmission is by far, the most complicated mechanical component in today's automobile. Automatic transmissions contain mechanical systems, hydraulic systems, electrical systems and computer controls, all working together in perfect harmony which goes virtually unnoticed until there is a problem. This article will help you understand the concepts behind what goes on inside these technological marvels and what goes into repairing them when they fail.This article is broken down into five sections:•What is a transmission breaks down in the simplest terms what the purpose of a transmission is.•Transmission Components describes the general principals behind each system in simple terms to help you understand how an automatic transmission works.•Spotting problems before they get worse shows what to look for to prevent a minor problem from becoming major.•Maintenance talks about preventative maintenance that everyone should know about.•Transmission repairs describes the types of repairs that are typically performed on transmissions from minor adjustments to complete overhauls.What is a transmission?The transmission is a devicethat is connected to the backof the engine and sends thepower from the engine to thedrive wheels. An automobileengine runs at its best at a certain RPM (Revolutions Per Minute) range and it is the transmission's job to make sure that the power is delivered to the wheels while keeping the engine within that range. It does this through various gear combinations. In first gear, the engine turns much faster in relation to the drive wheels, while in high gear the engine is loafing even though the car may be going in excess of 70 MPH. In addition to the various forward gears, a transmission also has a neutral position which disconnects the engine from the drive wheels, and reverse, which causes the drive wheels to turn in the opposite direction allowing you to back up. Finally, there is the Park position. In this position, a latch mechanism (not unlike a deadbolt lock on a door) is inserted into a slot in the output shaft to lock the drive wheels and keep them from turning, thereby preventing the vehicle from rolling.There are two basic types of automatic transmissions based on whether the vehicle is rear wheel drive or front wheel drive.On a rear wheel drive car, the transmission is usually mounted to the back of the engine and is located under the hump in the center of the floorboard alongside the gas pedal position. A drive shaft connects the rear of the transmission to the final drive which is located in the rear axle and is used to send power to the rear wheels. Power flow on this system is simple and straight forward going from the engine, through the torque converter, then through the transmission and drive shaft until it reaches the final drive where it is split and sent to the two rear wheels.On a front wheel drive car,the transmission is usuallycombined with the final drive toform what is called a transaxle.The engine on a front wheeldrive car is usually mountedsideways in the car with thetransaxle tucked under it onthe side of the engine facingthe rear of the car. Front axles are connected directly to the transaxle and provide powerto the front wheels. In this example, power flows from the engine, through the torqueconverter to a large chain that sends the power through a 180 degree turn to thetransmission that is along side the engine. From there, the power is routed through the transmission to the final drive where it is split and sent to the two front wheels through the drive axles.There are a number of other arrangements including front drive vehicles where the engine is mounted front to back instead of sideways and there are other systems that drive all four wheels but the two systems described here are by far the most popular. A much less popular rear drive arrangement has the transmission mounted directly to the final drive at the rear and is connected by a drive shaft to the torque converter which is still mounted on the engine. This system is found on the new Corvette and is used in order to balance the weight evenly between the front and rear wheels for improved performance and handling. Another rear drive system mounts everything, the engine, transmission and final drive in the rear. This rear engine arrangement is popular on the Porsche.Transmission ComponentsThe modern automatic transmission consists of many components and systems that are designed to work together in a symphony of clever mechanical, hydraulic and electrical technology that has evolved over the years into what many mechanically inclined individuals consider to be an art form. We try to use simple, generic explanations where possible to describe these systems but, due to the complexity of some of these components, you may have to use some mental gymnastics to visualize their operation.The main components that make up an automatic transmission include:•Planetary Gear Sets which are the mechanical systems that provides the various forward gear ratios as well as reverse.•The Hydraulic System which uses a special transmission fluid sent under pressure by an Oil Pump through the Valve Body to control the Clutches and the Bands in order to control the planetary gear sets.•Seals and Gaskets are used to keep the oil where it is supposed to be and prevent it from leaking out.•The Torque Converter which acts like a clutch to allow the vehicle to come to a stop in gear while the engine is still running.•The Governor and the Modulator or Throttle Cable that monitor speed and throttle position in order to determine when to shift.•On newer vehicles, shift points are controlled by Computer which directs electrical solenoids toshift oil flow to the appropriate component at the right instant.Planetary Gear SetsAutomatic transmissions contain many gears in various combinations. In a manual transmission, gears slide along shafts as you move the shift lever from one position to another, engaging various sized gears as required in order to provide the correct gear ratio. In an automatic transmission, however, the gears are never physically moved and are always engaged to the same gears. This is accomplished through the use of planetary gear sets.The basic planetary gear set consists of a sun gear, a ring gear and two or more planet gears, all remaining in constant mesh. The planet gears are connected to each other through a common carrier which allows the gears to spin on shafts called "pinions" which are attached to the carrier.One example of a way that this system can be used is by connecting the ring gear to the input shaft coming from the engine, connecting the planet carrier to the output shaft, and locking the sun gear so that it can't move. In this scenario, when we turn the ring gear, the planets will "walk" along the sun gear (which is held stationary) causing the planet carrier to turn the output shaft in the same direction as the input shaft but at a slower speed causing gear reduction (similar to a car in first gear).If we unlock the sun gear and lock any two elements together, this will cause all three elements to turn at the same speed so that the output shaft will turn at the same rate of speed as the input shaft. This is like a car that is in third or high gear. Another way that we can use a Planetary gear set is by locking the planet carrier from moving, then applying power to the ring gear which will cause the sun gear to turn in the opposite direction giving us reverse gear.The illustration on the rightshows how the simple systemdescribed above would look inan actual transmission. Theinput shaft is connected to thering gear (Blue), The Outputshaft is connected to theplanet carrier (Green) which isalso connected to a"Multi-disk" clutch pack. Thesun gear is connected to adrum (yellow) which is alsoconnected to the other half ofthe clutch pack. Surrounding the outside of the drum is a band (red) that can be tightened around the drum when required to prevent the drum with the attached sun gear from turning.The clutch pack is used, in this instance, to lock the planet carrier with the sun gear forcing both to turn at the same speed. If both the clutch pack and the band were released, the system would be in neutral. Turning the input shaft would turn the planet gears against the sun gear, but since nothing is holding the sun gear, it will just spin free and have no effect on the output shaft. To place the unit in first gear, the band is applied to hold the sun gear from moving. To shift from first to high gear, the band is released and the clutch is applied causing the output shaft to turn at the same speed as the input shaft.Many more combinations are possible using two or more planetary sets connected in various ways to provide the different forward speeds and reverse that are found in modern automatic transmissions.Some of the clever gear arrangements found in four and now, five, six and even seven and eight-speed automatics are complex enough to make a technically astute lay person's head spin trying to understand the flow of power through the transmission as it shifts from first gear through top gear while the vehicle accelerates to highway speed. On modern vehicles (mid '80s to the present), the vehicle's computer monitors and controls these shifts so that they are almost imperceptible.Clutch PacksA clutch pack consists of alternating disks that fitinside a clutch drum. Half of the disks are steeland have splines that fit into groves on the insideof the drum. The other half have a frictionmaterial bonded to their surface and have splineson the inside edge that fit groves on the outersurface of the adjoining hub. There is a pistoninside the drum that is activated by oil pressure atthe appropriate time to squeeze the clutch pack together so that the two components become locked and turn as one.One-Way ClutchA one-way clutch(also known as a"sprag" clutch) is adevice that will allowa component such asring gear to turn freely in one direction but not in the other. This effect is just like that of a bicycle, where the pedals will turn the wheel when pedaling forward, but will spin free when pedaling backward.A common place where a one-way clutch is used is in first gear when the shifter is in the drive position. When you begin to accelerate from a stop, the transmission starts out in first gear. But have you ever noticed what happens if you release the gas while it is still in first gear? The vehicle continues to coast as if you were in neutral. Now, shift into Low gear instead of Drive. When you let go of the gas in this case, you will feel the engine slow you down just like a standard shift car. The reason for this is that in Drive, a one-way clutch is used whereas in Low, a clutch pack or a band is used.BandsA band is a steel strap with friction material bonded to the inside surface. One end of the band is anchored against the transmission case while the other end is connected to a servo. At the appropriate time hydraulic oil is sent to the servo under pressure to tighten the band around the drum to stop the drum from turning.Torque ConverterOn automatic transmissions, the torque converter takes the place of the clutch foundon standard shift vehicles. It is there to allow the engine to continue running whenthe vehicle comes to a stop. The principle behind a torque converter is like taking afan that is plugged into the wall and blowing air into another fan which isunplugged. If you grab the blade on the unplugged fan, you are able to hold it fromturning but as soon as you let go, it will begin to speed up until it comes close to thespeed of the powered fan. The difference with atorque converter is that instead of using air, ituses oil or transmission fluid, to be more precise.A torque converter is a large doughnut shapeddevice (10" to 15" in diameter) that is mountedbetween the engine and the transmission. Itconsists of three internal elements that worktogether to transmit power to thetransmission. The three elements of the torqueconverter are the Pump, the Turbine, and theStator. The pump is mounted directly to theconverter housing which in turn is bolted directly to the engine's crankshaft and turns at engine speed. The turbine is inside the housing and is connected directly to the input shaft of the transmission providing power to move the vehicle. The stator is mounted to a one-way clutch so that it can spin freely in one direction but not in the other. Each of the three elements have fins mounted in them to precisely direct the flow of oil through the converterWith the engine running, transmission fluid is pulled into the pump section and is pushed outward by centrifugal force until it reaches the turbine section which starts it turning. The fluid continues in a circular motion back towards the center of the turbine where it enters the stator. If the turbine is moving considerably slower than the pump, the fluid will make contact with the front of the stator fins which push the stator into the one way clutch and prevent it from turning. With the stator stopped, the fluid is directed by the stator fins to re-enter the pump at a "helping" angle providing a torque increase. As the speed of the turbine catches up with the pump, the fluid starts hitting the stator blades on the back-side causing the stator to turn in the same direction as the pump and turbine. As the speed increases, all three elements begin to turn at approximately the same speed.Since the '80s, in order to improve fuel economy, torque converters have been equipped with a lockup clutch (not shown) which locks the turbine to the pump as the vehicle speed reaches approximately 45 - 50 MPH. This lockup is controlled by computer and usually won't engage unless the transmission is in 3rd or 4th gear.Hydraulic SystemThe Hydraulic system is a complex maze ofpassages and tubes that sends transmission fluidunder pressure to all parts of the transmissionand torque converter. The diagram at left is asimple one from a 3-speed automatic from the'60s. The newer systems are much morecomplex and are combined with computerizedelectrical components. Transmission fluid servesa number of purposes including: shift control,general lubrication and transmissioncooling. Unlike the engine, which uses oilprimarily for lubrication, every aspect of atransmission's functions is dependant on aconstant supply of fluid under pressure. This is not unlike the human circulatory system (the fluid is even red) where even a few minutes of operation when there is a lack of pressure can be harmful or even fatal to the life of the transmission. In order to keep the transmission at normal operating temperature, a portion of the fluid is sent through one of two steel tubes to a special chamber that is submerged in anti-freeze in the radiator. Fluid passing through this chamber is cooled and then returned to the transmission through the other steel tube. A typical transmission has an average of ten quarts of fluid between the transmission, torque converter, and cooler tank. In fact, most of the components of a transmission are constantly submerged in fluid including the clutch packs and bands. The friction surfaces on these parts are designed to operate properly only when they are submerged in oil.Oil PumpThe transmission oil pump (not to be confused with the pump element inside the torque converter) is responsible for producing all the oil pressure that is required in the transmission. The oil pump is mounted to the front of the transmission case and is directly connected to a flange on the torque converter housing. Since the torque converter housing is directly connected to the engine crankshaft, the pump will produce pressure whenever the engine is running as long as there is a sufficient amount of transmission fluid available. The oil enters the pump through a filter that is located at the bottom of the transmission oil pan and travels up a pickup tube directly to the oil pump. The oil is then sent, under pressure to the pressure regulator, the valve body and the rest of the components, as required.Valve BodyThe valve body isthe control center ofthe automatictransmission. Itcontains a maze ofchannels andpassages that directhydraulic fluid to thenumerous valveswhich then activatethe appropriateclutch pack or bandservo to smoothly shift to theappropriate gear for each drivingsituation. Each of the many valves inthe valve body has a specific purposeand is named for that function. Forexample the 2-3 shift valve activatesthe 2nd gear to 3rd gear up-shift or the3-2 shift timing valve which determineswhen a downshift should occur.The most important valve, and the onethat you have direct control over is the manual valve. The manual valve is directly connected to the gear shift handle and covers and uncovers various passages depending on what position the gear shift is placed in. When you place the gear shift in Drive, for instance, the manual valve directs fluid to the clutch pack(s) that activates 1st gear. It also sets up to monitor vehicle speed and throttle position so that it can determine the optimal time and the force for the 1 - 2 shift. On computer controlled transmissions, you will also have electrical solenoids that are mounted in the valve body to direct fluid to the appropriate clutch packs or bands under computer control to more precisely control shift points.Computer ControlsThe computer uses sensors on the engine and transmission to detect such things as throttle position, vehicle speed, engine speed, engine load, brake pedal position, etc. to control exact shift points as well as how soft or firm the shift should be. Once the computer receives this information, it then sends signals to a solenoid pack inside the transmission. The solenoid pack contains several electrically controlled solenoids that redirect the fluid to the appropriate clutch pack or servo in order to controlshifting. Computerized transmissions even learn your driving style and constantly adapt to it so that every shift is timed precisely when you would need it.Because of computer controls, sports models are coming out with the ability to take manual control of the transmission as though it were a stick shift, allowing the driver to select gears manually. This is accomplished on some cars by passing the shift lever through a special gate, then tapping it in one direction or the other in order to up-shift or down-shift at will. The computer monitors this activity to make sure that the driver does not select a gear that could over speed the engine and damage it.Another advantage to these "smart" transmissions is that they have a self diagnostic mode which can detect a problem early on and warn you with an indicator light on the dash. A technician can then plug test equipment in and retrieve a list of trouble codes that will help pinpoint where the problem is.Governor, Vacuum Modulator, Throttle CableThese three components are important in the non-computerized transmissions. They provide the inputs that tell the transmission when to shift. The Governor is connected to the output shaft and regulates hydraulic pressure based on vehicle speed. It accomplishes this using centrifugal force to spin a pair of hinged weights against pull-back springs. As the weights pull further out against the springs, more oil pressure is allowed past the governor to act on the shift valves that are in the valve body which then signal the appropriate shifts.Of course, vehicle speed is not the only thing that controls when a transmission should shift, the load that the engine is under is also important. The more loads you place on the engine, the longer the transmission will hold a gear before shifting to the next one.There are two types of devices that serve the purpose of monitoring the engine load: the Throttle Cable and the Vacuum Modulator. A transmission will use one or the other but generally not both of these devices. Each works in a different way to monitor engine load.The Throttle Cable simply monitors the position of the gas pedal through a cable that runs from the gas pedal to the throttle valve in the valve body.The Vacuum Modulator monitors engine vacuum by a rubber vacuum hose which is connected to the engine. Engine vacuum reacts very accurately to engine load with high vacuum produced when the engine is under light load and diminishing down to zero vacuums when the engine is under a heavy load. The modulator is attached to the outside of the transmission case and has a shaft which passes through the case and attaches to the throttle valve in the valve body. When an engine is under a light load or no load, high vacuum acts on the modulator which moves the throttle valve in one direction to allow the transmission to shift early and soft. As the engine load increases, vacuum is diminished which moves the valve in the other direction causing the transmission to shift later and more firmly.Seals and GasketsAn automatic transmission has many seals and gaskets to control the flow of hydraulic fluid and to keep it from leaking out. There are two main external seals: the front seal and the rear seal. The front seal seals the point where the torque converter mounts to the transmission case. This seal allows fluid to freely move from the converter to the transmission but keeps the fluid from leaking out. The rear seal keeps fluid from leaking past the output shaft.A seal is usually made of rubber (similar to the rubber in a windshield wiper blade) and is used to keep oil from leaking past a moving part such as a spinning shaft. In some cases, the rubber is assisted by a spring that holds the rubber in close contact with the spinning shaft.A gasket is a type of seal used to seal two stationary parts that are fastened together. Some common gasket materials are: paper, cork, rubber, silicone and soft metal.Aside from the main seals, there are also a number of other seals and gaskets that vary from transmission to transmission. A common example is the rubber O-ring that seals the shaft for the shift control lever. This is the shaft that you move when you manipulate the gear shifter. Another example that is common to most transmissions is the oil pan gasket. In fact, seals are required anywhere that a device needs to pass through the transmission case with each one being a potential source for leaks.。

alternating direction method of multipliers

alternating direction method of multipliers

alternating direction method
of multipliers
(ADMM)
Alternating Direction Method of Multipliers (ADMM) 是一种近似算法,它可以用来求解最优化问题。

该方法将原始最优化问题分解成若干子问题,利用“交替方向乘子法”(ADMM)的思想,依次求解这些子问题,直到收敛为止。

ADMM的核心思想是拆分和协调:把原始问题分解成多个子问题,然后通过“交替方向乘子法”(ADMM)的思想,以迭代的方式求解每个子问题,最终得到原始问题的最优解。

ADMM的优点在于可以用于解决不同类型的最优化问题,如凸和非凸优化问题、大规模优化问题等,具有较强的普遍性和通用性。

Edexcel BTEC Level 3 Nationals 发行版2-2011 电子设备维修与修复

Edexcel BTEC Level 3 Nationals 发行版2-2011 电子设备维修与修复

Aim and purposeThe aim of this unit is to provide the learner with the knowledge, understanding and skills required to carry out service and repair on electrical systems within land-based equipment. The learner will need to ensure they comply with current legislation and guidelines to complete this unit. This unit aims to introduce learners to skills and knowledge in the service and repair of electrical systems and how these can be applied in practice. It is designed for learners in centre-based settings looking to progress into the sector or onto further/ higher education.Unit introductionIn this unit learners will develop an understanding of the fundamentals of electrical maintenance and the knowledge and skills required when carrying out electrical maintenance activities. In carrying out these activities learners will developing knowledge and skills in selecting fault-finding techniques and diagnose faults. Learners will also develop the skills needed to dismantling, reassemble and carry out routine maintenance on electrical equipment and circuits such as motors and control systems.Learners will need to demonstrate an understanding of safe working practices when carrying out fault location and maintenance activities and take the necessary safeguards to protect their own safety and that of others in the workplace.Learning outcomesOn completion of this unit a learner should:1 Be able to perform service and repair operations on electrical systems and their components used inland-based equipment2 Know the construction, function and operation of electrical systems and circuits and their components.Unit content1Be able to perform service and repair operations on electrical systems and their components used in land-based equipmentElectrical risks: welding, short circuit, battery open circuit, overcharging, reverse polarityDismantling and assembly: use of manufacturers’ service manuals; parts lists and drawings; approved working proceduresRemoval and replacement: eg damaged wires and cables, electrical units/components, termination and connection, soldering and de-soldering; appropriate tools and equipment; approved working procedures Inspection and maintenance routines: maintenance routines eg power supplies and/or batteries, onelectrical equipment and circuit components, devices and systems, wiring harnesses, connectors and connections, earthing; inspection and functional testing eg voltage, current, continuity, resistance, battery, condition, continuity, wear, overheating, missing or loose fittings, carrying out adjustments as necessary;recording of condition; the use of maintenance manuals and documentationTypes of instruments: eg multimeter, light meter, Power ProbeFault diagnosis techniques: eg use of fault-finding aids, functional charts, diagrams, trouble shooting charts, six point (collect evidence, analyse evidence, locate fault, determine and remove cause, rectify fault, check system), half split, input/output, unit substitution, emergent sequence, component data sheets, operation and maintenance manuals, software-based records and data, visual examination, unit substitution, fault/ repair reporting, final test handover proceduresRepairs to manufacturers’ specifications: eg starting systems, charging systems, safety and/or circuitprotection systems, ignition systems, spark ignition systems, lighting systems, instrumentation systems, ancillary systemsProblem: eg short circuit, open circuit, high resistance, intermittent, partial failure/out-of-specification output, complete breakdownsReport findings: eg scheduled maintenance report, corrective maintenance report, other company-specific report, job cards, maintenance log2Know the construction, function and operation of electrical systems and circuits and their componentsSystem components: electrical supply eg cables and connectors, batteries; lead acid, gel, maintenance free, dry cell; transformers, rectifiers, contactors; circuit components eg capacitors, circuit boards, switches, solenoids, thermistors, devices eg overload protection device, relays, sensors; use of maker’s catalogue or database for selecting replacementsIdentification of components and function: series and parallel connections, power supply and battery types, circuit protect devices, fixed and/or variable resistors, diodes, relays switches, wire types and sizes, electrical consumersIdentification and interpretation of circuit diagrams to include the following: electrical component symbols, colour coding, wire identification and sizing, series and parallel connections; alternating and direct current and the common voltages in usePrinciples, construction and function of electrical circuits and their component types: starter circuits eginertia, pre-engaged; cold start circuits eg heat start, safety start, ignition circuits; charging circuits eg alternators, rectifiers, lighting circuits eg indicators, brake lights, side, head, dip, marker lights, work lights;Instrumentation circuits eg fuel, temperature, tachometer, hour meter. spark ignition circuits eg spark generation; ancillary circuits eg wiper motors, stop circuits, ventilation, horn, switches, actuators; safety and/or circuit protection circuits eg battery isolation safety isolation, fuses and fuseable links, thermal switches, over – under voltage switching, relays, RCCB, earth bonding, double insulationAssessment and grading criteriaIn order to pass this unit, the evidence that the learner presents for assessment needs to demonstrate that they can meet all the learning outcomes for the unit. The assessment criteria for a pass grade describe the level of achievement required to pass this unit.Assessment and grading criteriaTo achieve a pass grade the evidence must show that the learner is able to:To achieve a merit grade theevidence must show that, inaddition to the pass criteria,the learner is able to:To achieve a distinction gradethe evidence must show that,in addition to the pass andmerit criteria, the learner isable to:P1identify electrical circuitsand components andtheir functions fromwiring diagrams and visualrecognition[IE, CT]M1explain the relationshipbetween component faultsand the malfunction of a givenelectrical systemD1compare and contrast twofault diagnosis techniqueswhen carrying outmaintenance work on anelectrical system.P2perform tests usingequipment and practicesto measure and verifythe correct operation ofelectrical systems and theircomponents[SM, IE, TW]P3identify and rectify faultsin electrical systems andcomponents[RL, EP]P4maintain the integrity ofelectrical systems[EP, IE]P5remove, dismantle, rectify faults, repair and reinstateelectrical components andcircuits to manufacturer’sspecifications and standards[EP, IE, TW]Assessment and grading criteriaTo achieve a pass grade the evidence must show that the learner is able to:To achieve a merit grade theevidence must show that, inaddition to the pass criteria,the learner is able to:To achieve a distinction gradethe evidence must show that,in addition to the pass andmerit criteria, the learner isable to:P6identify and interpretelectrical circuit diagrams[IE]M2explain the importanceof applying safe workingpractices when carrying outmaintenance on an electricalsystem.P7summarise Ohm’s law, itsapplication and principles[IE]P8compare the specification,safe maintenance andcharging of different types ofbattery[RL,IE,TW,CT]P9describe the principles,construction and function ofelectrical circuits and theircomponents[CT]P10describe how to remove,dismantle, test, verify, repairand reinstate electrical circuitsand their components[CT, RL]P11outline risks posed toelectrical systems andcomponents by otheractivities or incidents.[IE, RL, SM, EP CT]PLTS: This summary references where applicable in the pass criteria, in the square brackets, the elements of the personal, learning and thinking skills. It identifies opportunities for learners to demonstrate effective application of the referenced elements of the skills.Key IE – independent enquirersCT – creative thinkers RL – reflective learnersTW – team workersSM – self-managersEP – effective participatorsEssential guidance for tutorsDeliveryAll centres must comply with the requirements of relevant, current legislation and codes of practice the Prevention of Accidents to Children in Agriculture Regulations 1998. Learners must be made aware of, and have access to, relevant health and safety legislation and know the importance of the use of risk assessments appropriate to each situation. Appropriate risk assessments must precede all practical machinery activities and learners must work in a safe manner at all times when using equipment or working with machinery. Learners must be supervised at all times and tutors must not ask learners to undertake tasks that are beyond their physical capabilities.Delivery of this unit will involve practical assessments, written assessment, visits to suitable collections and will link to industrial experience placements.The unit provides an opportunity for learners to work in teams or groups when diagnosing component or system faults. Delivery of this unit should focus on learners developing diagnostic and practical skills, together with an understanding of electrical components and systems maintenance.The learning outcomes are ordered logically and it would be reasonable to develop them sequentially throughout the unit. In this way, learners will be able to apply health and safety system and component operation to diagnostic, testing and maintenance techniques. All learning outcomes suit to a practical approach rather than too much time spent in theory lessons. For example, a short introduction to a component (or range of components), the function of the component within the larger system, the tools necessary to carry out the maintenance task together with any safety considerations, followed by practice. Learners need a broad overview of the different electrical components and systems so they can select and apply the correct maintenance, diagnostic and testing techniquesLearners will need to ensure they comply with current legislation and guidelines to complete this unit. Evidence may be collected from well-planned investigative assignments or reports of workshop activities. Evidence can be accumulated through learners building up a portfolio from investigations, case studies and maintenance operations through a tutor-led series of assignments, realistic maintenance exercises and tests. Outline learning planThe outline learning plan has been included in this unit as guidance and can be used in conjunction with the programme of suggested assignments.The outline learning plan gives an indication of the volume of learning it would take the average learner to achieve the learning outcomes. It is indicative and is one way of achieving the credit value. Learning time should address all learning (including assessment) relevant to the learning outcomes, regardless of where, when and how the learning has taken place.Topic and suggested assignments/activities and/assessmentElectrical basics: include Ohm’s law, what is needed in a circuit, series and parallel circuits.Assignment 1: Electrical Health and Safety (P10, P11, M2)Assignment 2: Maintenance of Electrical Equipment (P1, P2, P4, P5)practical activities to cover unit content as required.Topic and suggested assignments/activities and/assessmentAssignment 3: Electrical Theory (P6, P7, P9)recognising components, working with circuit diagrams, understanding ohm’s law and its application. Assignment 4: Batteries (P8)how batteries work, types of battery, battery application, the future of battery technology.Assignment 5: Electrical Fault Finding (P3, M1, D1)practical assessment activities that capture evident as described in unit content.Unit review.AssessmentFor P1, P2, P3, P4 and P5 learners are required to demonstrate practices and use of equipment to identify, measure and rectify faults in electrical systems and their components/circuits. All practical tasks and tutor feedback needs to be recorded using appropriate documentation.For P6, P7, P8, P9, P10 and P11, learners must provide information relating to operational task procedures relating to the service and repair of electrical systems. Evidence may be a by way of project assignment, observed practical test or pictorial presentation with notes using appropriate software, slides or OHPs.For M1, M2 and D1, learners must provide detailed information on electrical fault diagnosis, malfunctions and safe working practices. Evidence could be in the form of a report, test or presentation.Programme of suggested assignmentsThe following table shows a programme of suggested assignments that cover the pass, merit and distinction criteria in the grading grid. This is for guidance and it is recommended that centres either write their own assignments or adapt any Edexcel assignments to meet local needs and resources.Criteria covered Assignment title Scenario Assessment methodP10, P11, M2Electrical Healthand Safety Electrics can be dangerous and workingon vehicle electrics can damage thecomponents beyond repair. Look atthe risks involved with electrical system maintenance and repair.Assignment.P1, P2, P4, P5Maintenanceof ElectricalEquipment As with other systems, electrical systemsneed maintenance and repair. Carryout tests, maintenance and repairs toelectrical systems and components.Portfolio of evidence.P6, P7, P9Electrical Theory Electrical systems can become verycomplex. It is essential that youunderstand circuit diagrams and thetheory of electric systems.Open book test.P8Batteries Though vehicle electrics have changedrastically over a short period of time,the same cannot be said for the humblebattery. Investigate the different types ofvehicle battery with a view to the future.Investigative report.P3, M1, D1Electrical FaultFinding Quick fault diagnosis will saveconsiderable time, effort and expense.Use fault-finding techniques to identifysystem faults and rectify them.Practical assessment.Links to National Occupational Standards, other BTEC units, other BTEC qualifi cations and other relevant units and qualifi cationsThis unit forms part of the BTEC land-based sector suite. This unit has particular links with: Level 2 Level 3LEO22 Service and Repair Electrical Systems on Land-based Equipment Undertake and Review Work-related Experience in the land-based IndustriesEssential resourcesCentres delivering this unit must have access to land-based vehicle standard components and systems, testing instruments and rigs. This unit relies heavily on the learner being able to investigate the manufactured specification of components and service manuals.Employer engagement and vocational contextsVisits to vehicle electric specialist firms in relation to fault finding would be of benefit to learners as well as visitsto manufacturing organisations or similar with a focus on electrical components, their installation and service requirements. Learners will be made aware of the vast range and scope of electrical components and sensors used in the land-based engineering sector.Indicative reading for learnersT extbooksHealth and Safety Executive – Essentials of Health & Safety at Work (HSE, 1995) ISBN 071760716X Adams J – Electrical Safety: A Guide to the Causes and Prevention of Electrical Hazards(Institution of Electrical Engineers, 1994) ISBN 085296806XWebsites RS is Europe’s leading distributor of electronic,electrical and industrial components. is an online community dedicated toproviding visitors the ability to research, share, anddiscuss solutions and tips for completing day-to-daytasks and projects. Health and Safety ExecutiveDelivery of personal, learning and thinking skills (PLTS)The following table identifies the PLTS opportunities that have been included within the assessment criteria of this unit:Skill When learners are …Independent enquirers carrying out fault finding electrical systemsCreative thinkers carrying out fault finding on electrical systemsReflective learners making comparisons between components and systemsT eam workers gathering test dataSelf-managers investigating faultsEffective participators gathering test data.Although PLTS opportunities are identified within this unit as an inherent part of the assessment criteria, there are further opportunities to develop a range of PLTS through various approaches to teaching and learning. Skill When learners are …Independent enquirers planning and carrying out research activities related to the unitevaluating and carrying out extended thinkingCreative thinkers asking questions to extend their thinking during lectures and practical sessionstrying out alternatives or new solutionsReflective learners identifying opportunities for their own achievementsT eam workers assisting in group activitiesSelf-managers setting own targets for accurate completion of workasking for assistanceEffective participators encouraging debate.Functional Skills – Level 2Skill When learners are …ICT – Use ICT systemsusing ICT-based systems to define component functionality Select, interact with and use ICT systemsindependently for a complex task to meet avariety of needsUse ICT to effectively plan work andevaluate the effectiveness of the ICT systemthey have usedManage information storage to enableefficient retrievalFollow and understand the need for safetyand security practicesT roubleshootSelect and use ICT to communicate andexchange information safely, responsibly andeffectively including storage of messages andcontact listsMathematicsdescribing Ohm’s law and its applicationUnderstand routine and non-routineproblems in a wide range of familiar andunfamiliar contexts and situationsIdentify the situation or problem and themathematical methods needed to tackle itusing electrical fault-finding techniquesSelect and apply a range of skills to findsolutionsEnglishdiscussing fault-finding techniques.Speaking and listening – make a range ofcontributions to discussions and makeeffective presentations in a wide range ofcontextsReading – compare, select, read andunderstand texts and use them to gatherinformation, ideas, arguments and opinions。

基于分式函数约束的稀疏子空间聚类方法

基于分式函数约束的稀疏子空间聚类方法
摘 要:针对现有稀疏子空间聚类算法获取的系数矩阵不能准确反应高维空间中数据分布的稀疏性的不足,提出一 种分式函数约束的稀疏子空间聚类模型 ,并利用交替方向迭代方法给出该模型的解。在无噪声情形下 ,证明了该方 法获取的系数矩阵具有块对角结构,这为其准确获取数据结构提供了理论保证 ;在含噪声情形下,对异常点噪声同 样采用分式函数约束作为正则项,提高了模型的鲁棒性。在人工数据集、Extended Yale B 库和 Hopkins155 数据集 上的实验结果表明 ,基于分式函数约束的稀疏子空间聚类方法不仅提高了聚类结果的准确率 ,而且对异常点噪声具 有更好的鲁棒性。 关键词:分式函数 ;稀疏表示 ;块对角结构 ;子空间聚类 ;谱聚类 文献标志码:A 中图分类号:TP391 doi:10.3778/j.issn.1002-8331.1909-0147
Abstract:This paper proposes a novel sparse subspace clustering model which is based on the constraints of fractional function in order to overcome the shortcoming of sparse subspace clustering algorithm that the coefficient matrix obtained by this algorithm cannot reflect the sparsity of data distribution in high-dimensional space accurately and solves this model by applying the alternating direction iteration method. It is proved that the coefficient matrix obtained by this method has block diagonal structure without any noise, which provides a theoretical guarantee to acquire its data structure accurately. Under the condition of noise, the fractional function constraint is also used as the regular term for outlier noise to improve the robustness of the model. Experimental results on artificial data sets, Extended Yale B database and Hopkins155 data set show that the sparse subspace clustering method based on fractional function constraint not only improves the accuracy of clustering results and also improves the robustness to outlier noise. Key words:fractional function; sparse representation; block diagonal structure; subspace clustering; spectral clustering

结构化低秩矩阵分解方法介绍

结构化低秩矩阵分解方法介绍

Find a polynomial close to p(x) with multiple roots.
• The Sylvester matrix of p(x) and p (x) will be a (2n − 1) × (2n − 1) matrix formed from the coefficients of p(x) and p (x). • For example, if n = 4 and p(x) = a4x4 + a3x3 + a2x2 + a1x + a0, then a4 a3 a2 a1 a0 0 0 0 a a a a a 0 4 3 2 1 0 0 0 a a a a a 4 3 2 1 0 Sp,p = 4a4 3a3 2a2 1a1 0 0 0 0 4a4 3a3 2a2 1a1 0 0 0 0 4a4 3a3 2a2 1a1 0 0 0 0 4a4 3a3 2a2 1a1 • p(x) has a multiple root ⇐⇒ det(Sp,p ) = 0. (resultant) • Find a low rank approximation to Sp,p with the above structure.
• The Sylvester matrix of p(x) and q (x) is an (m + n) × (m + n) matrix formed from the coefficients of p(x) and q (x). • For example, if m = 3, n = 2, p3 0 = q2 0 0 p2 p3 q1 q2 0 p1 p2 q0 q1 q2 p0 p1 0 q0 q1 0 p0 0 0 q0
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Applying Alternating Direction Method of Multipliers for ConstrainedDictionary LearningA.Rakotomamonjy1,∗LITIS EA4108,University of RouenAbstractThis paper revisits the problem of dictionary learning on several points although we still consider an alter-nating optimization scheme.Ourfirst contribution consists in providing a simple proof of convergence for this numerical scheme for a large class of constraints and regularizers on the dictionary atoms.We also investigate the use of a well-known optimization method named alternating direction method of multipliers for solving each of the alternate step of the dictionary learning problem.We show that such an algorithm yields to several benefits.Indeed,it can be more efficient than other competing algorithms such as Iterative Shrinkage Thresholding approach and besides,it allows one to easily deal with mixed constraints or regular-izers over the dictionary atoms or approximation coefficients.For instance,we have induced joint sparsity, positivity and smoothness on the dictionary atoms by means of some total variation and sparsity-inducing regularizers.Our experimental results prove that using these mixed constraints helps in achieving better learned dictionary elements especially when learning from noisy signals.1.IntroductionIn the recent years,a lot of works have been devoted to the problem of sparse representation of signals and images.Several research communities[24,36,15]have focused on this problem in order to develop some new tools for analyzing signals or images,to select features for discrimination tasks or to study the theoretical properties of sparse representation.This large success of“sparsity”is essentially due to the fact that many real world signals or natural images can be represented as a linear combination of few representative elements, denoted as atoms or dictionary elements.One of the key problem related to sparse representation is the choice of the dictionary on which the signal of∗Corresponding authorEmail address:alain.rakotomamonjy@insa-rouen.fr(A.Rakotomamonjy)1Complete postal address:LITIS EA4108,University of Rouen,Avenue de l’universit´e,76800Saint Etienne du Rouvray, France.Preprint submitted to Elsevier August6,2013interest is decomposed.One simple approach is to consider an off-the-shelf dictionary such as wavelet basis, wavelet packet basis,Gabor atoms or Discrete Cosine Basis.A recent trend which achieved state-of-the art results on many low-level signal and image processing tasks is to learn the dictionary itself from the data [13,20,25,19,12].Several algorithms for dictionary learning have been proposed and most of them are based on an alternating optimization scheme which involves a signal sparse coding step and a dictionary optimization step[2,20,13,24,18,33].Dictionary learning algorithms have been successfully applied to a wide range of applications including face recognition[35],image reconstruction[21,34]and image denoising [12].Recently,Yaghoobi et al.[33]have proposed a novel algorithm for dictionary learning.Their approach consists in majorizing the datafidelity term and then in performing alternating optimization of this novel majorizing objective function.Owing to a clever majorization,their algorithm is simply based on several resolutions of constrained or regularized matrix-value approximations.When looked at closer details,at each alternating optimization stage,the algorithm they propose is actually equivalent to a Landweber iteration [17]or a Iterative Shrinkhage-Thresholding algorithm(ISTA)[16].The very nice point of their dictionary learning method is that it can easily handle constraints over dictionary atoms.In this paper,we focus on the problem of constrained dictionary learning that we also address through an alternating optimization scheme.One of our contribution is to provide a simple and graceful proof of convergence of such an alternate scheme for a large class of constraints and regularizers,independently of the algorithms used for solving each alternating stage.Then,we propose to tackle the dictionary learning problem through efficient algorithms that can also easily handle constraints or regularizers on the dictionary elements.More specifically,we investigate novel algo-rithms for solving the dictionary learning problem in which constraints on the dictionary atoms actually involve several constraints or regularizers.The approach we promote is based on alternate direction method of multipliers(ADMM)[11,1].Unlike Yaghoobi’s et al which algorithm takes into account onlyfirst-order gradient-based information,by doing so,we involve at each alternating optimization step,some second-order information(related to the Hessian’s inverse)on the problem which,as experimentally shown,helps us in solving the dictionary learning problem in an efficient way,for medium-scale problems and when several complex constraints are considered.As made clear in the sequel,the gain in efficiency brought by the ADMM approach can be substantial.Another of our contribution is to provide evidence on the benefits of using mixed-constraints for dictionary learning.While several works on dictionary learning acknowledge the need of learning atoms[13],under some norm constraints,few works have addressed the situation where several mixed constraints or regularizers are in play[33,24].In other application domains[1,28],algorithms dealing with compound constraints or2regularizers have recently been proposed,we provide in this paper a simple instantiation of the ADMM approach for coping with several constraints that can be useful for a dictionary learning problem.Then, owing to this algorithm,we show that alternating optimization methods for dictionary learning can handle constraints and regularization that induce the learned dictionary atoms to be for instance jointly sparse, smooth,positive and unit-norm.Experimental results show the benefit of these mixed constraints especially when learning dictionary from noisy signals.The paper is organized as follows.Section2introduces the dictionary learning framework we are dealing with and reviews works related to the one we present.The ADMM algorithm we promote for solving the problem is presented and discussed in Section3.Numerical experiments on toy signal denoising and image denoising problem are presented in Section4.They clearly show that our ADMM approach is faster thanfirst-order algorithms and that using appropriate constraints on the dictionary atoms leads to better denoising.The code that has been used for producing results andfigures is available for reviewing purpose at http://asi.insa-rouen.fr/enseignant/~arakoto/code/dicolearn.html.2.Dictionary learningIn this section,we formally state the dictionary learning problem we are interested in and present the current state of the art methods for solving this problem.2.1.Simultaneous Sparse approximation and dictionary learningThe problem of simultaneous sparse approximation(SSA)[31,29]consists in looking for a sparse decom-position over afixed dictionary of a set of signals under the constraint that,to some extents,they all share the same sparsity profile.Formally,this translates as follows.Suppose we have a set of L signals X=[x1,x2,···,x L],with X∈R N×L,a dictionary D∈R N×M,then the SSA problem is:min A∈R M×L 12X−DA 2F+λΩ(A)(1)whereΩ(A)is a sparsity-inducing regularizer on A andλa trade-offparameter that balances the data fitting term(the square loss)and the regularization term.Typically,for inducing a shared sparsity profile on the signal approximation,which means that all x i should preferably be approximated by the same set of dictionary elements,one can consider a mixed-norm regularizer over the rows of A of the form:Ω(A)= Mi=1A i,· p q1/p(2)where typically p=1and q∈{2,∞}[10,31].Note that for all matrices,we will denote as A i,·and A·,j the i-th row and the j-th column.When the sparsity profile is known to differ for all signals to be approximated,3one instead may consider p=1and q=1[6]which allowsΩ(A)to decouple and thus untie coefficients of A.Dictionary learning problems go beyond simultaneous sparse approximation by jointly optimizing over the coefficient matrix A and the dictionary D leading to the optimization problemmin A,D∈D 12X−DA 2F+λΩ(A)(3)where D is a set that imposes some constraints on the dictionary elements chosen according to prior knowledge on the problem or chosen so as to help in resolving the scale invariance of the problem.The most frequent constraint imposed to D,and already used in other works,is that each column of D has a unit 2norm or that the Frobenius norm on D is also unit[13,20,33].In a more general form,we consider the following problem for dictionary learningmin A,D Φ(A,D)=12X−DA 2F+λAΩA(A)+λDΩD(D)(4)whereΩA(·)andΩD(·)are general regularizers on A and D that can involve mixed terms(e.g non-negativity and sparsity)or indicator functions that are related to projections on convex set defining some constraints (e.g unit-norm dictionary elements).We will provide more details on these regularizers in the sequel.2.2.Related worksSeveral methods for solving the dictionary learning problem given in Equation(4)have been investigated in the last decade.These algorithms usually solve problem(4)by means of an alternating optimization procedure which is summarized in Algorithm1.Basically,this algorithm consists in alternatively learning the sparse approximation coefficient matrix A when the dictionary is consideredfixed and then in updating the dictionary withfixed matrix approximation A.For instance,Engan et al.[13]consider the problem with no constraints on A and impose unit-norm dictionary elements.Accordingly,the solution at each alternate step can be straightforwardly obtained by solving the related least-square problem over A and D.For the D update,assuming that(A k A Tk)−1exists(for instance when L≥M),this gives:D k+1=XA T k(A k A T k)−1(5) and unit-norm atoms are obtained by normalizing each column of D k+1.In a similar way,Kreutz-Delgado et al.have proposed an alternating optimization algorithm that solves a Bayesian interpretation model of the dictionary learning problem[20].The recent work of Yaghoobi et al.investigates problem(4)when several simple types of constraints and regularizations on D(typically unit-norm constraints or Frobenius norm boundedness)as well as sparsity-inducing regularizers on A are in play.For instance they consider both unit-norm Frobenius or unit 2norm4Algorithm 1:Alternate optimization for dictionary learning1:set k=1,initialize A (1),D (1)2:repeat3:A k +1=min A Φ(A ,D k )4:D k +1=min D Φ(A k +1,D )5:k ←k +16:until stopping criterion is metconstraints,jointly to sparsity-inducing regularizers of the same form of the one given in Equation (2)on the dictionary atoms.The main idea of their algorithm is based on the use of a surrogate function that majorizes the data fidelity term X −DA 2F in problem (4).For instance,when updating the dictionary D ,they replace the problem of minimizing Φ(A k ,D )with respects to D by the following surrogate problem :min D 1 X −DA k 2F +λD ΩD (D )+C D 1 D −D k −1 2F −1 DA k −D k −1A k 2F (6)where C D > A k A 2k 2is a constant ensuring that the majorization holds.In order to get a better insight on how this surrogate approach works,we provide below the solution of this minimization problem .Let us denote as J the objective function of Equation (6).By expanding all the square norms,we can show that2·J = X 2F −2tr(X DA k )+ DA k 2F +λD ΩD (D )+C D D 2F −2C D tr(D D k −1)+C D D k −1 2F − DA k 2F +2tr(A k D D k −1A k )+ D k −1A k 2F=C D D 2F −2C D tr(D T (D k −1+1C D (X −D k −1A k )A k ))+λD ΩD (D )+G where G is a term that englobes all components of the equation that do not depend on D .Note that the second equality is obtained because we have tr(A k D D k −1A k )=tr(D D k −1A k A k )and tr(X DA k )=tr((DA k ) X )=tr(D XA k ).According to this equality,we can deduce that problem given in Equation (6)is equivalent to the following one :min D 12 D −(D k −1+1C D (X −D k −1A k )A k ) 2F +λC D ΩD (D )(7)This problem can be easily solved by introducing the notion of proximal operator ,a key element of several algorithms such as the forward-backward splitting [8,7]or the Iterative Shrinkage/Thresholding algorithm (ISTA)[16]known.The proximal operator [26]of a lower semi-continuous convex function Ωis defined asprox Ω(ˆD )=arg min D 12D −ˆD 2F +Ω(D )and is unique and well-defined.For interested readers,a comprehensive review on basic properties of proximal operators and optimization methods based on proximal operators have recently been given in [9].Note that for several functions Ω(D ),this proximal operator has a closed-form solution.For instance,when Ω(D )=λ i,j |d i,j |,which corresponds to a soft thresholding operation for matrices,then prox λ i,j |d i ,j |(ˆD)5is the matrix whose(i,j)entry is sign(ˆd i,j)max(|ˆd i,j|−λ,0).WhenΩ(D)takes the form of an indicator over a convex set,then proxΩ(ˆD)is the projection ofˆD on that set.From this definition of the proximal operator,the solution of problem(7),as proposed by Yaghoobi et al., can now be expressed asD k=proxλC D ΩD(D k−1+1C D(X−D k−1A k)A k))(8)Hence,Yaghoobi et al.solve problem(6)byfirst updating D k−1according to a gradient descent step,since(X−D k−1A k)Ak is the negative gradient of12X−D k−1A k) 2F,and then in“projecting”this novel updateon the convex set imposed by the regularizers or the constraintsΩD.We can highlight that this surrogate approach can help in dealing with several types of constraints over the dictionary atoms,as long as the proximal operator ofΩD can be(easily)computed.We also note that this update equation is also the update we would have obtained if an iterative shrinkage thresholding algorithm(ISTA)[3]has been used for solving problem min DΦ(A k,D).Regarding ISTA,while this algorithm is mainly devoted to sparse coding approach based on a 1regularizer,it is generic enough to handle a large class of regularizers and thus can also be used for the dictionary update.Still in the framework of alternating optimization on A and D,several interesting research outcomes have also recently been proposed by the machine learning community[22,24].Efforts have essentially been spent for designing appropriate regularizers or constraints on the matrix code A so as to take into account some knowledge over the learning problems.For instance,Jenatton et al.have investigated methods that are able to exploit a hierarchical relationship between the dictionary elements through the definition of appropriate regularizersΩA(A)[18].They have also proposed methods that are able to learn some structures on the dictionary elements[19].As an example,they imposed atoms to have contiguous non-zeros patterns.These works are relevant to ours since the optimization strategy they consider is based,as in Yaghoobi et al.[33]on ISTA and its fast version FISTA.Indeed,they use an alternating optimization scheme in which the building block of the sparse coding and dictionary update involves afirst-order gradient step and the application of a proximal operator.In addition,they introduce some novel regularizers and constraints that can be mixed with other ones through the framework we promote.The K-SVD of Aharon et al.[2]presents a different perspective on the dictionary learning problem.Indeed, their dictionary update step also involves some updates of the approximation coefficients.This update is performed through a SVD decomposition on a representation error matrix of the set of signal X.The resulting decomposition defines a novel dictionary element and the approximation coefficients related to that atom. This SVD decomposition is performed M times(M being the dictionary size)for each dictionary update in the alternating optimization algorithm(1).6It is also interesting to mention that two works have lately investigated the problem of online dictionary learning[30,24].At the contrary of the batch approaches presented above,in the online framework it is supposed that signal examples x i are available on thefly.Skretting et al.proposes an extension of the method of optimal direction based on recursive least-squares for dealing with this online framework while Mairal et al.propose a stochastic approximation algorithm that is able to deal with constraints on A and D. These methods are those tailored to very large-scale(in the number of signals)problems.The approach we describe in this paper is more suitable for medium-scale problems since it needs the computation of an inverse Hessian matrix which dimension depends on the number of dictionary elements.However,our experimental results show that our approach is still competitive compared to other batch algorithms such as the one of Yaghoobi et al.for these medium-scale problems.In this paper,we consider the problem of dictionary learning in a batch framework as given by Equation(4). In particular,we are interested in problems where several constraints on the dictionary atoms are imposed. Our objective is to efficiently learn dictionary atoms while taking into account some prior knowledge on the dictionary.We note that the MOD algorithm of Engan et al.[13]indeed uses second-order information )−1in the dictionary update but it can not deal with specific constraints on the dictionary atoms.(A k AkContrarily,the approach of Yaghoobi et al.[33]can easily deal with constraints but it only usesfirst-order information in their update stages.We achieve these two objectives(efficiency and mixed-constraints)by means of an alternating direction of multipliers method(ADMM)approach.Our efficient dictionary learning algorithm is still based on the alternating optimization approach as described in Algorithm 1.The novelty we bring is that we solve each of the alternating problem using an ADMM method,which involves some second-order information in the optimization scheme,making it efficient for medium-scale problems.Since the alternating optimization approach nicely decouples the dictionary atom learning stage and the sparse coding step,This allows us to deal with several constraints imposed on the dictionary atoms.Since these mixed-constraints actually define a novel(global)constraint/regularizerΩ(D),we deal with this newΩ(D)by numerically computing its proximal operator.For this purpose,we again use an algorithm which is based on an ADMM.Interestingly,owing to a clever implementation,computing the proximal operator ofΩ(D)only needs the proximal operator of each single constraint composingΩ(D).From our numerical results,we show that our algorithm is indeed more efficient than otherfirst-order methods and that substantial improvements of the learned dictionary can be achieved by using mixed constraints,in particular in a noisy signal setting.73.Alternating Direction Method of Multiplier and Application to Dictionary LearningIn this section,wefirst review augmented Lagrangian method and the alternating direction method of multipliers.Then,we detail how such a framework can be instantiate for the dictionary learning problem and how it can be used to handle constraints over the dictionary elements.3.1.Augmented Lagrangian and Alternating Direction Method of MultipliersWe are interested in an optimization problem which general form is:minUf1(U)+f2(GU)(9) where f1(·)and f2(·)are convex lower semi-continuous functions over the set of matrices,U∈R n1×n2 and G∈R n ×posite optimization problems based on the sum of two functions are emcompassed into this framework by choosing G as the identity matrix.By introducing an auxiliary variable,the above unconstrained problem is equivalent to the following constrained oneminU,Vf1(U)+f2(V)s.t.GU−V=0(10) One advantage of introducing this auxiliary variable V is that it allows the decoupling of the two functions f1and f2since each of them now applies to one specific optimization variable U or V.Because of this property,in some situations,it thus may be easier to solve the equality-constrained problem(10)than the unconstrained one(9).This occurs for instance when f1or f2has a closed-form proximal operator.One typical way for solving such an equality-constrained problem(10)is to deploy an augmented Lagrangian approach[27].Basically,this consists in defining the problem’s Lagrangian augmented with a quadratic term that penalizes U and V not satisfying the equality contraint:L(U,V,Λ)=f1(U)+f2(V)−tr(ΛT(GU−V))+µ2GU−V 2FwithΛbeing a dense matrix of Lagrangian multipliers related to the equality constraints andµa positive parameter that balances the quadratic penalization.As described by Nocedal and Wright[27],the idea of augmented Lagrangian method is tofind a saddle point of the Lagrangian L(U,V,Λ)which coincides with the solution of the original problem(10).This is done byfinding a minimizer of the augmented Lagrangian withΛandµfixed,and then in updating these parameters and repeating these two steps until convergence. ForfixedΛandµ,after some simple algebras,minimizing the Lagrangian can be shown to be equivalentmin U,V f1(U)+f2(V)+µ2GU−V−Λµ2F(11)leading thus to the iterative algorithm given in Algorithm2with Z being1µΛ.8Algorithm2:Augmented Lagrangian method1:set k=1,initialize U1,V1,Z12:repeat3:(U k+1,V k+1)=argmin U,V f1(U)+f2(V)+µ2 GU−V−Z k 24:Z k+1=Z k+V k+1−GU k+15:k←k+16:until stopping criterion is metAlgorithm3:Alternating Direction Method of Multipliers1:set k=1,initialize V(1),Z(1)2:repeat3:U k+1=argmin U f1(U)+µ2 GU−V k−Z k 2F4:V k+1=argmin V f2(V)+µ2 GU k+1−Z k−V 2F5:Z k+1=Z k+V k+1−GU k+16:k←k+17:until stopping criterion is met8:output:U k+1This minimization problem is an unconstrained optimization problemjointly on U and V,involving in the objective function both f1(·),f2(·)and a quadratic term coupling U and V.Hence,it seems that little has been gained by introducing the auxiliary variable V and by considering Augmented Lagrangian approach. However,one should note that problem(11)provides us with an useful feature since f1and f2are now separable in U and V.This separability suggests thus the use of a block-coordinate descent(BCD)for solving problem(11).Alternating direction method of multipliers exploits this idea of block-coordinate optimization instead of a joint optimization on the two variables.The idea of ADMM is to optimize alternatively over U and V. Interestingly,ADMM considers only a single iteration of a block-coordinate descent over U and V and then updates the Lagrangian multipliers.This is different from an exact implementation of a BCD algorithm which would have alternated over U and V until convergence before updating the multipliersΛ.This results in the ADMM approach described in Algorithm3.More details on ADMM can be found in[11,14,4].Note that thefirst minimization problem in the ADMM algorithm involves a regularized least-square problem, where the design matrix is G and the regularizer is f1(U),its resolution thus can be expensive.Instead, the minimization of the second problem makes use of the proximal operator of1µf2(V),which can be cheap to compute if it has a closed-form solution.To make clear this connection,we can simply rewrite Line4of Algorithm3)asargminV 12V−(GU k+1−Z k) 2F+1µf2(V)which by definition is prox1µf2(V)(GU k+1−Z k).One of the very nice feature of this alternating direction method of multipliers is that it converges towards9a minimizer of problem(9)or(10)under the simple hypotheses that G is full-column rank and f1and f2 are proper convex functions[11]and this for an arbitrary choice ofµ.In addition,the convergence proof of ADMM methods given by Eckstein and Bertsekas states that,even if the two minimization steps(Lines 3and4)in the algorithm are not solved exactly,then as long as the two sequences of optimization errors are absolutely summable[11],the sequence generated by the ADMM algorithm still converges towards a solution of the problem(10).This implies that in practice,if the proximal operator of f2has to be computed numerically,the ADMM algorithm is still guaranteed to convergence as long as the numerical optimization error sufficiently decreases,which means that the stopping criterion of the proximal operator computation has to be tighter as ADMM iterations go.3.2.Application of ADMM to dictionary learning:global frameworkThe approach we propose for solving the dictionary learning problem is still based on the alternating approach presented in Algorithm1.Each of these minimization problems is here solved by means of an ADMM algorithm.According to the general problem of dictionary learning given in Equation(4),we define the functions f1(·), f2(·)asf1(A,D)=12X−DA 2F and f2(A,D)=λAΩA(A)+λDΩD(D)andΦ(A,D)=f1(A,D)+f2(A,D).NowΦ(A,D k)andΦ(A k,D)are clearly defined for each minimization step of Algorithm1.Before delving into the details of how ADMM have been implemented for solving these problems,we provide a proof of convergence of the alternating minimization scheme given in Algorithm1for dictionary learning.Proposition1.Suppose that the constraints and regularizers on the dictionaryΩD(D)and on the coefficient matrixΩA(A)are convex,possibly non-smooth and lower semicontinuous functions,leading to the dictionary learning problem given in Equation(4),then Algorithm1converges towards a stationary point of Equation (4).Proof.Note that as a quadratic function f1(A,D)is continuous and differentiable on its domain and Φ(A,D k)andΦ(A k,D)are convex respectively in A and D.Furthermore,it can be easily shown that f1(A,D)is coercive and thus the level sets of f1(A,D)+f2(A,D)are bounded.Owing to all these prop-erties,we invoke Theorem5.1of[32]and we can conclude that the sequence of{A k,D k}generated by Algorithm1,which is a block-coordinate descent with a cyclic rule,converges towards a stationary point of problem(4).10We want to emphasize that this proof of convergence is independent to how each alternate step is solved (by means of ADMM or any other methods).Besides being simpler,it is thus more general than the one provided by Yaghoobi et al.[33]which is specific to their algorithm.Interestingly,convergence holds for a large class of contraints and regularizers on the dictionary and the coefficient matrices.In accordance with the alternating optimization scheme given in Algorithm1,we now want to solve either minΦ(A,D k)or minΦ(A k+1,D)using the ADMM scheme described in Algorithm3.In order to make the parallel between the ADMM notations in Equation(9)and the dictionary learning context,we rewrite the dictionary update minimization problem as(with D=U))min U 12X−UA k+1 2Ff1(A k+1,D)+λAΩ(A k+1)+λDΩD(U)f2(A k+1,D)(12)and the weight update problem asmin U 12X−D k U 2Ff1(A,D k)+λAΩ(U)+λDΩD(D k)f2(A,D k)(13)The next subsections describe how we solve each of these alternating step through the ADMM scheme given in Algorithm3.3.3.Solving thefirst ADMM stepFor both the dictionary and weight update problems given in Equations(12)and(13),we naturally set G=I so that the auxiliary variable V is a direct copy of the variable to optimize A or D denoted as U in the sequel,regardless of the problem solved.Hence,the minimization problem which considers f1(·,·)in the ADMM approach(line3of Algorithm3), boils down to be a least-square problem.For the D update problem,i.e U=D,the problem ismin U 12X−UA k+1 2F+µ2U−V k−Z k 2FA closed-form solution can be straightforwardly obtained by setting the gradient of this problem to zero leading toU k+1=(XA T k+1+V k+Z k)(A k+1A T k+1+µI)−1(14) Similarly,when A is updated while D being keptfixed,we have:U k+1=(D T k D k+µI)−1(D T k X+V k+Z k)(15)In these two update equations,the terms A k+1A Tk+1+µI and D TkD k+µI can be interpreted as the regularizedHessian of respectively f1(A k+1,D)and f1(A,D k+1).Compared to the dictionary learning algorithm of11。

相关文档
最新文档