2010 A biologically inspired neural network for image enhancement

合集下载

小学上册第十四次英语第4单元期中试卷

小学上册第十四次英语第4单元期中试卷

小学上册英语第4单元期中试卷英语试题一、综合题(本题有50小题,每小题1分,共100分.每小题不选、错误,均不给分)1 In conclusion, my dream pet would be a ______ because it would bring joy and excitement to my life. I can't wait for the day when I can have one!2 The ancient Greeks are known for their ________ and philosophy.3 What is the name of the famous American singer known for her powerful voice and hit song "I Will Always Love You"?A. Aretha FranklinB. Whitney HoustonC. Mariah CareyD. Adele答案: B. Whitney Houston4 How many stars are on the U.S. flag?A. 50B. 48C. 52D. 54答案: A5 My favorite subject is _____. (science)6 We get milk from ______.7 The seagull is commonly found near ______ (海岸).8 The dog is ______ with its ball. (playing)9 I created a scavenger hunt with my _________ (玩具).10 What is the name of the famous author who wrote "The Great Gatsby"?A. F. Scott FitzgeraldB. Ernest HemingwayC. Mark TwainD. J.D. Salinger11 古代的________ (religions) 影响了人们的生活和信仰。

小学上册第十四次英语第1单元期中试卷(有答案)

小学上册第十四次英语第1单元期中试卷(有答案)

小学上册英语第1单元期中试卷(有答案)英语试题一、综合题(本题有100小题,每小题1分,共100分.每小题不选、错误,均不给分)1.The chemical symbol for nickel is _____.2.I make _____ (晚餐) for my family.3.We are going to the ___. (beach) this summer.4.The rabbit hops over the ______.5.What do you call a collection of poetry published together?A. AnthologyB. CollectionC. VolumeD. Book答案: A6. A _______ (小鲸鱼) can sing songs underwater.7. A _____ (植物研究计划) can address global challenges.8.I enjoy making ________ (生日蛋糕) for friends.9.My mom is a great __________ (家长) who supports us.10.The __________ (悬崖) is dangerous but beautiful.11. A __________ is a type of chemical bond formed by sharing electrons.12. A saturated fat is ______ at room temperature.13.My grandpa enjoys gardening ____.14.My teacher is _______ (友好的).15.Solids have tightly packed ______.16.The classroom is _____ (clean/dirty).17.What do you call the process of plants making their own food?A. PhotosynthesisB. RespirationC. FermentationD. Transpiration答案:A18.We have a ______ (丰富的) calendar of events.19. A jellyfish has a gelatinous ______ (身体).20._____ (温带) plants can survive in seasonal changes.21.My dad is a strong __________ (支持者) of my education.22. A cat's purring can soothe ______ (焦虑) feelings.23.The antelope gracefully moves through the grasslands, a testament to speed and ____.24.My aunt is very _______ (形容词). 她总是 _______ (动词).25.Many flowers are ______ (一年生) and die after one season.26.The capital of the Cayman Islands is __________.27.I enjoy playing in the ______ (秋天) leaves when they turn bright ______ (颜色).28.They are ___ a movie. (watching)29.I enjoy ______ (探索) the world around me.30.The element with the chemical symbol Fe is _______.31.I find _____ (乐趣) in reading.32.The chemical formula for silver acetate is _______.33. (Renaissance) artists were supported by wealthy patrons. The ____34.I have _____ (three/four) pets.35.What is the coldest season of the year?A. SpringB. SummerC. FallD. Winter答案:D.Winter36.What is the name of the sweet food made from chocolate and cream?A. GanacheB. FrostingC. MousseD. Pudding答案: C37. A ____(community development) focuses on improving living conditions.38.The process of combining elements to form compounds is called ______.39. A hamster can run for hours on its ______ (轮子).40. A __________ is a common example of a base.41.The museum is very _______ (有教育意义的).42.What is the main ingredient in sushi?A. NoodlesB. RiceC. BreadD. Potatoes答案: B43.I can ______ (dance) with my friends.44.What is the name of the famous landmark in the USA?A. Statue of LibertyB. Washington MonumentC. Golden Gate BridgeD. All of the above答案: D. All of the above45.She is a friendly ________.46.I want a pet _______ (fish).47.I like to _______ (paint) with watercolors.48. A __________ is a narrow valley.49.The __________ helps some animals to glide through the air.50.The chemical formula for boric acid is ______.51.The playground is ________ (适合孩子们).52.She is a _____ (历史学家) who studies ancient civilizations.53.I go to school by ______.54.What is the name of the famous painting by Van Gogh?A. The Starry NightB. The ScreamC. Girl with a Pearl EarringD. The Mona Lisa答案:A.The Starry Night55.The chemical name for HO is _______.56.What do we call the famous American holiday celebrated on July 4th?A. ThanksgivingB. Independence DayC. Memorial DayD. Labor Day 答案:B58.The ancient Egyptians kept _______ as pets. (猫)59.The ancient Romans had a system of laws known as ________.60.The ancient Romans built _____ to celebrate their victories.61.I love to explore ________ (村庄) during vacations.62.I think animals are very _______ (形容词). They bring joy and _______ (快乐) to our lives.63. A __________ is a small body of water, usually smaller than a lake.64. (Magna Carta) was signed in 1215 to limit the power of the king. The ____65.The ancient Greeks believed in the importance of ________ (艺术).66.What is 60 ÷ 3?A. 15B. 20C. 25D. 30答案:b67.What do you call the person who helps you in a gym?A. TrainerB. ChefC. DoctorD. Teacher答案: A68.The apples are _______ (ripe) and ready to eat.69. A ______ has a unique pattern on its fur.70. (18) is the imaginary line that divides the Earth into northern and southern halves. The ____71.The chemical formula for magnesium oxide is _____.72.Which animal lives in a den?A. WolfB. EagleC. FishD. Frog答案:A73.The penguin waddles across the ______ (冰).74.My mom enjoys __________ (与朋友聚会).75.In _____ (日本), sushi is a popular dish.76.My brother is my best _______ who plays games with me.78.In the garden, I planted _____ (多种) vegetables like carrots and tomatoes.79.The ______ teaches us about climate change.80.Carbon dioxide is produced when we __________ (呼吸).81.The crow is known for its ________________ (智慧).82. A squirrel's diet consists mainly of ______ (坚果) and grains.83.The chemical formula for glucose is ______.84.The chemical symbol for promethium is _____.85.How many colors are in a standard rainbow?A. 5B. 6C. 7D. 8答案:C86.n Wall fell in _____. The Berl87.The reaction between an acid and a base produces ______.88.The forecast says it might ______ (下雨) this evening.89.My teacher teaches us . (我的老师教我们。

2010年职称英语综合类A级真题及答案

2010年职称英语综合类A级真题及答案

2010年职称英语综合类A级真题及答案Almost Human?Scientists are racing to build the world’s first thinking robot. This is not science fictio n: some say they will have made it by the year 2020. Carol Packer reports.Machines that walk, speak and feel are no longer science fiction. Kismet is the name of an a ndroid (机器人) which scientists have built at the Massachusetts Institute of Technology (M IT). Kismet is different from the traditional robot because it can show human emotions. It s eyes, ears and lips move to show when it feels happy, sad or bored. Kismet is one of the f irst of a new generation of androids —— robots that look like human beings —— which can imitate human feelings. Cog, another android invented by the MIT, imitates the action of a mother. However, scientists admit that so far Cog has the mental ability of a two-year-o ld.The optimists (乐观主义) say that by the year 2020 we will have created humanoids (机器人) w ith brains similar to those of an adult human being. These robots will be designed to look l ike people to make them more attractive and easier to sell to the public. What kind of jobs will they do? In the future, robots like Robonaut, a humanoid invented by NASA, will be do ing dangerous jobs, like repairing space stations. They will also be doing more and more of the household work for us. In Japan, scientists are designing androids that will entertai n us by dancing and playing the piano.Some people worry about what the future holds: will robots become monsters (怪物)? Will pe ople themselves become increasingly like robots? Experts predict that more and more people w ill be wearing micro-computers, connected to the Internet, in the future. People will have m icro-chips in various parts of their body, which will connect them to a wide variety of gadg ets (小装置). Perhaps we should not exaggerate (夸大) the importance of technology, but o ne wonders whether, in years to come, we will still be falling in love, and whether we will still feel pain. Who knows?11 Kismet is different from traditional robots becauseA it thinks for itself.B it is not like science fiction.C it can look after two-year-olds.D it seems to have human feelings.12 What makes Cog special?A It looks like a mother.B It behaves like a child.C It can imitate the behavior of a mother.D It has a huge brain.13 In about 15 years’ time from now, robotsA will become space designers.B will look like monsters.C will behave like animals.D will think like humans.14 In the future robots will alsoA explore space.B entertain people.C move much faster.D do all of the housework.15 What is the writer’s attitude to robots in the future?A Critical.B Hostile.C Objective.D Enthusiastic.参考答案:11. D 12. C 13. D 14. B 15. C。

POLYNOMIAL FEATURES FOR ROBUST FACE AUTHENTICATION

POLYNOMIAL FEATURES FOR ROBUST FACE AUTHENTICATION

POLYNOMIAL FEATURES FOR ROBUST FACE AUTHENTICATION Conrad Sanderson and Kuldip K.PaliwalSchool of Microelectronic EngineeringGriffith UniversityBrisbane,QLD4111,AustraliaABSTRACTIn this paper we introduce the DCT-mod2facial feature extrac-tion technique which utilizes polynomial coefficients derived from 2-D DCT coefficients of spatially neighbouring blocks.We eval-uate its robustness and performance against three popular feature sets for use in an identity verification system subject to illumi-nation changes.Results on the multi-session VidTIMIT database suggest that the proposed feature set is the most robust,followed by(in order of robustness and performance):2-D Gabor wavelets, 2-D DCT coefficients and PCA(eigenface)derived features.More-over,compared to Gabor wavelets,the DCT-mod2feature set is over80times quicker to compute.1.INTRODUCTIONA face authentication system verifies the claimed identity(a2class task)based on images(or a video sequence)of the claimant’s face. This is in contrast to an identification system,which attempts to find the identity of a given person out of a pool of people.Past research on face based systems has concentrated on the identifica-tion aspect even though the verification task has the greatest appli-cation potential[1].This is demonstrated in security applications (eg.access control),where the claimant has good reason to co-operate with the system,as well as in forensic applications where the task is mostly evaluation of each suspect separately rather than choosing one from many persons.While identification and verification systems share feature ex-traction techniques and in many cases a large part of the classi-fier structure,there is no guarantee that an approach used in the identification scenario would work equally well in the verification scenario.There are many approaches to face based systems-ranging from the ubiquitous Principal Component Analysis(PCA) approach(also known as eigenfaces)[2],Dynamic Link Architec-ture(also known as elastic graph matching)[3],Artificial Neural Networks[4],to pseudo-2D Hidden Markov Models(HMM)[5].These systems differ in terms of the feature extraction proce-dure and/or the classification technique used.For example,in[2] PCA is used for feature extraction and a nearest neighbour clas-sifier is utilized for recognition.In[3],biologically inspired2-D Gabor wavelets[6]are used for feature extraction,while the Dynamic Link Architecture is part of the classifier.In[5],fea-tures are derived using the2-D Discrete Cosine Transform(DCT) and the pseudo-2D HMM is the classifier.PCA derived features have been shown to be sensitive to changes in the illumination direction[7]causing rapid degrada-tion in verification performance.A study by Zhang et al.[8]has shown a system employing2-D Gabor wavelet derived features to be robust to changes in the illumination direction.However,a dif-ferent study by Adini et al.[9]shows that the2-D Gabor wavelet derived features are indeed sensitive to the illumination direction.Belhumeur et al.[7]proposed robust features based on Fisher’s Linear Discriminant.However,to achieve robustness, Belhumeur’s system required face images with varying illumina-tion for training purposes.As will be shown,2-D DCT based features are also sensitive to changes in the illumination direction.In this paper we introduce four new techniques,which are significantly less affected by an illumination change:DCT-delta,DCT-mod,DCT-mod-delta and DCT-mod2.We will show that the DCT-mod2method,which uti-lizes polynomial coefficients derived from2-D DCT coefficients of spatially neighbouring blocks,is the most suitable.We then compare the robustness and performance of the DCT-mod2method against two popular feature extraction techniques:eigenfaces(PCA) and2-D Gabor wavelets.The rest of the paper is organized as follows.In Section2 we briefly review the2-D DCT feature extraction technique and describe the proposed feature extraction methods.In Section3we describe a Gaussian Mixture Model(GMM)classifier which shall be used as the basis for experiments.In Section4we describe the VidTIMIT audio-visual database.The performance of feature extraction techniques is compared in Section5.The results are discussed and conclusions drawn in Section6.To keep consistency with traditional matrix notation,pixel lo-cations(and image sizes)are described using the row(s)first,fol-lowed by the column(s).2.FEATURE EXTRACTION2.1.2-D Discrete Cosine Transform(DCT)Here the given face image is analyzed on a block by block basis. Given an image block,where,we decompose it in terms of orthogonal2-D DCT basis functions(see Fig.1).The result is an matrix containing DCT coefficients:(1) forwhereforfor(2) and(3)The coefficients are ordered according to a zig-zag pattern,reflect-ing the amount of information stored[10](see Fig.2).For blocklocated at,the DCT feature vector is composed of:(4)where denotes the-th DCT coefficient and is the num-ber of retained coefficients.2.2.DCT-deltaIn speech based systems,features based on polynomial coefficients (also known as deltas),representing transitional spectral informa-tion,have been successfully used to reduce the effects of back-ground noise and channel mismatch[11].For images,we define the-th horizontal delta coefficient for block located at as a1st order orthogonal polynomial coef-ficient:(5) Similarly,we define the-th vertical delta coefficient as:(6)where is a dimensional symmetric window vector.In this work we shall use and a rectangular window.Let us assume that we have three horizontally consecutive blocks and.Each block is composed of two components: facial information and additive noise-eg..More-over,let us also suppose that all of the blocks are corrupted with the same noise(a reasonable assumption if the blocks are small and are close or overlapping).Tofind the deltas for block,we apply Eqn.(5)to obtain(ignoring the denominator):(7)(8)(9) ie.the noise component is removed.By combining the horizontal and vertical delta coefficients an overall delta feature vector is formed.Hence,given that we ex-tract DCT coefficients from each block,the delta vector is dimensional.We shall term this feature extraction method as DCT-delta.We interpret these delta coefficients as transitional spatial information(somewhat akin to edges).2.3.DCT-mod,DCT-mod2and DCT-mod-deltaBy inspecting Eqns(1)and(3),it is evident that the0th DCT coef-ficient will reflect the average pixel value(or the DC level)inside each block and hence will be the most affected by any illumination change.Moreover,by inspecting Fig.1it is evident that thefirst and second coefficients represent the average horizontal and verti-cal pixel intensity change,respectively.As such,they will alsobe01u2v1233Fig.1.Several DCT basis func-tions for N=8.Lighter colours rep-resent larger values.00156124712238111339101415Fig.2.Ordering of DCT coeffi-cients for N=4.significantly affected by any illumination change.Hence we shallstudy three additional feature extraction approaches(in all caseswe assume the baseline DCT feature vector is dimensional):1.Discard thefirst three coefficients from the baseline DCTfeature vector.We shall term this modified feature extrac-tion method as DCT-mod.2.Discard thefirst three coefficients from the baseline DCTfeature vector and concatenate the resulting vector with thecorresponding DCT-delta feature vector.We shall refer tothis method as DCT-mod-delta.3.Replace thefirst three coefficients with their horizontal andvertical deltas,ie.:(10)where the superscript was omitted.Let us term thisapproach as DCT-mod2.Thus in the DCT-mod-delta and DCT-mod2approaches transitionalspatial information is combined with local texture information.3.GMM CLASSIFIERThe distribution of feature vectors for each person is modeled by aGaussian Mixture Model(GMM).Given a set of training vectors,an-mixture GMM is trained using a k-means clustering algo-rithm followed by10iterations of the Expectation Maximization(EM)algorithm[12].Given a claim for person’s identity and a set of feature vec-tors supporting the claim,the average log likelihoodof the claimant being the true claimant is calculated using:(11)where(12)and(13)Here is the model for person.is the number of mixtures,is the weight for mixture(with constraint),and is a multi-variate Gaussian function with meanand diagonal covariance matrix.Given a set ofbackground person models for person,the average log likeli-hood of the claimant being an impostor is found using:(14)The set of background person models is found using the methoddescribed in[13].An opinion on the claim is found using:(15)The verification decision is reached as follows:given a thresh-old,the claim is accepted when and rejected when.4.VIDTIMIT AUDIO-VISUAL DATABASEThe VidTIMIT database,created by the authors,is comprised of video and corresponding audio recordings of43people(19female and24male),reciting short sentences.It was recorded in3ses-sions,with a mean delay of7days between Session1and2,and 6days between Session2and3.The sentences were chosen from the test section of the NTIMIT corpus[14].There are10sentences per person.Thefirst six sentences are assigned to Session1.The next two sentences are assigned to Session2with the remaining two to Session3.The first two sentences for all persons are the same,with the remaining eight generally different for each person.The mean duration of each sentence is4.25seconds,or approximately106video frames.The recording was done in a noisy office environment using a broadcast quality digital video camera.The video of each person is stored as a sequence of JPEG images with a resolution of384512 pixels.The corresponding audio is stored as a mono,16bit,32kHz W A Vfile.For more information on the database,please visit .au/vidtimit/or contact the authors.5.EXPERIMENTSBefore feature extraction can occur,the face mustfirst be located[15].Furthermore,to account for varying distances to the camera,a geometrical normalization must be performed.We treat the problem of face location and normalization as separate from feature extraction.Tofind the face,we use template matching with several pro-totype faces of varying ing the distance between the eyes as a size measure,an affine transformation is used[10] to adjust the size of the image,resulting in the distance between the eyes to be the same for each person.Finally a pixel face window,,containing the eyes and the nose(the most invariant face area to changes in the expression and hair style)is extracted from the image.For PCA,the dimensionality of the face window is reduced to 40(choice based on the work by Samaria[16]and Belhumeur[7]).For DCT and DCT derived methods,each block is pix-els.Moreover,each block overlaps with horizontally and vertically adjacent blocks by50%.For Gabor features,we follow Duc[3]where the dimensional-ity of the Gabor feature vectors is18.The location of the wavelet centers was chosen to be as close as possible to the centers of the blocks used in DCT-mod2feature extraction.To reduce the computational burden during modeling and test-ing,every second video frame was used.For each feature ex-traction method,8mixture client models(GMMs)were generated from features extracted from face windows in Session1.An artificial illumination change was introduced to face win-dows extracted from Sessions2and3.To simulate more illumi-nation on the left side of the face and less on the right,a new face window is created by transforming using:(16)for andwhere(17)and illumination delta(in pixels)Example face windows for various are shown in Fig.3.It must be noted that the above artificial illumination change is rather re-strictive as it does not cover all the effects of illumination changes possible in real life(shadows,etc.).Fig.3.Examples of varying light illumination;left:(no change); middle:;right:Tofind the performance,Sessions2and3were used for obtain-ing example opinions of known impostor and true claims.Four utterances,each from8fixed persons(4male and4female),were used for simulating impostor accesses against the remaining35 persons.As in[13],10background person models were used for the impostor likelihood calculation.For each of the remaining35 persons,their four utterances were used separately as true claims. In total there were1120impostor and140true claims.The deci-sion threshold was then set so the a posteriori performance is as close as possible to Equal Error Rate(EER)(ie.where the False Acceptance Rate is equal to the False Rejection Rate).In thefirst experiment,we found the performance of the DCT approach on face windows with(ie.no illumination change) while varying the dimensionality of the feature vectors.The results are presented in Fig.4.The performance improves immensely as the number of dimensions is increased from1to3.Increasing the dimensionality from15to21provides only a relatively small improvement,while significantly increasing the amount of compu-tation time required to generate the models.Based on this we have chosen15as the dimensionality of baseline DCT feature vectors -hence the dimensionality of DCT-delta is30,DCT-mod is12, DCT-mod-delta is42and DCT-mod2is18.In the second experiment we compared the performance of DCT and all of the proposed techniques for increasing.Results are shown in Fig.5.In the third experiment we compared the performance of PCA, DCT,Gabor and DCT-mod2features for varying.Results are presented in Fig.6.Computational burden is an important factor in practical ap-plications,where the amount of required memory and speed of the processor have direct bearing on thefinal cost.Hence in the final experiment we compared the average time taken to process one face window by PCA,DCT,Gabor and DCT-mod2feature ex-traction techniques.It must be noted that apart from having the transformation data pre-calculated(eg.DCT basis functions), no thorough hand optimization of the code was done.Neverthe-less,we feel that this experiment providesfigures which are at least indicative.Results are listed in Table1.6.DISCUSSION AND CONCLUSIONSWe can see in Fig.4that thefirst three DCT coefficients contain a significant amount of person dependent information.Thus ig-This is verified in Fig.5where the DCT-mod features have worse performance than DCT features when there is little or no illumina-tion change().Performance of DCT features is fairly stable for small illumination changes but degrades for.This is in contrast to DCT-mod features which have a relatively static perfor-mance.The remaining proposed features(DCT-delta,DCT-mod-delta and DCT-mod2)do not have the performance penalty present in DCT-mod.Moreover,all of them have similarly better perfor-mance than DCT features.DCT-mod2edges out DCT-delta and DCT-mod-delta in terms of stability for large illumination changes ().Additionally,the dimensionality of DCT-mod2is lower than DCT-delta and DCT-mod-delta.The results suggest that delta features make the system more robust as well as improve performance.The results also suggest that it is only necessary to use deltas of coefficients representing the DC level and low frequency features(ie.the0th,1st and2nd DCT coefficients)while keeping the remaining DCT coefficients unchanged.Hence out of the four proposed feature extraction tech-niques,the DCT-mod2approach is the most suitable.Comparing PCA,DCT,Gabor and DCT-mod2(Fig.6),we can see that the DCT-mod2approach is the most immune to illu-mination changes-the performance is virtuallyflat for varying. The performance of PCA derived features rapidly degrades as increases.Performance of Gabor features is stable for and then gently deteriorates as increases.The results suggests that we can order the features,based on their robustness and performance, as follows:DCT-mod2,Gabor,DCT,and lastly,PCA.It must be noted that using the introduced illumination change,tion techniquesMethod Time(msec)PCA11DCT6Gabor675DCT-mod28Table1.Average time taken per face window(results obtained using Pentium III500MHz,Linux2.2.18,gcc2.96)The size of the portion decreases as increases.In the PCA approach one feature vector describes the entire face,hence any change to the face would alter the features obtained.This is in contrast to the other approaches(Gabor,DCT and DCT-mod2), where one feature vector describes only a small part of the face. Thus a significant percentage(dependent on)of the feature vec-tors is virtually unchanged,automatically leading to a degree of robustness.It must also be noted that when using the GMM classifier in conjunction with the Gabor,DCT or DCT-mod2features,the spa-tial relation between major face features(eg.eyes and nose)is lost. However,excellent performance is still obtained.In Table1we can see that Gabor features are the most compu-tationally expensive to calculate,taking about84times longer than DCT-mod2features.This is due to the size of the Gabor wavelets as well as the need to compute both real and imaginary inner pared to Gabor features,PCA,DCT and DCT-mod2fea-tures take a relatively similar amount of time to process one face window.7.REFERENCES[1]G.R.Doddington et al.,“The NIST speaker recognition evalua-tion-Overview,methodology,systems,results,perspective”,Speech Communication,V ol.31,No.2-3,2000.[2]M.Turk and A.Pentland,“Eigenfaces for Recognition”,Journal ofCognitive Neuroscience,V ol.3,No.1,1991.[3] B.Duc et al.,“Face Authentication with Gabor Information on De-formable Graphs”,IEEE Trans.Image Proc.,V ol.8,No.4,1999. [4]wrence et al.,“Face Recognition:A Convolutional Neural-Network Approach”,IEEE Trans.Neural Net.,V ol.8,No.1,1997.[5]S.Eickler et al.,“Recognition of JPEG Compressed Face Im-ages Based on Statistical Methods”,Image and Vision Computing, V ol.18,No.4,2000.[6]T.S.Lee,“Image Representation Using2D Gabor Wavelets”,IEEETrans.Patt.Anal.and Machine Intell.,V ol.18,No.10,1996. [7]P.N.Belhumeur et al.,“Eigenfaces vs.Fisherfaces:Recognition Us-ing Class Specific Linear Projection”,IEEE Trans.Patt.Anal.and Machine Intell.,V ol.19,No.7,1997.[8]J.Zhang et al.,“Face Recognition:Eigenface,Elastic Matching,andNeural Nets”,Proceedings of the IEEE,V ol.85,No.9,1997. [9]Y.Adini et al.,“Face Recognition:The Problem of Compensatingfor Changes in Illumination Direction”,IEEE Trans.Patt.Anal.and Machine Intell.,V ol.19,No.7,1997.[10]R. C.Gonzales and R. E.Woods,Digital Image Processing,Addison-Wesley,1993.[11] F.K.Soong and A.E.Rosenberg,“On the Use of Instantaneous andTransitional Spectral Information in Speaker Recognition”,IEEE Trans.ASSP,V ol.36,No.6,1988.[12]T.K.Moon,“Expectation-maximization Algorithm”,IEEE SignalProcessing Magazine,V ol.13,No.6,1996.[13] D. A.Reynolds,“Speaker Identification and Verification Us-ing Gaussian Mixture Speaker Models”,Speech Communication, V ol.17,No.1-2,1995.[14] C.Jankowski et al.,“NTIMIT:A Phonetically Balanced,ContinuousSpeech Telephone Bandwidth Speech Database”,Proc.Intern.Conf.Acoustics,Speech and Signal Processing,Albuquerque,1990. [15]L-F.Chen et al.,“Why recognition in a statistics-based face recogni-tion system should be based on the pure face portion:a probabilistic decision-based proof”,Pattern Recognition,V ol.34,No.7,2001. [16] F.Samaria,“Face Recognition Using Hidden Markov Models”,PhDThesis,University of Cambridge,1994.。

基于3D卷积的视频错帧筛选方法

基于3D卷积的视频错帧筛选方法

基于3D卷积的视频错帧筛选方法缪宇杰;吴智钧;宫婧【摘要】In order to extract better video features and improve training accuracy,we propose a model of wrong temporal-ordered frames based on CNN (convolutional neural network),whose task is identifying the sequence of wrong temporal-ordered frames from several sequences of frames.The sequence of wrong frames is wrong temporal-ordered while the right sequence is temporal-ordered.Unsuper-vised video representation learning is applied to train this model,therefore labeled data sets are unnecessary.Based on the task and no se-mantic labels,a multi-branched CNN structure is implemented which is learned end-to-end.As the model input,the sequences of frames are sampled from one video.Then,these sequences of frames are encoded with the method of 3D convolution to extract the temporal and spatial features of each sequence of frames.To find out the sequence of frames with wrong temporal-order,the model has to compare all the inputs,analyze the regularities among them,and identify the one with irregularities.The experiments on UCF101 dataset verify theef-fectiveness of the proposed method,and the accuracy of this model is high.%为了提取更好的视频特征,提高训练精准度,提出了一个基于CNN(convolutional neural network,卷积神经网络)的错帧筛选模型.所谓错帧,是指在时间上乱序的帧序列,相反,有序帧是指遵守时间顺序的帧序列.其目标是从若干组帧序列中,筛选出顺序错误的一组帧序列.采用无监督学习的方法来训练模型,因此不需要依赖有标签的数据集.基于这个模型的目标以及无标签的训练方式,采用了一个多分支的CNN结构,并且是端到端的.其输入的若干组帧序列从视频中采样获得,分别进行3D卷积编码后,能够提取出每组帧序列在时间和空间上的特征.为了找出帧顺序有误的一组序列,该模型对每组帧序列进行对比,找出它们之间的共同规则,从而筛选出违背此规则的那一组序列.在UCF101数据集上的实验结果证实了该方法的有效性,错帧筛选的准确率高.【期刊名称】《计算机技术与发展》【年(卷),期】2018(028)005【总页数】4页(P179-181,186)【关键词】无监督学习;卷积神经网络;错帧筛选;3D卷积【作者】缪宇杰;吴智钧;宫婧【作者单位】南京邮电大学物联网学院,江苏南京210003;南京邮电大学物联网学院,江苏南京210003;南京邮电大学理学院,江苏南京210003【正文语种】中文【中图分类】TP3010 引言近年来,随着深度学习的兴起,诸如CNN等深度学习框架的提出,很多机器学习的问题得到了解决,比如在真实场景下的目标识别、人体行为分析等等。

2010年武汉大学881微生物学考研真题(A卷)及详解【圣才出品】

2010年武汉大学881微生物学考研真题(A卷)及详解【圣才出品】

2010年武汉大学881微生物学考研真题(A卷)及详解一、名词解释(共10小题,每题4分,共40分)1.Aseptic technique答:Aseptic technique的中文名称是无菌技术,是指在分离、转接及培养纯培养物时防止其被其他微生物污染,其自身也不污染操作环境的技术。

它是保证微生物学研究正常进行的关键。

2.Auxotroph答:Auxotroph的中文名称是营养缺陷型,是指某些菌株发生突变(自然突变或人工诱变)后,失去合成某种(或某些)对该菌株生长必不可少的物质(通常是生长因子如氨基酸、维生素)的能力,必须从外界环境获得该物质才能生长繁殖的突变型菌株。

相应的野生型菌株称为原养型。

3.Diauxic growth答:Diauxic growth的中文名称是二次生长,是指微生物在同时含有速效碳源(或氮源)和迟效碳源(或氮源)的培养基中生长时,微生物首先会利用速效碳源(或氮源)生长直到该速效碳源(或氮源)耗尽,然后经过短暂的停滞后,再利用迟效碳源(或氮源)重新开始生长的这种生长或应答。

4.Virion答:Virion的中文名称是毒粒,又称病毒颗粒,是一团能够自主复制的遗传物质。

毒粒被蛋白质外壳所包围,有的还具有包膜,以保护遗传物质免遭环境破坏,并作为将遗传物质从一个宿主细胞传递给另一个宿主细胞的载体。

毒粒是病毒的细胞外颗粒形式,也是病毒的感染性形式。

5.Glycolysis答:Glycolysis的中文名称是糖酵解,是指在无氧条件下,葡萄糖在细胞质中被分解成为丙酮酸的过程。

在糖酵解过程中,每氧化1分子的葡萄糖净得2分子ATP以及两分子丙酮酸,2分子NAD+被还原为NADH。

糖酵解属于糖代谢的一种类型,可分为二个阶段,活化阶段和放能阶段。

反应包括三种关键酶(限速酶):己糖激酶、6-磷酸果糖激酶、丙酮酸激酶。

6.Protoplast答:Protoplast的中文名称是原生质体,是指在人为条件下,用溶菌酶除尽原有细胞壁或用青霉素抑制新生细胞壁合成后,所得到的仅有一层细胞膜包裹着的圆球状渗透敏感细胞。

最新人工智能领域SCI期刊排名(20200814)


85
Systems
86
JOURNAL OF CHEMOMETRICS
87 ACM Transactions on Interactive Intelligent Systems
88
AI MAGAZINE
89
MACHINE VISION AND APPLICATIONS
90 Journal of Ambient Intelligence and Smart Environments
2
Information Fusion
IEEE TRANSACTIONS ON EVOLUTIONARY
3
COMPUTATION
4
MEDICAL IMAGE ANALYSIS
5
IEEE Transactions on Cybernetics
INTERNATIONAL JOURNAL OF INTELLIGENT
Category Scheme: WoS
Rank
Full Journal Title
Total Cites
Journal Impact Factor
Eigenfactor Score
IEEE TRANSACTIONS ON PATTERN ANALYSIS
1
AND MACHINE INTELLIGENCE
82
SYSTEMS
83
Genetic Programming and Evolvable Machines
ACM Transactions on Autonomous and Adaptive
84
Systems
International Journal on Semantic Web and Information

小学下册第十三次英语第四单元测验试卷

小学下册英语第四单元测验试卷英语试题一、综合题(本题有100小题,每小题1分,共100分.每小题不选、错误,均不给分)1.The Earth's crust is constantly being reshaped by ______ forces.2.The __________ (历史的展现) reflects our journey.3.The _______ (老虎) is known for its strength.4.The chemical formula for aluminum sulfate is _____.5.I have _____ (ten/twenty) fingers.6.__________ (惰性气体) are used in lighting and welding due to their non-reactive nature.7.The _____ (松鼠) stores acorns for winter.8.My __________ likes gardening. (妈妈)9.My grandmother loves to __________. (讲故事)10.I love to ______ (dance) to music.11.Which holiday celebrates the New Year?A. ChristmasB. ThanksgivingC. New Year's DayD. Halloween答案:C12.His favorite color is ________.13.My hamster loves to explore its ______ (笼子).14.What do you call a person who flies an airplane?A. PilotB. EngineerC. MechanicD. Navigator答案:A15.What is the capital of Nigeria?A. LagosB. AbujaC. KanoD. Port Harcourt答案:B16.What is the sound of a clock?A. Tick-tockB. RingC. BeepD. Buzz答案:A17.She is _____ (practicing) her dance moves.18.What is the name of the longest river in the world?A. AmazonB. NileC. MississippiD. Yangtze答案:B19.There are _____ (三) birds in the tree.20._____ (根系) help to absorb water and nutrients.21.My brother is good at ____ (drawing) cartoons.22.The ________ (环境适应策略) is developed over time.23.The ancient Egyptians built temples for their ________.24.The sky is _______ (很蓝)。

Aleve生物2010年试题

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONSGeneral Certificate of EducationAdvanced Subsidiary Level and Advanced LevelREAD THESE INSTRUCTIONS FIRST Write your Centre number, candidate number and name in the spaces provided at the top of this page.Write in dark blue or black pen.Y ou may use a soft pencil for any diagrams, graphs or rough working.Do not use staples, paper clips, highlighters, glue or correction fluid.DO NOT WRITE IN ANY BARCODES.Answer all questions.At the end of the examination, fasten all your work securely together.The number of marks is given in brackets [ ] at the end of each question or part question.*9843463998*BIOLOGY 9700/21Paper 2 Structured Questions AS October/November 2010 1 hour 15 minutesCandidates answer on the Question Paper.No Additional Materials are required.For Examiner’s Use12345Total1 (a) Complete the passage with the most appropriate term.ForExaminer’sUse Within each ecosystem there is a ................................... of organisms that interactwith each other and with their environment. Each species fills a particular................................... within the ecosystem. Feeding relationships in food webs are anexample of the interactions species have with each other. In old field ecosystems inNorth America, producers, such as blue grass, provide energy for grazing animals. Theseanimals form the ................................... ................................... ...................................in the food chain. [3](b)Very little of the energy consumed by grazing animals is available to carnivores.State two reasons why this is so.1. ................................................................................................................................................................................................................................................................................2. ...................................................................................................................................... (2)[Total: 5]© UCLES 20109700/21/O/N/109700/21/O/N/10© UCLES 2010[Turn overFor Examiner’s Use2 (a) Table 2.1 shows some of the structures in different parts of the gas exchange system. Complete Table 2.1 by indicating with a tick (✓) if the structure is present in each part ofthe gas exchange system or a cross (✗) if it is not. Table 2.1structuretrachea bronchus bronchiole alveolusciliated epitheliumgoblet cellscartilagesmooth muscle[4](b) An exercise physiologist investigated aspects of breathing in an athlete.The minute volume is the volume of air breathed in during one minute. The data recorded is in Table 2.2. Table 2.2vital capacity/ dm 3breathing rate at rest / breaths min –1minute volume / dm 35.811 5.5(i) Explain how the physiologist would determine the vital capacity of the athlete......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... (2)(ii) Calculate the athlete’s tidal volume.Answer = (1)9700/21/O/N/10© UCLES 2010ForExaminer’sUse(c) Fig. 2.1 shows a cross section of a coronary artery partially blocked by plaque causingatherosclerosis.Fig. 2.1Explain why atherosclerosis in coronary arteries may limit the ability of people to takevigorous exercise. (3)9700/21/O/N/10© UCLES 2010[Turn overFor Examiner’s Use(d) Describe the effects of nicotine and carbon monoxide in cigarette smoke on thecardiovascular system.nicotine ...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................carbon monoxide . (3)[Total: 13]9700/21/O/N/10© UCLES 2010For Examiner’s Use 3Red blood cells are suspended in plasma which has a concentration equivalent to that of 0.9% sodium chloride (NaC l ) solution.A student investigated what happens to red blood cells when placed into sodium chloride solutions of different concentration. A small drop of blood was added to 10 cm 3 of each sodium chloride solution. Samples weretaken from each mixture and observed under the microscope. The number of red blood cells remaining in each sample was calculated as a percentage of the number in the 0.9% solution.The results are shown in Fig. 3.1. 0100.020304050607080901000.5 1.0concentration of NaC l / %percentageof cells remaining1.5Fig. 3.1(a) With reference to Fig. 3.1, describe the student’s results. (3)9700/21/O/N/10© UCLES 2010[Turn overForExaminer’sUseThe student also measured the cell volumes of the red blood cells in three of the sodiumchloride solutions. The results are shown in Table 3.1. Table 3.1concentration of sodium chloride/ %mean red cell volume / µm 30.71200.9901.565 Fig. 3.2 shows the appearance of some red blood cells removed from the 1.5% sodiumchloride solution.Fig. 3.2(b) Explain the results shown in Fig. 3.1, Table 3.1 and Fig. 3.2, in terms of water potential .0% NaC l solution .....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................0.7% NaC l solution ..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................1.5% NaC l solution (6)9700/21/O/N/10© UCLES 2010ForExaminer’sUseRed blood cells each contain about 240 million molecules of haemoglobin that transportoxygen and carbon dioxide. (c) Describe the role of haemoglobin in the transport of oxygen and carbon dioxide.oxygen ...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................carbon dioxide ...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................[4] (d) The haematocrit is the proportion of the blood that is composed of red blood cells.Samples of blood were taken from an athlete who lived at sea level since birth and moved to live and train at an altitude of 5000 m for three weeks. The haematocrit and the number of red blood cells per mm 3 were determined before moving to high altitude and after three weeks at that altitude. The results are shown in Table 3.2.Table 3.2altitude haematocrit number of red blood cells × 106per mm 3sea level 0.45 6.15000 m(after three weeks)0.537.3 (i) Calculate the percentage increase in the number of red blood cells per mm 3 afterthree weeks at 5000 m. Show your working.Answer = ............................................ % [2]9700/21/O/N/10© UCLES 2010[Turn overFor Examiner’s Use (ii) Explain why the haematocrit increases at altitude. .................................................................................................................................. .................................................................................................................................. .................................................................................................................................. .................................................................................................................................. .................................................................................................................................. ..............................................................................................................................[3][Total: 18]9700/21/O/N/10© UCLES 2010For Examiner’s Use4Cholera bacteria release the enzyme neuraminidase which alters some of the surface proteins on the membranes of epithelial cells in the small intestine. These surface molecules become receptors for the toxin, choleragen, released by cholerabacteria. The toxin stimulates the cells to secrete large quantities of chloride ions into the lumen of the small intestine. Sodium ions and water follow the loss of chloride ions. (a) (i) Name the pathogen that causes cholera. (1)(ii) Suggest how chloride ions are moved from the epithelial cells into the lumen of thesmall intestine................................................................................................................................... (1)(iii) Explain how cholera bacteria are transmitted from one person to another......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... (3)A potential vaccine for choleragen was trialled on volunteers. Fig. 4.1 shows the concentrationof antibodies against choleragen in the blood of a volunteer who received a first injection at week 0, followed by a booster injection at week 15.time / weeks antibodyconcentrationFig. 4.1For Examiner’s Use (b) Using the information in Fig. 4.1, explain the differences between the responses to thefirst injection and the booster injection........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... (4)(c) Discuss the problems involved in preventing the spread of cholera............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. (4)[Total: 13]For Examiner’s Use 5 (a) Cellulose is a polysaccharide.Fig. 5.1 shows three sub-units from a molecule of cellulose.CH 2OHC CO HH CH 2OH CO H CH 2OH C C O O H Fig. 5.1(i) Name the sub-unit molecule of cellulose. (1)(ii) Name the bonds that attach the sub-unit molecules together within cellulose. (1)(b) Cellulose has high mechanical strength which makes it suitable for the cell walls ofplants.Explain how cellulose has such a high mechanical strength making it suitable for thecell walls of plants............................................................................................................................................................................................................................................................................................................................................................................................................................... (2)For Examiner’s UsePlant cell walls consist of cellulose that is embedded in a matrix of compounds, such as pectins and proteins. Cell wall material is synthesised inside the cell and transported to the cell surface membraneas shown in the drawing made from an electron micrograph in Fig. 5.2.LJFig. 5.2(c) Locate the parts of the cell labelled in Fig. 5.2 which apply to each of the followingstatements. Y ou must only give one letter in each case. Y ou may use each letter once, more than once or not at all. The first answer has been completed for you.statementletter from Fig. 5.2organelle that contains DNAHtransports cell wall material tothe cell surface membranesite of transcriptionsite of ribosome synthesissite of photosynthesis[4]For Examiner’s Use(d) Enzymes known as expansins are found in the matrix of cell walls to help the growth ofcells.Use the information in Fig. 5.2 to describe how proteins made by the ribosomes reachthe matrix of the cell wall................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... (3)[Total: 11]Copyright Acknowledgements:Fig. 2.1 GJLF / Science Photo LibraryFig. 3.2 Steve Gschmeissner / Science Photo LibraryPermission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity.University of Cambridge International Examinations is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.。

Deep Sparse Rectifier Neural Networks

Deep Sparse Rectifier Neural NetworksXavier Glorot Antoine Bordes Yoshua BengioDIRO,Universit´e de Montr´e al Montr´e al,QC,Canada glorotxa@iro.umontreal.ca Heudiasyc,UMR CNRS6599UTC,Compi`e gne,FranceandDIRO,Universit´e de Montr´e alMontr´e al,QC,Canadaantoine.bordes@hds.utc.frDIRO,Universit´e de Montr´e alMontr´e al,QC,Canadabengioy@iro.umontreal.caAbstractWhile logistic sigmoid neurons are more bi-ologically plausible than hyperbolic tangentneurons,the latter work better for train-ing multi-layer neural networks.This pa-per shows that rectifying neurons are aneven better model of biological neurons andyield equal or better performance than hy-perbolic tangent networks in spite of thehard non-linearity and non-differentiabilityat zero,creating sparse representations withtrue zeros,which seem remarkably suitablefor naturally sparse data.Even though theycan take advantage of semi-supervised setupswith extra-unlabeled data,deep rectifier net-works can reach their best performance with-out requiring any unsupervised pre-trainingon purely supervised tasks with large labeleddatasets.Hence,these results can be seen asa new milestone in the attempts at under-standing the difficulty in training deep butpurely supervised neural networks,and clos-ing the performance gap between neural net-works learnt with and without unsupervisedpre-training.1IntroductionMany differences exist between the neural network models used by machine learning researchers and those used by computational neuroscientists.This is in part Appearing in Proceedings of the14th International Con-ference on Artificial Intelligence and Statistics(AISTATS) 2011,Fort Lauderdale,FL,USA.Volume15of JMLR: W&CP15.Copyright2011by the authors.because the objective of the former is to obtain com-putationally efficient learners,that generalize well to new examples,whereas the objective of the latter is to abstract out neuroscientific data while obtaining ex-planations of the principles involved,providing predic-tions and guidance for future biological experiments. Areas where both objectives coincide are therefore particularly worthy of investigation,pointing towards computationally motivated principles of operation in the brain that can also enhance research in artificial intelligence.In this paper we show that two com-mon gaps between computational neuroscience models and machine learning neural network models can be bridged by using the following linear by part activa-tion:max(0,x),called the rectifier(or hinge)activa-tion function.Experimental results will show engaging training behavior of this activation function,especially for deep architectures(see Bengio(2009)for a review), i.e.,where the number of hidden layers in the neural network is3or more.Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures.This is in part inspired by observations of the mammalian vi-sual cortex,which consists of a chain of processing elements,each of which is associated with a different representation of the raw visual input.This is partic-ularly clear in the primate visual system(Serre et al., 2007),with its sequence of processing stages:detection of edges,primitive shapes,and moving up to gradu-ally more complex visual shapes.Interestingly,it was found that the features learned in deep architectures resemble those observed in thefirst two of these stages (in areas V1and V2of visual cortex)(Lee et al.,2008), and that they become increasingly invariant to factors of variation(such as camera movement)in higher lay-ers(Goodfellow et al.,2009).Deep Sparse Rectifier Neural NetworksRegarding the training of deep networks,something that can be considered a breakthrough happened in2006,with the introduction of Deep Belief Net-works(Hinton et al.,2006),and more generally the idea of initializing each layer by unsupervised learn-ing(Bengio et al.,2007;Ranzato et al.,2007).Some authors have tried to understand why this unsuper-vised procedure helps(Erhan et al.,2010)while oth-ers investigated why the original training procedure for deep neural networks failed(Bengio and Glorot,2010). From the machine learning point of view,this paper brings additional results in these lines of investigation. We propose to explore the use of rectifying non-linearities as alternatives to the hyperbolic tangent or sigmoid in deep artificial neural networks,in ad-dition to using an L1regularizer on the activation val-ues to promote sparsity and prevent potential numer-ical problems with unbounded activation.Nair and Hinton(2010)present promising results of the influ-ence of such units in the context of Restricted Boltz-mann Machines compared to logistic sigmoid activa-tions on image classification tasks.Our work extends this for the case of pre-training using denoising auto-encoders(Vincent et al.,2008)and provides an exten-sive empirical comparison of the rectifying activation function against the hyperbolic tangent on image clas-sification benchmarks as well as an original derivation for the text application of sentiment analysis.Our experiments on image and text data indicate that training proceeds better when the artificial neurons are either offor operating mostly in a linear regime.Sur-prisingly,rectifying activation allows deep networks to achieve their best performance without unsupervised pre-training.Hence,our work proposes a new contri-bution to the trend of understanding and merging the performance gap between deep networks learnt with and without unsupervised pre-training(Erhan et al., 2010;Bengio and Glorot,2010).Still,rectifier net-works can benefit from unsupervised pre-training in the context of semi-supervised learning where large amounts of unlabeled data are provided.Furthermore, as rectifier units naturally lead to sparse networks and are closer to biological neurons’responses in their main operating regime,this work also bridges(in part)a machine learning/neuroscience gap in terms of acti-vation function and sparsity.This paper is organized as follows.Section2presents some neuroscience and machine learning background which inspired this work.Section3introduces recti-fier neurons and explains their potential benefits and drawbacks in deep networks.Then we propose an experimental study with empirical results on image recognition in Section4.1and sentiment analysis in Section4.2.Section5presents our conclusions.2Background2.1Neuroscience ObservationsFor models of biological neurons,the activation func-tion is the expectedfiring rate as a function of the total input currently arising out of incoming signals at synapses(Dayan and Abott,2001).An activation function is termed,respectively antisymmetric or sym-metric when its response to the opposite of a strongly excitatory input pattern is respectively a strongly in-hibitory or excitatory one,and one-sided when this response is zero.The main gaps that we wish to con-sider between computational neuroscience models and machine learning models are the following:•Studies on brain energy expense suggest that neurons encode information in a sparse and dis-tributed way(Attwell and Laughlin,2001),esti-mating the percentage of neurons active at the same time to be between1and4%(Lennie,2003).This corresponds to a trade-offbetween richness of representation and small action potential en-ergy expenditure.Without additional regulariza-tion,such as an L1penalty,ordinary feedforward neural nets do not have this property.For ex-ample,the sigmoid activation has a steady state regime around12,therefore,after initializing with small weights,all neuronsfire at half their satura-tion regime.This is biologically implausible and hurts gradient-based optimization(LeCun et al., 1998;Bengio and Glorot,2010).•Important divergences between biological and machine learning models concern non-linear activation functions.A common biological model of neuron,the leaky integrate-and-fire(or LIF)(Dayan and Abott,2001),gives the follow-ing relation between thefiring rate and the input current,illustrated in Figure1(left):f(I)=τlogE+RI−V rE+RI−V th+t ref−1,if E+RI>V th0,if E+RI≤V thwhere t ref is the refractory period(minimal time between two action potentials),I the input cur-rent,V r the resting potential and V th the thresh-old potential(with V th>V r),and R,E,τthe membrane resistance,potential and time con-stant.The most commonly used activation func-tions in the deep learning and neural networks lit-erature are the standard logistic sigmoid and theXavier Glorot,Antoine Bordes,YoshuaBengioFigure1:Left:Common neural activation function motivated by biological data.Right:Commonly used activation functions in neural networks literature:logistic sigmoid and hyperbolic tangent(tanh).hyperbolic tangent(see Figure1,right),which areequivalent up to a linear transformation.The hy-perbolic tangent has a steady state at0,and istherefore preferred from the optimization stand-point(LeCun et al.,1998;Bengio and Glorot,2010),but it forces an antisymmetry around0which is absent in biological neurons.2.2Advantages of SparsitySparsity has become a concept of interest,not only incomputational neuroscience and machine learning butalso in statistics and signal processing(Candes andTao,2005).It wasfirst introduced in computationalneuroscience in the context of sparse coding in the vi-sual system(Olshausen and Field,1997).It has beena key element of deep convolutional networks exploit-ing a variant of auto-encoders(Ranzato et al.,2007,2008;Mairal et al.,2009)with a sparse distributedrepresentation,and has also become a key ingredientin Deep Belief Networks(Lee et al.,2008).A sparsitypenalty has been used in several computational neuro-science(Olshausen and Field,1997;Doi et al.,2006)and machine learning models(Lee et al.,2007;Mairalet al.,2009),in particular for deep architectures(Leeet al.,2008;Ranzato et al.,2007,2008).However,inthe latter,the neurons end up taking small but non-zero activation orfiring probability.We show here thatusing a rectifying non-linearity gives rise to real zerosof activations and thus truly sparse representations.From a computational point of view,such representa-tions are appealing for the following reasons:•Information disentangling.One of theclaimed objectives of deep learning algo-rithms(Bengio,2009)is to disentangle thefactors explaining the variations in the data.Adense representation is highly entangled becausealmost any change in the input modifies most ofthe entries in the representation vector.Instead,if a representation is both sparse and robust tosmall input changes,the set of non-zero featuresis almost always roughly conserved by smallchanges of the input.•Efficient variable-size representation.Dif-ferent inputs may contain different amounts of in-formation and would be more conveniently repre-sented using a variable-size data-structure,whichis common in computer representations of infor-mation.Varying the number of active neuronsallows a model to control the effective dimension-ality of the representation for a given input andthe required precision.•Linear separability.Sparse representations arealso more likely to be linearly separable,or moreeasily separable with less non-linear machinery,simply because the information is represented ina high-dimensional space.Besides,this can reflectthe original data format.In text-related applica-tions for instance,the original raw data is alreadyvery sparse(see Section4.2).•Distributed but sparse.Dense distributed rep-resentations are the richest representations,be-ing potentially exponentially more efficient thanpurely local ones(Bengio,2009).Sparse repre-sentations’efficiency is still exponentially greater,with the power of the exponent being the numberof non-zero features.They may represent a goodtrade-offwith respect to the above criteria.Nevertheless,forcing too much sparsity may hurt pre-dictive performance for an equal number of neurons,because it reduces the effective capacity of the model.Deep Sparse Rectifier NeuralNetworksFigure 2:Left:Sparse propagation of activations and gradients in a network of rectifier units.The input selects a subset of active neurons and computation is linear in this subset.Right:Rectifier and softplus activation functions.The second one is a smooth version of the first.3Deep Rectifier Networks3.1Rectifier NeuronsThe neuroscience literature (Bush and Sejnowski,1995;Douglas and al.,2003)indicates that corti-cal neurons are rarely in their maximum saturation regime ,and suggests that their activation function can be approximated by a rectifier.Most previous stud-ies of neural networks involving a rectifying activation function concern recurrent networks (Salinas and Ab-bott,1996;Hahnloser,1998).The rectifier function rectifier(x )=max(0,x )is one-sided and therefore does not enforce a sign symmetry 1or antisymmetry 1:instead,the response to the oppo-site of an excitatory input pattern is 0(no response).However,we can obtain symmetry or antisymmetry by combining two rectifier units sharing parameters.Advantages The rectifier activation function allows a network to easily obtain sparse representations.For example,after uniform initialization of the weights,around 50%of hidden units continuous output val-ues are real zeros,and this fraction can easily increase with sparsity-inducing regularization.Apart from be-ing more biologically plausible,sparsity also leads to mathematical advantages (see previous section).As illustrated in Figure 2(left),the only non-linearity in the network comes from the path selection associ-ated with individual neurons being active or not.For a given input only a subset of neurons are active .Com-putation is linear on this subset:once this subset of neurons is selected,the output is a linear function of1The hyperbolic tangent absolute value non-linearity |tanh(x )|used by Jarrett et al.(2009)enforces sign symme-try.A tanh(x )non-linearity enforces sign antisymmetry.the input (although a large enough change can trigger a discrete change of the active set of neurons).The function computed by each neuron or by the network output in terms of the network input is thus linear by parts.We can see the model as an exponential num-ber of linear models that share parameters (Nair and Hinton,2010).Because of this linearity,gradients flow well on the active paths of neurons (there is no gra-dient vanishing effect due to activation non-linearities of sigmoid or tanh units),and mathematical investi-gation is putations are also cheaper:there is no need for computing the exponential function in activations,and sparsity can be exploited.Potential Problems One may hypothesize that the hard saturation at 0may hurt optimization by block-ing gradient back-propagation.To evaluate the poten-tial impact of this effect we also investigate the soft-plus activation:softplus (x )=log (1+e x )(Dugas et al.,2001),a smooth version of the rectifying non-linearity.We lose the exact sparsity,but may hope to gain eas-ier training.However,experimental results (see Sec-tion 4.1)tend to contradict that hypothesis,suggesting that hard zeros can actually help supervised training.We hypothesize that the hard non-linearities do not hurt so long as the gradient can propagate along some paths ,i.e.,that some of the hidden units in each layer are non-zero.With the credit and blame assigned to these ON units rather than distributed more evenly,we hypothesize that optimization is easier.Another prob-lem could arise due to the unbounded behavior of the activations;one may thus want to use a regularizer to prevent potential numerical problems.Therefore,we use the L 1penalty on the activation values,which also promotes additional sparsity.Also recall that,in or-der to efficiently represent symmetric/antisymmetric behavior in the data,a rectifier network would needXavier Glorot,Antoine Bordes,Yoshua Bengiotwice as many hidden units as a network of symmet-ric/antisymmetric activation functions.Finally,rectifier networks are subject to ill-conditioning of the parametrization.Biases and weights can be scaled in different (and consistent)ways while preserving the same overall network function.More precisely,consider for each layer of depth i of the network a scalar αi ,and scaling the parameters asW i =W iαi and b i =b i ij =1αj.The output units values then change as follow:s =sn j =1αj .Therefore,aslong as nj =1αj is 1,the network function is identical.3.2Unsupervised Pre-trainingThis paper is particularly inspired by the sparse repre-sentations learned in the context of auto-encoder vari-ants,as they have been found to be very useful intraining deep architectures (Bengio,2009),especially for unsupervised pre-training of neural networks (Er-han et al.,2010).Nonetheless,certain difficulties arise when one wants to introduce rectifier activations into stacked denois-ing auto-encoders (Vincent et al.,2008).First,the hard saturation below the threshold of the rectifier function is not suited for the reconstruction units.In-deed,whenever the network happens to reconstruct a zero in place of a non-zero target,the reconstruc-tion unit can not backpropagate any gradient.2Sec-ond,the unbounded behavior of the rectifier activation also needs to be taken into account.In the follow-ing,we denote ˜x the corrupted version of the input x ,σ()the logistic sigmoid function and θthe model pa-rameters (W enc ,b enc ,W dec ,b dec ),and define the linear recontruction function as:f (x,θ)=W dec max(W enc x +b enc ,0)+b dec .Here are the several strategies we have experimented:e a softplus activation function for the recon-struction layer,along with a quadratic cost:L (x,θ)=||x −log(1+exp(f (˜x ,θ)))||2.2.Scale the rectifier activation values coming from the previous encoding layer to bound them be-tween 0and 1,then use a sigmoid activation func-tion for the reconstruction layer,along with a cross-entropy reconstruction cost.L (x,θ)=−x log(σ(f (˜x ,θ)))−(1−x )log(1−σ(f (˜x ,θ))).2Why is this not a problem for hidden layers too?we hy-pothesize that it is because gradients can still flow throughthe active (non-zero),possibly helping rather than hurting the assignment of credit.e a linear activation function for the reconstruc-tion layer,along with a quadratic cost.We triedto use input unit values either before or after the rectifier non-linearity as reconstruction targets.(For the first layer,raw inputs are directly used.)e a rectifier activation function for the recon-struction layer,along with a quadratic cost.The first strategy has proven to yield better gener-alization on image data and the second one on text data.Consequently,the following experimental study presents results using those two.4Experimental StudyThis section discusses our empirical evaluation of recti-fier units for deep networks.We first compare them to hyperbolic tangent and softplus activations on image benchmarks with and without pre-training,and then apply them to the text task of sentiment analysis.4.1Image RecognitionExperimental setup We considered the image datasets detailed below.Each of them has a train-ing set (for tuning parameters),a validation set (for tuning hyper-parameters)and a test set (for report-ing generalization performance).They are presented according to their number of training/validation/test examples,their respective image sizes,as well as their number of classes:•MNIST (LeCun et al.,1998):50k/10k/10k,28×28digit images,10classes.•CIFAR10(Krizhevsky and Hinton,2009):50k/5k/5k,32×32×3RGB images,10classes.•NISTP:81,920k/80k/20k,32×32character im-ages from the NIST database 19,with randomized distortions (Bengio and al,2010),62classes.This dataset is much larger and more difficult than the original NIST (Grother,1995).•NORB:233,172/58,428/58,320,taken from Jittered-Cluttered NORB (LeCun et al.,2004).Stereo-pair images of toys on a cluttered background,6classes.The data has been prepro-cessed similarly to (Nair and Hinton,2010):we subsampled the original 2×108×108stereo-pair images to 2×32×32and scaled linearly the image in the range [−1,1].We followed the procedure used by Nair and Hinton (2010)to create the validation set.Deep Sparse Rectifier Neural NetworksTable1:Test error on networks of depth3.Bold results represent statistical equivalence between similar ex-periments,with and without pre-training,under the null hypothesis of the pairwise test with p=0.05.Neuron MNIST CIF AR10NISTP NORB With unsupervised pre-trainingRectifier 1.20%49.96%32.86%16.46% Tanh 1.16%50.79%35.89%17.66% Softplus 1.17%49.52%33.27%19.19% Without unsupervised pre-trainingRectifier 1.43%50.86%32.64%16.40% Tanh 1.57%52.62%36.46%19.29% Softplus 1.77%53.20%35.48%17.68% For all experiments except on the NORB data(Le-Cun et al.,2004),the models we used are stacked denoising auto-encoders(Vincent et al.,2008)with three hidden layers and1000units per layer.The ar-chitecture of Nair and Hinton(2010)has been used on NORB:two hidden layers with respectively4000 and2000units.We used a cross-entropy reconstruc-tion cost for tanh networks and a quadratic cost over a softplus reconstruction layer for the rectifier and softplus networks.We chose masking noise as the corruption process:each pixel has a probability of0.25of being artificially set to0.The unsuper-vised learning rate is constant,and the following val-ues have been explored:{.1,.01,.001,.0001}.We se-lect the model with the lowest reconstruction error. For the supervisedfine-tuning we chose a constant learning rate in the same range as the unsupervised learning rate with respect to the supervised valida-tion error.The training cost is the negative log likeli-hood−log P(correct class|input)where the probabil-ities are obtained from the output layer(which imple-ments a softmax logistic regression).We used stochas-tic gradient descent with mini-batches of size10for both unsupervised and supervised training phases.To take into account the potential problem of rectifier units not being symmetric around0,we use a vari-ant of the activation function for whichhalf of the units output values are multiplied by-1.This serves to cancel out the mean activation value for each layer and can be interpreted either as inhibitory neurons or simply as a way to equalize activations numerically. Additionally,an L1penalty on the activations with a coefficient of0.001was added to the cost function dur-ing pre-training andfine-tuning in order to increase the amount of sparsity in the learned representations. Main results Table1summarizes the results on networks of3hidden layers of1000hidden units each,Figure3:Influence offinal sparsity on accu-racy.200randomly initialized deep rectifier networks were trained on MNIST with various L1penalties(from 0to0.01)to obtain different sparsity levels.Results show that enforcing sparsity of the activation does not hurtfinal performance until around85%of true zeros.comparing all the neuron types3on all the datasets, with or without unsupervised pre-training.In the lat-ter case,the supervised training phase has been carried out using the same experimental setup as the one de-scribed above forfine-tuning.The main observations we make are the following:•Despite the hard threshold at0,networks trained with the rectifier activation function canfind lo-cal minima of greater or equal quality than those obtained with its smooth counterpart,the soft-plus.On NORB,we tested a rescaled version of the softplus defined by1αsoftplus(αx),which allows to interpolate in a smooth manner be-tween the softplus(α=1)and the rectifier(α=∞).We obtained the followingα/test error cou-ples:1/17.68%,1.3/17.53%,2/16.9%,3/16.66%, 6/16.54%,∞/16.40%.There is no trade-offbe-tween those activation functions.Rectifiers are not only biologically plausible,they are also com-putationally efficient.•There is almost no improvement when using un-supervised pre-training with rectifier activations, contrary to what is experienced using tanh or soft-plus.Purely supervised rectifier networks remain competitive on all4datasets,even against the pretrained tanh or softplus models.3We also tested a rescaled version of the LIF and max(tanh(x),0)as activation functions.We obtained worse generalization performance than those of Table1, and chose not to report them.Xavier Glorot,Antoine Bordes,Yoshua Bengio•Rectifier networks are truly deep sparse networks.There is an average exact sparsity(fraction of ze-ros)of the hidden layers of83.4%on MNIST,72.0%on CIFAR10,68.0%on NISTP and73.8%on NORB.Figure3provides a better understand-ing of the influence of sparsity.It displays the MNIST test error of deep rectifier networks(with-out pre-training)according to different average sparsity obtained by varying the L1penalty on the works appear to be quite ro-bust to it as models with70%to almost85%of true zeros can achieve similar performances. With labeled data,deep rectifier networks appear to be attractive models.They are biologically credible, and,compared to their standard counterparts,do not seem to depend as much on unsupervised pre-training, while ultimately yielding sparse representations.This last conclusion is slightly different from those re-ported in(Nair and Hinton,2010)in which is demon-strated that unsupervised pre-training with Restricted Boltzmann Machines and using rectifier units is ben-eficial.In particular,the paper reports that pre-trained rectified Deep Belief Networks can achieve a test error on NORB below16%.However,we be-lieve that our results are compatible with those:we extend the experimental framework to a different kind of models(stacked denoising auto-encoders)and dif-ferent datasets(on which conclusions seem to be differ-ent).Furthermore,note that our rectified model with-out pre-training on NORB is very competitive(16.4% error)and outperforms the17.6%error of the non-pretrained model from Nair and Hinton(2010),which is basically what wefind with the non-pretrained soft-plus units(17.68%error).Semi-supervised setting Figure4presents re-sults of semi-supervised experiments conducted on the NORB dataset.We vary the percentage of the orig-inal labeled training set which is used for the super-vised training phase of the rectifier and hyperbolic tan-gent networks and evaluate the effect of the unsuper-vised pre-training(using the whole training set,unla-beled).Confirming conclusions of Erhan et al.(2010), the network with hyperbolic tangent activations im-proves with unsupervised pre-training for any labeled set size(even when all the training set is labeled). However,the picture changes with rectifying activa-tions.In semi-supervised setups(with few labeled data),the pre-training is highly beneficial.But the more the labeled set grows,the closer the models with and without pre-training.Eventually,when all avail-able data is labeled,the two models achieve identical performance.Rectifier networks can maximally ex-ploit labeled and unlabeledinformation.Figure4:Effect of unsupervised pre-training.On NORB,we compare hyperbolic tangent and rectifier net-works,with or without unsupervised pre-training,andfine-tune only on subsets of increasing size of the training set.4.2Sentiment AnalysisNair and Hinton(2010)also demonstrated that recti-fier units were efficient for image-related tasks.They mentioned the intensity equivariance property(i.e. without bias parameters the network function is lin-early variant to intensity changes in the input)as ar-gument to explain this observation.This would sug-gest that rectifying activation is mostly useful to im-age data.In this section,we investigate on a different modality to cast a fresh light on rectifier units.A recent study(Zhou et al.,2010)shows that Deep Be-lief Networks with binary units are competitive with the state-of-the-art methods for sentiment analysis. This indicates that deep learning is appropriate to this text task which seems therefore ideal to observe the behavior of rectifier units on a different modality,and provide a data point towards the hypothesis that rec-tifier nets are particarly appropriate for sparse input vectors,such as found in NLP.Sentiment analysis is a text mining area which aims to determine the judg-ment of a writer with respect to a given topic(see (Pang and Lee,2008)for a review).The basic task consists in classifying the polarity of reviews either by predicting whether the expressed opinions are positive or negative,or by assigning them star ratings on either 3,4or5star scales.Following a task originally proposed by Snyder and Barzilay(2007),our data consists of restaurant reviews which have been extracted from the restaurant review site .We have access to10,000 labeled and300,000unlabeled training reviews,while the test set contains10,000examples.The goal is to predict the rating on a5star scale and performance is evaluated using Root Mean Squared Error(RMSE).4 4Even though our tasks are identical,our database is。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
ቤተ መጻሕፍቲ ባይዱ
Where, A is the decay rate constant; D is the baseline activities for the cell; E , F are nonnegative constants representing the upper and lower bounds of the neural
!!...Xk (i,j) -A[Xk (i,j)-D]+ [ E-Xk (i,j)]Ck (i,j) dt -[ F +Xk (i,j)]Sk (i,j)
=
(1)
This work is supported by grants from the National Basic Research (973) Program of China (NO. 2007CB714406), aud supported by the Fuudamental Research Fuuds for the Centtal Universities (NO. ZYGX2009XOO3, ZYGX2009Z005), aud supported by a grant from the Key Program of the Science Fouudation for Young Scholars in University of Electronic Science aud Technology of China (NO. JX0804).
978-1-4244-7371-7/lO/$26.00©20lO IEEE
2010 International Symposium on Intelligent Signal Processing and Communication Systems (lSPACS 2010) December 6-8, 2010
Yinghua Li, Tian Pu, Jian Cheng
School of Electronic Engineering, University of Electronic Science and Technology of China, 611731, liyinghua666 @, courierpt @, pami.cheng @
201O International Symposium on Intelligent Signal Processing and Communication Systems (lSPACS 20lO) December 6-8, 20lO
A BIOLOGICALLY INSPIRED NEURAL NETWORK FOR IMAGE ENHANCEMENT *
2. COLOR IMAGE ENHANCEMENT BASED ON IMPROVED PCNN 2.1. The Property of Visual Confrontation
network;
Image enhancement is of great importance in many applications, due to some features are hardly detected by eyes in an image. Since the images are observed by human, it is reasonable to develop the psychophysically-derived methods. Therefore, fmding a suitable visual model is a promising way to develop new image processing methods. There are a wealth of published literatures modeling the human visual system [1]�[2]. Here, we pay attention to the pulse coupled neural network (peNN) and the feed­ forward ON-OFF shunting neural network. The pulse coupled neural network (peNN) is a novel neural network different from the traditional ones, which was developed by Eckhom and his colleagues, based on the experimental observations of synchronous pulse bursts in mammal visual cortex[3]. It has significant potential applications in
ABSTRACT
Chengdu, Sichuan, China
A promising trend of image processing is to incorporate some knowledge on human visual system. In this paper, we propose an improved pulse coupled neural network (peNN) for image enhancement. We apply the passive membrane equation, which is known as a model for describing the ON-OFF opponent property of the receptive fields of the retinal ganglion cells, as the linking field to modulate feeding field input of the peNN and obtain the enhanced neural pulse as the output image. Initially, the RGB image is converted to luminance and chrominance images. Only the achromatic image is enhanced. Finally the RGB image is reconstructed from the enhanced luminance component along with the original chrominance component. The experimental results show the effectiveness of the method.
*
Physiological studies show that the receptive fields of ganglion cells in retina organize in two antagonistic center­ surround forms: excitatory ON-center with inhibitory OFF­ surround cells and inhibitory OFF-center with excitatory ON-surround cells[7]. The opponent mechanism in receptive fields results in that ganglion cells primarily transfer contrasts rather than light intensities. The passive membrane equation (PME), initially founded by Hodgkin and Huxley as a model, is expanded by Grossberg[8] to model the neural dynamics of the ON­ OFF receptive fields. It is described as the following equation:
activity, respectively; G{i,j)are inputs in the center neuron; 5;,(i,j) are inputs in the surround neurons. Studies show that the PME provides outputs to compute the ratio contrast in the image. We use the PME as a contrast extractor in the processing workflow.
Index Terms-pulse coupled neural opponent neural network; image enhancement;
1. INTRODUCTION
image processing and has been widely applied to image fusion[4], segmentation[5] and image denoising[6]. The ON-OFF shunting neural network is a model to understand the real-time adaptive response of the receptive fields of the retinal ganglion cells for complex and dynamic afferent stimuli. Knowing this, our motivation of incorporating these two neural networks is that it is more analogous to the retinal processing of the human visual system. As mentioned above, in this paper, we present a method based on an improved peNN. We use the response of the ON-OFF shunting neural network[7] as the linking fields of the peNN, thus the enhanced contrast components are combined with input signals through the state dependant the modulation of the peNN. Applications of this algorithm on various color images yield promising results.
相关文档
最新文档