Image classification using multimedia knowledge networks

合集下载

EN55022-2006

EN55022-2006

认证之家 EUROPEAN STANDARDEN 55022 NORME EUROPÉENNEEUROPÄISCHE NORM September 2006CENELECEuropean Committee for Electrotechnical StandardizationComité Européen de Normalisation ElectrotechniqueEuropäisches Komitee für Elektrotechnische NormungCentral Secretariat: rue de Stassart 35, B - 1050 Brussels© 2006 CENELEC - All rights of exploitation in any form and by any means reserved worldwide for CENELEC members.Ref. No. EN 55022:2006 E ICS 33.100.10Supersedes EN 55022:1998 + A1:2000 + A2:2003English versionInformation technology equipment -Radio disturbance characteristics -Limits and methods of measurement(CISPR 22:2005, modified)Appareils de traitement de l'information - Caractéristiques des perturbationsradioélectriques -Limites et méthodes de mesure(CISPR 22:2005, modifiée)Einrichtungen der Informationstechnik - Funkstöreigenschaften - Grenzwerte und Messverfahren (CISPR 22:2005, modifiziert)This European Standard was approved by CENELEC on 2005-09-13. CENELEC members are bound to comply with the CEN/CENELEC Internal Regulations which stipulate the conditions for giving this European Standard the status of a national standard without any alteration.Up-to-date lists and bibliographical references concerning such national standards may be obtained on application to the Central Secretariat or to any CENELEC member.This European Standard exists in three official versions (English, French, German). A version in any other language made by translation under the responsibility of a CENELEC member into its own language and notified to the Central Secretariat has the same status as the official versions.CENELEC members are the national electrotechnical committees of Austria, Belgium, Cyprus, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, Switzerland and the United Kingdom.EN 55022:2006– 2 –ForewordThe text of the International Standard CISPR 22:2003 as well as A1:2004 and CISPR/I/136/FDIS (Amendment 3) and CISPR/I/128/CDV (Amendment 2, fragment 17), prepared by CISPR SC I "Electromagnetic compatibility of information technology equipment, multimedia equipment and receivers", together with the common modifications prepared by the Technical Committee CENELEC TC 210, Electromagnetic compatibility (EMC), was submitted to the CENELEC Unique Acceptance Procedure for acceptance as a European Standard.In addition, the text of CISPR/I/135A/FDIS (future A2, fragment 1) to CISPR 22:2003, also prepared by CISPR SC I "Electromagnetic compatibility of information technology equipment, multimedia equipment and receivers", was submitted to the CENELEC formal vote as prAD to prEN 55022:2005, with the intention of the two documents being merged and ratified together as a new edition of EN 55022.During the period of voting on these CENELEC drafts, the amendments CISPR/I/135A/FDIS and CISPR/I/136/FDIS (Amendments 2 and 3 respectively) made to CISPR 22:2003, resulted in the publication of a new (fifth) edition of CISPR 22, in accordance with IEC rules. The resulting CISPR 22:2005 was published in April 2005.This resulting version of EN 55022, which was ratified on 2005-09-13, is therefore identical to CISPR 22:2005 except for the common modifications that were included in the document submitted to the CENELEC Unique Acceptance Procedure. The common modifications include CISPR/I/128/CDV, as this draft was not implemented in the unamended CISPR 22:2005.This European Standard supersedes EN 55022:1998 and its amendments A1:2000 and A2:2003.The following dates were fixed:–latest date by which the EN has to be implementedat national level by publication of an identicalnational standard or by endorsement (dop) 2007-04-01–latest date by which the national standards conflictingwith the EN have to be withdrawn (dow) 2009-10-01This European Standard has been prepared under a mandate given to CENELEC by the European Commission and the European Free Trade Association and covers essential requirements of EC Directives 89/336/EEC, 2004/108/EC and 1999/5/EC. See Annex ZZ.__________– 3 – EN 55022:2006CONTENTS INTRODUCTION (6)1Scope and object (7)2Normative references (7)3Definitions (8)4Classification of ITE (9)4.1Class B ITE (9)4.2Class A ITE (10)5Limits for conducted disturbance at mains terminals and telecommunication ports (10)5.1Limits of mains terminal disturbance voltage (10)5.2Limits of conducted common mode (asymmetric mode) disturbanceat telecommunication ports (11)6Limits for radiated disturbance (11)7Interpretation of CISPR radio disturbance limit (12)7.1Significance of a CISPR limit (12)7.2Application of limits in tests for conformity of equipment in series production (12)8General measurement conditions (13)8.1Ambient noise (13)8.2General arrangement (14)8.3EUT arrangement (16)8.4Operation of the EUT (18)8.5Operation of multifunction equipment (19)9Method of measurement of conducted disturbance at mains terminals and telecommunication ports (20)9.1Measurement detectors (20)9.2Measuring receivers (20)9.3Artificial mains network (AMN) (20)9.4Ground reference plane (21)9.5EUT arrangement (21)9.6Measurement of disturbances at telecommunication ports (23)9.7Recording of measurements (27)10Method of measurement of radiated disturbance (27)10.1Measurement detectors (27)10.2Measuring receivers (27)10.3Antenna (27)10.4Measurement site (28)10.5EUT arrangement (29)10.6Recording of measurements (29)10.7Measurement in the presence of high ambient signals (30)10.8User installation testing (30)11Measurement uncertainty (30)EN 55022:2006– 4 –Annex A (normative) Site attenuation measurements of alternative test sites (41)Annex B (normative) Decision tree for peak detector measurements (47)Annex C (normative) Possible test set-ups for common mode measurements (48)Annex D (informative) Schematic diagrams of examples of impedance stabilization networks (ISN) (55)Annex E (informative) Parameters of signals at telecommunication ports (64)Annex F (informative) Rationale for disturbance measurements and methods (67)Annex ZA (normative) Normative references to international publications with their corresponding European publications (75)Annex ZZ (informative) Coverage of Essential Requirements of EC Directives (76)Bibliography (74)Figure 1 – Test site (31)Figure 2 – Minimum alternative measurement site (32)Figure 3 – Minimum size of metal ground plane (32)Figure 4 – Example test arrangement for tabletop equipment (conducted and radiated emissions) (plan view) (33)Figure 5 – Example test arrangement for tabletop equipment (conducted emission measurement – alternative 1a) (34)Figure 6 – Example test arrangement for tabletop equipment (conducted emission measurement – alternative 1b) (34)Figure 7 – Example test arrangement for tabletop equipment (conducted emission measurement – alternative 2) (35)Figure 8 – Example test arrangement for floor-standing equipment (conductedemission measurement) (36)Figure 9 – Example test arrangement for combinations of equipment (conductedemission measurement) (37)Figure 10 – Example test arrangement for tabletop equipment (radiated emission measurement) (37)Figure 11 – Example test arrangement for floor-standing equipment (radiated emission measurement) (38)Figure 12 – Example test arrangement for floor-standing equipment with vertical riserand overhead cables (radiated and conducted emission measurement) (39)Figure 13 – Example test arrangement for combinations of equipment (radiatedemission measurement) (40)Figure A.1 – Typical antenna positions for alternate site NSA measurements (44)Figure A.2 – Antenna positions for alternate site measurements for minimumrecommended volume (45)Figure B.1 – Decision tree for peak detector measurements (47)Figure C.1 – Using CDNs described in IEC 61000-4-6 as CDN/ISNs (49)Figure C.2 – Using a 150 Ω load to the outside surface of the shield ("in situCDN/ISN") (50)Figure C.3 – Using a combination of current probe and capacitive voltage probe (50)Figure C.4 – Using no shield connection to ground and no ISN (51)Figure C.5 – Calibration fixture (53)Figure C.6 – Flowchart for selecting test method (54)Figure D.1 − ISN for use with unscreened single balanced pairs (55)– 5 – EN 55022:2006 Figure D.2 − ISN with high longitudinal conversion loss (LCL) for use with either oneor two unscreened balanced pairs (56)Figure D.3 − ISN with high longitudinal conversion loss (LCL) for use with one, two,three, or four unscreened balanced pairs (57)Figure D.4 − ISN, including a 50 Ω source matching network at the voltage measuringport, for use with two unscreened balanced pairs (58)Figure D.5 − ISN for use with two unscreened balanced pairs (59)Figure D.6 − ISN, including a 50 Ω source matching network at the voltage measuringport, for use with four unscreened balanced pairs (60)Figure D.7 − ISN for use with four unscreened balanced pairs (61)Figure D.8 − ISN for use with coaxial cables, employing an internal common modechoke created by bifilar winding an insulated centre-conductor wire and an insulatedscreen-conductor wire on a common magnetic core (for example, a ferrite toroid) (61)Figure D.9 − ISN for use with coaxial cables, employing an internal common modechoke created by miniature coaxial cable (miniature semi-rigid solid copper screen or miniature double-braided screen coaxial cable) wound on ferrite toroids (62)Figure D.10 − ISN for use with multi-conductor screened cables, employing an internal common mode choke created by bifilar winding multiple insulated signal wires and an insulated screen-conductor wire on a common magnetic core (for example, a ferrite toroid) (62)Figure D.11 − ISN for use with multi-conductor screened cables, employing an internal common mode choke created by winding a multi-conductor screened cable on ferrite toroids (63)Figure F.1 – Basic circuit for considering the limits with defined TCM impedance of 150 Ω..70 Figure F.2 – Basic circuit for the measurement with unknown TCM impedance (70)Figure F.3 – Impedance layout of the components used in Figure C.2 (72)Figure F.4 – Basic test set-up to measure combined impedance of the 150 Ω and ferrites (73)Table 1 – Limits for conducted disturbance at the mains ports of class A ITE (10)Table 2 – Limits for conducted disturbance at the mains ports of class B ITE (11)Table 3 – Limits of conducted common mode (asymmetric mode) disturbanceat telecommunication ports in the frequency range 0,15 MHz to 30 MHz for class A equipment (11)Table 4 – Limits of conducted common mode (asymmetric mode) disturbance at telecommunication ports in the frequency range 0,15 MHz to 30 MHz for class B equipment (11)Table 5 – Limits for radiated disturbance of class A ITE at a measuring distance of10 m (12)Table 6 – Limits for radiated disturbance of class B ITE at a measuring distance of10 m (12)Table 7 – Acronyms used in figures (31)Table A.1 – Normalized site attenuation (A N (dB)) for recommended geometries with broadband antennas (43)Table F.1 – Summary of advantages and disadvantages of the methods described inAnnex C (68)EN 55022:2006– 6 –INTRODUCTIONThe scope is extended to the whole radio-frequency range from 9 kHz to 400 GHz, but limits are formulated only in restricted frequency bands, which is considered sufficient to reach adequate emission levels to protect radio broadcast and telecommunication services, and to allow other apparatus to operate as intended at reasonable distance.– 7 – EN 55022:2006 INFORMATION TECHNOLOGY EQUIPMENT –RADIO DISTURBANCE CHARACTERISTICS –LIMITS AND METHODS OF MEASUREMENT1 Scope and objectThis International Standard applies to ITE as defined in 3.1.Procedures are given for the measurement of the levels of spurious signals generated by the ITE and limits are specified for the frequency range 9 kHz to 400 GHz for both class A and class B equipment. No measurements need be performed at frequencies where no limits are specified.The intention of this publication is to establish uniform requirements for the radio disturbance level of the equipment contained in the scope, to fix limits of disturbance, to describe methods of measurement and to standardize operating conditions and interpretation of results.2 Normative referencesThe following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.IEC 60083:1997, Plugs and socket-outlets for domestic and similar general use standardized in member countries of IECIEC 61000-4-6:2003, Electromagnetic compatibility (EMC) – Part 4-6: Testing and measurement techniques – Immunity to conducted disturbances, induced by radio-frequency fieldsCISPR 11:2003, Industrial, scientific, and medical (ISM) radio-frequency equipment – Electro-magnetic disturbance characteristics – Limits and methods of measurementCISPR 13:2001, Sound and television broadcast receivers and associated equipment – Radio disturbance characteristics – Limits and methods of measurementCISPR 16-1-1:2003, Specification for radio disturbance and immunity measuring apparatus and methods – Part 1-1: Radio disturbance and immunity measuring apparatus – Measuring apparatusCISPR 16-1-2:2003, Specification for radio disturbance and immunity measuring apparatus and methods – Part 1-2: Radio disturbance and immunity measuring apparatus – Ancillary equipment – Conducted disturbances 1Amendment 1 (2004)___________1There exists a consolidated edition 1.1 (2004) including edition 1.0 and its Amendment 1.EN 55022:2006– 8 –CISPR 16-1-4:2004, Specification for radio disturbance and immunity measuring apparatus and methods – Part 1-4: Radio disturbance and immunity measuring apparatus – Ancillary equipment – Radiated disturbancesCISPR 16-4-2:2003, Specification for radio disturbance and immunity measuring apparatus and methods – Part 4-2: Uncertainties, statistics and limit modelling – Uncertainty in EMC measurements3 DefinitionsFor the purposes of this document the following definitions apply:3.1information technology equipment (ITE)any equipment:a) which has a primary function of either (or a combination of) entry, storage, display,retrieval, transmission, processing, switching, or control, of data and of telecommuni-cation messages and which may be equipped with one or more terminal ports typically operated for information transfer;b) with a rated supply voltage not exceeding 600 V.It includes, for example, data processing equipment, office machines, electronic business equipment and telecommunication equipment.Any equipment (or part of the ITE equipment) which has a primary function of radio trans-mission and/or reception according to the ITU Radio Regulations are excluded from the scope of this publication.NOTE Any equipment which has a function of radio transmission and/or reception according to the definitions of the ITU Radio Regulations should fulfil the national radio regulations, whether or not this publication is also valid. Equipment, for which all disturbance requirements in the frequency range are explicitly formulated in other IEC or CISPR publications, are excluded from the scope of this publication.3.2equipment under test (EUT)representative ITE or functionally interactive group of ITE (system) which includes one or more host unit(s) and is used for evaluation purposes3.3host unitpart of an ITE system or unit that provides the mechanical housing for modules, which may contain radio-frequency sources, and may provide power distribution to other ITE. Power distribution may be a.c., d.c., or both between the host unit(s) and modules or other ITE3.4modulepart of an ITE which provides a function and may contain radio-frequency sources3.5identical modules and ITEmodules and ITE produced in quantity and within normal manufacturing tolerances to a given manufacturing specification– 9 – EN 55022:20063.6telecommunications/network portpoint of connection for voice, data and signalling transfers intended to interconnect widely-dispersed systems via such means as direct connection to multi-user telecommunications networks (e.g. public switched telecommunications networks (PSTN) integrated services digital networks (ISDN), x-type digital subscriber lines (xDSL), etc.), local area networks (e.g. Ethernet, Token Ring, etc.) and similar networksNOTE A port generally intended for interconnection of components of an ITE system under test (e.g. RS-232, IEEE Standard 1284 (parallel printer), Universal Serial Bus (USB), IEEE Standard 1394 (“Fire Wire”), etc.) and used in accordance with its functional specifications (e.g. for the maximum length of cable connected to it), is not considered to be a telecommunications/network port under this definition.3.7multifunction equipmentinformation technology equipment in which two or more functions subject to this standard and/or to other standards are provided in the same unitNOTE Examples of information technology equipment include–a personal computer provided with a telecommunication function and/or broadcast reception function; – a personal computer provided with a measuring function, etc.3.8total common mode impedanceTCM impedanceimpedance between the cable attached to the EUT port under test and the reference ground planeNOTE The complete cable is seen as one wire of the circuit, the ground plane as the other wire of the circuit. The TCM wave is the transmission mode of electrical energy, which can lead to radiation of electrical energy if the cable is exposed in the real application. Vice versa, this is also the dominant mode, which results from exposition of the cable to external electromagnetic fields.3.9arrangementphysical layout of the EUT that includes connected peripherals/associated equipment within the test area3.10configurationmode of operation and other operational conditions of the EUT3.11associated equipmentAEequipment needed to maintain the data traffic on the cable attached to the EUT port under test and (or) to maintain the normal operation of the EUT during the test. The associated equipment may be physically located outside the test areaNOTE The AE can be another ITE, a traffic simulator or a connection to a network. The AE can be situated close to the measurement set-up, outside the measurement room or be represented by the connection to a network. AE should not have any appreciable influence on the test results.4 Classification of ITE4.1 Class B ITEClass B ITE is a category of apparatus which satisfies the class B ITE disturbance limits. ITE is subdivided into two categories denoted class A ITE and class B ITE.EN 55022:2006 – 10 – Class B ITE is intended primarily for use in the domestic environment and may include:– equipment with no fixed place of use; for example, portable equipment powered by built-inbatteries;– telecommunication terminal equipment powered by a telecommunication network; – personal computers and auxiliary connected equipment.NOTE The domestic environment is an environment where the use of broadcast radio and television receivers may be expected within a distance of 10 m of the apparatus concerned.4.2 Class A ITE WarningThis is a class A product. In a domestic environment this product may cause radio inter-ference in which case the user may be required to take adequate measures.5 Limits for conducted disturbance at mains terminalsand telecommunication portsThe equipment under test (EUT) shall meet the limits in Tables 1 and 3 or 2 and 4, as appli-cable, including the average limit and the quasi-peak limit when using, respectively, an average detector receiver and quasi-peak detector receiver and measured in accordance with the methods described in Clause 9. Either the voltage limits or the current limits in Table 3 or 4, as applicable, shall be met except for the measurement method of C.1.3 where both limits shall be met. If the average limit is met when using a quasi-peak detector receiver, the EUT shall be deemed to meet both limits and measurement with the average detector receiver is unnecessary.If the reading of the measuring receiver shows fluctuations close to the limit, the reading shall be observed for at least 15 s at each measurement frequency; the higher reading shall be recorded with the exception of any brief isolated high reading which shall be ignored.5.1 Limits of mains terminal disturbance voltageTable 1 – Limits for conducted disturbance at the mains portsof class A ITE Limits dB(μV) Frequency rangeMHzQuasi-peak Average 0,15 to 0,5079 66 0,50 to 30 73 60NOTE The lower limit shall apply at the transition frequency.Class A ITE is a category of all other ITE which satisfies the class A ITE limits but not the class B ITE limits. The following warning shall be included in the instructions for use:Table 2 – Limits for conducted disturbance at the mains portsof class B ITE Limits dB(μV) Frequency rangeMHzQuasi-peak Average 0,15 to 0,5066 to 56 56 to 46 0,50 to 556 46 5 to 30 60 50NOTE 1 The lower limit shall apply at the transition frequencies.NOTE 2 The limit decreases linearly with the logarithm of the frequency in therange 0,15 MHz to 0,50 MHz.5.2 Limits of conducted common mode (asymmetric mode) disturbanceat telecommunication ports 2)Table 3 – Limits of conducted common mode (asymmetric mode) disturbanceat telecommunication ports in the frequency range 0,15 MHz to 30 MHzfor class A equipment Voltage limits dB (μV) Current limits dB (μA) Frequency rangeMHzQuasi-peak Average Quasi-peak Average0,15 to 0,597 to 87 84 to 74 53 to 43 40 to 30 0,5 to 30 87 74 43 30 NOTE 1 The limits decrease linearly with the logarithm of the frequency in the range 0,15 MHz to 0,5 MHz.NOTE 2 The current and voltage disturbance limits are derived for use with an impedance stabilization network (ISN) which presents a common mode (asymmetric mode) impedance of 150 Ω to the telecommunication port under test (conversion factor is 20 log 10 150 / I = 44 dB).Table 4 – Limits of conducted common mode (asymmetric mode) disturbanceat telecommunication ports in the frequency range 0,15 MHz to 30 MHzfor class B equipment Voltage limits dB(μV) Current limits dB(μA) Frequency rangeMHzQuasi-peak Average Quasi-peakAverage 0,15 to 0,584 to 74 74 to 64 40 to 30 30 to 20 0,5 to 30 74 64 30 20 NOTE 1 The limits decrease linearly with the logarithm of the frequency in the range 0,15 MHz to 0,5 MHz.NOTE 2 The current and voltage disturbance limits are derived for use with an impedance stabilization network (ISN) which presents a common mode (asymmetric mode) impedance of 150 Ω to the telecommunication port under test (conversion factor is 20 log 10 150 / I = 44 dB).6 Limits for radiated disturbanceThe EUT shall meet the limits of Table 5 or Table 6 when measured at the measuring distance R in accordance with the methods described in Clause 10. If the reading on the measuring receiver shows fluctuations close to the limit, the reading shall be observed for at least 15 s at each measurement frequency; the highest reading shall be recorded, with the exception of any brief isolated high reading, which shall be ignored.___________2) See 3.6.Table 5 – Limits for radiated disturbance of class A ITE at a measuring distance of 10 mFrequency rangeMHz Quasi-peak limits dB(μV/m)30 to 230 40230 to 1 000 47NOTE 1 The lower limit shall apply at the transition frequency.NOTE 2 Additional provisions may be required for cases where interference occurs.Table 6 – Limits for radiated disturbance of class B ITEat a measuring distance of 10 mFrequency rangeMHz Quasi-peak limits dB(μV/m)30 to 230 30230 to 1 000 37NOTE 1 The lower limit shall apply at the transition frequency.NOTE 2 Additional provisions may be required for cases where interferenceoccurs.7 Interpretation of CISPR radio disturbance limit7.1 Significance of a CISPR limit7.1.1 A CISPR limit is a limit which is recommended to national authorities for incorporation in national publications, relevant legal regulations and official specifications. It is also recom-mended that international organizations use these limits.7.1.2The significance of the limits for equipment shall be that, on a statistical basis, at least 80 % of the mass-produced equipment complies with the limits with at least 80 % confidence.7.2 Application of limits in tests for conformity of equipment in series production7.2.1Tests shall be made:7.2.1.1Either on a sample of equipment of the type using the statistical method of evaluation set out in 7.2.3.7.2.1.2Or, for simplicity's sake, on one equipment only.7.2.2Subsequent tests are necessary from time to time on equipment taken at random from production, especially in the case referred to in 7.2.1.2.7.2.3Statistically assessed compliance with limits shall be made as follows:This test shall be performed on a sample of not less than five and not more than 12 items of the type. If, in exceptional circumstances, five items are not available, a sample of four or three shall be used. Compliance is judged from the following relationship:x kS +≤n L wherex is the arithmetic mean of the measured value of n items in the sample()S n x x n 2n 211=−−∑x n is the value of the individual itemL is the appropriate limitk is the factor derived from tables of the non-central t -distribution which assures with 80 %confidence that 80 % of the type is below the limit; the value of k depends on the sample size n and is stated below.The quantities x n , x , S n and L are expressed logarithmically: dB(μV), dB(μV/m) or dB(pW). n3 4 5 6 7 8 9 10 11 12 k 2,04 1,69 1,52 1,42 1,35 1,30 1,27 1,24 1,21 1,207.2.4 The banning of sales, or the withdrawal of a type approval, as a result of a dispute shall be considered only after tests have been carried out using the statistical method of evaluation in accordance with 7.2.1.1.8 General measurement conditions8.1 Ambient noiseA test site shall permit disturbances from the EUT to be distinguished from ambient noise. The suitability of the site in this respect can be determined by measuring the ambient noise levels with the EUT inoperative and ensuring that the noise level is at least 6 dB below the limits specified in Clauses 5 and 6.If at certain frequency bands the ambient noise is not 6 dB below the specified limit, the methods shown in 10.5 may be used to show compliance of the EUT to the specified limits. It is not necessary that the ambient noise level be 6 dB below the specified limit where both ambient noise and source disturbance combined do not exceed the specified limit. In this case the source emanation is considered to satisfy the specified limit. Where the combined ambient noise and source disturbance exceed the specified limit, the EUT shall not be judged to fail the specified limit unless it is demonstrated that, at any measurement frequency for which the limit is exceeded, two conditions are met:a) the ambient noise level is at least 6 dB below the source disturbance plus ambient noiselevel;b) the ambient noise level is at least 4,8 dB below the specified limit.。

多尺度上采样方法的轻量级图像超分辨率重建

多尺度上采样方法的轻量级图像超分辨率重建

第 22卷第 4期2023年 4月Vol.22 No.4Apr.2023软件导刊Software Guide多尺度上采样方法的轻量级图像超分辨率重建蔡靖,曾胜强(上海理工大学光电信息与计算机工程学院,上海 200093)摘要:目前,大多数图像超分辨率网络通过加深卷积神经网络层数与拓展网络宽度提升重建能力,但极大增加了模型复杂度。

为此,提出一种轻量级图像超分辨率算法,通过双分支特征提取算法可使网络模型一次融合并输出不同尺度的特征信息,组合像素注意力分支分别对各像素添加权重,仅以较少参数为代价增强像素细节的特征表达。

同时,上采样部分结合亚像素卷积与邻域插值方法,分别提取特征深度、空间尺度信息,输出最终图像。

此外,组合注意力机制的亚像素卷积分支也进一步强化了重要信息,使输出图像具有更好的视觉效果。

实验表明,该模型在参数量仅为351K的情况下达到了与参数量为1 592K的CARN模型相似的重建性能,在部分测试集中的SSIM值高于CARN,证实了所提方法的有效性,可为轻量级图像超分辨率重建提供新的解决方法。

关键词:图像超分辨率重建;轻量级;像素注意力;多尺度上采样;图像处理DOI:10.11907/rjdk.221516开放科学(资源服务)标识码(OSID):中图分类号:TP391.41 文献标识码:A文章编号:1672-7800(2023)004-0168-07Lightweight Image Super-resolution Reconstruction using Multi-scaleUpsampling MethodCAI Jing, ZENG Sheng-qiang(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093, China)Abstract:At present, most image super-resolution networks improve the reconstruction ability by deepening the convolution neural network layers and expanding the network width, but greatly increase the model complexity. To this end, a lightweight image super-resolution algo‐rithm is proposed. Through the two-branch feature extraction algorithm, the network model can be fused and output the feature information of different scales at one time, and the pixel attention branches are combined to add weights to each pixel respectively, which only enhances the feature expression of pixel details at the cost of fewer parameters. In addition, the up-sampling part combines subpixel convolution and neigh‐borhood interpolation methods to extract feature depth and spatial scale information respectively, and output the final image. In addition, the subpixel convolution integral branch of the combined attention mechanism further strengthens the important information and makes the output image have better visual effect. The experimental results show that the model achieves similar reconstruction performance to the CARN model with a parameter quantity of 1 592K when the parameter quantity is only 351K, and the SSIM value in some test sets is higher than the CARN value, which confirms the effectiveness of the proposed method and can provide a new solution for lightweight image super-resolution recon‐struction.Key Words:image super-resolution; lightweight; pixel attention; multi-scale upsampling; image processing0 引言图像超分辨率重建是指将低分辨率图像重建为与之对应的高分辨率图像重建,在机器视觉和图像处理领域是非常重要的课题。

图像处理 毕业设计论文模版

图像处理 毕业设计论文模版

安徽建筑工业学院毕业设计(论文)课题视频序列图像分割及阴影抑制算法的研究专业电气工程及其自动化班级06城建电气2班学生姓名胡伟学号05290080117指导教师栾庆磊2010年6月5日摘要在智能视频监控领域、影视技术、多媒体应用技术中,常常需要检测出人体或其它物体,并将其与背景分离,即解决实时背景下目标的分割问题。

视频图像的目标分割结果,将对目标分类、跟踪及行为理解等后续处理产生重要影响。

图像分割多年里一直受到研究人员的重视,也提出了数以千计的算法。

现今比较流行的目标分割的方法,有不少是忽略阴影检测的,目标总是与阴影一起被检测出来。

阴影会引起目标的合并、目标形状的失真等一些严重问题,引起分割和跟踪错误。

由于阴影直接影响目标的检测,成为影响后续处理效果的关键因素,有必要进一步研究。

本课题拟根据图像处理的理论基础,对一些传统的边缘检测算子进行了理论分析,用仿真实验测试其边缘检测的效果,对比分析各边缘检测算法效果。

介绍几种常用的彩色空间以及彩色空间的转换算法,系统地阐述了图像分割的各种方法,分析总结了几种常用分割方法的优缺点。

选用RGB彩色空间,利用背景差分法对图像初步分割后,再利用区域生长法去除目标外部的噪声,分割出带影子的目标图像。

然后,分析总结了阴影检测的基本假设和一般框架,及国内外目前主流的阴影检测与抑制算法,指出了这些方法用于去除目标阴影时存在的问题。

针对不同图像的阴影和目标体的特点,拟设计一种去除阴影的算法。

基于边缘信息的阴影抑制算法适用于目标体边缘信息丰富,阴影边缘信息相对简单的阴影去除。

关键词图像分割阴影抑制AbstractIn the field of intelligent video surveillance,video technology,multimedia technology,often need to detect a human body or other objects,separate them with background,that is the context of solving real-time target segmentation. Video image object segmentation results,will target classification,tracking and behavior understanding such an important impact on subsequent processing. Image segmentation has been for many years in research attention,also raised thousands of algorithms.Goal of the current popular methods of segmentation, shadow detection,many are neglected,the goal is always to be detected, together with the shadow.The merger will cause the shadow of goals,objectives and some distortion of the shape of a serious problem,causing segmentation and tracking error.As the shadow directly affect target detection,a follow-up treatment effect affecting the key factors,the need for further research.The aim of this theory based on image processing based on some of the traditional edge detector is theoretically analyzed,using simulation experiments to test their effect on edge detection,contrast analysis of the effect of edge detection algorithm.Introduce some commonly used color space and color space conversion algorithm,systematically expounded the various methods of image segmentation,analyzes and summarizes the advantages and disadvantages of several commonly used e RGB color space, the background difference method using the initial segmentation of the image, then use region growing to remove the target of external noise,split the target image with a shadow.Then,the paper summarizes the basic assumptions shadow detection and the general framework of the current mainstream home and abroad shadow detection and suppression,that the goal of these methodsfor the removal of the existing problems in the shadow.Different images of the shadows and objectives of the body characteristics,be designed to remove the shadow of two algorithms.Based on Edge Information's shadow suppression algorithm is applied to the target of the edge information-rich,relatively simple shadow of the shadow edge removal。

基于多通道图像深度学习的恶意代码检测

基于多通道图像深度学习的恶意代码检测

2021⁃04⁃10计算机应用,Journal of Computer Applications2021,41(4):1142-1147ISSN 1001⁃9081CODEN JYIIDU http ://基于多通道图像深度学习的恶意代码检测蒋考林,白玮,张磊,陈军,潘志松*,郭世泽(陆军工程大学指挥控制工程学院,南京210007)(∗通信作者电子邮箱hotpzs@ )摘要:现有基于深度学习的恶意代码检测方法存在深层次特征提取能力偏弱、模型相对复杂、模型泛化能力不足等问题。

同时,代码复用现象在同一类恶意样本中大量存在,而代码复用会导致代码的视觉特征相似,这种相似性可以被用来进行恶意代码检测。

因此,提出一种基于多通道图像视觉特征和AlexNet 神经网络的恶意代码检测方法。

该方法首先将待检测的代码转化为多通道图像,然后利用AlexNet 神经网络提取其彩色纹理特征并对这些特征进行分类从而检测出可能的恶意代码;同时通过综合运用多通道图像特征提取、局部响应归一化(LRN )等技术,在有效降低模型复杂度的基础上提升了模型的泛化能力。

利用均衡处理后的Malimg 数据集进行测试,结果显示该方法的平均分类准确率达到97.8%;相较于VGGNet 方法在准确率上提升了1.8%,在检测效率上提升了60.2%。

实验结果表明,多通道图像彩色纹理特征能较好地反映恶意代码的类别信息,AlexNet 神经网络相对简单的结构能有效地提升检测效率,而局部响应归一化能提升模型的泛化能力与检测效果。

关键词:多通道图像;彩色纹理特征;恶意代码;深度学习;局部响应归一化中图分类号:TP309文献标志码:AMalicious code detection based on multi -channel image deep learningJIANG Kaolin ,BAI Wei ,ZHANG Lei ,CHEN Jun ,PAN Zhisong *,GUO Shize(Command and Control Engineering College ,Army Engineering University Nanjing Jiangsu 210007,China )Abstract:Existing deep learning -based malicious code detection methods have problems such as weak deep -level feature extraction capability ,relatively complex model and insufficient model generalization capability.At the same time ,code reuse phenomenon occurred in large number of malicious samples of the same type ,resulting in similar visual features of the code.This similarity can be used for malicious code detection.Therefore ,a malicious code detection method based on multi -channel image visual features and AlexNet was proposed.In the method ,the codes to be detected were converted into multi -channel images at first.After that ,AlexNet was used to extract and classify the color texture features of the images ,so as to detect the possible malicious codes.Meanwhile ,the multi -channel image feature extraction ,the Local Response Normalization (LRN )and other technologies were used comprehensively ,which effectively improved the generalization ability of the model with effective reduction of the complexity of the model.The Malimg dataset after equalization was used for testing ,the results showed that the average classification accuracy of the proposed method was 97.8%,and the method had the accuracy increased by 1.8%and the detection efficiency increased by 60.2%compared with the VGGNet method.Experimental results show that the color texture features of multi -channel images can better reflect the type information of malicious codes ,the simple network structure of AlexNet can effectively improve the detection efficiency ,and the local response normalization can improve the generalization ability and detection effect of the model.Key words:multi -channel image;color texture feature;malicious code;deep learning;Local Response Normalization (LRN)引言恶意代码已经成为网络空间的主要威胁来源之一。

基于lda主题模型的图像场景分类

基于lda主题模型的图像场景分类
The LDA model is used for scene classification in this paper. We focus on the extracting and choosing of the semantic features. The main content of this dissertation is summarized as follows
西安电子科技大学 学位论文创新性声明
秉承学校严谨的学风和优良的科学道德,本人声明所呈交的论文是我个人在 导师指导下进行的研究工作及取得的研究成果。尽我所知,除了文中特别加以标 注和致谢中所罗列的内容以外,论文中不包含其他人已经发表或撰写过的研究成 果;也不包含为获得西安电子科技大学或其它教育机构的学位或证书而使用过的 材料。与我一同工作的同志对本研究所做的任何贡献均已在论文中做了明确的说 明并表示了谢意。
1.1 研究背景及意义 ............................................................................................. 1 1.2 研究现状......................................................................................................... 2
3) Considering the influence of different local features on image scene classification, we propose a hierarchical LDA model. There are several levels for 13 categories of complex scenes. In different levels, images are described by different features. The results indicate that good performance can be obtained by using this method. In the meantime, the influence of visual vocabulary on the classification is very small when the visual vocabulary has

高中英语新教材译林必修件FestivalsandcustomsProject

高中英语新教材译林必修件FestivalsandcustomsProject

03
Teaching methods and means
Application of situational simulation teaching
method
01
Creating authentic festival scenarios
Teachers can design classroom activities that simulate real life festival celebrations, allowing students to experience the atmosphere and culture firsthand
Analyze the content of the Festivals and Custom units in the
new high school English textbooks, including the topics,
vocabulary, grammar, and cultural knowledge involved
02
Textbook positioning
Determine the role and position of the Festivals and Custom
units in the overall structure and teaching objectives of the
textbooks
Prior knowledge
Assess the prior knowledge and experience of students with regulations to Festivals and Customs to identify any gaps or areas that may require additional attention and support during the course of the project

诺瓦科技LED显示屏联网播放器TB2规格书英文版

Taurus SeriesMultimedia PlayersTB2 SpecificationsProduct Version: V1.3.0 Document Number:NS120100250XI 'AN NOVA ST A R TEC H C O .,L T D .Copyright © 2018 Xi'an NovaStar Tech Co., Ltd. All Rights Reserved.No part of this document may be copied, reproduced, extracted or transmitted in any form or by any means without the prior written consent of Xi’an NovaStar Tech Co., Ltd.Trademarkis a trademark of Xi’an NovaStar Tech Co., Ltd.StatementYou are welcome to use the product of Xi’an NovaStar Tech Co., Ltd. (hereinafter referred to as NovaStar). This document is intended to help you understand and use the product. For accuracy and reliability, NovaStar may make improvements and/or changes to this document at any time and without notice. If you experience any problems in use or have any suggestions, please contact us via contact info given in document. We will do our best to solve any issues, as well as evaluate and implement any suggestions.X I'A NN OV AS TA RT EC HC O.,L TD.Table of ContentsTable of Contents ............................................................................................................................ ii 1 Safety .. (1)1.1 Storage and Transport Safety ..................................................................................................................... 1 1.2 Installation and Use Safety .. (1)2 Overview (3)2.1 Introduction .................................................................................................................................................. 3 2.2 Application ................................................................................................................................................... 3 3 Features ........................................................................................................................................... 5 3.1 Powerful Processing Capability ................................................................................................................... 5 3.2 Omnidirectional Control Plan ....................................................................................................................... 5 3.3 Synchronous and Asynchronous Dual-Mode . (6)3.4 Wi-Fi AP Connection ................................................................................................................................... 6 4 Hardware Structure....................................................................................................................... 7 4.1 Appearance ................................................................................................................................................. 7 4.1.1 Front Panel ............................................................................................................................................... 7 4.1.2 Rear Panel ................................................................................................................................................ 8 4.2 Dimensions .................................................................................................................................................. 9 5 Software Structure . (10)5.1 System Software ........................................................................................................................................ 10 5.2 Related Configuration Software .. (10)6 Product Specifications................................................................................................................ 11 7 Audio and Video Decoder Specifications (13)7.1 Image ......................................................................................................................................................... 13 7.1.1 Decoder .................................................................................................................................................. 13 7.1.2 Encoder .................................................................................................................................................. 13 7.2 Audio .......................................................................................................................................................... 14 7.2.1 Decoder .................................................................................................................................................. 14 7.2.2 Encoder .................................................................................................................................................. 14 7.3 Video . (15)X I 'A N NOV A S T A R T E C H C O .,L T D.7.3.1 Decoder (15)7.3.2 Encoder (16)X I'A NN OV AS TA RT EC HC O.,L TD.TB2 Specifications1 Safety1SafetyThis chapter illustrates Taurus series products safety to ensure storage, transportation, installation and usage safety of the products.Safety description is applicable to all personnel that contact or use the products. First, pay attention to following points:● Read throughout the description. ● Save the whole description.●Be complied with the whole description.1.1 Storage and Transport Safety● Pay attention to dust and water prevention. ● Avoid long-term direct sunlight. ● Do not place the products in the position near fire and heat.● Do not place the products in an area containing explosive materials. ● Do not place the products in strong electromagnetic environment. ● Place the products in a stable position to prevent damage or personal injurycaused by dropping. ●Save the packing box and materials which will come in handy if you ever have to ship your products. For maximum protection, repack your product as it was originally packed at the factory. 1.2 Installation and Use Safety● Only trained professionals may install the products.● Do not insert and unplug (power cord plug) when the power is on. ● Devices must be placed horizontally during installation and use. ● Ensure the safe grounding of the device. ● Be careful about electric shock risk.● Always wear a wrist band and insulating gloves.● Do not place the products in an area having more or strong shake. ● Perform dust removing regularly.●Rather than having the product disassembled and maintained by non-certified professionals, please contact NovaStar for maintenance at any time.XI 'A N NO V A S T A R T E C HCO .,L T D.TB2 Specifications Table of Contents Replace faulty parts only with the spare parts supplied by NovaStar.X I'A NN OV AS TA RT EC HC O.,L TD.2Overview2.1 IntroductionTaurus series products are NovaStar's second generation of multimedia players dedicated to small and medium-sized full-color LED displays.TB2 of the Taurus series products (herein after referred to as “TB2”) feature following advantages, better satisfying users’ requir ements:● Loading capacity up to 650,000 pixels ● Powerful processing capability ● Omnidirectional control plan ● Synchronous and asynchronous dual-mode●Wi-Fi AP connectionIn addition to solution publishing and screen control via PC, mobile phones and LAN, the omnidirectional control plan also supports remote centralized publishing and monitoring. 2.2 Application Taurus series products can be widely used in LED commercial display field, such as bar screen, chain store screen, advertising machine, mirror screen, retail store screen,door head screen, on board screen and the screen requiring no PC. Classification of Taurus’ application cases is shown in Table 2-1.Table 2-1 ApplicationX I 'A N N OV A S T A R T E C HCO .,L T D.X I'A NN OV AS TA RT EC HC O.,L TD.3Features3.1 Powerful Processing Capability● 1.2 GHz four-core processor● Support for 1080P video hardware decoding ● 1 GB operating memory●8 GB on-board internal storage space with 4 GB available for users3.2 Omnidirectional Control PlanTable 3-1 Control PlanCluster control plan is a new internet control plan featuring following advantages:C HCO .,L T D.●More efficient: Use the cloud service mode to process services through a uniform platform. For example, VNNOX is used to edit and publish solutions, and NovaiCare is used to centrally monitor display status.● More reliable: Ensure the reliability based on active and standby disaster recovery mechanism and data backup mechanism of the server.● More safe: Ensure the system safety through channel encryption, data fingerprint and permission management.● Easier to use: VNNOX and NovaiCare can be accessed through Web. As long as there is internet, operation can be performed anytime and anywhere. ●More effective: This mode is more suitable for the commercial mode of advertising industry and digital signage industry, and makes information spreading more effective.3.3 Synchronous and Asynchronous Dual-ModeThe TB2 supports synchronous and asynchronous dual-mode, allowing more application cases and being user-friendly.When internal video source is applied, the TB2 is in asynchronous mode; when HDMI-input video source is used, the TB2 is in synchronous mode. Content can be scaled and displayed to fit the screen size automatically in synchronous mode. Users can manually and timely switch between synchronous and asynchronous modes, as well as set HDMI priority.3.4 Wi-Fi AP Connection The TB2 has permanent Wi-Fi AP. The SSID is "AP + the last 8 digits of the SN ", for example, "AP10000033", and the default password is "12345678". The TB2 requires no wiring and users can manage the displays at any time by connecting to the TB2 via mobile phone, Pad or PC.TB2’s Wi -Fi AP signal strength is related to the transmit distance and environment.Users can change the Wi-Fi antenna as required.XI 'AN NOVAS TAR T E C HCO .,L T D.4Hardware Structure4.1 Appearance4.1.1 Front PanelFigure 4-1 Front panel of the TB2Note: Product images provided in this document are for reference only, and the actualproducts shall prevail.Table 4-1 Description of TB2 front panelR T E C HCO .,L T D.4.1.2 Rear PanelFigure 4-2 Rear panel of the TB2Note: Product images provided in this document are for reference only, and the actual products shall prevail.Table 4-2 Description of TB2 rear panelC HCO .,L T D.4.2 DimensionsUnit: mmX I'A NN OV AS TA RT EC HC O.,L TD.5Software Structure5.1 System Software● Android operating system software ● Android terminal application software ●FPGA programNote: The third-party applications are not supported.5.2 Related Configuration SoftwareTable 5-1 Related configuration softwareC HCO .,L T D.6 Product Specifications SpecificationsAntennaX I'A NN OV AS TA RT EC HC O.,L TD.7Audio and Video DecoderSpecifications7.1 Image7.1.1 Decoder7.1.2 Encoder.,L T D.7.2 Audio 7.2.1 Decoder7.2.2 Encoder7.3 Video 7.3.1 DecoderNote: Output data format is YUV420 semi-planar, and YUV400(monochrome) is also supported for H.264.7.3.2 EncoderXI 'AN NOVA S。

测绘学名词解释

1954年北京坐标系Beijing Geodetic Coordinate System l9541954年我国决定采用的国家大地坐标系,实质上是由原苏联普尔科沃为原点的1942年坐标系的延伸。

1956年黄海高程系统Huang hai Vertical Datum l956以青岛验潮站根据1950年一1956年的验潮资料计算确定的平均海面作为基准面,据以计算地面点高程的系统。

1985国家高程基准National Vertical Datum 19851987年颁布命名的,以青岛验潮站1952年一1979年验潮资料计算确定的平均海面作为基准面的高程基准。

ISO/OSI参考模型OSI-RM ISO/OSI Reference Model该模型是国际标准化组织(ISO)为网络通信制定的协议,根据网络通信的功能要求,它把通信过程分为七层,分别为物理层、数据链路层、网络层、传输层、会话层、表示层和应用层,每层都规定了完成的功能及相应的协议。

WGS一84坐标系WGS一84 Coordinate System一种国际上采用的地心坐标系。

坐标原点为地球质心,其地心空间直角坐标系的之轴指向BIH (国际时间)1984.O定义的协议地球极(CTP)方向,调轴指向BIH 1984.0的零子午面和CTP赤道的交点,Y轴与Z轴、X轴垂直构成右手坐标系,称为1984年世界大地坐标系统。

编码Encoding将信息分类的结果用一种易于被计算机和人识别的符号体系表示出来的过程,是人们统一认识、统一观点、相互交换信息的一种技术手段。

编码的直接产物是代码。

标识码Identification Code在要素分类的基础上,用以对某一类数据中某个实体进行唯一标识的代码。

它便于按实体进行存贮或对实体进行逐个查询和检索,以弥补分类码的不足。

标准化standardization在经济、技术、科学及管理等社会实践中,对重复性事物和概念通过制定、发布和实施标准,达到统一,以获得最佳秩序和社会效益。

人脸表观年龄估计综述

信ia 与电ns China Computer & Communication 專该語言2020年第22期人脸表观年龄估计综述杜希婷(北京建筑大学,北京100044)摘 要:近年来,人脸表观年龄估计引起了越来越多的关注.基于此,本文对近年来表观年龄估计的相关研究发展 状况进行了综述,包括基于传统算法和基于深度学习算法两方面,然后对常用的数据库和性能评价指标进行了总结,最 后对基于人脸图像的表观年龄估计所面临的挑战和未来的发展方向进行了讨论.关键词:人脸衰老;深度学习;表观年龄估计;年龄数据库中图分类号:TP391 文献标识码:A 文章编号:1003-9767 (2020) 22-052-03An Overview of Face Apparent Age EstimationDU Xiting(Beijing University of Civil Engineering and Architecture, Beijing 100044, China)Abstract: In recent years, the estimation of apparent age of human face has attracted more and more attention. This paper summarizes the research and development of apparent age estimation in recent years, mainly including traditional methods and deep learning methods, and then summarizes the commonly used database and performance evaluation indicators. Finally, the challenges and future development of apparent age estimation based on face image are discussed*Keywords: face aging;deep learning; apparent age estimation;aging database 1估计步骤介绍近年来,表观年龄估计逐渐引起关注,它是一个人看起 来的年龄,而非实际年龄。

基于改进的LBP算法的三维人脸识别

基于改进的LBP算法的三维人脸识别王健;高媛;秦品乐;王丽芳【摘要】The extraction of 3D face images data was affected by costs and accessibility.According to the study of acquiring face data process by depth camera (Xtion pro live),RGB-D diagrams can be easily obtained.As for the RGB-D image,the local and global hybrid identification was used to extract the histogram information and feature vector based on the local binary average en-tropy pattern (LBEP).According to different changes in some regions,the rate of recognition was given different weights and then they were calculated.Experimental results show that compared with 2D face recognition and 3D face recognition algorithm, improved LBEP algorithm apparently increases the discrimination ratio.%三维人脸数据的获取会受到成本以及可访问性的影响.通过对深度相机(如Xtion pro live)获取人脸数据过程的研究可知,它能够很容易获得彩色和深度结合(RGB-D)图.针对RGB-D图,使用局部和整体混合识别,利用局部二值的平均信息熵模式(LBEP),快速提取RGB-D 图的直方图信息和特征向量,根据不同区域在表情不同情况下的变化程度,对不同区域的识别效果赋予不同的权值,进行加权运算.实验结果表明,相比现有的二维和三维人脸识别算法,改进的LBEP算法识别率有明显的提升.【期刊名称】《计算机工程与设计》【年(卷),期】2016(037)012【总页数】5页(P3366-3370)【关键词】三维人脸;人脸识别;深度相机;局部二值的平均信息熵模式(LBEP);加权运算【作者】王健;高媛;秦品乐;王丽芳【作者单位】中北大学计算机与控制工程学院,山西太原 030051;中北大学计算机与控制工程学院,山西太原 030051;中北大学计算机与控制工程学院,山西太原030051;中北大学计算机与控制工程学院,山西太原 030051【正文语种】中文【中图分类】TP393目前,基于结构的人脸识别算法可以分为3类:基于整体、局部和混合的人脸识别算法,基于整体的人脸识别算法经过了长年的发展和创新,如Zhou提出了Kernel的PPCA方法,能够很好地适应非线性问题和将高维问题降到低维;基于局部的人脸识别方式,是利用人脸的某些局部特征信息(如眼、鼻、口,伤疤的位置)来识别和身份判断的,相比于整体识别,局部特征对表情、光照和姿态等不利因素有较强的鲁棒性,基于局部特征的经典方法有很多种,其中Lucey等用高斯模型来表示人脸各特征在不同区域(将人脸划分为6个椭圆区域)的概率分布,从而进行人脸识别;基于混合(整体和局部)的人脸识别近年来越来越多被人们使用,它可以有效解决影响识别的一些空间变换(平移、投影、刚体等),并且将整体和局部的识别优势有效的结合在一起[2],本文介绍了一种人脸识别算法:局部二值的平均信息熵模(LBEP)。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

IMAGE CLASSIFICATION USING MULTIMEDIA KNOWLEDGE NETWORKS *Ana B. Benitez and Shih-Fu ChangDept. of Electrical Engineering, Columbia University, New York, NY 10027{ana, sfchang} @ *This research is partly supported by a Kodak fellowship awarded to the first author of the paper.ABSTRACTThis paper presents novel methods for classifying images based on knowledge discovered from annotated images using WordNet. The novelty of this work is the automatic class discovery and the classifier combination using the extracted knowledge. The extracted knowledge is a network of concepts (e.g., image clusters and word-senses) with associated image and text examples. Concepts that are similar statistically are merged to reduce the size of the concept network. Our knowledge classifier is constructed by training a meta-classifier to predict the presence of each concept in images. A Bayesian network is then learned using the meta-classifiers and the concept network. For a new image, the presence of concepts is first detected using the meta-classifiers and refined using Bayesian inference. Experiments have shown that combining classifiers using knowledge-based Bayesian networks results in superior (up to 15%) or comparable accuracy to individual classifiers and purely statistically learned classifier structures. Another contribution of this work is the analysis of the role of visual and text features in image classification. As text or joint text+visual features perform better in classifying images than visual features, we tried to predict text features for images without annotations; however, the accuracy of visual + predicted text features did not consistently improve over visual features.1. INTRODUCTIONIn recent years, there has been a major increase in available multimedia and in technologies to access the multimedia. Users often want to retrieve, filter and navigate multimedia at the semantic level (e.g., people). However, current multimedia applications use features at the perceptual level (e.g., color) failing to meet user needs. For example, the study [5] found that less than 20% of the attributes used by humans in describing images for retrieval were related to visual features. In addition, the most popular user operation in the web image search engine WebSEEk [10] was found to be subject hierarchy browsing. This paper focuses on image classification. Image classifiers can be used to annotate images with semantic labels. However, current approaches lack flexibility: they are often constrained to specific domains and trained on limited data sets.Prior work on image annotation and classification can be reviewed in terms of input features, classifier structure, and class selection. Many methods rely uniquely on perceptual features such as color histogram [4][9][11]; whereas few also considertext features from annotations or captions [7][8]. There are some approaches that only use individual classifiers or joint distributions [1]; while others combine multiple classifiers for improved accuracy [4][9][11]. Finally, experts handpick the classes in many methods [11][7][8][9] to which the classifiers are often fine-tuned. Exceptions are frameworks where "expert" users define their own classes and relations [4], and approaches that associate words annotating images to new images or regions [1]. The most similar prior work is [7] and [9], which learn Bayesian Networks (BNs) with classifiers as nodes. However, the BN is either manually entered by experts or automatically learned using costly statistical methods.In this paper, we present novel approaches towards image classification using visual and text features. The main contributions of this work are the automatic selection of salient classes, and the combination of multiple classifiers based on knowledge extracted from annotated images. In addition, this work analyses the role of visual and text features in image classification. As text or joint text+visual features perform better than visual features [8], we try to predict text features for images without annotations. We use the term “knowledge classifier” to refer to our image classification framework, and “knowledge network” to a concept network with associated media examples.Knowledge networks are constructed from annotated images by clustering the images based on visual and text features (perceptual knowledge); and disambiguating the senses of the words in the annotations using WordNet [6] and the image clusters (semantic knowledge) [2]. Visual, statistical and semantic relations are discovered among concepts (e.g., image clusters and words senses). Statistically similar concepts can be merged to reduce the number of concepts in the knowledge network. We propose to build a knowledge classifier for a knowledge network in two steps. First, we train a meta-classifier to predict the presence of each concept in images using visual and text features. A meta-classifier can be the result of combining several classifiers of different types or feature inputs. Then, a Bayesian network is learned using the meta-classifiers and the concept network. The presence of concepts in a new image is first detected using the meta-classifiers and this initial classification is refined using Bayesian inference. Text features are predicted for images without annotations using clustering and statistical approaches based on visual features extracted from the images.The paper is organized as follows. Section 2 summarizes the knowledge discovery process. Section 3 describes the construction of the knowledge classifier. The way concepts are detected in new images is explained in section 4. Section 5 presents the experimental setup and results. Finally, section 6 concludes with a summary and some future work.2. DISCOVERING KNOWLEDGE NETWORKS The discovery of knowledge from annotated images consists of four steps (see [2] for details): basic image and text processing, perceptual knowledge extraction, semantic knowledge extraction, and knowledge summarization. The result is a network of concepts with associated image and text examples.First, images and annotations are processed separately. Images are segmented into regions with homogenous color and edge. Then, features are extracted from images and regions such as color histogram and size, respectively. Similarly, words in annotations are stemmed down to their base form and tagged with their part-of-speech (e.g., verb). After discarding stopwords and rare words, words are represented as vectors using word-weighting schemes such as tf * idf and log tf * entropy.Perceptual knowledge is discovered by grouping images into clusters based on their visual and text features. We use well-known clustering algorithms: k-means, k-nearest neighbors, and self-organizing map algorithms, among others. Relationships among clusters are found based on centroid proximity and cluster statistics. For example, a cluster is considered to have similarity relationships with its k-nearest cluster neighbors based on their centroids’ distances. Clusters and cluster relations are concepts and concept relations in the knowledge network.Semantic knowledge is extracted by disambiguating the senses of words in annotations using WordNet and image clusters. WordNet is a dictionary that organizes English words into sets of synonyms (e.g., “rock, stone”) and connects them with semantic relations (e.g., generalization) [6]. We assume images in the same cluster are often related semantically. The words annotating the images in each cluster are matched to the definitions of the possible senses of each word using word-weighting schemes. Disambiguated senses are added as concepts to the knowledge network. Relationships and intermediate senses in the paths connecting disambiguated senses are found in WordNet and added to the knowledge network.Finally, the knowledge network can be summarized by merging similar concepts (e.g., image clusters and word senses). Merged concepts inherit all relations from individual concepts except for relations whose two vertices belong to the same merged concept. The distance among concepts in a knowledge network is calculated using a novel technique based on both concept statistics and network topology. The distance of a relationship between two concepts increases with the concepts’ probabilities but decreases with the concepts’ conditional probabilities through that relationship. The distance between any two concepts is the distance of the shortest distance path between them. Figure 1 shows examples of a concept network and a summarized concept network.3. BUILDING KNOWLEDGE CLASSIFIERSA knowledge classifier is built for a knowledge network in two steps: training meta-classifiers to predict the presence of concepts in images, and building a Bayesian network using the meta-classifiers and the concept network.First, one or more classifiers are trained to predict the presence of a concept in images based on visual and text features. The class labels indicate concept presence strength such as {presence, weak presence, absence}. For image clusters, the labels are the presence or absence of an image in the cluster; for word senses, the quantized disambiguation scores. We use well-known classifiers including Naïve Bayes (NB) and Support Vector Machine (SVM). Several two-class classifiers can learn more than two classes using the one-per-class coding technique. If multiple classifiers are built for a concept (e.g., for different features), the classifiers are combined into a meta-classifier using techniques like stacking and majority voting.Bayesian Networks (BNs) are directed graphical models that allow the efficient and compact representation of joint probability distributions for multiple random variables. We propose two approaches to combine meta-classifiers using Bayesian networks (see Figure 1). In the first approach (BN:MC), the nodes of the BN are the meta-classifiers; each node is thus indirectly representing a concept. The topology of the BN is set to that of the concept network after removing cycles. Each relation is assigned a direction in accordance with the cause-effect dependencies of a BN, if applicable (e.g., specialization: dog -> animal). Cycles are solved by removing all relations between the first two adjacent concepts (i.e., connected by a relationship) in a cycle. In the second approach (BN:MC+RC), the BN has meta-classifiers and real concepts as nodes; where a real concept node directly represents the presence of a concept. The arcs connecting real concept nodes in the BN are the relations in the concept network minus cycles. In addition, real concept nodes have incoming arcs from the meta-classifier nodes associated to adjacent concepts in the concept network. In both approaches, the parameters and the structure of the BN can be learned using standard statistical methods.4. CLASSIFYING IMAGESOnce trained, the knowledge classifier uses the meta-classifiers to predict the presence of concepts in images. This initial prediction is refined using Bayesian inference.For a new image, visual (and text) features are extracted from the image (and its annotations, if any). The features are inputted to the meta-classifiers. In BN:MC, the concept labels predicted by the best meta-classifiers are entered as observed values of the corresponding nodes in the BN (phase MC). An expert decides the number of best meta-classifiers. The performance of a meta-classifier is the concept detection accuracy in training images. The labels of the other concepts are inferred using the Bayesian network (phase MC+BN). Unconnected concepts are labeled using only the meta-classifiers. In addition, new concept labels for concepts detected using meta-classifiers can be refined or found using Bayesian inference (phase MC+2BN). In BN: MC+RC, the output labels of the meta-classifiers are observed values of the associated nodes in the BN. The presence of all the real concepts is then predicted using Bayesian inference. In both cases, senses disambiguated in the annotations of new images can be entered as observed values for corresponding meta-classifier and real concept nodes in BN:MC and BN:MC+RC, respectively.Text or joint text+visual features perform better than visual features in image classification [8]. If a new image does not have annotations, we try to predict the text features based on visual features in order to label the image using knowledge classifiers that use text features. We propose to estimate the text features by clustering the training images based on text featuresand modeling the visual features of the images within each cluster using a Gaussian model (clustering approach). We predict the text features for an image as the center of the cluster associated with the most likely Gaussian model given the visual features of the image. We also adopt the statistical approach proposed for handling missing and unreliable acoustic data in [3]. This technique models the distribution of features for the images of a given class using a mixture of Guassian models with diagonal-only covariance. The predicted text features for a new image are the mean text features conditioned on the visual features of the image given a class.5. EVALUATIONFrom a collection of 2706 nature images with annotations, 2437 were used to train knowledge classifiers with different parameters. The remaining 269 were used to test the performance of the classifiers in terms of classification accuracy.5.1. Experimental setupThe collection of 2706 nature images was taken from the Berkeley’s CalPhotos collection (http://elib.cs.berkely. edu/photos). The images in CalPhotos are labeled as plants (857), animals (818), landscapes (660) or people (371). We use a few keywords from the annotations describing the main objects or people depicted on the pictures (e.g., “plant, flower").A knowledge network was constructed using the 2437 training images. Color histogram (166 bins) was extracted from the images; and log tf * entropy (125 bins with latent-semantic analysis) from the annotations. Color histogram has been proven to be effective in retrieving natural images; in addition, it is widely accepted that log tf * entropy outperforms other word-weighting schemes in Information Retrieval. A concept network was then constructed using the senses of words in the annotations. The initial network of 52 semantic concepts, 47 specialization relations and 2 aggregation relations was summarized into 16 concepts and 13 specification relations. See Table 1 for a list of the most frequent words in the annotations, and concepts in the summarized knowledge network. Knowledge classifiers were then built for different classifiers, features, and structures, among others. We used the mean classification accuracy (for 16 concepts) to compare the resulting classifiers. For a concept, the accuracy is the percentage of testing images to which the concept is correctly assigned. Concept accuracies were weighted by 1 – p log (p), where p is the probability of a concept in the training annotations. Common and rare concepts are given less importance. The first author of this paper generated the ground truth of correct senses for words in all the image annotations. 5.2. Experimental resultsTable 2 lists the mean classification accuracy of knowledge classifiers built for (1) different features: color histogram (CH), log tf * entropy (LE), predicted log tf * entropy using the clustering (CPLE) and statistical (SPLE) approaches, and combinations of these; (2) different meta-classifiers (or classifiers): SVM and NB; (3) different structures for the Bayesian network: the meta-classifiers with no BN (MC no BN), BN of meta-classifiers (BN:MC) and BN of meta-classifiers and real concepts (BN:MC+RC); (4) and learning the parameters (PA), and also the structure (+ST) of the BN. The accuracies for BN:MC correspond to the best knowledge classifier at phase MC+BN or MC+2BN using 2 or 8 meta-classifiers in phase MC. In addition, we include results for disambiguating senses in annotations and activating the corresponding nodes in the BN (+O), another way of using annotations in the classification. For baseline comparison, randomly deciding the presence of concepts in images resulted in accuracies of about 50%.The classifiers in Table 2 use the correct senses of words in annotations during the knowledge network and classifier construction. We do this for the purpose of decoupling classification and disambiguation errors. If senses were disambiguated automatically, as described in section 2, only 65% of the words were disambiguated correctly. However, classification accuracies still reached 90% and 80% for SVM and NB, respectively, using color histogram + log tf * entropy and log tf * entropy features. In addition, for both, correct and automatically disambiguated senses, we observed similar trends in the results for the same features, classifiers, etc.As shown in Table 2, if annotations are available for new images, the best performing systems use (1) the individual SVM meta-classifiers (MC no BN) and (2) the BN of SVM meta-classifiers and real concepts (BN: MC+RC), using either text features (LE) or text and visual features (CH+LE). The differences in accuracy of these systems are not significant. When annotations are not available for classification (i.e., only color histogram inputs to meta-classifiers), the highest accuracy is achieved again for (1) the individual SVM meta-classifiers and (2) the BN of SVM meta-classifiers and real concepts. In both cases, using and not using annotations, having real concepts in the BN outperformed the BN of meta-classifiers alone (BN:MC) by up to 15%. Although the improvements for the BN of meta-classifiers and real concepts are insignificant with respect to no combination of classifiers for SVM, gains of up to 15% in accuracy where obtained for NB. These are good indications of the importance of including nodes corresponding to real concepts in the BN. In addition, combining classifiers using a BN can offer significant performance gains that are not affected by specific choices of features and classifiers.Other conclusions can be drawn from Table 2. First, the structure of the knowledge network discovered from annotated images using WordNet helps in classifying images. BNs of meta-classifiers (and, especially, of real concepts) whose structures were based on discovered knowledge networks consistently outperformed BNs with purely statistically learned structures by up to a 15%. In addition, observing values of nodes in the BN based on disambiguated senses in annotations improves the accuracy and robustness of knowledge classifiers even with text feature inputs (+O). As an example, the most accurate NB-based knowledge classifier used color histogram inputs and +O. Finally, predicting text features using visual features did not improve the most accurate knowledge classifier with color histogram inputs and the SVM classifier. However, it improved the results of MC no BN and BN:MC for the NB classifier. Based on the results, a better way to improve the classification of images without annotations would be to do Bayesian inference using predicted concepts labels as observed values of nodes in the BN (+O with predicted concept labels).6. CONCLUSIONSThis paper presents novel methods for classifying images based on knowledge discovered from annotated images. The main novelty of this work is to automatically use the extracted knowledge to discover salient classes, and to combine multiple classifiers for improved performance. Experiments have shown that combining classifiers based on knowledge discovered and summarized from annotated images using WordNet results in superior (up to 15%) or comparable accuracy to individual classifiers and purely statistically learned classifier structures. Another contribution of this work is the analysis of the role of visual and text features in image classification. As text or joint text+visual features perform better in classifying images than visual features, we tried to predict text features for images without annotations; however, the accuracy of visual + predicted text features did not consistently improve over visual features. Directions for future work are discovering knowledge from and classifying image regions, determining concepts that are accurately detected using trained classifiers, and distinguishing concepts that are applicable to image and/or regions. We envision the use of this information to refine discovered knowledge networks.7. REFERENCES[1] Barnard, K., P. Duygulu, D. Forsyth, et al., "Matching Wordsand Pictures", JMLR , in press.[2] Benitez, A.B., and S.-F. Chang, “Automatic MultimediaKnowledge Discovery, Summarization and Evaluation”, submitted to IEEE Trans. on Multimedia .[3] Cooke, M., P. Green, L. Josifovski and A. Vizinho, “RobustAutomatic Speech Recognition with Missing and Uncertain Acoustic Data”, Speech Communication , 2001.[4] Jaimes, A., and S.-F. Chang, “Learning Structured VisualDetectors From User Input at Multiple Levels”, IJIG, Special Issue on Image and Video Databases , Aug. 2001.[5] Jörgensen, C, “Attributes of Images in Describing Tasks”,Information Processing & Managem., Vol. 34, No. 2/3, 1998. [6] Miller, G.A., "WordNet: A Lexical Database for English",Comm. of the ACM , Vol. 38, No. 11, pp. 39-41, Nov. 1995. [7] Paek, S., and S.-F. Chang, "The Case for Image ClassificationSystems Based on Probabilistic Reasoning", ICME-2000, New York, NY, USA, July/Aug 30-2, 2000.[8] Paek, S., C.L. Sable, V. Hatzivassiloglou, A. Jaimes, et. al.,“Integration of Visual and Text Based Approaches for the Content Labeling and Classification of Photographs”, SIGI-1999 Berkeley, CA, 1999.[9] Naphade, R.M., and T.S. Huang, “A Probabilistic Frameworkfor Semantic Video Indexing, Filtering, and Retrieval”, IEEE Trans. on Multimedia , Vol. 3, No. 1, March 2001.[10] Smith, J.R., and S.-F. Chang, “An Image and Video SearchEngine for the World-Wide Web”, IS&T/SPIE-1997, San Jose, CA, 1997.[11] Szummer, M., and R. Picard, "Indoor-Outdoor ImageClassification", IEEE Workshop in Content-Based Access to Image and Video Databases , Bombay, India, Jan. 1998.BN node: Meta-Classifier BN node: Real Concept BN arcConceptFigure 1: Examples of (a) a concept network, (b) a summarized concept network, (c) a BN of meta-classifier nodes, and (b) a BN of meta-classifier and real concept nodes.Words ConceptsPlant 15.88 Plant flora, vine, tree 18.66 Animal 15.08 Animal, beast, fauna14.96 Flower 13.30 Natural object, plant part, flower 13.19 Habitat 12.19 Society, people, group, culture 12.66 Landscape 12.19 Vicinity, country, landscape 12.09 People6.85 Habitat, geographic area, region12.09Table 1: Most frequent words in annotations and concepts in knowledge summary with occurrence probabilities (%).CHCH + CPLECH+SPLE BN:MC BN:MC+RCMC no BN PA +ST +O PA +ST +OMC no BN BN:MC BN: MC+RC MCno BNBN:MC BN: MC+RC SVM 84.31 81.93 83.00 84.36 82.30 80.40 94.27 79.30 79.19 79.40 43.4066.32 39.90 NB65.45 65.3 60.56 68.89 81.33 80.40 94.96 70.35 77.24 78.94 60.7975.68 77.16CH+LE LEBN:MC BN:MC+RC BN:MC BN:MC+RCMCno BN PA +ST +O PA +ST +O MC no BN PA +ST +O PA +ST +O SVM99.58 95.59 95.58 99.48 99.61 83.09 99.62 99.66 95.72 99.66 99.56 99.74 83.12 99.81 NB82.76 82.10 81.10 88.29 86.04 80.47 92.40 85.52 83.87 83.88 87.74 90.35 83.31 95.48Table 2: Mean classification accuracy for different classifiers (SVM: Support Vector Machines, NB: Naïve Bayes), different input feature features (CH: color histogram, LE: log tf * entropy, CPLE: LE predicted using clustering approach, SPLE: LE predicted using statistical approach), different structures of the BN (MC: only meta-classifiers, BN:MC: BN of meta-classifiers, BN:MC+RC: BN of meta-classifiers and real concepts). Columns PA and + ST are results for learning the parameters, and also the structure of the BN, respectively. Column +O are results from observing nodes in the BN+PA for senses disambiguated in annotations.。

相关文档
最新文档