Application tuning and debugging on Linux

合集下载

振动式音叉说明书Tuning fork

振动式音叉说明书Tuning fork

Kunpeng Fork level switchOperation ManualThank you for your choosing our product! Beforeoperation, please read this “Operation Manual”carefully to ensure normal operation of instrument.2 Compact type fork level switchBrief introduction to productBased on the tuning fork vibration technology, this fork levelswitch has been equipped with durable stainless steel bodyand fork so that it can be used in various kinds of fields. The3/4” or 1” economic type level switch can be fixed onto pipelineor storage tank through screw mounting, or installed onto thefood industrial facility through sanitation fastenings. In addition,the direct load switch adapts to any power supply or PNP output,also can be used as the direct interface of PLC.Product features◆ Away from the influence including water flow 、onflow 、bubble 、foam 、vibration 、solid content 、coating and liquidfeatures completely.◆ Less installation process steps without indication of positions◆ Insensitive response to polarity with short-circuit protection function◆Standard plug/socket fastenings available◆With immovable parts or gap, free-maintenance can be acquired completely◆All the information about electric components、self-check and state monitoring can be finished through LEDdisplay◆The magnetic test points make the function test easier◆Compacted design make the outline be small and exquisite◆Quicker response to viscous liquid due to its “Dripple” design◆Sanitation fasteningsMeasuring PrincipleBased on the tuning fork vibration principle, this level switch adopts piezoelectric crystal to vibrate the fork by fixed frequency of tuning fork, also the frequency variation can be monitored continuously. When used for low level alarm, the liquid in container flows down and through the tuning fork to cause the variation of natural frequency, this frequency variation can be captured by electric component so that the output mode can be switched into another one. When used for high-level alarm, the liquid inside container goes up and vibrate the tuning fork so the output mode can be changed again.3Short fork technologySelect the natural frequency (about 1300Hz) of tuning fork so that the interference which can cause faulty action of switch can be avoided effectively. The shorter fork can extend into container, thus, this product can adapt to all liquid variety, abundant researches have established stable base for high working efficiency so that this product can adapt to any kind of liquid including painting liquid、inflatable liquid and serous fluid.Special functionsLED DisplayWith LED display, it is available to carry out observation through the window of shell. When under “CLOSE” mode, the LCD display will flicker continuously (one time per 0.5s), while under “OPEN” mode, the LCD display will be on all the time.Magnetic test pointsDistributed at the side of shell, the operator can carry out function test through these test points. When one magnetic block touches the test point, the output will be changed.Electric connectionFor a quick connection, adopt standard DIN 43650 plug/socket, in addition, insensitive polarity and short-circuit protection can make the electric connection easily and safely.4Fork design“Dripple” design (drop from the end of fork) can make the test speed quicker, specially for high viscosity liquid, the scnsitivity becomes higher.5Application exampleOverflow preventionWhen in the course of loading, the overflow material will cause harm to human body andenvironment as well as production loss and cleaning cost, so this instrument is consideredas a crucial limit switch which can offer overflow signal at any time.Pump protectionThe length of short fork inserting wet side is shortest and can be fixed onto pipeline andcontainer at any angle so that the installation cost can be reduced effectively. Because theextending length of fork is 2 inch only (50mm), it is available to install this instrument ontominorcaliber pipe. If combined with direct-load switch, this instrument can become an idealchoice for the reliable control of pump, also away from dry running of pump.Top and lowest level alarmBecause this instrument can works continuously under 302°F(150℃)temperature and1045 psig (100barg) pressure due to its hard and durable shell, so this product is an ideal oneused to detect the top and lowest of different storage tank. Generally, a separated level alarmswitch is equipped so that an extra standby switch can be provided when failure emerges.6Leakage detectFor flange、gasket、seal ring, there may be leakage when exposed to severe environment likecorrosive liquid. Most of the storage tank and container are installed onto base plate or insideprotection body in case the liquid leakage emerges. This instrument can be used to find out theleakage point quickly and accurately so that the detecting cost can be reduced obviously.Pump controlFor many processes, equipped batch process and elevated storage tank, so a control to pumpbecomes very important to make the liquid level at specified range. Generally, these storagetanks are made of thinner material so that it can not be used to bear heavier instruments.Sanitation applicationDue to its higher surface smooth finish(Ra) which can be more than 0.8 μm, this instrument canmeet the requirements of food and medicine manufacturing. In addition, this instrument is made ofstainless steel so that it has a good durability, thus, it can be used under 302°F(150℃)hightemperature steam cleaning (CIP).7Application and installationInstallation precautions◆Before installation, check if the liquid temp and pressure are at the specified range (see technical specification) ◆Check if the viscosity of liquid is at the recommended range: 0.2 to 10000 cP◆The products with high viscosity include chocolate syrup、tomato catsup、peanut paste and asphalt.Even though these things have a high viscosity, the instrument still can carry out detection, but the discharging time is very long.◆Check if the density of liquid is more than 37.5 lbs./feet3 (600kg/m3).◆the things with low density include acet.、pentane and hexane◆Check if the fork is at the risk of material accumulation◆away from the conditions like accumulation of dry and coated things◆ensure that the fork is away from lapping (joint) caused by high density things◆the things which can cause the lapping of forks include:thickened pulp and asphalt(um)◆Check if there is solid material contained inside the liquid◆If an agglomeration phenomenon emerges, this will cause serious problem.◆ T he max. solid particle shall be within 0.2”(5mm)◆When used in the liquid with over 0.2”(5mm) solid particle, a special consideration shall be took into account,also you may contact the manufacture for solution.◆Foam◆ Generally, this instrument is insensitive to foam, so there is nothing to the test result by foam.◆But, for some individual case, the thickened foam is used as a carrier, for example, in the process of8manufacturing of ice cream, the foam influence shall be took into account.Recommended installation◆ Before installation, turn the switch to “ON” position◆ For high liquid level, turn on “Dry” button◆ For low liquid level, turn on “Wet” button◆It is strongly suggested that the magnetic test points should beused to detect the system all the time when at debugging course.◆Around the instrument, a sufficient space should be left so thatthe installation and electric connection can be implemented (seedimensional drawing for details)◆ Do not install the instrument onto the position where is close to theinlet of liquid.◆ The tuning fork should be kept away from the large number ofsplashed liquid.◆ Ensure that the tuning fork is not in touch with tank wall or any internal partsor any interfering thing.◆Ensure that the tuning fork can be kept a certain distance with accumulative FIG1:Correct installation drawingmatter on the wall of storage tank.9Technical specificationPhysical characters◆ Product: compacted level switch◆ Measuring principle:based on the vibration of tuning fork◆Application: adapt to most liquids including painting liquid、inflatable liquid and serous fluidMechanical features◆ Material of main parts: 316L stainless steel (1.4404)For the tri-clamp joint, the surface smooth finish can be more than 0.8μm, the material for 1” BSPP(G1) gasket is non-asbestos BS7531 X carbon fiber, and added with rubber adhesive.◆ Shell material: Body of shell: 304 stainless steel with polyester labelLED display window: inflaming retarding vanamid (Pa12) UL94 V2Plug: Daiamid GRPSeal ring: nitrile rubber 122“(50mm)◆ Installation material: 3/4” BSPT (R) or NPT1” BSPT (R) or NPT (G) screw or 2”(21mm) sanitation tri-clamp joint10◆Dimensional drawing: See “dimensional drawing” for details◆Explosive-proof grade: IP66/67, meet the standard EN60529Performance◆ Hysteresis range(water): ±0.039”(±1mm), rated value◆ Switch point (water): the distance to fork end (vertical) andthe edge of fork (horizontal) is 0.5”(13mm), this value will changewith the variation of liquid density.Function features:◆ Max. operating pressure: depend on the fastening of storage tank◆ Screw fastening: see FIG2 for details.◆ Sanitation fastening: 435psig (30barg)◆ Temperature: see FIG3 for details.11◆ Liquid density: Min. 37.5 lbs./foot3(600kg/m3)◆ Liquid viscosity range: 0.2 to 10000cP (centipois)◆ Solid content and painting type: it is recommended that the max. particle of the solid in liquid is 0.2”(5mm)For painting type product, the lapping (joint) of fork should be avoided.◆ Switching delay: 0.3 to 30s◆ CIP (clean in position) clean: available to be exposed to 302°F(150℃)high temperature steam for clean.Functional Characteristics◆ Switch type: toggle switch available, also the user can choose another one.◆ Cable connection: a four-way joint can be used for the connection of cables, this meet the standard DIN43650.the max. size of wire :15AWG, in addition, a four-direction bearing unit has been equipped as well (90/180/270/360°)◆ Wire size: max. 0.06 inch2 (1.5mm2)◆ Cable sealing device: a PG9 device can meet the requirements of 0.24-0.31”(6-8mm)Other components◆ Switch shell: made of durable stainless steel and equipped with a LED display window made of daiamid, besides,a four-way joint has been equipped as well and meet standard DIN43650, also a four-direction bearing unit aswell sealing device of cable are equipped as well.◆ Electric components: a standard double-core cable( which can be connected with 15-28V AC power) can be usedto connect this instrument with load equipment to acquire direct load switch which can be used as single-pole single-throw (SPST) switch. In addition, it can be used as a direct interface for PLC on the base of 24V DC solid12PNP output, at the same time, a relay with large capacity can meet the strong current output which can be up to 5A at most.◆ Fastenings of storage tank and tuning fork: for the wet parts material, they are made of 316 stainless steel, and forfork, it is available to use shorter length or half extending structure. You may see the dimensional drawing for details.◆ Screw connection: Screw 3/4”NPT or BSBT (R) 、1”BSBT(R) or BSPP (G)Material: 316L stainless steel◆ Sanitation fastenings: Fittings 2”(51mm) Tri-clamp joint 、1” BSPP (G)、o-ringMaterial: 316L stainless steelAccessories:when used in a position where has a high requirement for sanitation, aauxiliary piece can be used to match 1”BSPP or 2”(51mm) tri-clamp joint.In addition, at the wet side of tri-clamp joint, the manual smooth finish can bemore than 0.8 μm, so this can meet the requirement of severe sanitationstandard.◆For the dimensional drawing of product, please see the below page for details13Dimensional Drawing14Electric connection diagram15Normal Fork Level SwitchBrief introduction to productBased the vibration of piezoelectric crystal, this instrument can be activated toacquire measuring result. When encounter damping action, the vibration amplitudedecreases sharply, also the frequency and phase position changes obviously,all of these changes will be detected by the electric components inside, thentransferred into switch signal for output after process. Thus, this instrument can beused to carry out high-low level test and control as well as alarm function for storagetank, suitable for various kinds of liquids、powders and particle solid. In addition, due to its high availability and stable performance, this instrument can be almost free to maintenance and adapt to various environments, adopt LED to indicate the working state, generally, there are three input modes including DC24V、AC 110V and AC 220V and many output modes like DC current output、relay contact output and DC voltage output, for all of these modes, a high-low level alarm mode is equipped and available to set the sensitivity of instrument.Product features◆Away from the influence including water flow、onflow、bubble、foam、vibration、solid content、coating and liquidfeatures completely.16◆Less installation process steps without indication of positions◆Insensitive response to polarity with short-circuit protection function◆Standard plug/socket fastenings available◆With immovable parts or gap, free-maintenance can be acquired completely◆LED indicator lamp which can be regulated as required◆Quicker response to viscous liquid due to its “Dripple” design◆Sanitation fasteningsMeasuring principleBased on the fork vibration theory, this instrument is designed as a level switch which makes use of the piezoelectric crystal to vibrate the fork by the natural frequency of fork, in addition, the frequency variation can be monitored continuously. When used for lower alarm application, the liquid in container flows down and through the tuning fork to cause the variation of natural frequency, this frequency variation can be captured by electric component so that the output mode can be switched into another one. When used for high-level alarm, the liquid inside container goes up and vibrate the tuning fork so the output mode can be changed again.Technical parameters※Medium temp: -20~80℃※Ambient temp:-20~60℃※Ambient humidity: ≤95%17※Medium type: liquid、powder or particle solid※Medium density: Solid ≥0.1g/cm3Liquid≥0.7g/cm3※Solid particle size:≤10mm※Max. liquid viscosity:<1000mm2/s※Medium angle of repose:≥200※Pressure range: ≤ 1MPa※Shell material: Die-casting aluminum※Fork material:1Cr18Ni9Ti※Degrees of protection provided by enclosures:IP65※Connection mode: G1 1/2 screwFlange(option)※Electric parameters1. Power supply: DC24V, AC220V 50Hz2. Output signal: relay output 5A 220V AC3A 24V DC※Power consumption: ≤2W18※Fork vibration frequency:300±50Hz※Ambient vibration grade: V.L.4,less than 1g accelerated velocity※Switching signal response time:1-60sElectric connectionThe wire connection for instrument is shown as follows, the connection terminal with POWER is used to connect with power supply (AC 220V or DC 24V), No.3,4,5 are the output terminals of relay.Red light--------Indicate the output mode of relayGreen light--------Indicate the fork modeDebugging-------when the LIMIT ELECTED is under L sector, the red light is on, at the same time, if the fork is19vibrating ,the green light will be on as well. In addition, the terminal 3and 4 should be under normal close mode, while terminal 3 and 5 should be under normal open mode. When use your hand touch the fork, the fork stops vibrating and the green light is off, the relay contact turn around, conversely, the terminal 3 and 4 will be under normal open mode , and the terminal 3,5 will be under normal close mode; lastly, regulate the action delay time at the range 1~60s.If the LIMIT SELECTED is under H sector, the red light will be off, at the same time, the fork vibrates and the green light is on; the terminal 3 and 4 is under normal open mode, while the terminal 3 and 5 is under normal close mode, when touch the fork, the fork stops vibrating and the green light is off; at the same time, the contact of relay turn around, conversely, the terminal 3 and 4 become normal open, while the terminal 3 and 5 become normal open, lastly, regulate the delay time at the range of 1~60s.Generally, just select the LIMIT SELECTED at L sector to select the output terminals of relay so that the upper and low limit alarm can be acquired.Sensitivity regulation: this regulation should be done according to the density of medium, the lower density, the higher sensitivity. After regulation, press RST for confirm, generally, H sector is reasonable.After the state indicator light has been selected, carry out connection as description, and turn on the power supply after checked to be correct. Next, use your hand to touch the fork to check if the indicator light change the state, when pull out the hand, the indicator light will resume immediately. Repeat this operation several times to confirm it can work normally before put into use.Installation method1. Generally, this instrument should be installed vertically with downward fork end, also horizontal or inclinedinstallation is acceptable (if the adhesivity of material is high, it is suggested that the fork should be installed vertically downward)2. The installation mode with the upward fork is forbidden203. For the material with lumpy or hard particles, it is suggested that the fork should be installed vertically orslantingly.4. Before installation, it is suggested that a small quantity should be sampled to test the sensitivity of instrument. Forexample: immerse the instrument into a container with medium to test its reliability.5. When in the course of actual installation, there are various installation modes including top installation, side wallinstallation and pipe installation as shown in the following FIG:21Points for attention1. Away from the restricting to the vibration of fork due to material agglomeration.2. When installed at the position with scar, a enough space shall be left between the fork and tank wall3. For the instrument used for liquid level monitoring, the test point should be selected as the actual height.4. For low viscosity liquid, the liquid can flow through the fork smooth, so it is available to carry out installation at anyposition as above FIG.5. For high viscosity, it is difficult that the medium can flow through the fork, so it is suggested that the fork endshould go downward and vertically for installation6. For the instrument used for material level monitoring, the installation position of vertical cylinder or similar notonly depends on the height of material level, but also take into account the angle of repose as well as the feeding position. When under horizontal installation, the fork end should be placed at inner wall by 1/3 radius of container, and the two forks pieces should be put into the same plane. When under vertical installation mode, the distance between the installation center and inner wall of container should be 1/3 radius of container. In addition, the installation position should be away from the direct impact and splash of material medium as much as possible in case cause fault actions or wearing; if failed to avoid the impact or splash of material, a protection eaves should be installed at the installation position of instrument, the length of eaves should be longer than the actual one of instrument.22WARNING!When install or operate, do not hold the fork piece of instrument or knock in case cause deformation of fork piece, even cause damage to internal electric components.Dimensional DrawingScrew InstallationPopular type Extension type23Flange InstallationPopular type Extension type24。

Accept() scalability on Linux

Accept() scalability on Linux

Projects: Linux scalability: Accept() scalability on LinuxMacro-BenchmarkThe results of the macro-benchmark are very encouraging. While running with a stable load of anywhere between 100 and 1500 simultaneous connections to the web server, the number of requests serviced per second increased dramatically with both the "wake one" and "task exclusive" patches. While the performance impact is not as powerful as that evidenced in the micro-benchmark, a considerable gain is evident in the testing. Whether the number of simultaneous connections is at a low level, or reaching the upper bounds of the test, the performance increase due to either patch remains steady at just over 50%. There is no discernable difference between the two patches.ConclusionBy thoroughly studying this "thundering herd" problem, we have shown that it is indeed a bottleneck in high-load server performance, and that either patch significantly improves the performance of a high-load server. Even though both patches performed well in the testing, the "wake one" patch is cleaner and easier to incorporate into new or existing code. It also has the advantage of not committing a task to "exclusive" status before it is awakened, so extra code doesn't have to be incorporated for special cases to completely empty the wait-queue. The "wake one" patch can also solve any "thundering herd" problems locally, while the "task exclusive" method may require changes in multiple places where the programmer is responsible for making sure that all adjustments are made. This makes the "wake one" solution easily extensible to all parts of thefor info, email info@ or call +1 (734) 763-2929. Copyright © 1999 University of Michigan Regents. All rightsreserved.。

11g RAC_安装与配置

11g RAC_安装与配置
CVU
© 2007 Oracle Corporation
7
cluvfy Stage List
• Valid stage options and stage names are:
$ ./cluvfy stage -list -post hwos : post-check for hardware & operating system -pre cfs : pre-check for CFS setup -post cfs : post-check for CFS setup -pre crsinst : pre-check for Clusterware installation -post crsinst : post-check for Clusterware installation -pre dbinst : pre-check for database installation -pre dbcfg : pre-check for database configuration
• Use a private dedicated non-routable Switch or VLAN
• crossover cables are not supported
• Eliminate any Transmission Problems
• Packet errors/drops can manifest into more serious outages
• Can install ASM and create a database automatically
© 2007 Oracle Corporation
5
Cluster Verification Utility (CVU, cluvfy)

将应用部署到weblogic及oracle linux时遇到的问题e

将应用部署到weblogic及oracle linux时遇到的问题e

property when WAR file is not expanded问题分析:出现这个原因是因为部署的时候使用的是war包,weblogic部署应用不像tomcat先将war解压在启动,而是直接使用war启动。

因为我们在很多JSP和Servlet文件中使用了如:this.servletContext.getRealPath("/")等类似写法,因为在war中的文件时没有真实路径的,所以getRealPath("/")取出来的都是意向不到的值,例如null。

解决方法:由于用这种写法获得web效劳器路径的地方很多,一个个去换显然不是一个很好的方法,而且直接使用war部署对后续的应用更新也比拟麻烦,所以准备采用另外一种部署方式,就是文件目录部署。

三、文件目录部署使用文件目录部署指的是用weblogic管理效劳器安装,直接指定本地的应用文件夹,只要该文件夹下面有包含WEB-INF\web.xml,就可以被选中安装。

所以接下来就是建立应用程序的安装目录。

在区别于weblogic域管理目录路径,我们在根路径创立了目录。

/deploy/applications/app/deploy/applications/planapp : 准备用来存放app应用,在文件夹建好以后,将我们的应用〔如:wzfy〕整个文件夹拷贝到app下面。

plan : 这个文件夹当weblogic管理效劳器安装了app下面的应用后,会在这里自动建立app 的部署方案文件。

在管理效劳器中,找到目录/deploy/applications/app ,选中wzfy,开始安装。

第三个问题出现无法访问选定应用程序。

Exception in AppMerge flows' progressionException in AppMerge flows' progression[J2EE:160111]ERROR: Appc can not write to the working directory,'/deploy/applications/app/wzfy'. Please ensure that you have write permission for this directory and try again.通过文字意思的理解,就是对于操作用户来说/deploy/applications/app/wzfy是不可写的。

petalinux(二)开启petalinux内核调试模式

petalinux(二)开启petalinux内核调试模式

petalinux(二)开启petalinux内核调试模式描述要调试基于Xilinx SDK的Linux内核模块,必须使能KERNEL_DEBUG_INFO和KERNEL_DEBUGGING。

这篇博文全面记录了在Petalinux中是如何处理的。

解决方案获得基于调试模式的petalinux,需要一些特定的配置设定,有一些特定的配置需要设置为了获取PetaLinux基于内核调试工作。

完整的配置步骤请参考帮助文件:SDK Help Xilinx Software Development Kit (SDK) User Guide Working with Xilinx System Debugger System Debugger Supported Design Flows Attach and Debug using Xilinx System Debugger.下面是配置基于PetaLinux的Linux内核调试模式所涉及到的步骤:1)创建一个Zynq Vivado和导出模板项目硬件SDK。

2)创建一个Linux应用程序在SDK Hello World示例并关闭SDK项目,继续Petalinux项目下一步。

3)与下面的命令创建一个petalinux项目:petalinux-create --type project --template --name4)到petalinux项目目录下,并运行以下命令:petalinux-config --get-hw-descripTIon=指定hw-descripTIon的目录,目录中包含hdf文件( project_name 的。

sdk目录在您之前创建Vivado项目)。

5)如图2所示,到达Linux Components SelecTIon --- and then to kernel (xlnx-4.0)项,选择remote 选项:6)接下来我们需要指定完整的内核源代码的路径,该路径可以Xilinx GitHub页面找到。

中标麒麟Linux系统的性能分析及工具

中标麒麟Linux系统的性能分析及工具
•-d for block device •-n for network •-P for cpu •-q forqueue length and load averages •-r for memory •-S for swap •-u for cpu
sar
Netstat和ss
•网络信息 •Netstat –p ss –o state established
Drill-Down Analysis Method
•Ext4延迟分析
–Dynamic Tracing / DTrace 在这里很适用,因为它可以 深入分析所有层的客户定义的细节
USE •针对M每e一th个o资d源,检查:
–1.利用率:繁忙程度 – 2. 饱和率:队列长度 –3. 错误数
•针对资源的尝试
–保存和分析所有“嗅探”系统时看得到的网络流 量
•Sar -n { keyword [,...] | ALL } •Vnstat •Traceroute •Iptstate •Darkstat
联网,本地视图
•ip 工具 •使用 netstat -ntaupe 来获取以下列表:
–活跃的网络服务 –建立的连接
free
•内存使用统计
Ping hping
•测量网络延迟 •ping -c100 –q node.ip
nicstat
•网络统计工具 •查看网卡利用率和吞吐量
dstat
•整合了vmstat,iostat和ifstat 界面友好 可保存
基本工具
中级工具
•sar • netstat 和ss •pidstat • strace ••tcpdump •blktrace • iotop • slabtop • sysctl • /proc

Linux下core文件调试方法

Linux下core文件调试方法

Linux下core⽂件调试⽅法在程序不寻常退出时,内核会在当前⼯作⽬录下⽣成⼀个core⽂件(是⼀个内存映像,同时加上调试信息)。

使⽤gdb来查看core⽂件,可以指⽰出导致程序出错的代码所在⽂件和⾏数。

1.core⽂件的⽣成开关和⼤⼩限制(1)使⽤ulimit -c命令可查看core⽂件的⽣成开关。

若结果为0,则表⽰关闭了此功能,不会⽣成core⽂件。

通过上⾯的命令修改后,⼀般都只是对当前会话起作⽤,当你下次重新登录后,还是要重新输⼊上⾯的命令,所以很⿇烦。

我们可以把通过修改 /etc/profile⽂件来使系统每次⾃动打开。

步骤如下:1.⾸先打开/etc/profile⽂件⼀般都可以在⽂件中找到这句语句:ulimit -S -c 0 > /dev/null 2>&1.ok,根据上⾯的例⼦,我们只要把那个0 改为 unlimited 就ok了。

然后保存退出。

2.通过source /etc/profile 使当期设置⽣效。

3.通过ulimit -c 查看下是否已经打开。

其实不光这个命令可以加⼊到/etc/profile⽂件中,⼀些其他我们需要每次登录都⽣效的都可以加⼊到此⽂件中,因为登录时linux都会加载此⽂件。

⽐如⼀些环境变量的设置。

还有⼀种⽅法可以通过修改/etc/security/limits.conf⽂件来设置,⾸先以root权限登陆,然后打开/etc/security/limits.conf⽂件,进⾏配置:#vim /etc/security/limits.conf<domain> <type> <item> <value>* soft core unlimited(2)使⽤ulimit -c filesize命令,可以限制core⽂件的⼤⼩(filesize的单位为kbyte)。

若ulimit -c unlimited,则表⽰core⽂件的⼤⼩不受限制。

CUDA 编程与调试指南说明书

CUDA 编程与调试指南说明书

A n d D e b ug g i ngCUDA Programming on the Tegra Xavier-BY KRISTOFFER ROBIN STOKKEKeep This Under Your Pillow q Volta Tuning Guidel https:///cuda/archive/10.2/volta-tuning-guide/index.html q CUDA Programmer’s Guideq https:///cuda/archive/10.2/cuda-c-programming-guide/index.html q CUDA for Tegraq https:///cuda/archive/10.2/cuda-for-tegra-appnote/index.htmlq Tegra X1 Whitepaperq https:///blog/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/ q Last but not leastq CUDA-GDBq You will need itq NVPROF –if you care about performanceq https:///cuda/archive/10.2/profiler-users-guide/index.htmlTegra Xavier vs.Tegra X1Tegra Xavier Tegra X1High Performance CPU8 x Carmel 2 MB L2 +128 kB $I, 64 kB $D 4 x Cortex-A57 2 MB L2 +48 kB $I, 32 kB $DLow Power CPU None 4 x Cortex-A53512 kB L2 +32 kB $I/$D Architecture Volta MaxwellCores512 Cores (8 Volta SM)256 Cores (2 SMM, 128 Cores / SMM) Memory bitwidth256-bit64-bitL2 cache512 kB256 kBL1 cache128 kB (shared memory + L1 RO cache)•64 kB (unique shared memory)•64 kB read-only cacheGPU Compute CapabilityCompute Capability Generation1.x Tesla2.x Fermi3.x Kepler (Tegra K1)4.x5.x Maxwell (Tegra X1)6.x Pascal (Tegra X2)7.x (7.2)Volta (Tegra Xavier)7.5Turing8.x Ampere You are here•«The functional development of GPUs»•For Maxwell:•Half-precision (16-bit) floating point•Dynamic parallelism•Kernels can launch kernels•Newest CC•Tensor Core (neural network support)•CUDA Toolkits•Compiler, doc, examples etc•nvcc–version•Already installed for you•If interested:•https:///wiki/CUDAParallelism in GPUs •Massive.•Volta SM –Volta Streaming Multiprocessor •Four Warp Schedulers(WS)•4 x 16 CUDA cores•4 x 8 LD/ST units, ~16k 32-bit registers•4 x 4 Special Function Units (SFUs)•8 Tensor CoresGroup of 32 threads •128 kB shared memory*•At every clock cycle..•Each WS selects an eligible warp..•.. and dispatches two instructions•All threads should follow, more or less, thesame execution pathVolta GPU Memories§256-bit RAM interface§512kB, GPU-global L2 cache§Shared between all Volta SMs§128 kB, read-only L1 cache/ shared memory §Shared texture/ local l1 cache§Local to the SM§SM-local L1 cache§Directly addressable through sharedmemory§Local to the SM§Registers FasterThisWay§A flexible, multi-layered cache hierarchy§Improves memory bandwidth§WS selects ready (non-stalling warps)§Highly programmableGPU Memories (Continued)§The CUDA toolkit documentation introduces the following memory spaces and naming conventions..§Global memory loads: Loads from RAM, possibly through caches§«Local memory»: Register spills, code, and other§Resides in RAM or «somewhere» in the cache hierarchy, hopefully in the right place§«Shared memory»: RW L1 cache shared in a thread block§«L1 RO cache»: Cache global, read-only memory loadsCUDA Programmer’s Perspectiveq Schedule blocks of threads (execution configuration syntax )q WS schedules eligible 32-thread groups of blocksFrom CUDA Programmer’s guide.__global__ void memcpy( uint32_t * src, uint32_t * dst) {...}void main(void){dim3 block_dim(1024, 1, 1);dim3 grid_dim(1024, 1, 1);uint32_t *src, *dst;memcpy<<<grid_dim, block_dim, 0, 0>>(src, dst);...}Shared memory per blockCuda stream (default 0)CUDA Programmer’s Perspective (Cont.)__global__ void memcpy( uint32_t * src, uint32_t * dst) {int idx;idx = threadIdx.x + ( blockIdx.x * blockDim.x );dst[idx] = src[idx];return;}q Special purpose registersq threadIdx.[x/y/z] -> block index coords q blockIdx.[x/y/z] -> grid index coords q blockDim.[x/y/z] -> grid dimension sizesq In example:q blockDim.x = 1024, blockIdx.x \in [0, 1023]q Index into contiguous memorySynchronisationq ECS kernel_symbol_name<<< gridDim, blkDim, shared, stream>>> ( __VA_ARGS__ ) q Kernel launches are always asynchronousq Executing thread immediately returnsq«Worst» sync: cudaDeviceSynchronise()q Blocks until all pending GPU activity is doneq However good for debugging / testing purposesq Streamsq Streams created with cudaStreamCreate()-> + flags!q Run kernel launches and asynchronous memory copies in streamsq Sync on streams with cudaStreamSynchronize( stream)Other API Specific Detailsq Two APISq Driver APIq Runtime API <-use this(https:///cuda/archive/10.2/cuda-runtime-api/index.html) q Other modules you should have a look atq Device managementq Error handlingq Memory management, unified addressingq CUDA samples: deviceQueryq CUDA Compiler: nvccq Source files with CUDA code(*.cu) are compiled as .cpp filesq nvcc extracts CUDA code, passes rest to native c++compilerWhen Things Aren’t Going Your Way q Cuda-gdbq Just like gdbq Main advantage: captures error conditions for youq But this doesn’t mean you can get lazyq Always check error codes and break on anything != cudaSuccessq Make a macroGPU Performance Analysisq CUPTI: GPU Hardware Performance Counters (HPCs)q Usage: nvprof –e <event counters> -m <metrics> <binary> <arguments>q Summary modes, counter collection modes....q Tells you about resource usage –time, memory, floating point performance, elapsed cycles q Takes time to profile –be patient or use ./c63enc –f 5 <-make sure to trigger ME & MCq Check HPC availability with nvprof –query-events –query-metricsq Notice there are well above 100 HPCs to choose from..q...which ones matter?q I will tell you! J J JGPU Performance Analysis (Continued) q Memory usageq L1_global_load_hit, l1_local_{store/load}_hit, l1_shared_{store/load}_transactions, shared_efficiency q Instructionsq Inst_integer, inst_bit_convert, inst_control, inst_misc, inst_fp_{16/32/64}q Causes of stallingq Memory, instruction dependencies, sync...q Otherq Elapsed_cycles_smq These are for the TK1, but should be at least similar for TX1q Don’t get confused by HPCs such as {gld/gst}_throughputCode Examples。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
相关文档
最新文档