DDR3内存的PCB仿真与设计

合集下载

DDR3内存的PCB仿真与设计

DDR3内存的PCB仿真与设计
一、DDR3内存的PCB仿真
PCB（Printed Circuit Board，印制电路板）的设计是DDR3内存中非常重要的一步。

在进行实际制作之前，通过仿真来验证设计的正确性，可以帮助找出潜在的问题并进行优化。

1.电源噪声仿真
2.信号完整性仿真
3.时钟分布仿真
4.排布规则仿真
二、DDR3内存的PCB设计
在进行DDR3内存的PCB设计时，需要考虑以下几个方面。

1.布局设计
2.分层设计
3.时钟优化
4.信号完整性优化
5.地平面设计
良好的地平面设计可以提供稳定的地连接，减小噪声干扰。

需要合理规划地平面的宽度和连接方式，并与信号平面分层设计相结合。

总之，DDR3内存的PCB仿真和设计是提高DDR3内存性能和稳定性的重要手段。

通过仿真和设计的过程，可以找出潜在的问题并进行优化，提
高DDR3内存的性能和可靠性。

对DDR3内存的PCB设计要仔细考虑布局、分层、时钟优化、信号完整性优化和地平面设计等方面，以确保DDR3内存的正常运行。

基于DDR3的PCB设计及仿真

ａ
：ｌ并且数据传输率一再被提升在这种情况下，对ＰＣＢ设计都提出了更高的要求，同时也增加了１）（Ｉｊ设计需要考虑的参数随
Ｚ之出现的存储器的仿真技术，通过对影响信号质量的关键因素的分析，实现对信号完整性的仿真分析，从而节省了ＰＣＢ设计
２仿真分析
ｉｆ；，ＪｆＥ！，｝ＪｌＹ，Ｈｂｒ４欠ｒ｝ ‘ （－ｉｌｌ［ｔｌｎｌ — ＳＩ，ｍ・ｆｌｆ／『ｏ．，＿弛嚣Ｔｅｋ１）ＳＡ７３３（）４１）．１）Ｉ）Ｒ敬州尘、Ｉ１）６０Ｍ－￣－）ｃ，
１引言
啦ｌ１＿Ｊ）９ｚ＝ＩＩＩＩＩｉｒｇ．Ｇ：｛ｎ坐越米越帙，对行器的数据ｆ々输能Ｊｊ ‘ 水ｌ越沫迷ｒ，他ｉ越越，逑电路板Ｉ｛的！连乐变僻
基于ＤＤＲ３的ＰＣＢ设计及仿真
牛宾，史黎黎（中华通信系统有限责任公司河北分公司，河北石家庄ｌｌ５ … Ｉ）
Ｈ
【摘要］随￣－－Ｈ－算机技术的快速发展，高速存储器技术已得到广泛应用，目前市场上主流应用的内存为１）１）１３存储器，
时间及开发成本
ｎ
【关键词】Ｉ）Ｉ）Ｒ３；仿真；ＰＣＢ；高速传输
ａ

ddr3电路设计

ddr3电路设计DDR3是一种流行的内存标准，广泛应用于计算机的内存子系统中。

DDR3电路设计的关键在于确保数据传输的精确性和速度。

本文将讨论DDR3电路设计的主要方面。

首先，DDR3的电路设计必须考虑时序。

DDR3采用双倍速时钟，并使用复杂的电路实现内部时序要求。

因此，在设计DDR3电路时，需要掌握时钟源和时序分析工具，以确保电路能够在正确的时序下运行。

同时，要特别注意时序与接口匹配，特别注意输入和输出以及速度的匹配。

其次，电源稳定性也是DDR3电路设计的一个重要方面。

DDR3的工作电压为1.5V，内部时序比较严格。

如果电源不稳定，会导致内部时序失调和数据传输的不稳定。

因此，在设计DDR3电路时，需要考虑使用高品质的电源和稳压控制电路，以确保系统的电源稳定性。

第三，DDR3电路设计必须考虑阻抗匹配。

DDR3内部线路的阻抗要求非常高，以确保数据信号传输的正确性。

因此，在DDR3电路设计中，设计师必须确保阻抗匹配，包括内部线路阻抗和外部元件阻抗匹配。

通过使用高品质的PCB材料和布线方式，可以实现更好的阻抗匹配。

最后，DDR3电路设计必须考虑EMI（电磁干扰）和ESD（静电放电）等问题。

DDR3的高速数据传输和内部时序要求使其在EMI和ESD方面非常敏感。

在DDR3电路设计中，需要考虑EMI和ESD的电磁干扰特性，采用合适的滤波和抑制措施，例如使用布局合理的地和电源平面，以及使用EMI滤波器等。

总之，DDR3电路设计是一项复杂的工作，需要掌握时序、电源、阻抗、EMI和ESD等方面的知识。

采用高品质的元件、PCB材料和布线方式，可以最大限度地提高DDR3电路的精确性和速度。

DDR2与DDR3信号完整性及PCB设计

Signal Integrity and PCB layout considerations for DDR2-800 Mb/s and DDR3 MemoriesFidus Systems Inc.900, Morrison Drive, Ottawa, Ontario, K2H 8K7, CanadaChris Brennan, Cristian Tudor, Eric Schroeter, Heike Wunschmann, and Syed BokhariSession # 8.13AbstractThe paper addresses the challenge of meeting Signal Integrity (SI) and Power Integrity (PI) requirements of Printed Circuit Boards (PCBs) containing Double Data Rate 2 (DDR2) memories. The emphasis is on low layer count PCBs, typically 4-6 layers using conventional technology. Some design guidelines have been provided.1. IntroductionDDR2 usage is common today with a push towards higher speeds such as 800 Mbps [1] and more recently, 1066 Mbps. DDR3 [2] targets a data rate of 1600 Mbps. From a PCB implementation standpoint, a primary requirement is delay matching which is dictated by the timing requirement. This brings into it a number of related factors that affect waveform integrity and delay. These factors are interdependent, but where a distinction can be made, they can be termed PCB layer stackup and impedance, interconnect topologies, delay matching, cross talk, PI and timing. Cadence ALLEGRO™SI-230 and Ansoft’s HFSS™ are used in all computations.Table 1: Comparison of DDR2 and DDR3 requirementsSignals common to both technologies and a general comparison of DDR2 and DDR3 is shown in Table 1. It must be noted that “matching” includes cases where the clock net may be made longer (termed DELTA in ALLEGRO SigXP). We have assumed a configuration comprising a Controller and two SDRAMs in most illustrations that follow.2. PCB Layer stackup and impedanceIn a layer constrained implementation, a 4 layer PCB (Figure 1) is a minimum with all routing on TOP and BOTTOM layers. One of the internal layers will be a solid ground plane (GND). The other internal plane layer is dedicated to VDD. Vtt and Vref can be derived from VDD. Use of a 6-layer PCB makes the implementation of certain topologies easier. PI is also enhanced due to the reduced spacing between power and GND planes. The interconnect characteristic impedance for DDR2 implementation can be a constant. A single-ended trace characteristic impedance of 50 Ohms can be used for all single-ended signals. A differential impedance of 100 Ohms can be used for all differential signals, namely CLOCK and DQS. Further, the termination resistor pulled up to VTT can be kept at 50 Ohms and ODT settings can be kept at 50 Ohms.In the case of DDR3 however, single ended trace impedances of 40 and 60 Ohms used selectively on loaded sections of ADDR/CMD/CNTRL nets have been found to be advantageous. Further, the value of the termination resistor pulled up to Vtt needs to be optimized in combination with the trace impedance through SI simulations. Typically, it is in the range 30 – 70 Ohms. The differential trace impedance can remain at 100 Ohms.Figure 1 : Four and Six layer PCB stackup3. Interconnect TopologiesIn both cases of DDR2 and DDR3, DQ, DM and DQS signals are point-to-point and do not need any topological consideration. An exception is in the case of multi-rank Dual In Line Memory Modules (DIMMs). Waveform integrity is also easily addressed by a proper choice of drive strengths and On Die Termination (ODT). The ADDR/CMD/CNTRL signals, and sometimes the clock signal will involve a multipoint connection where a suitable topology is needed. Possible choices are indicated in Figure 2 for cases involving two SDRAMs. The Fly-By Topology is a special case of a daisy chain with a very short or no stub.For DDR3, any of these topologies will work, provided that the trace lengths are minimized. The Fly-by topology shows the best waveform integrity in terms of an increased noise margin. This can be difficult to implement on a4-layer PCB and the need for a 6-layer PCB arises. The daisy chain topology is easier to implement on a 4 layer PCB. The tree topology on the other hand requires the length of the branch AB to be very close to that of AC (Figure 2). Enforcing this requirement results in the need to increase the length of the branches which affects waveform integrity. Therefore, for DDR3 implementation, the daisy chain topology with minimized stubs proves to be best suited for 4-layer PCBs.For DDR2-800 Mbps any of these topologies are applicable with the distinction between each other being less dramatic. Again, the daisy chain proves to be superior in terms of both implementation as well as SI.Where more than two SDRAMs are present, often, the topology can be dictated by constraints on device placement. Figure 3 shows some examples where a topology could be chosen to suit a particular component placement. Of these, only A and D are best suited for 4-layer PCB implementation. Again, for DDR2-800 Mbps operations all topologies yield adequate waveform integrity. For a DDR3 implementation, in particular at 1600 Mbps, only D appears to be feasible.Vtt RtRtRtTree topology Fly-By topologyFigure 2: ADDR/CMD/CNTRL topologies with 2 SDRAMS(A)Figure 3: ADDR/CMD/CNTRL topologies with four SDRAMS4. Delay matchingImplementing matched delay is usually carried out by bending a trace in a trombone shape. Routing blockage may require layer jumping. Unfortunately, while physical interconnect lengths can be made identical in layout, electrically, the two configurations shown in Figure 4 will not be the same.The case of trombone delay has been well understood, and the case of a via is obvious. The delay of a trombone trace is smaller than the delay of a straight trace of the same center-line length. In the case of a via, the delay is more than that of a straight microstrip trace of length equal to the via length. The problem can be resolved in two different ways. In the first approach, these values can be pre-computed precisely and taken into account while delay matching. This would become a tedious exercise which could perhaps be eased with userRtRtRt(B)(C)(D)Rtdefined constraints in ALLEGRO 16.0. In the second approach, one would use means to reduce the disparity to an acceptable level.Trombone traceStraight traceL 3L 2 L 4 ≠L 1L 5Figure 4: Illustration of Trombone traces and ViasFigure 5: Circuit for estimation of trombone effect and resulting waveforms.≠Straight traceVia cross sectional viewConsider the case of a trombone trace. It is known that the disparity can be reduced by increasing the length of L3 (Figure 4). Details can be found in reference [3]. A simulation topology can be set up in SigXP to represent parallel arms of a trombone trace as coupled lines. A sweep simulation is carried out with L3 (S in Figure 5) as a variable and the largest reasonable value that reduces the delay difference with respect to a reference trace is selected. For microstrip traces, L3 > 7 times the distance of the trace to ground is needed.Delay values are affected in a trombone trace due to coupling between parallel trace segments. Another way to reduce coupling without increasing the spacing is to use a saw tooth profile. The saw tooth profile shows better performance as compared to a trombone although it eventually ends up requiring more space. In either case, it is possible to estimate the effect on delay precisely by using a modified equation for the computation of the effective trace length [3]. This would need to be implemented as a user defined constraint in ALLEGRO. Consider the case of a through hole via on the 6 layer stackup of Figure 2. Ground vias placed close to the signal vias play an important role in the delay. For the illustration, the microstrip traces on TOP and BOTTOM layers are 150 mils long, and 4 mils wide. The via barrel diameter = 8 mils, pad diameter is 18 mils and the anti-pad diameter is 26 mils.Three different cases are considered. In the first case, the interconnect with via does not have any ground vias in its immediate neighborhood. Return paths are provided at the edges of the PCB 250 mils away from the signal via. In the second case, a reference straight microstrip trace of length = 362 mils is considered. The third case is the same as case 1 with four ground vias in the neighborhood of the signal via. Computed s-parameters with 60 Ohm normalization are shown in Figure 6. It can be seen that the use of 4 ground vias surrounding the signal via makes its behavior more like a uniform impedance transmission line and improves the s21 characteristic. In the absence of a return path in the immediate neighborhood, the via impedance increases. For the present purpose, it is important to know the resulting impact on the delay.A test circuit is set up similar to Figure 5. The driver is a linear source of 60 Ohms output impedance and outputs a trapezoidal signal of rise time = fall time = 100 ps and amplitude = 1V. It is connected to each of the 3 interconnects shown in Figure 6 and the far end is terminated in a 60 Ohm load. The excitation is a periodic signal with a frequency of 800 MHz. The time difference between the driver waveform at V = 0.5 V and the waveform at the receiver gives the switched delay.Results are illustrated in Figure 7 where only the rising edge is shown. It can be seen that the delay with four neighboring ground vias differs from that of the straight trace by 3 ps. On the other hand, the difference is 8 ps for the interconnect with no ground vias in the immediate neighborhood.It is therefore clear that increasing the ground via density near signal vias will help. However, in the case of 4 layer PCBs, this will not be possible as the signal traces adjacent to the Power plane will be referenced to a Power plane. Consequently, the signal return path would depend on decoupling. Therefore, it is very important that the decoupling requirement on 4 layer PCBs addresses return paths in addition to meeting power integrity requirements.The clock net is differential in both DDR2 and DDR3. In DDR2, DQS can be either single ended or differential although it is usually implemented as differential at higher data rates. The switched delay of a differential trace is less than that of a single ended trace of identical length. Where timing computations indicate the need, the clock and DQS traces may need to be made longer than the corresponding ADDR/CMD/CNTRL nets and DATA nets. This would ensure that the clock and DQS transitions are centered on the associated ADDR/CMD/CNTRL nets and DQ nets.Since DQ and DM nets run at the maximum speed, it is desirable that all of these nets in any byte lane be routed identically, preferably without vias. Differential nets are less sensitive to discontinuities and where layer jumping is needed, the DQS and CLOCK nets should be considered first.Figure 6: s-parameters of interconnects with vias (60 Ohm normalization)Figure 7: Driver and Receiver waveforms for the 3 cases of Figure 6. (Plot colors correspond)5. CrosstalkCross talk contributes to delay uncertainty being significant for microstrip traces. This is generally reduced by increasing the spacing between adjacent traces for long parallel runs. This has the drawback of increasing the total trace length and therefore a reasonable value must be chosen. Typically the spacing should be greater than twice the trace distance to ground. Again, ground vias play an important role. Near and far end coupling levels are illustrated in Figure 8. Use of multiple ground vias reduces coupling levels by 7 dB. To derive the interconnect budget, a simulation of a victim trace with two aggressors on both sides is adequate. Using a periodic excitation on all nets will yield the cross talk induced jitter. Using a pseudo random excitation on all nets will show the effect of both cross talk as well as data dependencies. Time domain results are not shown here, but it is easily done by setting up a 5 coupled line circuit in SigXP with the spacing between traces set up for sweeping. Reasonable spacing values that keep the jitter in the waveform due to both cross talk as well as pattern dependence at an acceptable level are chosen.Figure 8: s-parameters of coupled traces (60 Ohm normalization)6. Power IntegrityPower Integrity here refers to meeting the Power supply tolerance requirement under a maximum switching condition. Failure to address this requirement properly leads to a number of problems, such as increased clock jitter, increased data dependent jitter, and increased cross talk all of which eventually reduce timing margins.The theory for decoupling has been very well understood and usually starts with the definition of a “target impedance” as [4]CurrentTransient tolerance Voltage Z et t =arg (1)An important requirement here is knowledge of the transient current under worst case switching condition. A second important requirement is the frequency range. This is the range of frequencies over which the decoupling network must ensure that its impedance value is equal to or below the required target impedance. On a printed circuit board, capacitance created by the Power-Ground sandwich and the decoupling capacitors needs to handle a minimum frequency of ~100 kHz up to a maximum frequency of ~100-200 MHz. Frequencies below 100 kHz are easily addressed by the bulk capacitance of the voltage regulator module. Frequencies above 200 MHz should be addressed by the on-die and in some cases on-package decoupling capacitance. Due to the finite inductance of the package, there is no need to provide decoupling on the PCB to handle frequencies greater than 200 MHz. The actual computation of power integrity can be very complex involving IC package details, simultaneously switched signals and the PCB power distribution network. For PCB design, the use of the target impedance approach to decoupling design is simpler and provides a practical solution with very little computational effort.The three power rails of concern are the VDD, VTT and Vref. The tolerance requirements on the VDD rail is ~ 5% and the transient current is determined as the difference between Idd7 and Idd2 as specified by JEDEC [1,4]. This is accomplished by using plane layers for power distribution and a modest number of decoupling capacitors. It is preferable to use decoupling capacitors of 10 different values distributed in the range of 10 nF to 10 uF. Further, the capacitor pad mounting structure should be designed for reduced mounted inductance.The Vref rail has a tighter tolerance, but it draws very little current. Its target impedance is easily met using narrow traces and one or two decoupling capacitors. It is important however that the capacitors be located very close to the device pins.The VTT rail proves to be challenging because it not only has a tighter tolerance, but it also draws a transient current close to that of the VDD rail. The transient current is easily calculated as described in reference [5]. Again, the target impedance requirement can be met using an increased number of decoupling capacitors.On a 4 layer PCB, the planes are too far apart and consequently the advantage of inter-plane capacitance is lost. The number of decoupling capacitors needs to be increased and higher frequency capacitors with values less than 10 nF may be needed. These computations are easily done using ALLEGRO SI Power Integrity option.7. TimingTiming computation is carried out as described in reference [6]. A table needs to be setup for the following eight cases: 1. 2. 3. 4. 5. 6. 7. 8. Write Setup analysis DQ vs. DQS Write Hold analysis DQ vs. DQS Read Setup analysis DQ vs. DQS Read Hold analysis DQ vs. DQS Write Setup analysis DQS vs. CLK Write Hold analysis DQS vs. CLK Write Setup analysis ADDR/CMD/CNTRL vs. CLK Write Hold analysis ADDR/CMD/CNTRL vs. CLKAn example is shown for the case of Write setup analysis in Table 2. Actual numbers have been omitted as they are not precisely known yet for DDR3. These numbers are obtained from data sheets of Controller and memory manufacturers. The numbers in the interconnect section are determined by SI simulations. All the eight cases need to be analyzed for DDR2. For DDR3, 5 and 6 are not needed due to its write leveling feature. In the PCB implementation, length match tolerances must ensure that the total margin is positive. ElementControllerSkew Componenta.)DQ vs. DQS skew at transmitter output b.) Data / Strobe PLL jitter a+b Setup requirement (tDSb @ Vih/Vil level) DQ slew rate DQS slew rateSetupUnitsps ps ps psCommentsFrom controller design data Used if not included in transmitter skewTotal Controller SDRAM (or DIMM)V/ns V/ns psTotal SDRAM setup requirement InterconnecttDSb + slew rate adjustmentFrom SDRAM datasheet; this number is to be adjusted based on DQ and DQS slew rates Measured as per JEDEC specification from SI simulation results Measured as per JEDEC specification from SI simulation results Includes slew rate adjustmenta.) Data Xtalk b.) DQS Xtalk c.) Length matching tolerance d.) Characteristic impedance mismatch Total Interconnect Min. Total Setup Budget Setup margin Interconnect skew (a + b + c + d) 0.24*tckps ps ps ps2 aggressors (one each side of the victim); victim – repetitive; aggressor- PRBS 2 aggressors (one each side of the victim); victim – repetitive; aggressor- PRBS Extracted from SI simulation results longest data net, worst case PVT corner can be omitted if routing of DQ and corresponding DQS signals are done on same layerps ps From SDRAM datasheet (includes clock duty cycle variation) Must be positiveMin. Total Setup Budget – (Total Controller + Total SDRAM + Total Interconnect )psTable 2: Illustration of DDR3 Write Setup timing analysis summary for DQ vs. DQS8. PCB LayoutImplementation on a PCB involves a number of tradeoffs to meet SI requirements. Often, the question is how far does one need to go? PCB layout tasks are facilitated using the following approach: 1. Set up topology and constraints in ALLEGRO Constraint Manager. 2. Design Controller BGA breakout. A controller pin arrangement with ADDR/CMD/CNTRL pins in the middle and DQ/DQS/DM byte lanes on either side is best suited. Within these groups, individual pins may need to be swapped to ensure routing with minimum cross-over. 3. Attempt routing with reduced stub length and a minimum trace spacing as obtained from cross talk simulation. Often, most stubs can be eliminated but it will not be possible for all the pins. One may try two traces between BGA pads of the memory devices. This would require narrow PCB traces which can increase manufacturing cost. Yet, it will not be possible for all signals unless micro via and via-in-pad technology is used. Complete routing with coarse length matching tolerances. 4. Place Vref decoupling capacitors close to the Vref pins. Vtt decoupling can be placed at the far end of the last SDRAM and will not come in the way of routing. VDD decoupling can be placed close to devices where possible without blocking routing channels. The smaller valued capacitors should be placed closer to the devices. With a proper decoupling design, it will not be necessary to cram all capacitors close to the devices. All decoupling capacitors should use a fan out for the footprint designed for reduced inductance. This is typically two short wide traces perpendicular to the capacitor length. This can be automated by using a user defined capacitor footprint that can be attached to all the decoupling capacitors in the schematic. 5. Implement fine length matching and insert multiple ground vias where signal traces jump layers. It is better to use the delay matching option in ALLEGRO and one must include z-axis delay. Typically, P and N nets of differential pairs should be matched with a tolerance of +/- 2ps and the tolerance for all other matched nets can be +/- 10 ps or more based on the timing margin computation.9. DIMMConsiderations described above apply to the case of PCBs containing one or more DIMMs. The only exception is that the decoupling requirement for the memories can be relaxed as it is already accounted for on the DIMM PCB. SI analysis of registered DIMMs is also much simpler where the DIMM is treated as a single load. While the routing topology for ADDR/CMD/CNTRL nets is usually a daisy chain with reduced stubs, tree topologies can also be used for registered DIMMs. Analysis of un-buffered DIMMs can become tedious as the timing requirement at all the SDRAMs must be analyzed. DIMM routing on 4-layer PCBs is relatively simpler compared to the case of SDRAMs.10. ExamplesThe detail described above has been used in the implementation of a DDR2 PCB, a DDR3 PCB and a DDR3 – DIMM PCB. The controller is from MOSAID [7] which is designed to provide both DDR2 as well as DDR3 functionality. For the SI simulations, IBIS models have been used. Models for the memories are from MICRON Technology, Inc [8]. The IBIS models for the DDR3 SDRAMs were available at 1333 Mbps speed. These were used at 1600 Mbps. For the unbuffered DDR3 DIMM (MT_DDR3_0542cc) EBD models from Micron Technology were used. All waveforms are for the typical case and are computed at the SDRAM die. The 6 layer PCB stackup of Figure 2 is used with routing on TOP and BOTTOM layers only. The memory consists of 2 SDRAMsrouted as a daisy chain. In the case of the DIMM, a single unbufferred DIMM is used. TOP/BOTTOM layer routing and Signal Integrity waveforms are shown in Figures. 9-11.Snapshots ofFigure 9: Illustration of TOP and BOTTOM layers of a DDR3 PCB with computed waveforms at the farthest SDRAM. Waveform on left is an ADDRESS net compared to that of the CLOCK net. Waveform on the right is a DATA net compared to that of a DQS net. Clock frequency = 800 MHz and data rate is 1600 Mbps.Figure 10: Illustration of TOP and BOTTOM layers of a DDR2 PCB with computed waveforms at the farthest SDRAM. Waveform on left is an ADDRESS net compared to that of the CLOCK net. Waveform on the right is a DATA net compared to that of a DQS net. Clock frequency = 400 MHz and data rate is 800 Mbps.Figure 11: Illustration of TOP and BOTTOM layers of a DDR3 – DIMM PCB with computed waveforms at the 8th (last) SDRAM on DIMM. Waveform on left is an ADDRESS net compared to that of the CLOCK net. Waveform on the right is a DATA net compared to that of a DQS net.Lastly, Figure 12 shows a comparison of computed and measured DATA eye patterns of an 800 Mbps DDR2. In all cases waveform integrity can be seen to be excellent.Figure 12: Computed (Red) and Measured (blue) waveforms of a data net of an 800 Mbps DDR2 PCB.11. ConclusionIn this paper, all aspects related to SI, and PI of DDR2 and DDR3 implementation have been described. Use of Constraint Manager in ALLEGROTM makes implementation easy. While a four layer PCB implementation of 800 Mbps DDR2 and DDR3 appears to be feasible, DDR3-1600 Mbps will prove to be challenging. It will become clearer as the memory devices become available and one has a good handle on timing numbers.References[1] DDR2 SDRAM Specification, JEDEC JESD79-2B, January 2005. [2] DDR3 SDRAM Standard, JEDEC JESD79-3, June 2007. [3] Syed Bokhari, “Delay matching on Printed Circuit Boards”, Proceedings of the CDNLIVE 2006, San Jose. [4] Larry D Smith, and Jeffrey Lee, “Power Distribution System for JEDEC DDR2 memory DIMM, Proc. IEEE EPEP conference, Princeton, N.J., pp. 121-124, October 2003. [5] Hardware and layout design considerations for DDR2 SDRAM Memory Interfaces, Freescale semiconductor Application Note, Doc. No. AN2910, Rev. 2, 03/2007. [6] DDR2 design guide for 2 DIMM systems, Technical Note, Micron Technology Inc. TN-47-01, 2003. [7] /corporate/products-services/ip/SDRAM_Controller_whitepaper_Oct_2006.pdf [8] /products/dram/ddr2/partlist.aspx?speed=DDR2-800 [9] /products/dram/ddr3/partlist.aspx?speed=DDR3-1066。

DDR3-硬件设计和-Layout-设计

冲和下冲等）。
终端匹配方案
设计者应该采用主流的终端匹配方案，像商业电脑主板那样的设计，ODT 终端匹配被应用在
数据总线上，地址/命令和控制线也应通过电阻连接到 VTT。当然，其它的终端匹配也是有
效的，但最好通过仿真来验证，确保信号质量满足要求。
3
终端匹配电阻的选择，其功耗是否满足芯片制造商的要求。
功耗计算 Power = x RT
4
假如数据线组增加了外部终端匹配电阻，请查看数据线组是否与其他 DDR3
信号组远离/隔离。
注：因为在 DDR3 数据组中通常优先选用内部 ODT 终端匹配，额外电阻是不
需要的。当然，假如不用 ODT 电阻，那么就需要增设外部电阻器了。
5
请查看 VTT 电阻 RT 布局是否正确，RT 终端电阻应该直接连接到 DDR 总线末端
4 / 13
序号
27
28 29 30
描述
该在同一层布线，并且保证过孔数量的一致性。
注：一些 DDR 芯片数据线接口是 32 位的。
通道 0：MDQ(7:0)，MDM(0)，MDQS(0)，
(0)
通道 1：MDQ(15:8)，MDM(1)，MDQS(1)，
(1)
通道 2：MDQ(23:16)，MDM(2)，MDQS(2)，
VREF 是否合理去耦，源端和终端都应布置一个 0.1uF 电容。
VREF 参考源是否会随 VDDQ、温度、噪声变化，这个变化是否满足 JEDEC 要求。
VREF 电流是否满足系统（DDR 和处理器）需求。
如果采用电阻分压网络产生 VREF，那么请保证电阻阻值和至少 1%的精度。
Routing
建议 DDR3 布线顺序如下：

毕业设计---DDR的PCB设计

目录DDR的PCB设计 (I)The PCB design of DDR.............................................. I I 第1章绪论.. (1)1.1 DDR的叙述 (1)1.2 DDR-DDR与SDRAM的区别 (1)1.3 DDR存储器电气特性验证 (4)第2章噪声来源及分析 (8)2.1 反射噪声分析和端接技术 (8)2.1.1 反射形成原因 (8)2.1.2 主抗匹配与端接方案 (9)2.1.3 端接方案的仿真结果 (12)2.2串扰噪声分析 (13)2.2.1 高速PCB板上的串扰分析模型 (13)2.2.2 高速PCB板上的串扰仿真结果 (13)2.2.3 减少高速PCB板上的串扰噪声的措施 (14)第3章完整性分析 (16)3.1电源完整性 (16)3.2 时序分析 (17)3.2.1公共时钟同步的时序分析 (17)3.2.2 源同步的时序分析 (22)3.3 案例 (24)第4章布局与布线 (29)4.1 PCB的叠层（stackup）和阻抗 (29)4.2 互联通路拓扑 (30)4.3 SDRAM的布局布线 (32)4.4 DDR的布局布线 (33)4.4.1 布局时应注意 (35)4.4.2布线时应注意 (35)4.4.3 布线要点 (37)4.6 供电 (38)结束语 (40)参考文献 (41)致谢 (42)附录数据线同组同层 (43)DDR的PCB设计摘要：随着微电子技术和计算机技术的不断发展，DDR双通道同步动态随机存储器在通信系统中的应用越来越显得重要，而随着电子产品的集成化，对DDR在PCB中的设计要求也越来越高。

为了更好的能理解DDR，本文还与SDRAM一并做了介绍与设计。

本设计为基于DDR双通道同步动态随机存储器的PCB设计。

本文主要介绍了在对DDR的PCB设计时，所面临的信号完整性。

详尽的阐述了影响信号完整性的反射、串扰和信号完整性中的时序分析的相关理论并提出了减小反射和串扰得有效措施。

ddr3 电路设计

ddr3 电路设计
DDR3是一种双数据速率（Double Data Rate）的SDRAM（同步
动态随机存取存储器），它具有高速、高密度和低功耗的特点。

在
进行DDR3电路设计时，需要考虑以下几个方面：
1. 时序设计，DDR3内部时序非常严格，需要精确的时钟控制
和信号同步。

在电路设计中，需要确保时钟信号的准确性和稳定性，同时要考虑数据和控制信号的延迟和对齐。

2. 信号完整性，DDR3的高速传输需要考虑信号完整性，包括
信号的传输线路设计、阻抗匹配、信号串扰和噪声抑制等方面。

在
电路设计中需要合理布局PCB，减小信号传输路径的长度，采用差
分信号传输等方法来提高信号完整性。

3. 电源和接地设计，DDR3需要提供稳定的电源和接地，以确
保芯片的正常工作。

在电路设计中需要考虑电源线和接地线的布局
和连接方式，减小电源噪声和提高电源供电的稳定性。

4. 自校准和时序校准，DDR3内部具有自校准和时序校准的功能，可以校正时钟和数据信号的偏移和延迟。

在电路设计中需要考
虑这些校准功能的实现和控制。

5. 热管理，DDR3在高速运行时会产生较多的热量，需要考虑散热设计，包括散热片的设计和散热风扇等。

总之，DDR3电路设计需要全面考虑时序、信号完整性、电源和接地、自校准和时序校准、热管理等多个方面，以确保DDR3芯片的正常工作和高速稳定传输。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

本文主要使用时域分析工具对DDR3设计进行量化分析，介绍了影响信号完整性的主要因素对DDR3进行时序分析，通过分析结果进行改进及优化设计。

1 概述当今计算机系统DDR3存储器技术已得到广泛应用，数据传输率一再被提升，现已高达1866Mbps。

在这种高速总线条件下，要保证数据传输质量的可靠性和满足并行总线的时序要求，对设计实现提出了极大的挑战。

本文主要使用了Cadence公司的时域分析工具对DDR3设计进行量化分析，介绍了影响信号完整性的主要因素对DDR3进行时序分析，通过分析结果进行改进及优化设计，提升信号质量使其可靠性和安全性大大提高。

2 DDR3介绍DDR3内存与DDR2内存相似包含控制器和存储器2个部分，都采用源同步时序，即选通信号(时钟)不是独立的时钟源发送，而是由驱动芯片发送。

它比DR2有更高的数据传输率，最高可达1866Mbps；DDR3还采用8位预取技术，明显提高了存储带宽；其工作电压为1.5V，保证相同频率下功耗更低。

DDR3接口设计实现比较困难，它采取了特有的Fly-by拓扑结构，用“Write leveling”技术来控制器件内部偏移时序等有效措施。

虽然在保证设计实现和信号的完整性起到一定作用，但要实现高频率高带宽的存储系统还不全面，需要进行仿真分析才能保证设计实现和信号质量的完整性。

3 仿真分析对DDR3进行仿真分析是以结合项目进行具体说明：选用PowerPC 64位双核CPU 模块，该模块采用Micron公司的MT41J256M16HA—125IT为存储器。

Freescale 公司P5020为处理器进行分析，模块配置内存总线数据传输率为1333MT/s，仿真频率为666MHz。

3.1仿真前准备在分析前需根据DDR3的阻抗与印制板厂商沟通确认其PCB的叠层结构。

在高速传输中确保传输线性能良好的关键是特性阻抗连续，确定高速PCB信号线的阻抗控制在一定的范围内，使印制板成为“可控阻抗板”，这是仿真分析的基础。

DDR3总线单线阻抗为50Ω，差分线阻抗为100Ω。

设置分析网络终端的电压值；对分析的器件包括无源器件分配模型；确定器件类属性；确保器件引脚属性(输入＼输出、电源＼地等)……3.2电路前仿真分析前仿真分析的内容主要是在PCB设计之前对电路设计的优化包括降低信号反射、过冲，确定匹配电阻的大小、走线阻抗等，通过对无源器件的各种配置分析选取出最适合的参数配置。

图1：时钟线的拓扑结构(1)DDR3总线的差分时钟分析众所周知，在差分传输中，所有信息都是由差模信号来传送的，而共模信号会辐射能量并能显著增加EMI，因此保证差分信号的质量十分重要，应使共模信号的产生降到最低。

在对差分时钟分析时不仅要关注其本身的信号质量，由于其它信号都是以差分时钟的来采样数据，因此还需关注其单调性、过冲值等。

本例中差分时钟的fly—by拓扑结构与地址总线一样为串联方式，如图l对处理器P5020驱动4个DDR3内存芯片的时钟拓扑结构，在终端进行简单的电阻匹配，在PCB板上差分走线后，进行反射分析发现接收端反射波形上下过冲较大。

在处理器输出端选用正确的下拉匹配电阻，虽电压幅值略有减少，但上下过冲明显减少消除了反射干扰，即减少了差分线的共模分量。

对比分析结果如图2。

图2：接收端DDR3的反射波形(2)验证驱动能力和ODT选项DDR3内存总线数据信号的驱动能力分为FULL和HALF两种模式，内部终端电阻(ODT)选择也有0Ω、20Ω、30Ω、40Ω、60Ω、120Ω选项，它们分别对应不同的模型用于控制信号反射的影响。

为提高信号质量、降低功耗，可通过分析不同模式选取正确的参数模型。

取数据总线对不同的ODT选项进行分析。

图3是在不同ODT设置进行分析数据信号形成的眼图波形，从图中可以看出：ODT阻抗越高，在相同驱动激励和走线等情况下转换率越高，幅度越大；在选择ODT=60Ω，其接收波形平缓信号质量最好，无明显抖动和过冲，抖动最小。

图3：数据信号眼图波形通常串扰是指当信号在传输线上传播时，因电磁耦合对相邻的传输线产生的不期望的电压噪声干扰。

它生成前向串扰与后向串扰，其对信号波形主要影响在幅度和边沿上面。

从DDR3数据总线提取3根相邻线，中间一根为被攻击网络，周围2根为攻击网络，采用3线模型进行分析，如下图4走线排列，其受害线为中间走线保持低电平，两边的为攻击线，采用128位伪随机码，根据走线的不同线宽和线间距对其进行串扰分析，看其分析结果如下表1。

图4：走线剖面图表1 串扰分析结果从表1可看出间距越大串扰影响越小，这里我们遵循的3W原则即走线边沿间距S是线宽W的3倍将大大减少串扰的影响。

但间距增大将牺牲布线空间，因此需综合考虑使在有限的空间中布线最优化。

3.3 时序分析DDR3是并行总线，其时序属源同步系统，在满足信号质量的前提下，也必须满足时序要求。

对于源同步时钟，驱动芯片的数据和时钟信号由内部电路提供即数据和时钟并行传输。

DDR3对不同的时序关系采用分组设计，其时序关系如表2。

表2 时序分组关系源同步时序计算公式：Tsetup_margin=Tvb—Tsetup—TskewThold_margin=Tva—Thold—Tskew公式中：Tsetup margin＼Thold_margin：建立时间余量＼保持时间余量Tvb＼Tva：驱动端的建立时间和保持时间，Tsetup＼Thold接收端的建立时间和保持时间Tskew：指数据、地址信号参考时钟引起的偏移。

其中Tvb＼Tva，Tsetup＼Th01d参数值都是能从器件手册中获取，关键是数据与选通信号飞行时间的时序偏移(skew)，包含驱动端输出的偏移和在PCB板上的走线长度的偏移，需通过时序仿真非理想随机码进行分析计算得出。

以DDR3数据读写操作为例，根据下表芯片资料中的时序参数进行静态时序裕量的计算，获得PCB设计的时序控制规则。

表3 仿真所需要的时序参数(数据率：1333MHz)控制器：写操作：Tsetup_margin=0.25—0.03=0.22nsThold_margin=0.25-0.065=0.185ns读操作：Tsetup_margin = Thold_margin=(0.38*1.5-0.125)/2 - 0.125=97.5ps上述计算值是理论上的时序余量，其实源同步时序除本身芯片自身固有特性所带来的延时外还受其他因素的影响，都属于偏移范围，主要包括：(1)高速总线造成的信号完整性问题如串扰、同步开关噪声、码间干扰(ISI)等影响，需通过信号仿真分析来估算。

(2)高速总线互连所产生的时序偏斜：主要是信号总线互连链路中的布线误差，整个链路含器件封装内部走线、pcb板上走线和走线过孔等产生的时序偏差，可通过等长布线来控制其时序偏斜。

对8位数据总线DQ进行分析，选取一根数据线为受害线，其余数据线为反向伪随机码，条件设为最快驱动，在此最坏情况进行综合仿真，查看其受扰线的波形情况。

比较图5、图6后可看出此数据线受到各方面的因素综合影响，使波形眼图中的眼高和眼宽都相应减小，导致数据总线时序裕量大大减少。

上述时序裕量需减去此值再考虑其他噪声因素，结合经验留些时序余量后，就可把时序余量转换成PCB设计时的布线长度约束(约6in／ns)。

图5：DO数据总线理想波形图6：DQ数据总线最坏情况下的实际波形4 PCB设计4.1设计规则约束(1)等长约束采用分组等长方式，分组如下：数据线与DQS、DM信号组：64位数据线按8位一组，每组分别对应其各自的DQS 和DM信号；由于数据时序余量最小，组内严格控制延时，以对应的DQS为基准，等长精度在±10mil。

地址线、控制线、时钟线组：需等长控制，地址线与控制线各分支的误差±20mil，它们与时钟线误差在±100mil，差分时钟线之间±5rail。

(2)间距约束DDR3同组线间的间距保持在2倍线宽；不同组类线的间距保持在3倍线宽；DDR3线与其他jBDDR3线之间的间距应大于50mil，用于串扰控制。

(3)线宽约束根据传输线阻抗要求和印制板叠层结构计算走线线宽，设置走线线宽规则，保证阻抗的一致性。

4.2 布线技巧同组内总线尽量同层走线，时钟线与地层相邻；尽量少用过孔，如用需组内过孔数相同，保证其一致性；相邻信号走线需交叉，避免长距离的重叠走线，如相邻层间距足够大，可适当降低要求；走线避免采用直角应用45。

斜线或圆弧角走线；尽量采用3W原则走线；与电源层相邻的信号层中的高速走线应避免跨电源＼地平面；电源层比地层内缩20H(H：电源层与地层的介质厚度)；不允许有孤立铜的存在。

5 PCB板后仿验证DDR3的PCB设计结束后进行后仿分析，用以对前面的仿真分析进行验证。

PCB 板后仿主要是对DDR3信号质量和时序关系进行分析。

5.1 DDR3的差分时钟验证DDR3差分时钟在PCB布线后对其后仿真分析，抽取一对实际时钟走线对所走链路进行分析其波形如下图7：其单调性和上下过冲都满足要求。

图7：差分时钟PCB走线波形图8数据总线写时序5.2 DDR3的时序验证对于布线后的时序验证也是十分重要的环节。

在确定好同步信号组及对应的选通信号后利用Cadence软件的BUS setup功能进行综合分析，位数据总线及相应的DQS信号，设定时钟频率666MHz，设定相应ibis模型，加入随机码流，最终进行分析后可通过测量得到时序参数可计算时序裕量，验证PCB布线是否满足相关的时序关系。

分析结果见图8。

图8：数据总线写时序从上图8可测量出数据总线的建立时间和保持时间，根据DDR3数据相应时序进行静态时序计算，再综合考虑其余因素对时序的影响来估算包括其PCB走线长度引起的偏移等，满足其DDR3接收端的建立时间和保持时间的时序正确性，其它时序关系类似可通过此验证。

6 结束语通过上述Power PC模块的DDR3内存设计分析，了解高速信号反射、串扰、时序等因素对其设计的影响，其仿真分析成为增强计算机系统设计可靠性和稳定性的必要手段，为设计高速数字电路保驾护航。

DDR3内存的PCB仿真与设计