LOW DIAMETER REGULAR GRAPH AS A NETWORK TOPOLOGY IN DIRECT AND HYBRID INTERCONNECTION NETWO
Ownership and Copyright 2002 Springer-Verlag London Limited An Experimental Study of Slive

Engineering with Computers(2002)18:229–240 Ownership and Copyright2002Springer-Verlag LondonLimitedAn Experimental Study of Sliver ExudationH.Edelsbrunner1and Damrong Guoy21Department of Computer Science,Duke University,Durham,NC,USA and Raindrop Geomagic,Research Triangle Park,NC, USA;2Center for Simulation of Advanced Rockets,Computational Science and Engineering Program,University of Illinois at Urbana-Champaign,IL,USAAbstract.We present results on a two-step improvement of mesh quality in three-dimensional Delaunay triangulations. Thefirst step refines the triangulation by inserting sinks and eliminates tetrahedra with large circumradius over shortest edge length ratio.The second step assigns weights to the vertices to eliminate slivers.Our experimentalfindings pro-vide evidence for the practical effectiveness of sliver exu-dation.Keywords.Dynamic triangulation;Mesh generation; Mesh quality;Slivers;Tetrahedra;Weighted Delau-nay triangulations1.IntroductionThis paper is generally about improving the mesh quality of three-dimensional Delaunay triangulations and specifically about the practical effectiveness of sliver exudation as a method to eliminateflat tetrahedra.We implement published algorithms and study them experimentally,focusing on mesh qual-ity.Meshing.A mesh of a three-dimensional domain is a decomposition into simple pieces,called elements.We consider Delaunay triangulations, which decompose the domain into tetrahedral elements.One of the distinguishing properties of Delaunay versus other triangulations is that they are determined by the set of vertices sampled from the Research by both authors is partially supported by NSF undergrants CCR-97-12088and DMS98-73945.Research of thefirst author is also supported by the NSF under grands EIA-9972879 and CCR-00-86013and by ARO under grant DAAG55-98-1-0177.Correspondence and offprint requests to:H.Edelsbrunner, Department of Computer Science,Duke University,Box90129, Durham,NC,USA domain.Another is that they have fast and reliable algorithms that make the construction of large and complicated meshes possible.There is an extensive literature studying three-dimensional Delaunay tri-angulations;see for example the recent text by Edelsbrunner[4].Since the Delaunay triangulation is unique for a given set of vertices,the problem of constructing a good quality mesh reduces to choosing vertices for which the Delaunay tetrahedra have good quality. In a nutshell this means that all angles are in the intermediate range,avoiding small and large values. The Delaunay refinement algorithm pioneered by Ruppert[10]in two dimensions makes use of this reduction by adding vertices incrementally.Shew-chuk[11]extends Ruppert’s algorithm to three dimensions and reports good success by generally adding vertices at circumcenters of poor quality tetrahedra.The authors of this paper limit the choice of new vertices to sinks,which are special circum-centers[5].The thus modified refinement method is used as thefirst step of the mesh improvement algorithm studied in this paper.Slivers.The main shortcoming of the Delaunay refinement algorithm in three dimensions is its inability to remove slivers,which are ratherflat tetrahedra with relatively small circumspheres.The persistent presence of slivers in large Delaunay tri-angulations has been observed experimentally at least as early as1985[1],but effective methods dealing with them have been found only recently [2,3,6,9].This paper implements the sliver exudation algorithm of Cheng et al.and studies its effective-ness in practice.The question in focus is how large a minimum dihedral angle this method can achieve. The positive lower bound proved in Cheng et al.[2]is conservative and exceedingly small,and this paper confirms our intuition that the lower bound230H.Edelsbrunner and Damrong Guoythat can be achieved in practice is reasonably large, which makes the sliver exudation algorithm a viable method in practice.Our experimental results are,however,inconclus-ive for tetrahedra near the mesh boundary.The reason is a fundamental weakness of the sliver exudation method.Boundary treatment methods such as the one described by Li and Teng[9]will have to be added in the future to get software that guarantees good mesh quality throughout the domain.Outline.Section2reviews background material on Delaunay triangulations and mesh quality.Section3 describes weighted Delaunay triangulations and the sliver exudation algorithm.Section4presents our experimental results forfive three-dimensional data sets.Section5concludes the paper.2.Delaunay RefinementIn this section,we review the material necessary to understand thefirst step of our mesh improvement, the Delaunay refinement through sink insertion.We refer to the experimental study in Edelsbrunner and Guoy[5]for details.Delaunay triangulations.Given afinite set S of points in R3,the Delaunay triangulation consists of a subset of the tetrahedra spanned by the points. We call the circumsphere of such a tetrahedron empty if all points other than the vertices of that tetrahedron lie outside the sphere.The Delaunay triangulation consists exactly of all tetrahedra with empty circumspheres.In the general case,in which nofive points lie on a common sphere,the Delaunay triangulation is unambiguous and has the face-to-face property.In degenerate cases,we construct a triangulation that is the Delaunay triangulation of an infinitesimal perturbation of the point set[7]. The above definition does not mention any kind of boundary for the domain we mesh.We deal with this problem by constructing conforming Delaunay triangulations that contain a specified two-dimen-sional triangulation of the domain boundary as a subcomplex.This is achieved by making sure that no vertices lie inside the equator spheres of the boundary triangles.We then remove the tetrahedra outside that boundary and thus obtain a triangulation of the domain.The property of having empty equ-ator spheres of boundary triangles is maintained throughout the mesh improvement process.Figure1 illustrates the idea with an example in twodimen-Fig.1.Delaunay triangulation in two dimensions with triangles outside the two boundary curves removed.sions,where the Delaunay triangulation consists of all triangles with empty circumcircles,and the boundary edges are protected by empty diameter circles.Mesh quality.We use the classification of tetrahedra introduced in[2].It distinguishes poor quality tetrahedra with small Hausdorff distance to a line segment from those with small Hausdorff distance to a planarfigure.Among the latter,we distinguish tetrahedra with small Hausdorff distance to a triangle from slivers,which have small Haus-dorff distance to a quadrangle.A refinement of this classification into nine smaller classes is shown in Fig.2.We keep in mind that the classification is fuzzy and depends for example on what exactly we mean by small Hausdorff distance.We use two measures to quantify what we mean by the quality of a tetrahedron.Thefirst is the ratio of the circumradius over the shortest edge length, r/ᐉ.The regular tetrahedron has the smallest possible ratio of r/ᐉ=√6/4=0.612.%.Thefirst eight types of tetrahedra in Fig.2have large ratio,but slivers may have ratios as small as√2/2=0.707%or slightly smaller.The second measure is the minimum dihedral angle,.The regular tetrahedron maximizes the minimum dihedral angle at=arccos/=70.528%°. Spires,spears,and spindles may have reasonably large value of,but the remaining six types neces-sarily have small dihedral angles.231An Experimental Study of SliverExudation Fig.2.A classi fication of poor quality tetrahedra into nine classes.Figure 3illustrates the regions of the various types of tetrahedra in the ratio-angle plane.The region near the upper left corner contains what we call good quality tetrahedra.Tetrahedra that are not in this region are identi fied by large ratio,small angle,or both.Sink insertion.The basic idea of the Delaunay re finement algorithm is to identify a tetrahedron with large ratio and add its circumcenter to the vertex set.The empty sphere criterion implies that this tetrahedron is removed as part of the vertex insertion.As suggested in Edelsbrunner and Guoy [5],we modify the general strategy slightly and limit the added vertices to sinks ,which are circum-centers that are contained inside their own Delau-naytetrahedra.Fig.3.Each tetrahedron is plotted as a point in the ratio-angle plane.In contrast to other low quality tetrahedra,slivers reach the left boundaryof the region.Figure 4illustrate the de finition by showing the sinks in a two-dimensional Delaunay triangulation.The effect of Delaunay re finement by sink insertion can be seen by comparing Fig.4before with Fig.1after re finement through iterative sink insertion.Near the boundary,the re finement strategy has to be modi fied to preserve the protecting equator spheres of boundary triangles.There are different options and we decide to keep the boundary untouched by prohibiting new vertices inside the equator spheres.3.Sliver ExudationThe Delaunay re finement algorithm removes all poor quality tetrahedra,except slivers.The second step of our mesh improvement method removes slivers by sliver exudation as described in Cheng et al .[2].This section provides the necessary background,most importantly the generalization of Delaunay to weighted Delaunay triangulations.Weighted Delaunay triangulations.The gen-eralization replaces the Euclidean distance by more general distance functions.We assign to each vertex u a weight U 2R and de fine the weighted square distance between (u ,U )and (z ,Z )as ʈu Ϫz ʈ2Ϫ(U 2+Z 2).For zero weights,the weighted square distance is the square of the Euclidean distance.Note that the weighted square distance does not change ifweFig.4.In two dimensions,the (white)sinks are circumcenters of (dark)non-obtuse Delaunay triangles.232H.Edelsbrunner and Damrong Guoyincrease the weight of one vertex and decrease the weight of the other by the same amount.A crucial idea is the interpretation of (u ,U )as the sphere with center u and radius U .Two spheres are orthogonal if their weighted square distance is zero.Observe that two orthogonal spheres intersect in a circle and have perpendicular tangent planes along this circle.If one of the spheres is only a point then it lies on the other sphere.It follows that the circumsphere of a tetrahedron is the unique sphere orthogonal to the four vertices.If we assign weights to the vertices then we still have a unique orthogonal sphere,known as the orthosphere of the tetrahedron.Figure 5illustrates this idea in two dimensions.When all points have zero weight,their orthosphere is their circumsphere.After some points gain weights,the orthosphere changes in the manner to preserve orthogonality.Given a set of spheres,we call the orthosphere of four empty if all other spheres have positive weighted square distance.The weighted Delaunay triangulation of the set consists of all tetrahedra spanned by the centers that have empty orthospheres.For the special case of zero weights,this is the same as the unweighted Delaunay triangulation.Similar to the unweighted case,the weighted Delaunay triangu-lation is unambiguous if the spheres are in general position,which includes that no five spheres have a common orthosphere.Figure 6shows two weighted Delaunay triangulations of eight points on a cube.Assigning different weights can cause different tri-angulations.Fig.5.Orthospheres are generalization ofcircumspheres.Fig.6.Two weighted Delaunay triangulations of a cube.Weight pumping.Consider a vertex u in a weighted Delaunay triangulation.Its star consists of all tetrahedra that contain u as a vertex.By construc-tion,these tetrahedra all have empty orthospheres.If we continuously increase the weight of u ,as shown in Fig.7,these orthospheres are pushed away from u invading the space outside the original orthospheres.At discrete moments in time,the weighted square distance between an invading orthosphere and a sphere of a vertex outside the star vanishes.We say the weight of u at that time is critical .To prevent the weighted square distance to become negative,we locally change the triangulation by a flip .A flip operation in 3D replaces two tetrahedra by three tetrahedra that occupy the same space,or vice versa.We refer to Edelsbrunner and Shah [8]for details on maintaining a weighted Delaunay triangulation by flipping.Figure 8illustrates the two basic flips.We may compute the critical weights of u in increasing order by breadth-first search using a pri-ority queue.At any moment during the process,the prestar of u consists of all tetrahedra in the initial weighted Delaunay triangulation whose orthospheres have negative weighted square distances to u .Between critical weights,the new triangulationisFig.7.The dotted orthocircles of the triangles in the star of u are pushed away from u as its weight increases.233An Experimental Study of SliverExudation Fig.8.Two kinds of flips.(a)The two-to-three flip replaces the two tetrahedra sharing the link triangle of u by the three tetrahedra sharing a star edge of u .(b)The three-to-two flip replaces the three tetrahedra,one of which is a sliver,around the link edge of u by the two tetrahedra sharing a star triangle of u .obtained by substituting the current star for the prestar of u .We call the above process pumping p .Figure 9demonstrates this idea schematically.Let d u be the minimum Euclidean distance between u and any other vertex in the triangulation.The sliver exudation algorithm pumps p to a weight that maximizes the minimum dihedral angle of any tetrahedron in the star,but it does not expand the radius beyond 0.45·d u .Here 0.45is an arbitrary positive constant less than one half.The reason for that restriction is that overlapping spheres may cause some vertices to be deleted from the weighted Delaunay triangulation.With the mentioned stopping criterion,all spheres are disjoint and all points are vertices of the weighted Delaunay triangulation.The main result of Cheng et al .[2]is a proof that if we pump all vertices as described then all dihedral angles are larger than some constant ⑀Ͼ0that is independent of the set of input spheres.In the proof,that constant is positive but miserably small.As ourexperimentsFig.9.Between critical weights,the new triangulation is obtained by substituting the current star for the prestar.show,the constant that can be achieved in practice is much larger and possibly around 5°.Exudation algorithm.The algorithm pumps every vertex in the triangulation.There is no restriction on the scheduling sequence,and pumping each vertex once is enough.Let S be the vertex set of the initial (unweighted)Delaunay triangulation.void Sliver Exudation (S )foreach vertex u S do U 2=Pump (u );substitute (u ,U )for u and star for prestar endfor .The optimal weight U 2for a vertex u is computed as explained above.Keep in mind that U 2is optimal only under fairly limiting conditions,namely that all other vertices have the fixed weight they happen to have at the moment and U is less than 0.45·d u .To formally describe the pumping process,we write234H.Edelsbrunner and Damrong GuoyU20=0for the initial weight and U21ϽU22Ͻ%for the critical weights of u.We leti be the minimum dihedral angle of all tetrahedra in the star of u at the weight U2i.The optimal weight U2j is the one that maximizes the anglei.float Pump(u)i=0;U2o=0;j=0compute0;loopi=i+1;compute next critical weight U2i;if U iϾ0.45·d u then exit endif;expand the star of u and computei;ifiϾj then j=i endifforever;return U2jWe note that for efficiency reasons,the star in Function Pump is computed incrementally.The pre-star is explicitly constructed and the Delaunay tri-angulation is updated only once per vertex in Func-tion SliverExudation.In order to maintain the data structure dynamically,we use the prestar to delete and the star to create records for tetrahedra. The sliver exudation algorithm is modified for vertices near the domain boundary.Specifically,we limit the weight of every interior vertex so it cannot assume a negative weighted square distance to the equator sphere protecting any boundary triangle. Similarly,we limit the weight of a boundary vertex so it cannot assume a negative weighted square distance to the equator sphere protecting any non-incident boundary triangle.We check this condition in the prestar of the vertex when it is pumped. Let n=card S be the number of vertices.The number of iterations in Function SliverExudation is n.Assuming a constant upper bound on the ratios r/ᐉ,Cheng et al.[2]prove that the star of every vertex has constant size.Function Pump thus takes only constant time per vertex,which adds up to a total time of O(n)for sliver exudation.4.Experimental ResultsThis section presents experimental results forfive three-dimensional data sets.All experiments are done on a Pentium II450MHz CPU with128MB of memory.For each data set,we evaluate the mesh quality of the Delaunay triangulation initially(I), after refinement(R),and after sliver exudation(X). Each data set starts out with a boundary triangu-lation(B),so the initial Delaunay triangulation has no interior vertices.Our algorithm behaves differ-ently in the interior and near the boundary.To differentiate,we call a tetrahedron next to the bound-ary if at least one of its vertices lies on the boundary, and interior,otherwise.To simplify the discussion, wefix a threshold and call a tetrahedron a sliver if its minimum dihedral angle isϽ5°.A smooth surface of Permutahedron.Ourfirst example is a triangulated skin surface obtained from 24spheres,which are connected by blending hyper-boloids and inverse sphere patches.The sphere cen-ters are the vertices of a convex polytope and represent the permutations of four objects,hence the name.Thefirst row in Fig.10shows the entire mesh and the second an enlarged portion.Thefirst column shows the boundary mesh while the other three columns show aspects of the volume mesh at different stages of the algorithm.The corresponding statistics is presented in Table1.The initial Delaunay triangulation consists of more than76,000tetrahedra,of which about38%are slivers and are therefore visible in the second col-umn of Fig.10.In the sphere regions the slivers tend to be small and parallel to the boundary,while in hyperboloid regions they form stacks of large slivers roughly normal to the axis.The refinement takes about53minutes and decreases the ratio below 1.0except for384 tetrahedra in the interior and more than5,000next to the boundary.The total number of vertices and tetrahedra goes up by more than a factor of three, which is reasonable because the initial mesh has no vertex in the interior.By construction,all new verti-ces are in the interior.At the same time,the refine-ment step eliminates the majority of the slivers, namely the ones with large circumspheres,but516 slivers with small circumspheres remain.Sliver exudation takes about9minutes and decreases the number of tetrahedra by2.7%as it replacesflat tetrahedra.The ratio distribution wor-sens only slightly.The average minimum dihedral angle improves slightly,but most significantly,all dihedral angles move above thefive degree thres-hold,except for one.That one sliver hasϾ3°and lies next to the boundary.We guess that it survives the exudation process because of the limited freedom in assigning weights near the domain boundary.A surface with sharp corners of Wheel.The second example is one twelfth of a wheel.It has mechanical shape with sharp corners and sharp edges in the boundary.Figure11shows the entire shape in thefirst row and an enlarged portion235An Experimental Study of SliverExudation Fig.10.Permutahedron data.From left to right,surface mesh (B),initial Delaunay mesh (I),Delaunay mesh after re finement (R),and weighted Delaunay mesh after exudation (X).Volume meshes are displayed by showing transparent boundary together with opaque slivers.No other tetrahedra are shown.Table 1.Permutahedron data.Evaluation of boundary surface mesh (B),initial Delaunay mesh (I),the mesh after re finement by sink insertion (R),and the mesh after sliver exuation (X).Disribution tables distinguish tetrahedra in the interior (int)from the ones next to the boundary (bd)r /ᐉ#vert#tri #tet run timeminavg max min avg max B 14,28728,622––0.580.72 2.02–––I 14,287–76,329–0.71 4.6515.740.0029.6858.73R 43,569–230,11953min 0.620.83 2.140.28245.2969.90X43,569–223,9369min0.620.832.143.37646.4969.90r /ᐉdistributiondistribution0–11–22–33–10Ն100–55–1010–2020–4040–71I,int 0000000000I,bd 8387,5708,86958,29975328,72620,00917,2389,704652R,int 143,9843840002688273,77531,892107,606R,bd 80,3725,3745002487682,23020,79561,710X,int 139,4158870000593429,896109,467X,bd78,1625,466611671019,99262,915in the second row.The corresponding statistics is presented in Table 2.The initial mesh contains more than 34,000tetrahedra with 2.4%slivers,which can be seen in the second column of Fig.11.The re finement step leaves a few tetrahedra with ratios that exceed the threshold of 1.0.In about 16minutes,it almost doubles the number of vertices and almost triples the number of tetrahedra.The size in flation is less dramatic for the Wheel than for the Permutahedron data because the domain is fairly thin and requires only a small number of interior vertices.236H.Edelsbrunner and DamrongGuoyFig.11.Wheel data.The display conventions are the same as in Figure10.Table2.Evaluation of the Wheel datar/ᐉ#vert#tri#tet run timemin avg max min avg maxB11,52523,046––0.580.67 1.78–––I11,525–34,892–0.65 1.94 6.570.00034.1663.92 R20,688–93,18616min0.610.82 2.340.00046.0270.28 X20,688–90,9692min0.610.82 2.340.33647.1070.28r/ᐉdistributiondistribution 0–11–22–33–10Ն100–55–1010–2020–4040–71I,int0000000000 I,bd2,40221,2557,9223,31308462,4628,0785,68117,825 R,int32,47924000651828767,00924,371 R,bd59,709841133001883271,30811,98446,876 X,int31,384109000002276,54024,726 X,bd58,345999132003742311,46347,850The refinement step also reduces the number of slivers but not by the impressive rate we have observed earlier.The exudation step takes only two minutes but leaves three slivers next to the bound-ary.As in thefirst example,it reduces the number of tetrahedra by2.4%.Two boundary triangulations:ToothA and ToothB.In the third example,we study the effect of the boundary triangulation on the mesh improve-ment algorithm.The domain is a human tooth for which we compute the mesh starting with two differ-ent boundary triangulations.The two models are shown in Fig.12,and the statistics is presented in Table3.The boundary triangulation is better for ToothB than for ToothA.That difference has apparently no influence on the running time of the algorithm,but237An Experimental Study of SliverExudation Fig.12.The ToothA model in the upper and the ToothB model in the lower row.Table 3.Evaluation of the ToothA data in the upper and of the ToothB data in the lower tabler /ᐉ#vert#tri #tet run time min avg max min avg max B 13,45326,092––0.580.82 3.40–––I 13,453–47,286–0.747.8837.020.00720.1960.12R 52,325–291,1221h 0.610.84 3.600.04744.4070.48X52,325–282,63315min0.610.84 3.600.17045.6970.48r /ᐉ#vert#tri #tet run time min avg max min avg max B 13,45326,092––0.580.73 1.55–––I 13,453–44,438–0.797.8427.830.02122.9759.99R 51,274–281,9781h 0.620.82 1.890.15945.4969.99X51,274–274,33215min0.620.821.898.92446.7169.99it has a signi ficant in fluence on the mesh quality re finement and exudation achieve.The difference in mesh quality is most striking after the exudation step:we observe 142slivers with =0.17°in ToothA compared to no sliver and =8.924°in ToothB .It is telling that all 142slivers lie next to the boundary,which is strong evidencethat our insistence on maintaining the boundary is the main reason for the slivers in the final mesh.More examples:Head and Hog .We present experimental results for two additional data sets.The results are similar to what we have seen above,so we can be brief.238H.Edelsbrunner and DamrongGuoyFig.13.The Head data models the solid propellant inside the rocket booster of Space Shuttle.The Head data is displayed in Fig.13and the statistics is provided in Table 4.Both the input boundary triangulation and the output weighted Delaunay triangulation are the largest of all our examples,which explains the rather long running time.Table 4.Evaluation of the Head datar /ᐉ#vert#tri #tet run timemin avg max min avg max B 33,97067,940––0.58 1.21 1.60–––I 33,970–100,452–0.67 4.108.670.17020.5461.21R 110,919–610,4635h 0.620.93 2.390.03441.3369.88X110,919–593,0981.5h0.620.932.393.01442.4869.88r /ᐉdistributiondistribution0–11–22–33–10Ն100–55–1010–2020–4040–71I,int 0000000000I,bd 48220,88019,38259,708010,11913,37820,23050,5356,190R,int 370,5478,0087007502,45011,24389,281274,838R,bd 100,766131,11817001,5004,34521,055118,28686,715X,int 358,9489,3005001814,09584,396279,680X,bd98,389126,43422381,18216,519119,24487,862The Hog data is displayed in Fig.14and the stat-istics is provided in Table 5.The large volume of the animal requires a large number of interior verti-ces,and we observe ratios of final over initial size that exceed the ratios in the other data sets.239An Experimental Study of SliverExudation Fig.14.The first two pictures show the surface of the Hog data smoothly rendered and triangulated.Table 5.Evaluation of the Hog datar /ᐉ#vert#tri #tet run timeminavg max min avg max B 13,47426,948––0.580.76 2.89–––I 13,474–46,491–0.717.84297.150.00018.7559.88R 56,982–317,5652h 0.620.84 3.500.32244.9870.20X56,982–308,80120min0.620.843.502.52946.2270.20r /ᐉdistributiondistribution0–11–22–33–10Ն100–55–1010–2020–4040–71I,int 0000000000I,bd 4336,0305,86026,4557,71310,0769,3918,92811,8886,208R,int 231,1031,7834004061,3476,22551,811173,101R,bd 71,37613,24750203277402,86724,39256,349X,int 223,6602,6144000101,57848,247176,443X,bd69,49312,9804826731,22623,75457,4645.DiscussionThe computational experiments presented in this paper provide evidence for the practical viability of sliver exudation as a method to remove slivers from three-dimensional Delaunay triangulations.Gener-ally,the method succeeds in increasing all dihedral angles above 5°.Our results are not as crisp as one would hope,and the main and perhaps only reason for the short-coming is the lack of an effective method for improving the boundary mesh.This is not a weak-ness of our experimental set-up but rather a funda-mental limitation of the sliver exudation algorithm as described in Cheng et al .[2].Our positive results in the interior warrant additional efforts to rethink the way we deal with domain boundaries.Ideally,we would like to integrate the improvement of the boundary triangulation into the mesh improvement algorithm.240H.Edelsbrunner and Damrong GuoyWe note that the measure of‘sliverness’used in this paper is different from that in Cheng et al.[2]. We use the minimum dihedral angle,while the original exudation paper uses the ratio v/ᐉ3of vol-ume over the cube of the shortest edge length.Both measures approach zero as the tetrahedron getsflat, but they are quite different at the other extreme. The angleassumes its maximum70.528°for the regular tetrahedron while the ratio v/ᐉ3goes to infinity for skinny tetrahedra with short edges,like spires,spears,and splinters.The biggest advantage of the minimum dihedral angleis that it is intuitive and makes our statistical results easier to compre-hend.AcknowledgementThe second author thanks Timothy J.Baker for the Hog data,Raindrop Geomagic for the Tooth data,and Holun Cheng for his skin software that generates surface mesh of Permutahedron.References1.Cavendish,J.C.,Field,D.A.,Frey,W.H.(1985)Anapproach to automatic three-dimensionalfinite elementmesh generation.Internat.J.Numer.Methods Engrg.21,329–3472.Cheng,S.-W.,Dey,T.K.,Edelsbrunner,H.,Facello,M.A.,Teng,S.-H.(2000)Sliver exudation.J.ACM, 47,883–9043.Chew,L.P.(1997)Guaranteed-quality Delaunay mesh-ing in3D.Proceedings13th Annual Symposium on Computational Geometry,391–393.4.Edelsbrunner,H.(2001)Geometry and Topology forMesh Generation.Cambridge University Press5.Edelsbrunner,H.,Guoy,D.(2002)Sink insertion formesh put.Sci.13,223–2426.Edelsbrunner,H.,Li,X.-Y.,Miller,G.L.,Statho-poulos, A.,Talmor, D.,Teng,S.-H.,U¨ngo¨r, A., Walkington,N.(2000)Smoothing and cleaning up slivers.Proceedings32nd Annual ACM Symposium on the Theory of Computing,273–2777.Edelsbrunner,H.,Mu¨cke,E.P.(1990)Simulation ofSimplicity:a technique to cope with degenerate cases in geometric algorithms.ACM Trans.Graphics,9, 66–1048.Edelsbrunner,H.,Shah,N.R.(1996)Incremental topo-logicalflipping works for regular triangulations.Algor-ithmica,15,223–2419.Li,X.-Y.,Teng,S.-H.(2001)Generating well-shapedDelaunay meshes in3D.Proceedings12th Annual ACM-SIAM Symposium on Discrete Algorithms, 28–3710.Ruppert,J.(1995)A Delaunay refinement algorithmfor quality2-dimensional mesh generation.J.Algor-ithms,18,548–58511.Shewchuk,J.(1998)Tetrahedral mesh generation byDelaunay refinement.Proceedings14th Annual Sym-posium on Computational Geometry,86–95。
Contikimac 和 XMAC 协议在 ContikiRPL 数据采集应用中网络性能的比较研究(IJCNIS-V11-N8-4)

I. J. Computer Network and Information Security, 2019, 8, 32-37Published Online August 2019 in MECS (/)DOI: 10.5815/ijcnis.2019.08.04A Comparative Study of Network Performance between ContikiMAC and XMAC Protocols in Data Collection Application with ContikiRPLVu Chien ThangFaculty of Electronics and Communications Technology, University of Information andCommunications Technology, Thai Nguyen, 250000, Viet NamE-mail: vcthang@.vnReceived: 03 May 2019; Accepted: 18 July 2019; Published: 08 August 2019Abstract—This paper will present several research results evaluating the performance of ContikiMAC and XMAC protocols in data collection application with the RPL routing protocol. Simulation results show that ContikiMAC protocol gets better efficiency compared with XMAC protocol in both successful data delivery ratio and average energy consumption in the network. ContikiMAC protocol also performs well in high-density network condition. Meanwhile, successful data delivery ratio of XMAC protocol significantly reduced when the network density increases. The evaluating simulation results in this paper are an important basis for scientists to continue developing applications for wireless sensor networks in the future.Index Terms—ContikiMAC protocol, XMAC protocol, wireless sensor network, energy-efficient MAC protocol, network performance evaluation.I.I NTRODUCTIONWireless Sensor Network (WSN) is an infrastructure including sensing (measuring), processing and communicating components in order to provide administrators the ability to measure, observe and impact again with events, phenomena in a defined environment. Typical applications of wireless sensor networks include data collection, military, monitoring, and medicine, etc. Wireless sensor nodes often use limited power (usually using battery), have a long operating time (from several months to several years). Most wireless sensor nodes are equipped with low-power radio transceivers. These radio transceivers are one of the most energy-consuming components. To conserve energy, radio transceivers need to be turned off. When radio transceivers are turned off, they cannot listen to transmissions from other nodes.Not listening on the radio transceiver has affected the construction of the network topology for wireless sensor networks. In that state, only the star topology is appropriate to wireless sensor networks. In a star topology, the central node (sink node) has its radio turned on all the time. This node is supplied with external power. All other nodes are powered by batteries and keep their radio transceivers turned off to save energy. Only when sending data, they will turn on the radio transceivers. The only node that they can transmit to is the central node because all other nodes have turned off the radio transceivers. In some narrow-scale applications, star topology is proper.In order to expand the range of the network, nodes must have the ability to communicate with each other. In that state, the topology can provide redundant paths through the network, which increases reliability for the network. If a node runs out of power, the network can reroute flow around the faulty node. This is the mesh topology. In order to form the mesh topology, the radio transceivers of the nodes need to be controlled to turn off when they are not in use but must be turned on when a neighboring node wants to communicate. Therefore, a general protocol is needed so that nodes can communicate with each other.In this paper, ContikiMAC and XMAC protocols are studied and evaluated in data collection application with the RPL routing protocol. Some evaluation metrics used such as the data delivery ratio, average power consumption, average number of parent node changes, and average hop count in the network. The performance of the network will be simulated and evaluated when the network density changes. These simulation results are useful for scientists to develop applications with different network density such as smart water [1], smart grid, smart agriculture, smart environment, etc.II.R ELATED W ORKSIn the past, there have been a number of proposed protocols for wireless sensor networks [2, 3]. Initial studies have shown that energy was significantly saved in comparison with the cases where radio transceivers are often turned on. Some protocols such as S-MAC [4] reduced the average turned-on time of the radio from 100% to 35%. WiseMAC protocol [5] reduced to even 20%.Fig.1. The radio transceivers are turned on/off periodically One of the simplest energy-saving protocols is the LPL protocol [2]. This protocol achieves low power operation by turning off the radio in most of the time and periodically turning it on for a short period. By keeping the radio turned on for a short period of time, such protocol allows the sensor node to absolutely receive transmissions from neighboring nodes. This process is illustrated in figure 1.To send a packet to a node, the sender first sends a train of short packets called strobes. When the receiver listens to a strobe, the receiver will turn on its radio transceiver to wait for data packets. The strobe train must be long enough for all neighboring nodes to listen at least once with in the period. This is shown in figure 2.Fig.2. Operation of LPL protocolHowever, it can be seen that, through analysis, the LPL protocol still has some disadvantages. First, the strobes wake up every node, not only the one receiving the final packet. This wastes energy for all other neighboring nodes because they have to turn on the radio to receive packets that are not sent to them. Second, the transmission of each packet will take a long time because if the receiving node turns off the radio for a second, the strobe train must be sent during a second. This also causes energy loss for the sending node.Fig.3. Operation of XMAC protocol [6]In the paper [6], the authors proposed the XMAC protocol. XMAC protocol reaches higher energy efficiency than the LPL protocol. Before sending a data packet, the sender will send a train of short preambles. The short preamble carries address information of the destination node. When neighboring nodes receive a short preamble, it will check the information about destination node’s address. If the address of destination node matches with that of receiving node, it will send a confirmation message ACK to the sender while still enabling the radio to wait for receiving the data packet. After receiving the ACK message, the sender will send the data packet to the receiver. Conversely, if the address of destination node does not match with the address of the receiving node, the receiving node will turn off its radio. Thus, the XMAC protocol is more optimized than the LPL protocol because the waiting time for receiving data packets is shortened and nodes which are not the destination will quickly switch to sleep mode to save energy. Figure 3 illustrates the operation of the XMAC protocol. ContikiMAC protocol is proposed in the paper [7]. Figure 4 depicts the operation of this protocol.Fig.4. Operation of ContikiMAC protocol [7]In order to send a data packet, the sender repetitively sends the same packet until a confirmation message is received. The nodes in the network turn off the radio for most of the time and periodically turn on to check the transmission channel. If a data packet is detected on the transmission channel, the receiver will always turn on the radio to receive data packets. The receiving node will check the data packet, if the data packet is sent to it, it will confirm to the sender with an ACK message. As such, ContikiMAC protocol is designed to be simple, easily implement and does not need to use signaling messages as well as additional headers.III.C ONNECTIVITY M ODEL OF W IRELESSS ENSOR N ETWORKThis paper will focus on evaluating the XMAC and ContikiMAC protocols in data collection application with the RPL protocol [8]. The RPL protocol is a protocol designed for Low-Power and Lossy Networks (LLNs) with limited resource nodes and interconnected by lossy links (loss packet). The RPL protocol is a distance vector protocol. This protocol builds a topology consisting of one or more Destination Oriented Direct Acyclic Graph (DODAG). The route is constructed from nodes in the network to one of the root nodes of the DODAG [9]. Figure 5 shows the implementation of RPL protocol in the uIPv6 communication stack of the Contiki operating system [11]. The uIPv6 calls the ContikiRPL module when receiving ICMPv6 messages and discovering neighbors. ContikiRPL module calls the uIPv6 stack to install routes to the IPv6 routing tables.Figure 6 demonstrates the network topology model which is considered in this paper. The network is divided into many different small clusters. Since the clusters are the same, only one cluster is simulated and evaluated.Fig.5. Implementation of the RPL protocol in ContikiOS [10]Fig.6. The network topology is divided into manydifferent small clustersThe assumptions set for the simulation problem are: nodes are heterogeneous and there are two types of nodes, sensor nodes, and root nodes; The sensor nodes read and send data to the root nodes via other intermediate sensor nodes. Root nodes collect data and directly send data to a gateway; During the entire operation of the network, nodes transmit in the constant power level. No data aggregation is made in the network. All data collected by the root nodes are sent to the gateway; The sensor nodes are fixed, the network is considered static.The connectivity of a wireless sensor network is described by a graph G = (V, E), where V (vertices) is the set of sensor nodes, and E (edges) describes the adjacency relation between nodes. That is, for two devices u, v ∈ V, (u, v) ∈ E if v is adjacent to u. In an undirected graph, it holds that if (u, v) ∈ E, then also (v, u) ∈ E; that is, edges can be represented by sets {u, v} ∈ E rather than tuples. The classic connectivity model is the so-called unit disk graph (UDG). The UDG model is idealistic: In reality, radios are not omnidirectional, and even small obstacles such as plants can change connectivity.In wireless networks, the communication medium is shared and transmissions are exposed to interference. Concretely, a node u may not be able to correctly receive a message of an adjacent node v because there is a concurrent transmission nearby. In some sense, an interference model explains how concurrent transmissions block each other.Fig.7. The UDI model [12]In this paper, the UDI model is used for simulation [12]. Nodes are situated arbitrarily in the plane. Two nodes can communicate directly if and only if their Euclidean distance is at most 1, and if the receiver is not disturbed by a third node with Euclidean distance less or equal a constant R ≥ 1.Figure 7 describes the UDI model considered in this paper. The UDI model has two radii: a transmission radius (length 1) and an interference radius (length R ≥ 1). In this figure, node v is not able to receive a transmission from node u if node x concurrently transmits data to node w - even though v is not adjacent to x. IV. P ERFORMANCE E VALUATION OF C ONTIKI MAC AND XMAC P ROTOCOLS IN D ATA C OLLECTION A PPLICATIONWITH C ONTIKI RPL A. Evaluation MetricsThe performance of ContikiMAC and XMAC protocols will be evaluated and compared through some of the following evaluation metrics. 1. Data Delivery Ratio :The first metric is the data delivery ratio (DDR). We define DDR as the ratio between the number of data packets received at the root and the total number of sent data packets by all nodes in the whole network.(%).100%received dataNDDR N (1)In (1), N received is the number of data packets received at the root; N data is the number of data packets sent by all nodes in the network. The higher the DDR is, the better the communication efficiency of the network becomes. Clearly, DDR equals to 1 indicates that the network can deliver all the data to the root node. 2. Average energy consumption:In this paper, IRPL and RPL protocols have been evaluated based on the simulations. Tmote Sky hardware platform built on Cooja simulation tool was used [13]. To estimate the energy consumption of the Tmote Sky hardware platform, the software-based online energy estimation was used. The total energy consumption of the node is defined as [14]:()consumption a a l l t t r r ci ci iE U I t I t I t I t I t =++++∑(2)Where U is the supply voltage, I a is the consumption current of the microcontroller when running, t a is the time in which the microcontroller has been running, I l and t l are the consumption current and the time of microcontroller in low power mode, I t and t t are the consumption current and the time of the communication device in transmit mode, I r and t r are the consumption current and the time of communication device in receive mode, I ci and t ci are the consumption and the time of other components such as sensors and LEDs...Table 1. Energy Model of Tmote Sky.Table 1 shows the energy model of Tmote Sky, where the consumption currents are from chip manufacturer data sheets [15]. In the energy model of Tmote Sky, the author only considers the main energy consumptions that are the radio transceiver, the microcontroller, and other small energy consumptions ignored.3. Average number of times to change parent node: The average number of parent node change is determined based on the statistics of the number of parent node changes for each node. Wireless sensor networks operate on lossy radio links. The radio links are often unstable quality and change frequently over time. Therefore, network topology also needs changing in order to adapt to the radio communication environment. To evaluate this adaptive change, the author relies on statistics of the average number of parent node changes in the whole network.4. Average hop count in the network:The hop count refers to the number of intermediate nodes through which data must pass between source and destination. Hop count is a rough measure of distance between two nodes.B. The Scenario of EvaluationWith the assumptions set out in section 3 of this paper, a cluster model consists of sensor nodes randomly distributed in a grid area of 100mx100m.Nodes periodically send data packets to the root node located at the center of the cluster. Figure 8 illustrates a cluster model including 35 nodes, the root node is No.35.The parameters used over the time of evaluating simulation are summarized in Table 2. Radio communication model used in the simulation is the UDI model, in which the effective transmission range is 30 meters and the interference range is 50 meters. The network layer protocol used in the simulation scenario is the RPL protocol. The MAC layer protocol configured asContikiMAC and XMAC, respectively. Under normal operating conditions, each sensor node would send data packets to the root node with a random frequency of 1 packet/1 minutes.Fig.8. A simulated cluster model of 35 nodesTable 2. Evaluating Simulation ScenarioC. Results of SimulationPost-simulation data is extracted, analyzed and graphed to make the comparison. Figures 9, 10, 11 and 12 below correspond to the simulation results comparing between the ContikiMAC protocol and XMAC protocol in terms of the performance of data delivery ratio, average power consumption, average number of parent node changes, and average hop count in the network.Fig.9. Comparison in terms of data delivery ratioThe simulation results in figure 9 show that the network operated by ContikiMAC protocol achieves efficiency in terms of data delivery ratio higher than that operated by the XMAC protocol. As the density of nodesin the network increases, the XMAC protocol presents a significant decline in the data delivery ratio (from 92.3% to 78.3%). The XMAC protocol using a train of short preambles to synchronize the transmission time between the sender and receiver. Therefore, when the density of nodes in the network increases, the number of short preambles sent and received in the network also increases. This causes the conflict and loss of packets in the network. However, for ContikiMAC protocol, when the density of nodes in the network increases, the efficiency in terms of data delivery ratio in the network decreases insignificantly (from 100% to 99.2%).Fig.10. Comparison in terms of average power consumptionin the networkFig.11. Comparison in terms of average number of parent node changesFig.12. Comparison in terms of average hop countFigure 10 illustrates that the network operated by ContikiMAC protocol gets better energy efficiency than that operated by the XMAC protocol. Compared to the XMAC protocol, the ContikiMAC protocol does not make any additional energy costs due to not sending short preambles.In comparison of network stability (see figure 11), iscan be seen that the network operated by ContikiMAC protocol reaches better stability than that operated by XMAC protocol. Based on simulation results, for ContikiMAC protocol, the average number of parent node changes in the network does not change much when the network density increases. However, for the XMAC protocol, the average number of parent node changes in the network increases significantly as the network density increases. As the network density increases, noise and conflict during transmission/reception also increase, thereby the data delivery ratio reduces and nodes tend to update the parent node to find out alternative routes. This makes the topology change. Figure 11 also shows that for low density networks, the topology is not changed much when the network operates under both the XMAC protocol and ContikiMAC protocol.Figure 12 shows that the network operated by XMAC protocol has a lower average hop count than that operated by ContikiMAC protocol. Therefore, the number of hops that data packets need to be forwarded in the network under the XMAC protocol is lower than that of ContikiMAC protocol. The average hop count is related to the communication delay in the network.V. C ONCLUSIONSThis paper presented some evaluating results that compare the performance of ContikiMAC protocol and XMAC protocol in data collection application with the RPL routing protocol. The results of evaluating simulation show that the XMAC protocol works relatively well in low-density network condition. However, when the network density increases, the network's performance is significantly reduced. In all simulation scenarios, the ContikiMAC protocol always achieves good energy efficiency and better data delivery ratio than that of the XMAC protocol.R EFERENCES[1] Vu Chien Thang, “A Solution for Water Factories inVietnam using Automatic Meter Reading Technology ,” International Journal of Computer Network and Information Security (IJCNIS), Vol.10, No.8, pp.44-50, 2018.[2] Jean-Philippe Vasseur, Adam Dunkels, “InterconnectingSmart Object with IP: The Next Internet,” Morgan Kaufmann Publishers, 2010.[3] Areeg Fahad Rasheed, A E Abdelkareem, “PerformanceEvaluation of MAC Protocols with Multi-Sink for Mobile UWSNs,” International Journal of Computer Network and Information Security (IJCNIS), Vol.11, No.7, pp.1-7, 2019.[4] Ye W, Silva F, Heide mann J., “Ultra -low duty cycleMAC with scheduled channel polling,” In Proceedings of the 4th International Conference on Embedded Networked Sensor Systems. New York, NY: ACM Press; pp.321 – 334, 2006.[5] El-Hoiydi A, Decotignie JD, Enz CC, Le Roux E.“WiseMA C, an ultra low power MAC protocol for the WiseNet wireless sensor network,” In: SenSys, pp. 302 – 303, 2003.[6] M. Buettner, G. V. Yee, E. Anderson, and R. Han., “X -MAC: a short preamble mac protocol for duty-cycled wireless sensor networks,” In Proceedings of 2nd ACM conference on Embedded Networked Sensor Systems (SenSys’06), pp. 307–320, 2006.[7] A. Dunkels, “The ContikiMAC Radio Duty CyclingProtocol,” SICS technical report, December 2011.[8]Vasseur, J.P., Navneet Agarwal, Jonathan Hui, ZachShelby, Paul Bertr and, Cedric Chauvenet, “RPL: the IP routing protocol designed for low power and lossy networks,” In: Internet Protocol for Smart Objects (IPSO) Alliance, 2011.[9]Vu Chien Thang, Nguyen Van Tao, “A PerformanceEvaluation of Improved IPv6 Routing Protocol for Wireless Sensor Networks,” International Journal of Intelligent Systems and Applications, pp.18-25, 2016. [10]N. Tsiftes, J. Eriksson, and A. Dunkels, “Low-PowerWireless IPv6 Routing with ContikiRPL,” in Proceedings of the International Conference on Information Processing in Sensor Networks (ACM/IEEE IPSN), Stockholm, Sweden, 2010.[11] A. Dunkels, B. Grönvall, and T. Voigt, “Contiki - alightweight and flexible operating system for tiny networked sensors,” in Proc. EmNets, 2004.[12]Azzedine Boukerche, “Algorithms an d Protocols forWireless Sensor Networks,” John Wiley & Sons Inc., ISBN: 9780470396360, 2008.[13]Fredrik Österlind, Adam Dunkels, Joakim Eriksson,Niclas Finne, and Thiemo Voigt, “Cross-level sensor network simulation with cooja,” In Proceedings of the First IEEE International Workshop on Practical Issues in Building Sensor Network Applications, Tampa, Florida, USA, 2006.[14]Adam Dunkels, Fredrik Osterlind, Nicolas Tsiftes, ZhitaoHe, “Software-based Online Energy Estimation for Sensor Nodes,” Proceedings of the 4th workshop on Embedded networked sensors, 2007.[15]https:///files/2013/04/tmote-sky-datasheet.pdf.Authors’ ProfilesVu Chien Thang received the MSc degree inElectronics and Communication Technologyin 2009 from Hanoi University of Scienceand Technology and Ph.D. inTelecommunication Engineering in 2015from Vietnam Research Institute ofElectronics, Informatics, and Automation. He is currently a lecturer at Thai Nguyen University of Information and Communication Technology. His research interests include wireless sensor networks, internet of things, embedded systems. How to cite this paper: Vu Chien Thang, "A Comparative Study of Network Performance between ContikiMAC and XMAC Protocols in Data Collection Application with ContikiRPL", International Journal of Computer Network and Information Security(IJCNIS), Vol.11, No.8, pp.32-37, 2019.DOI: 10.5815/ijcnis.2019.08.04。
Hayati_et_al-2013-Microwave_and_Optical_Technology_Letters

2.88GHz.For the lower mode with C¼0.6pF and the higher mode with C¼5pF,the effects of varying/on CP perform-ance are given in Figures4and5,respectively.The simulation results suggest that an axial ratio of less than2dB can be found when/ranges between10and25 for the lower mode and between12and16 for the higher mode.3.RECONFIGURABLE DESIGN AND EXPERIMENTAL RESULTSAn antenna prototype with electrically switching was realized using a varactor diode(BB837,Siemens Semiconductor Group).For the dc bias(V0)used for controlling the varactor, its positive is connected to the feed line through a RF choke, which is composed of a high-impedance meandered microstrip line and a grounded capacitor of1nF,and the negative is directly linked to the RF ground plane,as shown in Figure1. Figure6exhibits the experimental results when V0is switched between two different values.From the measured results,it can be seen that the frequency with minimum axial ratio is1.83 GHz for the case of V0¼28V and it is2.96GHz for the case of V0¼6V.The CP bandwidths,determined by3dB axial ra-tio,are2.7and3.3%at the lower and higher CP operating fre-quencies,respectively.In addition,Figure6also demonstrates that a return loss of less than10dB is achieved within the two CP bandwidths.Therefore,the antenna can perform the dual-frequency operation with a frequency ratio of about1.6through switching.The radiation patterns at1.83and2.96GHz are measured and their results are plotted in Figure7.Broadside radiation with good CP performance is observed for each operating fre-quency,and the polarization in the plane of z>0is left-handed. The peak gain at1.83GHz is about3.3dBic and it is merely 0.2dB lower than that at2.96GHz.4.CONCLUSIONA design for circularly polarized annular slot antennas with switchable frequency has been presented.Only one diode is required in the reconfigurable design.With controlling the dc bias of the diode,the antenna can perform dual-frequency opera-tion with a high frequency ratio.Moreover,the antenna at the two operating frequencies has almost the same radiation pattern, polarization performance,and peak gain.REFERENCES1.Y.K.Jung and B.Lee,Dual-band circularly polarized microstripRFID reader antenna using metamaterial branch-line coupler,IEEE Trans Antennas Propag60(2012),786–791.2.Nasimuddin,Z.N.Chen,and X.Qing,Dual-band circularly-polar-ized S-shaped slotted patch antenna with a small frequency ratio, IEEE Trans Antennas Propag58(2010),2112–2115.3.J.Y.Sze,C.I.G.Hsu,and J.J.Jiao,CPW-fed circular slot antennawith slit back-patch for2.4/5GHz dual-band operation,Electron Lett42(2006),563–564.4.Y.L.Zhao,Y.C.Jiao,G.Zhao,Z.B.Weng,and F.S.Zhang,Anovel polarization reconfigurable ring-slot antenna with frequency agility,Microwave Opt Technol Lett51(2009),540–543.5.N.Jin,F.Yang,and Y.Rahmat-Samii,A novel patch antenna withswitchable slot(PASS):dual-frequency operation with reversed cir-cular polarizations,IEEE Trans Antennas Propag54(2006), 1031–1034.6.T.Y.Lee and J.S.Row,Frequency reconfigurable circularly polar-ized slot antennas with wide tuning range,Microwave Opt Technol Lett53(2011),1501–1505.V C2013Wiley Periodicals,Inc.DESIGN OF BROADBAND AND HIGH-EFFICIENCY CLASS-E AMPLIFIER WITH pHEMT USING A NOVEL LOW-PASS MICROSTRIP RESONATOR CELLMohsen Hayati1,2and Ali Lotfi11Electrical Engineering Department,Faculty of Engineering,Razi University,Tagh-E-Bostan,Kermanshah-67149,Iran; Corresponding author:mohsen_hayati@2Computational Intelligence Research Centre,Razi University,Tagh-E-Bostan,Kermanshah-67149,IranReceived31August2012ABSTRACT:In this article,a high-efficient class-E amplifier design with low voltage and broadband characteristics using a novel Front Coupled Tapered Compact Microstrip Resonant Cell is presented.The proposed micorstrip resonator is used as the harmonic control network in order to suppress higher order harmonics,which obtained the optimized impedance matching for the fundamental and harmonics.The class-E amplifier is realized from0.7to1.8GHz,and obtained the power added efficiency of72.5–77.5%.The maximum value of Power added efficiency(PAE)is79.7%with11-dBm input power at1.5GHz. The designed class-E amplifier using the proposed harmonic control network gained15.34%increment in PAE,and25.6%reduction in the circuit size in comparison with the conventional class-E amplifier.The simulation and measurement results show the validity of the proposed design procedure of the broadband class-E amplifier using a novel microstrip resonator cell.V C2013Wiley Periodicals,Inc.Microwave Opt Technol Lett55:1118–1118,2013;View this article online at .DOI10.1002/mop.27490Key words:switch mode;class-E amplifier;tapered cell;microstrip resonant cell;high efficiency;power added efficiency;zero voltage switching;zero voltage derivative switching1.INTRODUCTIONThe modern wireless communication systems need to consume the power supply.The main factor in reducing the consumption of the power supply is designing a low-voltage and high-effi-ciency power amplifier[1].The switch mode power amplifier is an efficient way for solving the efficiency problem.The class-E power amplifier is a kind of the switch mode power amplifier that the transistor acts as a switch.The class-E power amplifier is tuned by a shunt capacitance.This type of the power amplifier obtained100%drain efficiency theoretically[2].The class-E amplifier’s response conditions are zero voltage switching (ZVS)and zero voltage derivative switching(ZVDS),which lead to zero power loss in the transistor.Therefore,a high-effi-ciency power amplifier is obtained[3].The shunt capacitance in the class-E power amplifier has a main roll for achieving the class-E conditions[4].The power loss in the lower frequency can be neglected,but by increasing the operation frequency,the power dissipation is increased and the ideal operation of the class-E power amplifier will be missed.The antiphase of the voltage and current wave-forms throughout the signal period,obtain the class-E power amplifier with the maximum efficiency[5].This purpose can be achieved using a wave shaping network.The conventional class-E power amplifier load resistance is very much lower than the transistor ON-resistance.This effect leads to efficiency degrada-tion and a narrowband load matching network[6].Furthermore, the transistor parasitic resistance for both the switch on-state and parasitic inductance leads to efficiency degradation in the radio frequency(RF)and microwave applications[7,8].The optimum operation of the class-E power amplifier and the solution to the mentioned drawbacks can be obtained using two main methods:namely active device selection and circuit configuration[9].The class-E amplifier has various configura-tions such as the cascade[10]and push–pull[11].The cascade class-E configurations can double the maximum permissible drain voltage,and the push–pull class-E configuration increases the output power and decrease the harmonic distortion with high efficiency.A new topology for the class-E amplifier is proposed as an inverse class-E amplifier,which has inductive reactance [12].The inverse class-E amplifier has higher load resistance and lower peak switch voltage in comparison with the class-E amplifier.Also,because of the abruption of the device output inductances,the value of the inductance in the load network is decreased.However,the inverse class-E amplifier can be used only for the small to medium power applications.Therefore,to solve this drawback,the power combining methods have been used[13].Although,this method results in obtaining the inverse class-E amplifier for higher power application,but the circuit configuration and the design procedure are complicated with the circuit size increment because of using two power amplifier circuits.The class-E power amplifier is a high-efficiency power am-plifier for the microwave application,which is implemented using the transmission line as the harmonic control network at the output of the amplifier circuit[14].Furthermore,instead of the RF choke(RFC)a section of the transmission line is used.The transmission line has been used in the class-E power amplifier using LDMOS[15],GaN HEMT[16–19],SiC MES-FET[20],and LDMOSFET[21]as the harmonic control net-work increasingly,because of the simplicity of its structure and high rejection of harmonics.Therefore,the class-E amplifier configuration and operation are the best candidates for the design of the amplifier for the modern microwave communica-tion systems[22,23].Consequently,designing of the load network as the harmonic control network for suppression of harmonics in order to obtain a high-efficiency power amplifier is the main challenge of the switch mode power amplifiers.The designing of the class-E power amplifiers using various microstrip structures has been proposed such as a defected ground structure[24],an asymmet-rical spur-line[25],and composite right/left-handed transmission lines[26].The narrowband load network and low efficiency remain as the main challenge to the class-E power amplifier using the conventional microstrip transmission line[27].A compact microstrip resonant cell(CMRC)is a one-dimen-sional photonic band gap incorporating the microstrip transmis-sion line,which is,first,proposed in[28].The CMRC structure exhibits high rejection of the harmonics with the compact circuit size in comparison with the conventional micorstrip transmission lines.Therefore,it is used for the linearization and efficiency in-crement of the microwave power amplifiers[29,30].The appli-cation of the conventional CMRC is limited to obtain a high-ef-ficiency switch mode amplifier,as a result of the high insertion loss in the passband and restricted stopband.The front coupled tapered CMRC(FCTCMRC)is proposed in[31]for the implan-tation of a low-passfilter with high and wide rejection in the stopband with the compact circuit size in comparison with the conventional CMRC.Therefore,it can be widely used for designing the high-efficiency and broadband switch mode power amplifier because of high and wide suppression of harmonics.In this article,the harmonic suppression of the class-E ampli-fier using a novel FCTCMRC as the harmonic controller net-work is explored.A class-E amplifier with higher efficiency at a wider bandwidth in comparison with the conventional amplifiers is achieved.The proposed class-E power amplifier is designed and simulated for a frequency of1.5GHz using the micorstrip resonator structure.The measurement results of the proposed power amplifier validate our design procedure and simulation results.2.CLASS-E AMPLIFIER FUNDAMENTAL AND DESIGN THEORY2.1.Class-E Amplifier OperationThe basic circuit configuration of the class-E amplifier and switch waveforms are shown in Figures1(a)and1(b),respec-tively.The class-E amplifier consists of the switch device,shunt capacitance,series-tuned load network L-C,and an ideal RFC. The switch-on duty ratio is assumed to be50%in designing the class-E amplifier.This value of the duty ratio leads to optimum operation of the class-E amplifier for obtaining high efficiency [32].For an ideal class-E operation,three requirements for the drain voltage and current should be met[2]:1.The rise of the voltage across the transistor at turn-offshould be delayed until the transistor is off.2.The drain voltage should be brought back to zero at thetime of the transistor turn-on.3.The slope of the drain voltage should be zero at the timeof the transistor turn-on.Therefore,the class-E power amplifier is constructed based on two conditions as ZVS and ZVDS.These conditions are as follows:v s hðÞjh¼p¼0;(1)dv s hðÞd hh¼p¼0;(2)where v s(y)is the switch voltage,and y¼x t.The quality fac-tor of the output series resonant circuit is assumed infinite. Therefore,the output current is sinusoidal asi oðhÞ¼I m sinðhþuÞ:(3)In the time interval0y<p,the switch device is in the on-state,therefore,using Kirchhoff’s current law at the switch,we havei sðhÞ¼I dc1þa sin hþuðÞðÞ:(4)This is the currentflow through the shunt capacitance in the switch-off state.Therefore,the voltage across the switchisFigure1(a)The basic circuit of the class-E amplifier.(b)The class-E switch voltage and current waveformv sðtÞ¼1C sZ ti sðt0Þdt0¼I dcx C s1þa cos x tþuðÞÀcos uðÞðÞ:(5)Applying the class-E ZVS and ZVDS conditions to Eqs.(4)and (5),the value of a and u can be obtained asa¼ffiffiffiffiffiffiffiffiffiffiffiffiffi1þp24r;(6)u¼ÀtanÀ12p8>:9>;:(7)The drain voltage waveform is shaped by the harmonics so that the drain voltage and the slope of the drain voltage is zero when the transistor is in the on-state.The reactance for all harmonics is negative and comparable in magnitude to the fundamental fre-quency load resistance.The ideal class-E amplifier requirements are difficult to meet.So,we often only tuned the second and third harmonics to get the suboptimum class-E power amplifier solution.The analysis is performed considering just the output network behavior,thus neglecting input signal required to oper-ate the active device as an ideal switch.The optimal fundamental load by the Fourier-series expan-sion analysis in[7]used for achieving the perfect class-E opera-tion can be determined asZ E;f0¼0:28x C Pe49 :(8)This impedance is inductive.On the other hand,for the ideal operation of the class-E power amplifier the impedances at the higher order harmonics are infiniteZ E;fn¼1;for n!2:(9)From(8)the nominal class-E amplifier shunt capacitance C is defined byC¼0:1836x0R:(10)In order to achieve the maximum operation frequency of the class-E amplifier,the device output capacitance should be equal to Eq.(10).The matching network for the class-E power ampli-fier using a low-pass Chebyshev-form impedance transformer is proposed in[7].Therefore,the synthesis of the load network is done using a short circuit,and open circuit stubs instead of lumped capacitors in the load network for unwanted harmonics.2.2.Design of a Class-E Amplifier Using a pHEMTAchieving the optimum load is the main factor to obtain high efficiency when designing the class-E power amplifier.On the other hand,the optimum load is varied with the operating fre-quency as in Eq.(8).Therefore,designing of the load network, which can operate in the wide frequency range,is needed for designing the class-E power amplifier with the optimum condi-tions.The maximum operation frequency of the class-E power amplifier is restricted by the shunt capacitance.The shunt capac-itance consists of the transistor output capacitance and the exter-nal capacitance.Thus,the optimum operating frequency of the class-E power amplifier is achieved by selecting a transistor with lower output capacitance.On the other hand,the power loss is caused by ON-resistance of the transistor[33].Therefore, the active device with lower ON-resistance is preferred for designing the high-efficiency class-E power amplifier.We selected an ATF-34143pHEMT because of its lower ON-resist-ance and lower shunt parasitic capacitance,which provides lower power dissipation and optimum operation frequency using external capacitance,respectively.The circuit topology of the conventional class-E amplifier is shown in Figure2(a).It is designed using the design procedure,which is presented in[2, 3].The value of elements for an ideal class-E power amplifier is tabulated in Table1.In the design of the class-E power ampli-fier,it is assumed that the value of the DC-feed is infinitive,but in real implementation this value isfinite,and we used the half wavelength microstrip transmission line for the DC-feed.In the conventional class-E amplifier,using lumped elements, the second harmonic is located within the pass band.Therefore, the bandwidth is limited to one octave.In order to solve this drawback,one way is designing a multiple matching network for various bands and using switching element.This way leads to complexity of the amplifier circuit and degradation of the efficiency.The use of the micorstrip transmission line is a low-cost and simple way for designing the class-E amplifier with wide band and high-efficiency characteristics.We used the design proce-dure in Section2.1and designed the matching network for the amplifier as shown in Figure2(b).The values of the transmis-sion lines dimensions are given in Table2.The class-E ampli-fier is designed on RT/Duroid5880,a substrate with dielectric constant of2.2,height of15l l,and loss tangent of0.0009.Figure2Idealized class-E power amplifier:(a)lumped elements and(b)transmission lineTABLE1Element Design for the Nominal Class-E AmplifierC i1(pF)C i2(pF)C o1(pF)C o2(pF)C e(pF)C g1(pF)C g2(pF)C d1(pF)C d2(pF)L i1(nH)L o1(nH)L o2(nH) Theoretical10010010010 4.2221000.50.2312 4.7 3.33.FRONT COUPLED TAPERED CMRC CHARACTERISTICSA novel FCTCMRC is proposed in [31],for the first time,which is used to synthesize a low-pass filter with high and wide rejec-tion in the stopband.This microstrip structure exhibits bandstop characteristics and slow wave effects,which are used in the stopband extension and the circuit size reduction,respectively.The schematic and equivalent circuit of the resonator is shown in Figures 3(a)and 3(b),respectively.The proposed FCTCMRC has symmetrical topology.Therefore,the even–odd mode [34]can be used to simplify the analysis as shown in Figures 3(c)and 3(d).Consequently,theresonant condition for the odd-mode in Figure 3(c)is obtained by equating the input admittance Y o in of the proposed resonator to zero yields:Z 112x C 1ÀZ 1tan h 1 ÀZ 2tan h 2Z 1þtan h 12x C 1¼0:(11)Using the similar procedure,the even-mode resonant frequencies areobtained by equating the even admittance Y e in to zero as follows:Z 2tan h 1þZ 1tan h 2¼0:(12)The transmission zeros of the equivalent circuit for the proposed FCTCMRC,which is shown in Figure 3(a),is obtained whenY o in ¼Y ein asZ 2sin 2h 2þZ 1sin 2h 1¼cos 2h 1x C 1:(13)Therefore,the resonator characteristics for tuning transmission zeroes in the stopband can be achieved by the length and width of the tapered cells as shown in Figures 4(a)and 4(b).The pro-posed structure is optimized by an EM-simulator (ADS).The obtained dimensions are as follows:L t1¼2:58;L 2¼1:94;L 3¼2:7;W t1¼2:71;W t2¼5:6;W 1¼0:1;W 2¼0:56;L 3¼0:75;L f ¼2:36;W f ¼0:25all are in millimeter ðmm Þ:TABLE 2The Value of the Conventional Transmission Line for the class-E AmplifierTL 1TL 2TL b1TL 3TL 4TL 5TL b2Width (mm) 4.730.940.620.71 1.24 4.210.72Length (mm) 6.319.7262.3137.2318.4264.3Figure 3(a)Schematic of the proposed resonator.(b)Equivalent cir-cuit.(c)Odd-mode.(d)EvenmodeFigure 4(a)Changing of the transmission zeros with the width of tapered cell W t1.(b)Changing of the transmission zeros with the length of tapered cell L t .(c)Simulation and measurement results of the proposed harmonic control network.(d)Simulation input impedance of the FCTCMRCThe proposed FCTCMRC is fabricated,and the measurement is performed using an Agilent N5230A Network Analyzer.The simulation and measurement results of the proposed FCTCMRC are shown in Figure 4(c).As it is shown,it has an attenuation level À43and À33.1dB at 3.0and 4.5GHz,respectively.Therefore,the high suppression for the second and third har-monics is obtained.The insertion loss from DC to 2.39GHz is lower than À0.1dB.The simulation of the input impedance of the proposed CMRC for the fundamental and harmonics is shown in Figure 4(d).As it is observed,the harmonic impedan-ces are relatively open in comparison with the fundamental im-pedance.Consequently,it can be used as the matching network with high performance and low circuit complexity.4.CIRCUIT DESIGN AND IMPLEMENTATIONThe highly efficient and compact size class-E amplifier is designed and implemented for a 1.5-GHz band using an ATF-34143pHEMT.The proposed circuit is simulated using an Agi-lent’s Advanced Design System (ADS),and fabricated on an RT/Duroid 5880substrate.The active device is biased at V d ¼3V and V g ¼À0.7V.The FCTCMRC is used as the harmonic control network (HCN)at the output of the active device.The proposed HCN absorbed the parasitic reactance and capacitance of the active device.Therefore,it does not need to any lumped elements in series or parallel with the transistor to compensate the parasitic elements.The circuit schematic diagram of the designed class-E amplifier is shown in Figure 5(a).Moreover,the photograph of the fabricated circuit is shown in Figure 5(b).The RFC is realized using the microstrip transmission line (TLb2)with the quarter wavelength at a frequency of 1.5GHz.The input matching elements consist of two series and parallel open stubs.The dimensions of the tapered cells and transmission lines in the HCN are tuned in order to optimize harmonic termi-nation in the implemented amplifier circuit.The design and implementation of the output matching networks using the FCTCMRC as low-pass topology has been done from 0.7to 1.8GHz.The voltage and current waveforms of the designed class-E amplifier are shown in Figure 5(c).The switch is open for the time interval,0.2–0.4ns and the current through it is near zero.The switch is closed during the time interval 0.6–0.8ns,and the voltage across it is near to zero.The class-E ZVS and ZVDS conditions in the switch turn-off state are obtained.Therefore,the high-efficiency class-E amplifier is achieved.The input signal is generated using an Agilent E4433B signal generator,and the measurement is done by an E4440A PSA se-ries spectrum analyzer.The simulated and measured output power and gain for P in ¼11dBm (input power)are shown in Figure 6(a).The maximum output power at 1.5GHz with P in ¼11dBm is 25.3dBm,and the related gain is 14.3dB.The con-ventional class-E amplifier without CMRC has an output power of 18.5dBm and gain of 7.5dB.The class-E amplifier using CMRC has 36.7%output power improvement in comparison with the one without CMRC.The simulation and measurement results for the PAE at P in ¼11dBm (input power)is shown as a function of the operating frequency in Figure 6(b).The highest value of PAE at a fre-quency of 1.5GHz was 79.7%.The value of the PAE is 69.1%for the conventional class-E amplifier without CMRC.There-fore,the proposed class-E amplifier using the novel CMRC has 15.34%PAE improvement in comparison with the one without CMRC.The output power of the conventional class-E amplifier is decreased as the operating frequency is increased.As shown in Figure 6(a),this decrement is considerable when the operating frequency is more than 1.2GHz.Therefore,the conventional class-E amplifier has a drawback for the broadband applications.The designed class-E amplifier has 25.6%circuit size reduction in comparison with the conventional class-E amplifier.5.CONCLUSIONThe class-E amplifier with high efficiency and broadband char-acteristics has been designed and implemented.A novel and simple load-matching technique for the low-voltage microwave class-E amplifier using a front-coupled taperedcompactFigure 5The pHEMT class-E amplifier.(a)Circuit configuration.(b)A photograph of fabricated amplifier.(c)Simulated switch voltage and current waveforms.[Color figure can be viewed in the online issue,which is available at ]microcstrip resonant cell has been presented.The proposed am-plifier achieved an output power of 25.3dBm,a power added efficiency of 79.7%,and a gain of 7.5dB at input power of 11dBm.It has high-efficiency performance over a significant band-width form 0.7to 1.8GHz (88%).The proposed compact micro-strip resonant cell as the harmonic control network exhibited 15.34%improvement in PAE and 25.6%reduction in the circuit size in comparison with the conventional class-E amplifier.The extremely low insertion loss at the fundamental frequency and size reduction characteristics can be used in the design of the class-E amplifier with higher output power and smaller size,which are required in the broadband application.REFERENCES1.S.C.Cripps,Advanced techniques in RF power amplifiers design,Artech House,Norwood,MA,2002.2.N.O.Sokal and A.D.Sokal,Class E—A new class of high-effi-ciency tuned single-ended switching power amplifiers,IEEE J Sol-id-State Circuits 10(1975),168–176.3.F.H.Raab,Idealized operation of the class E tuned power ampli-fier,IEEE Trans Circuits Syst 25(1977),725–735.4.R.E.Zulinski and J.W.Steadman,Class E power amplifiers and frequency multipliers with finite DC-feed inductance,IEEE Trans Circuits Syst 34(1987),1074–1087.5.R.Negra,F.M.Ghannouchi,and W.Bachtold,Study and design optimization of multi-harmonic transmission-line load networks for class-E and class-F K-band MMIC power amplifiers,IEEE Trans Microwave Theory Tech 55(2007),1390–1397.6.K.L.R.Mertens and M.S.J.Steyaert,A 700-MHz 1-W fully differ-ential CMOS class-E power amplifier,IEEE J Solid-State Circuits 37(2002),137–141.7.T.B.Mader and Z.B.Popovic,The transmission line high-effi-ciency class-E amplifier,IEEE Microwave Guided Wave Lett 5(1995),290–292.8.T.Suetsugu and M.K.Kazimierczuk,Design procedure for lossless voltage-clamped class E amplifier with a transformer and a diode,IEEE Trans Power Electron 20(2005),56–64.9.H.J €a ger,A.V.Grebennikov,E.P.Heaney,and R.Weigel,Broad-band high-efficiency monolithic In-GaP/GaAs HBT power ampli-fiers for wireless applications,Int J RF Microwave Comput Aided Eng 13(2003),496–510.10.A.Mazzanti,rcher,R.Brama,and F.Svelto,Analysis of reli-ability and power efficiency in cascode class-E PAs,IEEE J Solid--State Circuits 41(2006),1222–1229.11.S.C.Wong and C.K.Tse,Design of symmetrical class-E poweramplifiers for very low harmonic-content applications,IEEE Trans Circuits Syst I,Reg Papers 52(2005),1684–1690.12.T.Mury and V.F.Fusco,Inverse class-E amplifier with transmis-sion line harmonic suppression,IEEE Trans Circuits Syst I,Reg.Papers 54(2007),1555–1561.13.T.Mury and V.F.Fusco,Power combining techniques into unbal-anced loads for class-e and inverse class-e amplifiers,IET Micro-wave Antennas Propag 2(2008),529–537.14.A.J.Wilkinson and J.K.A.Everard,Transmission-line load-networktopology for class-E power amplifiers,IEEE Trans Microwave Theory Tech 49(2001),1202–1210.15.J.Lee,S.Kim,J.Nam,J.Kim,I.Kim,and B.Kim,Highly effi-cient LDMOS power amplifier based on class-E topology,Micro-wave Optical Technol Lett 48(2006),789–791.16.Y.-S.Lee and Y.-H.Jeong,A high-efficiency class-E GaN HEMTpower amplifier for WCDMA applications,IEEE Microwave Wire-less Compon Lett 17(2007),622–624.17.H.G.Bae,R.Negra,S.Boumaiza,and F.M.Ghannouchi,High-ef-ficiency GaN class-E power amplifier with compact harmonic-sup-pression network,Proc 37th Europ Microwave Conf,2007,pp.1093–1096.18.Y.-S.Lee,M.-W.Lee,and Y.-H.Jeong,A 1-GHz GaN HEMTbased class-E power amplifier with 80%efficiency,Microwave Opt Technol Lett 50(2008),2989–2992.19.Y.-S.Lee,M.-W.Lee,and Y.-H.Jeong,A 40-W balanced GaNHEMT class-E power amplifier with 71%efficiency for WCDMA base station,Microwave Opt Technol Lett 51(2009),842–845.20.Y.S.Lee and Y.H.Jeong,A high-efficiency class-E power ampli-fier using SiC MESFET,Microwave Opt Technol Lett 49(2007),1447–1449.21.J.-H.Van,M.-S.Kim,S.-C.Jung,H.-C.Park,G.Ahn,C.-S.Park,B.-S.Kim,and Y.Yang,A high-frequency and high-power quasi-class-E amplifier design using a finite bias feed inductor,Micro-wave Opt Technol Lett 49(2007),1114–1118.22.R.Beltran,F.H.Raab,and A.Velazquez,High-efficiency out phas-ing transmitter using class-E power amplifiers and asymmetric combining,Microwave Opt Technol Lett 51(2009),2959–2963.23.C.Park,Y.Kim,H.Kim,and S.Hong,Fully integrated 1.9-GHzCMOS power amplifier for polar transmitter applications,Micro-wave Opt Technol Lett 48(2006),2053–2056.24.Y.C.Jeong,S.-G.Jeong,J.S.Lim,and S.W.Nam,A new methodto suppress harmonics using k /4bias line combined by defected ground structure in power amplifiers,IEEE Microwave Wireless Compon Lett 13(2003),538–540.25.L.Wang,W.Chen,P.Wang,X.Xue,J.Dong,and Z.Feng,Design of asymmetrical spur-line filter for a high power sic MES-FET class-E power amplifier,Microwave Opt Technol Lett 52(2010),1650–1652.26.M.Thian and V.Fusco,Design strategies for dual-band class-Epower amplifier using composite right/left-handed transmission lines,Microwave Opt Technol Lett 49(2007),2784–2788.27.Y.Qin,S.Gao,A.Sambell,and E.Korolkiewicz,Design of low-cost broadband class-e power amplifier using low-voltage supply,Microwave Opt Technol Lett 44(2005),103–106.28.Q.Xue,K.M.Shum,and C.H.Chan,Novel 1-D microstrip PBGcells,IEEE Microwave Wireless Comp Lett 10(2000),403–405.29.T.Yin,Q.Xue,and C.H.Chan,Amplifier linearization using com-pact microstrip resonant cell-theory and experiment,IEEE Trans Microwave Theory Tech 52(2004),927–934.Figure 6Comparison of the conventional amplifier simulation with the simulated and measured results of the proposed amplifier.(a)Output power and gain.(b)Power added efficiency (PAE %)。
Supramol. chem-3

have sizes in the range of tenths of nanometres.
Microfabrication is a collective term used for various kinds of lithography. The most common microfabrication technique used today is
5.1.2 Nanotechnology: the „top-down‟ approach
Approaches based on nanotechnology are needed to produce components on the 30nm scale and below. For example, lithography and
One of the most conceptually obvious ways to carry out chemistry on the nanoscale or to make nanoscale objects is to simply move molecules or atoms around directly. Such a process is termed nanomanipulation and in practice it is extremely difficult to achieve. This is because it is difficult to apply the necessary force on such a small scale. There are a number of modern techniques that can achieve manipu-
Phase Behavior of Medium and High Internal Phase Water-in-Oil

452
/10.1021/la4032514 | Langmuir 2014, 30, 452−460
Langmuir Table 1. Pickering Emulsions Stabilized by Various Types of Cellulose14−22,24,54−56
type of cellulose microcrystalline cellulose chemical modification oil phase heavy mineral oil sunflower oil vegetable oil kerosene toluene diesel type of emulsion o/w o/w o/w o/w w/o w/o w/o w/o o/w o/w o/w o/w o/w o/w o/w o/w o/w o/w o/w w/o or o/wd w/o
Article /Langmuir
Terms of Use CC-BY
Phase Behavior of Medium and High Internal Phase Water-in-Oil Emulsions Stabilized Solely by Hydrophobized Bacterial Cellulose Nanofibrils
Koon-Yang Lee,†,‡,⊥ Jonny J. Blaker,‡ Ryo Murakami,∥ Jerry Y. Y. Heng,§ and Alexander Bismarck*,†,‡
†
Polymer and Composite Engineering (PaCE) Group, Institute of Materials Chemistry and Research, Faculty of Chemistry, University of Vienna, Wah ̈ ringer Strasse 42, A-1090 Vienna, Austria ‡ Polymer and Composite Engineering (PaCE) Group and §Surfaces and Particle Engineering Laboratory (SPEL), Department of Chemical Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom ∥ Department of Chemistry of Functional Molecules, Konan University, 8-9-1 Okamoto, Kobe 658-8501, Japan ABSTRACT: Water-in-oil emulsions stabilized solely by bacterial cellulose nanofibers (BCNs), which were hydrophobized by esterification with organic acids of various chain lengths (acetic acid, C2-; hexanoic acid, C6-; dodecanoic acid, C12-), were produced and characterized. When using freezedried C6-BCN and C12-BCN, only a maximum water volume fraction (ϕw) of 60% could be stabilized, while no emulsion was obtained for C2-BCN. However, the maximum ϕw increased to 71%, 81%, and 77% for C2-BCN, C6-BCN, and C12-BCN, respectively, 150 h after the initial emulsification, thereby creating high internal phase water-in-toluene emulsions. The observed time-dependent behavior of these emulsions is consistent with the disentanglement and dispersion of freezedried modified BCN bundles into individual nanofibers with time. These emulsions exhibited catastrophic phase separation when ϕw was increased, as opposed to catastrophic phase inversion observed for other Pickering emulsions.
ISOIEC 16022(2006) - Cor2(2011)-DM码勘误

ICS 01.080.50; 35.040 Ref. No. ISO/IEC 16022:2006/Cor.2:2011(E)© ISO/IEC 2011 – All rights reservedPublished in SwitzerlandINTERNATIONAL STANDARD ISO/IEC 16022:2006TECHNICAL CORRIGENDUM 2Published 2011-02-01INTERNATIONAL ORGANIZATION FOR STANDARDIZATION • МЕЖДУНАРОДНАЯ ОРГАНИЗАЦИЯ ПО СТАНДАРТИЗАЦИИ • ORGANISATION INTERNATIONALE DE NORMALISATION INTERNATIONAL ELECTROTECHNICAL COMMISSION • МЕЖДУНАРОДНАЯ ЭЛЕКТРОТЕХНИЧЕСКАЯ КОМИССИЯ • COMMISSION ÉLECTROTECHNIQUE INTERNATIONALEInformation technology — Automatic identification and data capture techniques — Data Matrix bar code symbologyspecificationTECHNICAL CORRIGENDUM 2Technologies de l'information — Techniques automatiques d'identification et de capture des données — Spécification de symbologie de code à barres Data MatrixRECTIFICATIF TECHNIQUE 2Technical Corrigendum 2 to ISO/IEC 16022:2006 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology , Subcommittee SC 31, Automatic identification and data capture techniques .Page 28Replace Clause 9 with the following:9 Reference decode algorithm for Data MatrixThis reference decode algorithm finds a Data Matrix symbol in an image and decodes it.a) Define measurement parameters and form a digitised image:1) Define a distance d min which is 7,5 times the aperture diameter defined by the application. This will bethe minimum length of the "L" pattern's side.ISO/IEC 16022:2006/Cor.2:2011(E)2) Define a distance g max which is 7,5 times the aperture diameter. This is the largest gap in the "L"finder that will be tolerated by the finder algorithm in step b).3) Define a distance m min which is 1,25 times the aperture diameter. This would be the nominal minimummodule size when the aperture size is 80% of the symbol’s X dimension.4) Form a black/white image using a threshold determined according to the method defined in ISO/IEC15415.b) Search horizontal and vertical scan lines for the two outside edges of the Data Matrix "L":1) Extend a scan line horizontally in both directions from the centre point of the image. Sample along thescan line. For each white/black or black/white transition found along the scan line resolved to the pixel boundary:i) Follow the edge upward sampling pixel by pixel until either it reaches a point 3,5m min distant fromthe intersection of the scan line and the edge starting point, or the edge turns back toward theintersection of the scan line and the edge - the starting point.ii) Follow the edge downward pixel by pixel until either it reaches a point 3,5m min distant from the intersection of the scan line and the edge starting point, or the edge turns back toward theintersection of the scan line and the edge - the starting point.iii) If the upward edge reaches a point 3,5m min from the starting point:I) Plot a line A connecting the end points of the upward edge.II) Test whether the intermediate edge points lie within 0,5m min from line A. If so, continue to step III. Otherwise proceed to step 1)iv) to follow the edge in the opposite direction.III) Continue following the edge upward until the edge departs 0,5m min from line A. Back up to the closest edge point greater than or equal to m min from the last edge point along the edgebefore the departing point and save this as the edge end point. This edge point should bealong the "L" candidate outside edge.IV) Continue following the edge downward until the edge departs 0,5m min from line A. Back up to the closest edge point greater than or equal to m min from the last edge point along the edgebefore the departing point and save this as the edge end point. This edge point should bealong the "L" candidate outside edge.V) Calculate a new adjusted line A1 that is a "best fit" line to the edge in the two previous steps.The "best fit" line uses the linear regression algorithm (using the end points to select theproper dependent axis, i.e. if closer to horizontal, the dependent axis is x) applied to eachpoint. The "best fit" line terminates lines at points p1 and p2 that are the points on the "bestfit" line closest to the endpoints of the edge.VI) Save the line A1 segment two end points, p1 and p2. Also save the colour of the left side of the edge viewed from p1 to p2.iv) If step iii) failed or did not extend upward by 3,5m min in step iii)IV), test if the downward edge reaches a point 3,5m min from the starting point. If so, repeat the steps in iii) but with the downwardedge.v) If neither steps iii) or iv) were successful, test if both the upward and downward edges terminated at least 2m min from the starting point. If so, form an edge comprised of the appended 2m min lengthupward and downward edge segments and repeat the steps in iii) but with the appended edge.vi) Proceed to and process the next transitions on the scan line, repeating from step i), until the boundary of the image is reached.ISO/IEC 16022:2006/Cor.2:2011(E)2) Extend a scan line vertically in both directions from the centre point of the image. Look for linesegments using the same logic in step 1) above but following each edge transition first left and then right.3) Search among the saved line A1 segments for pairs of line segments that meet the following fourcriteria:i) If the two lines have the same p1 to p2 directions, verify that the closer of the interline p1 to p2distances is less than g max. If the two lines have opposite p1 to p2 directions, verify that the closerof the interline p1 to p1 or p2 to p2 distances is less than g max.ii) Verify that the two lines are co-linear within 5 degrees.iii) Verify that the two lines have the same saved colour if their p1 to p2 directions are the same or that the saved colours are opposite if their p1 to p2 directions are opposite to each other.iv) Form two temporary lines by extending each line to reach the point on the extension that is closest to the furthest end point of the other line segment. Verify that the two extended lines areseparated by less than 0,5m min at any point between the two extended lines.4) For each pair of lines meeting the criteria of step 3) above, replace the pair of line segments with alonger A1 line segment that is a "best fit" line to the four end points of the pair of shorter line segments. Also save the colour of the left side of the edge of the new longer line viewed from its p1 endpoint to its p2 endpoint.5) Repeat steps 3) and 4) until no more A1 line pairs can be combined.6) Select line segments that are at least as long as d min. Flag them as "L" side candidates.7) Look for pairs of "L" side candidates that meet the following three criteria:i) Verify that the closest points on each line are separated by less than 1,5g max.ii) Verify that they are perpendicular within 5 degrees.iii) Verify that the same saved colour is on the inside of the "L" formed by the two lines. Note that if one or both lines extend past their intersection, then the two or four "L" patterns formed will needto be tested for matching colour and maintaining a minimum length of d min for the truncated sideor sides before they can become "L" candidates.8) For each candidate "L" pair found in step 7) form an "L" candidate by extending the segments to theirintersection point.9) If the "L" candidate was formed from line segments with the colour white on the inside of the "L", forma colour inverted image to decode. Attempt to decode the symbol starting with the appropriate normalor inverted image starting from step d) below using each of the "L" candidates from step 8) as the "L"shaped finder. If none decode, proceed to step c).c) Maintain the line A1 line segments and "L" side candidates from the previous steps. Continue searchingfor "L" candidates using horizontal and vertical scan lines offset from previous scan lines:1) Using a new horizontal scan line 3m min above the centre horizontal scan line, repeat the process instep b)1), except starting from the offset from the centre point, and then b)3) through b)9). If there is no decode, proceed to the next step.2) Using a new vertical scan line 3m min left of the centre vertical scan line, repeat the process in stepb)2), except starting from the offset from the centre point, and then steps b)3) through b)9). If there isno decode, proceed to the next step.©ISO/IEC 2011 – All rights reserved3ISO/IEC 16022:2006/Cor.2:2011(E)3) Repeat step 1) above except using a new horizontal scan line 3m min below the centre horizontal scanline. If there is no decode, repeat step 2) above except using a new vertical scan line 3m min right of the centre vertical scan line. If there is no decode, proceed to step 4) below.4) Continue processing horizontal and vertical scan lines as in steps 1) through 3) that are 3m min above,then left, then below, then right of the previously processed scan lines until either a symbol is decoded or the boundary of the image is reached.d) First assume that the candidate area contains a square symbol. If the area fails to decode as a squaresymbol, then try to find and decode a rectangular symbol starting from procedure j). For a square symbol, first plot a normalised graph of transitions for the equal sides of the candidate area in order to find the alternating module finder pattern:1) Project a line through the candidate area bisecting the interior angle of the two sides of the "L" foundabove as shown in figure 9. Define the two equal areas formed by the bisecting line as the right side and the left side as viewed from the corner of the "L".2) For each side, form a line called a “search line” between a point d min distance from the corner alongthe “L” line, parallel to the other “L” side line, and extending to the bisecting line as shown in Figure 9.3) Move each search line away from the corner of the “L” as shown in Figure 9, lengthening each line asit expands to span its two bounding lines, the “L” line and the bisecting line. Keep each search line parallel to the other “L” side line. As each side is moved by the size of an image pixel, count the number of black/white and white/black transitions, beginning and ending the count with transitions from the colour of the “L” side to the opposite colour. A transition from one colour to the other is to be counted only when the current search line as well as the search lines immediately above and below have the same colour, opposite to the previously counted transition colour. Plot the number of transitions multiplied by the length of the longest “L” side divided by the current length of the search line measured between the two bounding lines:T = (number of transitions) (“L” max. line length) / (search line length).This formula normalises T to keep it from increasing because the line lengthens.Continue to calculate the T values until the search line is longer than the longest axis of the candidate area plus 50%.Figure 9 — Expanding search linesISO/IEC 16022:2006/Cor.2:2011(E)4) Form a plot of the T values for each side, where the Y-axis is the T value and the X-axis is the searchline’s distance from the corner of the “L”. A sample plot is shown in Figure 10.Figure 10 — Example plot of T as the search line expands5) Starting from the T value with the smallest X in the right side’s plot and then increasing X, find the firstinstance of a T S value (T S= maximum of zero and T - 1) that is less than 15% of the preceding local maximum T value, provided that T value is greater than 1. Increment this X value until the number of transitions stops decreasing. If the number of transitions does not increase, increment the X value once more. Refer to this X value as the valley. Increment the local maximum’s X value until the number of transitions decreases and refer to this X as the peak. Refer to the average of the peak and valley X values as the descending line X value. The search line at the peak may correspond to an alternating finder pattern side. At the valley, the search line may correspond to the solid dark interior line or a light quiet zone.6) Find the peak and valley in the left side’s plot whose descending line X value most closely matchesthe right peak and valley’s descending line X value. If returning to this step from a later step, consider additional left peaks and valleys, ordered in terms of how closely they match the right peak and valley.However, any left peak and valley under consideration must be checked to ensure that the absolute difference between the right and left peak X values is less than 15% of the average of the two peak X values and that the absolute difference between the right and left valley X values is less than 15% of the average of the two valley X values. The 15% specifies the maximum allowed foreshortening.7) The right side’s valley search line, the left side’s valley search line, and the two sides of the “L” outlinea possible symbol’s data region. Process the data region according to step e). If the decode fails, findthe next left peak and valley from step d)6). Once all left peaks and valleys have been discarded, discard the right side peak and valley and continue searching from step d)5) for the next right peak and valley.e) For each of the two sides of the alternating pattern, find the line passing through the centre of thealternating light and dark modules:1) For each side, form a rectangular region bounded by the side’s peak and valley search lines as thelonger two sides of the rectangle, and the “L” side and the other side’s valley search line as the shorter two sides, as shown in Figure 11.©ISO/IEC 2011 – All rights reserved5ISO/IEC 16022:2006/Cor.2:2011(E)Figure 11 — Rectangular region construction2) Within the rectangular region, find pixel edge pairs on the outside boundary of teeth:i) Traverse test lines starting with and parallel to the valley line looking for transitions to theopposite colour normally orthogonal to the test line. Select only transitions that are either dark tolight or light to dark where the first colour matches the predominate colour of the image along thevalley line.ii) If the number of transitions found is less than 15% of the number of pixels comprising the valley line, and the test line is not the peak line, move the test line toward the peak line by nominallyone pixel and repeat step i), now considering new transitions in addition to those already found. Ifthe 15% criterion is met or the peak line is reached, continue to the next step, otherwise continuesearching from step d)6) for the next left peak and valley.iii) Calculate a preliminary "best fit line" with linear regression using the points on the edge between the selected pixel pairs.iv) Discard the 25% of the points which are furthest from the preliminary "best fit line". Calculate a final "best fit line" with linear regression using the remaining 75% of points. This line should passalong the outside of the alternating pattern, shown as the "best fit line" in Figure 12.3) For each side, construct a line parallel to the step e)2) line which is offset toward the “L” corner by theperpendicular distance from the “L” corner to the peak search line divided by twice the number of transitions in the peak search line plus one:Offset = distance to the peak line / ((number of transitions + 1) * 2)Each of the two constructed lines should correspond to the mid-line of the alternating module pattern on that side, see Figure 12.ISO/IEC 16022:2006/Cor.2:2011(E)Figure 12 — Alternating pattern module centre-linef) For each side, measure the edge-to-edge distances in the alternating pattern:1) Bound the alternating pattern mid-line constructed in step e)3) by the adjacent “L” line and the otheralternating pattern mid-line from step e)3). Call the length of this line M d (see Figure 11).2) Along the bounded mid-line, measure the edge-to-edge distances between all the similar edges of alltwo-element pairs, i.e. dark/light and light/dark element pairs. Begin and end the edge-to-edge measurements with edges transitioning from the “L” colour to the opposite colour.3) Select the median edge-to-edge measurement and set the current edge-to-edge measurementestimate, EE_Dist, to the median measurement.4) Discard all element pairs with edge-to-edge measurements that differ more than 25% from EE_Dist.g) For each side, find the centre points of the alternating pattern modules:1) Using the remaining element pair measurements from f)4), calculate the average ink spread (verticalor horizontal depending on the segment side) by the average of the element pair’s ink spread, where bar is the dark element width and space is the light element width in a remaining element pair: ink_spread = Average ( (bar - ((bar + space) / 2)) / ((bar + space) / 2) )2) Calculate the centre of the bar in the median element pair using the following offset into the bar fromthe outside edge of the bar in the median pair:offset = (EE_Dist * (1 + ink_spread)) / 4If there is more than one median element pair, choose a single pair using the following process:i) Order the edges (excluding the “L” finder edge) by their distance from the “L” finder edge. Thereare an odd number of these edges because the edges start and end on a dark to light transitiongoing away from the “L” finder.ii) Call the middle edge in the list the centre edge.©ISO/IEC 2011 – All rights reserved7ISO/IEC 16022:2006/Cor.2:2011(E)iii) Calculate the (odd number of) element pair edge-to-edge distances and find their median EE_Dist.iv) Select the one or more element pairs with length EE_Dist.v) Among those pairs identify the one or two element edge pairs that has an edge closest to the centre edge.vi) If there is still a tie, take the element pair that has the outer edge of the bar closest to centre edge.vii) If there is still a tie, take the element pair that has an inner edge closest to the “L” finder.3) Starting from the centre of the bar in the median element pair from step f)3) proceed in the directionof the space in the element pair until reaching the end of the bounded mid-line, calculate each element’s centre, shown by the speckled pattern in Figure 13, by the following steps:Figure 13 — Edge-to-edge measurements for finding an element centre(While three bars and two spaces are shown in Figure 13, if a space is the element for which the centre is to be calculated, then the diagram would have three spaces instead of the bars and two bars instead of the spaces. For light elements adjacent to the element at the end of the mid-line, either D1 or D4 measurements are omitted as they would fall outside the symbol’s or segment’s measurable element boundaries.)i) Calculate a point p1 along the mid-line which is EE_Dist/2 from the previously calculated elementcentre in the direction of the new element.ii) Calculate d1 through d4 where:d1 = D1 / 2d2 = D2d3 = D3d4 = D4 / 2iii) If one of the values d1 through d4 is within 25% of EE_Dist, select the one which is closest to EE_Dist, and set the new EE_Dist to be the average of the current EE_Dist and the selected d1through d4 distance.I) If d1 or d4 are selected, select the corresponding D1 or D4 edge closest to the element, thecentre of which is to be calculated. Offset this edge by (ink_spread/2) * (EE_Dist/2) in theappropriate direction (i.e., if ink_spread is positive, the offset will move the edge toward thespace included in the distance D1 or D4 and if negative, the offset will move away from thisspace). Calculate a point p2 along the mid-line which is 0,75 times the selected d1 or d4value from the offset edge and toward the element centre to be calculated.II) If d2 or d3 are selected, select the corresponding D2 or D3 edge closest to the element the centre of which is to be calculated. Offset this edge by (ink_spread/2) * (EE_Dist/2) in theappropriate direction (i.e., if ink_spread is positive, the offset will move the edge toward thespace included in the distance D2 or D3 and if negative, the offset will move away from thisspace). Calculate a point p2 along the mid-line which is 0,25 times the selected d2 or d3value from the offset edge and toward the element centre to be calculated.ISO/IEC 16022:2006/Cor.2:2011(E)III) Set the element’s centre as halfway between p1 and p2.iv) Otherwise if none of the values d1 through d4 is within 25% of EE_Dist, leave EE_Dist at its current value, use p1 as the new element’s centre, and proceed to the next element.4) Starting from the bar in the median element pair, and proceeding in the opposite direction from step3), until reaching the other end of the bounded mid-line, calculate each element’s centre, following theprocedures in step 3).h) If the number of modules in each side do not correspond to a valid first region, continue searching fromstep d)6) for the next left peak and valley. Otherwise plot the data module sampling grid in the data region by extending the alternating pattern module centres:1) Extend each side’s step e)3) mid-line and the opposite side’s “L” line to form the vanishing point ofthe two nearly parallel or parallel extended lines.2) Extend rays from each vanishing point passing through the step g) module centres of the nearlyperpendicular step e)3) line.3) The intersection of the two sets of nearly perpendicular rays should correspond to the centres of thedata modules in the data region, as shown in Figure 14.Figure 14 — Module sampling grid constructioni) Continue to fill in the remaining data regions:1) When a data region is processed, form a new “L” for the next data section to the “left” or “above”using one of two processes:i) If the new data region is still bounded on one side by the original “L” from procedure b), repeatfrom procedure c) to process the new data region using the selected set of points from step e)2)and the set of points on the “L” from step b)2) which lie beyond the step e)2) line.ii) If the new data region is bounded on two sides by data regions, repeat from procedure c) to process the new data region using the selected set of points from step e)2) for each data regionwhich are adjacent and bound the new region on two sides2) If a data region does not match the number of modules in previously processed regions, trim thesymbol to the largest number of regions which correspond to a legal symbol.3) Decode the symbol with its one or more data regions starting with procedure k).©ISO/IEC 2011 – All rights reserved9ISO/IEC 16022:2006/Cor.2:2011(E)4) If the current data region exhausts its last peak and valley, revert to the previous data region andcontinue searching from step d)6) for the next left peak and valley in that data region.j) Find the data sections of a rectangular symbol.1) For each side of the “L” move a line perpendicular to the side and scanning along the length of theother side of the “L”. Keep each search line parallel to the other “L” side line. As each side is moved by the size of an image pixel, count the number of black/white and white/black transitions, beginning and ending the count with transitions from the colour of the “L” side to the opposite colour. A transition from one colour to the other is to be counted only when the current search line as well as the search lines immediately above and below have the same colour, opposite to the previously counted transition colour. As each side is moved by a pixel, plot the number of transitions, T. Continue until the parallel line moves further than the perpendicular leg of the “L” plus 10%.2) Starting from the origin of the plot, for each direction, find the first instance of a T S value (T S =maximum of zero and T - 1) value that is less than 15% of the preceding local maximum T value, provided that T value is greater than 1. Increment this X value until the T value stops decreasing. If the T value does not increase, increment the X value once more. Refer to this X value as the valley.Increment the local maximum’s X value until the T value decreases and refer to this X as the peak.Refer to the average of the peak and valley X Values as the descending line X value. The valley line at this point may form a side of a symbol or data region.3) Find the alternating pattern lines for each side of the region similar to procedure e).4) Plot the module sample grid in the data region or symbol as in procedures f), g), and h).5) If the data region defined is not a valid rectangular symbol, try to form a new data region using furthervalid peak to valley plot transitions.6) Process any additional regions as in procedure i).7) If a valid data region or two regions are detected, attempt to decode the symbol as in procedures k)and l). If the region(s) were not valid or the decode fails, disregard the candidate area.k) If the number of data modules is even or the symbol forms a valid rectangular symbol, decode the symbol using Reed-Solomon error correction:1) Sample the data modules at their predicted centres. Black at the centre is a one and white is a zero.2) Convert the eight module samples in the defined codeword patterns into 8-bit symbol charactervalues.3) Apply Reed-Solomon error correction to the symbol character values.4) Decode the symbol characters into data characters according to the specified encodation schemes. l) Otherwise the number of data modules is odd, so decode the symbol using convolution code error correction:1) Sample the data modules at their predicted centres. Black at the centre is a one and white is a zero.2) Apply the black/white balancing mask.3) Use the bit ordering table to convert the data into a bit stream.4) Apply the appropriate convolution code error correction.5) Convert the bit stream to data characters according to the encodation scheme specified.6) Verify that the CRC is correct.。
CS4525_08中文资料
Preliminary Product InformationThis document contains information for a new product.Cirrus Logic reserves the right to modify this product without notice.30W Digital Audio Amplifier with Integrated ADCDigital Amplifier FeaturesFully Integrated Power MOSFETs No Heatsink Required–Programmable Power Foldback on Thermal Warning –High Efficiency>100dB Dynamic Range <0.1% THD+N @ 1WConfigurable Outputs (10% THD+N)– 1 x 30W into 4Ω, Parallel Full-Bridge – 2 x 15W into 8Ω, Full-Bridge–2 x 7W into 4Ω, Half-Bridge + 1 x 15W into 8Ω, Full-BridgeBuilt-In Protection with Error Reporting–Overcurrent /Undervoltage /Thermal Overload Shutdown–Thermal Warning ReportingPWM Popguard ® for Half-Bridge Mode Click-free Start-upProgrammable Channel Delay for SystemNoise & Radiated Emissions ManagementADC FeaturesStereo, 24-bit, 48kHz Conversion Multi-bit Architecture95dB Dynamic Range (A-wtd) -86dB THD+NSupports 2Vrms Input with PassiveComponentsSystem FeaturesAsynchronous 2-channel Digital Serial Port 32kHz to 96kHz Input Sample Rates Operation with On-chip Oscillator Driver orApplied SYS_CLK at 18.432, 24.576 or 27.000MHzIntegrated Sample Rate Converter (SRC)–Eliminates Clock-jitter Effects–Input Sample Rate Independent Operation –Simplifies System IntegrationSpread Spectrum PWM Modulation–Reduces EMI Radiated EnergyLow Quiescent Current(Features continued on page 2)CS4525Software Mode System FeaturesDigital Audio Processing– 5 Programmable Parametric EQ Filters–Selectable High-pass Filter–Bass/Treble Tone Control–Adaptive Loudness Compensation–2-channel Mixer– 2.1 Bass Management–24dB/octave Linkwitz-Riley Crossover Filters–De-emphasis FilterSelectable Serial Audio Interface Formats –Left-justified up to 24-bit–I²S up to 24-bit–Right-justified 16-, 18-, 20-, 24-bitsDigital Serial Connection to Additional CS4525 or DACs for SubwooferDigital Interface to External Lip-sync DelayPWM Switch Rate Shifting Eliminates AM Frequency InterferenceDigital Volume Control with Soft Ramp –+24 to -103dB in 0.5dB stepsProgrammable Peak Detect and Limiter2-Channel Logic-level PWM Output–Programmable Channel Mapping–Can Drive an External PWM Amplifier, Headphone Amplifier, or Line-out Amplifier –Integrated Headphone DetectionFlexible Power Output ConfigurationsThermal Foldback for Interruption-free Power-stage Protection–Supports Internal and External Power StagesOperation from On-chip Oscillator Driver or Applied Systems ClockSupports I²C® Host Control Interface Hardware Mode System Features2-Channel Stereo Full Bridge Power Outputs Analog and Digital InputsI²S and Left-justified Serial Input FormatsThermal Foldback for Interruption-free Protection of Internal Power StageOperation from Applied Systems ClockExternal Mute Input Common ApplicationsIntegrated Digital TV’sFlat Panel TV MonitorsComputer/TV MonitorsMini/Micro Shelf SystemsDigital Powered SpeakersPortable Docking StationsComputer Desktop AudioGeneral DescriptionThe CS4525 is a stereo analog or digital input PWM high efficiency Class D amplifier audio system with an integrated stereo analog-to-digital (A/D) converter. The stereo power amplifiers can deliver up to 15W per channel into 8 Ω speakers from a small space-saving 48-pin QFN package. The PWM amplifier can achieve greater than 85% efficiency. The package is thermally enhanced for optimal heat dissipation which eliminates the need for a heatsink.The power stage outputs can be configured as two full-bridge channels for 2x15W operation, two half-bridge channels and one full-bridge channel for 2x7W+1x15W operation, or one parallel full-bridge channel for 1x30W operation. The CS4525 integrates on-chip over-current, under-voltage, and over-tempera-ture protection and error reporting as well as a thermal warning indicator and programmable foldback of the output power to allow cooling.The main digital serial port on the CS4525 can support asynchronous operation with the integrated on-chip sample rate converter (SRC) which eases system inte-gration. The SRC allows for a fixed PWM switching frequency regardless of incoming sample rate as well as optimal clocking for the A/D modulators.An on-chip oscillator driver eliminates the need for an external crystal oscillator circuit, reducing overall design cost and conserving circuit board space. The CS4525 automatically uses the on-chip oscillator driver in the absence of an applied master clock.The CS4525 is available in a 48-pin QFN package in Commercial grade (0° to +70° C). The CRD4525-Q1 4-layer, 1oz. copper and CRD4525-D1 2-layer, 1oz. cop-per customer reference designs are also available. Please refer to “Ordering Information” on page97 for complete ordering information.TABLE OF CONTENTS1. PIN DESCRIPTIONS - SOFTWARE MODE (8)2. PIN DESCRIPTIONS - HARDWARE MODE (10)2.1 Digital I/O Pin Characteristics (12)3. TYPICAL CONNECTION DIAGRAMS (13)4. TYPICAL SYSTEM CONFIGURATION DIAGRAMS (15)5. CHARACTERISTICS AND SPECIFICATIONS (18)6. APPLICATIONS (26)6.1 Software Mode (26)6.1.1 System Clocking (26)6.1.1.1 SYS_CLK Input Clock Mode (26)6.1.1.2 Crystal Oscillator Mode (27)6.1.2 Power-Up and Power-Down (28)6.1.2.1 Power-Up Sequence (28)6.1.2.2 Power-Down Sequence (28)6.1.3 Input Source Selection (29)6.1.4 Digital Sound Processing (29)6.1.4.1 Pre-Scaler (30)6.1.4.2 Digital Signal Processing High-Pass Filter (30)6.1.4.3 Channel Mixer (30)6.1.4.4 De-Emphasis (31)6.1.4.5 Tone Control (31)6.1.4.6 Parametric EQ (33)6.1.4.7 Adaptive Loudness Compensation (34)6.1.4.8 Bass Management (35)6.1.4.9 Volume and Muting Control (36)6.1.4.10 Peak Signal Limiter (37)6.1.4.11 Thermal Limiter (39)6.1.4.12 Thermal Foldback (40)6.1.4.13 2-Way Crossover & Sensitivity Control (42)6.1.5 Auxiliary Serial Output (43)6.1.6 Serial Audio Delay & Warning Input Port (44)6.1.6.1 Serial Audio Delay Interface (44)6.1.6.2 External Warning Input Port (44)6.1.7 Powered PWM Outputs (45)6.1.7.1 Output Channel Configurations (45)6.1.7.2 PWM Popguard Transient Control (45)6.1.8 Logic-Level PWM Outputs (46)6.1.8.1 Recommended PWM_SIG Power-Up Sequence for an External PWM Amplifier (47)6.1.8.2 Recommended PWM_SIG Power-Down Sequence for an External PWM Amplifier 476.1.8.3 Recommended PWM_SIG Power-Up Sequence for Headphone & Line-Out (48)6.1.8.4 Recommended PWM_SIG Power-Down Sequence for Headphone & Line-Out (48)6.1.8.5 PWM_SIG Logic-Level Output Configurations (49)6.1.9 PWM Modulator Configuration (50)6.1.9.1 PWM Channel Delay (50)6.1.9.2 PWM AM Frequency Shift (51)6.1.10 Headphone Detection & Hardware Mute Input (51)6.1.11 Interrupt Reporting (53)6.1.12 Automatic Power Stage Shut-Down (53)6.2 Hardware Mode (54)6.2.1 System Clocking (54)6.2.2 Power-Up and Power-Down (54)6.2.2.1 Power-Up Sequence (54)6.2.2.2 Power-Down Sequence (55)6.2.3 Input Source Selection (55)6.2.4 PWM Channel Delay (55)6.2.5 Digital Signal Flow (56)6.2.5.1 High-Pass Filter (56)6.2.5.2 Mute Control (56)6.2.5.3 Warning and Error Reporting (56)6.2.6 Thermal Foldback (57)6.2.7 Automatic Power Stage Shut-Down (58)6.3 PWM Modulators and Sample Rate Converters (58)6.4 Output Filters (59)6.4.1 Half-Bridge Output Filter (59)6.4.2 Full-Bridge Output Filter (Stereo or Parallel) (60)6.5 Analog Inputs (61)6.6 Serial Audio Interfaces (62)6.6.1 I²S Data Format (62)6.6.2 Left-Justified Data Format (62)6.6.3 Right-Justified Data Format (63)6.7 Integrated VD Regulator (63)6.8 I²C Control Port Description and Timing (64)7. PCB LAYOUT CONSIDERATIONS (65)7.1 Power Supply, Grounding (65)7.2 Output Filter Layout (65)7.3 QFN Thermal Pad (65)8. REGISTER QUICK REFERENCE (66)9. REGISTER DESCRIPTIONS (69)9.1 Clock Configuration (Address 01h) (69)9.1.1 SYS_CLK Output Enable (EnSysClk) (69)9.1.2 SYS_CLK Output Divider (DivSysClk) (69)9.1.3 Clock Frequency (ClkFreq[1:0]) (69)9.1.4 HP_Detect/Mute Pin Active Logic Level (HP/MutePol) (70)9.1.5 HP_Detect/Mute Pin Mode (HP/Mute) (70)9.1.6 Modulator Phase Shifting (PhaseShift) (70)9.1.7 AM Frequency Shifting (FreqShift) (70)9.2 Input Configuration (Address 02h) (71)9.2.1 Input Source Selection (ADC/SP) (71)9.2.2 ADC High-Pass Filter Enable (EnAnHPF) (71)9.2.3 Serial Port Sample Rate (SPRate[1:0]) - Read Only (71)9.2.4 Input Serial Port Digital Interface Format (DIF [2:0]) (71)9.3 AUX Port Configuration (Address 03h) (72)9.3.1 Enable Aux Serial Port (EnAuxPort) (72)9.3.2 Delay & Warning Port Configuration (DlyPortCfg[1:0]) (72)9.3.3 Aux/Delay Serial Port Digital Interface Format (AuxI²S/LJ) (72)9.3.4 Aux Serial Port Right Channel Data Select (RChDSel[1:0]) (72)9.3.5 Aux Serial Port Left Channel Data Select (LChDSel[1:0]) (73)9.4 Output Configuration (Address 04h) (73)9.4.1 Output Configuration (OutputCfg[1:0]) (73)9.4.2 PWM Signals Output Data Select (PWMDSel[1:0]) (73)9.4.3 Channel Delay Settings (OutputDly[3:0]) (73)9.5 Foldback and Ramp Configuration (Address 05h) (74)9.5.1 Select VP Level (SelectVP) (74)9.5.2 Enable Thermal Foldback (EnTherm) (74)9.5.3 Lock Foldback Adjust (LockAdj) (74)9.5.4 Foldback Attack Delay (AttackDly[1:0]) (75)9.5.5 Enable Foldback Floor (EnFloor) (75)9.5.6 Ramp Speed (RmpSpd[1:0]) (75)9.6 Mixer / Pre-Scale Configuration (Address 06h) (75)9.6.1 Pre-Scale Attenuation (PreScale[2:0]) (75)9.6.2 Right Channel Mixer (RChMix[1:0]) (76)9.6.3 Left Channel Mixer (LChMix[1:0]) (76)9.7 Tone Configuration (Address 07h) (76)9.7.1 De-Emphasis Control (DeEmph) (76)9.7.2 Adaptive Loudness Compensation Control (Loudness) (76)9.7.3 Digital Signal Processing High-Pass Filter (EnDigHPF) (77)9.7.4 Treble Corner Frequency (TrebFc[1:0]) (77)9.7.5 Bass Corner Frequency (BassFc[1:0]) (77)9.7.6 Tone Control Enable (EnToneCtrl) (77)9.8 Tone Control (Address 08h) (78)9.8.1 Treble Gain Level (Treb[3:0]) (78)9.8.2 Bass Gain Level (Bass[3:0]) (78)9.9 2.1 Bass Manager/Parametric EQ Control (Address 09h) (78)9.9.1 Freeze Controls (Freeze) (78)9.9.2 Hi-Z PWM_SIG Outputs (HiZPSig) (79)9.9.3 Bass Cross-Over Frequency (BassMgr[2:0]) (79)9.9.4 Enable Channel B Parametric EQ (EnChBPEq) (79)9.9.5 Enable Channel A Parametric EQ (EnChAPEq) (79)9.10 Volume and 2-Way Cross-Over Configuration (Address 55h) (80)9.10.1 Soft Ramp and Zero Cross Control (SZCMode[1:0]) (80)9.10.2 Enable 50% Duty Cycle for Mute Condition (Mute50/50) (80)9.10.3 Auto-Mute (AutoMute) (80)9.10.4 Enable 2-Way Crossover (En2Way) (81)9.10.5 2-Way Cross-Over Frequency (2WayFreq[2:0]) (81)9.11 Channel A & B: 2-Way Sensitivity Control (Address 56h) (81)9.11.1 Channel A and Channel B Low-Pass Sensitivity Adjust (LowPass[3:0]) (81)9.11.2 Channel A and Channel B High-Pass Sensitivity Adjust (HighPass[3:0]) (82)9.12 Master Volume Control (Address 57h) (82)9.12.1 Master Volume Control (MVol[7:0]) (82)9.13 Channel A and B Volume Control (Address 58h & 59h) (83)9.13.1 Channel X Volume Control (ChXVol[7:0]) (83)9.14 Sub Channel Volume Control (Address 5Ah) (83)9.14.1 Sub Channel Volume Control (SubVol[7:0]) (83)9.15 Mute/Invert Control (Address 5Bh) (84)9.15.1 ADC Invert Signal Polarity (InvADC) (84)9.15.2 Invert Channel PWM Signal Polarity (InvChX) (84)9.15.3 Invert Sub PWM Signal Polarity (InvSub) (84)9.15.4 ADC Channel Mute (MuteADC) (84)9.15.5 Independent Channel A & B Mute (MuteChX) (84)9.15.6 Sub Channel Mute (MuteSub) (85)9.16 Limiter Configuration 1 (Address 5Ch) (85)9.16.1 Maximum Threshold (Max[2:0]) (85)9.16.2 Minimum Threshold (Min[2:0]) (85)9.16.3 Peak Signal Limit All Channels (LimitAll) (86)9.16.4 Peak Detect and Limiter Enable (EnLimiter) (86)9.17 Limiter Configuration 2 (Address 5Dh) (87)9.17.1 Limiter Release Rate (RRate[5:0]) (87)9.18 Limiter Configuration 3 (Address 5Eh) (87)9.18.1 Enable Thermal Limiter (EnThLim) (87)9.18.2 Limiter Attack Rate (ARate[5:0]) (87)9.19.1 Automatic Power Stage Retry (AutoRetry) (88)9.19.2 Enable Over-Current Protection (EnOCProt) (88)9.19.3 Select VD Level (SelectVD) (88)9.19.4 Power Down ADC (PDnADC) (88)9.19.5 Power Down PWM Power Output X (PDnOutX) (89)9.19.6 Power Down (PDnAll) (89)9.20 Interrupt (Address 60h) (89)9.20.1 SRC Lock State Transition Interrupt (SRCLock) (90)9.20.2 ADC Overflow Interrupt (ADCOvfl) (90)9.20.3 Channel Overflow Interrupt (ChOvfl) (90)9.20.4 Amplifier Error Interrupt Bit (AmpErr) (91)9.20.5 Mask for SRC State (SRCLockM) (91)9.20.6 Mask for ADC Overflow (ADCOvflM) (91)9.20.7 Mask for Channel X and Sub Overflow (ChOvflM) (91)9.20.8 Mask for Amplifier Error (AmpErrM) (92)9.21 Interrupt Status (Address 61h) - Read Only (92)9.21.1 SRC State Transition (SRCLockSt) (92)9.21.2 ADC Overflow (ADCOvflSt) (92)9.21.3 Sub Overflow (SubOvflSt) (92)9.21.4 Channel X Overflow (ChXOvflSt) (93)9.21.5 Ramp-Up Cycle Complete (RampDone) (93)9.22 Amplifier Error Status (Address 62h) - Read Only (93)9.22.1 Over-Current Detected On Channel X (OverCurrX) (93)9.22.2 External Amplifier State (ExtAmpSt) (93)9.22.3 Under Voltage / Thermal Error State (UVTE[1:0]) (94)9.23 Device I.D. and Revision (Address 63h) - Read Only (94)9.23.1 Device Identification (DeviceID[4:0]) (94)9.23.2 Device Revision (RevID[2:0]) (94)10. PARAMETER DEFINITIONS (95)11. REFERENCES (95)12. PACKAGE DIMENSIONS (96)13. THERMAL CHARACTERISTICS (97)13.1 Thermal Flag (97)14. ORDERING INFORMATION (97)15. REVISION HISTORY (98)LIST OF FIGURESFigure 1.Typical Connection Diagram - Software Mode (13)Figure 2.Typical Connection Diagram - Hardware Mode (14)Figure 3.Typical System Configuration 1 (15)Figure 4.Typical System Configuration 2 (15)Figure 5.Typical System Configuration 3 (16)Figure 6.Typical System Configuration 4 (17)Figure 7.Serial Audio Input Port Timing (21)Figure 8.AUX Serial Port Interface Master Mode Timing (22)Figure 9.SYS_CLK Timing from Reset (23)Figure 10.PWM_SIGX Timing (23)Figure 11.Control Port Timing - I²C (24)Figure 12.Typical SYS_CLK Input Clocking Configuration (26)Figure 13.Typical Crystal Oscillator Clocking Configuration (27)Figure 14.Digital Signal Flow (29)Figure 15.De-Emphasis Filter (31)Figure 17.Peak Signal Detection & Limiting (37)Figure 18.Foldback Process (40)Figure 19.Popguard Connection Diagram (46)Figure 20.2-Channel Full-Bridge PWM Output Delay (50)Figure 21.3-Channel PWM Output Delay (50)Figure 22.Typical SYS_CLK Input Clocking Configuration (54)Figure 23.Hardware Mode PWM Output Delay (55)Figure 24.Hardware Mode Digital Signal Flow (56)Figure 25.Foldback Process (57)Figure 26.Output Filter - Half-Bridge (59)Figure 27.Output Filter - Full-Bridge (60)Figure 28.Recommended Unity Gain Input Filter (61)Figure 29.Recommended 2V RMS Input Filter (61)Figure 30.I²S Serial Audio Formats (62)Figure 31.Left-Justified Serial Audio Formats (62)Figure 32.Right-Justified Serial Audio Formats (63)Figure 33.Control Port Timing, I²C Write (64)Figure 34.Control Port Timing, I²C Read (64)LIST OF TABLESTable 1. I/O Power Rails (12)Table 2. Bass Shelving Filter Corner Frequencies (31)Table 3. Treble Shelving Filter Corner Frequencies (32)Table 4. Bass Management Cross-Over Frequencies (35)Table 5. 2-Way Cross-Over Frequencies (42)Table 6. Auxiliary Serial Port Data Output (43)Table 7. Nominal Switching Frequencies of the Auxiliary Serial Output (43)Table 8. PWM Power Output Configurations (45)Table 9. Typical Ramp Times for Various VP Voltages (46)Table 10. PWM Logic-Level Output Configurations (49)Table 11. PWM Output Switching Rates and Quantization Levels (51)Table 12. Output of PWM_SIG Outputs (52)Table 13. SYS_CLK Frequency Selection (54)Table 14. Input Source Selection (55)Table 15. Serial Audio Interface Format Selection (55)Table 16. Thermal Foldback Enable Selection (57)Table 17. PWM Output Switching Rates and Quantization Levels (58)Table 18. Low-Pass Filter Components - Half-Bridge (59)Table 19. DC-Blocking Capacitors Values - Half-Bridge (59)Table 20. Low-Pass Filter Components - Full-Bridge (60)Table 21. Power Supply Configuration and Settings (63)1.PIN DESCRIPTIONS - SOFTWARE MODEPin Name Pin #Pin DescriptionINT 1Interrupt (Output) - Indicates an interrupt condition has occurred.SCL2Serial Control Port Clock (Input) - Serial clock for the I²C control port.SDA3Serial Control Data (Input/Output) -Bi-directional data I/O for the I²C control port.LRCK4Left Right Clock (Input) - Determines which channel, Left or Right, is currently active on the serial audio data line.SCLK5Serial Clock (Input) - Serial bit clock for the serial audio interface.SDIN6Serial Audio Data Input (Input) - Input for two’s complement serial audio data.HP_DETECT/ MUTE 7Headphone Detect / Mute (Input) - Headphone detection or mute input signal as configured via the I²C control port.RST 8Reset (Input) - The device enters a low power mode and all internal registers are reset to their default settings when this pin is driven low.VPOUT1PGNDPGNDOUT2VPVPOUT3PGNDPGNDOUT4VPVA_REGAGNFILT+VAFILTLAFILTAINAINOCREPGNPGNDRAMP_CATITOYS_CLKUX_LRCK/ADUX_SCLKUX_SDOUTLY_SDIN/EX_TWRLY_SDOUTWM_SIG1WM_SIG2GNDGNDLVD9VD Voltage Level Indicator (Input) - Identifies the voltage level attached to VD. When applying 5.0V to VD, LVD must be connected to VD. When applying 2.5V or 3.3V to VD, LVD must be DGND.DGND10Digital Ground (Input) - Ground for the internal logic and digital I/O.VD_REG11Core Logic Power (Output) - Internally generated low voltage power supply for digital logic. VD12Power (Input) - Positive power supply for the internal regulators and digital I/O.VA_REG13Analog Power (Output)- Internally generated positive power for the analog section and I/O. AGND14Analog Ground (Input) - Ground reference for the internal analog section and I/O.FILT+15Positive Voltage Reference (Output) - Positive reference voltage for the internal ADC sampling circuits.VQ16Common Mode Voltage (Output)-Filter connection for internal common mode voltage.AFILTL AFILTR 1718Antialias Filter Connection (Output) - Antialias filter connection for ADC inputs.AINL AINR 1920Analog Input (Input)-The full-scale input level is specified in the ADC Analog Characteristics specification table.OCREF21Over Current Reference Setting (Input) - Sets the reference for over current detection.PGND 22,2327,2833,3437,38Power Ground (Input) - Ground for the individual output power half-bridge devices.RAMP_CAP24Output Ramp Capacitor (Input) - Used by the PWM Popguard Transient Control to suppress the initial pop in half-bridge-configured outputs.VP 25,30,31,36High Voltage Power (Input) - High voltage power supply for the individual half-bridge devices.OUT4 OUT3 OUT2 OUT126293235PWM Output (Output) - Amplified PWM power outputs.PWM_SIG2 PWM_SIG13940Logic Level PWM Output (Output) - Logic Level PWM switching signals.DLY_SDOUT41Delay Serial Audio Data Out (Output) - Output for two’s complement serial audio data.DLY_SDIN/ EX_TWR 42Delay Serial Audio Data Input (Input) - Input for two’s complement serial audio data.External Thermal Warning (Input) - Input for an external thermal warning signal. Configurable via the I²C control port.AUX_SDOUT43Auxiliary Port Serial Audio Data Out (Output) - Output for two’s complement auxiliary port serial data.AUX_SCLK44Auxiliary Port Serial Clock (Output) - Serial clock for the auxiliary port serial interface.AUX_LRCK/ AD045Auxiliary Port Left Right Clock (Output) - Determines which channel, Left or Right, is currently active on the serial audio data line.AD0 (Input) - Sets the LSB of the I²C device address. Sensed on the release of RST.SYS_CLK46System Clock (Input/Output) -Clock source for the internal logic, processing, and modulators. This pin should be connected to through a 10kΩ to ground when unused.XTO47Crystal Oscillator Output(Output) - Crystal oscillator driver output. XTI48Crystal Oscillator Input (Input) - Crystal oscillator driver input.Thermal Pad-Thermal Pad - Thermal relief pad for optimized heat dissipation. See “QFN Thermal Pad” on page65 for more information.2.PIN DESCRIPTIONS - HARDWARE MODEPin NamePin #Pin DescriptionCLK_FREQ0CLK_FREQ112Clock Frequency (Input) - Determines the frequency of the clock expected to be driven into the SYS_CLK pin. CLK_FREQ1 must be connected to DGND.ADC/SP 3ADC/Serial Port (Input) - Selects between the Analog to Digital Converter and the Serial Port for audio input. Selects the ADC when high or the serial port when low.LRCK 4Left Right Clock (Input ) - Determines which channel, Left or Right, is currently active on the serial audio data line.SCLK 5Serial Clock (Input ) - Serial bit clock for the serial audio interface.SDIN 6Serial Audio Data Input (Input ) - Input for two’s complement serial audio data.MUTE 7Mute (Input ) - The PWM outputs will output silence as a 50% duty cycle signal when this pin is driven low.RST8Reset (Input ) - The device enters a low power mode and all internal registers are reset to their default settings when this pin is driven low.VP OUT1PGND PGND OUT2VP VP OUT3PGND PGND OUT4VPV A _R E GA G N F I L T +V A F I L T LA F I L T A I N A I N O C R E P G N P G N DR A M P _C ALVD9VD Voltage Level Indicator (Input) - Identifies the voltage level attached to VD. When applying 5.0V to VD, LVD must be connected to VD. When applying 2.5V or 3.3V to VD, LVD must be con-nected to DGND.DGND10Digital Ground (Input) - Ground for the internal logic and I/O.VD_REG11Core Logic Power (Output) - Internally generated low voltage power supply for digital logic. VD12Digital Power (Input) - Positive power supply for the internal regulators and digital I/O.VA_REG13Analog Power (Output)- Internally generated positive power for the analog section and I/O. AGND14Analog Ground (Input) - Ground reference for the internal analog section and I/O.FILT+15Positive Voltage Reference (Output) - Positive reference voltage for the internal ADC sampling circuits.VQ16Common Mode Voltage (Output)-Filter connection for internal common mode voltage.AFILTL AFILTR 1718Antialias Filter Connection (Output) - Antialias filter connection for ADC inputs.AINL AINR 1920Analog Input (Input)-The full-scale input level is specified in the ADC Analog Characteristics specification table.OCREF21Over Current Reference Setting (Input) - Sets the reference for over current detection.PGND 22,2327,2833,3437,38Power Ground (Input) - Ground for the individual output power half-bridge devices.RAMP_CAP24Output Ramp Capacitor (Input) - This pin should be connected directly to VP in hardware mode.VP 25,30,31,36High Voltage Power (Input) - High voltage power supply for the individual half-bridge devices.OUT4 OUT3 OUT2 OUT126293235PWM Output (Output) - Amplified PWM power outputs.TSTO 3940Test Output(Output) - These pins are outputs used for the Logic Level PWM switching signals available only in software mode. They must be left unconnected for hardware mode operation.TWR 41Thermal Warning Output (Output) - Thermal warning output.ERRUVTE 42Thermal and Undervoltage Error Output (Output) - Error flag for thermal shutdown and under-voltage.ERROC 43Overcurrent Error Output(Output) - Overcurrent error flag.EN_TFB44Enable Thermal Feedback(Input) - Enables the thermal foldback feature when high.I2S/LJ45I²S/Left Justified(Input) - Selects between I²S and Left-Justified data format for the serial input port. Selects I²S when high and LJ when low.SYS_CLK46System Clock (Input/Output) -Clock source for the delta-sigma modulators.TSTO47Test Output(Output) - This pin is an output used for the crystal oscillator driver available only in software mode. It must be left unconnected for normal hardware mode operation.TSTI48Test Input (Input) - This pin is an input used for the crystal oscillator driver available only in soft-ware mode. It must be tied to digital ground for normal hardware mode operation.Thermal Pad-Thermal Pad - Thermal relief pad for optimized heat dissipation. See “QFN Thermal Pad” on page65 for more information.2.1Digital I/O Pin CharacteristicsThe logic level for each input is set by its corresponding power supply and should not exceed the maximum ratings.Power SupplyPinNumberPin Name I/O Driver ReceiverSoftware ModeVD1INT Output 2.5V-5.0V, Open Drain2SCL Input- 2.5V-5.0V, with Hysteresis3 SDA Input/Output 2.5V-5.0V, Open Drain 2.5V-5.0V, with Hysteresis7HP_DETECTMUTE InputInput--2.5V-5.0V2.5V-5.0V41DLY_SDOUT Output 2.5V-5.0V, CMOS-42DLY_SDINEX_TWR InputInput--2.5V-5.0V2.5V-5.0V43AUX_SDOUT Output 2.5V-5.0V, CMOS-44AUX_SCLK Output 2.5V-5.0V, CMOS-45AUX_LRCK Output 2.5V-5.0V, CMOS-VD_REG39PWM_SIG2Output 2.5V, CMOS-40PWM_SIG1Output 2.5V, CMOS-Hardware ModeVD1CLK_FREQ0Input- 2.5V-5.0V 2CLK_FREQ1Input- 2.5V-5.0V3ADC/SP Input- 2.5V-5.0V7MUTE Input- 2.5V-5.0V41TWR Output 2.5V-5.0V, Open Drain-42ERRUVTE Output 2.5V-5.0V, Open Drain-43ERROC Output 2.5V-5.0V, Open Drain-44EN_TFB Input- 2.5V-5.0V45I²S/LJ Input- 2.5V-5.0V All ModesVD4LRCK Input- 2.5V-5.0V 5SCLK Input- 2.5V-5.0V6SDIN Input- 2.5V-5.0V8RST Input- 2.5V-5.0V9LVD Input- 2.5V-5.0V46SYS_CLK Input/Output 2.5V-5.0V, CMOS 2.5V-5.0V VP26OUT4Output8.0V-18.0V Power MOSFET-29OUT3Output8.0V-18.0V Power MOSFET-32OUT2Output8.0V-18.0V Power MOSFET-35OUT1Output8.0V-18.0V Power MOSFET-Table 1. I/O Power Rails3.TYPICAL CONNECTION DIAGRAMSFigure 1. Typical Connection Diagram - Software ModeFigure 2. Typical Connection Diagram - Hardware Mode4.TYPICAL SYSTEM CONFIGURATION DIAGRAMSFigure 3. Typical System Configuration 1Figure 4. Typical System Configuration 2Figure 5. Typical System Configuration 3Figure 6. Typical System Configuration 45.CHARACTERISTICS AND SPECIFICATIONS RECOMMENDED OPERATING CONDITIONSAGND = DGND = PGND = 0 V; all voltages with respect to ground.Notes:1.For VD =2.5 V, VA_REG and VD_REG must be connected to VD. See section 6.7 on page 63 fordetails.ABSOLUTE MAXIMUM RATINGSAGND =DGND =PGND =0V; all voltages with respect to ground.WARNING:Operation at conditions beyond the Recommended Operating Conditions may affect device reliability,and functional operation beyond Recommended Operating Conditions is not implied.Notes:2.Any pin except supplies. Transient currents of up to ±100 mA on the analog input pins will not causeSCR latch-up.3.The maximum over/under voltage is limited by the input current.ParametersSymbol Min Nom Max UnitsDC Power SupplyDigital and Analog Core(Note 1)VD 2.375 2.5 2.625V VD 3.135 3.3 3.465V VD4.755.0 5.25V Amplifier OutputsVP 8.0-18.0V TemperatureAmbient Temperature T A 0-+70°C Junction TemperatureT J-+125°CParametersSymbol Min Max UnitsDC Power SupplyPower Stage Outputs Switching and Under LoadPower StageNo Output SwitchingDigital and Analog CoreVP VP VD -0.3-0.3-0.319.823.06.0V V V InputsInput Current (Note 2)I in -±10mA Analog Input Voltage (Note 3)V INA AGND - 0.7VA_REG + 0.7V Digital Input Voltage(Note 3)V IND-0.3VD + 0.4VTemperatureAmbient Operating Temperature - Power AppliedCommercialT A -20+85°C Storage TemperatureT stg-65+150°C。
LARGE CAYLEY GRAPHS AND DIGRAPHS WITH SMALL DEGREE AND DIAMETER
graphs and digraphs and present new Cayley digraphs which yield improvements over some of the previously known largest vertex transitive digraphs of given degree and diameter.
CDMTCS Research Report Series Large Cayley Graphs and Digraphs with Small Degree and Diameter
P. R. HaБайду номын сангаасner
Department of Mathematics University of Auckland
1. Introduction Interconnection networks (for example of computers, or of components on a microchip) can be modelled conveniently by graphs or digraphs depending on whether the communication between nodes is two-way or only one-way. In practice, such networks are subject to two fundamental restrictions: the number of connections that can be attached at any one node is limited, as is the number of intermediate nodes on the communications path between two nodes. We have arrived at the Degree/Diameter Problem: nd (di-)graphs of maximal order with given (in- and out-)degree and diameter D. In this paper we discuss this problem for undirected and directed graphs and present new Cayley digraphs which improve known results in the case of vertex transitive graphs. 2. Notation and Terminology We will consider directed and undirected graphs . The distance from a vertex x to a vertex y is the length of a shortest path from x to y. The set of all vertices of whose distance from a vertex x equals i is denoted by i (x). The diameter of the (di)graph is the maximum of all distances between pairs of vertices of . Graphs of degree and diameter D are called (; D) graphs (similarly for digraphs). A graph is said to be -regular if all its vertices have degree ; a digraph is called -regular if all its vertices have in- and outdegree . A (di)graph is called vertex transitive if its automorphism group is transitive on the set of vertices, a digraph is called arc transitive if its automorphism group is transitive on the set of arcs. In the context of networks, vertex transitive (di)graphs are advantageous because identical routing algorithms can be used at each vertex.
通过自上而下制备垂直硅纳米线阵列
Realization of ultra dense arrays of vertical silicon nanowires with defect free surface and perfect anisotropy using a top-down approachXiang-Lei Han a ,Guilhem Larrieu a ,b ,⇑,Pier-Francesco Fazzini b ,Emmanuel Dubois aa IEMN/UMR CNRS 8520,Avenue Poincaré,BP 60069,59652Villeneuve d’Ascq,France bLAAS-CNRS,Universitéde Toulouse,7av.du Col.Roche,31077Toulouse,Francea r t i c l e i n f o Article history:Available online 4January 2011Keywords:Electron-beam lithographyHighly dense arrays of vertical Si nanowires Oxidation of Si nanostructure Top-down approacha b s t r a c tThe routine synthesis of ultra dense nanowires arrays appears as an inescapable requirement to imple-ment future generations of nanodevices.In this study,we demonstrate the fabrication of vertical of ultra dense (4Â1010cm À2)Si NWs arrays using a top-down fabrication strategy.The developed process also feature nearly perfect anisotropy (98.5%),100%yield and an excellent surface cleanliness based a self-limiting oxidation mechanism that develops in 1D nanostructure.Ó2011Published by Elsevier B.V.1.IntroductionNanodevice based silicon nanowires (Si NWs)have been identified as potential candidates for ultimate complementary me-tal-oxide-semiconductor (CMOS)as well as more-than-Moore applications,as reported in the international technology roadmap for semiconductors (ITRS).The fabrication of ultra dense arrays of Si NWs is a requirement to implement future generations of nanodevices [1],including gate-all-around (GAA)MOSFETs [2],high sensitive biochemical sensors [3]or photovoltaic applications [4].Top-down and bottom-up approaches have their own advanta-ges and drawbacks:the bottom-up route has the potential to growth NWs with a virtually unlimited variety of materials while the top-down approach has the capability of quick integration in standard CMOS flow,with a very good reproducibility and control of the vertical NWs (position,diameter and pitch).Unlike bottom-up growth methods based on catalytic growth,the more conven-tional top-down approach is free of metallic contamination.In this study,Si NWs are defined using electron-beam lithography over a single layer of hydrogen silsesquioxane (HSQ)that acts as a robust mask to structure NWs by anisotropic plasma etching followed by a tightly controlled oxidation step.Taking advantage of the self-limited oxidation mechanism due to mechanical stress build-up in 1D nanostructure,a thin sacrificial oxide layer is grown to improve both the anisotropy and surface quality of NWs,resulting in a final tapered profile.The fabrication of Si NWs arrays with anultra high density (4Â1010cm À2),a 100%yield,excellent surface cleanliness and 98.5%etching anisotropy are demonstrated.2.Experimental procedures2.1.Realization of a hard mask by electron-beam lithography and etching vertical Si NWs arrayA single layer of negative-tone electron-beam resist,namely,hydrogen silsesquioxane (HSQ),was used as a hard mask.This choice was motivated by its excellent contrast properties upon e-beam exposure and development as well as its inorganic nature that provides an excellent etching selectivity with respect to sili-con.A solution of HSQ diluted in isobutyl ketone marketed by Dow Corning under the name of FOx-12and FOx-16was used.HSQ was spin coated on (100)Si wafers and baked at 80°C for 60s to evaporate the solvent.Electron-beam exposure was per-formed with an EBPG 5000+system from LEICA at the high energy of 100KeV.A 100pA beam current gives an extremely small spot size,estimated at 5nm.An optimum dose of 2750l C/cm 2coupled to correction factors that take into account proximity effects and pattern sizes were selected to generate HSQ nanocolumns.After exposure,the HSQ resist was developed by manual immersion in 25%tetramethylammonium hydroxide (TMAH)at 20°C for 60s,rinsed in methanol and dried using a supercritical carbon dioxide process to avoid the collapse of nanopillars caused by capillary forces that develop in conventional techniques that use gas blow [5].Using this process sequence,HSQ nanopillars arrays with very high contrast and density are obtained as shown in Fig.1(left)(diameter =27nm,pitch =24nm).0167-9317/$-see front matter Ó2011Published by Elsevier B.V.doi:10.1016/j.mee.2010.12.102⇑Corresponding author at:LAAS-CNRS,7av.du Col.Roche,31077Toulouse,France.E-mail address:rrieu@laas.fr (rrieu).Under electron beam exposure,HSQ has the remarkable prop-erty to evolve from a cage-like monomer to a network-like poly-mer that approaches the structure of silicon dioxide(SiO x),[6] improving the selectivity against plasma etching.The vertical HSQ nanopillars were transferred to the silicon bulk substrate by reactive ion etching(RIE)using a PlasmaLab100chamber from Ox-ford Instruments.A chlorine based plasma chemistry along with an optimized parameter selection(low pressure,without inductive plasma coupling)was used to obtain an anisotropy of92%without observing any traces of micro-trenching effect or‘grass effect’. More details were presented in our previous works[7].This ap-proach enabled the realization of vertical NWs arrays with a 19nm diameter and a nanowire density up4Â1010cmÀ2as given in Fig.1(right),which represents the highest density published up to now.2.2.Wet oxidation Si NWsAfter patterning,Si NWs were subjected to thermal wet oxida-tion a conventional tubular furnace(TEMPRESS)at850°C under a flow of1.5L/min of O2,2.5L/min of H2and a variable time.The SiO2layer grown was then stripped in a diluted HF solution.Based on high resolution SEM characterization,diameters at mid-height of the oxidized Si NW(d ox)and after stripping of the SiO2layer (d Si)were measured and the thickness of the grown SiO2layer (t oxide)deduced as following:t oxide¼ðd OXÀd SiÞ=2ð1Þ3.Results and discussion3.1.Self-limited oxidation phenomenon in1D Si nanostructureThermal oxidation is identified as an effective method to realize ultra-small diameter Si NWs by tapering the dimension[8].The comprehension of the different mechanisms that compete during the oxidation of a1D nanostructure,such as surface reaction and oxidant diffusivity,is a prerequisite to perfectly control the process at such dimensions.The oxidation rate of Si NWs is governed by a competition between stress build-up as the volume of oxide layer expands and stress relaxation by viscousflow.Buttner et al.[9] suggested that the stress increase is responsible for the retarded oxidation mechanism which cannot be relaxed by viscousflow of the oxide in case of dry oxidation.Under wet oxidation condition, the effect of stress relaxation by viscousflow of the oxide is more significant than in the case of dry oxidation due to the presence of hydroxyl ions[10].In Fig.2(a),the thicknesses of the oxide layer grown at850°C for different starting diameters of Si NWs are plot-ted as a function of oxidation time.The oxide thickness resulting from an one-dimensional oxidation of a planar(100)Si bulk wafer (dashed line)is given for comparison.Firstly,the shape of curve for the oxidation of Si wafer is nearly linear compared with the para-bolic profile associated to Si NWs.During the1D wet oxidation of a wafer at950°C and below,compressive stress is generated in the SiO2[11].A biaxial compressive stress at the SiO2/Si interface due to the increase of atomic volume from Si(20Å3)to SiO2 (45Å3)leads to the bending from a plane surface to a convex sur-face,resulting in a limitation in the diffusion of oxidizing agents. This phenomenon becomes even more significant in the oxidation of nanoscale cylindrical structures,such as NWs.In Fig.2(a),oxida-tion of the Si NWs is obviously retarded due to the reduced oxygen diffusion and decrease of the interface reaction rate by the com-pressive stress normal to the SiO2/Si interface.It is mentioned that this stress is strengthened in the oxidation process.Secondly,the NW oxide is grown anomalously faster than on a bulk Si wafer in the initial stage due to a supply of oxidizing species at the SiO2/ Si surface which is enhanced by the convex geometry.It this latter case,the surface of oxide shell is larger than the area at SiO2/Si interface.[12,13]Furthermore,during short oxidation on the con-vex surfaces,a tensile stress is created by the stretching of the al-ready grown oxide as it expands to a larger circumference leading to an increase of the oxidant diffusivity and solubility.As oxidation proceeds,the normal compressive stress becomes more important and retardation of oxidation is observed[13].Fig.2(b)shows the effect of the convex NWs geometry on the oxidation rate for sev-eral starting NWs diameters as function of the oxidation time.It is obvious that the self-limiting behavior is more apparent with smaller NWs diameters because of a higher surface/volume ratio which leads to a faster stress build-up.For example,considering a diameter below50nm,this dimension remains nearly constant after20min of oxidation.3.2.Improving anisotropy and smoothing surface of Si NWsOptimized RIE parameters associated to chlorine based chemis-try give anisotropy of about90%.The ideal vertical etched profile2.(a)Dependence of the oxide thickness with the oxidation time for severaldiameters obtained at850°C.The dashed line is an oxidation reference obtained on a(100)Si bulk wafer.(b)Evolution of several NW diameters function of oxidation time.(c)Dependence of the oxidation rate with oxidation time several NW diameters.(i.e.100%anisotropy)can not be reached due to the nature be-tween ionized chlorine and silicon that involves physical but also chemical reactions.In particular,lateral etching is introduced by chemical reaction and ions scattering.As previously discussed, the oxidation rate rapidly decreases and saturates to a very low rate with the oxidation duration,as shown in Fig.2(c).In other words,the rate of Si consumption by thermal oxidation in the out-er part of a large Si NW is faster than for smaller diameter NWs. Therefore,this mechanism can be used to improve the anisotropy while controlling the shrinking of diameter.Experimental results in Fig.3show Si NWs arrays before(left)and after(after)the wet oxidation step.The Si NWs diameter was reduced from42to 16nm while the anisotropy was improved from92%to98.5%.Finally,plasma-induced damage is recognized as a source of de-vice performance and reliability degradation,including UV radia-tion,electrostatic discharge and physical damage due to ionic bombardment.Ion-induced point defects and surface or interface nonstoichiometric states resulting from plasma exposure hold a major responsibility in degradation of carrier mobility,and sub-threshold metal–oxide-semiconductor characteristic[14].To cope with this problem,the vertical Si NW arrays were cured by wet oxidation and the SiO2layer grown was stripped in diluted HF. Using high-resolution TEM characterization shown in Fig.4,an atomically abrupt surface is observed after an oxidation step at 850°C/20min(right image)compared with the rough surface ob-tained after plasma etching(left image).4.ConclusionIn this study,a simple method of top-down fabrication cou-pled to a perfectly controlled oxidation step is demonstrated to realize vertical Si NWs arrays with an ultra high density (4Â1010cmÀ2),a perfect anisotropy(98.5%),100%yield and a sharply defined clean surface.The understandings of the differ-ent mechanisms that compete during the oxidation of1D nano-structure are presented in order to perfectly control the process at the nanometer scale.The mechanism of self-limited oxidation is advantageously used to effectively improve the anisotropy of vertical Si NWs.The authors thank the CNRS-CEMES laboratory (Toulouse,France)for the use of their TEM facilities. AcknowledgementsThis work was supported by the European Commission through the NANOSIL Network of Excellence(FP7-IST-216171). References[1]B.H.Iwai,IWJ(2008)1.[2]J.Goldberger,Allon I.Hochbaum,R.Fan,Peidong Yang,Nano Lett.6(2006)973–977.[3]G.J.Zhang,G.Zhang,J.H.Chua,R.E.Chee, E.H.Wong, A.Agarwal,K.D.Buddharaju,N.Singh,Z.Gao,N.Balasubramanian,Nano Lett.8(2008)1066–1070.[4]Zhiyong Fan,Haleh Razavi,Jae-won Do,Aimee Moriwaki,Onur Ergen,Yu-Lun Chueh,Paul W.Leu,Johnny.C.Ho,Toshitake Takahashi,Lothar.A.Reichertz,Steven Neale,Kyoungsik Yu,Ming Wu,Joel.W.Ager,X.Ali Javey, Nat.Mat.8(2009)648–653.[5]Toshihiko Tanaka,Mitsuaki Morigami,Nobufumi Atoda,Jpn.J.Appl.Phys.32(1993)6059–6064.[6]H.Namatsu,Y.Takahashi,K.Yamazake,T.Yamaguchi,M.Nagase,K.Kurihara,J.Vac.Sci.Technol.B16(1998)69–78.[7]X.-L.Han,rrieu,E.Dubois,J.Nanosci.Nanotechnol.10(2010)7423–7427.[8]H.I.Liu,D.K.Biegelsen,F.A.Ponce,N.M.Johnson,R.F.W.Pease,Appl.Phys.Lett.64(1994)1383–1385.[9]C.C.Buttner,M.Zacharias,Appl.Phy.Lett.89(2006)263106-(1-3).[10]S.M.Hu,J.Appl.Phys.64(1988)323–330.[11]E.P.EerNisse,Appl.Phys.Lett.35(1979)8–10.[12]Dah-Bin Kao,James P.McVittie,William D.Nix,Krishna C.Saraswat,IEEETrans.Electron Devices34(1987)1008–1017.[13]Dah.-Bin.Kao,James.P.McVittie,William.D.Nix,Krishna.C.Saraswat,IEEETrans.Electron Devices35(1988)25–37.[14]M.M.A.Hakim,L.T.an,O.Buiu,W.Redman-White,S.Hall,P.Ashb urn,Solid-State Electron.53(2009)753–759.Engineering88(2011)2622–2624。
吉林大学材料科学基础3
2 or R a 4
a
4R
a
☞ FCC unit cell volume VC
VC a 2 R 2
3
3
16R
3
2
C h a p t e r 3 / Structures of Metals and Ceramics
☞ The number of atom in unit cell (n) (晶胞原子数)
C h a p t e r 3 / Structures of Metals and Ceramics
Lattice (晶格, 点阵) The regular geometrical arrangement (规则几何排列) of points in crystal space. It means a three-dimensional array of points coinciding with atom positions (or sphere centers)
C h a p t e r 3 / Structures of Metals and Ceramics
The Body-Centered Cubic (BCC) (体心立方晶格) crystal structure
An aggregate of atoms
C h a p t e r 3 / Structures of Metals and Ceramics
☞ Relationship between a and R a: the cube edge length R: atomic radius (原子半径) a2+a2 = (4R)2
crystalline and noncrystalline between materials having the same composition exist Significant property differences e.g. ceramics polymers
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
©Copyright by Yogesh A.Mehta,2005LOW DIAMETER REGULAR GRAPH AS A NETWORK TOPOLOGY IN DIRECT AND HYBRID INTERCONNECTION NETWORKSBYYOGESH A.MEHTAB.E.,University of Mumbai,2003THESISSubmitted in partial fulfillment of the requirementsfor the degree of Master of Science in Computer Sciencein the Graduate College of theUniversity of Illinois at Urbana-Champaign,2005Urbana,IllinoisAbstractPerformance of a parallel computer depends on the computation power of the processors and the performance of the communication network connecting them.With the increasing scale and compute power of today’s parallel machines,interprocessor communication becomes the munication performance depends on the network topology and routing scheme for packets.This master’s thesis explores the use of low diameter regular(LDR) graph as a topology for interconnection networks.We generate graphs having same number of nodes and connections per node as the hypercube,a widely used network topology.These graphs have lower diameter and lower average internode distance than the corresponding hypercubes,which implies that on an average,packets travel for a lower number of hops. With a good routing scheme this would reduce the average message latency and lead to better communication performance.We run experiments with this new topology in a par-allel simulation framework for interconnection networks,BigNetSim.We show that LDR graphs achieve better performance than equivalent hypercubes for standard network traffic patterns.We have also developed a framework for implementing hardware collectives and we compare collective communication performance for different topologies.We implement a hybrid topology of a fat-tree and a LDR graph and evaluate its performance in comparison with a hybrid of a fat-tree and a hypercube.iiiTo my parents and my sister.ivAcknowledgementsFirst and foremost,I would like to thank my advisor xmikant V.Kale for his guidance and encouragement in my two years at the Parallel Programming Laboratory at University of Illinois at Urbana-Champaign.I would like to thank all members of the BigSim and BigNetSim project.Terry Wilmarth,who helped in understanding POSE,the simulation environment in which BigNet-Sim is developed.Sameer Kumar,for his informative assistance with interconnection net-works and in particular,the hardware collectives.Praveen Kumar Jagadishprasad for his initial code walkthroughs of BigNetSim to get me started on this project.Gengbin Zheng and Eric Bohm for their help with the BigSim simulator.Nilesh Choudhury for his patience and effort in debugging and optimizing BigNetSim.I would also like to thank all my friends at PPL for all the technical help,encour-agement and fun during my presence at graduate school.Finally,I express fullest gratitude for my parents and my sister who have been my pillars of strength and support and who have always been there for me whenever I needed them.vTable of ContentsList of Tables (viii)List of Figures (ix)Chapter1Introduction (1)1.1Thesis Contribution (3)1.2Thesis Organization (4)Chapter2Parallel Discrete Event Simulation (5)2.1Charm++ (5)2.2POSE (7)Chapter3Interconnection Networks (10)3.1Direct Networks (10)3.2Indirect Networks (11)3.3Topology (11)3.3.1Hypercube (12)3.3.2Fat Tree (12)3.4Routing (15)3.4.1Fixed Routing (15)3.4.2Adaptive Routing (15)3.5Simulation model (16)3.5.1Switch (16)3.5.2Channel (18)3.5.3Network Interface Card (18)3.5.4Node and Traffic Patterns (18)3.5.5Topologies and Routing Strategies (19)Chapter4Low Diameter Regular Graphs (20)4.1Background (20)4.2Motivation (21)4.3Generating an LDR graph (24)4.4Implementation of LDR graph topology (25)vi4.5Routing Algorithm for LDR graphs (25)4.6Performance (27)4.6.1Performance with Fixed Routing (29)4.6.2Performance with Oblivious Routing (29)4.6.3Performance with Adaptive Routing (31)Chapter5Hybrid Networks (39)5.1Designing a hybrid of a hypercube and a fat-tree (41)5.2Designing a hybrid of a LDR graph and a fat-tree (41)5.3Routing on an hybrid (42)5.4Performance (42)Chapter6Collectives (46)6.1On Hypercube (46)6.2On LDR Graphs (49)6.3On Hybrid Networks (49)6.4Performance (50)Chapter7Conclusion and Future Work (54)References (56)viiList of Tables4.1LDR graph v/s Hypercube (23)4.2Simulation Parameters (28)viiiList of Figures2.1Charm++Virtualization (6)2.2(a)User’s view of a poser;(b)Internal POSE representation of a poser (8)3.12-ary4-cube (13)3.216Node fat-tree (14)3.3BigNetSim conceptual model (17)4.1Petersen graph (21)4.2(a)8node Hypercube;(b)8node LDR graph (23)4.3(a)Initial Spanning Tree;(b)Same Spanning Tree-A different layout;(c)Adding an edge;(d)&(e)Adding more edges one at a time;(f)Complete8node LDR graph (33)4.4Message Response Time on a64node direct network withfixed routing (34)4.5Message Response Time on a64node direct network with oblivious routing.354.6Routing on an LDR graph (36)4.7Message Response Time on a64node direct network with adaptive routing.374.8Message Response Time on a2048node network with(a)input bufferedswitches(topfigure)and(b)output buffered switches(bottomfigure) (38)5.132-node hybrid topology with8-node hypercubes (40)5.2Message Response Time on a1024node hybrid network (43)5.3Message Response Time on a4096node hybrid network (44)5.4Message Response Time on a4096node hybrid network with a larger directnetwork component (45)6.1Message Response Time for broadcast on a64node direct network (51)6.2Message Response Time for broadcast on a2048node direct network (52)6.3Message Response Time for broadcast on a256node hybrid network (53)ixChapter1IntroductionIn the recent years,there have been remarkable advances in the scale and compute power of parallel computers.New parallel computers with hundreds of thousands of processors that are capable of achieving hundreds of teraflops at peak speed have been built.For example, the BlueGene(BG/L)machine,which is being developed by IBM,when completed will have 128K processors and is expected to achieve360teraflops at peak speed.Research projects in varied application areas such as molecular dynamics,astronomy,genomics and engineering design have been undertaken to exploit this tremendous amount of computational power.Porting existing applications and developing new applications for such large scale machines is a challenging task.If we can simulate the behavior of the application on a large machine,we might be able to improve the design of a machine even before it is built. The simulation could help in the development of algorithms which will scale well on such machines and thus enable efficient use of the machines.The BigSim[20,21]project aims at developing a simulation framework that would facilitate the development of efficient scalable applications on very large parallel machines.In most cases,there is a significant time gap between the deployment of large scale machines and the development of applications to run on them.Performance prediction of applications using BigSim can allow for optimization of applications in advance,so that they are ready to run as soon as the machines become available.Even after the machines are built,there are often long waiting periods involved1in acquiring large number of nodes on these machines.A simulator like BigSim can serve as a debugging and tuning environment which would be much more easily available than the actual machines.Parallel applications involve a lot of interprocessor communication.The inter-connection networks that connect different computers in a parallel machine are responsible for the communication performance and consequently for the overall performance of the application.For correctly simulating a parallel computing environment,it is necessary to accurately model the interconnection network.A network simulator BigNetSim[17],has been developed,which simulates the packet level communication on the detailed contention-based network models for large parallel computers.The size of data involved and the large compute power required makes sequential simulation impossible,hence we use parallel sim-ulation for BigNetSim.For accurate simulation of the communication time,BigNetSim models in detail various entities of the network which include the switches,nodes,channels and network properties such as the topologies,routing algorithms andflowcontrol.BigNetSim has been developed as a generic framework,so that new topologies and routing algorithms can be added and different types of networks can simulated.With the detailed network model,it can accurately simulate the interconnection networks in many of the widely used parallel computers today.Another application of this network simulator is to enable development of new topologies which might be better than the ones used today. While building and testing an actual network with the new topology can be difficult as well as impractical from point of view of time and money,the network simulator is a much more feasible alternative.Simulation can be used to compare these new ideas with currently used ones,tune them for performance,and then deploy them on actual networks.One idea is to use low diameter regular(LDR)graphs as an interconnection net-work topology.We generate LDR graphs that have same number of connections per node as hypercubes but that have lower diameter and lower average internode distance than cor-2responding hypercubes.Message latency,i.e.the time taken for messages to travel from source to destination,is an important measure of the communication performance.With lower diameter and lower average internode distance,packets would travel a lower number of hops on average and cause reduced contention.Thus,we would expect LDR graphs to provide lower message latency thereby improving communication performance.We discuss the motivation and generation of these graphs in detail in Chapter4.Also,two or more topologies could be combined in the same network to form a hybrid interconnection network.A simulator like BigNetSim can be used as a testbed for trying out new ideas for improving overall network performance.1.1Thesis ContributionPrincipal contributions of this thesis are:Design and implementation of low diameter regular graph topology for interconnection networks.Developing and optimizing a shortest-path based routing algorithm for LDR graphs.Extending the hardware collective framework for hypercube,LDR graphs,and hybrid topologies for interconnection networks.Development of topology and routing scheme for hybrid networks of fat-trees and LDR graphs.I alsofixed and adapted the original LDR graph generation algorithm to generate input graph data for the LDR graph topology.I was involved in development of specific components of BigNetSim such as the traffic generator and hybrid network design.My con-tributions towards the debugging and optimization of BigNetSim,in part,have led to a much3improved performance of the simulation and a more accurate modeling of interconnection networks.1.2Thesis OrganizationChapter2describes POSE,the parallel discrete event simulation environment used for developing the interconnection network simulator.Chapter3presents an overview of the interconnection networks used in parallel computers,their properties and entities,and how they are simulated in BigNetSim.We motivate the use of LDR graphs as a topology for interconnection networks in Chapter4.This chapter also explains the generation of these graphs,routing schemes,and their implementation and performance.In Chapter5,we dis-cuss the design,implementation and performance of hybrid topologies for interconnection networks.Chapter6discusses the framework for hardware collectives and collective com-munication performance for different topologies.Chapter7presents some conclusions from our work and directions for future research.4Chapter2Parallel Discrete Event SimulationWe have implemented our network simulation using POSE[18],a scalable general-purpose parallel discrete event simulation environment.POSE has been built in Charm++[8],a C++based parallel programming system which supports the virtualization programming model.The following overview of Charm++is based on detailed description in[8]and[7] and POSE overview is based on[18]and[19].2.1Charm++Charm++is an object-based,message-driven parallel programming environment.The basic unit of parallelism in Charm++is a message driven C++object known as chare.Methods can be invoked on a chare asynchronously from remote processors;these are known as entry methods.Charm++is based on the concept of virtualization[7].Each chare is a separate execution component and the number of chares(N)is independent of the number of proces-sors(P).In general,with N much greater than P,applications can run with millions of chares on a much smaller number of processors.With virtualization,user’s view of the program is that of the chares and their interactions.The runtime system takes care of the mapping of chares to processors.This distinction between user’s view and actual system implementation5Figure2.1.Charm++Virtualizationis shown in Figure2.1.A dynamic Charm++scheduler runs on each processor.The messages are stored in a queue which is sorted by a specific strategy.The scheduler picks the next message from the queue and invokes the corresponding method on the suitable object.As a result,no chare can hold the processor idle.Other chares can run while a particular chare is waiting for a message.This results in a good overlap of communication and computation and maximizes the degree of parallelism.On the basis of this virtualization model,Charm++has been successfully used to simulate challenging applications like Molecular dynamics,Cosmology, and Rocket Simulation.2.2POSEPOSE stands for Parallel Object-oriented Simulation Environment.It has been developed by Terry Wilmarth,a member of the Parallel Programming Laboratory within the Depart-ment of Computer Science at the University of Illinois at Urbana-Champaign.POSE is a scalable parallel discrete-event simulation environment designed for simulation models with fine granularity of computation.POSE encapsulates simulation entities in posers,which are equivalents of the chares6(a)(b)Figure2.2.(a)User’s view of a poser;(b)Internal POSE representation of a poserin Charm++.A structure of a poser is shown in Figure2.2(a).A poser stores it own virtual time known as Object Virtual Time(OVT).OVT is the virtual time that has passed since the start of the simulation relative to that object.Each poser has a set of event methods that are entry methods,they receive messages that have a timestamp.These entry methods capture incoming events,store them in a local event queue and invoke the local synchronization strategy on them.The event queue also stores checkpoints for the object state.This detailed internal representation of the poser is shown in Figure2.2(b)There are two ways in which a poser can advance its OVT.First is the elapse function.Calling an elapse with a number of time units passed as an argument advances the OVT of the poser by the time units specified.This indicates the time spent by the poser doing work.For example,in the context of network simulation,a channel poser can elapse time while it transmits a packet.Another way of advancing time on a poser is to invoke7an event method on the poser with an offset.This offset is then added to the OVT of the poser,which is a way of indicating some activity performed in the future or to indicate the simulation time spent in transit.An equivalent example in the context of network simulation is when a packet is sent across a channel to a switch,the method to receive the packet is invoked on the switch with an offset equal to the time taken by the packet to transit the channel.The OVT of the switch poser will be appropriately advanced.To develop an efficient application using POSE and to achieve good performance, it is important to decompose the problem into the smallest posers possible.This means that the degree of virtualization must be high.With smaller posers,the checkpoint and rollback overhead is less and object migration is easier.This also allows for better tuning of synchronization strategies to the object’s behavior.An important drawback of higher degree of virtualization is that with more objects in the simulation,there is more frequent context-switching between entities for each event.Overhead of managing per-object information is also higher.We studied these tradeoffs[17]in the context of network simulation.We found that higher degree of virtualization has more pros than cons.For example,we observed that the‘switch’poser was too large and breaking it in tofiner posers(making each port a separate poser)helped improve performance.Higher degree of virtualization also improved the scalability of our simulation.We present a brief overview of the optimistic synchronization strategy used by POSE.The strategy is adaptive and can range from cautiously optimistic to highly op-timistic.When the object receives an event it gets control of the processor and invokes the synchronization strategy to process events.The strategy performs necessary rollbacks and cancellations before beginning forward execution of events.Traditional optimistic ap-proaches execute the earliest arriving event from a sorted list of events.POSE differs in that it maintains a speculative window which decides how far in the future beyond the cur-rent global virtual time(GVT)estimate an object may proceed.If there are events with8timestamp>GVT but within the window,then they are executed.All these events within the window are batched together and executed as a multi-event.This reduces the context-switching overhead and batching of events benefits from a warmed cache.These benefits outweigh the additional rollback overhead.The adaptive synchronization strategy and the multi-events,along with other features of POSE,are discussed in detail in[19].9Chapter3Interconnection NetworksInterconnection networks,as defined in[2]are programmable systems that transport data between terminals.Interconnection networks occur at a variety of scales from small-scale on-chip networks within a single processor to a large scale large-area or wide-area network.In the context of our work,we are concerned with networks which are used to connect different processors in a parallel computer system.With faster processors,we have faster computation and consequently,often communication becomes the bottleneck for the performance of a parallel computer.Better interconnection networks can help improve communication and thereby improve the performance of the entire system.Interconnection Networks can be broadly classified as direct networks and indirect networks.3.1Direct NetworksEach node in a direct network is connected to a router,so they are also called router based networks.The neighboring nodes can be connected by a pair of unidirectional or bidirectional channels.The function of a router can also be performed by a local processor,but dedicated routers are used in parallel computers for overlapping communication and computation. Every router has a certain number of input and output channels.Internal channels connect the local processor or memory to the router.External channels connect different routers.103.2Indirect NetworksFor indirect networks the communication between any two nodes has to be carried out through switches.Every node has a network adapter that connects to a network switch. Each switch has a set of ports.Each port has an input and output link.A set of ports in each switch is connected to processors or connected to other switches.The interconnection of switches define various topologies.Transmitting a message in an indirect network from one node to another requires travelling to the switch of thefirst node,hopping across the network,reaching the destination node’s switch,and then reaching the node itself.3.3TopologyTopology is the layout of connections of nodes in the network.The topology is important as it decides various important properties of the network such as bisection bandwidth,diameter, and average internode distance.The bisection bandwidth refers to the bidirectional capacity of a network between two equal-sized partitions of nodes.The cut across the network is taken at the narrowest point in each bisection of the network.Diameter refers to the length of the longest shortest path between any two nodes in a topology.It is the largest number of edges which must be traversed in order to travel from one node to another when paths which backtrack,detour,or loop are excluded from consideration.Average internode distance refers to the average of lengths of the shortest paths between all pairs of nodes in the topology.We briefly discuss two common topologies here.11Figure3.1.2-ary4-cube3.3.1HypercubeHypercube is a network with logarithmic complexity which has the structure of a generalized cube.In this topology,the nodes are placed at the vertices of a2-ary M-cube,where M refers to the dimension.For example,a2-ary4-cube is shown in Figure3.1.For a hypercube of N nodes,the degree of each node is the same and is log(N). The diameter of the hypercube is log(N)and the average internode distance is log(N)/2. Hypercube is commonly used as a topology for direct networks.3.3.2Fat TreeFat-tree network[12]refers to the k-ary n-tree.The graph k-ary n-tree has been defined in[14]and[9].It is a type of fat-tree which can be defined as follows:Definition:A k-ary n-tree is a fat-tree that has two types of vertices:P=k n processing nodes and nk n−1switches.The switches are organized hierarchically with n levels that have k n−1switches at each level.Each node can be represented by the n-tuple{0,1,...,k−1}n,12Figure3.2.16Node fat-treewhile each switch is defined as an ordered pair w,l where w {0,1,...,k−1}n−1and l {0,1,...,n−1}.Here the parameter l represents the level of each switch and w identifies a switch at that level.The root switches are at level l=n−1,while the switches connected to the processing nodes are at level0.Fat-tree networks have various advantages,such as high bisection bandwidth,scal-able topology,compact switches,and simple routing.They are used extensively in current generation high performance networks such as Quadrics and Infiniband.A Complete16 node fat-tree is shown in Figure3.2.Both these topologies are implemented in BigNetSim.It also includes other topolo-gies including the new topology based on low diameter regular graphs,which we discuss in detail in the next chapter.3.4RoutingA route is an ordered set of channels a1,a2,a3,···,a n where the output node of channel a i is the input node of channel a i+1.Depending on the type of network,there could be a single route or multiple routes between a source and a destination.A good routing algorithm balances the load uniformly across channels.There are two major classification of routing13algorithms-fixed and adaptive.3.4.1Fixed RoutingDeterministic orfixed routing algorithms choose the same path between any two nodes, which is a function of the source and destination address.This can lead to load imbalance in the network for some load patterns.There can be increased contention in a specific part of the network,particularly in random traffic patterns.However,they are simple and inexpensive to implement.Deterministic algorithms are still prevalent today since designing a good randomized adaptive algorithm for irregular topologies is difficult.3.4.2Adaptive RoutingA routing technique is said to be adaptive if,for a given pair of source and destination, the path taken by a particular packet depends on dynamic network conditions,such as network contention,congested channels,or presence of faults.It provides fault tolerance to the system by introducing alternate paths since failure of a link will effectively leave the network disconnected in deterministic routing while the network will still remain connected in adaptive routing.Although the adaptive technique has clear advantages,it introduces a lot of complexity in the switch,which makes it costly.3.5Simulation modelThe model is an effort to simulate the basic units of a network,namely switch,channel, network interface cards,andfinally,nodes which inject messages into the network and receive messages intended to them.The conceptual model of BigNetSim is shown in Figure3.3 Each of these entities are modeled as posers.14Figure3.3.BigNetSim conceptual model3.5.1SwitchThe switch assumes a packet switching strategy and uses virtual cut-through strategy to forward messages through the switches.Switches can be distinguished as: Input Buffered(IB):A packet in a switch is stored at the input port until the next switch in its route is decided and leaves the current switch if itfinds available space on the next switch in the route.Output Buffered(OB):A packet in a switch decides beforehand about the next switch in its route and is buffered at the output port until space is available on the next switch along the route.It has a simple and fair arbitration strategy which uses aging of packets to determine which packet competing for which port should get higher priority.We use credit basedflow control in the network;the credits are equivalent to buffer space.A switch computes how many credits it has available on a specific downstream switch and based on the amount,it decides whether it can send a packet or not.The model also supports configurable strategies for15input virtual channel selection and output virtual channel selection.The configurability of the switch provides aflexible design satisfying the requirements of a large number of networks.3.5.2ChannelThe channel is a simple entity which receives a packet and delivers it to the next object it is connected to,which could be either a switch or a destination node.The channel models the delay equivalent to the time it would take for a packet to travel from one switch or node to another along that channel.3.5.3Network Interface CardThe network interface card divides a message into separate packets,based on the maximum transmission unit of the network,and sends them.It models DMA and HCA delays.The delays are categorized for small and large messages,then added to the message send times. It responds to excessive load with an injection threshold that models deteriorating caching effects as it gets overloaded.At the receiving end,the NIC assimilates the packets into the message and passes the data to the node.3.5.4Node and Traffic PatternsThe node generates the packets and injects them into the network.The traffic generator module can be used to generate different traffic patterns.Six different traffic patterns exist, which determine the destination node that it can generate:k-shift:address of the destination node for node i is(i+k)mod(N)Ring:equivalent to1-shift16Bit transpose:address of the destination node is a transpose of that of the source nodei.e.d i=s(i+b/2)mod(N)Bit reversal:address of the destination node is a reversal of the bit address of the source node i.e.d i=s b−i−1Bit complement:address of the destination node is a bitwise complement of the address of the source node.Uniform distribution:This is a random traffic in which each node is equally likely to send to any of the other nodes.The traffic generation time distribution can either be deterministic or it can follow a Poisson distribution.3.5.5Topologies and Routing StrategiesImplementation of a topology in our model involves defining the neighbors for a switch and the mapping of these neighbors to the port numbers on the current switch.Routing strategy decides the output port on which the packet is to be sent.Topologies and routing strategies can be created separately,and the architectures can be created to use these topologies and routing strategies.Various topologies have been implemented such as the hypercube,fat-tree and mesh3D topologies.Corresponding routing strategies such as hamming-distance routing for hypercubes;dimension-ordered and Torus routing for3D-mesh topologies;and Up-Down routing for fat-tree topologies have also been implemented.17。