Solving the Trade-off Between Fairness and Throughput Token Bucket and Leaky Bucket Based W

合集下载

数据链路层技术中的负载均衡方法探讨(二)

数据链路层技术中的负载均衡方法探讨在现代信息技术快速发展的背景下，网络通讯扮演着越来越重要的角色。

数据链路层作为OSI模型中的第二层，负责物理连接的建立和数据传输的管理。

在高负载的情况下，为了保证网络的稳定性和高效性，负载均衡技术变得至关重要。

本文将探讨数据链路层技术中的负载均衡方法。

一、什么是负载均衡负载均衡是将网络通讯中的数据传输均匀地分配到多个处理器、服务器或链路上，以实现最优的资源利用和流量控制。

负载均衡技术可以通过多种算法实现，包括轮询、加权轮询、最少连接、IP哈希等。

二、负载均衡在数据链路层中的应用数据链路层中的负载均衡主要应用于交换机和路由器的分流处理。

例如，当交换机接收到大量数据流时，它可以根据预先设定的算法，将数据流平均分配到可用的端口上，以避免某个端口负载过重而导致的延迟和丢包现象。

三、基于流量的负载均衡算法1. 轮询算法轮询算法是最简单的负载均衡算法之一。

交换机或路由器按照设定的顺序将数据流依次发送到不同的端口上。

优点是实现简单，但缺点是无法考虑到不同端口的工作能力和负载情况，可能导致某个端口过载。

2. 加权轮询算法加权轮询算法根据端口的权重配置，将数据流发送到不同权重的端口上。

权重越高的端口分配更多的数据流，以实现更均衡的负载分担。

这种算法可以更好地利用各个端口的资源。

四、基于连接的负载均衡算法1. 最少连接算法最少连接算法是一种根据当前连接数选择最空闲的端口进行分流的算法。

交换机或路由器通过检查端口的连接数来决定分配数据流所需要的端口。

这种算法可以避免某个端口被过载，但需要维护连接数的统计，增加了系统的复杂性。

2. IP哈希算法IP哈希算法将发送数据流的IP地址与端口数取模，将结果作为选择端口的依据。

这种算法可以保证相同的IP地址始终分配到同一个端口，以保证连接的稳定性。

五、负载均衡的局限性和扩展负载均衡技术虽然能够提高网络传输的稳定性和效率，但也存在一些局限性。

首先，负载均衡算法的选择和配置需要根据具体的网络环境和需求进行优化，不同的应用场景可能适合不同的算法。

DPOS共识机制的改进方案

收稿日期：２０１９０５２０；修回日期：２０１９０７１６作者简介：高迎（１９７３），女，硕导，博士，主要研究方向为信息管理、分布式系统；谭学程（１９９４），男，硕士研究生，主要研究方向为区块链以及分布式网络（１８９１０９４７９２２＠１６３．ｃｏｍ）．ＤＰＯＳ共识机制的改进方案高　迎，谭学程（首都经济贸易大学管理工程学院，北京１０００７０）摘　要：针对委托权益证明（ＤＰＯＳ）共识机制节点投票不积极以及恶意节点勾结现象提出了一种改进方案。

首先，引入非结构化网络信任模型，根据每个节点的历史记录和其他节点的推荐值计算综合信任值。

根据综合信任值进行投票，使得选择的节点更可信。

引入推荐算法，节点的权益得到了分散，降低了中心化程度。

其次，加入了奖惩机制，针对积极投票的节点给予信用值的奖励，使其有机会成为共识节点，针对恶意节点给予信任值的惩罚。

实验结果表明，基于综合信任值投票计算的ＤＰＯＳ共识机制能够快速剔除错误节点，维护系统稳定性，具有较高的安全性。

关键词：委托权益证明；信誉投票；奖惩机制中图分类号：ＴＰ３０２．７文献标志码：Ａ文章编号：１００１３６９５（２０２０）１００４２３０８６０５ｄｏｉ：１０．１９７３４／ｊ．ｉｓｓｎ．１００１３６９５．２０１９．０５．０２３４ＩｍｐｒｏｖｅｍｅｎｔｏｆＤＰＯＳｃｏｎｓｅｎｓｕｓｍｅｃｈａｎｉｓｍＧａｏＹｉｎｇ，ＴａｎＸｕｅｃｈｅｎｇ（ＳｃｈｏｏｌｏｆＭａｎａｇｅｍｅｎｔＥｎｇｉｎｅｅｒｉｎｇ，ＣａｐｉｔａｌＵｎｉｖｅｒｓｉｔｙｏｆＥｃｏｎｏｍｉｃｓ＆Ｂｕｓｉｎｅｓｓ，Ｂｅｉｊｉｎｇ１０００７０，Ｃｈｉｎａ）Ａｂｓｔｒａｃｔ：Ｉｎｏｒｄｅｒｔｏｓｏｌｖｅｔｈｅｐｒｏｂｌｅｍｏｆｐａｓｓｉｖｅｎｏｄｅｖｏｔｉｎｇａｎｄｃｏｌｌｕｓｉｏｎｂｅｔｗｅｅｎｍａｌｉｃｉｏｕｓｎｏｄｅｓ，ｔｈｉｓｐａｐｅｒｐｒｏｐｏｓｅｄａｎｉｍｐｒｏｖｅｄＤＰＯＳｃｏｎｓｅｎｓｕｓｍｅｃｈａｎｉｓｍ．Ｆｉｒｓｔｌｙ，ｉｔｉｎｔｒｏｄｕｃｅｄｔｈｅｕｎｓｔｒｕｃｔｕｒｅｄｎｅｔｗｏｒｋｔｒｕｓｔｍｏｄｅｌ．Ａｃｃｏｒｄｉｎｇｔｏｔｈｅｈｉｓｔｏｒｙｏｆｅａｃｈｎｏｄｅａｎｄｔｈｅｒｅｃｏｍｍｅｎｄａｔｉｏｎｖａｌｕｅｏｆｏｔｈｅｒｎｏｄｅ，ｉｔｃａｌｃｕｌａｔｅｄｔｈｅｃｏｍｐｒｅｈｅｎｓｉｖｅｔｒｕｓｔ．Ａｃｃｏｒｄｉｎｇｔｏｔｈｅｃｏｍｐｒｅｈｅｎｓｉｖｅｔｒｕｓｔｖａｌｕｅ，ｉｔｍａｄｅｔｈｅｓｅｌｅｃｔｅｄｎｏｄｅｍｏｒｅｒｅｌｉａｂｌｅ．Ｂｙｉｎｔｒｏｄｕｃｉｎｇｔｈｅｒｅｃｏｍｍｅｎｄａｔｉｏｎａｌｇｏｒｉｔｈｍ，ｉｔｄｉｓｐｅｒｓｅｄｔｈｅｉｎｔｅｒｅｓｔｓｏｆｎｏｄｅｓａｎｄｒｅｄｕｃｅｄｔｈｅｄｅｇｒｅｅｏｆｃｅｎｔｒａｌｉｚａｔｉｏｎ．Ｓｅｃｏｎｄｌｙ，ｉｔａｄｄｅｄａｒｅｗａｒｄａｎｄｐｕｎｉｓｈｍｅｎｔｍｅｃｈａｎｉｓｍｔｏｒｅｗａｒｄｔｈｅｃｒｅｄｉｔｖａｌｕｅｏｆｔｈｅｎｏｄｅｓｔｈａｔｖｏｔｅｄａｃｔｉｖｅｌｙ，ｓｏｔｈａｔｔｈｅｙｈａｄａｃｈａｎｃｅｔｏｂｅｃｏｍｅｃｏｎｓｅｎｓｕｓｎｏｄｅｓ，ａｎｄａｌｓｏｔｏｐｕｎｉｓｈｔｈｅｃｒｅｄｉｔｖａｌｕｅｆｏｒｔｈｅｍａｌｉｃｉｏｕｓｎｏｄｅ．ＴｈｅｅｘｐｅｒｉｍｅｎｔａｌｒｅｓｕｌｔｓｓｈｏｗｔｈａｔｔｈｅＤＰＯＳｃｏｎｓｅｎｓｕｓｍｅｃｈａｎｉｓｍｂａｓｅｄｏｎｔｈｅｃｏｍｐｒｅｈｅｎｓｉｖｅｔｒｕｓｔｖａｌｕｅｖｏｔｉｎｇｃａｌｃｕｌａｔｉｏｎｃａｎｑｕｉｃｋｌｙｅｌｉｍｉｎａｔｅｔｈｅｅｒｒｏｒｎｏｄｅｓ，ｍａｉｎｔａｉｎｔｈｅｓｔａｂｉｌｉｔｙｏｆｔｈｅｓｙｓｔｅｍ，ａｎｄｈａｖｅｈｉｇｈｓｅｃｕｒｉｔｙ．Ｋｅｙｗｏｒｄｓ：ＤＰＯＳ；ｒｅｐｕｔａｔｉｏｎｖｏｔｉｎｇ；ｒｅｗａｒｄａｎｄｐｕｎｉｓｈｍｅｎｔｍｅｃｈａｎｉｓｍ区块链是分布式存储、点对点传输、共识机制、加密算法等信息技术在互联网时代的集成创新。

sentinel 漏桶算法原理

sentinel 漏桶算法原理Sentinel漏桶算法原理Sentinel是一款开源的分布式系统容错框架，为大规模分布式系统提供实时监控、统计和报警功能。

而Sentinel的流量控制模块中采用了漏桶算法来实现对资源访问的限流。

什么是漏桶算法？漏桶算法是一种简单而经典的流量控制算法。

它的原理类似于一个漏桶，可以想象为水以恒定速度流出。

当请求到来时，如果桶中还有剩余容量，则将请求放入桶中；如果桶已满，则拒绝请求。

漏桶算法通过控制请求发送的速度，从而实现对流量的整形和限流。

漏桶算法原理解析1.桶的容量：漏桶算法中的桶是一个固定容量的存储单元。

当请求到达时，如果桶中还有剩余的容量，则请求被放入桶中。

否则，请求被拒绝。

2.请求的速率：漏桶算法中的请求速率是一个恒定值，无论是否有请求，水都以固定的速度流出，即使没有请求到来，漏桶也在不断地漏水。

3.请求的处理：每当请求到达时，如果漏桶没有满，则将请求放入漏桶中；如果漏桶已满，则拒绝请求。

4.漏桶的漏水：无论请求是否被拒绝，桶内的水都以恒定速度流出，即使没有新的请求进来。

5.稳定性保证：漏桶算法的整体思想是通过控制请求发送的速度，使其稳定在一个固定的速率，从而保证系统能够承受的负载不超过其处理能力。

漏桶算法的优势•直观：漏桶算法的原理简单直观，易于理解和实现。

•稳定性：通过限制请求的发送速率，可以平滑系统的流量波动，避免突发大量请求对系统造成过载。

•控制精度：漏桶算法可以精确控制请求的发送速率，确保系统稳定在可接受的范围内，提供更好的服务质量。

总结漏桶算法是Sentinel流量控制模块中用于限流的一种经典算法。

它借用了漏桶的形象来限制请求的速率，并通过漏桶的漏水来平衡系统的负载。

漏桶算法直观易懂，能够稳定地控制请求的发送速率。

在分布式系统中，Sentinel的漏桶算法可以有效地限制系统的并发请求，保证系统的稳定性和可靠性。

《计算机网络》数据链路层习题

《计算机网络》数据链路层习题3.1 数据链路层中的链路控制包括哪些功能？答：1.链路管理包括：数据链路的建立、维持和释放2.帧定界数据链路层的数据传送单位是帧。

帧定界也可称为帧同步，指的是接收方应当能够从收到的比特流中准确地区分出一帧的开始和结束在什么地方。

3.流量控制发送方发送数据的速率必须使得收方来得及接收。

当收方来不及接收时，就必须及时控制发送方发送数据的速率，这就是流量控制。

4.差错控制由于链路一般都有极低的差错率，所以一般采用编码技术，编码技术有2类：一是前向纠错，即接收方收到有差错的数据帧时，能够自动将差错改正过来。

这种方法开销较大，不太适合计算机通信；另一类是差错检测，即收方可以检测到收到帧有差错（不知道哪个帧错）。

然后可以不进行任何处理，或者可以由数据链路层负责重新传送该帧。

5.将数据和控制信息区分开多数情况下，数据和控制信息处于同一帧中，因此一定要有相应的措施使收方能够将它们区分开来。

6.透明传输透明传输是指不管传输的数据是什么样的比特组合，都可以在链路上传送；当传送的数据中的比特组合恰巧与某一个控制信息完全相同时，必须有可靠的措施，是收方不会将这种比特组合的数据误以为是某种控制信息。

7.寻址必须保证每一帧都能送到正确的目的站。

收方也应该知道发放是哪个站。

3.2 考察停止等待协议算法。

在接收结点，当执行步骤(3)时，若将“否则转到(6)”改为“否则转到(2)”，将产生什么结果？答：如果收到直接转到(2)，那么就是说收到重复发来的数据帧后，直接丢弃，不进行任何处理。

那么发送结点会以为接收结点仍然没有收到这个数据帧，会再重发发送这个数据帧，造成浪费。

3.3 信道速率为4kb/s。

采用停止等待协议。

传播时延tp=20ms.确认帧长度和处理时间均可忽略。

问帧长为多少才能使信道利用率达到至少50%？答：当发送一帧的时间等于信道的传播时延的2倍时，信道利用率是50％，或者说当发送一帧的时间等于来回路程的传播时延时，效率将是50％。

漏桶算法和令牌桶算法的区别

漏桶算法和令牌桶算法的区别漏桶算法与令牌桶算法在表⾯看起来类似，很容易将两者混淆。

但事实上，这两者具有截然不同的特性，且为不同的⽬的⽽使⽤。

漏桶算法与令牌桶算法的区别在于：漏桶算法能够强⾏限制数据的传输速率。

令牌桶算法能够在限制数据的平均传输速率的同时还允许某种程度的突发传输。

需要说明的是：在某些情况下，漏桶算法不能够有效地使⽤⽹络资源。

因为漏桶的漏出速率是固定的，所以即使⽹络中没有发⽣拥塞，漏桶算法也不能使某⼀个单独的数据流达到端⼝速率。

因此，漏桶算法对于存在突发特性的流量来说缺乏效率。

⽽令牌桶算法则能够满⾜这些具有突发特性的流量。

通常，漏桶算法与令牌桶算法结合起来为⽹络流量提供更⾼效的控制1|0漏桶限流算法的原理以固定速率从桶中流出⽔滴，以任意速率往桶中放⼊⽔滴，桶容量⼤⼩是不会发⽣改变的。

流⼊：以任意速率往桶中放⼊⽔滴。

流出：以固定速率从桶中流出⽔滴。

⽔滴：是唯⼀不重复的标识。

因为桶中的容量是固定的，如果流⼊⽔滴的速率>流出的⽔滴速率，桶中的⽔滴可能会溢出。

那么溢出的⽔滴请求都是拒绝访问的，或者直接调⽤服务降级⽅法。

前提是同⼀时刻2|0令牌桶算法(Token)令牌桶分为2个动作，动作1(固定速率往桶中存⼊令牌)、动作2(客户端如果想访问请求，先从桶中获取token)。

guava 提供的RateLimiter类来进⾏限流处理。

1.传统的⽅式整合RateLimiter 有很⼤的缺点：代码重复量特别⼤，⽽且本⾝不⽀持注解⽅式。

2.如果限流代码可以放在⽹关中，相当于针对所有的服务接⼝都实现限流(可以使⽤排除法进⾏排除不进⾏限流的⽅法)，维护性不是很强。

3.正常的互联⽹公司项⽬，不是所有的服务接⼝都需要实现限流⽅法的，⼀般只真针对于⼤流量接⼝。

⽐如：秒杀抢购、12306抢票等。

4.可以⼿动封装⼀个RateLimiter类注解来解决这个⽅法以规定的速率往令牌桶中放⼊ token，⽤户请求必须获取到令牌桶中的 token才可以访问我们的业务逻辑⽅法，如果没有从令牌桶中获取到 token ，拒绝访问。

求解云计算任务调度的粒子群优化算法研究

求解云计算任务调度的粒子群优化算法研究云计算任务调度是指在云计算环境中，将任务分配给合适的计算资源进行执行的过程。

任务调度的效率和质量直接影响着云计算系统的性能和用户体验。

为了优化任务调度的结果，研究者提出了多种优化算法，其中一种经典的算法是粒子群优化算法（Particle Swarm Optimization, PSO）。

本文将重点研究云计算任务调度的粒子群优化算法。

粒子群优化算法是一种基于模拟群体行为的优化算法，最初在20世纪90年代由美国Indiana University的Eberhart和Kennedy提出。

它通过模拟鸟群寻找食物的行为，来求解优化问题。

算法的核心思想是通过粒子之间的迭代和信息共享，不断更新粒子的位置和速度，从而找到最优解。

在云计算任务调度中，粒子群优化算法可以将任务表示为粒子的位置，计算资源表示为粒子的速度。

算法的基本流程如下：1.确定问题的目标函数。

云计算任务调度的目标一般是最小化任务的执行时间、最大化系统的利用率或者最小化能源消耗等。

目标函数应该能够评估出各个解的优劣程度。

2.初始化粒子群的位置和速度。

每个粒子代表一个任务，位置表示任务的调度方案，速度表示任务的执行时间或其他指标。

位置和速度的初始化可以采用随机生成的方式。

3.对每个粒子计算适应度值，即目标函数的值。

根据适应度值的大小，更新粒子的个体最优位置和整个群体的全局最优位置。

4.根据粒子的个体最优位置和全局最优位置，调整粒子的速度和位置。

可以借鉴其他粒子群优化算法的更新公式，例如线性或非线性的速度和位置更新公式。

5.判断终止条件，如果满足停止条件（例如达到最大迭代次数或目标函数值小于一些阈值），则输出最优解；否则返回步骤3继续迭代。

在云计算任务调度中，粒子群优化算法有以下优势：1.全局能力强。

粒子通过全局最优信息的引导，具有较好的全局性能，能够找到问题的最优解或近似最优解。

2.并行计算效率高。

粒子群算法的计算过程中粒子之间的更新是独立的，可以利用并行计算的特性，提高算法的运行效率。

多队列VC调度算法研究

多队列ＶＣ调度算法研究　吴舜贤　刘衍珩　田明吉林大学计算机科学与技术学院，吉林　长春　１３００１２　E-mail：wushxian@摘　要　本文首先分析了ＶＣ和ＧＰＳ／ＰＧＰＳ分组调度算法的优点和缺点，在此基础上提出了一种具有ＧＰＳ调度算法特性的多队列ＶＣ调度算法ＭＱＶＣ。

完整阐述了ＭＱＶＣ的设计目标、改进措施，并给出了ＭＱＶＣ算法模型和算法描述，通过定理和引理证明了该模型比单队列ＶＣ　和ＰＧＰＳ调度算法模型在实现复杂度、系统调度性能和包的丢失等方面的明显改善。

网络仿真结果显示该算法具有良好的服务质量性能。

关键词　调度算法，虚拟时钟，ＭＱＶＣ，ＧＰＳ，服务质量。

中图分类号：TP393.02 文献标识码：A１引言　在高速路由器技术研究中，核心部分之一是路由器中分组调度算法研究[1]。

通过对调度算法的综合比较[2]发现集成服务网络分组调度算法中，VC (Virtual Clock) [3，4]和GPS （Generalized Processor Sharing）/PGPS（Packet-by-packet Generalized Processor Sharing）[5]是两种比较典型的调度模型。

ＶＣ算法是一种基于ＴＤＭ的统计时分的服务模型。

文献［６］认为ＶＣ调度算法是　“不公平”的，但文献［３，４，７，８］从理论和仿真结果表明它的有效性、可行性和优越性，认为ＶＣ算法是一种能够有效满足一定的服务速率保证、时延保证和流隔离监视的调度算法。

相对于本文提出的算法模型，称原来的ＶＣ　算法为单队列ＶＣ调度算法。

GPS调度算法是一种理想的绝对的基于流的公平队列算法，在现实条件下是不可实现的，对GPS的模拟演化出一系列调度算法，如WFQ[1]、PGPS[5]、WF2Q[9]、WF2Q+[10]、SFQ[11]等，其最大差异在于虚拟时间定义和调度条件选定。

PGPS是GPS的一种较好的逼近模拟。

当前对VC算法主要的改进研究有前跳虚拟时钟[12]、核心无状态虚拟时钟［１３］及基于虚拟时钟的接纳控制技术［１４］等。

Congestion Control for Large-scale RDMA Deployment

Congestion Control for Large-Scale RDMADeploymentsYibo Zhu1,3 Haggai Eran2 Daniel Firestone1 Chuanxiong Guo1 Marina Lipshteyn1 Yehonatan Liron2 Jitendra Padhye1 Shachar Raindel2 Mohamad Haj Yahia2 Ming Zhang11Microsoft 2Mellanox 3U. C. Santa BarbaraABSTRACTModern datacenter applications demand high throughput (40Gbps) and ultra-low latency (< 10 µs per hop) from the network, with low CPU overhead. Standard TCP/IP stacks cannot meet these requirements, but Remote Direct Mem- ory Access (RDMA) can. On IP-routed datacenter networks, RDMA is deployed using RoCEv2 protocol, which relies on Priority-based Flow Control (PFC) to enable a drop-free net- work. However, PFC can lead to poor application perfor- mance due to problems like head-of-line blocking and un- fairness. To alleviates these problems, we introduce DC- QCN, an end-to-end congestion control scheme for RoCEv2. To optimize DCQCN performance, we build a fluid model, and provide guidelines for tuning switch buffer thresholds, and other protocol parameters. Using a 3-tier Clos network testbed, we show that DCQCN dramatically improves through- put and fairness of RoCEv2 RDMA traffic. DCQCN is im- plemented in Mellanox NICs, and is being deployed in Mi- crosoft’s datacenters.CCS Concepts•Networks →Transport protocols;KeywordsDatacenter transport; RDMA; PFC; ECN; congestion con- trol1.INTRODUCTIONDatacenter applications like cloud storage [16] need high bandwidth (40Gbps or more) to meet rising customer de- mand. Traditional TCP/IP stacks cannot be used at such speeds, since they have very high CPU overhead [29]. The Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is per- mitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@.SIGCOMM ’15, August 17–21, 2015, London, United K ingdomQ c2015ACM.ISBN978-1-4503-3542-3/15/08...$15.00DOI: /10.1145/2785956.2787484 brutal economics of cloud services business dictates that CPU usage that cannot be monetized should be minimized: a core spent on supporting high TCP throughput is a core that can- not be sold as a VM. Other applications such as distributed memory caches [10, 30] and large-scale machine learning demand ultra-low latency (less than 10 µs per hop) mes- sage transfers. Traditional TCP/IP stacks have far higher latency [10].We are deploying Remote Direct Memory Access (RDMA) technology in Microsoft’s datacenters to provide ultra-low latency and high throughput to applications, with very low CPU overhead. With RDMA, network interface cards (NICs) transfer data in and out of pre-registered memory buffers at both end hosts. The networking protocol is implemented en- tirely on the NICs, bypassing the host networking stack. The bypass significantly reduces CPU overhead and overall la- tency. To simplify design and implementation, the protocol assumes a lossless networking fabric.While the HPC community has long used RDMA in special- purpose clusters [11, 24, 26, 32, 38], deploying RDMA on a large scale in modern, IP-routed datacenter networks presents a number of challenges. One key challenge is the need for a congestion control protocol that can operate efficiently in a high-speed, lossless environment, and that can be imple- mented on the NIC.We have developed a protocol, called Datacenter QCN (DCQCN) for this purpose. DCQCN builds upon the con- gestion control components defined in the RoCEv2 standard. DCQCN is implemented in Mellanox NICs, and is currently being deployed in Microsoft’s datacenters.To understand the need for DCQCN, it is useful to point out that historically, RDMA was deployed using InfiniBand (IB) [19, 21] technology. IB uses a custom networking stack, and purpose-built hardware. The IB link layer (L2) uses hop- by-hop, credit-based flow control to prevent packet drops due to buffer overflow. The lossless L2 allows the IB trans- port protocol (L4) to be simple and highly efficient. Much of the IB protocol stack is implemented on the NIC. IB supports RDMA with so-called single-sided operations, in which a server registers a memory buffer with its NIC, and clients read (write) from (to) it, without further involvement of the server’s CPU.However, the IB networking stack cannot be easily de-523ployed in modern datacenters. Modern datacenters are built with IP and Ethernet technologies, and the IB stack is incom- patible with these. DC operators are reluctant to deploy and manage two separate networks within the same datacenter. Thus, to enable RDMA over Ethernet and IP networks, the RDMA over Converged Ethernet (RoCE) [20] standard, and its successor RoCEv2 [22] have been defined. RoCEv2 re- tains the IB transport layer, but replaces IB networking layer (L3) with IP and UDP encapsulation, and replaces IB L2 with Ethernet. The IP header is needed for routing, while the UDP header is needed for ECMP [15].To enable efficient operation, like IB, RoCEv2 must also be deployed over a lossless L2. To this end, RoCE is de- ployed using Priority-based Flow Control (PFC) [18]. PFC allows an Ethernet switch to avoid buffer overflow by forc-ing the immediate upstream entity (either another switch or a host NIC) to pause data transmission. However, PFC is a coarse-grained mechanism. It operates at port (or, port plus priority) level, and does not distinguish between flows. This can cause congestion-spreading, leading to poor perfor- mance [1, 37].The fundamental solution to PFC’s limitations is a flow- level congestion control protocol. In our environment, the protocol must meet the following requirements: (i) func- tion over lossless, L3 routed, datacenter networks, (ii) incur low CPU overhead on end hosts, and (iii) provide hyper-fast start in the common case of no congestion. Current propos- als for congestion control in DC networks do not meet all our requirements. For example, QCN [17] does not support L3 networks. DCTCP [2] and iWarp [35] include a slow start phase, which can result in poor performance for bursty stor- age workloads. DCTCP and TCP-Bolt [37] are implemented in software, and can have high CPU o verhead.Since none of the current proposals meet all our require- ments, we have designed DCQCN. DCQCN is an end-to- end congestion control protocol for RoCEv2, to enable de- ployment of RDMA in large, IP-routed datacenter networks. DCQCN requires only the standard RED [13] and ECN [34] support from the datacenter switches. The rest of the proto- col functionality is implemented on the end host NICs. DC- QCN provides fast convergence to fairness, achieves high link utilization, ensures low queue buildup, and low queue oscillations.The paper is organized as follows. In §2 we present ev- idence to justify the need for DCQCN. The detailed design of DCQCN is presented in §3, along with a brief summary of hardware implementation. In §4 we show how to set the PFC and ECN buffer thresholds to ensure correct operation of DCQCN. In §5 we describe a fluid model of DCQCN, and use it to tune protocol parameters. In §6, we evaluate the performance of DCQCN using a 3-tier testbed and traces from our datacenters. Our evaluation shows that DCQCN dramatically improves throughput and fairness of RoCEv2 RDMA traffic. In some scenarios, it allows us to handle as much as 16x more user traffic. Finally, in §7, we discuss practical issues such as non-congestion packet losses.2.THE NEED FOR DCQCNTo justify the need for DCQCN, we will show that TCPstacks cannot provide high bandwidth with low CPU over-head and ultra-low latency, while RDMA over RoCEv2 can.Next, we will show that PFC can hurt performance of RoCEv2.Finally, we will argue that existing solutions to cure PFC’sills are not suitable for our needs.Conventional TCP stacks perform poorlyWe now compare throughput, CPU overhead and latency of RoCEv2 and conventional TCP stacks. These experi- ments use two machines (Intel Xeon E5-2660 2.2GHz, 16 core, 128GB RAM, 40Gbps NICs, Windows Server 2012R2) connected via a 40Gbps switch.Throughput and CPU utilization: To measure TCP t hrough- put, we use Iperf [46] customized for our environment. Specif- ically, we enable LSO [47], RSS [49], and zero-copy opera- tions and use 16 threads. To measure RDMA throughput, we use a custom tool that uses IB READ operation to transferdata. With RDMA, a single thread saturates the link.Figure 1(a) shows that TCP has high CPU overhead. For example, with 4MB message size, to drive full throughput, TCP consumes, on average, over 20% CPU cycles across all cores. At smaller message sizes, TCP cannot saturate the link as CPU becomes the bottleneck. Marinos et.al. [29] have reported similarly poor TCP performance for Linux and FreeBSD. Even the user-level stack they propose consumes over 20% CPU cycles. In contrast, the CPU utilization of the RDMA client is under 3%, even for small message sizes.The RDMA server, as expected, consumes almost no CPU cycles.Latency: Latency is the key metric for small transfers. We now compare the average user-level latency of transferringa 2K message, using TCP and RDMA. To minimize TCPlatency, the connections were pre-established and warmed, and Nagle was disabled. Latency was measured using a high resolution (≤1 µs) timer [48]. There was no other traffic on the network.Figure 1(c) shows that TCP latency (25.4 µs) is signifi- cantly higher than RDMA (1.7 µs for Read/Write and 2.8 µs for Send). Similar TCP latency has reported in [10] forWindows, and in [27] for Linux.PFC has limitationsRoCEv2 needs PFC to enable a drop-free Ethernet fab- ric. PFC prevents buffer overflow on Ethernet switches and NICs. The switches and NICs track ingress queues. When the queue exceeds a certain threshold, a PAUSE message is sent to the upstream entity. The uplink entity then stops sending on that link till it gets an RESUME message. PFC specifies upto eight priority classes. PAUSE/RESUME mes- sages specify the priority class they apply to.The problem is that the PAUSE mechanism operates ona per port (and priority) basis –not on a per-flow basis.This can lead to head-of-line blocking problems; resulting in poor performance for individual flows. We now illustrate5245254035 30 25 20 15 10 5 04KB 16KB 64KB 256KB 1MB 4MBMessage size10080 60 40 20 0Message size3020 100TCPRDMA (read/write) RDMA (send)(a) Mean Throughput(b) Mean CPU Utilization (c) Mean LatencyFigure 1: Throughput, CPU consumption and latency of TCP and RDMA(a) Topology(a) TopologyFigure 2: Testbedtopology. All links are 40Gbps. All switches are Arista 7050QX32. There are four ToRs (T1-T4), four leaves (L1-L4) and two spines (S1- S2). Each ToR represents a different IP subnet. Routing and ECMP is done via BGP. Servers have multiple cores, large RAMs, and 40Gbps NICs.20 15 10 5H1H2H3H4Host(b)Throughput of individual sendersFigure 3: PFC Unfairness25 20 15 10 5 00 1 2 Number of senders under T3(b) Median throughput of victim flowFigure 4: Victim flow p roblemthe problems using a 3-tier testbed (Figure 2) representative of modern datacenter networks.Unfairness: Consider Figure 3(a). Four senders (H1-H4) send data to the single receiver (R) using RDMA WRITE operation. All senders use the same priority class. Ideally, the four senders should equally share bottleneck link (T4 to R). However, with PFC, there is unfairness. When queue starts building up on T4, it pauses incoming links (ports P2- P4). However, P2 carries just one flow (from H4), while P3 and P4 may carry multiple flows since H1, H2 and H3 must share these two ports, depending on how ECMP maps the flows. Thus, H4 receives higher throughput than H1-H3. This is known as the parking lot problem [14].This is shown in Figure 3(b), which shows the min, me- dian and max throughput achieved by H1-H4, measured over 1000 4MB data transfers. H4 gets as much as 20Gbps through- put, e.g. when ECMP maps all of H1-H3 to either P3 or P4. H4’s minimum throughput is higher than the maximum throughput of H1-H3.Victim flow: Because PAUSE frames can have a cascading effect, a flow can be hurt by congestion that is not even on its path. Consider Figure 4(a). Four senders (H11-H14), send data to R. In addition, we have a “victim flow” – VSsending to VR. Figure 4(b) shows the median throughput (250 transfers of 250MB each) of the victim flow.When there are no senders under T3, in the median case (two of H11-H14 map to T1-L1, others to T1-L2. Each of H11-H14 gets 10Gbps throughput. VS maps to one of T1’s uplinks), one might expect VS to get 20Gbps throughput. However, we see that it only gets 10Gbps. This is due to cascading PAUSEs. As T4 is the bottleneck of H11-H14 incast, it ends up PAUSEing its incoming links. This in turn leads to L3 and L4 to pause their incoming links, and so forth. Eventually, L1 and L2 end up pausing T1’s uplinks to them, and T1 is forced to PAUSE the senders. The flows on T1 that use these uplinks are equally affected by these PAUSEs, regardless of their destinations – this is also known as the head-of-the-line blocking problem.The problem gets worse as we start senders H31 and H32 that also send to R. We see that the median throughput fur- ther falls from 10Gbps to 4.5Gbps, even though no path from H31 and H32 to R has any links in common with the path between VS and VR. This happens because H31 and H32 compete with H11-H14 on L3 and L4, make them PAUSE S1 and S2 longer, and eventually make T1 PAUSE senders longer.S1S2 L1 L2 L3 L4 T1T2T3T4H11H14 VSVRH31 H32RT2H2RDMA TCPT h r o u g h p u t (G b p s )C P U u t i l i z a t i o n (%)T h r o u g h p u t (G b p s )T i m e t o t r a n s f e r 2K B ( s )T h r o u g h p u t (G b p s )Summary: These experiments show that flows in RoCEv2 deployments may see lower throughput and/or high variabil- ity due to PFC’s congestion-spreading characteristics. Existing proposals are inadequateA number of proposals have tried to address PFC’s lim- itations. Some have argued that ECMP can mitigate the problem by spreading traffic on multiple links. Experiments in previous section show that this is not always the case. The PFC standard itself includes a notion of priorities to ad- dress the head-of-the-line blocking problem. However, the standard supports only 8 priority classes, and both scenarios shown above can be made arbitrarily worse by expanding the topology and adding more senders. Moreover, flows within the same class will still suffer from PFC’s limitations.The fundamental solution to the PFC’s problems is to use flow-level congestion control. If appropriate congestion con- trol is applied on a per-flow basis, PFC will be rarely trig- gered, and thus the problems described earlier in this section will be avoided.The Quantized Congestion Notification (QCN) [17] stan- dard was defined for this purpose. QCN enables flow-level congestion control within an L2 domain. Flows are defined using source/destination MAC address and a flow id field.A switch computes a congestion metric upon each packet arrival. Its value depends on the difference between the in- stantaneous queue size and the desired equilibrium queue size, along with other factors. The switch then probabilisti- cally (probability depends on the severity of the congestion) sends the quantized value of the congestion metric as feed- back to the source of the arriving packet. The source reduces its sending rate in response to congestion feedback. Since no feedback is sent if there is no congestion, the sender in- creases its sending rate using internal timers and counters. QCN cannot be used in IP-routed networks because the definition of a flow is based entirely on L2 addresses. In IP- routed networks the original Ethernet header is not preserved as the packet travels through the network. Thus a congested switch cannot determine the target to send the congestion feedback to.We considered extending the QCN protocol to IP-routed networks. However, this is not trivial to implement. At min- imum, extending QCN to IP-routed networks requires us- ing the IP five-tuple as flow identifier, and adding IP and UDP headers to the congestion notification packet to enable it to reach the right destination. Implementing this requires hardware changes to both the NICs and the switches. Mak-ing changes to the switches is especially problematic, as the QCN functionality is deeply integrated into the ASICs. It usually takes months, if not years for ASIC vendors to im- plement, validate and release a new switch ASIC. Thus, up- dating the chip design was not an option for us. 3.THE DCQCN ALGORITHMDCQCN is a rate-based, end-to-end congestion protocol, that builds upon QCN [17] and DCTCP [2]. Most of the DCQCN functionality is implemented in the NICs.As mentioned earlier, we had three core requirements for DCQCN: (i) ability to function over lossless, L3 routed, dat- acenter networks, (ii) low CPU overhead and (iii) hyper- fast start in the common case of no congestion. In addi- tion, we also want DCQCN to provide fast convergence to fair bandwidth allocation, avoid oscillations around the sta- ble point, maintain low queue length, and ensure high link utilization.There were also some practical concerns: we could not demand any custom functionality from the switches, and since the protocol is implemented in NIC, we had to be mind- ful of implementation overhead and complexity.The DCQCN algorithm consists of the sender (reaction point (RP)), the switch (congestion point (CP)), and the re- ceiver, (notification point (NP)).AlgorithmCP Algorithm: The CP algorithm is same as DCTCP. At an egress queue, an arriving packet is ECN [34]-marked if the queue length exceeds a threshold. This is accomplished using RED [13] functionality (Figure 5) supported on all modern switches. To mimic DCTCP, we can set K min = K max = K, and P max = 1. Later, we will see that this is not t he o ptimal s etting.NP Algorithm: ECN-marked packets arriving at NP indi- cate congestion in the network. NP conveys this information back to the sender. The RoCEv2 standard defines explicit Congestion Notification Packets (CNP) [19] for this purpose. The NP algorithm specifies how and when CNPs should be generated.The algorithm follows the state machine in Figure 6 for each flow. If a marked packet arrives for a flow, and no CNP has been sent for the flow in last N microseconds, a CNP is sent immediately. Then, the NIC generates at most one CNP packet every N microseconds for the flow, if any packet t hat a rrives w ithin t hat t ime w indow w as m arked. We use N = 50µs in our deployment. Processing a marked packet, and generating the CNP are expensive operations, so we minimize the activity for each marked packet. We dis- cuss the implications in §5.RP Algorithm: When an RP (i.e. the flow sender) gets a CNP, it reduces its current rate (R C ) and updates the value of the rate reduction factor, α, like DCTCP, and remembers current rate as target rate (R T ) for later recovery. The values are updated as follows:1R T = R C,In §8 we will discuss why other proposals such as TCP- Bolt [37] and iWarp [35] do not meet our needs. Since the existing proposals are not adequate, for our purpose, we pro- R C = R C (1 −α/2),α= (1 −g)α+ g,(1)pose DCQCN. The NP generates no feedback if it does not get any markedpackets. Thus, if RP gets no feedback for K time units, it1Initial value of α is 1.526Pmax1MarkingProbability0 EgressQueue SizeFigure 5: Switch packet marking al- gorithmFirst time of a flow, receive aFigure 6: NP state machineupdates α, as shown in Equation (2). Note that K must be larger than the CNP generation timer. Our implementation uses K = 55µs. See §5 for further discussion.α = (1 −g)α,(2) Furthermore, RP increases its sending rate using a timer and a byte counter, in a manner identical to QCN [17]. The byte counter increases rate for every B bytes, while the timer in- creases rate every T time units. The timer ensures that the flow can recover quickly even when its rate has dropped to a low value. The two parameters can be tuned to achieve the desired aggressiveness. The rate increase has two main phases: fast recovery, where the rate is rapidly increased to- wards fixed target rate for F = 5 successive iterations:R C = (R T + R C )/2, (3) Fast recovery is followed by an additive increase, where the current r ate s lowly a pproaches t he t arget r ate, a nd t arget r ate is increased in fixed steps R AI :R T = R T + R AI,(4)R C = (R T + R C )/2,There is also a hyper increase phase for fast ramp up. Fig- ure 7 shows the state machine. See [17] for more details. Note that there is no slow start phase. When a flow starts, it sends at full line rate, if there are no other active flows from the host.2 This design decision optimizes the common case where flows transfer a relatively small amount of data, and the network is not congested [25].BenefitsBy providing per-flow congestion control, DCQCN alle- viates PFC’s limitations. To illustrate this, we repeat the ex- periments in §2.2, with DCQCN enabled (parameters set ac- cording to guidelines in §4 and §5.2Otherwise, starting rate is defined by local QoS policies.Figure 7: Pseudocode of the RP algorithmFigure 8 shows that DCQCN solves the unfairness prob- lem depicted in Figure 3. All four flows get equal share of the bottleneck bandwidth, and there is little variance. Fig- ure 9 shows that DCQCN solves the victim flow problem depicted in Figure 4. With DCQCN, the throughput of VS- VR flow does not change as we add senders under T3.DiscussionCNP generation: DCQCN is not particularly sensitive to congestion on the reverse path, as the send rate does not de- pend on accurate RTT estimation like TIMELY [31]. Still, we send CNPs with high priority, to avoid missing the CNP deadline, and to enable faster convergence. Note that no CNPs are generated in the common case of no congestion.Rate based congestion control: DCQCN is a rate-based congestion control scheme. We adopted a rate-based ap- proach because it was simple to implement than the window based approach, and allowed for finer-grained control.Parameters: DCQCN is based on DCTCP and QCN, but it differs from each in key respects. For example, unlike QCN, there is no quantized feedback, and unlike DCTCP there is no “per-ack” feedback. Thus, the parameter settings recommended for DCTCP and QCN cannot be blindly used with DCQCN. In §5, we use a fluid model of the DCQCN to establish the optimal parameter settings.The need for PFC: DCQCN does not obviate the need for PFC. With DCQCN, flows start at line rate. Without PFC, this can lead to packet loss and poor performance (§6).Hardware implementation: The NP and RP state ma- chines are implemented on the NIC. The RP state machine requires keeping one timer and one counter for each flow that is being rate limited, apart from a small amount of other state such as the current value of alpha. This state is main- tained on the NIC die. The rate limiting is on a per-packet52752820 15 10 5H1H2H3H4Host2520151050012Number of senders under T340 35 30 25 20 15 10 5 00.020.04 0.06 0.08 0.1Time (second)Figure 8: Throughput of individual senders with DCQCN. Compare to Fig- ure 3(b). Figure 9: Median throughput of “vic - tim” flow with DCQCN. Compare to Figure 4(b).Figure 10: Fluid model closely matches implementation.granularity. The implementation of NP state machine in ConnectX-3 Pro can generate CNPs at the rate of one per 1-5 microseconds. At link rate of 40Gbps, the receiver can receive about 166 full-sized (1500 byte MTU) packets ev- ery 50 microseconds. Thus, the NP can typically support CNP generation at required rate for 10-20 congested flows. The current version(ConnectX-4) can generate CNPs at the required rate for over 200 f lows.4. BUFFER SETTINGSCorrect operation of DCQCN requires balancing two con-flicting requirements: (i ) PFC is not triggered too early –i.e. before giving ECN a chance to send congestion feed-back, and (ii ) PFC is not triggered too late – thereby causingpacket loss due to buffer o verflow.We now calculate the values of three key switch param-eters: t flight , t PFC and t ECN , to ensure that these two re-quirements are met even in the worst case . Note that differ-ent switch vendors use different terms for these settings; weuse generic names. The discussion is relevant to any shared-buffer switch, but the calculations are specific to switcheslike Arista 7050QX32, that use the Broadcom Trident II chipset.These switches have 32 full duplex 40Gbps ports, 12MB ofshared buffer and support 8 PFC priorities.Headroom buffer t flight : A PAUSE message sent to an upstream device takes some time to arrive and take effect. To avoid packet drops, the PAUSE sender must reserve enough buffer to process any packets it may receive during this time. This includes packets that were in flight when the PAUSEwas sent, and the packets sent by the upstream device whileit is processing the PAUSE message. The worst-case calcula-tions must consider several additional factors (e.g., a switch cannot abandon a packet transmission it has begun) [8]. Fol- lowing guidelines in [8], and assuming a 1500 byte MTU, we get t flight = 22.4KB per port, per priority. PFC Threshold t PFC : This is the maximum size an ingressqueue can grow to, before a PAUSE message is sent to theupstream device. Each PFC priority gets its own queue ateach ingress port. Thus, if the total switch buffer is B , andthere are n ports, it follows that t PFC ≤ (B −8nt flight )/(8n ).For our switches, we get t PFC ≤ 24.47KB . The switchsends RESUME message when the queue falls below t PFCby two MTU.ECN Threshold t ECN : Once an egress queue exceeds this threshold, the switch starts marking packets on that queue (K min in Figure 5). For DCQCN to be effective, this thresh- old must be such that PFC threshold is not reached before the switch has a chance to mark packets with ECN.Note however, that ECN marking is done on egress queue while PAUSE messages are sent based on ingress queue. Thus, t ECN is an egress queue threshold, while t PFC is aningress queue threshold. The worst case scenario is that packets pending on allegress queues come from a single ingress queue. To guar- antee that PFC is not triggered on this ingress queue be-fore ECN is triggered on any of the egress queues, we need:t PFC > n ∗ t ECN Using the upper bound on the value of t PFC , we get t ECN < 0.85KB . This is less than one MTU and hence infeasible.However, not only we do not have to use the upper boundon t PFC , we do not even have to use a fixed value for t PFC .Since t he s witch buffer i s s hared a mong a ll p orts, t PFC shoulddepend on how much of the shared buffer is free. Intu-itively, if the buffer is largely empty, we can afford to waitlonger to trigger PAUSE. The Trident II chipset in our switchallows us to configure a parameter β such that: t PFC =β(B − 8nt f l ig ht − s )/8, where s is the amount of buffer that is currently occupied. A higher β triggers PFC lessoften, while a lower value triggers PFC more aggressively.Note that s is equal to the sum of packets pending on all egress queues. Thus, just before ECN is triggered on any egress port, we have: s ≤ n ∗ t ECN . Hence, to ensure that ECN is always triggered before PFC, we set: t ECN <β(B − 8nt flight )/(8n (β + 1)). Obviously, larger β leaves more room for t ECN . In our testbed, we use β = 8, which leads to t ECN < 21.75KB . Discussion: The above analysis is conservative, and en- sures that PFC is not triggered on our switches before ECN even in the worst case and when all 8 PFC priorities are used. With fewer priorities, or with larger switch buffers,the threshold values will be different. The analysis does not imply that PFC will never be trig- gered. All we ensure is that at any switch, PFC is not trig- gered before ECN. It takes some time for the senders to re- ceive the ECN feedback and reduce their sending rate. Dur- ing this time, PFC may be triggered. As discussed before, we rely on PFC to allow senders to start at line rate. ImplementationFluid ModelT h r o u g h p u t (G b p s )T h r o u g h p u t (G b p s )T h r o u g h p u t (G b p s )。

计算机网络自顶向下第七版第七章答案

计算机⽹络⾃顶向下第七版第七章答案Computer Networking: A Top-Down Approach,7th Edition计算机⽹络⾃顶向下第七版Solutions to Review Questions and ProblemsChapter 7 Review Questions1.In infrastructure mode of operation, each wireless host is connected to the largernetwork via a base station (access point). If not operating in infrastructure mode, a network operates in ad-hoc mode. In ad-hoc mode, wireless hosts have noinfrastructure with which to connect. In the absence of such infrastructure, the hosts themselves must provide for services such as routing, address assignment, DNS-like name translation, and more.2.a) Single hop, infrastructure-basedb) Single hop, infrastructure-lessc) Multi-hop, infrastructure-basedd) Multi-hop, infrastructure-less3.Path loss is due to the attenuation of the electromagnetic signal when it travelsthrough matter. Multipath propagation results in blurring of the received signal at the receiver and occurs when portions of the electromagnetic wave reflect off objects and ground, taking paths of different lengths between a sender and receiver. Interference from other sources occurs when the other source is also transmitting in the samefrequency range as the wireless network.4.a) Increasing the transmission powerb) Reducing the transmission rate5.APs transmit beacon frames. An AP’s beacon frames will be transmitted over one ofthe 11 channels. The beacon frames permit nearby wireless stations to discover and identify the AP.6.False7.APs transmit beacon frames. An AP’s beacon frames will be transmitted over one ofthe 11 channels. The beacon frames permit nearby wireless stations to discover and identify the AP.8.False9.Each wireless station can set an RTS threshold such that the RTS/CTS sequence isused only when the data frame to be transmitted is longer than the threshold. This ensures that RTS/CTS mechanism is used only for large frames.10.No, there wouldn’t be any advantage. Suppose there are two stations that want totransmit at the same time, and they both use RTS/CTS. If the RTS frame is as long asa DATA frames, the channel would be wasted for as long as it would have beenwasted for two colliding DATA frames. Thus, the RTS/CTS exchange is only useful when the RTS/CTS frames are significantly smaller than the DATA frames.11.Initially the switch has an entry in its forwarding table which associates the wirelessstation with the earlier AP. When the wireless station associates with the new AP, the new AP creates a frame with the wireless station’s MAC address and broadcasts the frame. The frame is received by the switch. This forces the switch to update itsforwarding table, so that frames destined to the wireless station are sent via the new AP.12.Any ordinary Bluetooth node can be a master node whereas access points in 802.11networks are special devices (normal wireless devices like laptops cannot be used as access points).13.False14.“Opportunistic Scheduling” refers to matching the physical layer protocol to channelconditions between the sender and the receiver, and choosing the receivers to which packets will be sent based on channel condition. This allows the base station to make best use of the wireless medium.15.UMTS to GSM and CDMA-2000 to IS-95.16.The data plane role of eNodeB is to forward datagram between UE (over the LTEradio access network) and the P-GW. Its control plane role is to handle registration and mobility signaling traffic on behalf of the UE.The mobility management entity (MME) performs connection and mobility management on behalf of the UEs resident in the cell it controls. It receives UE subscription information from the HHS.The Packet Data Network Gateway (P-GW) allocates IP addresses to the UEs and performs QoS enforcement. As a tunnel endpoint it also performs datagram encapsulation/decapsulation when forwarding a datagram to/from a UE.The Serving Gateway (S-GW) is the data-plane mobility anchor point as all UE traffic will pass through the S-GW. The S-GW also performs charging/billing functions and lawful traffic interception.17.In 3G architecture, there are separate network components and paths for voice anddata, i.e., voice goes through public telephone network, whereas data goes through public Internet. 4G architecture is a unified, all-IP network architecture, i.e., both voice and data are carried in IP datagrams to/from the wireless device to several gateways and then to the rest of the Internet.The 4G network architecture clearly separates data and control plane, which is different from the 3G architecture.The 4G architecture has an enhanced radio access network (E-UTRAN) that is different from 3G’s radio access network UTRAN.18.No. A node can remain connected to the same access point throughout its connectionto the Internet (hence, not be mobile). A mobile node is the one that changes its point of attachment into the network over time. Since the user is always accessing theInternet through the same access point, she is not mobile.19.A permanent address for a mobile node is its IP address when it is at its homenetwork. A care-of-address is the one its gets when it is visiting a foreign network.The COA is assigned by the foreign agent (which can be the edge router in theforeign network or the mobile node itself).20.False21.The home network in GSM maintains a database called the home location register(HLR), which contains the permanent cell phone number and subscriber profileinformation about each of its subscribers. The HLR also contains information about the current locations of these subscribers. The visited network maintains a database known as the visitor location register (VLR) that contains an entry for each mobile user that is currently in the portion of the network served by the VLR. VLR entries thus come and go as mobile users enter and leave the network.The edge router in home network in mobile IP is similar to the HLR in GSM and the edge router in foreign network is similar to the VLR in GSM.22.Anchor MSC is the MSC visited by the mobile when a call first begins; anchor MSCthus remains unchanged during the call. Throughout the call’s duration and regardless of the number of inter-MSC transfers performed by the mobile, the call is routed from the home MSC to the anchor MSC, and then from the anchor MSC to the visited MSC where the mobile is currently located.23.a) Local recoveryb) TCP sender awareness of wireless linksc) Split-connection approachesChapter 7 ProblemsProblem 1Output corresponding to bit d 1 = [-1,1,-1,1,-1,1,-1,1]Output corresponding to bit d 0 = [1,-1,1,-1,1,-1,1,-1]Problem 2Sender 2 output = [1,-1,1,1,1,-1,1,1]; [ 1,-1,1,1,1,-1,1,1]Problem 3181111)1()1(111111)1()1(1112=?+?+-?-+?+?+?+-?-+?=d 181111)1()1(111111)1()1(1122=?+?+-?-+?+?+?+-?-+?=dProblem 4Sender 1: (1, 1, 1, -1, 1, -1, -1, -1)Sender 2: (1, -1, 1, 1, 1, 1, 1, 1)Problem 5a) The two APs will typically have different SSIDs and MAC addresses. A wirelessstation arriving to the café will associate with one of the SSIDs (that is, one of the APs). After association, there is a virtual link between the new station and the AP. Label the APs AP1 and AP2. Suppose the new station associates with AP1. When the new station sends a frame, it will be addressed to AP1. Although AP2 will alsoreceive the frame, it will not process the frame because the frame is not addressed to it. Thus, the two ISPs can work in parallel over the same channel. However, the two ISPs will be sharing the same wireless bandwidth. If wireless stations in different ISPs transmit at the same time, there will be a collision. For 802.11b, the maximum aggregate transmission rate for the two ISPs is 11 Mbps.b) Now if two wireless stations in different ISPs (and hence different channels) transmitat the same time, there will not be a collision. Thus, the maximum aggregatetransmission rate for the two ISPs is 22 Mbps for 802.11b.Problem 6Suppose that wireless station H1 has 1000 long frames to transmit. (H1 may be an AP that is forwarding an MP3 to some other wireless station.) Suppose initially H1 is the onlystation that wants to transmit, but that while half-way through transmitting its first frame, H2 wants to transmit a frame. For simplicity, also suppose every station can hear every other station’s signal (that is, no hidden terminals). Before transmitting, H2 will sense that the channel is busy, and therefore choose a random backoff value.Now suppose that after sending its first frame, H1 returns to step 1; that is, it waits a short period of times (DIFS) and then starts to transmit the second frame. H1’s second frame will then be transmitted while H2 is stuck in backoff, waiting for an idle channel. Thus, H1 should get to transmit all of its 1000 frames before H2 has a chance to access the channel. On the other hand, if H1 goes to step 2 after transmitting a frame, then it too chooses a random backoff value, thereby giving a fair chance to H2. Thus, fairness was the rationale behind this design choice.Problem 7A frame without data is 32 bytes long. Assuming a transmission rate of 11 Mbps, the time to transmit a control frame (such as an RTS frame, a CTS frame, or an ACK frame) is (256 bits)/(11 Mbps) = 23 usec. The time required to transmit the data frame is (8256 bits)/(11 Mbps) = 751DIFS + RTS + SIFS + CTS + SIFS + FRAME + SIFS + ACK= DIFS + 3SIFS + (3*23 + 751) usec = DIFS + 3SIFS + 820 usecProblem 8a) 1 message/ 2 slotsb) 2 messages/slotc) 1 message/slota)i) 1 message/slotii) 2 messages/slotiii) 2 messages/slotb)i) 1 message/4 slotsii) slot 1: Message A→ B, message D→ Cslot 2: Ack B→ Aslot 3: Ack C→ D= 2 messages/ 3 slotsiii)slot 1: Message C→ Dslot 2: Ack D→C, message A→ BRepeatslot 3: Ack B→ A= 2 messages/3 slotsProblem 10a)10 Mbps if it only transmits to node A. This solution is not fair since only A is gettingserved. By “fair” it m eans that each of the four nodes should be allotted equal number of slots.b)For the fairness requirement such that each node receives an equal amount of dataduring each downstream sub-frame, let n1, n2, n3, and n4 respectively represent the number of slots that A, B, C and D get. Now,data transmitted to A in 1 slot = 10t Mbits(assuming the duration of each slot to be t)Hence,Total amount of data transmitted to A (in n1 slots) = 10t n1Similarly total amounts of data transmitted to B, C, and D equal to 5t n2, 2.5t n3, and t n4 respectively.Now, to fulfill the given fairness requirement, we have the following condition:10t n1 = 5t n2 = 2.5t n3 = t n4Hence,n2 = 2 n1n3 = 4 n1n4 = 10 n1Now, the total number of slots is N. Hence,n1+ n2+ n3+ n4 = Ni.e. n1+ 2 n1 + 4 n1 + 10 n1 = Ni.e. n1 = N/17Hence,n2 = 2N/17n3 = 4N/17n4 = 10N/17The average transmission rate is given by:(10t n1+5t n2+ 2.5t n3+t n4)/tN= (10N/17 + 5 * 2N/17 + 2.5 * 4N/17 + 1 * 10N/17)/N= 40/17 = 2.35 Mbpsc)Let node A receives twice as much data as nodes B, C, and D during the sub-frame.Hence,10tn1 = 2 * 5tn2 = 2 * 2.5tn3 = 2 * tn4i.e. n2 = n1n3 = 2n1n4 = 5n1Again,n1 + n2 + n3 + n4 = Ni.e. n 1+ n1 + 2n1 + 5n1 = Ni.e. n1 = N/9Now, average transmission rate is given by:(10t n1+5t n2+ 2.5t n3+t n4)/tN= 25/9 = 2.78 MbpsSimilarly, considering nodes B, C, or D receive twice as much data as any other nodes, different values for the average transmission rate can be calculated.Problem 11a)No. All the routers might not be able to route the datagram immediately. This isbecause the Distance Vector algorithm (as well as the inter-AS routing protocols like BGP) is decentralized and takes some time to terminate. So, during the time when the algorithm is still running as a result of advertisements from the new foreign network, some of the routers may not be able to route datagrams destined to the mobile node.b)Yes. This might happen when one of the nodes has just left a foreign network andjoined a new foreign network. In this situation, the routing entries from the oldforeign network might not have been completely withdrawn when the entries from the new network are being propagated.c)The time it takes for a router to learn a path to the mobile node depends on thenumber of hops between the router and the edge router of the foreign network for the node.Problem 12If the correspondent is mobile, then any datagrams destined to the correspondent would have to pass through the correspondent’s home agent. The foreign agent in the network being visited would also need to be involved, since it is this foreign agent thatnotifies the correspondent’s home agent of the location of the correspondent. Datagrams received by the correspondent’s home agent would need to be encapsulated/tunneled between the correspondent’s home agent and for eign agent, (as in the case of the encapsulated diagram at the top of Figure 6.23.Problem 13Because datagrams must be first forward to the home agent, and from there to the mobile, the delays will generally be longer than via direct routing. Note that it is possible, however, that the direct delay from the correspondent to the mobile (i.e., if the datagram is not routed through the home agent) could actually be smaller than the sum of the delay from thecorrespondent to the home agent and from there to the mobile. It would depend on the delays on these various path segments. Note that indirect routing also adds a home agent processing (e.g., encapsulation) delay.Problem 14First, we note that chaining was discussed at the end of section 6.5. In the case of chaining using indirect routing through a home agent, the following events would happen: ?The mobile node arrives at A, A notifies the home agent that the mobile is now visiting A and that datagrams to the mobile should now be forwarded to thespecified care-of-address (COA) in A.The mobile node moves to B. The foreign agent at B must notify the foreign agent at A that the mobile is no longer resident in A but in fact is resident in Band has the specified COA in B. From then on, the foreign agent in A willforward datagrams it receives that are addressed to the mobile’s COA in A to t he mobile’s COA in B.The mobile node moves to C. The foreign agent at C must notify the foreign agent at B that the mobile is no longer resident in B but in fact is resident in C and has the specified COA in C. From then on, the foreign agent in B will forwarddatagrams it receives (from the foreign agent in A) that are addressed to themobile’s COA in B to the mobile’s COA in C.Note that when the mobile goes offline (i.e., has no address) or returns to its home network, the datagram-forwarding state maintained by the foreign agents in A, B and C must be removed. This teardown must also be done through signaling messages. Note that the home agent is not aware of the mobile’s mobility beyond A, and that the correspondent is not at all aware of the mobil e’s mobility.In the case that chaining is not used, the following events would happen: ?The mobile node arrives at A, A notifies the home agent that the mobile is now visiting A and that datagrams to the mobile should now be forwarded to thespecified care-of-address (COA) in A.The mobile node moves to B. The foreign agent at B must notify the foreign agent at A and the home agent that the mobile is no longer resident in A but infact is resident in B and has the specified COA in B. The foreign agent in A can remove its state about the mobile, since it is no longer in A. From then on, thehome agent will forward datagrams it receives that are addressed to the mobile’sCOA in B.The mobile node moves to C. The foreign agent at C must notify the foreign agent at B and the home agent that the mobile is no longer resident in B but in fact is resident in C and has the specified COA in C. The foreign agent in B canremove its state about the mobile, since it is no longer in B. From then on, thehome agent will forward datagrams it receives that are addressed to the mobile’sCOA in C.When the mobile goes offline or returns to its home network, the datagram-forwarding state maintained by the foreign agent in C must be removed. This teardown must also bedone through signaling messages. Note that the home agent is always aware of the mobile’s cu rrent foreign network. However, the correspondent is still blissfully unaware of the mobile’s mobility.Problem 15Two mobiles could certainly have the same care-of-address in the same visited network. Indeed, if the care-of-address is the address of the foreign agent, then this address would be the same. Once the foreign agent decapsulates the tunneled datagram and determines the address of the mobile, then separate addresses would need to be used to send the datagrams separately to their different destinations (mobiles) within the visited network.Problem 16If the MSRN is provided to the HLR, then the value of the MSRN must be updated in the HLR whenever the MSRN changes (e.g., when there is a handoff that requires the MSRN to change). The advantage of having the MSRN in the HLR is that the value can be provided quickly, without querying the VLR. By providing the address of the VLR Rather than the MSRN), there is no need to be refreshing the MSRN in the HLR.。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

AE¨UInt.J.Electron.Commun.
1(2005)No.1,1–6
1

Received01/2005.
AE¨UInt.J.Electron.Commun.
2
Abendroth,Eckel,Killat:TB&LBscheduler

1(2005)No.1,1–6

φ
k

−
sj[t1,t2]

LBWRR
2

SFI0.0130.1178
AE¨UInt.J.Electron.Commun.
1(2005)No.1,1–6
Abendroth,Eckel,Killat:TB&LBscheduler
3

LBWRR
2

99.26899.3319

LBWRR
2

0.12141.7995