Best-case complexity of asynchronous Byzantine consensus
拜占庭容错(BFT)算法应用案例

拜占庭容错(BFT,Byzantine Fault Tolerance)算法是一种用于解决分布式系统中的错误容忍问题的算法。
它能够在存在拜占庭错误(Byzantine failures)的情况下,保证系统的正确性和稳定性。
下面介绍一个拜占庭容错算法的应用案例。
应用场景:智能电网智能电网是一个分布式的系统,由多个智能设备组成,用于控制和调节电网的流量和电压。
在智能电网中,需要保证数据的一致性和可靠性,以避免由于设备故障或人为错误导致的电网中断。
解决方案:拜占庭容错算法采用拜占庭容错算法来设计智能电网的通信协议。
该算法通过使用认证、授权和消息确认等技术,确保在存在拜占庭错误的情况下,系统能够正确地处理消息并保证数据的一致性。
具体实现:1. 认证机制:每个智能设备都需要通过身份认证,确保其合法性。
通过使用公钥加密技术,可以实现设备的身份验证和消息的加密传输。
2. 授权机制:只有授权的智能设备才能发送和接收消息。
授权机制可以防止恶意设备的干扰。
3. 消息确认机制:每个接收到的消息都需要进行确认,以确保消息已经被正确地处理。
如果消息未被正确处理,则发送方会重新发送该消息。
4. 错误检测和处理:在分布式系统中,由于网络延迟和通信故障,消息可能会丢失或重复。
拜占庭容错算法能够检测和处理这些错误,并确保数据的一致性。
优势:1. 提高了系统的稳定性和可靠性,减少了由于错误导致的电网中断。
2. 降低了对人工干预的依赖,提高了系统的自动化程度。
3. 适用于大规模的智能电网系统,具有较高的可扩展性。
结论:通过在智能电网中应用拜占庭容错算法,可以有效地解决分布式系统中的错误容忍问题,提高系统的稳定性和可靠性。
该算法适用于大规模的智能电网系统,具有较高的可扩展性和自动化程度。
随着分布式系统的广泛应用,拜占庭容错算法将成为解决错误容忍问题的重要手段之一。
Google_chubby 分布式锁服务

The Chubby lock service for loosely-coupled distributed systemsMike Burrows,Google Inc.AbstractWe describe our experiences with the Chubby lock ser-vice,which is intended to provide coarse-grained lock-ing as well as reliable(though low-volume)storage for a loosely-coupled distributed system.Chubby provides an interface much like a distributedfile system with ad-visory locks,but the design emphasis is on availability and reliability,as opposed to high performance.Many instances of the service have been used for over a year, with several of them each handling a few tens of thou-sands of clients concurrently.The paper describes the initial design and expected use,compares it with actual use,and explains how the design had to be modified to accommodate the differences.1IntroductionThis paper describes a lock service called Chubby.It is intended for use within a loosely-coupled distributed sys-tem consisting of moderately large numbers of small ma-chines connected by a high-speed network.For example, a Chubby instance(also known as a Chubby cell)might serve ten thousand4-processor machines connected by 1Gbit/s Ethernet.Most Chubby cells are confined to a single data centre or machine room,though we do run at least one Chubby cell whose replicas are separated by thousands of kilometres.The purpose of the lock service is to allow its clients to synchronize their activities and to agree on basic in-formation about their environment.The primary goals included reliability,availability to a moderately large set of clients,and easy-to-understand semantics;through-put and storage capacity were considered secondary. Chubby’s client interface is similar to that of a simplefile system that performs whole-file reads and writes,aug-mented with advisory locks and with notification of var-ious events such asfile modification.We expected Chubby to help developers deal with coarse-grained synchronization within their systems,and in particular to deal with the problem of electing a leader from among a set of otherwise equivalent servers.For example,the Google File System[7]uses a Chubby lock to appoint a GFS master server,and Bigtable[3]uses Chubby in several ways:to elect a master,to allow the master to discover the servers it controls,and to permit clients tofind the master.In addition,both GFS and Bigtable use Chubby as a well-known and available loca-tion to store a small amount of meta-data;in effect they use Chubby as the root of their distributed data struc-tures.Some services use locks to partition work(at a coarse grain)between several servers.Before Chubby was deployed,most distributed sys-tems at Google used ad hoc methods for primary elec-tion(when work could be duplicated without harm),or required operator intervention(when correctness was es-sential).In the former case,Chubby allowed a small sav-ing in computing effort.In the latter case,it achieved a significant improvement in availability in systems that no longer required human intervention on failure. Readers familiar with distributed computing will rec-ognize the election of a primary among peers as an in-stance of the distributed consensus problem,and realize we require a solution using asynchronous communica-tion;this term describes the behaviour of the vast ma-jority of real networks,such as Ethernet or the Internet, which allow packets to be lost,delayed,and reordered. (Practitioners should normally beware of protocols based on models that make stronger assumptions on the en-vironment.)Asynchronous consensus is solved by the Paxos protocol[12,13].The same protocol was used by Oki and Liskov(see their paper on viewstamped replica-tion[19,§4]),an equivalence noted by others[14,§6]. Indeed,all working protocols for asynchronous consen-sus we have so far encountered have Paxos at their core. Paxos maintains safety without timing assumptions,but clocks must be introduced to ensure liveness;this over-comes the impossibility result of Fischer et al.[5,§1]. Building Chubby was an engineering effort required tofill the needs mentioned above;it was not research. We claim no new algorithms or techniques.The purpose of this paper is to describe what we did and why,rather than to advocate it.In the sections that follow,we de-scribe Chubby’s design and implementation,and how ithas changed in the light of experience.We describe un-expected ways in which Chubby has been used,and fea-tures that proved to be mistakes.We omit details that are covered elsewhere in the literature,such as the details of a consensus protocol or an RPC system.2Design2.1RationaleOne might argue that we should have built a library em-bodying Paxos,rather than a library that accesses a cen-tralized lock service,even a highly reliable one.A client Paxos library would depend on no other servers(besides the name service),and would provide a standard frame-work for programmers,assuming their services can be implemented as state machines.Indeed,we provide such a client library that is independent of Chubby. Nevertheless,a lock service has some advantages over a client library.First,our developers sometimes do not plan for high availability in the way one would wish.Of-ten their systems start as prototypes with little load and loose availability guarantees;invariably the code has not been specially structured for use with a consensus proto-col.As the service matures and gains clients,availability becomes more important;replication and primary elec-tion are then added to an existing design.While this could be done with a library that provides distributed consensus,a lock server makes it easier to maintain exist-ing program structure and communication patterns.For example,to elect a master which then writes to an ex-istingfile server requires adding just two statements and one RPC parameter to an existing system:One would acquire a lock to become master,pass an additional inte-ger(the lock acquisition count)with the write RPC,and add an if-statement to thefile server to reject the write if the acquisition count is lower than the current value(to guard against delayed packets).We have found this tech-nique easier than making existing servers participate in a consensus protocol,and especially so if compatibility must be maintained during a transition period. Second,many of our services that elect a primary or that partition data between their components need a mechanism for advertising the results.This suggests that we should allow clients to store and fetch small quanti-ties of data—that is,to read and write smallfiles.This could be done with a name service,but our experience has been that the lock service itself is well-suited for this task,both because this reduces the number of servers on which a client depends,and because the consistency fea-tures of the protocol are shared.Chubby’s success as a name server owes much to its use of consistent client caching,rather than time-based caching.In particular, we found that developers greatly appreciated not having to choose a cache timeout such as the DNS time-to-live value,which if chosen poorly can lead to high DNS load, or long client fail-over times.Third,a lock-based interface is more familiar to our programmers.Both the replicated state machine of Paxos and the critical sections associated with exclusive locks can provide the programmer with the illusion of sequen-tial programming.However,many programmers have come across locks before,and think they know to use them.Ironically,such programmers are usually wrong, especially when they use locks in a distributed system; few consider the effects of independent machine fail-ures on locks in a system with asynchronous communi-cations.Nevertheless,the apparent familiarity of locks overcomes a hurdle in persuading programmers to use a reliable mechanism for distributed decision making. Last,distributed-consensus algorithms use quorums to make decisions,so they use several replicas to achieve high availability.For example,Chubby itself usually has five replicas in each cell,of which three must be run-ning for the cell to be up.In contrast,if a client system uses a lock service,even a single client can obtain a lock and make progress safely.Thus,a lock service reduces the number of servers needed for a reliable client system to make progress.In a loose sense,one can view the lock service as a way of providing a generic electorate that allows a client system to make decisions correctly when less than a majority of its own members are up. One might imagine solving this last problem in a dif-ferent way:by providing a“consensus service”,using a number of servers to provide the“acceptors”in the Paxos protocol.Like a lock service,a consensus service would allow clients to make progress safely even with only one active client process;a similar technique has been used to reduce the number of state machines needed for Byzan-tine fault tolerance[24].However,assuming a consensus service is not used exclusively to provide locks(which reduces it to a lock service),this approach solves none of the other problems described above.These arguments suggest two key design decisions:•We chose a lock service,as opposed to a library or service for consensus,and•we chose to serve small-files to permit elected pri-maries to advertise themselves and their parameters, rather than build and maintain a second service. Some decisions follow from our expected use and from our environment:•A service advertising its primary via a Chubbyfile may have thousands of clients.Therefore,we must allow thousands of clients to observe thisfile,prefer-ably without needing many servers.•Clients and replicas of a replicated service may wish to know when the service’s primary changes.Thissuggests that an event notification mechanism would be useful to avoid polling.•Even if clients need not pollfiles periodically,many will;this is a consequence of supporting many devel-opers.Thus,caching offiles is desirable.•Our developers are confused by non-intuitive caching semantics,so we prefer consistent caching.•To avoid bothfinancial loss and jail time,we provide security mechanisms,including access control.A choice that may surprise some readers is that we do not expect lock use to befine-grained,in which they might be held only for a short duration(seconds or less); instead,we expect coarse-grained use.For example,an application might use a lock to elect a primary,which would then handle all access to that data for a consider-able time,perhaps hours or days.These two styles of use suggest different requirements from a lock server. Coarse-grained locks impose far less load on the lock server.In particular,the lock-acquisition rate is usu-ally only weakly related to the transaction rate of the client applications.Coarse-grained locks are acquired only rarely,so temporary lock server unavailability de-lays clients less.On the other hand,the transfer of a lock from client to client may require costly recovery proce-dures,so one would not wish a fail-over of a lock server to cause locks to be lost.Thus,it is good for coarse-grained locks to survive lock server failures,there is little concern about the overhead of doing so,and such locks allow many clients to be adequately served by a modest number of lock servers with somewhat lower availability. Fine-grained locks lead to different conclusions.Even brief unavailability of the lock server may cause many clients to stall.Performance and the ability to add new servers at will are of great concern because the trans-action rate at the lock service grows with the combined transaction rate of clients.It can be advantageous to re-duce the overhead of locking by not maintaining locks across lock server failure,and the time penalty for drop-ping locks every so often is not severe because locks are held for short periods.(Clients must be prepared to lose locks during network partitions,so the loss of locks on lock server fail-over introduces no new recovery paths.) Chubby is intended to provide only coarse-grained locking.Fortunately,it is straightforward for clients to implement their ownfine-grained locks tailored to their application.An application might partition its locks into groups and use Chubby’s coarse-grained locks to allocate these lock groups to application-specific lock servers. Little state is needed to maintain thesefine-grain locks; the servers need only keep a non-volatile,monotonically-increasing acquisition counter that is rarely updated. Clients can learn of lost locks at unlock time,and if a simplefixed-length lease is used,the protocol can be simple and efficient.The most important benefits of thisclient processes5servers of a Chubby cellclientapplicationchubbylibraryclientapplicationchubbylibrary...mRPCs m mastermmmqIFigure1:System structurescheme are that our client developers become responsible for the provisioning of the servers needed to support their load,yet are relieved of the complexity of implementing consensus themselves.2.2System structureChubby has two main components that communicate via RPC:a server,and a library that client applications link against;see Figure1.All communication between Chubby clients and the servers is mediated by the client library.An optional third component,a proxy server,is discussed in Section3.1.A Chubby cell consists of a small set of servers(typi-callyfive)known as replicas,placed so as to reduce the likelihood of correlated failure(for example,in different racks).The replicas use a distributed consensus protocol to elect a master;the master must obtain votes from a majority of the replicas,plus promises that those replicas will not elect a different master for an interval of a few seconds known as the master lease.The master lease is periodically renewed by the replicas provided the master continues to win a majority of the vote.The replicas maintain copies of a simple database,but only the master initiates reads and writes of this database. All other replicas simply copy updates from the master, sent using the consensus protocol.Clientsfind the master by sending master location requests to the replicas listed in the DNS.Non-master replicas respond to such requests by returning the iden-tity of the master.Once a client has located the master, the client directs all requests to it either until it ceases to respond,or until it indicates that it is no longer the master.Write requests are propagated via the consensus protocol to all replicas;such requests are acknowledged when the write has reached a majority of the replicas in the cell.Read requests are satisfied by the master alone; this is safe provided the master lease has not expired,as no other master can possibly exist.If a master fails,the other replicas run the election protocol when their master leases expire;a new master will typically be elected in a few seconds.For example,two recent elections took6s and4s,but we see values as high as30s(§4.1).If a replica fails and does not recover for a few hours,a simple replacement system selects a fresh machine from a free pool and starts the lock server binary on it.It then updates the DNS tables,replacing the IP address of the failed replica with that of the new one.The current mas-ter polls the DNS periodically and eventually notices the change.It then updates the list of the cell’s members in the cell’s database;this list is kept consistent across all the members via the normal replication protocol.In the meantime,the new replica obtains a recent copy of the database from a combination of backups stored onfile servers and updates from active replicas.Once the new replica has processed a request that the current master is waiting to commit,the replica is permitted to vote in the elections for new master.2.3Files,directories,and handlesChubby exports afile system interface similar to,but simpler than that of UNIX[22].It consists of a strict tree offiles and directories in the usual way,with name components separated by slashes.A typical name is:/ls/foo/wombat/pouchThe ls prefix is common to all Chubby names,and stands for lock service.The second component(foo)is the name of a Chubby cell;it is resolved to one or more Chubby servers via DNS lookup.A special cell name local indicates that the client’s local Chubby cell should be used;this is usually one in the same building and thus the one most likely to be accessible.The remain-der of the name,/wombat/pouch,is interpreted within the named Chubby cell.Again following UNIX,each di-rectory contains a list of childfiles and directories,while eachfile contains a sequence of uninterpreted bytes. Because Chubby’s naming structure resembles afile system,we were able to make it available to applications both with its own specialized API,and via interfaces used by our otherfile systems,such as the Google File System.This significantly reduced the effort needed to write basic browsing and name space manipulation tools, and reduced the need to educate casual Chubby users. The design differs from UNIX in a ways that ease dis-tribution.To allow thefiles in different directories to be served from different Chubby masters,we do not expose operations that can movefiles from one directory to an-other,we do not maintain directory modified times,and we avoid path-dependent permission semantics(that is, access to afile is controlled by the permissions on the file itself rather than on directories on the path leading to thefile).To make it easier to cachefile meta-data,the system does not reveal last-access times.The name space contains onlyfiles and directories, collectively called nodes.Every such node has only one name within its cell;there are no symbolic or hard links.Nodes may be either permanent or ephemeral.Any node may be deleted explicitly,but ephemeral nodes are also deleted if no client has them open(and,for directo-ries,they are empty).Ephemeralfiles are used as tempo-raryfiles,and as indicators to others that a client is alive. Any node can act as an advisory reader/writer lock;these locks are described in more detail in Section2.4.Each node has various meta-data,including three names of access control lists(ACLs)used to control reading,writing and changing the ACL names for the node.Unless overridden,a node inherits the ACL names of its parent directory on creation.ACLs are themselves files located in an ACL directory,which is a well-known part of the cell’s local name space.These ACLfiles con-sist of simple lists of names of principals;readers may be reminded of Plan9’s groups[21].Thus,iffile F’s write ACL name is foo,and the ACL directory contains afile foo that contains an entry bar,then user bar is permit-ted to write ers are authenticated by a mechanism built into the RPC system.Because Chubby’s ACLs are simplyfiles,they are automatically available to other ser-vices that wish to use similar access control mechanisms. The per-node meta-data includes four monotonically-increasing64-bit numbers that allow clients to detect changes easily:•an instance number;greater than the instance number of any previous node with the same name.•a content generation number(files only);this in-creases when thefile’s contents are written.•a lock generation number;this increases when the node’s lock transitions from free to held.•an ACL generation number;this increases when the node’s ACL names are written.Chubby also exposes a64-bitfile-content checksum so clients may tell whetherfiles differ.Clients open nodes to obtain handles that are analo-gous to UNIXfile descriptors.Handles include:•check digits that prevent clients from creating or guessing handles,so full access control checks need be performed only when handles are created(com-pare with UNIX,which checks its permissions bits at open time,but not at each read/write becausefile de-scriptors cannot be forged).•a sequence number that allows a master to tell whethera handle was generated by it or by a previous master.•mode information provided at open time to allow the master to recreate its state if an old handle is presented to a newly restarted master.2.4Locks and sequencersEach Chubbyfile and directory can act as a reader-writer lock:either one client handle may hold the lock in exclu-sive(writer)mode,or any number of client handles mayhold the lock in shared(reader)mode.Like the mutexes known to most programmers,locks are advisory.That is,they conflict only with other attempts to acquire the same lock:holding a lock called F neither is necessary to access thefile F,nor prevents other clients from do-ing so.We rejected mandatory locks,which make locked objects inaccessible to clients not holding their locks:•Chubby locks often protect resources implemented by other services,rather than just thefile associated with the lock.To enforce mandatory locking in a meaning-ful way would have required us to make more exten-sive modification of these services.•We did not wish to force users to shut down appli-cations when they needed to access lockedfiles for debugging or administrative purposes.In a complex system,it is harder to use the approach employed on most personal computers,where administrative soft-ware can break mandatory locks simply by instructing the user to shut down his applications or to reboot.•Our developers perform error checking in the conven-tional way,by writing assertions such as“lock X is held”,so they benefit little from mandatory checks.Buggy or malicious processes have many opportuni-ties to corrupt data when locks are not held,so wefind the extra guards provided by mandatory locking to be of no significant value.In Chubby,acquiring a lock in either mode requires write permission so that an unprivileged reader cannot prevent a writer from making progress.Locking is complex in distributed systems because communication is typically uncertain,and processes may fail independently.Thus,a process holding a lock L may issue a request R,but then fail.Another process may ac-quire L and perform some action before R arrives at its destination.If R later arrives,it may be acted on without the protection of L,and potentially on inconsistent data. The problem of receiving messages out of order has been well studied;solutions include virtual time[11],and vir-tual synchrony[1],which avoids the problem by ensuring that messages are processed in an order consistent with the observations of every participant.It is costly to introduce sequence numbers into all the interactions in an existing complex system.Instead, Chubby provides a means by which sequence numbers can be introduced into only those interactions that make use of locks.At any time,a lock holder may request a se-quencer,an opaque byte-string that describes the state of the lock immediately after acquisition.It contains the name of the lock,the mode in which it was acquired (exclusive or shared),and the lock generation number. The client passes the sequencer to servers(such asfile servers)if it expects the operation to be protected by the lock.The recipient server is expected to test whether the sequencer is still valid and has the appropriate mode;if not,it should reject the request.The validity of a sequencer can be checked against the server’s Chubby cache or,if the server does not wish to maintain a ses-sion with Chubby,against the most recent sequencer that the server has observed.The sequencer mechanism re-quires only the addition of a string to affected messages, and is easily explained to our developers.Although wefind sequencers simple to use,important protocols evolve slowly.Chubby therefore provides an imperfect but easier mechanism to reduce the risk of de-layed or re-ordered requests to servers that do not sup-port sequencers.If a client releases a lock in the normal way,it is immediately available for other clients to claim, as one would expect.However,if a lock becomes free because the holder has failed or become inaccessible, the lock server will prevent other clients from claiming the lock for a period called the lock-delay.Clients may specify any lock-delay up to some bound,currently one minute;this limit prevents a faulty client from making a lock(and thus some resource)unavailable for an arbitrar-ily long time.While imperfect,the lock-delay protects unmodified servers and clients from everyday problems caused by message delays and restarts.2.5EventsChubby clients may subscribe to a range of events when they create a handle.These events are delivered to the client asynchronously via an up-call from the Chubby li-brary.Events include:•file contents modified—often used to monitor the lo-cation of a service advertised via thefile.•child node added,removed,or modified—used to im-plement mirroring(§2.12).(In addition to allowing newfiles to be discovered,returning events for child nodes makes it possible to monitor ephemeralfiles without affecting their reference counts.)•Chubby master failed over—warns clients that other events may have been lost,so data must be rescanned.•a handle(and its lock)has become invalid—this typi-cally suggests a communications problem.•lock acquired—can be used to determine when a pri-mary has been elected.•conflicting lock request from another client—allows the caching of locks.Events are delivered after the corresponding action has taken place.Thus,if a client is informed thatfile contents have changed,it is guaranteed to see the new data(or data that is yet more recent)if it subsequently reads thefile. The last two events mentioned are rarely used,and with hindsight could have been omitted.After primary election for example,clients typically need to commu-nicate with the new primary,rather than simply know that a primary exists;thus,they wait for afile modifi-cation event indicating that the new primary has written its address in afile.The conflicting lock event in theory permits clients to cache data held on other servers,using Chubby locks to maintain cache consistency.A notifi-cation of a conflicting lock request would tell a client to finish using data associated with the lock:it wouldfinish pending operations,flush modifications to a home loca-tion,discard cached data,and release.So far,no one has adopted this style of use.2.6APIClients see a Chubby handle as a pointer to an opaque structure that supports various operations.Handles are created only by Open(),and destroyed with Close(). Open()opens a namedfile or directory to produce a handle,analogous to a UNIXfile descriptor.Only this call takes a node name;all others operate on handles. The name is evaluated relative to an existing directory handle;the library provides a handle on”/”that is always valid.Directory handles avoid the difficulties of using a program-wide current directory in a multi-threaded pro-gram that contains many layers of abstraction[18].The client indicates various options:•how the handle will be used(reading;writing and locking;changing the ACL);the handle is created only if the client has the appropriate permissions.•events that should be delivered(see§2.5).•the lock-delay(§2.4).•whether a newfile or directory should(or must)be created.If afile is created,the caller may supply ini-tial contents and initial ACL names.The return value indicates whether thefile was in fact created.Close()closes an open handle.Further use of the han-dle is not permitted.This call never fails.A related call Poison()causes outstanding and subsequent operations on the handle to fail without closing it;this allows a client to cancel Chubby calls made by other threads without fear of deallocating the memory being accessed by them. The main calls that act on a handle are: GetContentsAndStat()returns both the contents and meta-data of afile.The contents of afile are read atom-ically and in their entirety.We avoided partial reads and writes to discourage largefiles.A related call GetStat() returns just the meta-data,while ReadDir()returns the names and meta-data for the children of a directory. SetContents()writes the contents of afile.Option-ally,the client may provide a content generation number to allow the client to simulate compare-and-swap on a file;the contents are changed only if the generation num-ber is current.The contents of afile are always written atomically and in their entirety.A related call SetACL() performs a similar operation on the ACL names associ-ated with the node.Delete()deletes the node if it has no children. Acquire(),TryAcquire(),Release()acquire and release locks.GetSequencer()returns a sequencer(§2.4)that de-scribes any lock held by this handle.SetSequencer()associates a sequencer with a handle. Subsequent operations on the handle fail if the sequencer is no longer valid.CheckSequencer()checks whether a sequencer is valid(see§2.4).Calls fail if the node has been deleted since the han-dle was created,even if thefile has been subsequently recreated.That is,a handle is associated with an instance of afile,rather than with afile name.Chubby may ap-ply access control checks on any call,but always checks Open()calls(see§2.3).All the calls above take an operation parameter in ad-dition to any others needed by the call itself.The oper-ation parameter holds data and control information that may be associated with any call.In particular,via the operation parameter the client may:•supply a callback to make the call asynchronous,•wait for the completion of such a call,and/or •obtain extended error and diagnostic information. Clients can use this API to perform primary election as follows:All potential primaries open the lockfile and attempt to acquire the lock.One succeeds and becomes the primary,while the others act as replicas.The primary writes its identity into the lockfile with SetContents() so that it can be found by clients and replicas,which read thefile with GetContentsAndStat(),perhaps in response to afile-modification event(§2.5).Ideally, the primary obtains a sequencer with GetSequencer(), which it then passes to servers it communicates with; they should confirm with CheckSequencer()that it is still the primary.A lock-delay may be used with services that cannot check sequencers(§2.4).2.7CachingTo reduce read traffic,Chubby clients cachefile data and node meta-data(includingfile absence)in a consis-tent,write-through cache held in memory.The cache is maintained by a lease mechanism described below,and kept consistent by invalidations sent by the master,which keeps a list of what each client may be caching.The pro-tocol ensures that clients see either a consistent view of Chubby state,or an error.Whenfile data or meta-data is to be changed,the mod-ification is blocked while the master sends invalidations for the data to every client that may have cached it;this mechanism sits on top of KeepAlive RPCs,discussed more fully in the next section.On receipt of an invali-dation,a clientflushes the invalidated state and acknowl-。
基于信誉值的实用拜占庭容错改进算法研究

2022年5月25日第6卷第10期现代信息科技Modern Information TechnologyMay.2022 Vol.6 No.1016基于信誉值的实用拜占庭容错改进算法研究王启河(华北电力大学 控制与计算机工程学院,河北 保定 071003)摘 要:共识问题是区块链中的核心问题,针对联盟链常用的实用拜占庭容错算法(PBFT )中主节点选取随意、网络通信量大、公平性较低等问题,提出一种基于信誉值的PBFT 改进算法。
首先改变信誉值主节点选取方式,然后优化共识流程,节点的累计信誉作为判断达成共识的条件。
达成共识时没有参与共识过程的节点或恶意节点的信誉值降低,降低的信誉值均分给成功参与共识的节点。
经过多次共识后,故障或恶意节点对共识的影响变小,提高了算法的公平性。
关键词:联盟链;实用拜占庭容错算法;共识机制;信誉值;公平性中图分类号:TP311.5文献标识码:A文章编号:2096-4706(2022)10-0016-05Research on Improved Algorithm of Practical Byzantine Fault Tolerance Based onReputation ValueWANG Qihe(School of Control and Computer Engineering, North China Electric Power University, Baoding 071003, China)Abstract: The consensus problem is the core problem in the blockchain. In view of the problems of arbitrary master node selection, large amount of network communication and low fairness of the Practical Byzantine Fault Tolerance (PBFT), which is commonly used in coalition chains, an improved algorithm of PBFT based on reputation value is proposed. Firstly, it changes the selection mode of reputation value master node. Then the consensus process is optimized, and the accumulated reputation of nodes is used as a condition to judge reaching consensus. The reputation value of nodes that do not participate in the consensus process or malicious nodes is reduced when consensus is reached, and the reduced reputation value is equally distributed to the nodes that successfully participate in consensus. After multiple consensus, the impact of the fault or malicious node on the consensus becomes small and the fairness of the algorithm is improved.Keywords: coalition chain; practical Byzantine fault tolerant algorithm; consensus mechanism; reputation value; fairness0 引 言近2008年中本聪[1]提出一种点对点的电子现金系统即比特币,使得区块链底层技术得到了社会广泛关注。
拜占庭容错算法的英文缩写

拜占庭容错算法的英文缩写1算法概述拜占庭容错算法(Byzantine Fault Tolerance,BFT)是一种在分布式计算中保证系统正确性、可靠性的重要算法。
BFT源于拜占庭帝国在战争中发生的故事,其目的是在一些节点出现错误、故障或恶意行为的情况下,仍能够达成一致的决策。
2拜占庭问题拜占庭问题指的是在分布式计算系统中,由于节点之间的通信和计算错误,导致无法达成一致的问题。
该问题最早由莱斯利·兰伯特(Leslie Lamport)等人在1982年提出,后被称为拜占庭问题。
为了解决这个问题,进行了大量的研究,其中最著名的是拜占庭容错算法。
3拜占庭容错算法的优点(1)高可靠性:拜占庭容错算法可以给予系统高可靠性保障,即使在一些节点出现问题或恶意攻击时,也能保证系统的可靠性。
(2)高效性:拜占庭容错算法可以在较短的时间内达成一致的决策,对系统的性能有很大的提升。
(3)灵活性:拜占庭容错算法可以适应不同的系统环境和配置,可以在多种不同的架构中实现。
4拜占庭容错算法的工作原理拜占庭容错算法主要分为两个过程:一致性协议和故障检测协议。
(1)一致性协议一致性协议是指系统中的所有节点要达成一致的决策后,才能进行下一步的操作。
一般通过投票方式来实现,节点根据收到的投票结果进行决策。
常见的一致性协议有Paxos、Raft等,它们的共同特点是可以保证系统的可靠性和正确性,但是在一些拜占庭故障的情况下无法达成一致。
(2)故障检测协议故障检测协议是指在系统中检测节点是否存在故障、错误或者恶意行为,以保证整个系统的稳定性。
常见的故障检测协议有Gossip协议、SWIM协议等。
这些协议会定时地检测节点状态,并通过有效的协议交换信息来达到故障检测的目的。
5BFT的实现在实现BFT算法的过程中,需要解决的核心问题是安全性问题。
安全性主要包括共识(Consensus)和状态机复制(State Machine Replication)。
拜占庭容错——精选推荐

拜占庭容错拜占庭容错 拜占庭将军问题提出后,有很多的算法被提出⽤于解决这个问题。
这类算法统称拜占庭容错算法(BFT: Byzantine Fault Tolerance)。
简略来说,拜占庭容错(BFT)不是某⼀个具体算法,⽽是能够抵抗拜占庭将军问题导致的⼀系列失利的系统特点。
这意味着即使某些节点出现缺点或恶意⾏为,拜占庭容错系统也能够继续运转。
本质上来说,拜占庭容错⽅案就是少数服从多数。
拜占庭容错系统需要达成如下两个指标:安全性:任何已经完成的请求都不会被更改,它可以在以后请求看到。
在区块链系统中,可以理解为,已经⽣成的账本不可篡改,并且可以被节点随时查看。
活性:可以接受并且执⾏⾮拜占庭客户端的请求,不会被任何因素影响⽽导致⾮拜占庭客户端的请求不能执⾏。
在区块链系统中,可以理解为,系统需要持续⽣成区块,为⽤户记账,这主要靠挖矿的激励机制来保证。
拜占庭系统⽬前普遍采⽤的假设条件包括:拜占庭节点的⾏为可以是任意的,拜占庭节点之间可以共谋;节点之间的错误是不相关的;节点之间通过异步⽹络连接,⽹络中的消息可能丢失、乱序、延时到达;服务器之间传递的信息,第三⽅可以知晓 ,但是不能窜改、伪造信息的内容和验证信息的完整性;在BFT共识机制中,⽹络中节点的数量和⾝份必须是提前确定好的。
且每⼀次节点的进出都需要对⽹络进⾏初始化,故其⽆法像PoW共识机制那样任何⼈都可以随时加⼊/退出挖矿。
另外,由于节点间基于消息传递达成共识,因此采⽤BFT算法的⽹络⽆法承载⼤量的节点,业内普遍认为100个节点是BFT算法的上限。
所以BFT算法⽆法直接⽤于公有链,⽽更多的应⽤于私有链和联盟链。
业内⼤名⿍⿍的联盟链Hyperledger fabric v0.6采⽤的是PBFT,v1.0⼜推出PBFT的改进版本SBFT。
后续⼜有相当多的⼈对其进⾏了改进,⼒求提⾼其扩展性。
但往往都是基于对⽹络环境的理想假设,以省去部分共识阶段,实现更⾼的节点承载量。
拜占庭容错算法在生活中的例子

拜占庭容错算法在生活中的例子拜占庭容错算法在生活中的例子:拜占庭容错,全称Byzantine Fault Tolerance,简称BFT。
拜占庭错误即为叛徒的存在,也就是恶意节点,它为了阻挠真实信息的传递以及有效一致的达成,会向各个节点发送前后不一致的信息。
能够处理拜占庭错误的这种容错性,就叫做拜占庭容错。
拜占庭容错共识算法,就是假设区块链网络环境包括运行正常的服务器、故障的服务器和破坏者的服务器情况下,如何在正常的节点间形成对网络状态的共识。
战争开始了,分散在不同地方的将军们之间互相传递信息。
每一个收到命令的将军都要去询问附近其他人,他们收到的命令是什么。
每个命令的执行都需要节点间两两交互去核验消息,这产生了比较高的通信代价。
PBFT,本质上就是利用通信次数换取可靠性,它具有速度快、支持高并发、可扩展的特点。
FBFT算法工作流程FBFT是一种状态机副本复制算法,1个主节点和其它副本节点。
每一个副本节点在收到来自主节点的预准备消息之后,都要检查消息的正确性,然后发送准备消息给除了自己以外的其他所有副本节点。
同时它也会收到其他副本节点发来的准备消息。
在收到消息后,副本节点对其他节点的准备消息进行验证,如果正确就将准备消息写入消息日志,集齐规定数量的准备消息之后,它就进入准备状态。
副本节点进入准备状态后,在全网范围内广播commit消息,当副本节点集齐规定数量个验证过的commit消息后,就表示请求处理完毕,当前网络中的大部分节点已经达成共识,于是发送处理结果给客户端,运行客户端的请求。
举例:当节点数大于等于4个的时候,1个无效节点的存在并不会影响消息的传递。
当存在n个无效节点时,只要总节点数超过3n个,消息传递的正确性就能得到保证,这也是拜占庭算法的容错率。
拜占庭容错算法范文
拜占庭容错算法范文拜占庭容错算法(Byzantine Fault Tolerance,简称BFT)是一种用于解决分布式系统中存在故障节点的问题的算法。
它起源于20世纪80年代中期的拜占庭将军问题,该问题模拟了一个拜占庭帝国中的将军们需要达成一致的决策,但其中存在一些可能是叛徒的将军。
拜占庭容错算法的目标是确保在存在故障节点或恶意节点的情况下,分布式系统仍能维持正常运行。
拜占庭容错算法在分布式系统中具有重要的应用价值,尤其是在需要保证系统可靠性和安全性的关键领域,如金融交易、电子支付、航空航天等。
它通过在节点之间进行消息传递和共识机制来实现容错性,确保节点之间的一致性和可信性。
拜占庭容错算法的核心思想是通过选举一个领导节点来达成共识。
领导节点负责收集所有节点的决策,并根据接收到的决策进行最终的决策。
为了保证数据的正确性,拜占庭容错算法采用了消息签名和验证的机制,确保消息的完整性和身份的可信性。
在拜占庭容错算法中,故障节点的存在可能导致系统的不一致性或错误的决策。
为了解决这个问题,算法要求至少2/3的节点必须是正确的,并且能够相互达成一致的决策。
在每个轮次中,节点将自己的决策发送给其他节点,并根据接收到的决策进行投票。
当接收到的投票超过2/3时,节点将接受该决策并且广播给其他节点。
最终,所有节点都能达成一致的决策,并且保证算法的安全性和可靠性。
在实际应用中,拜占庭容错算法还需要考虑一些其他因素,如节点的可信度、网络延迟和通信错误等。
为了提高系统的容错性和性能,还可以采用一些优化方法,如选择合适的领导节点、使用快速消息传递协议和引入超级节点等。
总而言之,拜占庭容错算法是解决分布式系统中故障节点问题的一种重要算法。
它通过选举领导节点、消息签名和投票机制来确保节点之间的一致性和可信性。
虽然算法本身复杂且耗时,但是它在保障系统可靠性和安全性方面具有重要的作用,值得进一步研究和应用。
1.毕设外文翻译(原版)
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. AC-19, NO.
5,
OCTOBER
1974
H. P. Geering, “ O p t k l control theory for nonscalar-valued [22] L. A. Zadeh, “Optimality and non-scalar-valued performance performance criteria, Ph.D. dissertation, Dept. Elec. Eng., criteria,” I E E E Trans. Automat. Contr. (Corresp.), vol. AC-8, Mass. Inst. Technol., Cambridge, Aug. 1971 (available in the pp. 59-60, 1963. form of Microfiche Nr. AD 731213, NTIS, U. S. Dept. of Comm.). theory for H. P. Geering and M. Athans,“Optimalcontrol non-scalar-valued performance criteria,” in Proc. 5th Ann. Princeton Conf. Information Sciences and Systems, Princeton, N.J., Mar. 1971. H. Halkin, “On the necessary condition for optimal control of non-linear systems,” J . Analyse Mathhatique, vol. 12, pp. IHans P. Geering (S’7&M’71) was bornon 82, 1964. June 7, 1942. H e received the degree of H. Halkin, “Topological aspects of optimal control of dynamicalpolysystems,” Contrib. Differential Equations, vol. 3, pp. Diplomierter Elektroingenieur from the Eid377-385. 1964. genoesische Technische Hochschulein Zurich, E. B. Lee and L. Markus, Foundations of Optimal Control Switzerland, in 1966, and the M.S. and Ph.D. Theory. New York: Wiley, 1967. degrees from the Massachusetts Institute of L. W.-Neustadt, “A general theory of extremals,” J . Comput. Technology, Cambridge,in 1969 and 1971, Sci. Syst., V O ~ .3, pp. 57-92, 1969. respectively. C. Olech, “Existence theorems for optimal problems with vecIn 1967 and 1968, he worked with Sprecher tor valued cost function,” Center of Dynamical Systems, & Schuh AG in Suhr, Switzerland, and Brown University, Providence, R.I., Tech. Rep. 67-6, 1967. Oerlikon-Buehrle AG in Zurich, Switzerland, K. Ritter, “Optimization theory in linear spaces-Part I,” Math. Annul., vol. 182, pp. 189-206, 1969. From September, 1968 to August, 1971 he was a Research Assistant , “Optimization theory in linear spaces-Part 11,” Math. in the M.I.T. Electronic Systems Laboratory and a Teaching AsAnnal., vol. 183, pp. 169-180, 1969. sistant in the Department of Electrical Engineering at M.I.T. He is , “Optimization theory in linear spaces-Part 111,”Math. presently with Oerlikon-Buehrle AG. His research interests are in Annul., vol. 184, pp. 133-154, 1970. estimation and control theory. R.. T. ltockafellar, Con.vexAnalysis. Princeton, N.J.: Princeton Dr. Geel.ing is a member of Schweizerische Gesellschaft fur AutoUniv. Press, 1970. E. Tse. “On the oDtimal control of linear svstems with in- matik (IFAC) and Schweizerischer Elektrotechnischer Verein. completeinformati&,” Ph.D. dissertation, U*ep. Elec. Eng., Mass. Inst. Technol., Cambridge, Nov., 1969. [20] B. Z. Vulikh, Introduction to the Theory of Partially Ordered Spaces. Groningen, The Netherlands: Wolters-Noordhoff Sc., 1967. [21] H. S. Witsenhausen, “Minimax control of uncertain system,” Electronic Systems Lab., Mass. Inst. Technol., Cambridge, Michael Athans (S’58-M%-SM’69-F’73) for a photograph and Rep. ESGIt-269, 1966. biography see page 30 of February the issue of TRANSACTIONS. this
2012-57-9-TAC-采样一致+二阶积分器+非一致时变时延
A Sufficient Condition for Convergence of Sampled-DataConsensus for Double-Integrator Dynamics With Nonuniform and Time-Varying Communication Delays Jiahu Qin,Student Member,IEEE,andHuijun Gao,Senior Member,IEEEAbstract—This technical note investigates a discrete-time second-order consensus algorithm for networks of agents with nonuniform and time-varying communication delays under dynamically changing communica-tion topologies in a sampled-data setting.Some new proof techniques are proposed to perform the convergence analysis.It isfinally shown that under certain assumptions upon the velocity damping gain and the sampling pe-riod,consensus is achieved for arbitrary bounded time-varying commu-nication delays if the union of the associated digraphs of the interaction matrices in the presence of delays has a directed spanning tree frequently enough.Index Terms—Double-integrator agents,sampled-data consensus,span-ning tree,time-varying communication delays.I.I NTRODUCTIONIn recent years,consensus problems for agents with single-integrator dynamics have been studied from various perspectives(see,e.g.,[4], [7],[10],[11],[14],[16],[17],[26]).Taking into account that double-integrator dynamics can be used to model more complicated systems in reality,cooperative control for multiple agents with double-integrator dynamics has been studied extensively recently,see[12],[18]–[20], [23],[28]for continuous algorithms and[1]–[3],[5],[6],[8],[13]for discrete-time algorithms.In[8],a sampled-data algorithm is studied for double-integrator dy-namics through a Lyapunov-based approach.The analysis in[8]is lim-ited to an undirected network topology and cannot be extended to deal with the directed case.However,the informationflow might be directed in practical applications.In a similar sampled-data setting,[1]studies two sampled-data consensus algorithms,i.e.,the case with an absolute velocity damping term and the case with a relative velocity damping term,in the context of a directed network topology by extensively using matrix spectral analysis.Reference[2]extends the algorithms in[1]to deal with a dynamic directed network topology.References[5]and[6] mainly investigate sampled-data consensus for the case with a relative velocity damping term under a dynamic network topology.In[5],the network topologies are required to be both balanced and strongly con-nected at each sampling instant.On the other hand,considering that it might be difficult to measure the velocity information in practice,[6] Manuscript received November17,2009;revised September15,2010; August15,2011,and January24,2012;accepted January25,2012.Date of publication February17,2012;date of current version August24,2012.This work was supported in part by the National Natural Science Foundation of China under Grants60825303,60834003,and61021002,by the973Project (2009CB320600),and by the Foundation for the Author of National Excellent Doctoral Dissertation of China(2007B4).Recommended by Associate Editor H.Ito.J.Qin is with Harbin Institute of Technology,Harbin,China,and also with the Australian National University,Canberra,A.C.T.,Australia(e-mail:jiahu. qin@.au).H.Gao is with the Research Institute of Intelligent Control and Systems, Harbin Institute of Technology,Harbin150001,China(e-mail:hjgao@. cn).Color versions of one or more of thefigures in this paper are available online at .Digital Object Identifier10.1109/TAC.2012.2188425proposes a consensus strategy using the measurements of the relative positions between neighboring agents to estimate the relative velocities. In[13],consensus problems of second-order multi-agent systems with nonuniform time delays and dynamically changing topologies is investigated.However,the paper considers a discrete-time model es-timated by using the forward difference approximation method rather than a sampled-data model.In general,a sampled-data model is more realistic.Also,in[13],the weighting factors must be chosen from a finite set.With this background,we study the convergence of sam-pled-data consensus for double-integrator dynamics under dynamically changing topologies and allow the communication delays to be not only different but also time varying.Here,considering the weighting factors of directed edges between neighboring agents usually represent confi-dence or reliability of the transmitted information,it is more natural to consider choosing the weighting factors from an infinite set,which is more general than thefinite set case in[2]and[13].Moreover,dif-ferent from that in[13],A(k),the interaction matrix in the presence of delays at time t=kT,is introduced in this technical note and the dif-ference between A(k)and A(k),the adjacency matrix at time t=kT, is further explored as well.The reason for introducing A(k)is that it is more relevant than A(k)to the strategies investigated in this technical note.It is worth pointing out that the method employed to perform the convergence analysis is totally different from most of the existing liter-ature which heavily relies on analyzing the system matrix by spectral analysis.By using the similar transformation as that used in[13],we can treat the sampled-data consensus for double-integrator dynamics as the consensus for multiple agents modeled byfirst-integrator dynamics. Then,in order to make the transformed system dynamics mathemati-cally tractable,a new graphic method is proposed to specify the rela-tions between0(A(k)),the associated digraph of the interaction matrix in the presence of delays,and the the associated digraph of the trans-formed system matrix.Finally,motivated by the work in[22,Theorem 2.33]and[27],by employing the product properties of row-stochastic matrices from an infinite set,we present a sufficient condition in terms of the associated digraph of the interaction matrix in the presence of delays for the agents to reach consensus.Note here that the proving techniques employed in this technical note can be extended directly to derive similar results by considering the discrete-time model in[13]. The rest of the technical note is organized as follows.In Section II, we formulate the problem to be investigated and also provide some graph theory notations,while the convergence analysis is given in Section III.In Section IV,a numerical example is provided to show the effectiveness of the new result.Finally,some concluding remarks are drawn in Section V.II.B ACKGROUND AND P RELIMINARIESA.NotationsLet I n2n2n and0n;n2n2n denote,respectively,the identity matrix and the zero matrix,and1m2m be the column vector of all ones.Letand+denote,respectively,the set of nonnegative and positive integers.Given any matrix A=[a ij]2n2n,let diag(A) denote the diagonal matrix associated with A with the ith diagonal element equal to a ii.Hereafter,matrices are assumed to be compatible for algebraic operations if their dimensions are not explicitly stated.A matrix M2n2n is nonnegative,denoted as M 0,if all its entries are nonnegative.Let N2n2n.We write M N if M0N 0.A nonnegative matrix M is said to be row stochastic if all its row sums are1.Let k i=1M i=M k M k01111M1denote the left product of the matrices M k;M k01;111;M1.A row-stochastic matrix M is ergodic0018-9286/$31.00©2012IEEE(or indecomposable and aperiodic )if there exists a column vector f2nsuch that lim k !1M k =1n f T .B.Graph Theory NotationsLet G =(V ;E ;A )be a weighted digraph of order n with a finite nonempty set of nodes V =f 1;2;...;n g ,a set of edges E V 2V ,and a weighted adjacency matrix A =[a ij ]2n 2n with nonnegative adjacency elements a ij .An edge of G is denoted by (i;j ),meaning that there is a communication channel from agent i to agent j .The adjacency elements associated with the edges are positive,i.e.,(j;i )2E ,a ij >0.Moreover,we assume a ii =0for all i 2V .The set of neighbors of node i is denoted by N i =f j 2V :(j;i )2Eg .Denote by L =[l ij ]the Laplacian matrix associated with G ,where l ij =0a ij ,i =j ,and l ii=n k =1;k =i a ik .A directed path is a sequence of edges in a digraph of the form (i 1;i 2);(i 2;i 3);....A digraph has a directed spanning tree if there exists at least one node,called the root node,having a directed path to all the other nodes.A spanning subgraph G s of a directed graph G is a directed graph such that the node set V (G s )=V (G )and the edge set E (G s ) E (G ).Given a nonnegative matrix S =[s ij ]2n 2n ,the associated di-graph of S ,denoted by 0(S ),is the directed graph with the node set V =f 1;2;...;n g such that there is an edge in 0(S )from j to i if and only if s ij >0.Note that for arbitrary nonnegative matrices M;N2p 2p satisfying M N ,where >0,if 0(N )has a di-rected spanning tree,then 0(M )also has a directed spanning tree.C.Sampled-Data Consensus Algorithm for Double-Integrator DynamicsEach agent is regarded as a node in a digraph G of order n .Let T >0denote the sampling period and k2denote the discrete-time index.For notational simplicity,the sampling period T will be dropped in the sequel when it is clear from the context.We consider the following sampled-data discrete-time system which has been investigated in [1],[2],and [8]asr i (k +1)0r i (k )=T v i (k )+12T 2u i (k )v i (k +1)0v i (k )=T u i (k )(1)where x i (k )2p ,v i (k )2p and u i (k )2p are,respectively,the position,velocity and control input of agent i at time t =kT .For simplicity,we assume p =1.However,all results still hold for any p2+by introducing the notation of Kronecker product.In this technical note,we mainly consider the following discrete-time second-order consensus algorithm which takes into account the nonuniform and time-varying communication delays as u i (k )=0 v i (k )+j 2N (k )ij (k )(r j (k 0 ij (k ))0r i (k ))(2)where >0denotes the absolute velocity damping gain,N i (k )de-notes the neighbor set of agent i at time t =kT that varies with G (k )(i.e.,the dynamic communication topology at time t =kT ), ij (k )>0if agent i can receive the delayed position r j (k 0 ij (k ))from agent j at time t =kT while ij (k )=0otherwise,and 0 ij (k ) max ,where ij (k )2,is the communication delay from agent j to agent i .Here,we assume ii (t ) 0,that is,the time delays affect only the in-formation that is transmitted from one agent to another.Moreover,we assume that all the nonzero and hence positive weighting factors areboth uniformly lower and upper bounded,i.e., ij (k )2[ ;],where 0< < ,if j 2N i (k ).Remark 1:In general,(j;i )2E (G (k ))or a ij (k )>0,which cor-responds to an available communication channel from agent j to agent i at time t =kT ,does not imply ij (k )>0even if the reverse is true.This is mainly because the communication topologies are dynamicallychanging and the communication delays are time varying,which may destroy the continuity of information.Note that ij (k )>0requires a ij >0for the whole time between k 0 ij (k )and k .DefineA (k )= 11(k )111 1n (k )......... n 1(k )111 nn (k):To distinguish A (k )from the adjacency matrix A (k )at time t =kT ,we call A (k )the interaction matrix in the presence of delays to em-phasize that A (k )is closely related to not only the available commu-nication channel but also the information transmission in the presence of delays.Let L (k )be L (k )=D (k )0A (k ),where D (k )is a diag-onal matrix with the i th diagonal entrybeing n j =1;j =i ij (k ).In fact,0(A (k )),the associated digraph of A (k ),is a spanning subgraph of the communication topology G (k )at time t =kT .To illustrate,consider a team of n =3agents.The possible communication topologies are modeled by the digraph as shown in Fig.1.Assume the communica-tion delays 21(k )and 32(k ),k2,are all larger than 1T ,while the communication topology switches periodically between Ga and Gb at each sampling instant.Clearly,A (k )=03;3at each sampling instant.However,in the special case that there is no communication delay be-tween neighboring agents,0(A (k ))=G (k ).In the case that both the communication topology and the communication delays are time in-variant,0(A (k ))=G (k )after max time steps.We say that consensus is reached for algorithm (2)if for any initial position and velocity states,and any i;j 2Vlim k !1r i (k )=lim k !1r j (k )and lim k !1v i (k )=0:It is assumed that r i (k )=r i (0)and v i (k )=v i (0)for any k <0and i;j 2V .III.M AIN R ESULTSDenote G=f G 1;G 2;...;G m g as the finite set of all possible com-munication topologies for all the n agents.In the sequel,when we men-tion the union of a group of digraphs f G i ;...;G i g G,we mean a digraph with the node set V =f 1;2;...;n g and the edge set given by the union of the edge sets of G i ,j =1;...;k .Firstly,we perform the following model transformation,which helps us deal with the consensus problem for an equivalent trans-formed discrete-time system.Denote r (k )=[r 1(k );111;r n (k )]T ,v (k )=[v 1(k );111;v n (k )]T ,x (k )=(2= )v (k )+r (k ),andy (k )=[r (k )T x (k )T ]T.Then,applying algorithm (2)and by some manipulation,(1)can be written in a matrix form asy (k +1)=40(k )y (k )+`=14`(k )y (k 0`)(3)where we get the equation shown at the bottom of the next page,and 4`(k )=T2A `(k )0n;n2T +12T 2A `(k )0n;n;`=1;2;...; max :Here in 4p (k ),p =0;1;...; max ,the ij th element of A p (k )is either equal to ij (k )if ij (k )=p ,or equal to 0otherwise and L (k )is the Laplacian matrix of the digraph of A (k ).1ObviouslyA 0(k )+A 1(k )+111+A(k )=A (k ):The following lemma will allow us to perform the convergence anal-ysis by using the product properties of row-stochastic matrices.1NoteL (k )is different from the Laplacian matrix of the communicationtopology G(k).Fig.1.Two possible communication topologies for the three agents.Lemma 1:Let d (k )be the largest diagonal element of the Lapla-cian matrix L (k ),i.e.,d (k )=max if n j =1;j =i ij (k )g .If the ve-locity damping gain and the sampling period T satisfy the following condition:4 T 0 T >2and T 01 2T d (k )(4)then 4(k )=40(k )+41(k )+111+4(k );k2+,is a row-stochastic matrix with positive diagonal elements.Proof:It follows from A 0(k )+A 1(k )+111+A(k )=A (k )=diag L (k )0L (k )that4(k )=40(k )+41(k )+111+4(k )=411(k )412(k )421(k )422(k )(5)where 411(k )=(10( =2)T +( 2=4)T 2)I n 0(T 2=2)L (k ),412(k )=(( =2)T 0( 2=4)T 2)I n ,421(k )=(( =2)T +( 2=4)T 2)I n 0((2= )T +(1=2)T 2)L (k )422(k )=(10( =2)T 0( 2=4)T 2)I n .One can easily check from (4)that all the matrices 411(k ),412(k ),421(k ),and 422(k )are nonnegative with positive di-agonal elements.That is,4(k )is a nonnegative with positive diagonal elements.Finally,it follows straightforwardly from L (k )1n =1n that 4(k )is a row-stochastic matrix.Remark 2:By some manipulation,we can get that (4)is equivalent to the following condition:1+1+8T 2d (k )2T <p 501:(6)This is achieved by solving ( T )2+2 T 04<0and T 20 02T d (k ) 0,which can be considered the quadratic inequalities in T and ,respectively.In the sequel,4(k )will be used to denote the row-stochastic matrix as described in Lemma 1.In order to make the transformed system dynamics mathematically tractable in terms of 0(A (k )),the associated digraph of the interaction matrix in the presence of delays,we need to explore the relations be-tween 0(A (k ))and the associated digraph of the transformed system matrix 0(4(k )).To this end,a new graphic method is proposed as follows.Lemma 2:Given any digraph G (V ;E ).Let G 1(V 1;E 1)be a graph with n nodes and an empty edge set,that is,V 1=f n +1;n +2;...;2n g and E 1=.Let ~G(~V ;~E )be a digraph satisfying the fol-lowing conditions:(A)~V=V [V 1=f 1;...;n;n +1;...;2n g ;(B)there is an edge from node n +i to node i ,i.e.,(n +i;i )2~",for any i 2V ;(C)if (j;i )2E ,then (j;n +i )2~Efor any i;j 2V ;i =j .Then,G has a directed spanning tree if and only if ~Ghas a directed spanning tree.Proof:Necessity:Denote G s as a directed spanning tree of the digraph G .Assume,without loss of generality,`is the root node of G s .By rules (B )and (C ),split each edge (i;j )in G s into edges (i;n +j );(n +j;j )and add edge (n +`;`)for the root node `,then we canget a directed spanning tree for ~G.Sufficiency:Let ~Gs be a directed spanning tree of ~G .Note that by the definition of ~G,the digraph G can be obtained by contracting all the edges (n +i;i );i 2V in the digraph ~G.Thus,the operation of the edge contraction on ~Gs will result in a directed spanning tree,say G s ,of the digraph G .Based on the above lemma,now we have the following result.Lemma 3:Suppose that and T satisfy the inequality in (4).Let f z 1;z 2;...;z q g be any finite subsetof +.If the union of the digraphs 0(A (z 1));0(A (z 2));...;0(A (z q ))has a directed spanning tree,then the union of digraphs 0(4(z 1));0(4(z 2));...;0(4(z q ))also has a directed spanning tree.Proof:The union of the digraphs 0(4(z 1));0(4(z 2));...;0(4(z q ))hereby is exactly the digraph0(q l =14(z l )).Because and T satisfy (4),it follows that 4(z l ),l =1;2;...;q ,is a row-stochastic (and hence nonnegative)matrix with positive diagonal entries.Note that L (z l )=diag L (z l )0A (z l ).By observing the equation in (5),we get that there exists a positive number ,say =min f q (( =2)T 0( 2=4)T 2);(2= )T +(1=2)T 2g ,such that we get (7),as shown at the bottom of the page.It thus follows from ~M 12=I n that (n +i;i )20(q l =14(z l ))for any i 2V .On the other hand,~M 21=q l =1A (z l )implies that(j;i )20(q l =1A (z l ))if and only if (j;n +i )20(ql =14(z l ))for any i;j 2V ;i =j .Combining these arguments,we knowthat the digraphs0(q l =14(z l ))and0(ql =1A (z l ))correspondto the digraphs ~G and G ,respectively,as described in Lemma 2.Note that the digraph0(q l =1A (z l ))is just the union of digraphs 0(A (z 1));0(A (z 2));...;0(A (z q )).It then follows from Lemma 2that the digraph0(q l =14(z l ))has a directed spanning tree,which proves the Lemma.Let P be the set of all n by n row-stochastic matrices.Given any row-stochastic matrix P =[p ij ]2P ,define (P )=10mini;j k min f p ik ;p jk g [25].Lemma 4: (1)is continuous on P .40(k )=102T +4T2I n 0T2(diag L (k )0A 0(k))2T 04T2In2T +4T2I n 02T +12T 2(diag L (k )0A 0(k))102T 04T2I nql =14(z l )q2T 04T2I n2T +12T 2diag q l =1L (z l )0q l =1L (z l )0Inql =1A (z l )0= ~M 11~M12~M 21~M22:(7)Proof:2:P can be viewed as a subset of metricspace n .All the functions involved in the definition of (1)are continuous,and since the operations involved are sums and mins,it readily follows that (1)is continuouson n .The restriction of a continuous function is con-tinuous,so (1)is also continuous on P .Two nonnegative matrices M and N are said to be of the same type,denoted by M N ,if they have zero elements and positive elements in the same places.To derive the main result,we need the fol-lowing classical results regarding the infinite product of row-stochastic matrices.Lemma 5:([25])Let M =f M 1;M 2;...;M q g be a finite set of n 2n ergodic matrices with the property that for each se-quence M i ;M i ;...;M i of positive length,the matrix productM i M i111M i is ergodic.Then,for each infinite sequence M i ;M i ;...there exists a column vector c2n such thatlim j !1M i M i111M i =1c T :(8)In addition,when M is an infinite set, (W )<1,where W =S k S k 111Sk,S k 2M ,j =1;2;...;N (n )+1,and N (n )(which may depend on n )is the number of different types of all n 2n ergodic matrices.Furthermore,if there exists a constant 0 d <1satisfying (W ) d ,then (8)still holds.Let d=(n 01) .Assume,in the sequel,that ;T satisfy (4= T )0 T >2and T 01 (2= )T d.Then,by Lemma 1,all possible 4(k )must be nonnegative with positive diagonal elements.In addition,since the set of all 2n ( max +1)22n ( max +1)matrices can be viewed as the metricspace [2n (+1)],for each fixed pair ;T ,all possible 4(k )compose a compact set,denoted by 7( ;T ).This is because all the nonzero and hence positive entries of 4(k )are both uniformly lower and upper bounded,which can be seen by observing the form of 4(k )in (5).Let 3(A )=f B =[b ij ]22n 22n :b ij =a ij or b ij =0;i;j =1;2;...;2n g ,and denote by 5( ;T )the set of matricesM (40;41;...;4)=40411114014I 2n 0111000I 2n 11100 0111I 2nsuch that 40;41;...;423(4(k ))and 40+41+...+4=4(k ),where 4(k )27( ;T ).The set 5( ;T )is compact,since givenany 4(k )27( ;T ),all possible choices of 40;41;...;4are finite.Let (k )=[ 1(k ); 2(k );111; 2n (+1)(k )]T =[y T (k );y T (k 01);111;y T (k 0 max )]T22n (+1).Then,there exists a matrix M (40(k );41(k );...;4(k ))25( ;T )such that system (3)is rewritten as(k +1)=M (40(k );41(k );...;4(k )) (k ):(9)Clearly,the set 5( ;T )includes all possible system matrices of system (9).2Weare indebted to Associate Editor,Prof.Jorge Cortes,for his help with a simpler proof of this lemma.Given any positive integer K,define ~5(;T )=i =1M (4i 0;4i 1;...;4i):M (1)25( ;T )and there exists a integer ;1 K suchthat the union of digraphsj =04ij ;i =1;...; ;has a directed spanningtree :~5(;T )is also a compact set,which can be derived by noticing the following facts:1)5( ;T )is a compact set;2)all possible choices of are finite since is bounded by K;3)all possible choices of the directed spanning trees are finite;and 4)given the directed spanning tree and ,the followingset:i =1M (4i 0;4i 1;...;4i):M (1)25( ;T )and the union of the digraphsj =04ij;i =1;...; ;hasthe speci ed directed spanningtreeis compact (this can be proved by following the similar proof of [27,Lemma 10]).Note that the set ~5(;T )includes all possible products of ; K ,consecutive system matrices of system (9).The following lemma is presented to prove that all the possible prod-ucts of consecutive system matrices of system (9)satisfy the result as stated in Lemma 5,which in turn allow us to use the properties of in-finite products of row-stochastic matrices from an infinite set to derive our main result.Lemma 6:If 81;...;8k 2~5(;T ),where k =N (2n ( max +1))+1,then there exists a constant 0 d <1such that(k i =18i ) d .Proof:We first prove that for any 82~5(;T );8is an er-godic matrix.According to the definition of ~5(;T ),there exist pos-itive integer (1 K),M (4i 0;4i 1;...;4i )25( ;T ),i =1;...; ,such that 8= i =1M (4i 0;4i 1;...;4i)and the union of digraphs0(j =04ij ),i =1;...; ,has a directed span-ning tree.Since M (4i 0;4i 1;...;4i )25( ;T ),j =04ij must be nonnegative matrices with positive diagonal elements.Furthermore,there exists a positive number 1such that diag(j =04ij ) I 2n ,for any M (4i 0;4i 1;...;4i )25( ;T ).Specifically,by observing (5),we can choose as=min 1;10 2T + 24T20T 22(n 01) ;10 2T 0 24T2:Combining this with the condition that the union of digraphs0(j =04ij ),i =1;...; ,has a directed spanning tree,we can prove that matrix 8is ergodic by following the proof of [26,Lemma 7].Letd =max 82~5(;T )ki =18i :From Lemma 5,we know that(k i =18i )<1.This,together withthe fact that ~5( ;T )is a compact set and (1)is continuous (Lemma4),implies d must exist and 0 d <1,which therefore completing the proof.For notational simplicity,we shall denote M (40(k );41(k );...;4(k ))by M (k )if it is self-evident from the context.Based on the preceding work,now we can present our main result as follows.Theorem 1:Assume that and T satisfy (4= T )0 T >2andT 01 (2= )T d.Then,employing algorithm (2),consensus is reached for all the agents if there exists an infinite sequence of con-tiguous,nonempty,uniformly bounded time intervals [k j ;k j +1),j =1;2;...,starting at k 1=0,with the property that the union of the di-graphs 0(A (k j ));0(A (k j +1));...;0(A (k j +101))has a directed spanning tree.Proof:We first prove that consensus can be reached for system (9)using algorithm (2).Let 8(k;k )=I 2n (+1),k 0,and 8(k;l )=M (k 01)111M (l +1)M (l ),k >l 0.Assume,without loss of generality,that the lengths of all the time intervals [k j ;k j +1),j =1;2;...,are bounded by K.It follows from Lemma 3and the condition that the union of the digraphs 0(A (k j ));0(A (k j +1));...;0(A (k j +101))has a directed spanning tree that the union of the digraphs 0(4(k j ));0(4(k j +1));...;0(4(k j +101))also has a directed spanning tree for each j2+,which,together with the proof ofLemma 6,implies that 8(k j +1;k j )=k 01k =k M (k )2~5(;T ).Since 8(k j ;0)=8(k j ;k j 01)8(k j 01;k j 02)1118(k 2;k 1),it then follows from Lemma 5and Lemma 6thatlim j !18(k j ;0)=12n (+1)wT(10)where w22n (+1)and w 0.For each m >0,let k l be the largest nonnegative integer such that k l m .Note that matrix 8(m;k l )is row stochastic,thus we have8(m;0)012n w T =8(m;k l)8(k l ;0)012n wT :The matrix 8(m;k l )is bounded because it is the product of fi-nite matrices which come from a bounded set ~5(;T ).By using (10),we immediately have lim m !18(m;0)=12n (+1)w T .Combining this with the fact that (m )=8(m;0) (0)yields lim m !1 (m )=(w T (0))12n (+1)which,in turn,implies lim m !1x (m )=(w T (0))1n and lim k !1v (m )=0,and there-fore completing the proof.Remark 3:Matrix A (k )is a somewhat complex object to study compared with the adjacency matrix A (k )(see Remark 1).It is worth noting that more general results in which the sufficient conditions for guaranteeing the final consensus are presented in terms of G (k )instead of the interaction matrix in the presence of delays can be provided if some additional conditions are imposed.For example,if in addition to the conditions on and T as that required in Theorem 1,it is further required that a certain communication topology which takes effect at some time will last for at least max +1time steps,then we can get that consensus can be reached if there exists an infinite sequence of contiguous,nonempty,uniformly bounded time intervals [k j ;k j +1),j =1;2;...,starting at k 1=0,with the property that the union of the digraphs G (k j );G (k j +1);...;G (k j +101)has a directed spanning tree.This can be observed by reconstructing a new sequence of con-tiguous,nonempty and uniformly bounded time intervals which satis-fies the condition in Theorem 1by using similar technique as that in in [26,Theor.3].IV .I LLUSTRATIVE E XAMPLEConsider a group of n =6agents interacting between the possible digraphs f Ga;Gb;Gc g (see Fig.2),all of which have 0–0.2weights.Fig.2.Digraphs which model all the possible communicationtopologies.Fig.3.Position and velocity trajectories for agents.Take and T as =2and T =0:6respectively.Assume that the communication delays ij (k )satisfies 21(k )= 32(k )= 43(k )=1T s , 52(k )= 54(k )=2T s ,while 65(k )= 61(k )=3T s ,for any k2+.Moreover,we assume the switching signal is periodically switched,every 3T s in a circular way from Ga to Gb ,from Gb to Gc ,and then from Gc to Ga .Obviously,the union of the digraphs 0(A (k ))across each time in-terval of 9T s is precisely the digraph G d in Fig.2,which therefore has a directed spanning tree.Fig.3shows that consensus is reached for algorithm (2),which is consistent with the result in Theorem 1.V .C ONCLUSIONS AND F UTURE W ORKIn this technical note,we have investigated a discrete-time second-order consensus algorithm for networks of agents with nonuniform and time-varying communication delays under dynamically changing com-munication topologies in a sampled-data setting.By employing graphic method,state argumentation technique as well as the product proper-ties of row-stochastic matrices from an infinite set,we have presented a sufficient condition in terms of the associated digraph of the interac-tion matrix in the presence of delays for the agents to reach consensus.Finally,we have shown the usefulness and advantages of the proposed result through simulation results.It is worth noting that the case with input delays is an interesting topic which deserves further investigation in our future work.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
TheComplexityofAsynchronousByzantineConsensusParthaDutta,RachidGuerraouiandMarkoVukoli´cDistributedProgrammingLaboratory,EPFLCH-1015Lausanne,Switzerland
Abstract.Thispaperestablishesthefirsttheoremrelatingresilience,roundcomplexityandauthen-ticationindistributedcomputing.WegiveanexactmeasureofthetimecomplexityofconsensusalgorithmsthattolerateByzantinefailuresandarbitrarylongperiodsofasynchronyasintheInternet.Themeasureexpressestheabilityofprocessestoreachaconsensusdecisioninaminimalnumberofroundsofinformationexchange,asafunctionof(a)theabilitytouseauthenticationand(b)thenum-berofactualprocessfailures,inthoserounds,aswellasof(c)thetotalnumberoffailurestoleratedand(d)thesystemconfiguration.Themeasureholdsforaframeworkwherethedifferentrolesofprocessesaredistinguishedsuchthatwecandirectlyderiveameaningfulboundonthetimecomplexityofim-plementingrobustgeneralservicesinpracticaldistributedsystems.Toproveourtheorem,weestablishcertainlowerboundsandwegivealgorithmsthatmatchthesebounds.ThealgorithmsareallvariantsofthesamegenericasynchronousByzantineconsensusalgorithm,whichisinterestinginitsownright.
1Introduction1.1ContextWeestablishatheoremonthecomplexityoftheconsensusprobleminageneraldistributedsystemframeworkcomposedofthreekindsofprocesses[21]:proposers,acceptorsandlearners(Fig.1).Basically,theproblemconsistsforthelearnerstodecideonacommonvalueamongthoseproposedbytheproposers,usingacceptorsaswitnessesthathelpensuretheagreement.Everylearnerissupposedtoeventuallylearnavalue(liveness)thatisthesameproposedvalueforalllearners(safety)[2].Measuringthecomplexityoflearningadecisioninthisframeworkautomaticallyderivesameasureofthecomplexityofstatemachinereplication,ageneraltechniquetobuildrobustdistributedservices[19,31].WestudyconsensusalgorithmsthattolerateByzantinefailures.AByzantinefailurecaneithercorrespondtoacrashoramaliciousbehavior(bydefault,afailuremeansaByzantinefailure).Aprocessismaliciousifitdeviatesfromthealgorithmassignedtoitinawaythatisdifferentfromsimplystoppingallactivities(crashing).Besidesprocessfailures,thealgorithmsweconsideralsotoleratearbitrarilylongperiodsofasynchrony,duringwhichtherelativespeedsofprocessesandcommunicationdelaysareunbounded.Suchalgorithmsaresometimescalledasynchronous[6,21].Weassumehoweverthatthedurationoftheasynchronousperiodsandtheirnumberofoccurrencesarebothfinite,otherwiseconsensusisknowntobeimpossible[14].Processesthatdonotfailarecalledcorrectprocesses,andtheycaneventuallycommunicateamongeachotherinatimelymanner.Themodelassumedhere,calledtheeventuallysynchronousmodel[12],matchespracticalsystemsliketheInternetwhichareoftensynchronousandsometimesasynchronous.Whereasitisimportanttotolerateperiodsofasynchronyandthelargestnumberoffaultspossible,itisalsoimportanttooptimizealgorithmsforfavorable,andmostfrequent,situationswherethesystemissynchronousandveryfewprocessesfail.Clearly,itisneverpossibletolearnadecisioninoneroundofinformationexchange(wesaycommunicationround)andyetensureagreementdespitepossibleByzantinefailures.Therearehoweveralgorithms[6]where,incertainfavorablesituations,adecisionislearnedafterthreecom-municationroundsbyallcorrectlearners:wetalkaboutfastlearning.Infact,asconjecturedin[21],andasweshowinthispaper,thereareevenslightlymorefavorablesituations,whicharestillveryplausibleinpractice,underwhichlearningcanbeachievedintwocommunicationrounds:wetalkinthiscaseaboutveryfastlearningandaproposerfromwhichavaluecanbelearnedveryfastiscalledaprivilegedproposer.Thetheoremweestablishinthispaperdeterminestheexactcondi-tionsunderwhichveryfast(resp.fast)learningcanbeachieved.Underlyingthetheoremliesthenotionoffavorable(resp.veryfavorable)runsthatpreciselycapturesthefavorablesituationswementionedabove.Namely,arunrofaconsensusalgorithmAissaidtobeveryfavorable(resp.favorable)if:(1)rissynchronous,(2)asingle(correct)privilegedproposerplproposesavalueinrand(3)atmostQ≤F(resp.morethanQbutatmostF)acceptorsarefaulty(here,Fisoneoftheresiliencethresholdsdefinebelow).Basically,veryfast(resp.fast)learningisachievedinveryfavorable(resp.favorable)runsofalgorithmA.Ourtheoremisgeneralinthatitisparameterizedby(1)differentresiliencethresholds,(2)differentsystemconfigurations,aswellas(3)theabilityofprocessestouseauthenticationprimitives(public-keycryptography)[30]toachievefast(resp.veryfast)learning.
1.Wedistinguishtworesiliencethresholds:MandF;Mdenotesthemaximumnumberofacceptormaliciousfailuresdespitewhichconsensussafetyisensured(thenumberofacceptorcrash-onlyfailuresdoesnotinfluencesafety);Fdenotesthemaximumnumberofacceptorfailuresdespitewhichconsensuslivenessisensured.ParticularlyinterestingisthecasewhereM>F:consensussafetyshouldbepreserveddespiteMacceptormaliciousfailures,butlivenessisguaranteedonlyifthenumberofacceptorfailuresisatmostF.2.Wedistinguishtwosystemconfigurations:C1andC2;C1istheconfigurationwhereatleastoneprivilegedproposermightnotbeanacceptor,orthereareatleasttwoprivilegedproposers(Fig.1(a));C2istheconfigurationwherethereisonlyoneprivilegedproposer,whichisalsooneoftheacceptors(Fig.1(b)).3.Finally,wealsodistinguishthecasewheretheprocessesareallowedtouseauthenticationtoachieveveryfast(resp.fast)learningfromthecasewheretheyarenot.Notethat,inbothcases,wedonotpreventprocessesfromusingauthenticationinrunsthatdonotenableveryfastorfastlearning,typicallynon-favorablerunswithproposerfailuresandasynchronousperiods.Roughly,authenticationallowstherecipientofthemessagetovalidlyclaimtoathirdpartythatitreceivedthemessagefromtheactualoriginalsenderofthemessage[30].Thisabilityisamajorsourceofoverhead[24,27]andhence,wewouldtypicallyliketoavoidusingauthenticationfor(very)fastlearning.