分布式系统原理与范型考试 2010 答案

合集下载

分布式系统原理与范型考试-2009-12

Department Computer Science Distributed Systems VU University15.12.2009 MAKE SURE THAT YOUR HANDWRITING IS READABLE1a Explain what is meant by request-level and message-level interceptors in middleware.5pt Request-level interceptors are special local components to which an invocation request is passed before passing it to the underlying middleware.Such an interceptor is invocation aware in the sense that it knows with which invocation it is dealing,and for which server it is intended.Typically,such interceptors can be used to implement replicated calls.A message-level interceptor is a component that is logically placed between the middleware and the underlying operating system.It can thus handle only basic network messages,for example,by fragmenting them into smaller parts(and assembling these parts at the receiver side).1b Where does the need for adaptive middleware come from?5pt Middleware is intended to incorporate general-purpose,i.e.,application-independent,mechanisms for distributed computing.The problem is that for practical purposes,it is very difﬁcult to separate policy from mechanism,with the effect that many middleware solutions are not right for speciﬁc applications.The result is the need to be able to tweak the middleware for the speciﬁc needs of an application.1c In the underlying feedback control loop,give an example of the analysis component in combination with the reference input.5ptAn example that is also discussed in the book,is analyzing whether measured performance is as good as it could have been when another replication scenario would have been used.In this case, the reference input is a cost function that needs to be minimized.2a What is the difference between transport-layer switching and content-aware request distribution?5pt With transport-layer switching,a front end to a server cluster accepts incoming TCP connections and hands these off to one of the back-end servers using only information that is available at the TCP-level:client address and destination port.In the case of content-aware request distribution,the switch can also inspect the content of requests(such as an HTTP URL)and use that information to decide to which back-end server the request should be forwarded.2b Explain how TCP handoff works and why it is difﬁcult to apply to wide-area networks.5pt With TCP handoff,an incoming connection request is forwarded by a switch to a speciﬁc server, which then sends the response back directly to the client,using the network address of the switch.This last issue is problematic is a wide-area system,as it essentially involves spooﬁng the switch, which is often difﬁcult to do across administrative domains.2c Explain how the content-aware request distribution can be combined with TCP handoff.5pt Your answer should explain what is happening in Figure12-9.Essential is that you mention the initial handoff to a distributor or dispatcher to decide what the best server could be based on content, after which the TCP connection is handed off to that server.The switch is subsequently informed.3a Traditional RPC mechanisms cannot handle pointers.What is the problem and how can it be ad-dressed?5pt The problem is that pointers passed as parameters refer to a memory location that is local to the caller.That location is not only often meaningless to the receipient,but more important is that the recipient will most likely not have the data structure in its memory that the caller has.There are not many things you can do about this,except copying the entire(dynamic data structure)from the caller to the callee when doing the RPC.An alternative is to replace pointers by global systemwide references,as is done with Java object references.3b Where does the need for at-least-once and at-most-once semantics come from?Why can’t we have exactly-once semantics?5pt The problem originates from having a(suspected)server crash,detected by the lack of a response in the case of an RPC.What the client-side software can do is either resend the request until itﬁnally gets a response(at-least-once semantics)or immediately reports the failure to the client application, thus providing at-most-once semantics.Guaranteeing exactly-once semantics is,in principle,impos-sible,because you cannot know in general whether the server crashed before or after executing the requested operation.3c Consider a client/server system based on RPC,and assume the server is replicated for performance.Sketch an RPC-based solution for hiding replication of the server from the client.5pt Simply take a client-side stub that replicates the call to the respective servers.It is essential that you mention that these calls should be done in parallel and that(for example)theﬁrst response is immediately passed to the client.Serializing the RPCs or waiting for all responses is OK for fault tolerance,but certainly not for performance.4a Resolve the following key lookups for the shown Chord-based P2P system:5ptsource key41542221302127201815@4:14–18;22@4:20–21–28;30@21:28–1;27@21:28;18@20:4–14–184b Adjust theﬁnger tables of nodes18and14when a node with ID24enters the ring.Also give the ﬁnger table of node24.5pt Node18:[20,20,24,28,4];Node14:[18,18,18,24,1];Node24:[28,28,28,1,9].4c Chord allows keys to be looked up recursively or iteratively.Explain the differences,as well as the main advantage of iterative over recursive lookup.5pt With recursive lookups,a message is forwarded from peer to peer until it reaches its destination.In contrast,with an iterative lookup,the requester is returned the next peer it should ask for the key.One can argue that in the case of Chord,iterative lookups are much better:recursive lookups do not have the advantage of proximity-awareness.Also,note that iterative lookups have the advantage of letting the client handle failures more easily.5a Explain how two-phase commit works.5pt Make sure that you explain(1)coordinator sends vote-request;(2)participants respond;(3)coordi-nator sends decision;(4)participants ack.5b Explain what happens when a participant,who is in the READY state,times out because it hasn’t received a response from the coordinator yet.5pt In that case,P can check whether any of the other particpants has made a transition to either ABORT or INIT(in which case P can abort)or COMMIT(and commit as well).The difﬁculty is when all others are in READY:they all need to wait until the coordinator recovers.5c If we use two-phase commit for a distributed transaction,can we allow a coordinator to issue two distributed transactions(involving the same participants)at the same time?5pt Yes:the local transaction managers at the participants will handle any concurrency issues.What is seen here is that the use of2PC is completely independent of the semantics of speciﬁc transactions.6a How can a Web hosting service help in handlingﬂash crowds?5pt Crucial for a correct answer is that you not only state that content is replicated,but that the origin server is assumed to be capable of redirecting requests,but perhaps no longer in also returning content-rich responses.Note that distributed request distribution is really tricky business.6b Akamai uses DNS-based redirection.Explain how resolution of the name would work.5pt The trick is that the regular DNS will resolve the name ,pointing to a DNS server that is controlled by Akamai.If we use iterative DNS name resolution,that server will know the IP address ot the requesting client,and be able to decide to which server(with logical name )it can forward the request.6c Explain the difference between content-aware and content-blind caching for Web applications by means of an example.5pt With content-aware caching,the cache has knowledge on the data model that is used by the Web application,and with that,can conduct query-containment procedures to see whether a query could possibly be addressed by the data that is already cached.For example,if an edge server had once received the query“select ALL FROM books WITH author=Irving”,it can cache that ter, when receiving a query“select ALL FROM books WITH author=Irving AND date<2008”,the edge server should be able to recognize that this is a subquery,and that it can thus look into its local cache.With content-blind caching,the cache simply attaches a unique id to an entire,speciﬁc query in order to check whether that exact query had been issued before.If so,it can possibly return the previously stored response from its cache.In our example,the two queries would each get a unqiue ID,which is then used to do a cache lookup.Grading:Theﬁnal grade is calculated by accumulating the scores per question(maximum:90points),and adding10bonus points.The maximum total is therefore100points.。

分布式系统基础考试

分布式系统基础考试（答案见尾页）一、选择题1. 分布式系统的定义是什么？A. 由多个计算机组成的系统，这些计算机通过网络进行通信和协调B. 一个提供分布式服务的计算机系统C. 一种软件技术，使得应用程序可以跨多个硬件和操作系统运行D. 一种允许多个用户同时访问和操作的系统2. 分布式系统中的“分布式”一词的含义是什么？A. 多个系统独立运行B. 数据存储在多个位置C. 系统具有高可用性和容错性D. 所有节点都可以独立完成任务3. 分布式系统的核心特性是什么？A. 并发性B. 透明性C. 可伸缩性D. 容错性4. 分布式系统中的节点可以是哪种类型？A. 服务器B. 桌面电脑C. 移动设备D. 所有这些都可能5. 分布式系统中的通信协议有哪些？A. HTTPB. TCP/IPC. UDPD. 所有这些都可能6. 分布式系统中的数据一致性是指什么？A. 所有节点上的数据完全相同B. 所有节点上的数据保持同步更新C. 所有节点上的数据在某个时间点相同D. 所有节点上的数据可以不同7. 分布式系统中的负载均衡是什么？A. 将请求平均分配到多个服务器B. 将流量限制到单个服务器C. 将流量分散到多个服务器D. 将流量全部转发到单个服务器8. 分布式系统中的复制是什么？A. 在多个节点上创建数据的副本B. 将数据存储在远程位置C. 将数据加密D. 将数据存储在本地9. 分布式系统中的CAP理论指的是什么？A. 一致性、可用性和分区容错性之间的权衡B. 一致性、可用性和性能之间的权衡C. 一致性、可用性和可伸缩性之间的权衡D. 一致性、可用性和安全性之间的权衡10. 分布式系统中的分布式事务是什么？A. 一种需要在多个节点上同步执行的事务B. 一种可以在多个节点上并行执行的事务C. 一种不能在多个节点上同步执行的事务D. 一种可以在多个节点上同步执行但不需要一致性的事务11. 分布式系统的定义是什么？A. 一组独立的计算机通过网络进行通信和协作B. 一个硬件和软件的组合，能够在多个处理器上运行C. 一个提供分布式服务的互联网D. 一个由多个服务器组成的系统，每个服务器都有自己的资源12. 分布式系统中的“分布式”一词意味着什么？A. 多个系统组件位于不同的地理位置B. 多个系统组件共同工作以完成一项任务C. 多个系统组件独立地运行并相互通信D. 多个系统组件共享数据和资源13. 分布式系统中的节点可以是哪种类型？A. 主节点B. 从节点C. 客户端D. 所有类型的节点14. 分布式系统中的数据复制是为了什么目的？A. 提高系统性能B. 防止数据丢失C. 提高数据的可用性D. 保证数据的一致性15. 分布式系统中的负载均衡是一种什么技术？A. 将请求分配到多个服务器以优化性能B. 将流量限制到特定的服务器以避免拥塞C. 将客户端的请求直接路由到正确的服务器D. 使用一种算法来决定哪个服务器应该处理哪个请求16. 分布式系统中的共识算法是什么？A. 一种确保所有节点对数据的一致性达成一致的技术B. 一种用于同步不同节点之间的数据状态的技术C. 一种用于检测和处理网络延迟的技术D. 一种用于管理分布式系统中的故障的技术17. 分布式系统中的容错机制是什么？A. 一种确保系统在部分组件失败时仍能正常运行的技术B. 一种用于检测和修复系统错误的技术C. 一种用于保护系统免受恶意攻击的技术D. 一种用于限制系统中的用户数量的技术18. 分布式系统中的数据分片是什么？A. 将数据分割成小块以便于存储在不同的位置B. 将数据分割成小块以便于在不同的硬件设备上存储C. 将数据分割成小块以便于在不同的网络上进行传输D. 将数据分割成小块以便于在不同的时间点进行访问19. 分布式系统中的消息传递机制是什么？A. 一种用于在节点之间传递消息的技术B. 一种用于在节点之间同步数据的技术C. 一种用于在节点之间交换数据的技术D. 一种用于在节点之间协调任务的技术20. 分布式系统中的安全性是指什么？A. 保护系统免受未经授权的访问B. 保护系统免受未经授权的修改C. 保护系统免受未经授权的数据泄露D. 保护系统免受所有上述威胁21. 分布式系统的定义是什么？A. 一组计算机通过互联网进行通信和协调的系统B. 一个硬件和软件集合，能够在有限时间内处理大量数据C. 一个提供分布式服务的互联网系统D. 一种允许多个用户访问和共享资源的网络架构22. 分布式系统中的“分布式”一词意味着什么？A. 多个系统独立运行B. 数据存储在多个位置C. 系统具有高可用性和可扩展性D. 所有节点共同工作以完成特定任务23. 分布式系统的核心特性包括哪些？A. 可靠性B. 可用性C. 并发性D. 容错性24. 在分布式系统中，通常使用哪种通信协议？A. HTTPB. TCP/IPC. UDPD. ICMP25. 分布式系统中的“容错性”是什么意思？A. 系统在部分组件失败时仍能继续运行的能力B. 系统能够自动恢复丢失的数据或进程的能力C. 系统能够自我调整以避免单点故障的能力D. 系统能够确保所有节点之间的同步性26. 分布式数据库的概念是什么？A. 一个包含多个数据副本的数据库，以提高数据可用性和性能B. 一个只有一个数据副本的数据库C. 一个动态调整数据分布的数据库D. 一个支持实时数据更新的数据库27. 分布式系统的设计原则之一是什么？A. 高度集权B. 高度分散C. 高度可伸缩性28. 在分布式系统中，什么是“微服务”？A. 一种特定的编程风格或架构模式，其中应用程序被拆分成一系列小型服务B. 一种分布式系统的实现技术C. 一种单一的、集中的服务D. 一种特定的数据存储技术29. 分布式系统中的“同步”和“异步”有什么区别？A. 同步是指多个进程或线程在同一时间访问同一资源B. 异步是指多个进程或线程在不同的时间访问同一资源C. 同步通常用于需要数据一致性的场景D. 异步通常用于需要提高系统性能的场景30. 分布式系统的发展历程及其在不同领域中的应用有哪些？A. 分布式系统的发展始于20世纪80年代B. 分布式系统广泛应用于大数据处理、云计算、物联网等领域C. 分布式系统的发展受到了计算机网络技术的影响D. 分布式系统是现代计算机系统的基本组成部分31. 分布式系统的定义是什么？A. 一组通过网络进行通信的计算机系统B. 一个硬件和软件的组合，可以在多个位置进行数据处理和存储C. 一种允许多个服务器共享资源和数据的系统D. 一种设计用于处理大量数据并保证数据一致性的系统32. 分布式系统中的“分布式”一词意味着什么？A. 多个系统独立运行B. 资源共享C. 数据备份D. 所有这些都正确33. 分布式系统的核心特性是什么？B. 高可用性C. 任务无关性D. 资源共享34. 分布式系统中的“并发”是指什么？A. 同时执行多个任务B. 同时访问同一资源C. 同时处理多个数据流D. 同时修改数据库35. 以下哪个选项不是分布式系统中的常见同步问题？A. 机器之间的网络延迟B. 任务执行的先后顺序C. 共享资源的访问冲突D. 数据一致性问题36. 分布式系统中的“透明性”是指什么？A. 用户感觉好像所有的系统组件都在本地运行B. 系统管理员可以远程管理所有组件C. 应用程序的数据和代码在主机之间是可移植的D. 所有这些都正确37. 以下哪个分布式算法不是CAP定理中提到的？A. 客户端-服务器算法B. 一致性算法C. 分区容错算法D. 内容分发算法38. 分布式系统中的“分区容错”是什么意思？A. 在网络故障时，系统仍然可以运行B. 在网络分区时，系统能够继续运行C. 在网络拥堵时，系统仍然可以运行D. 在网络配置错误时，系统能够继续运行39. 以下哪个选项不是分布式系统中的常见性能指标？A. 响应时间B. 可扩展性C. 容错性D. 资源利用率40. 分布式系统与传统集中式系统的最大区别是什么？A. 可靠性更高B. 可伸缩性更好C. 无需依赖中央控制点D. 所有这些都正确二、问答题1. 什么是分布式系统？请简述其基本特性。

分布式系统原理与范型

分布式系统原理与范型分布式系统是由多台计算机通过网络连接，共同协作完成任务的系统。

它具有高性能、高可靠性、可扩展性等优点，被广泛应用于大规模的数据处理、并行计算、云计算等领域。

在分布式系统中，各个计算机之间彼此独立，通过消息传递进行通信和协调，没有共享内存。

分布式系统的原理主要包括：并发性、透明性、故障容错、可扩展性和一致性。

首先是并发性原理，分布式系统中的多个计算机并行地执行任务，能够提高系统的处理能力。

并发性可以通过多线程、多进程等方式来实现，并且需要解决同步、互斥、死锁等并发控制问题。

其次是透明性原理，分布式系统希望对用户来说，就像是一个单一的计算机系统，隐藏了分布式部署的复杂性。

透明性包括：访问透明性（用户无感知地访问分布式系统）、位置透明性（用户无需关心数据的物理位置）、迁移透明性（用户无需关心资源在系统中的迁移）、复制透明性（用户无需关心数据的副本）、并发透明性（用户无需关心并发访问的问题）等。

第三是故障容错原理，分布式系统中的计算机节点可能会发生故障，为了保证系统的可靠性，需要进行故障检测、故障恢复和容错处理。

例如，使用冗余备份、错误检测和纠错码等技术来检测和修复故障，确保系统可以继续正常运行。

第四是可扩展性原理，分布式系统需要能够方便地扩展计算资源，以应对数据量和任务量的增长。

可扩展性可以通过水平扩展和垂直扩展来实现。

水平扩展是增加计算机节点的数量，垂直扩展是增加单个计算机节点的处理能力。

最后是一致性原理，分布式系统中的数据可能被存储在不同的节点上，而多个节点之间需要保持一致的数据视图。

一致性可以通过强一致性和弱一致性来实现。

强一致性要求所有的节点都能观察到同样的数据视图，而弱一致性则容忍一定的数据不一致性。

分布式系统有多种范型，包括客户端-服务器模型、对等模型、发布-订阅模型等。

客户端-服务器模型是最常见的分布式系统范型，其中一台计算机作为服务器提供服务，而其他计算机作为客户端请求服务。

分布式系统练习题

一、选择题概述1、下列哪项描述不是分布式系统的特性( C )A、透明性B、开放性C、易用性D、可扩展性3、下列描述正确的是( A )A、基于中间件的系统要比网络操作系统的透明性高√B、网络操作系统要比分布式操作系统的透明性高×C、基于中间件的系统要比分布式操作系统的透明性高×D、分布式操作系统可以运行在异构多计算机系统中4、从下面关于网络操作系统的原理图中可以看出( B )A、网络操作系统是紧耦合系统，因而只能运行在同构多计算机系统中×B、网络操作系统不要求各计算机上的操作系统同构√C、运行于网络操作系统之上的分布式应用程序可以取得很高的透明性×D、网络操作系统可以作为一个全局的单一的系统进行方便的管理×5、在网络操作系统之上采用中间件技术加入中间件层，主要可以( D )A、弥补网络操作系统在可扩展性方面的缺陷B、弥补网络操作系统在可开放性方面的缺陷C、提高网络操作系统的稳定性D、提高网络操作系统的透明性1、下列描述不是分布式系统目标的是( C )A、连接用户和资源B、透明性C、异构性D、开放性以及可扩展性。

2、下列系统中有共享内存的系统是( B )A、同构多计算机系统B、多处理器系统C、异构多计算机系统D、局域网系统3、下述系统中，能运行于同构多计算机系统的操作系统是( A )A、分布式操作系统B、网络操作系统C、中间件系统D、嵌入式操作系统4、多计算机系统的主要通信方式是( B )A、共享内存B、消息传递C、文件传输D、TCP/IP协议6、下列描述中，不属于C/S三层模型中是( C )A、用户界面层B、数据层C、通信层D、处理层2、透明度最高的操作系统是( A )A、多处理器分布式操作系统B、多计算机分布式操作系统C、网络操作系统D、基于中间件的操作系统3、下图所示典型C/S模型交互过程中，假设客户端是阻塞的，则其阻塞时间为( A? )A、T4－T1B、T4－T2C、T3－T2D、T3－T14、分布式系统的中间件协议位于网络通信协议体系的( D )A、传输层B、数据链路层C、网络层D、应用层6、C/S模型中，核心处理函数由哪一层实现( D )A、用户界面层B、数据层C、通信层D、中间层11、网络操作系统要求其管理的各计算机( B )A、硬件同构（不要求）B、通信协议一致或者相互兼容C、操作系统同构（不要求）D、安装相同的中间件1、分布式系统的透明性是指( B )A、用户不需要关心任何操作B、用户不需要关心系统实现的细节C、系统不需要关心用户的操作细节D、系统不需要关心用户的操作过程3、下列处理器与内存关系示意图中，属于多计算机系统结构的是( D？)A、B、C、D、4、中间件系统与分布式操作系统有比较好的 A ，与网络操作系统相比有比较好的 AA、可扩展性和开放性，透明性和易用性B、可扩展性和透明性，开放性和易用性C、透明性和易用性，可扩展性和开放性C、透明性和开放性，可扩展性和易用性17、透明度最高的系统是( C )A、网络操作系统B、中间件系统C、分布式操作系统D、松耦合系统5、中间件协议位于网络协议体系的( D )A、传输层B、会话层C、网络层D、应用层通信5、异步通信中，消息由客户进程首先送给( A? )A、服务器缓冲区B、服务器进程C、客户端缓冲区D、网络10、RPC中，客户调用的接口称为( A? )A、客户存根B、服务器存根C、远程对象接口D、消息接口14、电子邮件系统通信方式属于( B )A、暂时通信B、持久通信C、中间层通信D、RPC通信7、RPC通信过程中，服务器存根把服务器执行的结果打成消息包，提交给( A )A、服务器操作系统B、客户存根C、客户操作系统D、服务器( A? )6、RPC 通信中，客户存根和服务器存根都包含一组调用接口，它们是否包含这些接口的实现？ ( D??? ) A 、客户存根包含，服务器存根不包含B 、都不包含C 、客户存根不包含，服务器存根包含D 、都包含进程8、下图为重复服务器与并发服务器组织方式。

分布式数据库系统考试

分布式数据库系统考试（答案见尾页）一、选择题1. 分布式数据库系统的定义是什么？A. 一种将数据存储在多个地理位置的数据库系统中，通过分布式计算框架来管理和访问数据的一种技术。

B. 一种单一的集中式数据库系统，所有数据都存储在一个服务器上。

C. 一种将数据分割成多个部分，并分布存储在不同的服务器上的数据库系统。

D. 一种不依赖于单一服务器的数据库系统，数据可以跨多个服务器进行存储和访问。

2. 分布式数据库系统的优点包括哪些？A. 提高数据处理速度和效率。

B. 降低单点故障的风险。

C. 更好的数据冗余和容错能力。

D. 扩展性更强，可以更容易地添加新的数据和节点。

3. 以下哪个不是分布式数据库系统中的常见拓扑结构？A. 星形拓扑B. 环形拓扑C. 网状拓扑D. 树形拓扑4. 在分布式数据库系统中，什么是分片？A. 将整个数据库系统的数据分成多个部分，每个部分存放在一个单独的节点上。

B. 将数据库系统的一个或多个表按照某种规则分成多个部分。

C. 将数据库系统的数据按照某种规则分成多个部分，每个部分存放在一个单独的节点上。

D. 将数据库系统的一个或多个表按照某种规则分成多个部分，并存放在不同的节点上。

5. 在分布式数据库系统中，什么是复制？A. 将数据库系统的数据复制到多个节点上，以确保数据的可靠性和可用性。

B. 将数据库系统的数据存储在多个地理位置，以确保数据的可靠性和可用性。

C. 将数据库系统的数据按照某种规则分成多个部分，并存放在不同的节点上。

D. 将数据库系统的一个或多个表按照某种规则分成多个部分，并存储在不同的节点上。

6. 在分布式数据库系统中，什么是分布式事务？A. 一种需要在多个节点上同步更新数据的事务处理方式。

B. 一种可以在多个节点上并行处理的事务处理方式。

C. 一种需要确保数据的一致性和完整性的事务处理方式。

D. 一种可以在多个节点上同时执行的事务处理方式。

7. 分布式数据库系统中的数据一致性是指什么？A. 数据在多个节点上保持一致的状态。

分布式课后习题答案

分布式课后习题答案第⼀章分布式数据库系统概述1.1请⽤⾃⼰的语⾔定义下列分布式数据库系统中的术语：（1）局部数据:只提供本站点的局部应⽤所需要的数据。

全局数据：虽然物理上存储在个站点上，但是参与全局应⽤（2）全局/局部⽤户：局部⽤户：⼀个⽤户或⼀个应⽤如果只访问他注册的那个站点上的数据称为本地或局部⽤户或本地应⽤；全局⽤户：如果访问涉及两个或两个以上的站点中的数据，称为全局⽤户或全局应⽤。

全局/局部DBMS：1）LDBMS(Local DBMS)：局部场地上的数据库管理系统，其功能是建⽴和管理局部数据库，提供场地⾃治能⼒，执⾏局部应⽤及全局查询的⼦查询。

（2）GDBMS(Global DBMS)：全局数据库管理系统，主要功能是提供分布透明性，协调全局事物的执⾏，协调各局部DBMS 以完成全局应⽤，保证数据库的全局⼀致性，执⾏并发控制，实现更新同步，提供全局恢复功能等。

（3）全局外模式：全局应⽤的⽤户视图，也称全局视图。

从⼀个由各局部数据库组成的逻辑集合中抽取，即全局外模式是全局概念式的⼦集。

对全局⽤户⽽⾔，都可以认为在整个分布式数据库系统的各个站点上的所有数据库都如同在本站点上⼀样，只关⼼他们⾃⼰所使⽤的那部分数据（4）全局概念模式：描述分布式数据库中全局数据的逻辑结构和数据特性，是分布式数据库的全局概念视图。

采⽤关系模型的全局概念模式由⼀组全局关系的定义(如关系名、关系中的属性、每⼀属性的数据类型和长度等)和完整性定义(关系的主键、外键及完整性其他约束条件等)组成。

（5）分⽚模式：描述全局数据的逻辑划分。

每个全局关系可以通过选择和投影的关系操作被逻辑划分为若⼲⽚段。

分⽚模式描述数据分⽚或定义⽚段，以及全局关系与⽚段之间的映像。

这种映像是⼀对多的。

（6）分配模式：根据选定的数据分布策略，定义各⽚段的物理存放站点，即定义⽚段映像的类型，确定分布式数据库是冗余的还是⾮冗余的，以及冗余的程度。

如果⼀个⽚段分配在多个站点上，则⽚段的映像是⼀对多的，分布式数据库是冗余的，否则是不冗余的。

2010年全国自考数据库系统原理模拟试卷（六）及答案

2010年全国自考数据库系统原理模拟试卷（六）及答案2010年全国自考数据库系统原理模拟试卷(六)一、单项选择题（本大题共15小题,每小题2分,共30分）在每小题列出的四个备选项中只有一个是符合题目要求的,请将其代码填写在题后的括号内。

错选、多选或未选均无分。

1.在定义分布式数据库的各种片段时，必须遵守的条件是：完备性条件，不相交条件和()A.安全性条件B.数据一致性条件C.重构条件D.数据完整性条件答案：C2.SQL的“CREATE UNIQUE INDEX…”语句中UNIQUE表示基本表中()A.索引键值不可分解B.索引键值都是惟一的C.没有重复元组D.没有重复列值答案：B3.()位于分片视图与分配视图之间。

A.分片透明性B.位置透明性C.局部数据模型透明性D.复制透明性答案：B4.在多用户共享系统中，并发操作的事务互相干扰，破坏了事务的()A.原子性B.一致性C.隔离性D.持久性答案：C5.SQL是介于()之间的一种关系查询语言。

A.关系代数和域演算B.元组演算和域演算C.关系代数和元组演算D.关系代数和集合运算答案：C6.假定学生关系是S(S#,SNAME,SEX,AGE)，课程关系是C(C#,CNAME,TEACHER)，学生选课关系是SC(S#，C#，GRADE)。

要查找选修“COMPUTER”课程的“女”学生姓名，将涉及到关系()A.SB.SC，CC.S，SCD.S，C，SC答案：D7.数据的管理方法主要有()A.批处理和文件系统B.文件系统和分布式系统C.分布式系统和批处理D.数据库系统和文件系统答案：D8.在E-R图和数据流图中都使用了方框，下列说法中有错误的是()A.在E-R图中表示实体B.在E-R图中表示属性C.在数据流图中表示起点D.在数据流图中表示终点答案：B9.超类与子类间的关系是()A.超类实体继承子类实体的所有属性B.子类实体继承超类实体的所有属性C.超类实体继承子类实体的主码D.子类实体继承超类实体的主码答案：B10.避免活锁采用的简单策略是()A.顺序封锁法B.依次封锁法C.按优先级确定服务顺序D.先来先服务答案：D11.在SQL语言中，()子句能够实现关系参照性规则。

分布式系统原理与范型考试 2009 答案

1
it is sent, or even at the same time it is sent, since it takes a finite, nonzero amount of time to arrive. 7. finger table (chap 5) Instead of linear approach toward key lookup, each Chord node maintains a finger table of at most m entries. If FTp denotes the finger table of node p, then FTp[i] = succ (p + 2i-1) Put in other words, the i-th entry points to the first node succeeding p by at least 2i-1. 8. out of band data (chap 3) Data is to be processed by the server before any other data from that client. 9. MapReduce (5pt)
2
二、简答题（共70分） 1. Q: What is the difference between a vertical distribution and a horizontal distribution? (chap 2, 5pt) A: Vertical distribution refers to the distribution of the different layers in a multitiered architectures across multiple machines. In principle, each layer is implemented on a different machine. Horizontal distribution deals with the distribution of a single layer across multiple machines, such as distributing a single database. 2. Q: Is a server that maintains a TCP/IP connection to a client stateful or stateless? (chap 3) A: Assuming the server maintains no other information on that client, one could justifiably argue that the server is stateless. The issue is that not the server, but the transport layer at the server maintains state on the client. What the local operating systems keep track of is, in principle, of no concern to the server. 3. Q: One way to handle parameter conversion in RPC systems is to have each machine send parameters in its native representation, with the other one doing the translation, if need be. The native system could be indicated by a code in the first byte. However, since locating the first byte in the first word is precisely the problem, can this actually work? (chap 4) A: First of all, when one computer sends byte 0, it always arrives in byte 0. Thus the destination computer can simply access byte 0 (using a byte instruction) and the code will be in it. It does not matter whether this is the low-order byte or the high-order byte. An alternative scheme is to put the code in all the bytes of the first word. Then no matter which byte is examined, the code will be there. 4. Q: Routing tables in IBM WebSphere, and in many other message-queuing systems, are configured manually. Describe a simple way to do this automatically. (chap 4) A: The simplest implementation is to have a centralized component in which the topology of the queuing network is maintained. That component simply calculates all best routes between pairs of queue managers using a known routing algorithm, and subsequently generates routing tables for each queue manager. These tables can be downloaded by each manager separately. This approach works in queuing networks where there are only relatively few, but possibly widely dispersed, queue managers. 5. Q: Is an identifier allowed to contain information on the entity it refers to? (chap 5) A: Yes, but that information is not allowed to change, because that would imply changing the identifier. The old identifier should remain valid, so that changing it would imply that an entity has two identifiers, violating the second property of identifiers. 6. Q: When a node synchronizes its clock to that of another node, it is generally a good idea to take previous measurements into account as well. Why? Also, give an example of how such past readings could be taken into account. (chap 6) A: The obvious reason is that there may be an error in the current reading. Assuming that clocks need only be gradually adjusted, one possibility is to consider the last N values and compute a median or average. If the measured value falls outside a current interval, it is not taken into account (but is added to the list). Likewise, a new value can be computed by taking a weighted average, or an aging algorithm.

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

分布式系统 2010春季学期期末考试北京大学计算机系，2010年6月7日院系：学号：姓名：一、概念题（共15分）1.asynchronous RPC (3pt) (chap4)With asynchronous RPCs, the server immediately sends a reply back to the client the moment the PRC request is received, after which it calls the requested procedure. The reply acts as an acknowledgment to the client that the server is going to process the RPC. The client will continue without further blocking as soon as it has received the server’s acknowledgment.2.response failure, sate transition failure (3pt) (chap8)The server’s response is simply incorrect. Two kinds of response failures may happen. In the case of a value failure, a server simply provides the wrong reply to a request. The other type of response failure is known as a state transition failure. This kinds of failure happens when the server reacts unexpectedly to an incoming request.3.stateless server, soft state (3pt) (chap3)A stateless server does not keep information on the state of its clients, and can change its own state without having to inform any client.A particular form of a stateless design is where the server maintains what is known as soft sate. The server promises to maintain state on behalf of the client, but only for a limited time. After that time has expired, the server falls back to default behavior.4.totally-ordered multicast, causally-ordered multicasting (3pt) (chap6)A multicast operation by which all messages are delivered in the same order to each receiver.If two messages are not in any way related to each other, we do not care in which order they are delivered to applications. They may even be delivered in different order at different locations.5.Monotonic-read consistency (3pt) (chap7)If a process reads the value of a data item x, any successive read operation on x by that process will always return that same value or a more recent value. In other words, monotonic-read consistency guarantees that if a process has seen a value of x at time t, it will never see and older version of x at a later time.二、简答题（共35分）1. Q: 1) Resolve the following key lookups for the shown Chord-based P2P system: (5pt) (chap5)15@4: 14–18; 22@4: 20–21–28; 30@21: 28–1; 27@21: 28; 18@20: 4–14–182) Adjust the finger tables of nodes 18 and 14 when a node with ID 24 enters the ring. Also give the finger table of node 24. (5pt) (chap5)Node 18: [20,20,24,28,4]; Node 14: [18,18,18,24,1]; Node 24: [28,28,28,1,9].3) Chord allows keys to be looked up recursively or iteratively. Explain the differences, as well as the main advantage of iterative over recursive lookup. (5pt) (chap5)With recursive lookups, a message is forwarded from peer to peer until it reaches its destination. In contrast, with an iterative lookup, the requester is returned the next peer it should ask for the key. One can argue that in the case of Chord, iterative lookups are much better: recursive lookups do not have the advantage of proximity-awareness. Also, note that iterative lookups have the advantage of letting the client handle failures more easily.2. Q: What is a three-tiered client-server architecture? (5pt) (chap 2)A three-tiered client-server architecture consists of three logical layers, where each layer is, in principle, implemented at a separate machine. The highest layer consists of a client user interface, the middle layer contains the actual application, and the lowest layer implements the data that are being used.3. Q: Where does the need for at-least-once and at-most-once semantics come from? Why can’t we have exactly-once semantics? (5pt)The problem originates from having a (suspected) server crash, detected by the lack of a response in the case of an RPC. What the client-side software can do is either resend the request until it finally gets a response (at-least-once semantics) or immediately reports the failure to the client application, thus providing at-most-once semantics. Guaranteeing exactly-once semantics is, in principle, impossible, because you cannot know in general whether the server crashed before or after executing the requested operation.4. Q: 1) Explain how two-phase commit works. (5pt) (chap8)Make sure that you explain (1) coordinator sends vote-request; (2) participants respond; (3) coordinator sends decision; (4) participants ack.2) Explain what happens when a participant, who is in the READY state, times out because it hasn’t received a response from the coordinator yet. (5pt) (chap8)In that case, P can check whether any of the other participants has made a transition to either ABORT or INIT (in which case P can abort) or COMMIT (and commit as well). The difficulty is when all others are in READY: they all need to wait until the coordinator recovers.。