分布式系统练习题2009秋(1)

合集下载

分布式系统原理与范型考试-2009-12

Department Computer Science Distributed Systems VU University15.12.2009 MAKE SURE THAT YOUR HANDWRITING IS READABLE1a Explain what is meant by request-level and message-level interceptors in middleware.5pt Request-level interceptors are special local components to which an invocation request is passed before passing it to the underlying middleware.Such an interceptor is invocation aware in the sense that it knows with which invocation it is dealing,and for which server it is intended.Typically,such interceptors can be used to implement replicated calls.A message-level interceptor is a component that is logically placed between the middleware and the underlying operating system.It can thus handle only basic network messages,for example,by fragmenting them into smaller parts(and assembling these parts at the receiver side).1b Where does the need for adaptive middleware come from?5pt Middleware is intended to incorporate general-purpose,i.e.,application-independent,mechanisms for distributed computing.The problem is that for practical purposes,it is very difﬁcult to separate policy from mechanism,with the effect that many middleware solutions are not right for speciﬁc applications.The result is the need to be able to tweak the middleware for the speciﬁc needs of an application.1c In the underlying feedback control loop,give an example of the analysis component in combination with the reference input.5ptAn example that is also discussed in the book,is analyzing whether measured performance is as good as it could have been when another replication scenario would have been used.In this case, the reference input is a cost function that needs to be minimized.2a What is the difference between transport-layer switching and content-aware request distribution?5pt With transport-layer switching,a front end to a server cluster accepts incoming TCP connections and hands these off to one of the back-end servers using only information that is available at the TCP-level:client address and destination port.In the case of content-aware request distribution,the switch can also inspect the content of requests(such as an HTTP URL)and use that information to decide to which back-end server the request should be forwarded.2b Explain how TCP handoff works and why it is difﬁcult to apply to wide-area networks.5pt With TCP handoff,an incoming connection request is forwarded by a switch to a speciﬁc server, which then sends the response back directly to the client,using the network address of the switch.This last issue is problematic is a wide-area system,as it essentially involves spooﬁng the switch, which is often difﬁcult to do across administrative domains.2c Explain how the content-aware request distribution can be combined with TCP handoff.5pt Your answer should explain what is happening in Figure12-9.Essential is that you mention the initial handoff to a distributor or dispatcher to decide what the best server could be based on content, after which the TCP connection is handed off to that server.The switch is subsequently informed.3a Traditional RPC mechanisms cannot handle pointers.What is the problem and how can it be ad-dressed?5pt The problem is that pointers passed as parameters refer to a memory location that is local to the caller.That location is not only often meaningless to the receipient,but more important is that the recipient will most likely not have the data structure in its memory that the caller has.There are not many things you can do about this,except copying the entire(dynamic data structure)from the caller to the callee when doing the RPC.An alternative is to replace pointers by global systemwide references,as is done with Java object references.3b Where does the need for at-least-once and at-most-once semantics come from?Why can’t we have exactly-once semantics?5pt The problem originates from having a(suspected)server crash,detected by the lack of a response in the case of an RPC.What the client-side software can do is either resend the request until itﬁnally gets a response(at-least-once semantics)or immediately reports the failure to the client application, thus providing at-most-once semantics.Guaranteeing exactly-once semantics is,in principle,impos-sible,because you cannot know in general whether the server crashed before or after executing the requested operation.3c Consider a client/server system based on RPC,and assume the server is replicated for performance.Sketch an RPC-based solution for hiding replication of the server from the client.5pt Simply take a client-side stub that replicates the call to the respective servers.It is essential that you mention that these calls should be done in parallel and that(for example)theﬁrst response is immediately passed to the client.Serializing the RPCs or waiting for all responses is OK for fault tolerance,but certainly not for performance.4a Resolve the following key lookups for the shown Chord-based P2P system:5ptsource key41542221302127201815@4:14–18;22@4:20–21–28;30@21:28–1;27@21:28;18@20:4–14–184b Adjust theﬁnger tables of nodes18and14when a node with ID24enters the ring.Also give the ﬁnger table of node24.5pt Node18:[20,20,24,28,4];Node14:[18,18,18,24,1];Node24:[28,28,28,1,9].4c Chord allows keys to be looked up recursively or iteratively.Explain the differences,as well as the main advantage of iterative over recursive lookup.5pt With recursive lookups,a message is forwarded from peer to peer until it reaches its destination.In contrast,with an iterative lookup,the requester is returned the next peer it should ask for the key.One can argue that in the case of Chord,iterative lookups are much better:recursive lookups do not have the advantage of proximity-awareness.Also,note that iterative lookups have the advantage of letting the client handle failures more easily.5a Explain how two-phase commit works.5pt Make sure that you explain(1)coordinator sends vote-request;(2)participants respond;(3)coordi-nator sends decision;(4)participants ack.5b Explain what happens when a participant,who is in the READY state,times out because it hasn’t received a response from the coordinator yet.5pt In that case,P can check whether any of the other particpants has made a transition to either ABORT or INIT(in which case P can abort)or COMMIT(and commit as well).The difﬁculty is when all others are in READY:they all need to wait until the coordinator recovers.5c If we use two-phase commit for a distributed transaction,can we allow a coordinator to issue two distributed transactions(involving the same participants)at the same time?5pt Yes:the local transaction managers at the participants will handle any concurrency issues.What is seen here is that the use of2PC is completely independent of the semantics of speciﬁc transactions.6a How can a Web hosting service help in handlingﬂash crowds?5pt Crucial for a correct answer is that you not only state that content is replicated,but that the origin server is assumed to be capable of redirecting requests,but perhaps no longer in also returning content-rich responses.Note that distributed request distribution is really tricky business.6b Akamai uses DNS-based redirection.Explain how resolution of the name would work.5pt The trick is that the regular DNS will resolve the name ,pointing to a DNS server that is controlled by Akamai.If we use iterative DNS name resolution,that server will know the IP address ot the requesting client,and be able to decide to which server(with logical name )it can forward the request.6c Explain the difference between content-aware and content-blind caching for Web applications by means of an example.5pt With content-aware caching,the cache has knowledge on the data model that is used by the Web application,and with that,can conduct query-containment procedures to see whether a query could possibly be addressed by the data that is already cached.For example,if an edge server had once received the query“select ALL FROM books WITH author=Irving”,it can cache that ter, when receiving a query“select ALL FROM books WITH author=Irving AND date<2008”,the edge server should be able to recognize that this is a subquery,and that it can thus look into its local cache.With content-blind caching,the cache simply attaches a unique id to an entire,speciﬁc query in order to check whether that exact query had been issued before.If so,it can possibly return the previously stored response from its cache.In our example,the two queries would each get a unqiue ID,which is then used to do a cache lookup.Grading:Theﬁnal grade is calculated by accumulating the scores per question(maximum:90points),and adding10bonus points.The maximum total is therefore100points.。

分布式系统复习题与参考答案(答案完全版)

关于分布式系统复习题与参考答案一、填空题(每题n分,答错个扣分,全错全扣，共计m分)1．下面特征分别属于计算机网络和分布式计算机系统，请加以区别：分布式计算机是指系统内部对用户是完全透明的；系统中的计算机即合作又自治；系统可以利用多种物理和逻辑资源，可以动态地给它们分配任务。

计算机网络是指互连的计算机是分布在不同地理位置的多台独立的“自治计算机”。

2．点到点通信子网的拓扑结构主要有以下几种：星型、环型、树型、网状型，请根据其特征填写相应结构。

网状型：结点之间的连接是任意的，没有规律。

环型：节点通过点到点通信线路连接成闭合环路。

星型：节点通过点到点通信线路与中心结点相连；树型：结点按层次进行连接。

3．分布式计算系统可以分为两个子组，它们是集群计算系统和网格计算系统。

4．分布式事务处理具有4个特性，原子性：对外部来说，事务处理是不可见的；一致性：事务处理不会违反系统的不变性；独立性：并发的事务处理不会相互干扰；持久性：事务处理一旦提交，所发生的改变是永久性的。

5．网络协议有三要素组成，时序是对事件实现顺序的详细说明；语义是指需要发出何种控制信息，以及要完成的动作与作出的响应；语法是指用户数据与控制信息的结构与格式6．根据组件和连接器的不同，分布式系统体系结构最重要的有4种，它们是：分层体系结构、基于对象的体系结构、以数据为中心的体系结构、基于事件的体系结构7．在客户-服务器的体系结构中，应用分层通常分为3层，用户接口层、处理层和数据层。

8．有两种类型的分布式操作系统，多处理器操作系统和多计算机操作系统。

9．软件自适应的基本技术有3种，一是要点分离、二是计算映像、三是基于组件的设计。

10．DCE本身是由多个服务构成的，常用的有分布式文件系统、目录服务、安全服务以及分布式时间服务等。

11．TCP/IP体系结构的传输层上定义的两个传输协议为传输控制协议(TCP)和用户数据报协议(UDP)。

12．Windows NT的结构借用了层次模型和客户/服务器两种模型。

分布式系统第四章作业

分布式系统第四章作业2009计算机技术王万军1、在许多分层协议中，每一层都有自己的报头。

如果每个消息前部都只有单个报头，其中包含了所有控制信息，无疑会比使用单独的多个报头具有更高的效率。

为什么不这么做？答：协议的每一层都必须和其它层相独立。

从第k+1层传送至第k层的数据同时包含了报头和数据，但是第k层协议不能对它们进行辨别。

如果使用单个大的报头来包含所有信息的话将会破坏透明性，使得一个协议层的变动会影响到其它层，这显然不是我们所希望的。

2、为什么传输层通信服务常常不适于构建分布式应用程序？答：它们通常不提供分布透明性，这意味着应用程序开发人员需要注意通信的实现，从而导致解决方案的可扩展性很差。

分布式应用程序，例如基于套接字构建的分布式应用程序，将很难移植或者和其它应用程序交互。

3、一种可靠的多播服务允许发送者向一组接收者可靠地传递消息。

这种服务是属于中间件层还是更低层的一部分？答：从理论上来说，一种可靠的多播服务可以很容易的成为传输层，甚至是网络层的一部分。

例如，不可靠的IP多播服务是在网络层实现的。

但是，由于这些服务目前尚无法应用，它们通常使用传输层的服务来实现，传输层的服务将它们放在中间件中。

如果把可扩展性加以考虑的话，只有充分考虑应用程序的需求时可靠性才能得到保证。

用更高、更特殊的网络层来实现这些服务存在一个很大的争议。

6、处理RPC系统中参数转换的一种方法是，每台机器以自己系统使用的表示方式来发送参数，由另一方在必要的情况下进行转换。

可以通过首字节中的代码来表示发送消息机器所用的系统。

然而，由于要在首个字中找到开头的字节这本身也是一个问题，这种方法能行得通吗？答：首先，当一台机器发送字节0时，消息肯定已经送到。

因此目标机器可以访问字节0，而代码就在消息里面。

这种方法不考虑字节是高位优先还是低位优先的字节。

另一个方法是将代码放在第一个单词的所有字节中。

因此不管检查的是哪一个字节，代码都能被找到。

计算机程序设计员实操考核分布式系统题目

计算机程序设计员实操考核分布式系统题目一、题目描述设计一个分布式系统，实现一个简单的分布式文件系统。

该系统应包含以下功能：1.文件上传：用户可以上传文件至分布式文件系统。

2.文件下载：用户可以从分布式文件系统下载文件。

3.文件删除：用户可以在分布式文件系统中删除文件。

二、设计思路为了实现一个分布式文件系统，我们需要考虑以下几个关键问题：数据分布、数据复制和数据一致性。

2.1 数据分布将文件的数据分布在不同的节点上，可以提高系统的并发能力和数据读取速度。

可以使用一致性哈希算法来决定文件数据应存储在哪个节点上。

将文件分成多个块，并选择多个节点进行数据的复制，以提高系统的可用性和容错性。

2.2 数据复制为了提高系统的可用性和容错性，需要在多个节点上复制文件的数据。

可以使用主从复制的方式，其中一个节点作为主节点负责接受文件上传请求，其他节点作为从节点，负责数据的复制和文件的下载请求。

当主节点故障时，从节点可以接替成为新的主节点，保证系统的可用性。

2.3 数据一致性在分布式系统中，数据一致性是一个重要的问题。

当用户上传文件或删除文件时，需要保证系统中所有节点的数据一致。

可以使用分布式一致性协议来解决这个问题，比如使用Paxos协议或Raft协议。

三、系统架构3.1 节点类型在分布式文件系统中，可以定义以下几种节点类型：1.主节点（Master）：负责接受文件上传请求，并将文件数据分发到其他节点上。

2.从节点（Slave）：负责接收主节点发送的文件数据，并在本地进行存储。

3.客户端（Client）：用户使用的接口，可以通过客户端进行文件上传、下载和删除操作。

3.2 节点之间的通信节点之间的通信可以使用RPC框架（如gRPC）来实现。

主节点可以通过RPC调用从节点的接口，将文件数据发送给从节点。

客户端也可以通过RPC调用主节点的接口，实现文件的上传、下载和删除等操作。

3.3 文件分块和数据分布将文件分成多个块，并计算每个块的哈希值。

分布式数据库系统考试

分布式数据库系统考试（答案见尾页）一、选择题1. 分布式数据库系统的定义是什么？A. 一种将数据存储在多个地理位置的数据库系统中，通过分布式计算框架来管理和访问数据的一种技术。

B. 一种单一的集中式数据库系统，所有数据都存储在一个服务器上。

C. 一种将数据分割成多个部分，并分布存储在不同的服务器上的数据库系统。

D. 一种不依赖于单一服务器的数据库系统，数据可以跨多个服务器进行存储和访问。

2. 分布式数据库系统的优点包括哪些？A. 提高数据处理速度和效率。

B. 降低单点故障的风险。

C. 更好的数据冗余和容错能力。

D. 扩展性更强，可以更容易地添加新的数据和节点。

3. 以下哪个不是分布式数据库系统中的常见拓扑结构？A. 星形拓扑B. 环形拓扑C. 网状拓扑D. 树形拓扑4. 在分布式数据库系统中，什么是分片？A. 将整个数据库系统的数据分成多个部分，每个部分存放在一个单独的节点上。

B. 将数据库系统的一个或多个表按照某种规则分成多个部分。

C. 将数据库系统的数据按照某种规则分成多个部分，每个部分存放在一个单独的节点上。

D. 将数据库系统的一个或多个表按照某种规则分成多个部分，并存放在不同的节点上。

5. 在分布式数据库系统中，什么是复制？A. 将数据库系统的数据复制到多个节点上，以确保数据的可靠性和可用性。

B. 将数据库系统的数据存储在多个地理位置，以确保数据的可靠性和可用性。

C. 将数据库系统的数据按照某种规则分成多个部分，并存放在不同的节点上。

D. 将数据库系统的一个或多个表按照某种规则分成多个部分，并存储在不同的节点上。

6. 在分布式数据库系统中，什么是分布式事务？A. 一种需要在多个节点上同步更新数据的事务处理方式。

B. 一种可以在多个节点上并行处理的事务处理方式。

C. 一种需要确保数据的一致性和完整性的事务处理方式。

D. 一种可以在多个节点上同时执行的事务处理方式。

7. 分布式数据库系统中的数据一致性是指什么？A. 数据在多个节点上保持一致的状态。

分布式数据库系统知识点及习题

第9章分布式数据库系统9.1 基本内容分析9.1.1 本章重要概念（1）分布计算的三种形式：处理分布，数据分布，功能分布。

（2）C/S系统，工作模式，技术特征，体系结构，两层、三层、多层C/S结构。

（3）DDBS的定义、特点、优点、缺点和分类；分布式数据存储的两种形式（分片和分配）。

（4）DDB的体系结构：六层模式，分布透明性的三个层次，DDBS的组成，DDBMS的功能和组成。

（5）分布式查询处理的查询代价，基于半联接的优化策略，基于联接的优化策略。

（6）分布式数据库的并发控制和恢复中出现的问题，以及处理机制。

9.1.2 本章的重点篇幅（1）两层、三层、多层C/S结构。

（教材P365-367）（2）分布式数据存储：分片和分配。

（教材P375-377）（3）DDB的体系结构。

（教材P378的图9.10，P381的图9.12）（4）基于半联接的执行示意图。

（教材P389的图9.17）9.2 教材中习题9的解答9.1 名词解释·集中计算：单点数据和单点处理的方式称为集中计算。

·分布计算：随着计算机网络技术的发展，突破集中计算框架，DBMS的运行环境逐渐从单机扩展到网络，对数据的处理从集中式走向分布式、从封闭式走向开放式。

这种计算环境称为分布计算。

·处理分布：指系统中处理是分布的，数据是集中的这种情况。

·数据分布：指系统中数据是分布的，但逻辑上是一个整体这种情况。

·功能分布：将计算机功能分布在不同计算机上执行，譬如把DBMS功能放在服务器上执行，把应用处理功能放在客户机上执行。

·服务器位置透明性：指C/S系统向客户提供服务器位置透明性服务，用户不必知道服务器的位置，就可以请求服务器的服务。

·集中式DBS：所有工作都由一台计算机完成，这种DBS称为集中式DBS。

·DDBS：是物理上分散逻辑上集中的DBS，每一场地既能完成局部应用又能完成全局应用，这种系统称为DDBS。

分布式系统原理与范型考试 2009 答案

1
it is sent, or even at the same time it is sent, since it takes a finite, nonzero amount of time to arrive. 7. finger table (chap 5) Instead of linear approach toward key lookup, each Chord node maintains a finger table of at most m entries. If FTp denotes the finger table of node p, then FTp[i] = succ (p + 2i-1) Put in other words, the i-th entry points to the first node succeeding p by at least 2i-1. 8. out of band data (chap 3) Data is to be processed by the server before any other data from that client. 9. MapReduce (5pt)
2
二、简答题（共70分） 1. Q: What is the difference between a vertical distribution and a horizontal distribution? (chap 2, 5pt) A: Vertical distribution refers to the distribution of the different layers in a multitiered architectures across multiple machines. In principle, each layer is implemented on a different machine. Horizontal distribution deals with the distribution of a single layer across multiple machines, such as distributing a single database. 2. Q: Is a server that maintains a TCP/IP connection to a client stateful or stateless? (chap 3) A: Assuming the server maintains no other information on that client, one could justifiably argue that the server is stateless. The issue is that not the server, but the transport layer at the server maintains state on the client. What the local operating systems keep track of is, in principle, of no concern to the server. 3. Q: One way to handle parameter conversion in RPC systems is to have each machine send parameters in its native representation, with the other one doing the translation, if need be. The native system could be indicated by a code in the first byte. However, since locating the first byte in the first word is precisely the problem, can this actually work? (chap 4) A: First of all, when one computer sends byte 0, it always arrives in byte 0. Thus the destination computer can simply access byte 0 (using a byte instruction) and the code will be in it. It does not matter whether this is the low-order byte or the high-order byte. An alternative scheme is to put the code in all the bytes of the first word. Then no matter which byte is examined, the code will be there. 4. Q: Routing tables in IBM WebSphere, and in many other message-queuing systems, are configured manually. Describe a simple way to do this automatically. (chap 4) A: The simplest implementation is to have a centralized component in which the topology of the queuing network is maintained. That component simply calculates all best routes between pairs of queue managers using a known routing algorithm, and subsequently generates routing tables for each queue manager. These tables can be downloaded by each manager separately. This approach works in queuing networks where there are only relatively few, but possibly widely dispersed, queue managers. 5. Q: Is an identifier allowed to contain information on the entity it refers to? (chap 5) A: Yes, but that information is not allowed to change, because that would imply changing the identifier. The old identifier should remain valid, so that changing it would imply that an entity has two identifiers, violating the second property of identifiers. 6. Q: When a node synchronizes its clock to that of another node, it is generally a good idea to take previous measurements into account as well. Why? Also, give an example of how such past readings could be taken into account. (chap 6) A: The obvious reason is that there may be an error in the current reading. Assuming that clocks need only be gradually adjusted, one possibility is to consider the last N values and compute a median or average. If the measured value falls outside a current interval, it is not taken into account (but is added to the list). Likewise, a new value can be computed by taking a weighted average, or an aging algorithm.

分布式系统练习题

分布式系统练习题
一、选择题
概述
1、下列哪项描述不是分布式系统的特性( C )
A、透明性
B、开放性
C、易用性
D、可扩展性
3、下列描述正确的是( A )
A、基于中间件的系统要比网络操作系统的透明性高√
B、网络操作系统要比分布式操作系统的透明性高×
C、基于中间件的系统要比分布式操作系统的透明性高×
D、分布式操作系统可以运行在异构多计算机系统中 4、从下面关于网络操作系统的原理图中可以看出( B )
A、网络操作系统是紧耦合系统，因而只能运行在同构多计算机系统中×
B、网络操作系统不要求各计算机上的操作系统同构√
C、运行于网络操作系统之上的分布式应用程序可以取得很高的透明性×
D、网络操作系统可以作为一个全局的单一的系统进行方便的管理×
5、在网络操作系统之上采用中间件技术加入中间件层，主要可以( D )
A、弥补网络操作系统在可扩展性方面的缺陷
B、弥补网络操作系统在可开放性方面的缺陷
C、提高网络操作系统的稳定性
D、提高网络操作系统的透明性
1、下列描述不是分布式系统目标的是( C )
A、连接用户和资源
B、透明性
C、异构性
4、多计算机系统的主要通信方式是( B )
D、开放性
以及可扩展性。

2、下列系统中有共享内存的系统是( B )
A、同构多计算机系统
B、多处理器系统
C、异构多计算机系统
D、局域网系统
3、下述系统中，能运行于同构多计算机系统的操作系统是( A )
A、分布式操作系统
B、网络操作系统
C、中间件系统
D、嵌入式操作系统。

分布式系统题库1-0-4

分布式系统题库1-0-4问题：[单选]在分布式数据库中，（）是指各场地数据的逻辑结构对用户不可见。

A.分片透明性B.场地透明性C.场地自治D.局部数据模型透明性在分布式数据库中，分布透明性指用户不必关心数据的逻辑分片，不必关心数据物理位置分配的细节，也不必关系各个场地上数据库数据模型。

分布透明性可归入物理独立性的范围，包括3个层次：分片透明性、位置透明性和局部数据模型透明性。

分片透明性是最高层次的分布透明性，即用户或应用程序只对全局关系进行操作而不必考虑数据的分片。

位置透明性是指用户或应用程序应当了解分片情况，但不必了解片段的存储场地。

位置透明性位于分片视图与分配视图之间。

局部数据模型透明性位于分配视图与局部概念视图之间，指用户或应用程序要了解分片及各片段存储的场地，但不必了解局部场地上使用的是哪种数据模型。

问题：[单选]对与在船上工作的人员而言：（）A.有压力是好事，没有压力人就没有上进的动力；B.长期生活在压力中对人的精神面貌是有积极的作用的；C.人对压力的反应是不同的，如不能适应会导致身体损耗和疾病；D.压力会使人成熟起来，使人能够冷静的对待自己周围所发生的事情。

问题：[单选]与集中式系统相比，分布式系统具有很多优点，其中（）不是分布式系统的优点。

A.提高了系统对用户需求变更的适应性和对环境的应变能力B.系统扩展方便C.可以根据应用需要和存取方式来配置信息资源D.不利于发挥用户在系统开发、维护、管理方面的积极性与主动精神根据硬件、软件、数据等资源在空间的分布情况，信息系统的结构可分为集中式和分布式两大类。

集中式系统的主要优点是：（1）信息资源集中，管理方便，规范统一。

（2）专业人员集中使用，有利于发挥他们的作用，便于组织人员培训和提高工作。

（3）信息资源利用率高。

（4）系统安全措施实施方便。

集中式系统的不足之处是：（1）随着系统规模的扩大和功能的提高，集中式系统的复杂性迅速增长，给管理、维护带来困难。

分布式数据库试题及答案

数据库试题目录1. 九八年秋季试题 (5)1.1. 概念题 (5)1.1.1. 比较半连接方法和枚举法的优缺点。

(5)1.1.2. 2PL协议的基本思想。

(5)1.1.3. WAL协议的主要思想。

(5)1.1.4. SSPARC三级模式体系结构。

(5)1.1.5. 设计OID的数据结构时应考虑哪些问题。

(6)1.2. 某个大学中有若干系，且每个系有若干个班级和教研室，每个教研室有若干个教员，其中教授、副教授每个人带若干名研究生。

每个班有若干名学生，每个学生可选修若干门课程，每门课程可由若干学生选修。

完成下列各种要求： (7)1.3. 下面是某学院的一个学生档案数据库的全局模式： (9)1.3.1. 将全局模式进行分片，写出分片定义和分片条件。

(9)1.3.2. 指出各分片的类型，并画出分片树。

(9)1.3.3. 假设要求查询系号为1的所有学生的姓名和成绩，写出在全局模式上的SQL查询语句，并要求转换成相应的关系代数表示，画出全局查询树，请依次进行全局优化和分片优化，画出优化后的查询树。

要求给出优化变换过程。

(10)1.4. 设数据项x,y存放在S1场地，u,v存放在S2场地，有分布式事务T1和T2,T1在S1场地的操作为R1(x)W1(x)R1(y)W1(y),T2在S1场地的操作为R2(x)R2(y)W2(y);T1在S2场地上的操作作为R1(u)R1(v)W1(u),T2在S2场地上的操作作为W2(u)R2(v)W2(v)。

对下述2种情况，各举一种可能的局部历程（H1和H2），并说明理由。

(11)1.4.1. 局部分别是可串行化，而全局是不可串行化的 (11)1.4.2. 局部和全局都是可串行化的。

要求按照严格的2PL协议，加上适当的加锁和解锁命令，（注意，用rl(x)表示加读锁，wl(x)表示加对x加写锁，ul(x)表示解锁）121.5. 试述面向对象的数据库系统中页面服务器和对象服务器两种Client/Server体系结构的主要特点, (12)2. 九九年春季试题 (13)2.1. DBMS解决了信息处理技术中的哪些挑战？ (13)2.2. 在关系数据库应用设计中，为什么要对数据库模式进行规范化？ (13)2.3. 简述ACID特性。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

1、分布式系统中的体系结构样式有那几种？并简述之。

1)分层体系结构：组件组成了不同的层，其中L i层的组件可以调用下面的层L i-1。

其中的关键因素是，其控制系统是从一层到别一层的：请求是从上往下，而请求结果是从下向上。

2）基于对象的体系结构：是一种很松散的组织结构。

基本上每一个对象都对应一个组件，这些组件是通过过程调用机制来连接的。

分层和面向对象的体系结构仍然是大型软件系统最重要的样式。

3)以数据为中心的体系结构，其发展思想：进程通信需要通过一个公用（被动或主动的）仓库。

可以分成两个关键的部分:展现当前状态的中央数据结构和一个对数据进行操作的独立组件的集合4）基于事件的体系结构，进程基本上是通过事件的抟播来通信的，事件传播还可以有选择地携带数据。

基本思想是：进程发布事件，然后，中间件将确保那些订阅了这些事件的进程才接收它们，优点是进程是松散耦合的。

2、简述分布式系统设计所面临的问题及遇到的挑战。

1）：难以合理设计分配策略,在集中式系统中，所有的资源都由系统系统管理和分配，但在分布式系统中，资源属于局部工作站或个人计算机，所以调度的灵活性不如集中式系统，资源的物理分布可能与服务器的分布不匹配，某些资源可能空闲，而另外一些资源可能超载。

2）：部分失效问题：由于分布式系统通常是由若干部分组成的，各个部分由于各种各样的原因可能发生故障，如硬件故障。

如果一个分布式系统不对这些故障对这些问题进行有效的处理，系统某个组成部分的故障可能导致整个系统的瘫痪。

3）性能和可靠性过分依赖于网络：由于分布式系统是建立在网络之上的，而网络本身是不可靠的，可能经常发生故障，网络故障可能导致整个系统的终止；另外，网络超负荷会导致性能下降，增加系统的响应时间。

4）缺乏统一控制：一个分布式系统的控制通常是一个典型的分散式控制，没有统一的中心控制。

因此，分布式系统通常需要相应的同步机制来协调系统中各个部分的工作；5）安全保密性问题：为了获得可扩展性，分布式系统中的许多软件接口都提供给用户，这样的开放结构对于开发人员非常有价值，担同时也为破坏者打开了方便之门挑战：设计与实现一个对用户来说是透明的且具有容错能力的分布式系统。

挑战：1. Heterogeneity异构性，包括网络、硬件、操作系统、编程语言、Implementation 组件，<中间件的作用就是隐藏这些异构，并提供一致的计算模式（模块）>，<虚拟机，编译器成虚拟机使用的code>；2. Openness 开放性：提供Services, Syntax, Semantics，合适的接口定义需要兼顾完整性和中立性，同时互操作性和便携性也是重要的，分布式系统需要flexible，即易于配置且把策略和结构分开来获得flexible；3. Security 安全性：安全性包括有效性，机密性，完整性三个要素，拒绝服务攻击和移动代码安全是两个安全方面的挑战。

4. Scalability 可扩展性（可测性），即在资源和用户增加的时候保持效率，size、geographically、administratively的scaleble。

要求分散（分布式）算法，没有机器有系统状态的完整信息，机器做决定仅仅是取决于本地信息，一个机器的错误不会毁掉整个算法，没有统一时序。

5. Failure Handling 错误处理。

6. Concurrency 协力，一致（并发？）7. Transparency 透明度3、简述远程过程调用的步骤。

Client端:1)发送远程过程调用的消息(以消息包形式)给远程的server端;2) 等待, 直到收到server端对该请求的回复;3) 一旦接收到来自server端的返回执行结果, 就继续执行后面的程序.Server端:1) 倾听状态, 等待client端发送过程调用消息;2)一旦接收到过程调用消息, server就抽取参数并分析它, 然后执行所请求的过程;3)将执行结果以消息包形式回送给client.1、客户进程以正常的方式调用客户存根；2、客户存根生成一个消息，然后调用本地操作系统；3、客户端操作系统将消息发送给远程操作系统；4、远程操作系统将消息交给服务器存根；5、服务器存根将参数提取出来，然后调用服务器；6、服务器执行要求的操作，操作完成后将结果返回给服务器存根；7、服务器存根将结果打包成一个消息，然后调用本地操作系统；8、服务器操作系统将含有结果的消息发送回客户端操作系统；9、客户端操作系统将消息交给客户存根；10、客户存根将结果从消息中提取出来，返回给调用它的客户过程。

4、Consider Figure 1 that shows four processes (P1, P2, P3, P4) with events a, b,c,… and messages communicating between them. Assume that initial logical clock values are all initialized to 0.a)List the Lamport timestamps for each event shown in Figure 1. Assumethat each process maintains a logical clock as a single integer value as aLamport clock. Provide Lamport clock for all labeled events (but consider alllabeled and unlabeled events).b)List the Total Ordering Logical Clock timestamps for each event shown inFigure 1. . Provide Total Ordering Logical Clock for all labeled events (butconsider all labeled and unlabeled events).c)Is there the potential for a causal violation? Explain why.Figure 1: Four Processes P1, P2, P3, P4 run events a,b,c,d,…. to send and receive messages5、 a, b, and c are events and no two events belong to the same process. Prove ordisprove (give counter-example) the following:a. a is concurrent with b and b is before c implies that a is before c.b. a is concurrent with b and b is concurrent with c implies that a is concurrent withc.6、 Consider a group of distributed systems, P1, P2, P3, P4 and P5 (P5 is currentlythe coordinator). P5 fails and P2 notices the failure. If the Bully algorithm is used for election of a new coordinator and the election attribute is the (Max of) processor numbers, show the set of all messages communicated through each communication channel Pij i,j = 1..5 for this election. Show the type of each message as “election”, “response”, “coordinator”.(CH10)7、 How Web Services work ？1客户发现服务器2客户通过建立TCP 连接来绑定服务器3客户建立SOAP 请求4 SOAP 路由器路由请求给合适的服务器5 服务器把请求解包,然后处理,并返回结果6结果通过相反的路径返回到客户端8、简述CDN ？CDN 的基本类型有哪几种，并简述之？9、 The Byzantine generals problem for 3 loyal generals and 1 traitor.a) what they send to each other ？b) what each one got from the other ？c) what each one got in second step ？10、 Consider Figure 2.(C9)a) Where does the waiting/blocking occur?b) What happens in case of a crash? How do we detect a crash?Figure 2: Coordinator and Participant11、以google 文件系统为例，说明分布式文件系统的设计与应用场景密切相关。

CoordinatorParticipant12、At 10:27:540 (hr, min, sec.), Processor B requests time from a time server.At 10:27:610 it receives a reply with time stamp of 10:27:375.a)Find out the dirft of B‟s clock with respect to the server …s clock (assumethere is no processing time at the server for time service).b)Is B‟s clock going too fast or too slow?c)How should B adjust its clock?1.B的发送时间应该是10：27：140 所以该时钟早了：400SEC2.TOO FAST3.将时钟的的SEC减去200 就可以了13、Compile a summary for the coursea)It should be a summary summarizing what we have gone through for thesemester.b)This summary will serve as a quick review on distributed systems and aprefect cue card/note before a job interview that is related to distributedcomputing.c)I will look at the organization of the summary as well as some intelligentways for reminding what you have learnt in the course.14、“End-To-End Arguments”主要观点是什么？主要是关于可靠性的讨论。