开源云计算软件的比较与分析
开源云计算平台研究

if s utr—sa ev e T eevr u pnsuc m el po i ni p r n t n t efrh s h on t i s nr t cuea— sri . h s ai s e-orep jc rv ea ot t e a v oew od o ws t ue ar — c o o s d m a a r i ot l ho
s f a e p af r s T e o e a lf n t n o h s y t ms i o ma a e c n iu a o f vru l ma h n s f r a co d p o i i g o t r l t m . h v rl u c o f t e e s se s t n g o f r t n o i a c i e o l u r vd n w o i g i t
择 。本文对这些系统进行比较详尽 的比 较和分析 , 特别分析 了反映这些项 目 自 各 不同目 的不 同 标 特点 , 以方便用户的选择。
关 键 词 : 计 算 ;开 源 ;管理 平 台 云
中图分 类号 :P 1 T31
ቤተ መጻሕፍቲ ባይዱ
文献标识 码: A
d i 1.9 9 . s.0 627 . 1 . . 6 o: 03 6  ̄i n10 —4 5 0 20 0 s 2 4 0
po e . r el j s
Ke r s c o d c mp t g p n s u c ;ma a e n lt r s y wo d : l u o u n ;o e —o r e i n g me t a o p fm
0 引 言
现在的云计算已不再是一个新的尖端技术 , 它已 成为彻底改变人们使用和开发应用程序方式的一种 极 有价 值 的重要 技术 。Lnx R g, 开源技 术 为云 iu& e1及 . )  ̄ ( 公共的和私有的基础设施 ) 提供了基础。 本文将对云 进行剖析 , 探究基 于虚拟机 的开源云计算平台以及这 些 平 台 的一 些典 型 实例 , 以便 用户利 用这 些开 源平 台 来 建立 自己的 云计算 系统 。
服务器虚拟化技术OpenStackvsProxmoxVE

服务器虚拟化技术OpenStackvsProxmoxVE 随着云计算的快速发展,服务器虚拟化技术成为了企业管理和运维中不可忽视的一部分。
其中,OpenStack和Proxmox VE是两种备受关注的虚拟化平台。
本文将对它们进行比较和分析,帮助读者更好地了解它们的优势和适用场景。
一、OpenStack简介OpenStack是一个用于构建和管理云计算平台的开源软件,它提供了一系列的组件,用于实现云计算中的虚拟化、网络、存储等功能。
OpenStack旨在提供弹性、可扩展、安全的云计算解决方案,被广泛应用于公共云、私有云和混合云环境。
1.1 OpenStack的特点OpenStack具有以下特点:1. 开源:OpenStack是开源软件,允许用户自由访问和修改代码,满足个性化需求。
2. 弹性扩展:OpenStack采用分布式架构,可以根据需求自由扩展计算、存储和网络资源。
3. 多租户支持:OpenStack能够实现多个租户共享同一套基础设施,提高资源利用率。
4. 健壮可靠:OpenStack具备高可用性和自动化管理能力,能够保证云计算平台的稳定运行。
5. 多样化的组件:OpenStack提供了丰富的组件,如Nova、Neutron、Cinder等,可根据需求选择和定制。
1.2 OpenStack的组件OpenStack包含多个重要组件,如下所示:1. Nova:用于管理和调度计算资源,提供虚拟机实例的创建、调整和销毁等功能。
2. Neutron:用于管理网络资源,提供虚拟网络的创建、隔离和连接等功能。
3. Cinder:用于管理存储资源,提供块存储服务,支持虚拟机实例的持久化存储。
4. Glance:用于管理镜像资源,提供镜像的上传、下载和共享等功能。
5. Keystone:用于身份认证和访问控制,提供用户和角色管理、认证服务等功能。
二、Proxmox VE简介Proxmox VE(Virtual Environment)是一款基于开源的服务器虚拟化平台,提供了虚拟化和容器两种虚拟化技术。
大数据分析——如何选择适合的数据分析工具

大数据分析——如何选择适合的数据分析工具在进行大数据分析时,选择适合的数据分析工具是非常重要的。
不同的工具具有不同的功能和特点,选择合适的工具可以提高分析效率和准确性。
本文将介绍几种常用的大数据分析工具,并提供选择工具的几个关键因素。
一、常用的大数据分析工具1. Hadoop:Hadoop是一个开源的分布式计算框架,适用于处理大规模数据集。
它具有高可靠性、高扩展性和高效性的特点,可以处理结构化和非结构化数据。
Hadoop生态系统中的组件包括HDFS(Hadoop分布式文件系统)、MapReduce、Hive、Pig等。
2. Spark:Spark是一个快速、通用的大数据处理引擎。
它支持在内存中进行数据处理,比传统的MapReduce更快速。
Spark提供了丰富的API,可以用于数据处理、机器学习、图形计算等各种任务。
3. Python:Python是一种简单易学的编程语言,拥有丰富的数据分析库,如NumPy、Pandas、Matplotlib等。
Python可以用于数据清洗、数据可视化、统计分析等任务,适合中小规模的数据分析。
4. R:R是一种专门用于统计分析和数据可视化的编程语言。
R拥有丰富的统计分析库和可视化工具,适合进行高级的统计分析和建模。
5. Tableau:Tableau是一种强大的可视化工具,可以连接各种数据源并生成交互式的可视化报表。
Tableau提供了直观的界面和丰富的可视化选项,适合展示和共享分析结果。
二、选择适合的数据分析工具的关键因素1. 数据规模:根据数据规模的大小选择合适的工具。
如果数据量较大,可以考虑使用Hadoop或Spark进行分布式处理;如果数据量较小,Python或R等工具也可以满足需求。
2. 数据类型:根据数据的类型选择合适的工具。
如果数据是结构化的,可以使用SQL查询语言进行分析;如果数据是非结构化的,可以使用Hadoop或Spark进行处理。
3. 分析需求:根据具体的分析需求选择合适的工具。
数据分析框架总结

数据分析框架总结引言在当今大数据时代,数据分析的重要性日益凸显。
随着数据量的快速增长,传统的数据处理方法已经无法满足分析师和数据科学家的需求。
因此,数据分析框架应运而生。
本文将对几种常见的数据分析框架进行总结和分析,并比较它们之间的优缺点。
1. Apache HadoopApache Hadoop是目前最受欢迎的开源数据分析框架之一。
它由Apache软件基金会开发,旨在处理大规模数据集。
Hadoop的核心组件包括Hadoop Distributed File System(HDFS)和MapReduce计算模型。
HDFS是一种专为大规模数据存储而设计的分布式文件系统。
它可以在多个节点之间分布和复制数据,提高了数据的可靠性和容错性。
MapReduce是一种用于并行处理大规模数据集的编程模型。
它将计算任务分成多个小任务,并在各个节点上并行执行。
MapReduce模型以简单而有效的方式处理数据,但不适合实时数据分析。
优点: - 可处理大规模数据集 - 可靠性和容错性更高 - 成熟的生态系统,有丰富的工具和支持缺点: - 不适合实时数据分析 - 对于小规模数据集的处理效率较低2. Apache SparkApache Spark是一个快速而通用的数据处理引擎,可以用于大规模数据处理和分析。
相比于Hadoop的MapReduce模型,Spark使用了一种称为弹性分布式数据集(Resilient Distributed Dataset,简称RDD)的高级抽象。
RDD是Spark的核心概念之一,它是一个可以并行处理的数据集。
Spark通过将数据集放入内存中进行操作,大大提高了计算速度和效率。
除了支持Python和Java等编程语言外,Spark还提供了SQL和流处理等功能。
优点: - 快速而通用的数据处理引擎 - 支持多种编程语言和功能 - 高效的内存计算,适用于实时数据分析缺点: - 对于大规模数据集的内存要求较高 - 需要较大的资源支持3. Apache FlinkApache Flink是一个可扩展的流处理和批处理框架。
大数据分析知识:开源大数据分析工具——Spark、Hadoop、和Storm

大数据分析知识:开源大数据分析工具——Spark、Hadoop、和Storm近年来,随着数字与互联网的不断发展,人们每天产生大量的数据。
这些数据包括各种类型的数字、图像、文本等等。
如何对这些数据进行高效查询和分析,已经成为了一个迫切需要解决的问题。
为了应对这个问题,开源社区出现了一批大数据分析工具,其中最为常见和流行的就是Spark、Hadoop和Storm。
这些工具不断发展和壮大,被广泛应用于各种情况下的大数据处理。
一、SparkApache Spark是一个通用引擎系统,支持分布式计算。
它最初是由Berkeley大学AMP实验室开发的,是一个基于内存的计算引擎。
相比于Hadoop,它速度更快,且处理数据的可以达到数PB级别。
Spark 可以与Java、Scala、Python等语言结合使用,提供了强大的开发工具和丰富的API,支持各种类型的数据分析处理。
Spark提供了一个交互式的Shell界面,这个交互式界面可以轻松地从各种数据源中读取数据,进行处理和分析,并将结果保存到各种类型的输出源中。
它也提供了强大的分布式计算模型,可以让用户在大数据分析处理过程中获得更高的效率。
二、HadoopApache Hadoop是一个开源的软件框架,支持分布式存储和处理大数据集的应用程序。
Hadoop提供了一个分布式文件系统(HDFS)和MapReduce编程模型。
在Hadoop中,数据可以分散到许多不同的服务器上进行存储和处理。
MapReduce可以让用户在这些分散节点上执行计算任务,最终将结果合并成单一结果。
Hadoop可以运行在一组廉价的服务器上,而不是在只有一个高成本服务器上进行处理,因此降低了成本和提高了可靠性。
Hadoop的主要特点包括:高扩展性、高可靠性、高稳定性和强数据一致性。
Hadoop可以使用Java、Python和其他编程语言进行开发,但最常见的编程语言是Java。
并且,Hadoop与Linux等操作系统常用的基于命令行的界面交互使用,使用起来十分简便。
开源软件的优势与劣势分析

开源软件的优势与劣势分析开源软件是一种非常受欢迎的软件开发模式,它在全球范围内被广泛应用。
作为一种软件开发方式,开源软件的优势与劣势都有着非常明显的特点。
在这篇文章中,我将探讨开源软件的优势与劣势,并分析其对软件开发和开发者的影响和意义。
一、优势1.开放性开源软件的最大优势就是其开放性。
开源软件的源代码是公开的,任何人都可以使用、修改和分发它。
这样,社区中的成员都可以为软件做贡献,使软件更加完善。
与此同时,开放性也降低了软件的使用门槛。
2.灵活性开源软件的源代码是公开的,这意味着任何人都可以自由修改其代码。
开源软件具有灵活、可扩展的特点,可以根据用户需求进行自定义设置和优化。
此外,开源软件还可以方便地集成其他开源软件和自主开发的模块。
3.可靠性由于开源软件的源代码是公开的,因此开发者和用户都可以对软件的代码进行审核和检查,从而带来更高的可靠性。
这也有助于发现和修复潜在的漏洞和错误,增强了软件的安全性。
4.品质开源软件的由于其源代码是公开的,因此任何人都可以为软件做出贡献,有助于提高软件的品质。
开源软件经过多人多次审查、测试、维护和改进,往往有更好的稳定性和性能。
5.成本开源软件不像商业软件一样需要购买授权,可以免费使用,且还不会担心公司解散导致软件服务中断问题。
在软件的数量和功能上,开源软件也不亚于其他商业软件。
对于企业和个人而言,开源软件可以在较低的成本下获得相应的功能和服务。
二、劣势1.支持开源软件通常由志愿者和社区在维护,而不像商业软件一样有专门的维护和技术支持团队。
这意味着,如果你遇到了问题,需要自己寻找解决方案或者靠社区中的其他成员提供帮助与支持,使用门槛相对较高。
2.文档对于初学者而言,开源软件缺乏易于理解的文档和说明。
这意味着用户需要花更多的时间和精力来阅读和理解源代码。
即使是对开发人员,如果开源项目缺乏良好的文档记录,他们也需要花费更多的时间学习代码。
3.兼容性开源软件由于其开放性和灵活性,容易导致兼容性问题,尤其是在与其他软件和硬件组件进行交互时。
开源软件开发的优缺点

开源软件开发的优缺点开源软件是指可以被任何人查看、复制、修改、发布的软件。
相较于闭源软件,开源软件在源代码、知识产权、成本、可定制性等方面具有独特的优势和缺陷。
本文将从开源软件的优点和缺点两方面,探讨开源软件开发的利与弊。
一、开源软件开发的优点1. 更加透明的源代码开源软件的最大优点是源代码可供查看,这意味着开发者可以了解软件的实现,进行自定义修改以适应个性化需求。
另一方面,源代码可以公开接受代码审查,避免包含恶意代码或后门,确保使用者的隐私和安全。
2. 共享知识产权开源软件的知识产权使用和复制权没有限制,这为各类企业和组织提供了巨大的便利。
例如,通过开放的协作形式,一家公司可以获得一个多功能软件的代码,并利用这些代码创建一个不同的软件产品。
这样的协作方式将节省大量时间和开发成本,更重要的是可以保证不同的团队拥有公平的竞争机会。
3. 较低的成本在开源软件开发模式下,开发人员可以充分利用现有资源、工具和开发者之间的分享。
这降低了开发成本,使得开发团队可以专注于创新和增量开发,从而更好地实现软件开发的目标,满足企业的需求。
4. 可定制性开源软件能够灵活地适应不同需求,满足个性化需求。
例如,对于一个开源的ERP系统,开发者可以根据企业个性化需求进行调整,增加产品特性并对其进行配置,以加快应用程序开发。
此外,开源软件还可以更快地调整和调试,使得应对危机和新兴市场需求更加容易。
二、开源软件开发的缺点1. 开发难度较大相对于闭源软件的开发,开源软件开发需要更多的团队开发和项目管理经验。
此外,开源软件在质量控制和可靠性方面存在一定挑战:开发者可能会有不同的目标,这会使得一个开源项目变得不可控,导致修改和维护的成本增加。
2. 可能缺乏标准化开源软件开发存在着缺乏标准化的问题。
因为开发者都拥有一定的推动力,可能会开发自己的分支,不同项目之间的猜测、实验、扩展显得很有吸引力。
这意味着许多开源项目可能存在无数变种,其中一部分可能是建立在过时或不安全的代码上。
开源软件的优劣比较与适用场景

开源软件的优劣比较与适用场景开源软件是指源代码公开的软件,其核心思想是共享和自由。
而闭源软件则是指不公开源代码的软件。
开源软件有着其独特的优势和劣势,适用于不同的场景。
本文主要探讨开源软件的优劣比较与适用场景。
一、开源软件的优势1. 自由度高开源软件的源代码是公开的,用户可以根据需要进行修改和定制,非常灵活方便。
同时,开源软件通常可以免费下载,并且不需要购买授权,用户可以免费使用。
2. 安全性更高由于开源软件的源代码可以被公开评审,这意味着其安全性更高。
当有安全问题被发现时,开源社区可以更快地发布修复程序,减少了用户的风险。
3. 社区支持开源软件通常有一个庞大的社区,可以为用户提供技术支持和相关软件的更新升级。
同时,社区也可以提供各种插件和组件来丰富软件的功能和性能。
4. 可定制性强开源软件的源代码是公开的,用户可以根据自己的需要进行修改和定制,这使得开源软件可以更好地适应用户的需求和业务需求。
这也为用户提供了更好的灵活性和可扩展性。
二、开源软件的劣势1. 使用门槛高开源软件通常需要一定的技术知识和能力才能正确地安装和配置。
对于非专业用户来说,他们可能需要花费更多时间来学习使用开源软件。
2. 用户协同难度高开源软件通常是由来自全球的开发人员共同开发的,他们可能使用不同的开发平台和工具。
因此,合理和平衡地组织和管理分布式开发过程是一个挑战。
3. 代码可见性难保护由于开源软件的源代码是公开的,这使得一些不法分子有机会拿到源代码,进行篡改和滥用。
同时,一些商业公司可能会将开源软件用于自己的商业产品中,这种行为可能会导致知识产权的问题。
三、开源软件的适用场景1. 数据库开源数据库软件在传统关系型数据库的基础上,可以提供更加灵活和高效的数据存储和查询功能。
开源数据库软件最著名的代表是MySQL和PostgreSQL等。
2. 前端开发前端开发中最常用的开源软件包括jQuery、React、Vue和Angular等。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
SURFnet cloud computing solutionsUniversiteit van AmsterdamMaster of Science in System and Network EngineeringMarvin Rambhadjan(marvin.rambhadjan@os3.nl) Arthur Schutijser(arthur.schutijser@os3.nl)March12,20101.ABSTRACT Cloud computing comparison 1AbstractSURFnet is the primary supplier of advanced networking to Colleges,Universi-ties and Research Institutions.SURFnet(and partners)wishes to optimize its computing capacity,and hope to realize this with the use of cloud computing. With the rising interest in cloud computing,a lot of new techniques are being developed and SURFnet wishes advice about which technique bestfits their needs.In this document we research thefive most promising cloud platforms. They are compared on their offload ability,live migration options,high avail-ability options and others.A conclusions is drawn out of of this comparison, which cloud technique bestfits the profile needed for SURFnet and its insti-tutes/partners.From this comparison we describe several use cases in which this cloud platform would be useful.1Contents1Abstract (1)2Introduction (4)2.1General description of the project (4)2.2Goal and research questions (5)2.3Outline of this report (5)3Cloud computing (6)3.1Service models (6)3.2Deployment Models (8)3.3Open standards (9)4SURFnet (10)4.1Combining resources (10)4.2Requirements (10)5Techniques/Platforms (12)5.1Available cloud techniques (12)5.2Researched cloud techniques (13)6Hypervisors (15)7Test environment (17)8Eucalyptus (19)9OpenNebula (25)10AbiCloud (32)11OpenQRM (37)12VMware vSphere (41)13Comparison (45)13.1Compare matrix (45)13.2Overall outcome (47)13.3Scenarios (47)14Deployment (48)14.1Networking (48)14.2User Interface (48)14.3Scheduling (48)14.4Multi EC2Site Support (49)15Use Cases (50)15.1Green IT (50)15.2Private Cloud (50)15.3Hybrid Cloud (51)2CONTENTS Cloud computing comparison15.4Community Cloud (53)16Future Research (55)17Conclusion (56)17.1Advice (56)32.INTRODUCTION Cloud computing comparison2IntroductionCloud computing is becoming an important concept in the IT world.The idea has been around for quite a while but lately more and more people are getting interested in it.This is because it is a concept,which could improve the com-puting efficiency in datacenters and thus save money.A lot of similar solutions have been used to make large(collaborate)computation power possible.These are clusters,grids and distributed systems.There are different kind of ways to use cloud computing.It is possible to use this technique to outsource a critical business application so it will run on a“public cloud”.In this set-up you will only pay for what you use.There are many cloud providers which implement this kind of service.It is called Software as a Service(SaaS).Besides software it is also possible to create a network infrastructure based on a cloud(IaaS).This way computing capacity is used more efficient than the traditional set up of a datacenter.Further more cloud computing could be used to make your datacenter more green and create redundant servers more easily.2.1General description of the projectSURFnet would like to be able to share computing resources between them and their institutions/partners.This way it is not needed for institutions(depart-ments/researchers)to build a very large datacenter to be able to handle large amount or peaks of computing requests.When sharing the computing resources it is possible to“borrow”computing resources from other institutions to handle the load when needed.With cloud computing this is possible.This is why SURFnet would like advice about possible solutions.In this project available cloud solutions1are researched to implement shared computing resources.If the resulting cloud solution is a great success for SURFnet they will advice this technique2to the institutions.All of the them are then able to be part of the cloud and use more resources from each other without having to purchase ex-tra hardware,storage,power,etc.by reusing overcapacity of datacenters.In this project we will not cover the ability to offer some sort of billing system for shared resourches as SURFnet wants a cloud for their own internal services and share unused resources.During this project we will come across many virtualization techniques,this is an important aspect of cloud computing.Although we will not cover this in too much depth as many information on this can be found on the internet and we are more interested in the available cloud platforms.1Because all techniques treated in this project are relatively new,the available documen-tation and information is limited.This is why the Internet overall is our biggest supplier of information.This is the reason we quote a lot of people and websites and base our decisions on these sources.2In this paper we will use both“technique”as“platform”to refer to the different program-s/implementations available to implement a private cloud.In most cases“platform”will be used.42.INTRODUCTION Cloud computing comparison2.2Goal and research questionsThe research question for this project is:Which Cloud Computing platform meets the requirements best,to combine mul-tiple clouds to create a hybrid cloud for SURFnet?Within this project the most promising cloud computing solutions(for SURFnet), which are available at the moment,are researched.These solutions are com-pared on their functionality(the full list can be found in section4.2).At the end of this project an advice is given to SURFnet about the best solutions for them and how to realize their goal as stated before.2.3Outline of this reportIn the next section cloud computing is fully explained which will cover the differences between the types of cloud techniques and how it is used.After this the requirements of SURFnet are listed(section4.2)and a description of what SURFnet hopes to achieve with cloud computing.Based on these requirements a handful of available technique are chosen in section5which will be looked into.Finally all techniques and platforms are compared with each other to be able to give the best possible advice to SURFnet which can be found in section 17(conclusion).53.CLOUD COMPUTING Cloud computing comparison3Cloud computingThe name cloud computing comes from the fact that in network designs large parts of a network are drawn as a cloud.For example the Internet is most often displayed as a cloud.It is displayed like this because you are not sure what is in there,and have no control over it.But when you put some data in you get some data out.This is the same with cloud computing,you are not sure where the servers(you are communicating with)are located but you get answers you requested back.NIST defined cloud computing as following“Cloud computing is a model for enabling convenient,on-demand network access to a shared pool of configurable computing resources(works,servers,storage,applications,and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.This cloud model promotes availability and is composed offive essential characteristics,three service models,and four deploy-ment models”[6].The characteristics refer to On-demand self-service,Broad network access,Resource pooling,Rapid elasticity,and Measured Service.The technique behind“Cloud Computing”is based on a collection of many old and a few new concepts in severalfields,like computer grids,distributed systems as well as virtualization.Cloud computing has created much interest in the last few years because of its ease to scale your resources.It is possible to expand or shrink your resources on thefly and only pay for the resources you use.Cloud computing is a new fase in utility computing,which means packaging of computing resources,such as computation and storage,as a metered service similar to a traditional public utility(such as electricity,water,natural gas,or telephone network)[1].It is possible to build a cloud on top of the existing hardware in a data-center.Most cloud solutions rely on virtualization techniques.It is through these techniques,resources could be made available so quickly to customers. With the use of middleware all servers within a datacenter(or multiple data-centers)could be combined to act as a single pool of resources and deliver this to customers.These two techniques are the fundamentals which make cloud computing possible and such a success[2].In the next section we will cover the different cloud services and possibilities.3.1Service modelsCloud computing has several layers in which you can operate.The more basic (and most used)layers are:software,platform and infrastructure.We will explain these three layers in the next section in more depth.These layers could run on top of each other,with software as the highest level and infrastructure at the bottom(Figure1).This latter one could then use virtualization techniques to work more efficient(we will get back to this later on in this chapter).Besides the three layers just mentioned,there are also other types of services like Data-Storage as a Service(DaaS)and Communication as a Service(CaaS).63.CLOUD COMPUTING Cloud computing comparisonFigure1:Cloud triangle.[3]We will not cover these two in this paper only the more common and(for SURFnet)interesting ones,namely:•Software as a Service(SaaS)•Platform as a Service(PaaS)•Infrastructure as a Service(IaaS)SaaSWith SaaS a company outsources an application to a SaaS provider.Applica-tions are maintained and managed within the cloud of a provider.A company can rent this service for its own use.It is common in these kind of set-ups you pay only for what you use.Because this is build on top of a cloud,resources could be extended easily to customers needs.SaaS exports the computational load from the users terminal to datacenters where cloud applications are deployed.This in turn decreases the restrictions on hardware requirements needed for the end-users,because they do not need to have big computing power,as this is done in the(remote)datacenter.This all means no upfront investment in servers and licenses[5].Some examples of Software as a Service are:Salesforce Customer Relation-ships Management(CRM)system and Google Apps.PaaS“This form of cloud computing delivers development environments as a service. You build your own applications that runs on the provider’s infrastructure and are delivered to your users via the Internet from the provider’s servers.”[5]“The consumer does not manage or control the underlying cloud infrastruc-ture including network,servers,operating systems,or storage,but has control over the deployed applications and possibly application hosting environment configurations”[6].73.CLOUD COMPUTING Cloud computing comparisonSome examples of Platform as a Service are:Salesforce Apex language and Google Apps Engine.IaaSThis layer is(mostly)used in combination with hardware virtualization.It is used to deliver customers on-demand virtual machines where they can run their own services on.It is possible for customers to build their entire infrastructure (processing,storage,networks,and other fundamental computing resources)on this cloud layer.Several IaaS providers offer the ability to build an entire computer infras-tructure on their cloud,Amazon Web Services deliver such a ers are able to manage their infrastructure through an API delivered by the provider.“The consumer does not manage or control the underlying cloud infras-tructure but has control over operating systems,storage,deployed applica-tions,and possibly limited control of select networking components(e.g.,host firewalls).”[6]Some examples of Infrastructure as a Service are:Amazon Web Services and GoGrid.3.2Deployment ModelsCloud computing has several deployment models.Here are the definitions de-fined by NIST[6]:Private cloud The cloud infrastructure is operated solely for an organization.It may be managed by the organization or a third party and may exist on premise or offpremise.Community cloud The cloud infrastructure is shared by several organizations and supports a specific community that has shared concerns(e.g.,mission, security requirements,policy,and compliance considerations).It may be managed by the organizations or a third party and may exist on premise or offpremise.Public cloud The cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services.Hybrid cloud The cloud infrastructure is a composition of two or more clouds (private,community,or public)that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability(e.g.,cloud bursting for load-balancing between clouds).3Amazon offers an IaaS public cloud to their customers under the name Amazon Elastic Compute Cloud(EC2),they also offer other cloud services but this is the most interesting for us.Through an API customers are able to manage their virtual machines running on this cloud.83.CLOUD COMPUTING Cloud computing comparison3.3Open standardsAs cloud computing is making its entry into the world several open standards are introduced.These are meant to give guidelines to implement or develop cloud computing as best as possible.These standards are reviewed by multiple parties and(most)are free to use.When holding to these guidelines it will improve the overall product as these standards are based on best practices.Many of the open standards for clouds computing today are still in development but we will give a short highlight of the most interesting ones as found on [7]: Cloud Security Alliance(CSA)“The Cloud Security Alliance was created to promote the use of best practices for providing security assurance withinCloud Computing,and provide education on the uses of Cloud Computingto help secure all other forms of computing”.Open Virtualization Format(OVF)“This specification describes an open, secure,portable,efficient and extensible format for the packaging anddistribution of software to be run in virtual machines.”This is managedby Distributed Management Task Force(DMTF).The European Telecommunications Standards Institute(ETSI)“The focus is on scenarios where connectivity goes beyond the local network.This includes not only Grid computing but also the emerging commer-cial trend towards Cloud computing which places particular emphasis onubiquitous network access to scalable computing and storage resources”. Open Cloud Computing Interface(OCCI)“The purpose of this group is the creation of a practical solution to interface with Cloud infrastructuresexposed as a service(IaaS).”.This is done by Open Grid Forum(OGF). Object Management Group(OMG)“OMG’s focus is always on modelling, and thefirst specific cloud-related specification efforts have only just be-gun,focusing on modelling deployment of applications&services on cloudsfor portability,interoperability&reuse”.Open Cloud Consortium(OCC)“development of standards for cloud com-puting and frameworks for interoperating between clouds,develops bench-marks for cloud computing and supports reference implementations forcloud computing,preferably open source reference implementations.”. Organization for the Advancement of Structured Information Standards “OASIS drives the development,convergence and adoption of open stan-dards for the global information society”.94.SURFNET Cloud computing comparison4SURFnet“SURFnet is the primary supplier of advanced networking to Colleges,Uni-versities and Research Institutions.Their main goal is to share(multimedia) information and data by connecting institutes(higher educations and research groups)in a safe and secure way.SURFnet is constantly searching for new innovations to improve their services”[8].4.1Combining resourcesMore and more companies want to see if cloud computing is a solution for their resource needs,the same with SURFnet and institutes(like other NREN4).As there are a lot of(big)cloud computing providers which offer interesting services to outsource for example office like applications,cloud computing is becoming more panies like Google offers services like Google Apps(SaaS) and Google Apps Engine(PaaS).Customers(SURFnet and institutes)can use these services with less people for maintaining everything and with less com-putational resources needed within the company,which can save on the total cost.Problems with these type of services are that companies want services man-aged internal,because there might be sensitive information involved.You want to know what happens with this kind of information when these services are outsourced to for example the United States.The current state of affair with company(including SURFnet and its insti-tutes)resources is that they are all spread over several departments and are not used to their full potential.Extra resources are also added(bought)by one department while others are only using a small portion.This is why out-sourcing certain services to a public cloud becomes interesting.But because of legal issues and sensitive information this is not always an option.This is why SURFnet wants to implement cloud computing for themselves and reduce costs by sharing resources.Because there already are large public cloud providers,which offer(for rea-sonable costs)the ability to offload services/requests to a public cloud,it would be useful to include these.Of course this is not always possible due to sensitive information but it would improve the overallflexibility of the cloud.SURFnet would like to have this option as a backup possibility.4.2RequirementsTo sum up all requirements of SURFnet we set the following criteria to the available cloud computing techniques.Based on these criteria several cloud techniques and platforms are researched and compared.4A national research and education network(NREN)is a specialised internet service provider dedicated to supporting the needs of the research and education communities within a country.[9]104.SURFNET Cloud computing comparison An internal private cloud By using cloud technology they want to merge computing of several departments and/or other institutes to avoid unused overcapacity and share resources.Offloading to other private cloud(s)Using their own cloud,they want to connect with other private clouds(one or more of their institutes)to be able to have more resources when needed,and the other way around.Offloading to public cloud(s)If other private clouds can not handle the re-quested computing requests they would also prefer if it would be possible to offload some load to a public cloud like Amazon.Through this project SURFnet is given advice by us on what cloud solution to implement which meets the requirements.With this cloud solution SURFnet will be able to design and implement a private cloud.After this seems a success for SURFnet,institutes connected to them could also implement this solution which enables the possibility to combine multiple private clouds into a hybrid cloud.115.TECHNIQUES/PLATFORMS Cloud computing comparison5Techniques/PlatformsIn this section we will give a general description about the available cloud plat-forms(which are available to us as this paper is written).From this list only a few are further researched.This section will explain all the relevant and important features of the researched platforms.Given the requirements by SURFnet,we only focused on Infrastructure as a Service(IaaS)platforms.Only IaaS is able to deliver a private cloud as SURFnet wishes.All other cloud implementations generally make use of public cloud and their providers,and only use specific services like office applications. As described in the requirements(section4.2)SURFnet would like to build a computing infrastructure on a cloud,so an IaaS is the solution for them.5.1Available cloud techniquesAs cloud computing is rising,more and more companies try to profit from it. Large parties like Microsoft,Google,and Amazon try to offer cloud services. Also smaller(less known)parties come with their own cloud solutions both public and private.Tofind the best cloud solution for SURFnet we considered and looked into several cloud solutions and came with the following list5:•3Tera(AppLogic)•AbiCloud•Amazon EC2•Aserver/DAAS•Deltacloud•Enomaly’s Elastic Computing Platform(ECP)•Eucalyptus•Flexiscale•Gandi•GoGrid•IBM cloudburst•Nimbus•OpenNebula•Openqrm5Besides these are a lot of other cloud products(as more and more are being developed) but we found these more interesting for SURFnet than others.As they are well known and/or offer a specific interesting service.125.TECHNIQUES/PLATFORMS Cloud computing comparison•Rackspace cloud(Mosso)•Ubuntu Enterprise Cloud(UEC)•VMware vSphere5.2Researched cloud techniquesOut of all available platforms,listed above(section5.1),we made a small list of the platforms we will look into in more depth.As can be read in section4.2 the platform has to match certain requirements.We have chosen together with SURFnet the following platforms to research further.It came down to these because they are the most promising solutions as they offer most of the required features;supported operating systems,co¨o pera-tion with other cloud/virtualization platforms,and have the biggest potential.•AbiCloud•Eucalyptus•OpenNebula•OpenQrm•VMWare vSphereDue to the limited time and documentation found about all platforms(be-cause all platforms are relatively new)we have to make assumptions about the supported features.We assume if important features like Live migration(ex-plained below)are not documented it is not supported by the given platform. We will compare the platforms on the following features6:API We take into account if the platform supports multiple API’s or an open standard for this.Availability zones Is it possible to isolate parts of the cloud for specific pur-poses or customers.Fault tolerance/fail-over Is there a feature to handle failing hardware?Running two instances of the same virtual machine(VM)on different physical servers,when one physical server fails the VM on the other server will take over.The same can be done for physical machines by making them redundant.Live migration Moving a virtual machine while it is still running.Migration of virtual machines is always possible when it is shutdown,but there would be a high level offlexibility if this could be done while running.6Techniques like Fault tolerance and Live migration rely on the used hypervisor.When a hypervisor supports this it could be used,although it will depend on the cloud platform if it will support this feature also as it needs to know if a instances is moving.135.TECHNIQUES/PLATFORMS Cloud computing comparison Monitoring Is there an ability to monitor running(virtual)machines,either with an internal tool or an external plug-in or program.Multiple clouds Is there support for multiple clouds.Is it possible to offload processes to other private clouds or a(subscribed)public cloud(provider).This will be described at the start of every platform as it is an important feature.Open Virtualization Format(OVF)“This is an open standard for pack-aging and distributing virtual appliances(software run within virtual machines).This standard is not bound to a hypervisor or processor architecture”[13].Is this standard applied within the platform?7 Scaling Could the cloud infrastructure adapt to requests.Is it possible to ex-pand or shrink the resources automatically when needed or can this easily be done by hand(Automatically shutdown or turn on virtual/physical machines when needed).User management Can users be managed and assign privileges to them.Has the platform a graphical user interface to interact with the cloud?Besides these features we will look into their marketshare,future develop-ment and overall potential.Based on all these point we will make a comparison found in section13to advice SURFnet.7As this open standard is already implemented is some platforms we take this into account when comparing the platforms.When no other open standard is mentioned at the platform it is not supported(yet).146.HYPERVISORS Cloud computing comparison6HypervisorsThe virtualization techniques that are used by the different cloud platforms from our research will be briefly explained below.We will not go into much depth,as there is enough knowledge about hypervisors within SURFnet and because this was not the main question of our research,altough this is an important layer in cloud computing.Xen works directly on the physical hardware.It treats the old(host)OS as a guest OS.For Xen,Linux needs patches to make Linux a guest VM on the hypervisor.KVM(Kernel-based Virtual Machine)is a patch to the Linux kernel,where the kernel supports a virtual infrastructure.ESX and ESXi are“bare-metal”hypervisors.This means that they are in-stalled directly on top of the physical hardware and partitions the re-sources of the underlying server.VirtualBox is a general-purpose full virtualizer for hardware developed by Sun.Linux VServer“is a virtual private server implementation done by adding operating system-level virtualization capabilities to the Linux kernel.”[44] The VMware solutions ESX/ESXi both support their own features like vMo-tion,High Availability and others.These features are also supported when this hypervisor is used in an open source cloud solution.Although the configuration and management are supported by the VMware platform,it is not configurable within the open source cloud solution.This means that these features need to be used from outside the cloud solution.157.TEST ENVIRONMENT Cloud computing comparison7Test environmentFor thefive cloud platforms(5.2)we wanted to set up a test environment.The reason for this was for us to get some experience with them and get a feeling for there capabilities.For each of the platforms we wanted to test the following features:•Manageable•Virtualization features like,live migration and fault tolerance•Offloading capabilities to another cloud•Support for multiple guest operating systemsTo set up these different platforms SURFnet offered us two servers within there test-datacenter.One of these servers had virtualization support from the CPU.This is why we used this server as the controller which ran all virtual machines.The network layout of the servers can be seen in Figure2.Figure2:Server network layout.Because these servers were unaccessable directly by us we later got laptops to test the platforms.This gave us some moreflexibility due to the fact that we needed to contact someone in case something went wrong with the servers as there were behind closed doors to us.Both of these laptops had support for virtualization from the CPU.The network layout of the laptops can be seen in Figure3.177.TEST ENVIRONMENT Cloud computing comparisonFigure3:Laptop network layout.As can be read later on we got some problems in setting up the platforms, this is why we used several different ways to set up a solid test environment. Besides the servers and laptops we also used virtual machines on our own laptops to test everything.This is why setting up all these environments costed us a lot of time and were not able to test all the features.We will now describe the several cloud solution we tried.They are listed in the order in which we tested them.•Eucalyptus•OpenNebula•AbiCloud•OpenQrm•VMWare vSphere18。