GFARM V2 A GRID FILE SYSTEM THAT SUPPORTS HIGH-PERFORMANCE DISTRIBUTED AND PARALLEL DATA CO

合集下载

filefrag用法

filefrag用法
filefrag是一个用于显示文件碎片信息的命令行工具。

它通常
用于查看文件在磁盘上存储时的碎片情况，以及文件系统的碎片化
程度。

filefrag命令的基本用法如下：
filefrag [选项] 文件名。

其中，选项可以是：
-b, --block-size=SIZE，指定块的大小，以字节为单位。

-v, --verbose，显示详细信息。

-a, --all，显示所有文件的碎片信息。

-s, --sync，在显示文件碎片信息之前，先进行文件系统同步。

例如，要查看文件"example.txt"的碎片信息，可以使用以下命令：
filefrag example.txt.
该命令将显示文件"example.txt"的碎片情况，包括每个碎片的起始块号、长度和是否为连续碎片等信息。

另外，filefrag命令还可以与其他命令结合使用，比如可以将其输出重定向到文件中，以便后续分析。

这样可以帮助用户更好地了解文件在磁盘上的存储情况，以及优化文件系统的碎片化程度。

总之，filefrag命令是一个非常有用的工具，可以帮助用户查看文件的碎片信息，从而更好地管理和优化文件系统的性能。

希望这个回答能够帮助到你理解filefrag命令的基本用法。

gflag 原理 -回复

gflag 原理-回复「gflag 原理」指的是Google 提供的一个命令行参数解析工具，用于将命令行参数转化为程序中可用的数据结构。

在本文中，将详细解释gflag 的原理，包括其基本用法、解析器的工作流程以及其实现的原理。

一、基本用法1.1 安装gflag要使用gflag，首先需要将其安装到本地开发环境中。

可以通过以下步骤安装gflag：- 在Linux 环境下，可以通过包管理工具直接安装。

比如，在Ubuntu 上使用以下命令进行安装：sudo apt-get install libgflags-dev- 在Windows 环境下，可以从gflag 的GitHub 仓库中下载预编译的二进制文件，并将其添加到项目的链接器设置中。

1.2 引入gflag 头文件在编写使用gflag 的代码之前，需要在源文件中引入gflag 的头文件。

可以使用以下代码进行引入：cpp#include <gflags/gflags.h>1.3 定义命令行参数在代码中，可以通过定义命令行参数来告诉gflag 该如何解析命令行输入。

可以使用以下宏来定义命令行参数：cppDEFINE_<type>(name, default_value, description);其中，`<type>` 表示参数的类型，可以是bool、int、double、string 等；`name` 表示参数的名称；`default_value` 表示参数的默认值；`description` 表示参数的描述信息。

1.4 解析命令行参数在程序的入口函数中，需要添加解析命令行参数的代码。

可以使用以下代码进行解析：cppgflags::ParseCommandLineFlags(&argc, &argv, true);其中，`argc` 和`argv` 是main 函数的参数，通过引用传递给gflag 的解析函数。

grub2 原理

grub2 原理全文共四篇示例，供读者参考第一篇示例：Grub2是一款广泛应用于Linux操作系统的引导管理器，其设计的初衷是为了解决引导多个操作系统的问题。

Grub2的设计极具灵活性和可扩展性，使得它成为目前最流行的引导管理器之一。

Grub2的工作原理十分复杂，但可以简单地分为几个主要步骤：第一步是启动引导加载程序。

当计算机启动时，BIOS会加载Grub2引导加载程序到内存中，并执行引导加载程序。

引导加载程序的主要功能是在系统启动时向用户提供一个选择菜单，让用户选择要启动的操作系统。

第二步是加载内核。

一旦用户选择了要启动的操作系统，Grub2会加载该操作系统的内核文件到内存中。

内核文件包含了操作系统的核心功能，如进程管理、文件系统等。

第三步是加载初始化RAM磁盘（initrd）。

在加载内核之后，Grub2会加载initrd文件到内存中。

initrd是一个临时的文件系统，用于初始化硬件设备和加载驱动程序，以便系统能够顺利启动。

第四步是启动操作系统。

Grub2会将控制权交给内核，由内核初始化系统并启动用户空间进程，最终启动操作系统。

Grub2的设计具有许多优点，例如支持多种文件系统和操作系统，能够通过脚本和配置文件自定义启动选项，支持密码保护和图形界面等。

Grub2还支持在启动时对引导选项进行编辑，方便用户在系统启动时进行调试和配置。

Grub2也存在一些缺点，例如在配置文件中出现错误可能导致系统无法启动，引导加载程序的体积相对较大，可能会增加系统启动时间等。

在使用Grub2时，用户需要仔细阅读文档并谨慎配置，以避免出现问题。

Grub2是一款功能强大且灵活的引导管理器，能够帮助用户管理多个操作系统的启动过程。

它的工作原理虽然复杂，但通过对其原理和配置文件的深入了解，用户可以充分利用其强大的功能，提高系统的稳定性和性能。

第二篇示例：Grub2是一个多重引导加载程序，常用于Linux系统，负责加载操作系统。

starwindv2v error creating file

starwindv2v error creating file`StarWind v2v` 是一个数据迁移和复制工具，通常用于将数据从一台存储系统迁移到另一台存储系统。

当您遇到"error creating file" 这样的错误时，通常意味着在尝试创建一个文件的过程中出现了问题。

以下是可能的原因和相应的解决方法：1. 权限问题：确保运行StarWind v2v 的用户具有足够的权限来在目标位置创建文件。

2. 磁盘空间不足：检查目标位置是否有足够的磁盘空间。

3. 文件已存在：确保目标文件或目录不存在。

如果文件已经存在，并且您没有配置StarWind v2v 来覆盖现有文件，那么您可能会遇到这个错误。

4. 路径问题：检查您提供的路径是否正确。

确保路径中没有拼写错误，并且路径分隔符使用正确（例如，在Windows 中使用反斜杠`\`，而在Linux/Unix 中使用正斜杠`/`）。

5. 文件系统限制：某些文件系统可能有关于文件大小或文件名的限制。

确保您没有违反这些限制。

6. 日志和详细信息：查看StarWind v2v 的日志，通常可以找到更详细的错误信息和解决方案。

7. 软件问题：确保StarWind v2v 是最新版本，或至少是一个已知的稳定版本。

有时，软件中的bug 可能会导致此类问题。

8. 硬件或网络问题：如果StarWind v2v 正在尝试通过网络传输文件，确保网络连接稳定，并且没有任何硬件或网络问题。

9. 其他软件冲突：确保没有其他软件或服务正在使用目标文件或目录，或者与StarWind v2v 冲突。

10. 重新尝试：有时，简单地重新尝试操作可能会解决问题。

如果您仍然无法解决问题，建议查阅StarWind v2v 的官方文档或联系其技术支持以获得帮助。

grub2 的cfg写法 -回复

grub2 的cfg写法-回复"grub2的cfg写法" - 一个详细解析引言：GRUB（GNU GRand Unified Bootloader）是一个用于多操作系统的引导程序，其最新版本是GRUB2。

GRUB2的配置文件负责定义操作系统启动选项，以及引导过程中各种设置。

在本篇文章中，我们将一步一步回答有关GRUB2配置文件（cfg）的各种问题，旨在帮助读者了解如何正确地编写和修改这些文件。

第一步：打开配置文件GRUB2的配置文件通常位于/boot/grub目录下，文件名为grub.cfg。

要打开此文件，您可以使用任何文本编辑器，例如vi或nano。

请注意，修改配置文件可能需要管理员权限，因此请确保您以root用户身份登录。

第二步：了解配置语法GRUB2的配置文件使用一种称为GRUB命令行语言的语法。

它由一系列命令和参数组成，并按照特定的规则和结构进行排列。

在编写GRUB2配置文件时，我们需要了解一些常用的命令和语法规则，例如菜单项的创建、内核或操作系统的引导、启动参数的设置等。

第三步：编辑菜单项GRUB2的配置文件中的每个菜单项代表一个操作系统或内核镜像。

要添加新菜单项，我们需要使用"menuentry"命令，并在双引号中指定菜单项的名称。

下面是一个示例菜单项的定义：menuentry 'Ubuntu 20.04 LTS' {set root='hd0,msdos1'linux /vmlinuz-5.4.0-26-generic root=/dev/sda1 roinitrd /initrd.img-5.4.0-26-generic}在上面的示例中，菜单项的名称为"Ubuntu 20.04 LTS"。

接下来，我们使用"set"命令将根分区设置为"hd0,msdos1"。

glmark2编译方法

glmark2编译方法GLMark2是一款用于测试图形性能的工具，它可以用来评估计算机或移动设备的图形处理能力。

GLMark2的编译方法如下：1. 下载源代码我们需要从GLMark2的官方网站上下载源代码。

在网站上找到最新版本的GLMark2源代码，并点击下载按钮。

下载完成后，解压缩文件到一个合适的目录。

2. 安装编译依赖在编译GLMark2之前，我们需要先安装一些必要的依赖库。

在终端中运行以下命令以安装这些依赖库：sudo apt-get install build-essential cmake libegl1-mesa-dev libgles2-mesa-dev libdrm-dev libgbm-dev libx11-dev libudev-dev libinput-dev libxkbcommon-dev libpng-dev libjpeg-dev libgl1-mesa-dev libglu1-mesa-dev这些命令将会安装GLMark2所需的编译工具和库文件。

3. 创建编译目录在源代码目录下创建一个用于编译的目录，例如：mkdir buildcd build4. 配置编译选项运行以下命令以配置编译选项：cmake ..这将会读取源代码目录下的CMakeLists.txt文件，并生成编译所需的Makefile。

5. 编译GLMark2运行以下命令以开始编译GLMark2：make这将会开始编译GLMark2的源代码，并生成可执行文件。

6. 安装GLMark2编译完成后，我们可以将GLMark2安装到系统中。

运行以下命令以安装GLMark2：sudo make install安装完成后，我们就可以在系统中直接运行GLMark2了。

7. 运行GLMark2现在，我们可以在终端中运行GLMark2以测试图形性能。

运行以下命令以启动GLMark2：GLMark2将会开始执行一系列的图形测试，并显示测试结果。

gg.internal2用法 -回复

gg.internal2用法-回复gg.internal2是一个功能强大的机器学习和自然语言处理（NLP）库，可以在各种实际应用中帮助提高自然语言处理算法的性能。

本文将一步一步回答有关gg.internal2的用法及其在NLP领域的应用。

第一步：安装gg.internal2库要使用gg.internal2库，首先需要将其安装在您的计算机上。

您可以通过pip命令安装gg.internal2，具体步骤如下：1. 打开终端或命令提示符。

2. 运行以下命令：pip install gg.internal2这将自动下载并安装gg.internal2库及其依赖项。

第二步：导入gg.internal2库安装完gg.internal2之后，您需要在Python脚本的开头导入这个库。

使用以下代码导入gg.internal2库：import gg.internal2 as gg这将使您可以在脚本中使用gg.internal2中的各种功能。

第三步：加载数据集在使用gg.internal2之前，您需要准备一个数据集，以便训练和测试NLP 模型。

数据集可以是一个包含文本数据的CSV文件、JSON文件或文本文件。

您可以使用pandas库来加载和处理数据集。

1. 首先，您需要安装pandas库。

运行以下命令：pip install pandas2. 在Python脚本中导入pandas库：import pandas as pd3. 使用pandas的read_csv()方法加载CSV文件，如下所示：data = pd.read_csv('data.csv')这将将CSV文件加载到一个名为data的pandas数据框中。

第四步：数据预处理一旦您加载了数据集，接下来需要对数据进行预处理，以便为NLP模型提供准备好的输入。

gg.internal2库提供了各种预处理功能，如分词、移除停用词、文本向量化等。

1. 分词：使用gg.internal2中的分词函数将文本划分为单词或标记。

windbg gflags操作流程

windbg gflags操作流程Windbg是一种Windows操作系统的调试工具，它可以帮助开发人员分析和调试应用程序的崩溃和性能问题。

Gflags是Windbg的一个重要组成部分，它提供了一种简便的方式来配置系统的全局标志和调试标志。

本文将介绍如何使用Windbg和Gflags进行调试和性能分析的操作流程。

第一步：安装Windbg和配置符号文件路径我们需要安装Windbg调试工具，并配置符号文件路径。

符号文件包含了应用程序的调试信息，可以帮助开发人员更方便地分析崩溃和性能问题。

在Windbg安装目录下，找到“sym”文件夹，将符号文件下载到该文件夹中。

然后，在Windbg的命令行中输入以下命令来配置符号文件路径：.sympath+ C:\Sym其中，C:\Sym是符号文件路径的根目录，可以根据实际情况进行修改。

第二步：启动Windbg并加载应用程序打开Windbg，选择“文件”菜单中的“打开进程”选项，然后在弹出的对话框中输入应用程序的可执行文件路径。

点击“确定”按钮后，Windbg会加载该应用程序并暂停在入口点处。

第三步：设置断点和观察点在Windbg的命令行中，输入以下命令来设置断点：bp symbol其中，symbol是要设置断点的函数名或地址。

通过设置断点，我们可以在应用程序执行到特定位置时中断程序的执行，以便进行调试和分析。

除了设置断点，我们还可以设置观察点来监视特定的变量或内存地址的值。

在Windbg的命令行中，输入以下命令来设置观察点：ba r/w size address其中，size是要监视的内存块的大小，address是要监视的内存地址。

通过设置观察点，我们可以在特定的内存地址发生变化时中断程序的执行，以便进行调试和分析。

第四步：运行应用程序并触发断点和观察点在Windbg的命令行中，输入“g”命令来继续执行应用程序。

当应用程序执行到设置的断点或观察点处时，Windbg会中断程序的执行，并显示相关的调试信息。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

GFARM V2:A GRID FILE SYSTEM THAT SUPPORTS HIGH-PERFORMANCE DISTRIBUTED AND PARALLEL DATACOMPUTINGOsamu Tatebe,Satoshi Sekiguchi,AIST,Tsukuba,JapanYouhei Morita,KEK,Tsukuba,JapanNoriyuki Soda,SRA,Nagoya,JapanSatoshi Matsuoka,Titech/NII,Tokyo,JapanAbstractGrid Datafarm architecture is designed for facilitating reliableﬁle sharing and high-performance distributed and parallel data computing in a Grid across administrative do-mains by providing a global virtualﬁle system.Gfarmv2is an attempt to implement a global virtualﬁle system that supports a complete set of standard POSIX APIs,while still retaining the parallel and distributed data computing feature of Grid Datafarm architecture.This paper discusses the design and implementation of Gfarm v2that provides a secure,robust,scalable and high-performance global vir-tualﬁle system.INTRODUCTIONRecent research,development and standardization of Grid technologies make it possible to share resources such as CPU and storage across administrative domains.Funda-mental problems for resource sharing such as authentica-tion,authorization and security has been resolved to some degree.Problems are gradually shifting to core operating system functionalities such as process management,pro-cess scheduling,signal handling,andﬁle systems,in order to exploit resources that are located in distant locations ef-ﬁciently.Our research group proposed the Grid Datafarm archi-tecture for Petascale data-intensive computing facilitating distributed resources in wide area[1].Main features of the architecture are to provide(1)a Gridﬁle system that integrates local disks of compute nodes in Computational Grid,and(2)parallel and distributed computing associat-ing Computational Grid and Data Grid.A Gridﬁle system is a global virtualﬁle system that fed-erates numbers ofﬁle systems(orﬁle servers)in a Grid. Integration is achieved by aﬁlesystem metadata server that manages a virtual human-readable namespace.Standard-ization regarding Gridﬁle systems is ongoing in the Grid File System Working Group of the Global Grid Forum[2]. Physically,eachﬁle or each block in aﬁle would be stored in some arbitraryﬁle server in a Grid,while users and ap-plications would accessﬁles via a virtualizedﬁle system without being concerned of theﬁle location.As such,a Gridﬁle system is a shared networkﬁle system scaled to Grid level,allowing easy and transparent sharing ofﬁle data without any modiﬁcations to existing applications.In addition,in the Grid Datafarm architecture,local disks of compute nodes in a Computational Grid compose a Gridﬁle system(Fig.1);everyﬁle server provides not only storage but also computing resources.In other words,stor-ages ofﬁle servers comprise a Gridﬁle system or a Data Grid,while computing resources ofﬁle servers comprise a ComputationalGrid.ClientWhen submitting a job,it will be executedon a file system nodethat has a requested fileGfarm File SystemJob ServerFile Sharing viaGfarm File SystemFigure1:Grid Datafarm Architecture.Local disks of com-pute nodes in a Computational Grid composes a Gfarmﬁle system.A job will be executed on a compute node that has one ofﬁle replicas of the requestedﬁle.When a user submits a job to the Computational Grid,it will be scheduled and executed on one of theﬁle servers (i.e.,a compute node)that has a copy of the requestedﬁle depending on the CPU utilization.This scheduling pol-icy is calledﬁle-afﬁnity process scheduling,which enables scalable I/O performance as well as distributed data com-puting.The key issue is association of Computational Grid with Data Grid.Having a separate I/O across the network independent from the compute nodes would be disadvanta-geous for large-scale distributed system.Grid Datafarm architecture,moreover,supports high-performance distributed and parallel computing for pro-cessing a group ofﬁles by a single program,which is a most time-consuming,but also a most typical,task in data-intensive computing such as high energy physics,astron-omy,space exploration,and human genome analysis.Such a process can be typically performed independently on ev-eryﬁle in parallel,or at least exhibit good locality.In order to facilitate this,in Grid Datafarm,an arbitrary group ofﬁles possibly dispersed across administrative domains can be managed by a single Gfarmﬁle.Each memberﬁle will be accessed in parallel in a newﬁle view called localﬁle view by a parallel process allocated byﬁle-afﬁnity schedul-ing based on replica locations of the memberﬁles.File-afﬁnity scheduling andﬁle view feature naturally derives the owner computes strategy,or move the computation to data approach for parallel and distributed data computing of memberﬁles of a Gfarmﬁle in a single system image. This is the key distinction of Gfarm over other distributed ﬁle systems,where the data will be moved to computation by default.Gfarm v1is a prototype implementation of the Grid Datafarm architecture1.It provides a subset of the POSIX standard API required for data-intensive computing.Gfarm v1has demonstrated its architectural advantages and ease of programming for data-intensive applications[3].How-ever,we found that there are several weaknesses in the sys-tem,some from the lack of feature typically found in dis-tributedﬁle systems,some being robustness and depend-ability,and some more fundamental to the architecture it-self.This paper discusses the design and implementation of Gfarm v2that attempts to overcome such weaknesses. Gfarm v2aims to provide a POSIX-compliant global vir-tualﬁle system facilitating features of Grid Datafarm ar-chitecture for Petascale data-intensive computing.It can be used as a general-purpose networkﬁle system for Grid or virtual organization,allowing existing applications to share ﬁles securely and dependably,and to accessﬁles efﬁciently across administrative domains.RELATED WORKThere are several high-performanceﬁle systems that support more than a thousand clients and/orﬁle system nodes.Lustre[4]supports more than a thousand clients in a cluster system.Lustre consists of numbers ofﬁle servers or Object Storage Targets(OSTs),metadata servers,and numerous clients.Betweenﬁle servers and clients,high-speed interconnects such as Gigabit Ethernet,Elan3,and Myrinet are assumed.Eachﬁle(or object)can be placed in any OST.Lustre does not facilitate replica management; instead,it uses writeback cache to improve write perfor-mance.Collaborative read cache is being planned to im-prove read performance.The major difference from the Grid Datafarm architecture is that Lustre separatesﬁle sys-tem nodes from clients(or compute nodes).This reﬂects the fact that Lustre assumes an operating environment of large clusters and intra-enterprise high bandwidth connec-tivity,where moving data to computation makes sense,in contrast to Gfarm where data may reside distributedly in wide area,making exploitation of local high bandwidth by moving computation to data essential.The Google File System[5]supports more than a thou-sand storage nodes.Allﬁles are divided intoﬁxed-size 1Gfarm is a registered trademark in Japan.chunks,and each chunk can be placed in any storage node (chunkserver).All chunks have three replicas by default to prevent data loss on failures.All I/O operations are imple-mented by user client library with no client or server cache. Google File System does not provide complete POSIX API,but tunes itself to operations that support Google’s data processing needs.The difference from the Grid Data-farm architecture is that Google File System also separates I/O from clients,and it divides aﬁle intoﬁxed-size chunks. The latter is disadvantageous for parallel and distributed data computing since data access cannot be localized.OVERVIEW OF GFARM V1Gfarm v1is a prototype implementation to realize the Grid Datafarm architecture.It is an open source soft-ware available at /.It consists of Gfarm I/O library,an I/O server;gfsd,and a set of metadata servers;gfmd and slapd.Gfarm I/O library provides interfaces for accessing the Gfarmﬁle system.It includes Gfarmﬁle read/write,ﬁle replication,parallel I/O,parallelﬁle transfer,andﬁle-afﬁnity process scheduling.Parallel I/O provides theﬁle view feature:indexﬁle view and localﬁle view,for dis-tributed and parallel data computing.A system call hooking library is provided for existing binary programs to access Gfarmﬁle system as if it were mounted at/gfarm.By loading the system call hooking library before program execution,necessary system calls of existing programs can be trapped without any modiﬁ-cation.The system call hooking library determines from the access path or aﬁle descriptor whether it is aﬁle in Gfarmﬁle system or not.If it is aﬁle in Gfarmﬁle system, the corresponding Gfarm APIs are called to access theﬁle. Otherwise,it call the system call as usual.Gfsd is an I/O daemon running on everyﬁle system node.It facilitatesﬁle access andﬁle replica creation for theﬁle system nodes.In Grid Datafarm architecture,aﬁle system node is also assumed to be a compute node,and gfsd has the ability to execute a remote program on theﬁle system node.Moreover,gfsd manages status information and CPU load average of theﬁle system node for schedul-ing.The metadata server consists of gfmd and slapd.Every ﬁle system metadata including directory,ﬁle status infor-mation,replica catalog,is managed by slapd,which is an ldap server developed by the OpenLDAP project[6].Gfmd is a process manager that is used by the Gfarm remove ex-ecution commands.Gfarmﬁle system can be accessed by not only the Gfarm I/O library but also standard protocols such as scp, GridFTP,and SMB using the system call hooking library. Weaknesses of Gfarm v1Being theﬁrst prototype/experimental implementation of the Grid Datafarm architecture,there are several weak-nesses of Gfarm v1with respect to functionality,robust-ness,security,andﬂexibility.Gfarm v1has been developed to investigate the requirements of large-scale data-intensive computing,and to show the effectiveness of the Grid Data-farm architecture.As such functionalities found commonly in a distributedﬁle system but were considered not impor-tant for those speciﬁc purposes,includingﬁle open in read-write mode,and advisoryﬁle locking were not designed or implemented2.On the other hand,as we expanded the applicationﬁelds,it became clear that some traditional dis-tributedﬁle system feature,as well as some enhancements, were necessary.In Gfarm v1,it is the role of the Gfarm I/O library to generally maintain consistency between metadata and the corresponding physicalﬁle.Since it is a user-level library, however,unexpected application crash can easily break the consistency.Another case is that aﬁle owner can modify or delete physicalﬁles directly onﬁle system nodes bypassing the Gfarm I/O library.This also causes the inconsistency between metadata and the physicalﬁle.Although Gfarm v1provides a maintenance command to check andﬁx the inconsistency,this is cumbersome and error-prone. Gfarm v1does not impose sufﬁcient access control for metadata access.Also,there is no access control for a group since there is no group management.Grid Datafarm architecture supports managing a group ofﬁles(collection,or container)as a single Gfarmﬁle for parallel and distributed data computing.To support this feature,Gfarm v1has a special type of metadata for a group ofﬁles that has any number of memberﬁles.On the other hand,for every type of grouping,we have had to construct support for it internally,which lackedﬂexibility and was in fact quite cumbersome.DESIGN AND IMPLEMENTATION OFGFARM V2The goal of Gfarm v2is to provide a POSIX compli-ant,robust,dependable and secure networkﬁle system as well as to support more than ten thousand clients andﬁle server nodes with scalableﬁle I/O performance.POSIX compliance includes supporting read-writeﬁle open mode and advisoryﬁle locking.It also aims to be a substitute for NFS and AFS,while retaining the parallel data processing capability.Opening Files in Read-write ModeWhen aﬁle has severalﬁle replicas,consistency among theﬁle replicas needs to be maintained when it is modiﬁed. Semantics of consistency amongﬁle replicas supported by Gfarm v2is the same as AFS.1.If there is no advisoryﬁle locking,updatedﬁle contentcan be accessed only by a process that opens it after a writing process closes.2Read-write mode has been supported since the version1.0.4.2.Otherwise,up-to-dateﬁle content can be accessed inthe locked region among processes that lock it.Note that this is not always ensured when a process writes the sameﬁle withoutﬁle locking.To ensure the semantics of consistency,ﬁle replicas to be accessed are selected as follows when opening aﬁle.1.When opening aﬁle in read mode,(a)select anyﬁle replica of theﬁle.2.When opening aﬁle in write mode,(a)if there is a process that opens theﬁle in writemode,select theﬁle replica already opened inwrite mode,(b)if there are several processes that open theﬁle inread mode,select anyﬁle replica or one ofﬁlereplicas opened in read mode,(c)if there is no process that opens theﬁle,selectanyﬁle replica.The selection is done by the metadata server.This selection ensures two differentﬁle replicas cannot be opened in write mode at the same time.Metadata server deletes all invalid metadata and all in-validﬁle replicas when closing aﬁle that is opened in write mode.The reason why all possible invalid metadata and all possible invalidﬁle replicas are not deleted atﬁle open time is that there is a case such that aﬁle is not modiﬁed even though it is opened in write mode.Regarding dele-tion ofﬁle replicas whilst one of them is still held open by some process,there is no particular problem since it is still accessed by a validﬁle descriptor.Advisory File LockingGfarm v2supports advisoryﬁle locking in POSIX.A read lock and a write lock are supported for the wholeﬁle or a region of aﬁle.The basic policy to implement the advisoryﬁle locking is that all processes access the sameﬁle replica when the ﬁle is locked.Moreover,to ensure access to the up-to-date ﬁle content,client cache is disabled in the locked region. Fig.2describes mechanism to implement advisoryﬁle locking with an example.There are twoﬁle system nodes; FSN1and FSN2.Aﬁle/grid/jp/ﬁle2has twoﬁle replicas stored on FSN1and FSN2.Process1running on FSN1 opens aﬁle/grid/jp/ﬁle2in read-write mode.Since one ofﬁle replicas is stored on the same node,the localﬁle replica is selected to be accessed.Process2running on FSN2opens the sameﬁle in read-only mode.Since one of theﬁle replicas is stored on the same node,the localﬁle replica is selected to be accessed.Note that the modiﬁca-tion ofﬁle replica on FSN1does not need to be reﬂected on to theﬁle replica on FSN2since Process2opens theﬁle before Process1closes it.Process2,then,requests a read lock to the metadata server.Since Process1already has theﬁle replica open on FSN1of the sameﬁle in read-write mode,it needs toFigure2:Advisoryﬁle locking.Process1opens /grid/jp/ﬁle2in read-write mode.Process2opens the same ﬁle in read mode,and requests a read lock.In this case, Process2ﬂushes and disables the client cache,and changes theﬁle replica to be accessed from FSN2to FSN1,since the ﬁle replica on FSN1is already opened in read-write mode. change theﬁle replica to be accessed.Before changing the ﬁle replica,itﬂushes the clientﬁle cache,and disables it in the locked region to ensure access to the up-to-date content.Consistent Update of MetadataGfarm v1maintains consistency between metadata and the corresponding physicalﬁle using the Gfarm I/O library. Since it is a user-level library,when application unexpect-edly crashes before closing aﬁle,ﬁle status information includingﬁle size cannot be updated.Gfarm v2changes this metadata update mechanism.In Gfarm v2,Gfarm I/O library basically does not update metadata directly.Instead,it is updated by gfsd.When aﬁle is closed,Gfarm I/O library sends a close request to gfsd.Gfsd updates the metadata after closing theﬁle. Gfsd also updates the metadata when the connection from a client is broken.This ensures the consistent metadata up-date even on unexpected application failure.The other possibility that breaks the consistency between metadata and the corresponding physicalﬁle is direct ac-cess and modiﬁcation of the physicalﬁle without notifying the metadata server.In Gfarm v1,the access to physical ﬁles is allowed by aﬁle owner in the Gfarmﬁle system. Because of this,theﬁle owner is able to modify the physi-calﬁle accidentally,which would cause the inconsistency. In Gfarm v2,every physicalﬁle is owned by gfsd.This disables directﬁle modiﬁcation by users.Access permis-sion is only controlled by the metadata in Gfarmﬁle sys-tem.Generalization of File Grouping ModelGfarm v1introduces a special type of metadata to man-age a group ofﬁles.Although introducing a special type enabled the management of group ofﬁles,it lackedﬂexi-bility in many ways.For example,it had a restriction such that a speciﬁc memberﬁle should belong to only one group not multiple groups.Moreover,it is not possible to have a group of groups ofﬁles.In the case of astronomical data analysis for the Subaru telescope[3],image data of the prime focus camera con-sists ofﬁles in shots since it is a mosaic CCD camera consisting of10CCD detectors.During the data analysis,there are three cases;10ﬁles can be executed in parallel,ﬁles can be executed in parallel,andﬁles can be executed in parallel.Because this,we need three kinds of grouping for the same physicalﬁles.Forﬂexible grouping,Gfarm v2does not introduce a special metadata type but exploits theﬁlesystem directory structure to manage a group ofﬁles.Allﬁles under a di-rectory including its subdirectories form a group ofﬁles. Moreover,exploiting symbolic links and hard links,it is possible to realize the above three kinds of grouping for the sameﬁle set using standardﬁle operations.SUMMARY AND FUTURE WORK Gfarm v2aims at being a global virtualﬁle system hav-ing scalability up to more than ten thousand clients andﬁle system nodes.It also aims to be POSIX compliant,secure, robust,and dependable.This paper discussed its design and implementation.We are still implementing the Gfarm v2,and have a plan to release theﬁrst version in March,2005.We would like to evaluate it especially regarding the scalability up to more than ten thousand nodes.We need to investigate several re-search issues including data preservation and efﬁcient al-gorithm of automatic replica creation.ACKNOWLEDGEMENTSWe thank the members of the Gfarm project of AIST, KEK,Tokyo Institute of Technology,and the University of Tokyo for taking the time to discuss many aspects of this work with us,and for their valuable suggestions.We also thank the members of Grid Technology Research Center, AIST,for their cooperation in this work.REFERENCES[1]O.Tatebe,Y.Morita,S.Matsuoka,N.Soda and S.Sekiguchi,“Grid Datafarm Architecture for Petascale Data Intensive Computing”,Proc.2nd IEEE/ACM International Symposium on Cluster Computing and the Grid(2002)[2]https:///projects/gfs-wg/.[3]N.Yamamoto,O.Tatebe and S.Sekiguchi,“Parallel andDistributed Astronomical Data Analysis on Grid Datafarm”, Proc.5th IEEE/ACM International Workshop on Grid Com-puting(2004).[4]/.[5]S.Ghemawat,H.Gobioff and S.Leung,“The Google FileSystem”,Proc.19th ACM Symposium on Operating Systems Principles(2003).[6]/.。