一个使用gromacs进行蛋白质模拟的入门教程

合集下载

GROMACS使用教程

GROMACS教程一 Gromacs基本模拟流程 (4)1 下载pdb文件 (4)2 用pdb2gmx 处理 pdb 文件 (5)3 建立盒子 (5)5 设置能量最小化 (6)6 用grompp程序进行文件处理 (9)7 使用 genion 和tpr文件添加离子 (9)8 用fws_ion.pdb来产生能量最小化的输入文件 (10)9 在后台运行能量最小化（在命令后加&） (10)二设置位置限制性动力学模拟 (11)三设置非限制性动力学模拟 (15)1 如何重启一个计算 (17)2 如何延长一个计算 (17)3 如何设置并行计算 (18)五模拟结果分析 (18)1 如何将特定帧的轨迹保存成*.pdb文件 (18)2 用ngmx观察轨迹文件（也可以用VMD观察轨迹文件) (19)3 比较常用的分析工具 (21)3.3 g_covar 计算斜方差 (24)3.4 g_energy 能量数据作图，如压力、体积、密度等 (24)3.5 g_gyrate 测量回旋半径 (26)3.6 g_rms 与 g_rmsdist 计算结构的RMSD 值 (26)3.7 g_rmsf 计算原子位置的根均方波动（ rmsf ） (27)3.8 do_dssp 计算模型的二级结构 (30)3.9 g_hbond 计算模拟过程中分子间的氢键的数目、距离或角度 (31)3.10 g_saltbr 分析模拟中残基间的盐桥 (32)GROMACS 是一个使用经典分子动力学理论研究蛋白质动力学的高端的高效的工具。

GROMACS是遵守GNU许可的免费软件，可以从以下站点下载：，并且可以在linux和 Windows上使用。

在本教程中，将研究一个从漏斗形蜘蛛的毒液中分离的毒素。

我们将使用显性溶剂动力学的方法来进行研究。

首先比较真空中和溶解的模型。

我们将把毒素肽溶在水盒子里，紧接着用牛顿运动定律加以平衡。

我们还将比较偿离子在显性溶剂动力学中的影响。

gromacs使用手册

gromacs使用手册摘要：1.引言2.gromacs 简介3.gromacs 的基本使用3.1 安装gromacs3.2 创建模拟系统3.3 运行模拟3.4 分析结果4.gromacs 的高级使用4.1 模拟参数设置4.2 力场参数设置4.3 脚本编写与批处理5.gromacs 的常见问题及解决方法6.总结正文：gromacs 使用手册gromacs 是一款开源的分子动力学模拟软件，广泛应用于生物大分子、药物分子、液晶等体系的模拟研究。

本手册将详细介绍gromacs 的使用方法及高级技巧。

1.引言分子动力学模拟是一种通过计算原子之间的相互作用力，来研究分子运动规律的方法。

gromacs 作为一款功能强大的分子动力学模拟软件，得到了广大科研工作者的青睐。

2.gromacs 简介gromacs 是一款基于GROMOS 力场的分子动力学模拟软件，支持多种体系、多种力场的模拟计算。

gromacs 采用高效的算法和技术，可以实现高效的大规模模拟计算。

3.gromacs 的基本使用3.1 安装gromacs用户可以根据gromacs 的官方文档，选择合适的安装方式，如使用编译器进行编译安装，或使用包管理器进行安装。

3.2 创建模拟系统首先需要构建分子模型，包括原子类型、坐标文件、相互作用参数等。

接着，通过gromacs 的脚本或命令行，设定模拟参数，如温度、压力、模拟时间等。

3.3 运行模拟根据设定的参数，运行gromacs 命令，开始模拟计算。

gromacs 会生成一系列模拟结果文件，包括轨迹文件、能量文件、坐标文件等。

3.4 分析结果使用gromacs 提供的分析工具或其他第三方软件，对模拟结果进行后处理，如计算均方根偏差（RMSD）、计算相互作用能等。

4.gromacs 的高级使用4.1 模拟参数设置根据实际需求，调整模拟参数，如采用更高级的力场、改变模拟方法等，以优化模拟效果。

4.2 力场参数设置通过修改力场参数文件，可以自定义力场参数，以适应不同体系的研究需求。

gromacs使用手册

gromacs使用手册【原创实用版】目录1.Gromacs 简介2.Gromacs 的功能和应用领域3.Gromacs 的使用方法和技巧4.Gromacs 的安装与配置5.Gromacs 的常见问题与解决方案6.Gromacs 的未来发展趋势正文1.Gromacs 简介Gromacs 是一个开源的生物大分子模拟软件，主要用于分子动力学模拟和静态模拟。

Gromacs 可以模拟各种生物大分子，如蛋白质、核酸和多糖等，被广泛应用于生物学、化学、材料科学等领域的研究中。

2.Gromacs 的功能和应用领域Gromacs 具有以下主要功能：(1) 分子动力学模拟：Gromacs 可以使用不同的力场和算法进行分子动力学模拟，以研究分子结构和功能的关系。

(2) 静态模拟：Gromacs 可以进行静态模拟，以研究分子的静态结构。

(3) 计算化学：Gromacs 可以进行量子化学计算和分子轨道计算。

(4) 溶剂模型：Gromacs 提供了多种溶剂模型，以模拟不同环境下分子的行为。

Gromacs 的应用领域包括：(1) 蛋白质结构和功能研究(2) 药物设计(3) 材料科学(4) 生物物理学3.Gromacs 的使用方法和技巧(1) 安装 Gromacs：Gromacs 的安装过程相对简单，只需按照官方提供的安装指南进行即可。

(2) 配置 Gromacs：Gromacs 需要配置一些环境变量和参数文件，以保证其正常运行。

(3) 使用 Gromacs：Gromacs 提供了命令行界面和图形界面两种使用方式。

(4) 技巧：在进行模拟时，需要注意选择合适的力场和算法，以及合理设置模拟参数。

4.Gromacs 的安装与配置(1) 安装 Gromacs：可以从 Gromacs 官网下载最新版本的安装包，然后按照官方提供的安装指南进行安装。

(2) 配置 Gromacs：需要配置环境变量和参数文件，以保证 Gromacs 正常运行。

gromacs使用手册

gromacs使用手册【原创版】目录1.Gromacs 简介2.Gromacs 的功能3.Gromacs 的使用方法4.Gromacs 的常见问题与解决方案5.总结正文1.Gromacs 简介Gromacs 是一个开源的生物大分子模拟软件，主要用于分子动力学模拟、势能计算、分子对接等。

它广泛应用于生物物理学、药物设计等领域，为研究者提供了强大的计算工具。

2.Gromacs 的功能Gromacs 具有以下主要功能：(1) 分子动力学模拟：Gromacs 能够模拟不同温度、压力条件下的分子动力学行为，帮助研究者深入了解分子结构和性质。

(2) 势能计算：Gromacs 可以计算分子间的相互作用势能，为分子对接、药物设计等领域提供重要参考数据。

(3) 分子对接：Gromacs 具有高效的分子对接功能，可以帮助研究者快速找到合适的分子结合模式。

(4) 蛋白质结构预测：Gromacs 可以通过分子动力学模拟，预测蛋白质的二级结构和三级结构。

3.Gromacs 的使用方法(1) 安装 Gromacs：首先需要从官方网站下载 Gromacs 源代码并进行编译安装。

(2) 准备模拟系统：根据研究目的，构建分子模型，并设定模拟参数，如温度、压力等。

(3) 运行模拟：使用 Gromacs 提交模拟任务，等待模拟结果。

(4) 分析结果：对模拟结果进行分析，提取所需的信息，如分子动力学轨迹、势能曲线等。

4.Gromacs 的常见问题与解决方案(1) 模拟过程中出现异常：可能是由于模拟参数设置不合理，或者系统配置不足导致。

需要检查模拟参数和系统配置，适当调整以解决问题。

(2) 模拟结果不准确：可能是由于分子模型构建不准确，或者模拟过程中出现误差。

需要检查分子模型，适当延长模拟时间以提高模拟精度。

(3) Gromacs 报错：可能是由于软件版本不兼容，或者安装过程中出现错误。

需要检查软件版本和安装过程，重新安装或升级软件。

5.总结Gromacs 是一个功能强大的生物大分子模拟软件，广泛应用于生物物理学、药物设计等领域。

gromacs使用手册

gromacs使用手册摘要：一、Gromacs简介二、Gromacs的安装与配置三、Gromacs的基本操作1.创建模拟配置文件2.运行模拟3.分析结果四、Gromacs的高级功能1.分子动力学模拟2.热力学计算3.对接与筛选五、Gromacs的优缺点六、Gromacs的未来发展正文：一、Gromacs简介Gromacs是一款用于分子动力学模拟的开源软件，广泛应用于生物化学、材料科学等领域。

它具有高效的计算性能、丰富的功能和友好的用户界面，为科学家提供了强大的分子模拟工具。

二、Gromacs的安装与配置要在计算机上安装Gromacs，首先需要确保满足系统的硬件和软件要求。

接下来，按照官方文档的指引进行安装和配置。

在配置过程中，用户可以根据自己的需求选择相应的模块和参数。

三、Gromacs的基本操作1.创建模拟配置文件要运行Gromacs，首先需要创建一个模拟配置文件（gro文件），其中包含了模拟系统的信息，如原子、盒子、温度、压力等。

通过编辑gro文件，用户可以设置模拟的具体参数。

2.运行模拟在完成gro文件设置后，使用Gromacs提供的脚本（如mdrun）运行模拟。

根据需要，用户可以选择不同的模拟模式，如NVT、NPT等。

3.分析结果Gromacs可以自动生成模拟过程中的数据文件（如gro、xtc、trr等），用户可以通过Gromacs提供的分析工具（如g_analysis）对这些文件进行处理和可视化。

四、Gromacs的高级功能1.分子动力学模拟Gromacs支持多种分子动力学算法，如Verlet积分器、Langevin动力学等。

用户可以根据研究需求选择合适的算法进行模拟。

2.热力学计算Gromacs可以用于计算系统的热力学性质，如比热、熵等。

这些计算有助于深入了解系统的热力学行为。

3.对接与筛选Gromacs提供了对接和筛选工具，用于寻找分子间的最佳结合位点。

这对于药物设计和蛋白质筛选等领域具有重要的应用价值。

一个使用gromacs进行蛋白质模拟的入门教程

Lysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the series.GROMACS TutorialStep One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, emacs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate ., the case of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file should contain only protein atoms, and is ready to be input into the first GROMACS tool,pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2. A position restraint file.3. A post-processed structure file.The topology by default) contains all the information necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, and dihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f-o-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a force field:Select the Force Field:From '/usr/local/gromacs/share/gromacs/top':1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force field (with CMAP) - version9: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges 17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field and decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by 'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB file; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GROMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new files: , , and . is a GROMACS-formatted structurefile that contains all the atoms defined within the force field ., H atoms have been added to the amino acids in the protein). The file is the system topology (more on this in a minute). The file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's look at what is in the output topology . Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following:#include ""This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is [ moleculetype ], below which you will find ; Name nrexclProtein_A 3The name "Protein_A" defines the molecule name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the [ atoms ] in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q +1 opls_287 1 LYS N 1 ; qtot2 opls_290 1 LYS H1 1 ; qtot3 opls_290 1 LYS H2 1 ; qtot4 opls_290 1 LYS H3 1 ; qtot5 opls_293B 1 LYS CA 1 ; qtot6 opls_140 1 LYS HA 1 ; qtot 1The interpretation of this information is as follows:nr: Atom numbertype: Atom typeresnr: Amino acid residue numberresidue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH"indicates that the residue is protonated (the predominant state at neutralpH).atom: Atom namecgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding upcalculationscharge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on the moleculemass: Also self-explanatorytypeB, chargeB, massB: Used for free energy perturbation (not discussed here) Subsequent sections include [ bonds ], [ pairs ], [ angles ], and [ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The "" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include ""#endifThis ends the "Protein_A" moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click here, but be aware that not all of these models are present within GROMACS.; Include water topology#include ""#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1nm-2.Ion parameters are included next:; Include generic topology for ions#include ""Finally come system-level definitions. The [ system ] directive gives the name of the system that will be written to output files during the simulation. The [ molecules ] directive lists all of the molecules in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the [ molecules ] directive:1.The order of the listed molecules must exactly match the order of themolecules in the coordinate (in this case, .gro) file.2.The names listed must match the [ moleculetype ] name for each species,not residue names or anything else.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involved. There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simple cubic box as the unit cell. As you becomemore comfortable with periodic boundary conditions and box types, I highly recommend the rhombic dodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein.Let's define the box using editconf:editconf-f-o -c -d -bt cubicThe above command centers the protein in the box (-c), and places it at least nm from the box edge (-d. The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, a protein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of nm will mean that there are at least nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp-cs-o -pThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using , which is a generic equilibrated 3-point solvent model. You can use as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called , and we tell genboxthe name of the topology file so it can be modified. Note the changes to the [ molecules ] directive of :[ molecules ]; Compound #molsProtein_A 1SOL 10832What genbox has done is keep track of how many water molecules it has added, which it then writes to your topology to reflect the changes that have been made. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topology! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output of pdb2gmx told us that the protein has a net charge of +8e (based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your [ atoms ] directive in ; it should read (in part) "qtot 8." Since life does not exist at a net charge, we must add ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is called a run input file, which has an extension of .tpr; this file is produced by the GROMACS tool grompp (GROMACS pre-processor), which will also be used later when we run our first simulation. What grompp does is process the coordinate file and topology (which describes the molecules) to generate anatomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input file, with the extension .mdp (molecular dynamics parameter file); grompp will assemble the parameters specified in the .mdp file with the coordinates and topology information to generate a .tpr file.An .mdp file is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be downloaded here.In reality, the .mdp file used at this step can contain any legitimate combination of parameters. I typically use an energy-minimization script, because they are very basic and do not involve any complicated parameter combinations.Assemble your .tpr file with the following:grompp -f -c-p -oNow we have an atomic-level description of our system in the binary file . We will pass this file to genion:genion -s-o-p -pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You do not want to replace parts of your protein with ions.In the genion command, we provide the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, define positive and negative ion names (-pname and-nname, respectively), and tell genion to add only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifying the -neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options. The names of the ions specified with -pname and -nname were force field-specific in previous versions of GROMACS, but have been standardized as of version . The specified atom names are always the elemental symbol in all capital letters, along with the [ moleculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to for proper nomenclature if you encounter difficulties.Your [ molecules ] directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assemble the binary input using grompp using this input parameter file: grompp -f -c -p-oMake sure you have been updating your file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the following files:: ASCII-text log file of the EM process: Binary energy file: Binary full-precision trajectory: Energy-minimized structureThere are two very important factors to evaluate to determine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without-v). Epot should be negative, and (for a simple protein in water) on the order of 105-106, depending on the system size and number of water molecules. The second important feature is the maximum force, Fmax, the target for which was set in - "emtol = " - indicating a target Fmax of no greater than 1000 kJ mol-1nm-1. It is possible to arrive at a reasonable Epot with Fmax > emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The file contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr file using the GROMACS tools g_energy: g_energy -f-oAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "" will be written. To plot this data, you will need the Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonable starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperaturewe wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density. Remember that file that pdb2gmx generated a long time ago? We're going to use it now. The purpose of is to apply a position restraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they allow us to equilibrate our solvent around our protein, without the added variable of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted under an NVT ensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but in NVT, the temperature of the system should reach a plateau at the desired value. If the temperature has not yet stabilized, additional time will be required. Typically, 50-100 ps should suffice, and we will conduct a 100-ps NVT equilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file here.We will call grompp and mdrun just as we did at the EM step:grompp -f-c -p -omdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provided. Take note of a few parameters in the .mdp file:gen_vel = yes: Initiates velocity generation. Using different random seeds (gen_seed) gives different initial velocities, and thus multiple (different) simulations can be conducted from the same starting structure.tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correct kinetic ensemble.pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -fType "15 0" at the prompt to select the temperature of the system and exit. The resulting plot should look something like the following:From the plot, it is clear that the temperature of the system quickly reaches the target value (300 K), and remains stable over the remainder of the equilibration. For this system, a shorter equilibration period (on the order of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step, NVT equilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the density) of the system. Equilibration of pressure is conducted under an NPT ensemble, wherein the Number of particles, Pressure, and Temperature are allconstant. The ensemble is also called the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-ps NPT equilibration can be found here. It is not drastically different from the parameter file used for NVT equilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:continuation = yes: We are continuing the simulation from the NVT equilibration phasegen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did for NVT equilibration. Note that we are now including the -t flag to include the checkpoint file from the NVT equilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced during NVT, we must include this file. The coordinate file (-c) is the final output of the NVT simulation.grompp -f -c-t-p -omdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f -oType "16 0" at the prompt to select the pressure of the system and exit. The resulting plot should look something like the following:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is bar.Let's take a look at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f -oAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model closely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and density.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to release the position restraints and run production MD for data collection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found here.grompp -f -c -t -p -omdrun -deffnm md_0_1The mdrun step is definitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job would be: mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational load of the PME mesh part:The estimate for PME load will dictate how many processors should be dedicated to the PME calculation, and how many for the PP calculations. Refer to the GROMACS 4 publication and the manual for details. For a cubic box, the optimal setup will have a PME load of (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME load is (2:1 PP:PME). When executing mdrun, the program should automatically determine the best number of processors to assign for the PP and PME calculations. Thus, make sure you indicate an appropriate number of nodes for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we should run some analysis on the system. What types of data are important? This is an important question to ask before running the simulation, so you should have some ideas about the types of data you will wantto collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the following:trjconv -s-f -o -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's look at structural stability first. GROMACS has a built-in utility for RMSD calculations called g_rms. To use g_rms, issue this command:g_rms -s-f-o -tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD calculation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is done for clarity of the output (especially if you have a long simulation - 1e+06 ps does not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we could issue the following:g_rms -s -f -o -tu nsThe result will be something like:Both plots show the RMSD levels off to ~ nm (1 Å), indicating that the structure is very stable. Subtle differences between the plots indicate that the structure at t = 0 ns is slightly different from this crystal structure. This is to be expected, since it has been energy-minimized, and because the position restraints are not 100% perfect, as discussed previously.The radius of gyration of a protein is a measure of its compactness. If a protein is stably folded, it will likely maintain a relatively steady value of Rg. If a protein unfolds, its Rg will change over time. Let's analyze the radius of gyration for lysozyme in our simulation:g_gyrate -s -f-oWe can see from the reasonably invariant Rg values that the protein remains very stable, in its compact (folded) form over the course of 1 ns at 300 K. This result is not unexpected, but illustrates an advanced capacity of GROMACS analysis that comes built-in.SummaryYou have now conducted a molecular dynamics simulation with GROMACS, and analyzed some of the results. This tutorial should not be viewed as comprehensive. There are many more types of simulations that one can conduct with GROMACS (free energy calculations, non-equilibrium MD, and normal modes analysis, just to name a few).。

gromacs分子计算模拟流程

gromacs分子计算模拟流程English Answer:GROMACS Molecular Simulation Workflow.GROMACS is an open-source molecular simulation software package that is used to simulate the behavior of biomolecules, such as proteins, DNA, and lipids. The GROMACS workflow typically consists of the following steps:1. System preparation: This involves creating a molecular system that includes the molecule(s) of interest, as well as the surrounding solvent and any other necessary components. The system is typically prepared using a molecular editor, such as VMD or PyMOL.2. Force field selection: A force field is a set of equations that describe the interactions between atoms and molecules. The force field is selected based on the system being simulated.3. Simulation setup: This involves setting up the simulation parameters, such as the temperature, pressure, and time step. The simulation is also set up to output the desired data, such as the coordinates of the molecules and their energies.4. Simulation: The simulation is performed using a molecular dynamics engine. The engine integrates the equations of motion to update the positions and velocities of the molecules.5. Analysis: The simulation data is analyzed to extract information about the behavior of the system. This can be done using a variety of tools, such as VMD, PyMOL, and GROMACS utilities.Chinese Answer:GROMACS分子计算模拟流程。

gromacs使用手册

gromacs使用手册（原创实用版）目录1.Gromacs 简介2.Gromacs 的功能3.Gromacs 的使用方法4.Gromacs 的应用实例5.Gromacs 的未来发展正文1.Gromacs 简介Gromacs 是一个开源的生物大分子模拟软件，主要用于分子动力学和势能计算。

它由 Warren P.Moran 等人开发，最早的版本于 1996 年发布。

Gromacs 名字来源于“Groningen 分子模拟器”，这反映出它的起源地——荷兰格罗宁根大学。

2.Gromacs 的功能Gromacs 具有多种功能，包括：- 分子动力学模拟：Gromacs 可以使用不同的力场进行分子动力学模拟，包括经典的力场如 CHARMM、OPLS、AMBER 等，也可以使用更高级的力场如 GB/SA、ITS/SA 等。

- 势能计算：Gromacs 可以计算生物大分子的势能，包括分子间的相互作用能、分子内能等。

- 模拟过程中的能量计算：Gromacs 可以在模拟过程中实时计算各种能量，如动能、势能、内能等。

- 轨迹文件的输出：Gromacs 可以输出轨迹文件，方便用户进行后续的分析。

- 模拟过程中的原子运动轨迹可视化：Gromacs 具有可视化工具，可以实时显示原子在模拟过程中的运动轨迹。

3.Gromacs 的使用方法Gromacs 的使用方法相对简单，用户只需要按照以下步骤进行操作：- 准备输入文件：用户需要根据需要编写输入文件，包括分子结构文件、模拟参数文件、力场文件等。

- 运行 Gromacs：用户需要在终端中输入 Gromacs 命令，指定输入文件和输出文件。

- 查看输出文件：模拟完成后，用户可以在指定的输出文件中查看模拟结果。

4.Gromacs 的应用实例Gromacs 广泛应用于生物大分子的模拟研究，包括蛋白质结构预测、药物设计、生物材料研究等。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Lysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the 4.5.x series.Step One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, em acs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate (i.e., the case of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file should contain only protein atoms, and is ready to be input into the first GROMACS tool, pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2. A position restraint file.3. A post-processed structure file.The topology (topol.top by default) contains all the information necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, and dihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f 1AKI.pdb-o 1AKI_processed.gro-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a forcefield:Select the Force Field:From '/usr/local/gromacs/share/gromacs/top':1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force field (with CMAP) - version 2.09: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field and decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by 'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB file; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GR OMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new files: 1AKI_processed.gro, topol.top, and posre.itp. 1AKI_processed.gro is a GROMACS-formatted structure file that contains all the atoms defined within the force field (i.e., H atoms have been added to the amino acids in the protein). The topol.top file is the system topology (more on this in a minute). The posre.itp file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's look at what is in the output topology (topol.top). Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following:#include "oplsaa.ff/forcefield.itp"This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is [ moleculetype ], below which you will find; Name nrexclProtein_A 3The name "Protein_A" defines the molecule name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the [ atoms ] in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q +2.01 opls_287 1 LYS N 1 -0.3 14.0067 ; qtot -0.32 opls_290 1 LYS H1 1 0.33 1.008 ; qtot 0.033 opls_290 1 LYS H2 1 0.33 1.008 ; qtot 0.364 opls_290 1 LYS H3 1 0.33 1.008 ; qtot 0.695 opls_293B 1 LYS CA 1 0.25 12.011 ; qtot 0.946 opls_140 1 LYS HA 1 0.06 1.008 ; qtot 1The interpretation of this information is as follows:∙nr: Atom number∙type: Atom type∙resnr: Amino acid residue number∙residue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH" indicates that the residue is protonated (the predominant state at neutral pH).∙atom: Atom name∙cgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding up calculations∙charge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on the molecule∙mass: Also self-explanatory∙typeB, chargeB, massB: Used for free energy perturbation (not discussed here)Subsequent sections include [ bonds ], [ pairs ], [ angles ], and [ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section 5.3.4 of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The "posre.itp" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include "posre.itp"#endifThis ends the "Protein_A" moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click here, but be aware that not all of these models are present within GROMACS.; Include water topology#include "oplsaa.ff/spce.itp"#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1 nm-2.Ion parameters are included next:; Include generic topology for ions#include "oplsaa.ff/ions.itp"Finally come system-level definitions. The [ system ] directive gives the name of the system that will be written to output files during the simulation. The [ molecules ] directive lists all of the molecules in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the [ molecules ] directive:1.The order of the listed molecules must exactly match the order of the molecules inthe coordinate (in this case, .gro) file.2.The names listed must match the [ moleculetype ] name for each species, not residuenames or anything else.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involved.There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simple cubic box as the unit cell. As you become more comfortable with periodic boundary conditions and box types, I highly recommend the rhombicdodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein.Let's define the box using editconf:editconf-f 1AKI_processed.gro-o 1AKI_newbox.gro-c -d 1.0 -bt cubicThe above command centers the protein in the box (-c), and places it at least 1.0 nm from the box edge (-d 1.0). The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, a protein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of 1.0 nm will mean that there are at least 2.0 nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp 1AKI_newbox.gro-cs spc216.gro-o 1AKI_solv.gro-p topol.topThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using spc216.gro, which is a generic equilibrated 3-point solvent model. You can use spc216.gro as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called 1AKI_solv.gro, and we tell genbox the name of the topology file (topol.top) so it can be modified. Note the changes to the [ molecules ] directive of topol.top:[ molecules ]; Compound #molsProtein_A 1SOL 10832What genbox has done is keep track of how many water molecules it has added, which it then writes to your topology to reflect the changes that have been made. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topology! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output of pdb2gmx told us that the protein has a net charge of +8e (based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your [ atoms ] directive in topol.top; it should read (in part) "qtot 8." Since life does not exist at a net charge, we must add ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is called a run input file, which has an extension of .tpr; this file is produced by the GROMACS tool grompp(GROMACS pre-processor), which will also be used later when we run our first simulation. What grompp does is process the coordinate file and topology (which describes the molecules) to generate an atomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input file, with the extension .mdp (molecular dynamics parameter file); grompp will assemble the parameters specified in the .mdp file with the coordinates and topology information to generate a .tpr file.An .mdp file is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be downloaded here.In reality, the.mdp file used at this step can contain any legitimate combination of parameters. I typically use an energy-minimization script, because they are very basic and do not involve any complicated parameter combinations.Assemble your .tpr file with the following:grompp -f ions.mdp-c 1AKI_solv.gro-p topol.top -o ions.tprNow we have an atomic-level description of our system in the binary file ions.tpr. We will pass this file to genion:genion -s ions.tpr-o 1AKI_solv_ions.gro-p topol.top-pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You do not want to replace parts of your protein with ions.In the genion command, we provide the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, define positive and negative ion names (-pname and -nname, respectively), and tell genion to add only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifyingthe -neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options.The names of the ions specified with -pname and -nname were force field-specific in previous versions of GROMACS, but have been standardized as of version 4.5. The specified atom names are always the elemental symbol in all capital letters, along with the [ moleculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to ions.itp for proper nomenclature if you encounter difficulties.Your [ molecules ] directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assemble the binary input using grompp using this input parameter file:grompp -f minim.mdp-c 1AKI_solv_ions.gro -p topol.top-o em.tprMake sure you have been updating your topol.top file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "em.tpr," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the following files:∙em.log: ASCII-text log file of the EM process∙em.edr: Binary energy file∙em.trr: Binary full-precision trajectory∙em.gro: Energy-minimized structureThere are two very important factors to evaluate to determine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without -v). Epot should be negative, and (for a simple protein in water) on the order of 105-106, depending on the system size and number of water molecules. The second important feature is the maximum force, Fmax, the target for which was set in minim.mdp- "emtol = 1000.0" - indicating a target Fmax of no greater than 1000 kJ mol-1 nm-1. It is possible to arrive at a reasonable Epot with Fmax > emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The em.edr file contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr file using the GROMACS tools g_energy:g_energy -f em.edr-o potential.xvgAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "potential.xvg" will be written. To plot this data, you will need the Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonable starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperature we wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density.Remember that posre.itp file that pdb2gmx generated a long time ago? We're going to use it now. The purpose of posre.itp is to apply a position restraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they allow us to equilibrate our solvent around our protein, without the added variable of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted under an NVT ensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but in NVT, the temperature of the system should reach a plateau at the desired value. If the temperature has not yet stabilized,additional time will be required. Typically, 50-100 ps should suffice, and we will conduct a 100-ps NVT equilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file here.We will call grompp and mdrun just as we did at the EM step:grompp -f nvt.mdp-c em.gro -p topol.top-o nvt.tprmdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provided. Take note of a few parameters in the .mdp file:∙gen_vel = yes: Initiates velocity generation. Using different random seeds (gen_seed) gives different initial velocities, and thus multiple (different) simulations can be conducted from the same starting structure.∙tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correct kinetic ensemble.∙pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -f nvt.edrType "15 0" at the prompt to select the temperature of the system and exit. The resulting plot should look something like the following:From the plot, it is clear that the temperature of the system quickly reaches the target value (300 K), and remains stable over the remainder of the equilibration. For this system, a shorter equilibration period (on the order of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step, NVT equilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the density) of the system. Equilibration of pressure is conducted under an NPT ensemble, wherein the Number of particles, Pressure, and Temperature are all constant. The ensemble is also called the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-ps NPT equilibration can be found here. It is not drastically different from the parameter file used for NVT equilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:∙continuation = yes: We are continuing the simulation from the NVT equilibration phase ∙gen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did for NVT equilibration. Note that we are now including the -t flag to include the checkpoint file from the NVT equilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced during NVT, we must include this file. The coordinate file (-c) is the final output of the NVT simulation.grompp -f npt.mdp -c nvt.gro-t nvt.cpt -p topol.top -o npt.tprmdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f npt.edr -o pressure.xvgType "16 0" at the prompt to select the pressure of the system and exit. The result ing plot should look something like the following:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is 1.05 bar.Let's take a look at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f npt.edr -o density.xvgAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is 998.3 kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model closely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and density.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to release the position restraints and run production MD for data collection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found here.grompp -f md.mdp-c npt.gro-t npt.cpt-p topol.top -o md_0_1.tprmdrun -deffnm md_0_1The mdrun step is definitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job would be:mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational load of the PME mesh part: 0.25The estimate for PME load will dictate how many processors should be dedicated to the PME calculation, and how many for the PP calculations. Refer to the GROMACS 4 publication and the manual for details. For a cubic box, the optimal setup will have a PME load of 0.25 (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME load is 0.33 (2:1 PP:PME). When executing mdrun, the program should automatically determine the best number of processors to assign for the PP and PME calculations. Thus, make sure you indicate an appropriate number of nodes for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we should run some analysis on the system. What types of data are important? This is an important question to ask before running the simulation, so you should have some ideas about the types of data you will want to collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the following:trjconv -s md_0_1.tpr-f md_0_1.xtc-o md_0_1_noPBC.xtc -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's look at structural stability first. GROMACS has a built-in utility for RMSD calculations called g_rms. To use g_rms, issue this command:g_rms -s md_0_1.tpr-f md_0_1_noPBC.xtc-o rmsd.xvg-tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD calculation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is done for clarity of the output (especially if you have a long simulation - 1e+06 ps does not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we could issue the following:g_rms -s em.tpr -f md_0_1_noPBC.xtc-o rmsd_xtal.xvg -tu nsThe result will be something like:。