隨著(zhu)中國科(ke)研的(de)(de)發展,有限的(de)(de)超(chao)算(suan)中心和(he)自建計算(suan)集群(qun)的(de)(de)方式(shi)已無法滿(man)足(zu)超(chao)算(suan)用(yong)(yong)戶計算(suan)業(ye)務和(he)創新的(de)(de)需(xu)要,特別對降低超(chao)算(suan)總體擁(yong)有成本,無需(xu)排隊(dui)獲(huo)取資源,彈性規模,軟件開箱即用(yong)(yong)和(he)安全運(yun)維等核(he)心需(xu)求(qiu),云上超(chao)算(suan)是唯一解決(jue)之道。
今天,將為大家介紹如何在北鯤云超算平臺上快速完成動力學模擬計算,模擬溶(rong)菌(jun)酶(lysozyme)在水相中的(de)(de)結構變(bian)化(hua)。北鯤云超算平臺是前(qian)段時(shi)間小伙伴安(an)利(li)的(de)(de),經過(guo)一(yi)段時(shi)間測試,效果還不錯。只需要(yao)一(yi)臺上網本就可以(yi)在短時(shi)間內(nei)完成動力學模擬所(suo)有內(nei)容,而且還預裝了(le)Gromacs在內(nei)的(de)(de)300+應用軟件,實現(xian)了(le)開箱即用,非(fei)常的(de)(de)靈(ling)活便捷。接下來,就正式進入今天的(de)(de)主題。

水中(zhong)的溶菌酶(mei)
01 結構處理
溶(rong)菌(jun)酶(mei)(mei)又稱胞壁(bi)質酶(mei)(mei)或N-乙酰胞壁(bi)質聚(ju)糖水(shui)解酶(mei)(mei),是(shi)一種能水(shui)解細菌(jun)中(zhong)黏(nian)多糖的(de)(de)堿(jian)性酶(mei)(mei)。溶(rong)菌(jun)酶(mei)(mei)還可與帶負電荷的(de)(de)病(bing)毒蛋(dan)白(bai)直(zhi)接結(jie)合,與DNA、RNA、脫輔(fu)基蛋(dan)白(bai)形成復合體,使(shi)病(bing)毒失活。這(zhe)里我們從(cong)PDB數據庫下載(zai)溶(rong)菌(jun)酶(mei)(mei)蛋(dan)白(bai)質三(san)維結(jie)構,編號為1AKI,或則利用(yong)PyMOL直(zhi)接從(cong)命令行獲取三(san)維結(jie)構,然(ran)后利用(yong)PyMOL檢查結(jie)構,是(shi)否缺失,并(bing)且(qie)刪除除開(kai)蛋(dan)白(bai)質外的(de)(de)雜原子和水(shui)分(fen)子,并(bing)保(bao)存為protein.pdb文件。
fetch 1AKI
02 將輸入文件上傳到北鯤云超算平臺
打(da)開北鯤云超算平臺,注冊(ce)登錄后在/home/cloudam/jobs下新建文件(jian)(jian)夾Lysozyme,將準(zhun)備(bei)好(hao)的protein.pdb文件(jian)(jian)直接拖拽到該(gai)文件(jian)(jian)中,即(ji)可(ke)完成上傳。上傳文件(jian)(jian)和下載文件(jian)(jian)都可(ke)以利用sftp非常方便(bian)完成。
除了命令行,北(bei)鯤云還有可(ke)(ke)視化和圖形界面(mian)作業(ye)(ye)(ye)提交方(fang)式。可(ke)(ke)視化無需代碼,比較(jiao)適合(he)無計算(suan)機專業(ye)(ye)(ye)背景的同學,圖形界面(mian)適合(he)于單節(jie)點(dian)小規模計算(suan)任務。這兩種(zhong)作業(ye)(ye)(ye)提交方(fang)式大家(jia)就自行體(ti)(ti)驗吧(北(bei)鯤云針對(dui)新(xin)用戶有贈送200元(yuan)體(ti)(ti)驗金的活動,大家(jia)可(ke)(ke)以關注(zhu)一下)。


北鯤云超算平臺(tai)作業(ye)提交入口

北鯤(kun)云超算平臺操作終端窗口
03 生成拓撲文件
在確(que)保(bao)蛋白質沒有雜原子且(qie)結(jie)構完整后(hou),將其上傳(chuan)至計算平(ping)臺,然后(hou)我們利(li)用(yong)Gromacs開(kai)始進行動(dong)力(li)學模(mo)擬,在此(ci)之前需要選擇所使用(yong)的Gromacs版本(ben),在北鯤云超算平(ping)臺上可以使用(yong)的Gromacs如下(xia):
GROMACS/2019.6-fosscuda GROMACS/2020-foss-2019b
GROMACS/2021-gromacs-cpu-new
GROMACS/gromacs-5.1.5-gpu
GROMACS/2019.6-intel-2019b
GROMACS/2020-fosscuda-2019b
GROMACS/4.5.5-openmpi-1.3.1
GROMACS/2019.6-intelmpi-cuda
GROMACS/2021-fosscuda-2019b
GROMACS/gromacs-5.1.2
這里我們使用Gromacs2020版(ban)(ban)進行(xing)本(ben)次(ci)計算,也可自由(you)選(xuan)擇其他版(ban)(ban)本(ben)進行(xing)模(mo)擬(ni)(如果沒有你需要的版(ban)(ban)本(ben),也可找(zhao)技(ji)術支持(chi)安裝(zhuang)),加載命令(ling)如下:
module add Gromacs/2020-fosscuda-2019b
可(ke)以通過通過鍵入gmx或(huo)者gmx_mpi命(ming)令查看是(shi)否加載成功(gong),如果出(chu)現類似如下信息則表明加載成功(gong),可(ke)以進行(xing)后續模擬:
[cloudam@master jobs]$ gmx_mpi
:-) GROMACS - gmx_mpi, 2021 (-:
GROMACS is written by:
Andrey Alekseenko Emile Apol Rossen Apostolov
Paul Bauer Herman J.C. Berendsen Par Bjelkmar
Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Rudi van Drunen Anton Feenstra
Gilles Gouaillardet Alan Gray Gerrit Groenhof
Anca Hamuraru Vincent Hindriksen M. Eric Irrgang
Aleksei Iupinov Christoph Junghans Joe Jordan
Dimitrios Karkoulis Peter Kasson Jiri Kraus
Carsten Kutzner Per Larsson Justin A. Lemkul
Viveca Lindahl Magnus Lundborg Erik Marklund
Pascal Merz Pieter Meulenhoff Teemu Murtola
Szilard Pall Sander Pronk Roland Schulz
Michael Shirts Alexey Shvetsov Alfons Sijbers
Peter Tieleman Jon Vincent Teemu Virolainen
Christian Wennberg Maarten Wolf Artem Zhmurov
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
pdb2gmx是(shi)我們用到的(de)第一個GROMACS模塊, 它的(de)作用的(de)是(shi)產生三個文(wen)件(jian):分子拓撲文(wen)件(jian)topol.top、位置限制文(wen)件(jian)posre.itp、后處理結構文(wen)件(jian)protein_processed.gro,具體命(ming)令如(ru)下:
gmx pdb2gmx -f protein.pdb -o protein_processed.gro -water tip3p -ignh -merge all
此時會出現如下選(xuan)擇力場的提示:
Select the Force Field:
From '/public/software/.local/easybuild/software/GROMACS/2021-gromacs-cpu-new/share/gromacs/top':
1: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
3: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
4: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
5: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
6: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
8: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
9: GROMOS96 43a1 force field
10: GROMOS96 43a2 force field (improved alkane dihedrals)
11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40,, 843-856, DOI: 10.1007/s00249-011-0700-9)
15: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
這里選擇(ze)力場AMBER99來處理(li)蛋白質(zhi),鍵入(ru)4回車,檢查(cha)生成文件應該包含如下內(nei)容:
posre.itp protein.pdb protein_processed.gro topol.top
04 定義單位盒子并填充溶劑
使(shi)用editconf來定義盒子,命令如下(xia):
gmx editconf -fprotein_processed.gro -o pro_newbox.gro -c -d 1.0 -bt cubic
其中-c表示將蛋白質放在盒子中心,-d 1.0表示蛋白質距離盒子邊緣至少1埃距離,-bt表示盒子形狀為立方體。
然后使用(yong)solvate來填(tian)充水分子,命(ming)令如下:
gmx solvate -cppro_newbox.gro -cs spc216.gro -o pro_solv.gro -p topol.top
-cp指定輸(shu)入文件,-cs指定溶劑(ji)模(mo)型(xing)文件,這里(li)是通用(yong)的三位(wei)點溶劑(ji)模(mo)型(xing)spc216.gro
05 添加離子
完成上述步(bu)驟后,已經獲得了(le)帶(dai)有電荷的(de)(de)(de)溶液體(ti)系,topol.top文件(jian)記錄的(de)(de)(de)電荷信息。因(yin)此需(xu)要添加(jia)對應(ying)的(de)(de)(de)離子來進行體(ti)系中和,至于需(xu)要添加(jia)多少離子,gromacs就比較方便了(le),是不需(xu)要像Amber一(yi)樣進行計算的(de)(de)(de)。添加(jia)離子需(xu)要兩步(bu)命令,第一(yi)步(bu)需(xu)要一(yi)個ion.mdp參數文件(jian),這里我們將(jiang)所有的(de)(de)(de)mdp參數文件(jian)放在MDP文件(jian)夾(jia)下,命令如下:
gmx grompp -f../MDP/ions.mdp -c pro_solv.gro -p topol.top -o ions.tpr
上(shang)述(shu)(shu)命令會(hui)獲(huo)得一個(ge)tpr結(jie)尾的(de)文件,該文件它提供了(le)我們體系的(de)原子級別的(de)描述(shu)(shu)。
第二步進行(xing)利用(yong)Na離(li)子(zi)和(he)Cl離(li)子(zi)替換溶劑部分水分子(zi),使得體(ti)系(xi)電荷被中和(he)。命令如下(xia):
gmx grompp -f ../MDP/minim.mdp -c pro_solv_ions.gro -p topol.top -o em.tpr
溶劑SOL的組編號為13,因此選擇13然后回車,如下所示:
Reading file ions.tpr, VERSION 2021 (single precision)
Reading file ions.tpr, VERSION 2021 (single precision)
Will try to add 0 NA ions and 8 CL ions.
Select a continuous group of solvent molecules
Group 0 ( System) has 33892 elements
Group 1 ( Protein) has 1960 elements
Group 2 ( Protein-H) has 1001 elements
Group 3 ( C-alpha) has 129 elements
Group 4 ( Backbone) has 387 elements
Group 5 ( MainChain) has 517 elements
Group 6 ( MainChain+Cb) has 634 elements
Group 7 ( MainChain+H) has 646 elements
Group 8 ( SideChain) has 1314 elements
Group 9 ( SideChain-H) has 484 elements
Group 10 ( Prot-Masses) has 1960 elements
Group 11 ( non-Protein) has 31932 elements
Group 12 ( Water) has 31932 elements
Group 13 ( SOL) has 31932 elements
Group 14 ( non-Water) has 1960 elements
Select a group: 13
參數文件內容:
; ions.mdp - used as input into grompp to generate ions.tpr
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep = 0.01 ; Minimization step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
cutoff-scheme = Verlet ; Buffered neighbor searching
ns_type = grid ; Method to determine neighbor list (simple, grid)
coulombtype = cutoff ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; Short-range electrostatic cut-off
rvdw = 1.0 ; Short-range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions in all 3 dimensions

VMD檢(jian)查體(ti)系中蛋白質和離(li)子分(fen)布
06 體系能量最小化
命令(ling)如下(xia),本命令(ling)同樣需要mdp參數(shu)(shu)文件,名為minim.mdp, 能量最小化過程與添加離子過程相(xiang)似,利用grompp將 拓撲(pu)和模擬參數(shu)(shu)寫入一個二(er)進制的輸入文件中(.tpr), 然后使用GROMACS MD引擎的mdrun模塊來進行能量最小化
gmx grompp -f ../MDP/minim.mdp -c pro_solv_ions.gro -p topol.top -o em.tpr
MDP參數文(wen)件內容:
; minim.mdp - used as input into grompp to generate em.tpr
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep = 0.01 ; Minimization step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
cutoff-scheme = Verlet ; Buffered neighbor searching
ns_type = grid ; Method to determine neighbor list (simple, grid)
coulombtype = PME ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; Short-range electrostatic cut-off
rvdw = 1.0 ; Short-range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions in all 3 dimensions
然(ran)后鍵入如下命(ming)令進行能量最(zui)小化,這一步應當提(ti)交到計算節點進行,不要在管理節點直接運行。
gmx mdrun -v -deffnm em
能量最小化開始,屏幕中出現運行進度(du)提示:
Steepest Descents:
Tolerance (Fmax) = 1.00000e+03
Number of steps = 50000
Step= 0, Dmax= 1.0e-02 nm, Epot= -2.94025e+05 Fmax= 1.53483e+05, atom= 19880
Step= 1, Dmax= 1.0e-02 nm, Epot= -3.12614e+05 Fmax= 6.09036e+04, atom= 19880
Step= 2, Dmax= 1.2e-02 nm, Epot= -3.35310e+05 Fmax= 2.40700e+04, atom= 19880
Step= 3, Dmax= 1.4e-02 nm, Epot= -3.61996e+05 Fmax= 1.00755e+04, atom= 24188
Step= 4, Dmax= 1.7e-02 nm, Epot= -3.87330e+05 Fmax= 4.66750e+03, atom= 983
Step= 5, Dmax= 2.1e-02 nm, Epot= -4.08800e+05 Fmax= 1.30036e+04, atom= 547
Step= 6, Dmax= 2.5e-02 nm, Epot= -4.13336e+05 Fmax= 2.07112e+04, atom= 547
Step= 7, Dmax= 3.0e-02 nm, Epot= -4.17185e+05 Fmax= 1.90699e+04, atom= 547
Step= 8, Dmax= 3.6e-02 nm, Epot= -4.18940e+05 Fmax= 2.96782e+04, atom= 547
Step= 9, Dmax= 4.3e-02 nm, Epot= -4.21615e+05 Fmax= 2.83858e+04, atom= 547
Step= 11, Dmax= 2.6e-02 nm, Epot= -4.26495e+05 Fmax= 6.89298e+03, atom= 547
Step= 12, Dmax= 3.1e-02 nm, Epot= -4.27186e+05 Fmax= 3.75623e+04, atom= 1785
Step= 13, Dmax= 3.7e-02 nm, Epot= -4.33871e+05 Fmax= 1.81256e+04, atom= 1785
Step= 15, Dmax= 2.2e-02 nm, Epot= -4.35813e+05 Fmax= 1.54546e+04, atom= 1785
Step= 16, Dmax= 2.7e-02 nm, Epot= -4.37056e+05 Fmax= 2.45185e+04, atom= 1785
利用如下命令檢(jian)查是否有效的能量(liang)最(zui)小化(hua):
echo 10 0 | gmx energy -f em.edr -o potential.xvg
利用xmgrace打開potential.xvg文件,結果(guo)如如下所示(shi):

體系勢(shi)能隨時間變化曲線
07 NVT平衡
NVT系(xi)(xi)綜(zong)(粒子數, 體積和溫(wen)度都是恒(heng)定的(de)(de))系(xi)(xi)綜(zong)也被(bei)稱為(wei)等溫(wen)等容系(xi)(xi)綜(zong)或正則系(xi)(xi)綜(zong)。NVT平衡過(guo)程的(de)(de)需要的(de)(de)時(shi)(shi)間(jian)與體系(xi)(xi)的(de)(de)構成(cheng)有(you)關(guan), 但在NVT系(xi)(xi)綜(zong)中, 體系(xi)(xi)的(de)(de)溫(wen)度應達到預期值(zhi)并基本保持不(bu)變(bian). 如(ru)果(guo)溫(wen)度仍然沒有(you)穩定, 那就(jiu)需要更多的(de)(de)時(shi)(shi)間(jian)。通常情況下, 50 ps到100 ps就(jiu)足夠了, 因(yin)此在本例中我們進行100 ps的(de)(de)NVT平衡. 根據(ju)機器的(de)(de)不(bu)同, 運行可能(neng)需要一段時(shi)(shi)間(jian)(在雙核(he)的(de)(de)MacBook上剛(gang)(gang)剛(gang)(gang)超(chao)過(guo)一小時(shi)(shi),而在北(bei)鯤云上幾(ji)分鐘即可完成(cheng))。命令如(ru)下:
gmx grompp -f ../MDP/nvt.mdp -c em.gro -r em.gro -p topol.top -o nvt.tpr
gmx mdrun -deffnm nvt
MDP參(can)數文件(jian)內容如下:
title = OPLS Lysozyme NVT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds involving H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Nonbonded settings
cutoff-scheme = Verlet ; Buffered neighbor searching
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
DispCorr = EnerPres ; account for cut-off vdW scheme
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed
同(tong)樣通過使用energy來檢查平(ping)衡是否完成,命(ming)令如(ru)下:
echo 16 0 |gmx energy -f nvt.edr -o temperature.xvg
結果如下所示,溫度基本保持在(zai)300K,如果波動太厲害可(ke)以考(kao)慮增加nstes步(bu)數來進一(yi)步(bu)獲取平(ping)衡體(ti)系。

體系溫度隨(sui)時間變化曲(qu)線
08 NPT平衡
前一步(bu)的(de)NVT平(ping)(ping)衡穩(wen)定了體系的(de)溫度. 在采集(ji)數(shu)據之前, 我們還需要穩(wen)定體系的(de)壓力(因此還包括密度)。壓力平(ping)(ping)衡是在NPT系綜(zong)(zong)下進行(xing)的(de), 其中粒子數(shu), 壓力和溫度都(dou)保(bao)持不變。這個系綜(zong)(zong)也被稱為等溫等壓系綜(zong)(zong), 最接近(jin)實驗條件。100 ps NPT平(ping)(ping)衡的(de)命令(ling)如下:
gmx grompp -f ../MDP/npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
gmx mdrun -deffnm npt
MDP參數內容:
title = OPLS Lysozyme NPT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = yes ; Restarting after NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds involving H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Nonbonded settings
cutoff-scheme = Verlet ; Buffered neighbor searching
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
DispCorr = EnerPres ; account for cut-off vdW scheme
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Velocity generation
gen_vel = no ; Velocity generation is off
檢查平衡結果命令:
echo 18 0| gmx energy -f npt.edr -o pressure.xvg
體系壓力(li)隨時間變化曲線
09 動力學模擬成品模擬
隨著兩個(ge)平衡階段的(de)完成(cheng), 體系已經在需要(yao)(yao)(yao)的(de)溫度和(he)壓(ya)強(qiang)下(xia)平衡好了。我(wo)們(men)現在可以放開(kai)位置限制并進(jin)行(xing)成(cheng)品MD以收集(ji)數(shu)據了, 這個(ge)過程(cheng)跟前(qian)面的(de)類似。運(yun)行(xing)grompp時, 我(wo)們(men)還要(yao)(yao)(yao)用到檢查(cha)點文(wen)件(在這種情(qing)況(kuang)下(xia),其中包(bao)含(han)了壓(ya)力耦合信(xin)息(xi))。 我(wo)們(men)要(yao)(yao)(yao)進(jin)行(xing)一個(ge)20 ns的(de)MD模擬, 命令如下(xia):
gmx grompp -f ../MDP/md.mdp -c npt.gro -t npt.cpt -p topol.top -o md.tpr
gmx mdrun -v -deffnm md
MDP參數(shu)文件內容為:
title = OPLS Lysozyme NPT equilibration
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 10000000 ; 2 * 10000000 = 20000 ps (20 ns)
dt = 0.002 ; 2 fs
; Output control
nstxout = 0 ; suppress bulky .trr file by specifying
nstvout = 0 ; 0 for output frequency of nstxout,
nstfout = 0 ; nstvout, and nstfout
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save compressed coordinates every 10.0 ps
compressed-x-grps = System ; save the whole system
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds involving H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet ; Buffered neighbor searching
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
最后模擬耗時較長,而在北鯤云超算平臺上供用戶選擇的GPU型號非常多,比如這里教程使用的是V100顯卡,計算20ns的上述體系只需要

北鯤云計算平(ping)臺可(ke)供選擇GPU型號示例
動力學模擬運行完畢后提示運行20ns的動力學模擬使用了3個小時20分鐘,速度可達每天143.41 ns,可謂速度飛快~
starting mdrun 'HYDROLASE in water'
10000000 steps, 20000.0 ps.
step 80: timed with pme grid 44 44 44, coulomb cutoff 1.000: 140.1 M-cycles
step 160: timed with pme grid 40 40 40, coulomb cutoff 1.091: 135.9 M-cycles
step 240: timed with pme grid 36 36 36, coulomb cutoff 1.212: 147.5 M-cycles
step 320: timed with pme grid 32 32 32, coulomb cutoff 1.363: 175.3 M-cycles
step 400: timed with pme grid 36 36 36, coulomb cutoff 1.212: 145.9 M-cycles
step 480: timed with pme grid 40 40 40, coulomb cutoff 1.091: 138.7 M-cycles
step 560: timed with pme grid 42 42 42, coulomb cutoff 1.039: 141.3 M-cycles
step 640: timed with pme grid 44 44 44, coulomb cutoff 1.000: 143.6 M-cycles
optimal pme grid 40 40 40, coulomb cutoff 1.091
step 9999900, remaining wall clock time: 0 s
Writing final coordinates.
step 10000000, remaining wall clock time: 0 s
NOTE: The GPU has >25% less load than the CPU. This imbalance causes
performance loss.
Core t (s) Wall t (s) (%)
Time: 322366.259 12049.380 2675.43h20:49
(ns/day) (hour/ns)
Performance: 143.410 0.167
gcq#450: "Above all, don't fear difficult moments. The best comes from them." (Rita Levi-Montalcini)
利用Chimera的(de)MD movie工具(ju)或者其他可視化(hua)軟件對(dui)動力(li)學模(mo)擬(ni)成(cheng)品(pin)進行動畫展示(shi),模(mo)擬(ni)部分(fen)結果如(ru)下所示(shi),綠色原子表示(shi)添加的(de)離(li)子。
,時長00:11
Chimera展示動力學模擬動畫
利用rms工具計算體系中蛋白質骨架的RMSD隨時間波動情況,命令如下
echo 4 4| gmx rms -f md.xtc -s md.tpr -o rmsd
利用xmgrace查看結果
xmgrace -nxy rmsd.xvg

動(dong)力學分析RMSD隨(sui)模擬(ni)時間(jian)變化曲線
10 北鯤云超算平臺作業提交
上述教程非(fei)常(chang)詳細(xi)的展(zhan)示(shi)了如何(he)用(yong)Gromacs進行(xing)(xing)動力學模擬,值得(de)注(zhu)意的是,上述教程中的命令(ling)可(ke)以(yi)在單機完(wan)成(cheng),也可(ke)以(yi)上述所有命令(ling)寫成(cheng)作業腳(jiao)本進行(xing)(xing)提交。提交腳(jiao)本命令(ling):
sbatch -N 1 -p g-v100-1 -c 12 md-gromacs.sh
其中,-N為節(jie)點的(de)數(shu)量(liang),這(zhe)里輸入的(de)是1。-p為選擇的(de)PARTITION,這(zhe)里使(shi)用的(de)是V100卡(ka)(g-v100-1)。
md-gromacs.sh腳本(ben)的內(nei)容涵蓋(gai)上(shang)述(shu)教程中的所有(you)命令,根據北鯤云超算平臺的指(zhi)南(nan)需要在腳本(ben)開頭加上(shang)導入(ru)gromacs模塊(kuai),如果申請了GPU需要將GPU模塊(kuai)也導入(ru)(1-6行),具(ju)體腳本(ben)內(nei)容如下:
module add GROMACS/2020-fosscuda-2019b
export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true
export GMX_FORCE_UPDATE_DEFAULT_GPU=true
ntomp="$SLURM_CPUS_PER_TASK"
export OMP_NUM_THREADS=$ntomp
echo 4 | gmx pdb2gmx -f protein.pdb -o protein_processed.gro -water tip3p -ignh -merge all
gmx editconf -f protein_processed.gro -o pro_newbox.gro -c -d 1.0 -bt cubic
gmx solvate -cp pro_newbox.gro -cs spc216.gro -o pro_solv.gro -p topol.top
gmx grompp -f ../MDP/ions.mdp -c pro_solv.gro -p topol.top -o ions.tpr
echo 13| gmx genion -s ions.tpr -o pro_solv_ions.gro -p topol.top -pname NA -nname CL -neutral
gmx grompp -f ../MDP/minim.mdp -c pro_solv_ions.gro -p topol.top -o em.tpr
gmx mdrun -v -deffnm em
echo 10 0 | gmx energy -f em.edr -o potential.xvg
gmx grompp -f ../MDP/nvt.mdp -c em.gro -r em.gro -p topol.top -o nvt.tpr
gmx mdrun -deffnm nvt
echo 16 0 |gmx energy -f nvt.edr -o temperature.xvg
gmx grompp -f ../MDP/npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
gmx mdrun -deffnm npt
echo 18 0| gmx energy -f npt.edr -o pressure.xvg
gmx grompp -f ../MDP/md.mdp -c npt.gro -t npt.cpt -p topol.top -o md.tpr
gmx mdrun -v -deffnm md
echo 4 4| gmx rms -f md.xtc -s md.tpr -o rmsd.xvg
以(yi)上(shang)就是本次(ci)教程的所有(you)內(nei)容,而(er)所有(you)操(cao)作(zuo)只需要(yao)可以(yi)登錄北鯤云超算平臺在線(xian)操(cao)作(zuo)即可,無需自己配備高性能的計算機(ji),和為繁瑣的工具安裝浪費時間了。
經(jing)過(guo)最近這段(duan)時間的測試,以下是(shi)北鯤云超算平臺吸引我的幾點優勢:
高性能(neng)計(ji)(ji)算(suan)資源(yuan),極大的(de)節省計(ji)(ji)算(suan)成本
支持海量的計算工具,且開箱即用(yong)
計算任務(wu)進度實時追蹤提示的貼心服務(wu)
詳細耐心的技術咨(zi)詢