药物生物信息学 配体结构相似策略

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

相似性测度
Cosine coefficient(余弦系数)
Biblioteka Baidu
similarity =
C
(A B)
A
B C
= 6 / (13 8) = 0.588
not monotonic with the Tanimoto and Dice coefficients, but highly correlated with them
Pharmacogenomics Jiankai Xu Harbin Medical University
配体小分子结构信息
指纹(Fingerprints)
the fragments present in a structure can be represented as a sequence of 0s and 1s
also called the Ochiai (落合)coefficient
Pharmacogenomics
Jiankai Xu Harbin Medical University
Similarity Ensemble Approach
Similarity Ensemble Approach
SEA算法 Michael J Keiser,Nbt1284
相似性测度
Dice coefficient similarity = 2C A+B
A
B C
= 12 / (13 + 8) = 0.57
does not give the same values as the Tanimoto coefficient, but will rank molecules in the same order of similarity to a target
原理 通过比较两个给定靶标蛋白所能结合的配 体集合的整体结构相似性来得到两个蛋白 之间的相似性得分。
理论基础:序列比对
Pharmacogenomics Jiankai Xu Harbin Medical University
Similarity Ensemble Approach
SEA的计算步骤1:构建配体集
将配体集间的相似性Raw Score转化为E值, 通过E值消除配体集大小对结果的影响进而进 行比较。
Z = (Raw score – μ(x)) / σ(x) P(Z > z) = 1 – exp(‐e‐zπ/sqrt(6)–Г’(1)) E(z) = P(z)Ndb
Pharmacogenomics Jiankai Xu Harbin Medical University
i.e. “monotonic” with the Tanimoto coefficient
also called the Czekanowski or Sørenson coefficient
Pharmacogenomics Jiankai Xu Harbin Medical University
Pharmacogenomics Jiankai Xu Harbin Medical University
谢谢
Pharmacogenomics Jiankai Xu Harbin Medical University
......
Pharmacogenomics Jiankai Xu Harbin Medical University
配体小分子结构信息
商用数据资源 Target inhibitor database(GVK Bio) AurSCOPE (Aureus) stARLITe ChemBioBase Suite BioPrint WOMBAT MDDR
for chemical structures often called structure “fingerprints”
Pharmacogenomics Jiankai Xu Harbin Medical University
配体小分子结构信息
Fingerprints
Daylight fingerprints
度对l22,计于…算配,体Tcl集(2ln1}Li,,1=l使{2jl)1用1=,TTala1n2n,iimm…oott,oo
l1m}和L2={l21, score作为测 score( l1i,
l2j),
设定阈值Threshold,计算两个配体集间相 似性得分:
Raw Score(L1, L2 )
2‐7 atom 2048 bit
MACCS Keys指纹
166 个分子碎片
3D ...
Pharmacogenomics Jiankai Xu Harbin Medical University
相似性测度
similarity measures are most commonly calculated from structure fingerprints
Pharmacogenomics Jiankai Xu Harbin Medical University
配体小分子结构信息
公共数据资源 DrugBank
~1000 FDA‐approved drugs;~3000 experimentaldrugs;~6000 drug‐targets relationships(DTRs);
TC 阈值的确定
Pharmacogenomics Jiankai Xu Harbin Medical University
SEA的应用
Pharmacogenomics
Jiankai Xu Harbin Medical University
Pharmacogenomics
Jiankai Xu Harbin Medical University
收集靶点的配体构成配体集 忽略配体数目小于5的靶点
Pharmacogenomics Jiankai Xu Harbin Medical University
Similarity Ensemble Approach
SEA的计算步骤2:计算配体集间相似性
Daylight SMILES strings
配体小分子结构信息
公共数据资源 PDSP Ki
~6800 chemicals;~46000 Ki
Binding DB
~18000 chemicals;~30000 Ki,IC50
PubChem BioAssay
~560000 chemicals;
PDBbind‐CN
~8700 small‐molecule ligands;~3600 protein DTRs
count the bits that are “on” in both molecules
count the bits that are “on” in each molecule separately
struct A: struct B: A AND B:
00010100010101000101010011110100 00000000100101001001000011100000 00000000000101000001000011100000
13 bits on (A) 8 bits on (B) 6 bits on (C)
similarity coefficient can be calculated from A, B and C
A
B C
Pharmacogenomics
Jiankai Xu Harbin Medical University
小结
丰富的公共药物靶点数据库是研究的基础 谷本系数是最常用的配体相似性测度 SEA算法对于不同的数据,Tc阈值不同 SEA算法具有广泛的应用
Pharmacogenomics Jiankai Xu Harbin Medical University
思考题 简述SEA算法 举例说明1-2种SEA可能的实际应用
Tc(l1i , l2 j )
1im,1 jn
Tc(l1i ,l2 j )Threshold
Pharmacogenomics
Jiankai Xu Harbin Medical University
Similarity Ensemble Approach
SEA的计算步骤3:Raw Score转化为E值
The Tanimoto coefficient is the most commonly used similarity coefficient in chemical informatics
also called the Jaccard coefficient
Pharmacogenomics Jiankai Xu Harbin Medical University
配体小分子结构信息
Tanimoto coefficient(谷本系数)
similarity =
C
A
A+B–C
B C
= 6 / (13 + 8 – 6) = 0.4
the number of bits set in both molecules divided by the number of bits set in either molecule
00010100010101000101010011110100
0 means fragment is not present in structure 1 means fragment is present in structure (perhaps
multiple times)
each 0 or 1 can be represented as a single bit in the computer (a “bitstring”)
Matador
~770 drugs;~7000 direct and 5000 indirect DTRs
SuperTarget
~1500 drugs;~7300 DTRs
Therapeutic Target Database(TTD)
~2100 drugs;~1535 DTRs
Pharmacogenomics Jiankai Xu Harbin Medical University
药物靶标的计算预测方法
College of Bioinformatics Science and Technology
Harbin Medical University
配体结构相似策略
徐建凯 生物信息科学与技术学院
哈尔滨医科大学
主要内容
配体小分子结构信息 相似性测度 SEA算法 ※▲ SEA的应用 ※
相关文档
最新文档