概率图模型导论——概率论与图论相结合

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

P(O | ci ) P(ci ) P(ci | O) = P(O)
Class i
Normalization factor
What we care is the conditional probability, and it’s is a ratio of two marginal probabilities.
Different thoughts of machine learning
Statistics (modeling uncertainty, detailed information) vs. Logics (modeling complexity, high level information)
∑ f ( x , xπ ) = 1
i i
i
xi
Definition of Joint Probability Distribution Check:
Representation
Graphical models represent joint probability distributions more economically, using a set of “local” relationships among variables.
Directed PGM Undirected PGM
Directed PGM Undirected PGM
Insights of PGM
Outline
Preparations
PGM “is” a universal model
Different thoughts of machine learning Different training approaches Different data types
Unifying Logical and Statistical AI. Pedro Domingos, University of Washington. AAAI 2006. Speech: Statistical information (Acoustic model + Language model + Affect model…) + High level information (Expert/Logics)
Two incoming arrows
Check through reachability
Outline
Preparations Probabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
Undirected PGM (Leabharlann BaiduRF)
Potential function (local parameterization)
C
ϕ X ( xC ): potential function on the possible realizations xC of the maximal clique X C
Probability Distribution(2)
p ( x, z ) = p ( x ) p ( z )
p ( x, y , z ) p ( x, z )
Classical Markov chain “Past”, “present”, “future”
Common cause Y “explains” all the dependencies between X and Z
Probability Distribution Representation
Queries
Implementation Interpretation
Conditional Independence
Probability Distribution(1)
Clique
A clique of a graph is a fully-connected subset of nodes. Local functions should not be defined on domains of nodes that extend beyond the boundaries of cliques.
Chain rules of probability theory
Conditional Independence
Outline
Preparations Probabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
Directed PGM (BN)
Probability Distribution Representation
Queries
Implementation Interpretation
Conditional Independence
Probability Distribution
fi ( xi , xπ i ) ≥ 0
Bayesian Framework Chain rules of probability theory Conditional Independence
Probabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
To model both temporal and spatial data, by unifying
Thoughts: Statistics + Logics Approaches: Maximum Likelihood Training + Discriminative Training
Further more, the directed and undirected models together provide modeling power beyond that which could be provided by either alone.
Common effect Multiple, competing explanation
Conditional Independence (check)
Bayes ball algorithm (rules)
One incoming arrow and one outgoing arrow
Two outgoing arrows
浙江大学计算机学院《人工智能引论》课件
第十讲 概率图模型导论
Chapter 10 Introduction to Probabilistic Graphical Models
Weike Pan, and Congfu Xu
{panweike, xucongfu}@zju.edu.cn
Institute of Artificial Intelligence College of Computer Science, Zhejiang University October 12, 2006
References
An Introduction to Probabilistic Graphical Models. Michael I. Jordan. http://www.cs.berkeley.edu/~jordan/graphical.html
Outline
Preparations Probabilistic Graphical Models (PGM)
Bayesian Framework
Problem description
Observation Conclusion (classification or prediction) Likelihood Priori probability
Bayesian rule
Observation A posteriori probability
Undirected graph (Markov Random Fields, MRF)
Modeling symmetric effects and dependencies: spatial dependence (e.g. image analysis…)
PGM “is” a universal model
Different training approaches
Maximum Likelihood Training: MAP (Maximum a Posteriori) vs. Discriminative Training: Maximum Margin (SVM)
Speech: classical combination – Maximum Likelihood + Discriminative Training
Conditional Independence (basic)
Interpret missing edges in terms of conditional independence
Assert the conditional independence of a node from its ancestors, conditional on its parents.
Z
∑∏ ϕ
x C
XC
( xC )
Z
∑∏ ϕ
x C x
XC
( xC )
= ∑ exp{− H ( x)}
Conditional Independence
It’s a “reachability” problem in graph theory.
Representation
Outline
Preparations Probabilistic Graphical Models (PGM)
Different data types
Directed acyclic graph (Bayesian Networks, BN)
Modeling asymmetric effects and dependencies: causal/temporal dependence (e.g. speech analysis, DNA sequence analysis…)
Maximal cliques
Probability Distribution(3)
Joint probability distribution
Boltzman distribution
p ( x)
1 ∏ ϕ X C ( xC ) Z C
p( x)
Normalization factor
1 ∏ ϕ X C ( xC ) Z C 1 = ∏ exp{− H C ( xC )} Z C 1 = exp{−∑ H C ( xC )} Z C 1 = exp{− H ( x)} Z
PGM
Nodes represent random variables/states The missing arcs represent conditional independence assumptions
The graph structure implies the decomposition
Conditional Independence (3 canonical graphs)
Conditional Independence Marginal Independence
p ( x, y , z ) = p ( x ) p ( z ) p ( y | x , z ) = p( x) p( z )
Maximal cliques
The maximal cliques of a graph are the cliques that cannot be extended to include additional nodes without losing the probability of being fully connected. We restrict ourselves to maximal cliques without loss of generality, as it captures all possible dependencies.
相关文档
最新文档