数学期望
4.1 数学期望

设球的直径X~ 例4.1.2 设球的直径 ~U(a, b), 求球的体积 的数学期望E(X). 的数学期望 体积V=(π/6)X3,可得 解 体积 可得
−2 1 3 2 y 3, fV( y)= b−a 9π 0,
π a3≤ y≤π b3;
6 6 . 其它
则 E(V)= ∫ yf ( y)dy= π (a+b)(a2+b2). 24 −∞ V
i =1
n−1
i −1 n−1
(1− p)
n−1−( i −1)
p
i −1
= np[ p + (1− p)]
n−1
= np.
可利用二项分布的可加性证明,见例 可利用二项分布的可加性证明,见例4.1.12
电子科技大学
数学期望
3. X~N(µ , σ 2 ) , 则 E(X) = µ ;
1 +∞ − E( X ) = ∫−∞ xf ( x)dx = ∫−∞ xe σ 2π x−µ t2 +∞ 1 − t= (µ + σt )e 2 dt σ ∫−∞
数学期望
§4.1 数学期望 一. 随机变量的数学期望 引 例 定义4.1.1 设X 是离散型随机变量,其分布律为 定义 是离散型随机变量,
P{X = xi } = pi , i = 1,2,3....
若 ∑ xi pi < + ∞ 则称
i =1 +∞
+∞
E( X ) = ∑ xi pi 为X的数学期望 均值). (
解 E( XY) = ∑∑xi y j P{X = xi ,Y = y j }
i j
= ∑∑xi y j pi . p. j = ∑ xi pi . ∑ y j p. j
第三章 数学期望

r ( x ) f ( x)(离散变量)
r
r ( x ) r f ( x)dx(连续变量)
X关于原点的r阶矩也称为r阶原点矩,定义为 ‘r = E(Xr)
矩母函数
X的矩母函数定义为: MX(t)=E(etX) 在假设收敛的条件下,它是
M X (t ) e tX f ( x)(离散的变量) M X (t )
数学期望
数学期望的定义
数学期望就是一个随机变量的期望值或简称期望。 离散随机变量的期望定义: E(X)=x1P(X=x1)+x2P(X=x2)+…+xnP(X=xn) =xjP(X=xj) = xjf(xj) 如果随机变量取值概率都是相等的,那么我们就可 以得到一个特殊的期望,算术平均: E(X)=(x1+x2+…+xn)/n
对联合分布的方差和协方差
若X和Y是有联合密度函数f(x,y)的两个连续随机变 量,则X和Y的均值或期望是
X E( X ) Y E (Y )
xf ( x, y)dxdy
yf ( x, y)dxdy
方差是
2 X E[( X X ) 2 ]
标准化随机变量
令X是带均值和标准差的随机变量,则我 们用下式定义标准化的随机变量 X*=(X-)/ X*的一个重要性质是均值为0且方差为1,标 准化的变量对比较不同分布是有好处的。
矩
随机变量X关于均值的r阶中心矩,定义为: r=E((X-)r) 这里r=0,1,2,…。由此得到0=1 1=0 2=2
相关系数
若X和Y是独立的,则Cov(X,Y)=0。另一方面,若X 和Y是完全相关的。例如,当X=Y,则 Cov(X,Y)=XY=XY。由此我们引入变量X和Y相互 依赖的测度: = XY/XY 根据定理四,我们知道-1<=<=1。在=0时,我 们称X和Y是不相关的。然而在这些情况下,变量可 以是独立的,也可以是不独立的。我们将在后面的 章节中会进一步讨论相关性。
数学期望——精选推荐

数学期望⽬录数学期望定义离散型随机变量ξ有分布列x1x2⋯x k⋯p1p2⋯p k⋯如果级数 ∑k x k p k绝对收敛,则记Eξ=∑k x k p k称为ξ的数学期望.定义连续型随机变量ξ有密度函数p(x) ,若∫+∞−∞|x|p(x)dx<∞ ,则称Eξ=∫+∞−∞xp(x)dx为ξ的数学期望.定义随机变量ξ有分布函数F(x) ,若∫+∞−∞|x|dF(x)<∞ ,则称Eξ=∫+∞−∞xdF(x)为ξ的数学期望.设ξ为随机变量,η=f(ξ) ,则Eη=∫+∞−∞f(y)dFξ(y)当ξ连续时有密度函数p(x) ,则Eη=∫+∞−∞f(y)p(y)dy随机变量ξ,η独⽴同分布当且仅当对任意有界连续函数f有Ef(ξ)=Ef(η) .条件期望定义设ξ=x时,η的条件分布函数为Fη|ξ(y|x) ,则条件期望为E(η|ξ=x)=∫+∞−∞ydFη|ξ(y|x)若有条件分布列pη|ξ(y j|x) ,则E(η|ξ=x)=∑j y j pη|ξ(y j|x)若有条件密度函数pη|ξ(y|x) ,则E(η|ξ=x)=∫+∞−∞ypη|ξ(y|x)dy显然,若ξ,η相互独⽴,则E(η|ξ=x)=Eη .定理条件期望E(η|ξ=x) 可看作是x的函数,记为m(x) ,则m(ξ) 是随机变量,称m(ξ) 为已知ξ时η的条件期望,记为E(η|ξ) ,从⽽条件期望的数学期望有E[E(η|ξ)]=EηProof.利⽤期望定义m(x)=E(η|ξ=x)=∫+∞−∞ypη|ξ(y|x)dy=∫+∞−∞y p(x,y) pξ(x)dy则有E[E(η|ξ)]=E(m(ξ))=∫+∞−∞m(x)pξ(x)dx代⼊即证;直观上,E(η|ξ) 为在给定的ξ下的η的期望,它是ξ的函数,再求期望时,实际上是对所有的ξ求η的期望.全期望公式当ξ为离散型随机变量,记p i=P(ξ=x i) ,则Eη=∑i p i E(η|ξ=x i)[] Loading [MathJax]/jax/element/mml/optable/BasicLatin.js它是上⾯等式的直接推导.性质加法性质:Eξ1,⋯,Eξn存在,则∀c1,⋯,c n及b,有En∑i=1c iξi+b=n∑i=1c i Eξi+b乘法性质:若ξ1,⋯,ξn相互独⽴,Eξ1,⋯,Eξn存在,则E(ξ1⋯ξn)=Eξ1⋯Eξn有界收敛定理:设∀ω∈Ω有lim,且\forall n\ge 1,\ |\xi_n|\le M,则\lim_{n\to\infty}E\xi_n = E\xiE(h(\xi)\eta|\xi) = h(\xi)E(\eta|\xi) .柯西-施⽡茨不等式:|E(XY|Z)|\le \sqrt{E(X^2|Z)}\cdot \sqrt{E(Y^2|Z)} .⽅差定义称\xi-E\xi为\xi关于均值E\xi的离差,若E(\xi-E\xi)^2存在有限,则称其为\xi的⽅差,记作Var\xi或D\xiVar\xi = E(\xi-E\xi)^2 = E\xi^2 - (E\xi)^2为了统⼀量纲,有时使⽤标准差\sqrt{Var\xi} .切⽐雪夫不等式若⽅差存在,则\forall \epsilon>0,有P(|\xi-E\xi|\ge\epsilon)\le\dfrac{Var\xi}{\epsilon^2}Proof.⾮常巧妙的放缩法\begin{aligned} P(|\xi-E\xi|\ge\epsilon) &= \int_{|x-E\xi|\ge\epsilon}dF(x)\\ &\le \int_{|x-E\xi|\ge\epsilon}\dfrac{(x-E\xi)^2}{\epsilon^2}dF(x)\\ &\le \int_{-\infty}^{+\infty}\dfrac{(x-E\xi)^2}{\epsilon^2}dF(x)\\ &= \dfrac{1}{\epsilon^2}\int_{-\infty}^{+\infty}(x-E\xi)^2dF(x)\\ &= \dfrac{Var\xi}{\epsilon^2} \end{aligned}切⽐雪夫不等式说明\xi离均值E\xi的距离,被⽅差所控制,即\xi落在(E\xi-\epsilon,E\xi+\epsilon)的概率⼤于1-\frac{Var\xi}{\epsilon^2} .性质Var\xi = 0 \Leftrightarrow P(\xi=c)=1;切⽐雪夫不等式的直接推论.Var(c\xi+b) = c^2Var\xi .Var\xi \le E(\xi-c)^2 .加法性质:Var\left(\sum_{i=1}^n\xi_i\right) = \sum_{i=1}^nVar\xi_i + 2 \sum_{1\le i<j\le n} Cov(\xi_i,\xi_j)若\xi_1,\cdots,\xi_n两两独⽴,则Var\left(\sum_{i=1}^n\xi_i\right) = \sum_{i=1}^nVar\xi_i此时Cov(\xi_i,\xi_j) = 0 .协⽅差定义设\xi_i,\xi_j有联合分布F_{ij}(x,y),若E|(\xi_i-E\xi_i)(\xi_j-E\xi_j)|<\infty,称E(\xi_i-E\xi_i)(\xi_j-E\xi_j) = \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}(x-E\xi_i)(y-E\xi_j)dF_{ij}(x,y)为\xi_i,\xi_j的协⽅差,记作Cov(\xi_i,\xi_j) .性质Cov(\xi,\eta) = Cov(\eta,\xi) = E\xi\eta-E\xi E\eta\begin{aligned} E(\xi-E\xi)(\eta-E\eta) &= \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}(x-E\xi)(y-E\eta)dF(x,y)\\ &= \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}(xy-xE\eta-yE\xi+E\xi E\eta)dF(x,y)\\ &= E\xi\eta - 2E\xi E\eta + E\xi E\eta = E\xi\eta - E\xi E\eta \end{aligned}加法性质:Cov\left(\sum_{i=1}^n\xi_i,\eta\right) = \sum_{i=1}^nCov(\xi_i,\eta)Cov(a\xi+c,b\xi+d) = abCov(\xi,\eta) .Cov(\xi,\eta) \le \sqrt{Var\xi}\sqrt{Var\eta} .Cov(a\xi+b\eta,c\xi+d\eta) = acCov(\xi,\xi) + (ad+bc)Cov(\xi,\eta) + bdCov(\eta,\eta) .协⽅差矩阵协⽅差矩阵的元素是随机向量各分量两两之间的协⽅差B = E(\xi-E\xi)(\xi-E\xi)^T = \left( \begin{matrix} b_{11} & b_{12} & \cdots & b_{1n}\\ b_{21} & b_{22} & \cdots & b_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ b_{n1} & b_{n2} & \cdots & b_{nn}\\ \end{matrix} \right),\quad b_{ij} = Cov(\xi_i,\xi_j)容易看出B对称半正定.若有变换\eta = C\xi,则有EC(\xi-E\xi)(C(\xi-E\xi))^T = CBC^T为\eta的协⽅差矩阵.⼆维随机向量的协⽅差矩阵C = \left( \begin{matrix} Var\xi & E\xi\eta - E\xi E\eta\\ E\xi\eta - E\xi E\eta & Var\eta \end{matrix} \right)相关系数的计算r_{\xi,\eta} = \dfrac{Cov(\xi,\eta)}{\sqrt{Var\xi Var\eta}}相关系数为0则不相关.相关系数定义令\xi^* = (\xi-E\xi)/\sqrt{Var\xi},\ \eta^* = (\eta-E\eta)/\sqrt{Var\eta},称r_{\xi\eta} = Cov(\xi^*,\eta^*) = E\xi^*E\eta^*为\xi,\eta的相关系数.柯西-施⽡茨不等式()任意随机变量\xi,\eta有|E\xi\eta|^2\le E\xi^2E\eta^2等式成⽴当且仅当\exists t_0,\ \mathrm{s.t.}\ P(\eta=t_0\xi) = 1 .Proof.考虑u(t) = E(\eta-t\xi)^2 = t^2E\xi^2-2tE\xi\eta+E\eta^2\ge 0,分析判别式即可.性质|r_{\xi\eta}| \le 1,并且当|r_{\xi\eta}| = 1,称\xi,\eta以概率1线性相关;若|r_{\xi\eta}| = 0,称\xi,\eta不相关.若⽅差有限,则有等价条件Cov(\xi,\eta) = 0\xi,\eta不相关E\xi\eta = E\xi E\etaVar(\xi+\eta) = Var\xi + Var\eta若\xi,\eta独⽴,且它们⽅差有限,则\xi,\eta不相关.对⼆元正态随机向量,两个分量不相关与独⽴等价.矩⽅差、协⽅差本质上都是对随机变量分布分离程度的度量,可以⽤矩的概念进⾏推⼴.原点矩:m_k=E\xi^k,称为k阶原点矩中⼼距:c_k = E(\xi-E\xi)^k,称为k阶中⼼矩绝对矩:M_{\alpha} = E|\xi|^{\alpha},\ \alpha\in\mathbb{R},称为\alpha阶绝对矩。
《概率论与数理统计》数学期望

§4.3 随机变量函数的数学期望 例题
§4.3 随机变量函数的数学期望 例题
§4.3 随机变量函数的数学期望 例题
概率论与数理统计
§4.4 协方差和相关系数
协方差 相关系数 授课内容 例题
§4.4 协方差和相关系数 协方差
1. 定义
§4.4 协方差和相关系数 协方差
2. 协方差的计算公式
概率论与数理统计
§4.1 数学期望
离散型随机变量的数学期望
连续型随机变量的数学期望
授课内容
数学期望的性质
§4.1 数学期望 离散型随机变量的数学期望
1. 定义
§4.1 数学期望 离散型随机变量的数学期望
关于定义的几点说明
(2) 级数的绝对收敛性保证了级数的和不随级数各项次序的改变 而改变 , 之所以这样要求是因为数学期望是反映随机变量X 取可能值 的平均值,它不应随可能值的排列次序而改变.
§4.4 协方差和相关系数 相关系数
3. 不相关的定义
§4.4 协方差和相关系数 相关系数
4. 不相关性的判定
以下四个条件等价 (1) ρ 0; (2)Cov( X ,Y ) 0; (3) D( X Y ) DX DY;
(4)3 随机变量函数的数学期望 二维随机变量函数的数学期望
§4.3 随机变量函数的数学期望 二维随机变量函数的数学期望
一维随机变量函数的数学期望 二维随机变量函数的数学期望 授课内容 例题
§4.3 随机变量函数的数学期望 例题
§4.3 随机变量函数的数学期望 例题
§4.3 随机变量函数的数学期望 例题
5 .不相关与相互独立的关系
协方差 相关系数 授课内容 例题
§4.4 协方差和相关系数 例题
3.3 数学期望的定理

解: 设随机变量
1 , 第i个部件需要调整; Xi 0 , 第i个部件不需要调整.
i 1 ,2 ,3.
则
X X1 X 2 X 3.
X1 0 pi 0.9
而
1
0.1
E ( X 1 ) 0 0.9 1 0.1 0.1,
同理 E ( X 2 ) 0.2, E ( X 3 ) 0.3, 由数学期望的性质得
第三章 随机变量的数字特征
§3.3 关于数学期望的定理
[定理1] 常量的数学期望等于这个常量:
E (C ) C ,
其中C 是常量.
证: 常量 C 可以看作这样一个随机变量,它只可能
取得一个值 C , 显然, 它取得这个值的概率等于 1 . 所以
E (C ) C 1 C.
[定理2] 常量与随机变量的乘积的数学期望等于这个 常量与随机变量的数学期望的乘积:
于是
2k 2 E ( X ) k e 2, k! i 0
E ( Z ) E (3 X 2) 3E ( X ) 2 3 2 2 4.
E (CX ) CE ( X ).
证:对于离散随机变量X , 我们有
i i
E (CX ) Cxi p( xi ) C xi p( xi ) CE (X ). E (CX )
对于连续随机变量 X , 我们有
C
Cxf ( x)dx xf ( x)dx CE (X ).
E (CX ) CE ( X ).
3 设 X ,Y 是两个随机变量,则有
E ( X Y ) E ( X ) E (Y ).
推广: E ( X i ) E ( X i )
《数学期望》课件

在计算过程中需要注意积分的上下 限以及概率密度函数的取值范围。
连续型随机变量的数学期望的性质
01
02
03
非负性
E(X) ≥ 0,即数学期望的 值总是非负的。
可加性
如果X和Y是两个独立的随 机变量,那么E(X+Y) = E(X) + E(Y)。
线性性质
如果a和b是常数,那么 E(aX+b) = aE(X)+b。
方差是数学期望的度量,表示随机变量取值 与数学期望的偏离程度。
04
CATALOGUE
连续型随机变量的数学期望
连续型随机变量的定义
连续型随机变量
如果一个随机变量X的所有可能 取值是实数轴上的一个区间变量。
概率密度函数
描述连续型随机变量X在各个点 上取值的概率分布情况,其数学
《数学期望》PPT课件
CATALOGUE
目 录
• 引言 • 数学期望的基本性质 • 离散型随机变量的数学期望 • 连续型随机变量的数学期望 • 数学期望的应用 • 总结与展望
01
CATALOGUE
引言
数学期望的定义
数学期望是概率论和统计学中的 一个重要概念,它表示随机变量
取值的平均数或加权平均数。
数学期望的定义基于概率论的基 本原理,通过将每个可能的结果 与其对应的概率相乘,然后将这
些乘积相加得到。
数学期望具有一些重要的性质, 如线性性质、期望值不变性质等 ,这些性质在概率论和统计学中
有着广泛的应用。
数学期望的起源和历史
数学期望的起源可以追溯到17世纪,当时的一些数学家开始研究概率论和统计学中 的一些基本概念。
通过计算投资组合的数学期望, 我们可以了解投资组合的预期收 益,从而制定更加合理的投资策
常用分布的数学期望及方差

方差的性质
方差具有可加性
对于两个独立的随机变量X和Y,有Var(X+Y) = Var(X) + Var(Y)。
方差具有对称性
对于一个常数a和随机变量X,有Var(aX) = |a|^2 * Var(X)。
方差具有非负性
对于随机变量X,有Var(X) >= 0,其中 Var(X) = 0当且仅当X是一个常数。
05 数学期望与方差的应用
在统计学中的应用
描述性统计
数学期望和方差用于描述一组数据的中心趋势和 离散程度,帮助我们了解数据的基本特征。
参数估计
通过样本数据的数学期望和方差,可以对总体参 数进行估计,如均值和方差的无偏估计。
假设检验
在假设检验中,数学期望和方差用于构建检验统 计量,判断原假设是否成立。
常见分布的数学期望
均匀分布的数学期望为
$E(X) = frac{a+b}{2}$,其中a和b是均匀分布的下限和上 限。
柯西分布的数学期望为
$E(X) = frac{pi}{beta} sinh(frac{1}{beta})$,其中β是柯西 分布的参数。
拉普拉斯分布的数学期望为
$E(X) = frac{beta}{pi} tan(frac{pi}{beta})$,其中β是拉普 拉斯分布的参数。
03
泊松分布
正态分布是一种常见的连续型随机变量 分布,其方差记作σ²。正态分布的方差 描述了随机变量取值的分散程度。
二项分布是一种离散型随机变量分布, 用于描述在n次独立重复的伯努利试验 中成功的次数。其方差记作σ²,且σ² = np(1-p),其中n是试验次数,p是单次 试验成功的概率。
泊松分布是一种离散型随机变量分布, 用于描述在一段时间内随机事件发生的 次数。其方差记作σ²,且σ² = λ,其中 λ是随机事件发生的平均速率。
数学期望常用公式总结高中

数学期望常用公式总结高中
数学期望是统计学中一个重要的概念,它用来衡量一组数据的平均值。
它有助于研究者分析数据,从而得出有效的结论。
在高中数学中,我们常用的数学期望公式有以下几种:(1)
期望的基本公式:期望就是数据的平均值。
其公式为:E(X) = ∑xP(x),其中E(X)表示期望,∑x表示每个可能的观测值的总和,P(x)表示每个观测值的概率。
(2)期望的期望公式:期望的期望公式表示期望可以用
来计算另一个期望。
其公式为:E(E(X)) = ∑E(X)P(X),其中
E(E(X))表示期望的期望,∑E(X)表示每个可能的观测值的期望值的总和,P(X)表示每个观测值的概率。
(3)期望的条件期望公式:期望的条件期望公式表示期
望可以用来计算另一个条件期望。
其公式为:E(E(X|Y)) =
∑E(X|Y)P(Y),其中E(E(X|Y))表示条件期望的期望,∑E(X|Y)
表示每个可能的观测值的条件期望的总和,P(Y)表示每个条件的概率。
(4)期望的离散概率分布公式:期望的离散概率分布公
式表示期望可以用来计算离散概率分布的期望值。
其公式为:E(X) = ∑xP(x),其中E(X)表示离散概率分布的期望值,∑x表
示每个可能的观测值的总和,P(x)表示每个观测值的概率。
以上就是高中数学中常用的数学期望公式。
它们可以帮助我们更准确地分析数据,从而得出有效的结论。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Expected valueThis article is about the term used in probability theory and statistics. For other uses, see Expected value (disambiguation).In probability theory, the expected value (or expectation, or mathematical expectation, or mean, or the first moment) of a random variable is the weighted average of all possible values that this random variable can take on. The weights used in computing this average correspond to the probabilities in case of a discrete random variable, or densities in case of a continuous random variable. From a rigorous theoretical standpoint, the expected value is the integral of the random variable with respect to its probability measure.[1][2]The expected value may be intuitively understood by the law of large numbers: The expected value, when it exists, is almost surely the limit of the sample mean as sample size grows to infinity. More informally, it can be interpreted as the long-run average of the results of many independent repetitions of an experiment (e.g. a dice roll). The value may not be expected in the ordinary sense—the "expected value" itself may be unlikely or even impossible (such as having 2.5 children), just like the sample mean.The expected value does not exist for some distributions with large "tails", such as the Cauchy distribution.[3]It is possible to construct an expected value equal to the probability of an event by taking the expectation of an indicator function that is one if the event has occurred and zero otherwise. This relationship can be used to translate properties of expected values into properties of probabilities, e.g. using the law of large numbers to justify estimating probabilities by frequencies.[edit] Definition[edit] Discrete random variable, finite caseSuppose random variable X can take value x1 with probability p1, value x2 with probability p2, and so on, up to value x k with probability p k. Then the expectation of this random variable X is defined asSince all probabilities p i add up to one: p1 + p2 + ... + p k = 1, the expected value can be viewed as the weighted average, with p i’s being the weights:If all outcomes x i are equally likely (that is, p1 = p2 = ... = p k), then the weighted average turns into the simple average. This is intuitive: the expected value of a random variable is the average of all values it can take; thus the expected value is what you expect to happen on average. If the outcomes x i are not equiprobable, then the simple average ought to be replaced with the weighted average, which takes into account the fact that some outcomes are more likely than the others. The intuition however remains the same: the expected value of X is what you expect to happen on average.An illustration of the convergence of sequence averages of rolls of a die to the expected value of 3.5 as the number of rolls (trials) grows.Example 1. Let X represent the outcome of a roll of a six-sided die. More specifically, X will be the number of pips showing on the top face of the die after the toss. The possible values for X are1, 2, 3, 4, 5, 6, all equally likely (each having the probability of 16). The expectation of X isIf you roll the die n times and compute the average (mean) of the results, then as n grows, the average will almost surelyconverge to the expected value, a fact known as the strong law of large numbers. One example sequence of ten rolls of the die is 2, 3, 1, 2, 5, 6, 2, 2, 2, 6, which has the average of 3.1, with the distance of 0.4 from the expected value of 3.5. The convergence is relatively slow: the probability that the average falls within the range 3.5 ± 0.1 is 21.6% for ten rolls, 46.1% for a hundred rolls and 93.7% for a thousand rolls. See the figure for an illustration of the averages of longer sequences of rolls of the die and how they converge to the expected value of 3.5. More generally, the rate of convergence can be roughly quantified by e.g. Chebyshev's inequality and the Berry-Esseen theorem.Example 2. The roulette game consists of a small ball and a wheel with 38 numbered pockets around the edge. As the wheel is spun, the ball bounces around randomly until it settles down in one of the pockets. Suppose random variable X represents the (monetary) outcome of a $1 bet on a single number ("straight up" bet). If the bet wins (which happens with probability 138), the payoff is $35; otherwise the player loses the bet. The expected profit from such a bet will be[edit] Discrete random variable, countable caseLet X be a discrete random variable taking values x1, x2, ... with probabilities p1, p2, ... respectively. Then the expected value of this random variable is the infinite sumprovided that this series converges absolutely (that is, the sum must remain finite if we were to replace all xi's with their absolute values). If this series does not converge absolutely, we say that the expected value of X does not exist.For example, suppose random variable X takes values 1, −2, 3, −4, ..., with respective probabilities c12, c22, c32, c42, ..., where c = 6π2 is a normalizing constant that ensures the probabilities sum up to one. Then the infinite sumconverges and its sum is equal to ln(2) ≃ 0.69315. However it would be incorrect to claim that the expected value of X is equal to this number—in fact E[X] does not exist, as this series does not converge absolutely (see harmonic series).[edit] Univariate continuous random variableIf the probability distribution of X admits a probability density function f(x), then the expected value can be computed as[edit] General definitionIn general, if X is a random variable defined on a probability space(Ω, Σ, P), then the expected value of X, denoted by E[X], ⟨X⟩, X or E[X], is defined as Lebesgue integralWhen this integral exists, it is defined as the expectation of X. Note that not all random variables have a finite expected value, since the integral may not converge absolutely; furthermore, for some it is not defined at all (e.g., Cauchy distribution). Two variables with the same probability distribution will have the same expected value, if it is defined.It follows directly from the discrete case definition that if X is a constant random variable, i.e. X = b for some fixed real number b, then the expected value of X is also b.The expected value of an arbitrary function of X, g(X), with respect to the probability density function ƒ(x) is given by the inner product of ƒ and g:This is sometimes called the law of the unconscious statistician. Using representations as Riemann–Stieltjes integral and integration by parts the formula can be restated as∙if ,∙if .As a special case let α denote a positive real number, thenIn particular, for α = 1, this reduces to:ifPr[X≥ 0] = 1, where F is the cumulative distribution function of X.[edit] Conventional terminology∙When one speaks of the "expected price", "expected height", etc. one means the expected value of a random variable that is a price, a height, etc.∙When one speaks of the "expected number of attempts needed to get one successful attempt", one might conservatively approximate it as the reciprocal of the probability of success for such an attempt. Cf. expected value of the geometric distribution.[edit] Properties[edit] ConstantsThe expected value of a constant is equal to the constant itself; i.e., if c is a constant, then E[c] = c. [edit] MonotonicityIf X and Y are random variables such that X≤ Y almost surely, then E[X] ≤ E[Y].[edit] LinearityThe expected value operator (or expectation operator) E is linear in the sense thatNote that the second result is valid even if X is not statistically independent of Y. Combining the results from previous three equations, we can see thatfor any two random variables X and Y (which need to be defined on the same probability space) and any real numbers a and b.[edit] Iterated expectation[edit] Iterated expectation for discrete random variablesFor any two discrete random variables X, Y one may define the conditional expectation:[4]which means that E[X|Y](y) is a function of y.Then the expectation of X satisfiesHence, the following equation holds:[5]that is,The right hand side of this equation is referred to as the iterated expectation and is also sometimes called the tower rule or the tower property. This proposition is treated in law of total expectation. [edit] Iterated expectation for continuous random variablesIn the continuous case, the results are completely analogous. The definition of conditional expectation would use inequalities, density functions, and integrals to replace equalities, mass functions, and summations, respectively. However, the main result still holds:[edit] InequalityIf a random variable X is always less than or equal to another random variable Y, the expectation of X is less than or equal to that of Y:If X≤ Y, then E[X] ≤ E[Y].In particular, if we set Y to |X| we know X≤ Y and −X≤ Y. Therefore we know E[X] ≤ E[Y] and E[-X] ≤ E[Y]. From the linearity of expectation we know -E[X] ≤ E[Y].Therefore the absolute value of expectation of a random variable is less than or equal to the expectation of its absolute value:[edit] Non-multiplicativityIf one considers the joint probability density function of X and Y, say j(x,y), then the expectation of XY isIn general, the expected value operator is not multiplicative, i.e. E[XY] is not necessarily equal toE[X]·E[Y]. In fact, the amount by which multiplicativity fails is called the covariance:Thus multiplicativity holds precisely when Cov(X, Y) = 0, in which case X and Y are said to be uncorrelated (independent variables are a notable case of uncorrelated variables).Now if X and Y are independent, then by definition j(x,y) = ƒ(x)g(y) where ƒand g are the marginal PDFs for X and Y. ThenandCov(X, Y) = 0.Observe that independence of X and Y is required only to write j(x,y) = ƒ(x)g(y), and this is required to establish the second equality above. The third equality follows from a basic application of the Fubini-Tonelli theorem.[edit] Functional non-invarianceIn general, the expectation operator and functions of random variables do not commute; that isA notable inequality concerning this topic is Jensen's inequality, involving expected values of convex (or concave) functions.[edit] Uses and applicationsThe expected values of the powers of X are called the moments of X; the moments about the mean of X are expected values of powers of X− E[X]. The moments of some random variables can be used to specify their distributions, via their moment generating functions.To empirically estimate the expected value of a random variable, one repeatedly measures observations of the variable and computes the arithmetic mean of the results. If the expected value exists, this procedure estimates the true expected value in an unbiased manner and has the property of minimizing the sum of the squares of the residuals (the sum of the squared differences between the observations and the estimate). The law of large numbers demonstrates (under fairly mild conditions) that, as the size of the sample gets larger, the variance of this estimate gets smaller.This property is often exploited in a wide variety of applications, including general problems of statistical estimation and machine learning, to estimate (probabilistic) quantities of interest via Monte Carlo methods, since most quantities of interest can be written in terms of expectation, e.g.where is the indicator function for set , i.e..In classical mechanics, the center of mass is an analogous concept to expectation. For example, suppose X is a discrete random variable with values x i and corresponding probabilities p i. Now consider a weightless rod on which are placed weights, at locations x i along the rod and havingmasses p i (whose sum is one). The point at which the rod balances is E[X].Expected values can also be used to compute the variance, by means of the computational formula for the varianceA very important application of the expectation value is in the field of quantum mechanics. Theexpectation value of a quantum mechanical operator operating on a quantum state vectoris written as . The uncertainty in can be calculated using the formula.[edit] Expectation of matricesIf X is an matrix, then the expected value of the matrix is defined as the matrix of expected values:This is utilized in covariance matrices.[edit] Formulas for special cases[edit] Discrete distribution taking only non-negative integer valuesWhen a random variable takes only values in {0,1,2,3,...} we can use the following formula for computing its expectation (even when the expectation is infinite):Proof:interchanging the order of summation, we haveas claimed. This result can be a useful computational shortcut. For example, suppose we toss a coin where the probability of heads is p. How many tosses can we expect until the first heads (not including the heads itself)? Let X be this number. Note that we are counting only the tails and notthe heads which ends the experiment; in particular, we can have X = 0. The expectation of X maybe computed by . This is because the number of tosses is at least i exactly when the first i tosses yielded tails. This matches the expectation of a random variable with anExponential distribution. We used the formula for Geometric progression:[edit] Continuous distribution taking non-negative valuesAnalogously with the discrete case above, when a continuous random variable X takes only non-negative values, we can use the following formula for computing its expectation (even when the expectation is infinite):Proof: It is first assumed that X has a density f X(x). We present two techniques:∙Using integration by parts (a special case of Section 1.4 above):and the bracket vanishes because[6]1 − F(x) = o(1 / x) as .∙Using an interchange in order of integration:In case no density exists, it is seen that[edit] HistoryThe idea of the expected value originated in the middle of the 17th century from the study of the so-called problem of points. This problem is: how to divide the stakes in a fair way between two players who have to end their game before it's properly finished? This problem had been debated for centuries, and many conflicting proposals and solutions had been suggested over the years, when it was posed in 1654 to Blaise Pascal by a French nobleman chevalier de Méré. deMéréclaimed that this problem couldn't be solved and that it showed just how flawed mathematics was when it came to its application to the real world. Pascal, being a mathematician, got provoked and determined to solve the problem once and for all. He began to discuss the problem in a now famous series of letters to Pierre de Fermat. Soon enough they both independently came up with a solution. They solved the problem in different computational ways but their results were identical because their computations were based on the same fundamental principle. The principle is that the value of a future gain should be directly proportional to the chance of getting it. This principle seemed to have come absolutely natural to both of them. They were very pleased by the fact that they had found essentially the same solution and this in turn made them absolutely convinced they had solved the problem conclusively. However, they did not publish their findings. They onlyinformed a small circle of mutual scientific friends in Paris about it.[7]Three years later, in 1657, a Dutch mathematician Christiaan Huygens, who had just visited Paris, published a treatise (see Huygens (1657)) "De ratiociniis in ludoaleæ" on probability theory. In this book he considered the problem of points and presented a solution based on the same principle as the solutions of Pascal and Fermat. Huygens also extended the concept of expectation by adding rules for how to calculate expectations in more complicated situations than the original problem (e.g., for three or more players). In this sense this book can be seen as the first successful attempt of laying down the foundations of the theory of probability.In the foreword to his book, Huygens wrote: "It should be said, also, that for some time some of the best mathematicians of France have occupied themselves with this kind of calculus so that no one should attribute to me the honour of the first invention. This does not belong to me. But these savants, although they put each other to the test by proposing to each other many questions difficult to solve, have hidden their methods. I have had therefore to examine and go deeply for myself into this matter by beginning with the elements, and it is impossible for me for this reason to affirm that I have even started from the same principle. But finally I have found that my answers in many cases do not differ from theirs." (cited by Edwards (2002)). Thus, Huygens learned about de Méré's problem in 1655 during his visit to France; later on in 1656 from his correspondence with Carcavi he learned that his method was essentially the same as Pascal's; so that before his book went to press in 1657 he knew about Pascal's priority in this subject.Neither Pascal nor Huygens used the term "expectation" in its modern sense. In particular, Huygens writes: "That my Chance or Expectation to win any thing is worth just such a Sum, as wou'd procure me in the same Chance and Expectation at a fair Lay. ... If I expect a or b, and have an equal Chance of gaining them, my Expectation is worth a+b2." More than a hundred years later, in 1814, Pierre-Simon Laplace published his tract "Théorieanalytique des probabilités", where the concept of expected value was defined explicitly:... this advantage in the theory of chance is the product of the sum hoped for by the probability of obtaining it; it is the partial sum which ought to result when we do not wish to run the risks of the event in supposing that the division is made proportional to the probabilities. This division is the only equitable one when all strange circumstances are eliminated; because an equal degree of probability gives an equal right for the sum hoped for. We will call this advantage mathematical hope.The use of letter E to denote expected value goes back to W.A. Whitworth (1901) "Choice and chance". The symbol has become popular since for English writers it meant "Expectation", for Germans "Erwartungswert", and for French "Espérancemathématique".[8]。