chapter3 discrete variable
JMP的共同性分析软件操作

Commonality Analysis
20
Key components of data
Data Integrity
If the data integrity is in doubt, is It still valuable?
Completeness
If there is missing data, will the analysis provide the full picture?
– Grades: First, Second, Third ; Lot Numbers: 101, 102, 103
• Nominal Data
– A discrete quality characteristic is described to be nominal if the possible values are descriptive labels with no order or sequence.
Continuous
X
Nominal
X
X
Ordinal
X
X
0.085 0.08
0.075 0.07
0.065 0.06
0.055
POST
PRE/ PST
PRE
Character vs Continuous
Mosaic Plot
1.00
0.75
Good
0.50
0.25
Bad
0.00
A
B
Cat egory
Commonality Analysis
21
Fast Retrieval
If we need 2 weeks to retrieve the data, will it still be relevant
Discrete and Continuous Random Variables离散型与连续型随机变

A random variable takes numerical values that describe the outcomes of some chance process. The probability distribution of a random variable gives its possible values and their probabilities.
Discrete Random Variables and Their Probability Distributions
A discrete random variable X takes a fixed set of possible values with gaps between. The probability distribution of a discrete random variable X lists the values xi and their probabilities pi:
A numerical variable that describes the outcomes of a chance process is called a random variable. The probability model for a random variable is its probability distribution
Example: Consider tossing a fair coin 3 times. Define X = the number of heads obtained
X = 0: TTT
X = 1: HTT THT TTH
X = 2: HHT HTH THH
概率论与数理统计(英文) 第五章

5. Random vectors and Joint Probability Distribution s随机向量与联合概率分布5.1 Concept of Joint Probability Distributions(1) Discrete Variables Case 离散型Often, trials are conducted where two random variables are observed simultaneously in order to determine not only their individual behavior but also the degree of relationship between them.( X, Y)For two discrete random variables X and Y, we write the probability that X will take the value x and Y will take the value y as P(X=x, Y=y). Consequently, P(X=x, Y=y) is the probability of the intersection of the events X=x and Y=y.(X=x, Y=y) ------ (X=x)∩(Y=y)The distribution of probability is specified by listing the probabilities associated with all possible pairs of values x and y, either by formula or in a table. We refer to the function p(x, y)=P(X=x, Y=y) and the corresponding possible values (X, Y) as the j oint probability distribution (联合分布)of X and Y.They satisfy(,)0, (,)1xyp x y p x y ≥=∑∑,where the sum is over all possible values of the variable.Example 5.1.1 Calculating probabilities from a discrete joint probability distributionLet X and Y have the joint probability distribution.(a) Find (1)P X Y +>;(b) Find the probability distribution ()()X p x P X x == of the individualrandom variable X . Solution(a) The event 1X Y +>is composed of the pairs of values (l,1), (2,0), and (2,l). Adding their corresponding probabilities(1)(1,1)(2,0)(2,1)0.20.100.3.P X Y p p p +>=++=++=(b) Since the event X =0 is composed of the two pairs of values (0,0) and (0,1), we add their corresponding probabilities to obtain(0)(0,0)(0,1)0.10.20.3P X p p ==+=+=.Continuing, we obtain (1)(1,0)(1,1)0.40.20.6P X p p ==+=+= and(2)(2,0)(2,1)0.100.1P X p p ==+=+=.In summary, (0)0.3X p =, (1)0.6X p = and (2)0.1X p =is the probabilitydistribution of X . Note that the probability distribution ()X p x of appears in the lower margin of this enlarged table. The probability distribution ()Y p y of Y appears in the right-hand margin of the table. Consequently, the individual distributions are called marginal probability distributions .(边缘分布)From the example, we see that for each fixed value of x , the marginalprobability distribution is obtained as()()(,)X yP X x p x p x y ===∑,where the sum is over all possible values of the second variable. Continuing, we obtain()()(,)Y xP Y y p y p x y ===∑.Example 3.5.3Suppose the number X of patent applications (专利申请)submitted by a company during a 1-year period is a random variable having thePoisson distribution with mean λ, (()!n e P X n n λλ-==)and the variousapplications independently have probability (0,1)p ∈ of eventually being approved.Determine the distribution of the number of patent applications during the 1-year period that are eventually approved.先求联合分布密度,再求边缘分布Solution Let Y be the number of patent application being eventually approved during 1-year period. Then the event {}Y k = is the union of mutually exclusive events {,}X n Y k == ()n k ≥.If X n =, then the random variable S has the binomial distribution with parameter n and p :(|)(1)k k n k n P Y k X n C p p -===-. (0)n k ≥≥ Thus(,)()(|)P X n Y k P X n P Y k X n ====== (1)!nk kn k n e C p p n λλ--=⋅⋅-when k>n, P(X=n, Y=k)=0,Hence the distribution of Y is()(,)(,)n n kP Y k P X n Y k P X n Y k ∞∞=========∑∑(1)!nk kn k n n ke C p p n λλ∞--==⋅⋅-∑!(1)!!()!nk n k n k n e p p n k n k λλ∞--==⋅⋅--∑(1)!()!kn kkn k n ke p p k n k λλλ-∞--==⋅⋅--∑()(1)(1)()()!!!mk k p m p p p e e ek m k λλλλλλ∞---=-==∑ ()!k pp e k λλ-= Thus, Y has the Poisson distribution of mean p λ. exercise从1,2,3,4,5五个数中不放回随机的接连地取3个,然后按大小排成123X X X <<,试求13(,)X X 的联合分布,x1,x3 独立吗?Homework Chap 5 1,(2) Continuous Variables Case 连续型随机向量There are many situations in which we describe an outcome by giving the values of several continuous random variables. For instance, we may measure the weight and the hardness of a rock, the pressure and the temperature of a gas. Suppose that X and Y are two continuous random variables. A function (,)f x y is called the joint probability density of these random variables, if the probability that , a X b c Y d ≤≤≤≤ is given by the multiple integral(, )(,)b da cP a X b c Y d f x y dxdy ≤≤≤≤=⎰⎰Thus, a function (,)f x y can serve as a joint probability density if all of the following hold:for all values of x and y , f is integrable on R 2 andTo extend the concept of a cumulative distribution function to the two variables case, we can define F (x , y )(, )(, )F x y P X x Y y =≤≤,and we refer to the corresponding function F as the joint cumulative distribution function of the two random variables.Example 5.1.2If the joint probability density of two random variables is given by236 for 0,0(,)0 elsewherex y e x y f x y --⎧>>=⎨⎩ Find the joint distribution function, and use it to find the probability(2,4)P X Y ≤≤.Solution By definition,23006 for 0, 0(,)(,)0 elsewhere y x yu vxe du e dv x y F x yf u v dudv ---∞-∞⎧>>⎪==⎨⎪⎩⎰⎰⎰⎰Thus,23(1)(1) for >0, >0(,)0 elsewhere x y e e x y F x y --⎧--=⎨⎩.Hence,412(2, 4)(2, 4)(1)(1)0.9817P X Y F e e --≤≤==--=.ExampleIf the joint probability density of two random variables is given by2,1,01(,)0,kxy x y x f x y ⎧≤≤≤≤=⎨⎩其他(a)find the k; (b)find the probability2((,)),{(,)|,01}P X Y D D x y x y x x ∈=≤≤≤≤solutionsince(,)1f x y dxdy ∞∞-∞-∞=⎰⎰24111001(,)()226x x kf x y dxdy dx kxydy k x dx ∞∞-∞-∞==-=⎰⎰⎰⎰⎰ hence k=6.21124001((,))663()4xx DP X Y D xydxdy dx xydy x x x dx ∈===-=⎰⎰⎰⎰⎰joint marginal densities 边缘密度Given the joint probability density of two random variables, the probability density of the X or Y can be obtained by integrating out another variable,The functions f X and f Y respectively are called the marginal density (边缘密度)of X and Y .,ExampleThe joint probability density of two random variables is given by26,1,01(,)0,xy x y x f x y ⎧≤≤≤≤=⎨⎩其他find the marginal density from the joint density when [0,1]x ∈,215()(,)633X xf x f x v dv xydy x x +∞-∞====-⎰⎰[0,1]x ∉,()0X f x =,hence 533,01()0,X x x x f x elsewhere ⎧-≤≤=⎨⎩23,01()0,Y y y f x elsewhere ⎧≤≤=⎨⎩exercises求服从B 上均匀分布的随机向量(X,Y )的分布密度及分布函数。
金融衍生品定价理论第三章(binomial tree methods--discrete models of option pricing)

B0
35 1 0.01
ቤተ መጻሕፍቲ ባይዱ
34.65
Then
c0
40 34.65 2
2.695
This is the investor should pay $2.695 for this stock option.
Analysis of the Example
① the idea of hedging: it is possible to
Example cont.1
payoff = cT (ST K )
S0 $40
STu $45, cT (45 40) $5
STd $35, cT (35 40) $0
Consider a portfolio
S 2c
Example cont.2
Chapter 3
Binomial Tree Methods ------ Discrete Models of Option Pricing
An Example
S0 $40
STu $45
STd $35
Question: When t=0, buying a call option of the stock at with strike price $40 and 1 month maturity. If the risk-free annual interest rate is 12% throughout the period [0, T], how much should the premium for the call option be?
t=0,
ST :
chapter3 Markovs chains

• Rules:
– new policy holders start on level 1: – Following a year with one or more claims, move to the next lower level, or remain at level 1; – Following a claim-free year: Nhomakorabeap
jS jS
ij
1
i holds for all
, ie each row of
( pijl ) 1
12
Time-inhomogeneous Markov chains
• For a time-inhomogeneous Markov chain, the transition probabilities cannot simply be denoted by ( P)ij pij because they will depend on the absolute values of time, n, rather than just the time difference. • The value of “time” can be represented by many factors, for example the time of year, age or duration.
chapter受限因变量模型

第1章 受限因变量模型这一章讨论响应变量仅仅被部分观测到的情况。
引入被部分观测到的潜在随机变量y *,y *的实际观测变量为y i 。
引入二元指示变量D i ,如果a i < y *<b i ,D i = 1;否则,D i = 0。
即D i 表示变量y *是否可以被观测得到。
(a i , b i )称为观测区间。
如果对于D i = 1 和D i = 0都有实际观测数据,当D i = 1时,潜在变量与实际观测变量相等,当D i = 0时,实际观测变量同样有取值,但不等于潜在变量,这时称数据被归并(censored ),即小于a i 的数据被归并为a i ,而大于b i 的数据被归并为b i 。
用数学符号表示为:****,,,i i ii i i i i i i ia y a y y a yb b y b ⎧ <⎪= ≤≤⎨⎪ >⎩如果如果如果。
(1)如果只有当D i = 1时实际观测变量y i 才有观测数据,即:当D i = 1时,潜在变量与实际观测变量相等,而当D i = 0时,y i 没有观测值,这时称数据被截断(truncated ),即小于a i 的数据和大于a i 的数据被截断了。
因此截断数据与归并数据的区别在于,对于观测区间外的数据,归并数据将将其都归并为一点,而截断数据没有观测值。
将潜在随机变量y *的基本模型设定为:*i i i y v μσ=+。
(2)其中?i 为位置参数,?为刻度参数;v i 为独立于x i 的连续随机扰动项,均值为0,方差为1,其分布函数、密度函数分别为F 、f 。
在这些假定条件下,y i *的均值为?i ,方差为?2,分布函数为*()i iy F μσ-,概率密度函数为*()/i iy f μσσ-(证明请参见附录1)。
a i < y i * < b i 等价于i ii ii i i a b c v d μμσσ--=<<=,那么y i *被观测到的概率为:*Pr()Pr(1)()()i i i i i i a y b D F d F c <<===- (3)下面对截断数据模型和归并数据模型分别进行介绍1.1 截断数据模型如果样本数据是从总体的一部分抽取得到,我们把这类数据称为截断数据。
variables
4. Concepts, indicators and variables
Concepts are converted into variables through indicators(指标) - a set of criteria reflective of a concept. Concepts Indicators Variables Decision level
1. The definition of a variable
An image, perception or concept that is capable of measurement - hence capable of taking on different values - is called a variable. In other words, a concept that can be measured is called a variable.
Cause
Change variables
Effect
Outcome variables
(dependent variables) (independent variables) Variables that affect the relationship (extraneous variables) (外在变项) Fig. Types of variables in a casual relationship
Morality(死亡率) Independent variable
The extent of the use of contraceptives(避孕) Intervening variables
Fertility (出生率) Dependent variable
第07章 离散因变量和受限因变量模型(第三版)
本章首先关注的一类问题是经济决策中经常面临的选择问题, 如购买者对某种商品的购买决策问题,求职者对某种职业的选择 问题,投票人对某候选人的投票决策,银行对某客户的贷款决策 等。与通常的经济计量模型假定因变量是连续的不同,以这样的 决策结果作为因变量建立的计量经济模型称为离散因变量数据计 量经济学模型(models with discrete dependent variables)或离散 选择模型(discrete choice model, DCM)。
第七章第七章离散因变量和受限因变量模型离散因变量和受限因变量模型经济分析中经常会遇到大量的个体和企业的调查数据这些数据具有很多与时间序列数据不同的特点常存在离散选择性问题数据审查截断选择性样本等问题一般来说需要采用微观计量经济学方法进行定量分析
第七章 离散因变量和受限因变量模型
经济分析中经常会遇到大量的个体和企业的调查数据,这些 数据具有很多与时间序列数据不同的特点,常存在离散选择性问 题、数据审查(截断)、选择性样本等问题,一般来说需要采用 微观计量经济学方法进行定量分析。微观计量经济学最凸显的问 题是所谓经济选择和定性因变量问题。
i
0
(7.1.14)
式中:fi 表示概率密度函数。那么如果已知分布函数和密度函数的表达 式及样本值,求解该方程组,就可以得到参数的极大似然估计量。例如,
将上述3种分布函数和密度函数代入式(7.1.14)就可以得到3种模型的参数
极大似然估计。但是式(7.1.14) 通常是非线性的,需用迭代法进行求解。
5
7.1.1 线性概率模型及二元选择模型的形式
为了深刻地理解二元选择模型,首先从最简单的线性概率模型开 始讨论。线性概率模型的回归形式为:
yi 1 x1i 2 x2i k xki ui , i 1 , 2 , , N (7.1.1)
CHAPTER 7 Random Variables Saluda County Schools:7章随机变量S县学校
DISCRETE Random Variables
• the probability distribution of X lists the values and their probabilities. Value of X Probability x1 p1 x2 p2 x3 p3 … … xk pk
DISCRETE Random Variables
• Create a Probability Distribution for the following: • The probabilities of a return on an investment of 1000, 2000, 3000 are ½, ¼, and ¼ respectively.
– usually measurements
ounces
DISCRETE or CONTINUOUS?
• • • • • weight of a book number of chapters in a book number of defects in a square yard of fabric number of homeruns in a season weight of a boat-load of fish
DISCRETE Random Variables
• 25% of women cannot distinguish between the colors red and green. • Find the probability distribution for 3 randomly chosen women. • So, what is the probability that exactly two women will be colorblind? • What is the probability that at least one will not be colorblind?
[理学]概率论与数理统计英文 第四章-精品文档
(b) Select 5 bulbs randomly from the product of company X, what is the probability that at least 3 of them has life longer than 450 hrs.
, (4.1.4)
Where is the mean ofX, is referred to as thestandard deviation.
We easily get
.(4.1.5)
Example4.1.5
Determining the mean and variance using the probability density function
Note Equation (4.5.1) really gives a density function, since
Theorem4.5.1The mean and variance of a continuous random variableXhaving exponential distribution with parameter is given by
.
ProofSince the probability density function ofXis (4.5.1), we have
Example4.5.1.
Assume that the lifeYof bulbs produced by company X has exponential distribution with mean .
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Example 2 Roll two dice, let X denoted the sum of two dice show number. Solution:
Example 3 Consider the experiment of tossing a coin five times and on each toss observing whether the coin lands with a head or tail on its upward face.
Example 2 : the frequency function is X p -1 0.16 0 1 2 2a/10 3 0.3
a/10 a2
Determine the parameter a? Solution : due to property of the frequency function
Example 1 There are 5 balls in the bag, in which 2 white balls and 3 black balls. Select three balls one time at random manner. Solution: Suppose X={number of selected white balls}
we can define X = 1 when a tail is observed and X = 0 when a head is observed. The definition is arbitrary but it must be fixed before the experiment is started.
3.2 Probability Distributions For Discrete Random Variables
When probabilities are assigned to various outcomes in S , these in turn determine probabilities associated with the values of any particular rv X. the probability distribution of X says how the total probability of 1 is distributed among (allocated to ) the various possible X values.
3 Discistributions
3.1 Random Variables
Definition
For a given sample space S of some experiment, a random variable is any rule that associates a number with each outcome in S.
The Cumulative Distribution Function
Definition The cumulative distribution function (cdf ) F(x) of a discrete rv X with pmf P(x) is defined for every number X by F(x)=P(X≤ x)= y∑xP( y ) y: y ≤ : For any number x, F(x) is the probability that the observed value of X will be at most x.
P(a ≤ X ≤ b)=F(b)-F(a-)
Another View Of Probability Mass Function
Solution: a.
X p 1 1/n 2 1/n 3 1/n …….. …….. n 1/n
b.
1 P ( X = k ) = 1 n
k 1
1 n
k = 1,2,3,
Example 5 The defective rate of automatic product line is p. When a defective products was produced, the automatic product line must be adjusted. Let X denoted the number of good products during two times adjustment. Please determine the probability distribution of X. Solution:
F( x) = P( X ≤ x),
(∞ < x < +∞)
For X a discrete rv, the graph of F(x) will have a jump at every possible value of X and will be flat between possible values. Such a graph is called a step function
A Parameter Of A Probability Distribution Definition Suppose P(x) depends on a quantity that can
be assigned any one of a number of possible values, with each different value determining a different probability distribution. such a quantity is called a a parameter of the distribution . the collection of all probability distributions for different values of the a parameter is called a family of a probability distributions.
0.16 + a/10+ a2+2a/10+ 0.3=1 (a+0.9)(a-0.6)=0 a=0.6
Example 3 Roll two dice, let X denoted the sum of two dice show number. Determine the probability distribution of X ? Solution:
Definition Any random variable whose only possible values are 0 and 1 is called a Bernoulli random variable .
We will often want to define and study several different random variables from the same sample space
Definition The probability distribution or probability
mass function (Pmf) of a discrete rv is defined for every number x by P(x)=P(X=x)
The conditions 1 2 P(x)
P ( X = 0) =
C C
3 3 3 5
1 = 10
P( X
CC = 6 P ( X = 1) = 10 C CC = 3 = 2) = 10 C
3 2 3 5
1 2 2 3 3 5
2
1
The probability distribution function
X p
0 1/10
1 6/10
2 3/10
Example 1 Example 2
Property 1 the cdf is non-decreasing Property 3 satisfies :
Property 2
limF(x) = 0
x→∞
limF(x) = 1
x→+∞
Definition For any two numbers a and b with a ≤ b,
P ( X = k ) = (1 p ) p
k
k = 0,1,2,3,
Example 6 Consider the experiment of tossing a coin five times and on each toss observing whether the coin lands with a head or tail on its upward face. Discuss this experiment with four properties satisfied.
Two Type Of Random Variables Definition A discrete random variable is an rv whose possible values either constitute a finite set or else can be listed in an infinite sequence in which there is a first element, a second element, and so on. A random variable is continuous if its set of possible values consists of an entire interval on the number line.
all possible x
∑ P( x) =1
≥
0