R软件课后作业

合集下载

R的课后习题

R的课后习题

x <- c(74.3, 78.8, 68.8, 78, 70.4, 80.5, 80.5, 69.7, 71.2, 73.5, 79.5, 75.6, 75, 78.8, 72, 72, 72, 74.3, 71.2, 72, 75, 73.5, 78.8, 74.3, 75.8, 65, 74.3, 71.2, 69.7, 68, 73.5, 75, 72, 64.3, 75.8, 80.3, 69.7, 74.3, 73.5, 73.5, 75.8, 75.8, 68.8, 76.5, 70.4, 71.2, 81.2, 75, 70.4, 68, 70.4, 72, 76.5, 74.3, 76.5, 77.6, 67.3, 72, 75, 74.3, 73.5, 79.5, 73.5, 74.7, 65, 76.5, 81.6, 75.4, 72.7, 72.7, 67.2, 76.5, 72.7, 70.4, 77.2, 68.8, 67.3, 67.3, 67.3, 72.7, 75.8, 73.5, 75, 73.5, 73.5, 73.5, 72.7, 81.6, 70.3, 74.3, 73.5, 79.5, 70.4, 76.5, 72.7, 77.2, 84.3, 75, 76.5, 70.4)n <- length(x)mean(x) #均值## [1] 73.67var(x) #方差## [1] 15.52sd(x) #标准差## [1] 3.939max(x) - min(x) #极差## [1] 20sd(x)/100^(1/2) # 标准误## [1] 0.3939100 * sd(x)/mean(x) #变异系数## [1] 5.347n/((n - 1) * (n - 2)) * sum(x - mean(x))^3/sd(x)^3 # 偏度## [1] -4.411e-41n * (n + 1)/((n - 1) * (n - 2) * (n - 3) * sd(x)^4) * sum(x - mean(x))^4 - 3 * (n - 1)^2/(n - 2)/(n - 3) #峰度## [1] -3.0933.2 绘出习题3.1的直方图、密度估计曲线、经验分布图和qq图,并将密度估计曲线与正态密度曲线相比较,将经验分布曲线与正态分布曲线相比较(其中正态曲线的均值和标准差取习题3.1计算出的值)# 直方图hist(x)# 密度估计曲线plot(density(x), col = "blue")# 经验分布图plot(ecdf(x))# qq图qqnorm(x) qqline(x)# 将密度估计曲线与正态密度曲线相比较plot(density(x), col = "blue")x1<- 55:154lines(x1, dnorm(x1, mean(x), sd(x)), col = "red")# 将经验分布曲线与正态分布曲线相比较verticals=TRUE # 表示画垂直线;do.p=FALSE表示数据点上不画点plot(ecdf(x), verticals = TRUE, do.p = FALSE)x1<- 55:154lines(x1, pnorm(x1, mean(x), sd(x)))3.3 绘出习题3.1的茎叶图、箱线图,并计算五数总括。

R软件课后作业

R软件课后作业

第四章4.1> x<-c(0.1,0.2,0.9,0.8,0.7,0.7)> n<- length(x)> a<-sum(x)*(1/n)> f<-function(b){+ n/(b+1)+sum(log(x))+ }> b<-uniroot(f,c(0,1))> b$root[1] 0.211182$f.root[1] -3.844668e-05$iter[1] 5$estim.prec[1] 6.103516e-05估计值为0.2111824.2> x<-c(5,15,25,35,45,55,65)> v<-c(365,245,150,100,70,45,25) > y<-x*v> f<-function(k){+ 1000/k-sum(y)+ }> k<-uniroot(f,c(0,100))> k$root[1] 0.05002618$f.root[1] -10.46586$iter[1] 14$estim.prec[1] 6.103516e-05估计值为0.050026184.3> x<- c(0,1,2,3,4,5,6)> n<- length(x)> f<- function(k){+ 10-n*k+ }> k<-uniroot(f,c(0,10))> k$root[1] 1.428571$f.root[1] 0$iter[1] 1$estim.prec[1] 8.571429平均每升水中含有1.428571个大肠杆菌时,可满足题目中的条件。

4.4####写出目标函数> g<-function(x){+ f<-c(-13+x[1]+((5-x[2])*x[2]-2)*x[2],-29+x[1]+((x[2]+1)*x[2]-14)*x[2]) + sum(f^2)}####调用nlm()函数求解> x0<-c(0.5,-2);nlm(g,x0)$minimum[1] 48.98425$estimate[1] 11.4127791 -0.8968052$gradient[1] 1.415447e-08 -1.435296e-07$code[1] 1$iterations[1] 16最优解为x*(11.4127791, -0.8968052);最优值为48.984254.5####写出总体方差已知和方差未知两种情况均值μ区间估计的R程序(程序名:interval_estimate1.R)interval_estimate1<-function(x, sigma=-1, alpha=0.05){n<-length(x); xb<-mean(x)if (sigma>=0){tmp<-sigma/sqrt(n)*qnorm(1-alpha/2);df<-n}else{tmp<-sd(x)/sqrt(n)*qt(1-alpha/2,n-1); df<-n-1}data.frame(mean=xb, df=df, a=xb-tmp, b=xb+tmp)}####输入数据,调用函数interval_estimate1()> source("interval_estimate1.R")> y<- c(54,67,68,78,70,66,67,70,65,69)> interval_estimate1(y)得到:mean df a b1 67.4 9 63.1585 71.6415因此,10名患者平均脉搏在95%的置信区间为[63.16,71.64] 10个人的平均脉搏为67.4,所以这10名患者的平均脉搏属不低于正常人的平均脉搏4.6####写出均值差μ1-μ2区间估计的R程序(程序名:interval_estimate2.R)interval_estimate2<-function(x, y,sigma=c(-1,-1), var.equal=TRUE, alpha=0.05){n1<-length(x); n2<-length(y)xb<-mean(x); yb<-mean(y)if (all(sigma>=0)){tmp<-qnorm(1-alpha/2)*sqrt(sigma[1]^2/n1+sigma[2]^2/n2)df<-n1+n2}else{if (var.equal == TRUE){Sw<-((n1-1)*var(x)+(n2-1)*var(y))/(n1+n2-2)tmp<-sqrt(Sw*(1/n1+1/n2))*qt(1-alpha/2,n1+n2-2)df<-n1+n2-2}else{S1<-var(x); S2<-var(y)nu<-(S1/n1+S2/n2)^2/(S1^2/n1^2/(n1-1)+S2^2/n2^2/(n2-1))tmp<-qt(1-alpha/2, nu)*sqrt(S1/n1+S2/n2)df<-nu}}data.frame(mean=xb-yb, df=df, a=xb-yb-tmp, b=xb-yb+tmp)}####输入数据,在调用函数interval_estimate2()> x<-c(140,137,136,140,145,148,140,135,144,141)> y<-c(135,118,115,140,128,131,130,115,131,125)> source("interval_estimate2.R")> interval_estimate2(x, y, var.equal=TRUE)mean df a b1 13.8 18 7.53626 20.06374所求置信区间为[7.53626 ,20.06374]4.7####输入数据,在调用函数interval_estimate2()> x<-c(0.143,0.142,0.143,0.137)> y<-c(0.140,0.142,0.136,0.138,0.140)> source("interval_estimate2.R")> interval_estimate2(x,y,var.equal=TRUE)mean df a b1 0.00205 7 -0.001996351 0.006096351所求置信区间为[-0.001996351, 0.006096351]4.8####写出方差比σ12/σ22区间估计的R程序(程序名:interval_var2.R)interval_var2<-function(x,y,mu=c(Inf, Inf), alpha=0.05){n1<-length(x); n2<-length(y)if (all(mu<Inf)){Sx2<-1/n1*sum((x-mu[1])^2); Sy2<-1/n2*sum((y-mu[2])^2)df1<-n1; df2<-n2}else{Sx2<-var(x); Sy2<-var(y); df1<-n1-1; df2<-n2-1}r<-Sx2/Sy2a<-r/qf(1-alpha/2,df1,df2)b<-r/qf(alpha/2,df1,df2)data.frame(rate=r, df1=df1, df2=df2, a=a, b=b)}####输入数据,调用函数interval_var2()进行计算> x<-c(140,137,136,140,145,148,140,135,144,141)> y<-c(135,118,115,140,128,131,130,115,131,125)> source("interval_var2.R")> a<-interval_var2(x, y)> arate df1 df2 a b1 0.2353305 9 9 0.05845276 0.947439####两样本方差不同,调用函数interval_estimate2()进行计算> x<-c(140,137,136,140,145,148,140,135,144,141)> y<-c(135,118,115,140,128,131,130,115,131,125)> source("interval_estimate2.R")> interval_estimate2(x, y, var.equal=FALSE)mean df a b1 13.8 13.01367 7.359713 20.24029> interval_estimate2(x, y)mean df a b1 13.8 18 7.53626 20.06374所以题中所求置信区间为[7.53626,20.06374]4.9####输入数据,求平均数> x<-c(7,10,12,8,3,2,0)> sum(x)/7[1] 6由于呼唤的次数服从参数未知的Poisson分布,所以λ的估计值为λ=6 ####写出非正态总体区间估计的R程序(程序名:interval_estimate3.R)interval_estimate3<-function(x,sigma=-1,alpha=0.05){n<-length(x); xb<-mean(x)if (sigma>=0)tmp<-sigma/sqrt(n)*qnorm(1-alpha/2)elsetmp<-sd(x)/sqrt(n)*qnorm(1-alpha/2)data.frame(mean=xb, a=xb-tmp, b=xb+tmp)}####输入数据,再调用函数interval_estimate3()进行计算> x<-c(7,10,12,8,3,2,0)> source("interval_estimate3.R")> interval_estimate3(x)得到:mean a b1 6 2.71478 9.28522所以呼叫次数在置信系数为95%的时候的置信区间为【2.714,9.285】4.10####写出可控制求单侧置信区间或双侧置信区间的R程序(程序名:interval_estimate4.R)interval_estimate4<-function(x, sigma=-1, side=0, alpha=0.05){n<-length(x); xb<-mean(x)if (sigma>=0){if (side<0){tmp<-sigma/sqrt(n)*qnorm(1-alpha)a <- -Inf;b <- xb+tmp}else if (side>0){tmp<-sigma/sqrt(n)*qnorm(1-alpha)a <- xb-tmp;b <- Inf}tmp <- sigma/sqrt(n)*qnorm(1-alpha/2)a <- xb-tmp;b <- xb+tmp}df<-n}else{if (side<0){tmp <- sd(x)/sqrt(n)*qt(1-alpha,n-1)a <- -Inf;b <- xb+tmp}else if (side>0){tmp <- sd(x)/sqrt(n)*qt(1-alpha,n-1)a <- xb-tmp;b <- Inf}else{tmp <- sd(x)/sqrt(n)*qt(1-alpha/2,n-1)a <- xb-tmp;b <- xb+tmp}df<-n-1}data.frame(mean=xb, df=df, a=a, b=b)}####输入数据,调用函数interval_estimate4()> x<-c(1067,919,1196,785,1126,936,918,1156,920,948)> source("interval_estimate4.R")> a<-interval_estimate4(x, side=1)> amean df a b1 997.1 9 920.8443 Inf故可知有95%的灯泡可持续使用920.8443小时以上第五章5.1####写出求正态总体均值检验的R程序(程序名:mean.test1.R)mean.test1<-function(x, mu=0, sigma=-1, side=0){source("P_value.R")n<-length(x); xb<-mean(x)if (sigma>0){z<-(xb-mu)/(sigma/sqrt(n))P<-P_value(pnorm, z, side=side)data.frame(mean=xb, df=n, Z=z, P_value=P)}t<-(xb-mu)/(sd(x)/sqrt(n))P<-P_value(pt, t, paramet=n-1, side=side)data.frame(mean=xb, df=n-1, T=t, P_value=P)}}####写出求P值的R程序(程序名:P_value.R)P_value<-function(cdf, x, paramet=numeric(0), side=0){n<-length(paramet)P<-switch(n+1,cdf(x),cdf(x, paramet),cdf(x, paramet[1], paramet[2]),cdf(x, paramet[1], paramet[2], paramet[3]))if (side<0) Pelse if (side>0) 1-Pelseif (P<1/2) 2*Pelse 2*(1-P)}####输入数据,再调用函数mean.test1()>x<-c(220,188,162,230,145,160,238,188,247,113,126,245,164,231,256,183,190,158,224,175) > source("mean.test1.R")> a<-mean.test1(x, mu=225,side=0)> a得到:mean df T P_value1 192.15 19 -3.478262 0.002516436可知,P值小于0.05,故与正常值存在差异5.2####输入数据,再调用函数mean.test1()> x<-c(1067,919,1196,785,1126,936,918,1156,920,948)> source("mean.test1.R")> mean.test1(x, mu=1000,side=1)得到:mean df T P_value1 997.1 9 -0.06971322 0.5270268所以灯泡寿命为1000小时以上的概率是0.47297325.3####写出两总体均值检验的R程序(程序名:mean.test2.R)mean.test2<-function(x, y,sigma=c(-1, -1), var.equal=FALSE, side=0){source("P_value.R")n1<-length(x); n2<-length(y)xb<-mean(x); yb<-mean(y)if (all(sigma>0)){z<-(xb-yb)/sqrt(sigma[1]^2/n1+sigma[2]^2/n2)P<-P_value(pnorm, z, side=side)data.frame(mean=xb-yb, df=n1+n2, Z=z, P_value=P)}else{if (var.equal == TRUE){Sw<-sqrt(((n1-1)*var(x)+(n2-1)*var(y))/(n1+n2-2))t<-(xb-yb)/(Sw*sqrt(1/n1+1/n2))nu<-n1+n2-2}else{S1<-var(x); S2<-var(y)nu<-(S1/n1+S2/n2)^2/(S1^2/n1^2/(n1-1)+S2^2/n2^2/(n2-1))t<-(xb-yb)/sqrt(S1/n1+S2/n2)}P<-P_value(pt, t, paramet=nu, side=side)data.frame(mean=xb-yb, df=nu, T=t, P_value=P)}}####输入数据,再调用函数mean.test2()> x<-c(113,120,138,120,100,118,138,123)> y<-c(138,116,125,136,110,132,130,110)> source("mean.test2.R")> mean.test2(x, y, var.equal=TRUE, side=0)得到:mean df T P_value1 -3.375 14 -0.5659672 0.5803752P值大于0.05,故接受原假设5.4####写出均值已知和均值未知两种情况方差比检验的R程序(程序名:var.test2.R)var.test2<-function(x, y,mu=c(Inf,Inf),side=0){source("P_value.R")n1<-length(x); n2<-length(y)if (all(all(mu<Inf)){Sx2<-sum((x-mu[1])^2)/n1;Sy2<-sum((y-mu[2])^2)/n2df1=n1;df2=n2}else{Sx2<-var(x); Sy2<-var(y);df1=n1-1;df2=n2-1}r<-Sx2/Sy2P<-P_value(pf, r, paramet=c(df1,df2), side=side)data.frame(rate=r, df1=df1, df2=df2,F=f, P_value=P)}}####输入数据>x<-c(-0.70,-5.60,2.00,2.80,0.70,3.50,4.00,5.80,7.10,-0.50,2.50,-1.60,1.70,3.00,0.40,4.50,4.60,2.5 0,6.00,-1.40)> a<-shapiro.test(x)> aShapiro-Wilk normality testdata: xW = 0.9699, p-value = 0.7527>0.05>y<-c(3.70,6.50,5.00,5.20,0.80,0.20,0.60,3.40,6.60,-1.10,6.00,3.80,2.00,1.60,2.00,2.20,1.20,3.10, 1.70,-2.00)> b<-shapiro.test(y)> bShapiro-Wilk normality testdata: yW = 0.971, p-value = 0.7754>0.05由以上可知,两组数据均为正态分布####输入数据,再调用函数mean.test2()>x<-c(-0.70,-5.60,2.00,2.80,0.70,3.50,4.00,5.80,7.10,-0.50,2.50,-1.60,1.70,3.00,0.40,4.50,4.60,2.5 0,6.00,-1.40)>y<-c(3.70,6.50,5.00,5.20,0.80,0.20,0.60,3.40,6.60,-1.10,6.00,3.80,2.00,1.60,2.00,2.20,1.20,3.10, 1.70,-2.00)> source("mean.test2.R")> a<-mean.test2(x, y, var.equal=TRUE, side=0);amean df T P_value1 -0.56 38 -0.641872 0.5248097> b<-mean.test2(x, y, var.equal=FALSE, side=0);bmean df T P_value1 -0.56 36.08632 -0.641872 0.525013> c<-t.test(x-y, alternative = "two.sided");cOne Sample t-testdata: x - yt = -0.6464, df = 19, p-value = 0.5257alternative hypothesis: true mean is not equal to 095 percent confidence interval:-2.373146 1.253146sample estimates:mean of x-0.56以上P值均大于0.05,故均值无差异。

r软件课后答案

r软件课后答案

r软件课后答案【篇一:统计建模与r软件课后答案】> x-c(1,2,3);y-c(4,5,6)e-c(1,1,1)z-2*x+y+e;z[1] 7 10 13z1-crossprod(x,y);z1[,1][1,]32z2-outer(x,y);z2[,1] [,2] [,3][1,] 4 5 6[2,] 81012[3,]1215182.2(1) a-matrix(1:20,ow=4);b-matrix(1:20,ow=4,byrow=t) c-a+b;c(2) d-a%*%b;d(3) e-a*b;e(4) f-a[1:3,1:3](5) g-b[,-3]x-c(rep(1,5),rep(2,3),rep(3,4),rep(4,2));x2.4h-matrix(ow=5,ncol=5)for (i in 1:5)+ for(j in 1:5)+ h[i,j]-1/(i+j-1)(1) det(h)(2) solve(h)(3) eigen(h)2.5studentdata-data.frame(姓名=c(张三,李四,王五,赵六,丁一) + ,性别=c(女,男,女,男,女),年龄=c(14,15,16,14,15),+ 身高=c(156,165,157,162,159),体重=c(42,49,41.5,52,45.5))2.6write.table(studentdata,file=student.txt)write.csv(studentdata,file=student.csv)2.7count-function(n){if (n=0)print(要求输入一个正整数)repeat{if (n%%2==0)n-n/2elsen-(3*n+1)if(n==1)break}print(运算成功)}}第三章3.1首先将数据录入为x。

利用data_outline函数。

如下 data_outline(x)3.2hist(x,freq=f)lines(density(x),col=red)y-min(x):max(x)lines(y,dnorm(y,73.668,3.9389),col=blue)plot(ecdf(x),verticals=t,do.p=f)lines(y,pnorm(y,73.668,3.9389))qqnorm(x)qqline(x)3.3stem(x)boxplot(x)fivenum(x)3.4shapiro.test(x)ks.test(x,pnorm,73.668,3.9389)one-sample kolmogorov-smirnov testdata: xd = 0.073, p-value = 0.6611alternative hypothesis: two-sidedwarning message:in ks.test(x, pnorm, 73.668, 3.9389) :ties should not be present for the kolmogorov-smirnov test这里出现警告信息是因为ks检验要求样本数据是连续的,不允许出现重复值3.5x1-c(2,4,3,2,4,7,7,2,2,5,4);x2-c(5,6,8,5,10,7,12,12,6,6);x3-c(7,11,6,6,7,9,5,5,10,6,3,10)boxplot(x1,x2,x3,names=c(x1,x2,x3),vcol=c(2,3,4))windows()plot(factor(c(rep(1,length(x1)),rep(2,length(x2)),rep(3,length(x3) ))),c(x1,x2,x3))3.6rubber-data.frame(x1=c(65,70,70,69,66,67,68,72,66,68),+x2=c(45,45,48,46,50,46,47,43,47,48),x3=c(27.6,30.7,31.8,32.6,31 .0,31.3,37.0,33.6,33.1,34.2))plot(rubber)【篇二:r软件课后习题第五章】> ####写出求正态总体均值检验的r程序(程序名:mean.test1.r) mean.test1-function(x, mu=0, sigma=-1, side=0){source(p_value.r)n-length(x); xb-mean(x)if (sigma0){z-(xb-mu)/(sigma/sqrt(n))p-p_value(pnorm, z, side=side)data.frame(mean=xb, df=n, z=z, p_value=p)}else{t-(xb-mu)/(sd(x)/sqrt(n))p-p_value(pt, t, paramet=n-1, side=side)data.frame(mean=xb, df=n-1, t=t, p_value=p)}}####写出求p值的r程序(程序名:p_value.r)p_value-function(cdf, x, paramet=numeric(0), side=0){n-length(paramet)p-switch(n+1,cdf(x),cdf(x, paramet),cdf(x, paramet[1], paramet[2]),cdf(x, paramet[1], paramet[2], paramet[3]))if (side0) pelse if (side0) 1-pelseif (p1/2) 2*pelse 2*(1-p)}####输入数据,再调用函数mean.test1()x-c(220,188,162,230,145,160,238,188,247,113,126,245,164,231,256 ,183,190,158,224,175) source(mean.test1.r)a-mean.test1(x, mu=225,side=0)a得到:mean dft p_value1 192.15 19 -3.478262 0.002516436可知,p值小于0.05,故与正常值存在差异5.2####输入数据,再调用函数mean.test1()x-c(1067,919,1196,785,1126,936,918,1156,920,948)source(mean.test1.r)mean.test1(x, mu=1000,side=1)得到:mean df tp_value1 997.1 9 -0.06971322 0.5270268所以灯泡寿命为1000小时以上的概率是0.47297325.3####写出两总体均值检验的r程序(程序名:mean.test2.r)mean.test2-function(x, y,sigma=c(-1, -1), var.equal=false, side=0){source(p_value.r)n1-length(x); n2-length(y)xb-mean(x); yb-mean(y)if (all(sigma0)){z-(xb-yb)/sqrt(sigma[1]^2/n1+sigma[2]^2/n2)p-p_value(pnorm, z, side=side)data.frame(mean=xb-yb, df=n1+n2, z=z, p_value=p)}else{if (var.equal == true){sw-sqrt(((n1-1)*var(x)+(n2-1)*var(y))/(n1+n2-2))t-(xb-yb)/(sw*sqrt(1/n1+1/n2))nu-n1+n2-2}else{s1-var(x); s2-var(y)nu-(s1/n1+s2/n2)^2/(s1^2/n1^2/(n1-1)+s2^2/n2^2/(n2-1)) t-(xb-yb)/sqrt(s1/n1+s2/n2)}p-p_value(pt, t, paramet=nu, side=side)data.frame(mean=xb-yb, df=nu, t=t, p_value=p)}}####输入数据,再调用函数mean.test2()x-c(113,120,138,120,100,118,138,123)y-c(138,116,125,136,110,132,130,110)source(mean.test2.r)mean.test2(x, y, var.equal=true, side=0)得到:mean df tp_value1 -3.375 14 -0.5659672 0.5803752p值大于0.05,故接受原假设5.4####写出均值已知和均值未知两种情况方差比检验的r程序(程序名:var.test2.r)var.test2-function(x, y,mu=c(inf,inf),side=0){source(p_value.r)n1-length(x); n2-length(y)if (all(all(muinf)){sx2-sum((x-mu[1])^2)/n1;sy2-sum((y-mu[2])^2)/n2df1=n1;df2=n2}else{sx2-var(x); sy2-var(y);df1=n1-1;df2=n2-1}r-sx2/sy2p-p_value(pf, r, paramet=c(df1,df2), side=side)data.frame(rate=r, df1=df1, df2=df2,f=f, p_value=p)}}####输入数据x-c(-0.70,-5.60,2.00,2.80,0.70,3.50,4.00,5.80,7.10,-0.50,2.50,-1.60,1.70,3.00,0.40,4.50,4.60,2.50,6.00,-1.40)a-shapiro.test(x)ashapiro-wilk normality testdata: xw = 0.9699, p-value = 0.75270.05y-c(3.70,6.50,5.00,5.20,0.80,0.20,0.60,3.40,6.60,-1.10,6.00,3.80,2.00,1.60,2.00,2.20,1.20,3.10,1.70,-2.00)b-shapiro.test(y)bshapiro-wilk normality testdata: yw = 0.971, p-value = 0.77540.05由以上可知,两组数据均为正态分布####输入数据,再调用函数mean.test2()x-c(-0.70,-5.60,2.00,2.80,0.70,3.50,4.00,5.80,7.10,-0.50,2.50,-1.60,1.70,3.00,0.40,4.50,4.60,2.50,6.00,-1.40)y-c(3.70,6.50,5.00,5.20,0.80,0.20,0.60,3.40,6.60,-1.10,6.00,3.80,2.00,1.60,2.00,2.20,1.20,3.10,1.70,-2.00)source(mean.test2.r)a-mean.test2(x, y, var.equal=true, side=0);amean dftp_value1 -0.56 38 -0.641872 0.5248097b-mean.test2(x, y, var.equal=false, side=0);bmean dft p_value1 -0.56 36.08632 -0.641872 0.525013c-t.test(x-y, alternative = two.sided);cone sample t-testdata: x - yt = -0.6464, df = 19, p-value = 0.5257alternative hypothesis: true mean is not equal to 095 percent confidence interval:-2.373146 1.253146sample estimates:mean of x-0.56以上p值均大于0.05,故均值无差异。

(完整版)统计建模与R软件课后答案

(完整版)统计建模与R软件课后答案

第二章2.1> x<-c(1,2,3);y<-c(4,5,6)> e<-c(1,1,1)> z<-2*x+y+e;z[1] 7 10 13> z1<-crossprod(x,y);z1[,1][1,] 32> z2<-outer(x,y);z2[,1] [,2] [,3][1,] 4 5 6[2,] 8 10 12[3,] 12 15 182.2(1)> A<-matrix(1:20,nrow=4);B<-matrix(1:20,nrow=4,byrow=T) > C<-A+B;C(2)> D<-A%*%B;D(3)> E<-A*B;E(4)> F<-A[1:3,1:3](5)> G<-B[,-3]> x<-c(rep(1,5),rep(2,3),rep(3,4),rep(4,2));x2.4> H<-matrix(nrow=5,ncol=5)> for (i in 1:5)+ for(j in 1:5)+ H[i,j]<-1/(i+j-1)(1)> det(H)(2)> solve(H)(3)> eigen(H)2.5> studentdata<-data.frame(姓名=c('张三','李四','王五','赵六','丁一')+ ,性别=c('女','男','女','男','女'),年龄=c('14','15','16','14','15'),+ 身高=c('156','165','157','162','159'),体重=c('42','49','41.5','52','45.5')) 2.6> write.table(studentdata,file='student.txt')> write.csv(studentdata,file='student.csv')2.7count<-function(n){if (n<=0)print('要求输入一个正整数')repeat{if (n%%2==0)n<-n/2elsen<-(3*n+1)if(n==1)break}print('运算成功')}}第三章3.1首先将数据录入为x。

统计建模与R软件课后参考答案(可编辑修改word版)

统计建模与R软件课后参考答案(可编辑修改word版)

第二章2.1> x<-c(1,2,3);y<-c(4,5,6)> e<-c(1,1,1)> z<-2*x+y+e;z[1] 7 10 13>z1<-crossprod(x,y);z1[,1][1,] 32>z2<-outer(x,y);z2[,1] [,2] [,3][1,] 4 5 6[2,] 8 10 12[3,] 12 15 182.2(1) > A<-matrix(1:20,nrow=4);B<-matrix(1:20,nrow=4,byrow=T) >C<-A+B;C(2) > D<-A%*%B;D(3) > E<-A*B;E(4) > F<-A[1:3,1:3](5) > G<-B[,-3]2.3>x<-c(rep(1,5),rep(2,3),rep(3,4),rep(4,2));x2.4>H<-matrix(nrow=5,ncol=5)>for (i in 1:5)+ for(j in 1:5)+ H[i,j]<-1/(i+j-1)(1)> det(H)(2)> solve(H)(3)> eigen(H)2.5>studentdata<-data.frame(姓名=c('张三','李四','王五','赵六','丁一') + ,性别=c('女','男','女','男','女'),年龄=c('14','15','16','14','15'),+ 身高=c('156','165','157','162','159'),体重=c('42','49','41.5','52','45.5')) 2.6>write.table(studentdata,file='student.txt')>write.csv(studentdata,file='student.csv')2.7count<-function(n){if (n<=0)print('要求输入一个正整数')else{ repeat{if (n%%2==0)n<-n/2elsen<-(3*n+1)if(n==1)break}print('运算成功')}}第三章3.1首先将数据录入为x。

R软件作业4

R软件作业4

1、查询tapply的函数帮助信息,并用帮助文件中的案例进一步学习.> n <- 17>fac <- factor(rep(1:3, length = n), levels = 1:5)> fac[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2Levels: 1 2 3 4 5> table(fac)fac1 2 3 4 56 6 5 0 0> tapply(1:n, fac, sum)1 2 3 4 551 57 45 NA NA注:51=1+4+7+10+13+16,是对数据为1的下标进行求和,57,45同理。

2、定义一个维数为(4, 2)的数组其第一列取从1开始的奇数,第二列取从2开始的偶数;分别计算第一层和第二层的平均值;把生成的数组保存到一个文本文件中,数据项用空格和换行分隔。

> a=seq(from=1,by=2,length=4)> b=seq(from=2,by=2,length=4)> s=c(a,b)> dim(s)=c(4,2)> s[,1] [,2][1,] 1 2[2,] 3 4[3,] 5 6[4,] 7 8> cat(s[1,],"\n",s[2,],"\n",s[3,],"\n",s[4,],"\n",file="F:/1.txt")2、把user.txt数据中的性别、年龄、身高分别输入到S中。

计算不同性别、不同年龄的人数,并计算每一组的平均身高。

把这些变量组合成一个列表。

把user.txt数据输入为S的数据框。

注:数据集已上传至ftp3. 把语句x <- floor(2*runif(100))所生成的向量保存到一个文本文件中,数据项分别用空格和换行分隔。

然后从此文件中读入数据到向量y中。

R软件课后习题解第四章 -

R软件课后习题解第四章 -

习题4.1计算代码如下(极大似然估计):x<-c(0.1,0.2,0.9,0.8,0.7,0.7)n<- length(x)f<-function(a){n/(a+1)+sum(log(x))} uniroot(f,c(0,1))$root[1] 0.211182$f.root[1] -3.844668e-05$iter[1] 5$estim.prec[1] 6.103516e-05习题4.2(指数分布)x<-c(5,15,25,35,45,55,65)v<-c(365,245,150,100,70,45,25)y<-x*vf<-function(k){1000/k-sum(y)} uniroot(f,c(0,100))$root[1] 0.05002618$f.root[1] -10.46586$iter[1] 14$estim.prec[1] 6.103516e-05估计值为0.05002618习题4.3(极大似然估计)(作业)解:设n x x x ⋯,,21为字样n ξξξ,...,21的一组观测值,所以似然函数为λλλλλλn n x n i i x n e x x x e x x x x L L n i i i -=-∑==⋯==∏!!...!!),,;()(211211取对数,所以 ∑∑+=-⋅+-=n i in i i x x n L 11)!l n (ln )(ln λλλ 求偏导,并令其等于0,01ln 1=+-=∂∂∑=ni i x n L λλ 解得,-=x ^λ,所以λ的极大似然估计量为-ξ.x<-rep(0:6,c(17,20,10,2,1,0,0))mean(x)[1] 1习题4.4y<-function(x){f<-c(-13+x[1]+((5-x[2])*x[2]-2)*x[2],-29+x[1]+((x[2]+1)*x[2]-14)*x[2]);sum(f^2)}x0<-c(0.5,-2)nlm(y,x0)$minimum[1] 48.98425$estimate[1] 11.4127791 -0.8968052$gradient[1] 1.415447e-08 -1.435296e-07$code[1] 1$iterations[1] 16习题4.5X<- c(54,67,68,78,70,66,67,70,65,69)t.test(X)One Sample t-testdata: Xt = 35.947, df = 9, p-value = 4.938e-11alternative hypothesis: true mean is not equal to 095 percent confidence interval:63.1585 71.6415sample estimates:mean of x67.4因此,10名患者平均脉搏在95%的置信区间为[63.16,71.64] 10个人的平均脉搏为67.4,所以这10名患者的平均脉搏属不低于正常人的平均脉搏习题4.6(作业)x<-c(140,137,136,140,145,148,140,135,144,141)y<-c(135,118,115,140,128,131,130,115,131,125)t.test(x,y)Welch Two Sample t-testdata: x and yt = 4.6287, df = 13.014, p-value = 0.0004712alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:7.359713 20.240287sample estimates:mean of x mean of y140.6 126.8所以u1-u2的置信区间为[7.53626 ,20.06374]习题4.7x<-c(0.143,0.142,0.143,0.137)y<-c(0.140,0.142,0.136,0.138,0.140)t.test(x,y,var.equal=TRUE) ## 注意:如果方差相同,需要声明var.equal=TRUE Two Sample t-testdata: x and yt = 1.198, df = 7, p-value = 0.2699alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:-0.001996351 0.006096351sample estimates:mean of x mean of y0.14125 0.13920习题4.8(作业)x<-c(140,137,136,140,145,148,140,135,144,141)y<-c(135,118,115,140,128,131,130,115,131,125)var.test(x,y)F test to compare two variancesdata: x and yF = 0.2353, num df = 9, denom df = 9, p-value = 0.04229alternative hypothesis: true ratio of variances is not equal to 195 percent confidence interval:0.05845276 0.94743902sample estimates:ratio of variances0.2353305t.test(x,y)Welch Two Sample t-testdata: x and yt = 4.6287, df = 13.014, p-value = 0.0004712alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:7.359713 20.240287sample estimates:mean of x mean of y140.6 126.8所以所求置信区间为[7.359713,20.240287]习题4.9X<-rep(0:6,c(7,10,12,8,3,2,0))t.test(X)One Sample t-testdata: Xt = 9.0895, df = 41, p-value = 2.238e-11alternative hypothesis: true mean is not equal to 095 percent confidence interval:1.4815562.327968sample estimates:mean of x1.904762所以估计值为1.904762,置信区间为[1.481556,2.327968]习题4.10(作业)X<-c(1067,919,1196,785,1126,936,918,1156,920,948)t.test(X,alternative="greater")One Sample t-testdata: Xt = 23.9693, df = 9, p-value = 9.148e-10alternative hypothesis: true mean is greater than 095 percent confidence interval:920.8443 Infsample estimates:mean of x997.1所以有95%的灯泡寿命在920.8443h以上.习题5.1x<-c(220, 188, 162, 230, 145, 160, 238, 188, 247, 113, 126, 245, 164, 231, 256, 183, 190, 158, 224, 175)t.test(x,mu=225)One Sample t-testdata: xt = -3.4783, df = 19, p-value = 0.002516alternative hypothesis: true mean is not equal to 22595 percent confidence interval:172.3827 211.9173sample estimates:mean of x192.15原假设:油漆工人的血小板计数与正常成年男子无差异。

统计建模与r软件 课后习题答案

统计建模与r软件 课后习题答案

统计建模与r软件课后习题答案统计建模与R软件课后习题答案在统计建模与R软件课程中,学生们经常需要完成一系列的习题来巩固所学知识。

这些习题涉及到统计建模的理论和实践,以及如何使用R软件来进行数据分析和建模。

在本文中,我们将给出一些常见的统计建模与R软件课后习题的答案,希望能够帮助学生更好地理解课程内容。

1. 线性回归模型习题:使用R软件对给定数据集进行线性回归分析,并给出回归方程和相关系数。

答案:在R软件中,可以使用lm()函数来进行线性回归分析。

例如,对于数据集data,可以使用以下代码进行线性回归分析:```model <- lm(y ~ x, data=data)summary(model)```其中,y和x分别表示因变量和自变量。

通过summary()函数可以得到回归方程和相关系数等信息。

2. 逻辑回归模型习题:使用R软件对给定数据集进行逻辑回归分析,并给出回归方程和模型拟合度。

答案:逻辑回归分析可以使用glm()函数来进行。

例如,对于数据集data,可以使用以下代码进行逻辑回归分析:```model <- glm(y ~ x, data=data, family=binomial)summary(model)```其中,y和x分别表示因变量和自变量,family=binomial表示使用二项分布进行逻辑回归分析。

通过summary()函数可以得到回归方程和模型拟合度等信息。

3. 方差分析习题:使用R软件对给定数据集进行方差分析,并给出各组之间的差异是否显著。

答案:在R软件中,可以使用aov()函数来进行方差分析。

例如,对于数据集data,可以使用以下代码进行方差分析:```model <- aov(y ~ group, data=data)summary(model)```其中,y和group分别表示因变量和自变量。

通过summary()函数可以得到各组之间的差异是否显著等信息。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

第四章4.1x<-c(0.1,0.2,0.9,0.8,0.7,0.7)n<- length(x)a<-sum(x)*(1/n)f<-function(b){n/(b+1)+sum(log(x))}b<-uniroot(f,c(0,1))b$root[1] 0.211182$f.root[1] -3.844668e-05$iter[1] 5$estim.prec[1] 6.103516e-05估计值为0.211182 4.2> s<-seq(5,65,by=10)> x<-rep(s,c(365,245,150,100,70,45,25))> length(x)/sum(x)[1] 0.05 4.3正解Poisson分布P(x=k)=λ^k/k!*e^(-λ)其均数和方差相等,均为λ,其含义为平均每升水中大肠杆菌个数。

取均值即可。

x<-rep(0:6,c(17,20,10,2,1,0,0));>mean(x)[1] 1。

4.4obj<-function(x){f<-c(-13+x[1]+((5-x[2])*x[2]-2)*x[2],-29+x[1]+((x[2]+1)*x[2]-14)*x[2]) ;sum(f^2)} #其实我也不知道这是在干什么。

所谓的无约束优化问题。

> x0<-c(0.5,-2)>nlm(obj,x0)$minimum[1] 48.98425$estimate[1] 11.4127791 -0.8968052$gradient[1] 1.411401e-08 -1.493206e-07$code[1] 1$iterations[1] 16 4.5x<-c(54,67,68,78,70,66,67,70,65,69)> t.test(x)One Sample t-testdata: xt = 35.947, df = 9, p-value = 4.938e-11alternative hypothesis: true mean is not equal to 095 percent confidence interval:63.1585 71.6415sample estimates:mean of x67.4因此,10名患者平均脉搏在95%的置信区间为[63.16,71.64] 10个人的平均脉搏为67.4,所以这10名患者的平均脉搏属不低于正常人的平均脉搏t.test(x,alternative="less",mu=72) #t.test()做单样本正态分布单侧区间估计One Sample t-testdata: xt = -2.4534, df = 9, p-value = 0.01828alternative hypothesis: true mean is less than 7295 percent confidence interval:-Inf 70.83705sample estimates:mean of x67.4p值小于0.05,拒绝原假设,平均脉搏低于常人 4.6x<-c(140,137,136,140,145,148,140,135,144,141);x[1] 140 137 136 140 145 148 140 135 144 141> y<-c(135,118,115,140,128,131,130,115,131,125);y[1] 135 118 115 140 128 131 130 115 131 125> t.test(x,y,var.equal=TRUE) #两样本方差相等#Two Sample t-testdata: x and yt = 4.6287, df = 18, p-value = 0.0002087alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:7.53626 20.06374sample estimates:mean of x mean of y140.6 126.8期望差的95%置信区间为7.53626 20.06374 。

要点:t.test()可做两正态样本均值差估计。

此例认为两样本方差相等。

4.7x<-c(0.143,0.142,0.143,0.137)> y<-c(0.140,0.142,0.136,0.138,0.140)> t.test(x,y,var.equal=TRUE)Two Sample t-testdata: x and yt = 1.198, df = 7, p-value = 0.2699alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:-0.001996351 0.006096351sample estimates:mean of x mean of y0.14125 0.13920期望差的95%的区间估计为-0.001996351 0.006096351 4.8 x<-c(140,137,136,140,145,148,140,135,144,141);x[1] 140 137 136 140 145 148 140 135 144 141> y<-c(135,118,115,140,128,131,130,115,131,125);y[1] 135 118 115 140 128 131 130 115 131 125> var.test(x,y)F test to compare two variancesdata: x and yF = 0.2353, num df = 9, denom df = 9, p-value = 0.04229 alternative hypothesis: true ratio of variances is not equal to 195 percent confidence interval:0.05845276 0.94743902sample estimates:ratio of variances0.2353305要点:var.test可做两样本方差比的估计。

基于此结果可认为方差不等。

因此,在Ex4.6中,计算期望差时应该采取方差不等的参数。

t.test(x,y)Welch Two Sample t-testdata: x and yt = 4.6287, df = 13.014, p-value = 0.0004712alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:7.359713 20.240287sample estimates:mean of x mean of y140.6 126.8期望差的95%置信区间为7.359713 20.240287 。

要点:t.test(x,y,var.equal=TRUE)做方差相等的两正态样本的均值差估计t.test(x,y)做方差不等的两正态样本的均值差估计 4.9x<-rep(0:6,c(7,10,12,8,3,2,0))t.test(x)One Sample t-testdata: xt = 9.0895, df = 41, p-value = 2.238e-11alternative hypothesis: true mean is not equal to 095 percent confidence interval:1.4815562.327968sample estimates:mean of x1.904762平均呼唤次数为1.90.95的置信区间为1.49,2,32 4.10> x<-c(1067,919,1196,785,1126,936,918,1156,920,948)> t.test(x,alternative="greater")One Sample t-testdata: xt = 23.9693, df = 9, p-value = 9.148e-10alternative hypothesis: true mean is greater than 095 percent confidence interval:920.8443 Infsample estimates:mean of x997.1灯泡平均寿命置信度95%的单侧置信下限为920.8443要点:t.test()做单侧置信区间估计第五章5.1Ex5.1> x<-c(220, 188, 162, 230, 145, 160, 238, 188, 247, 113, 126, 245, 164, 231, 256, 183, 190, 158, 224, 175) > t.test(x,mu=225)One Sample t-testdata: xt = -3.4783, df = 19, p-value = 0.002516alternative hypothesis: true mean is not equal to 22595 percent confidence interval:172.3827 211.9173sample estimates:mean of x192.15原假设:油漆工人的血小板计数与正常成年男子无差异。

备择假设:油漆工人的血小板计数与正常成年男子有差异。

p值小于0.05,拒绝原假设,认为油漆工人的血小板计数与正常成年男子有差异。

上述检验是双边检验。

也可采用单边检验。

备择假设:油漆工人的血小板计数小于正常成年男子。

> t.test(x,mu=225,alternative="less")One Sample t-testdata: xt = -3.4783, df = 19, p-value = 0.001258alternative hypothesis: true mean is less than 22595 percent confidence interval:-Inf 208.4806sample estimates:mean of x192.15同样可得出油漆工人的血小板计数小于正常成年男子的结论。

相关文档
最新文档