计量经济学stata英文论文终稿

合集下载

计量经济学精品论文12篇全集

计量经济学精品论文12篇全集

计量经济学精品论文12篇全集引言本文档汇集了12篇计量经济学的精品论文,旨在为读者提供有关计量经济学领域的最新研究成果和重要发现。

这些论文涵盖了多个计量经济学的子领域,包括时间序列分析、面板数据模型、计量经济模型等。

本文档将简要介绍每篇论文的主题和重要内容。

论文一:时间序列分析方法的比较研究该论文比较了传统的ARIMA模型和基于机器学习的深度学习方法在时间序列分析中的效果。

研究结果表明,深度学习方法在某些情况下能够取得更好的预测结果。

论文二:面板数据模型下的固定效应与随机效应比较该论文通过实证研究,比较了面板数据模型中固定效应模型和随机效应模型的优劣。

研究发现,在某些情况下,固定效应模型能够更好地解释面板数据的变异性。

论文三:计量经济模型下的因果推断方法该论文介绍了计量经济模型中常用的因果推断方法,包括工具变量法和差分法等。

通过实证研究,论文验证了这些方法在因果关系分析中的有效性。

论文四:计量经济模型中的异方差问题该论文探讨了计量经济模型中常见的异方差问题,并提出了一种基于加权最小二乘法的异方差校正方法。

研究结果表明,该方法能够有效地解决异方差问题。

论文五:计量经济模型中的变量选择方法比较该论文比较了计量经济模型中常用的变量选择方法,包括逐步回归和LASSO回归等。

研究发现,LASSO回归在变量选择方面具有更好的性能。

论文六:计量经济模型中的序列相关性检验方法该论文介绍了计量经济模型中常用的序列相关性检验方法,包括ADF检验和KPSS检验等。

研究结果表明,这些方法能够有效地检测序列的相关性。

论文七:计量经济模型中的误差修正模型该论文研究了计量经济模型中的误差修正模型,探讨了其在长期关系分析中的应用。

研究结果表明,误差修正模型能够更好地捕捉变量之间的长期关系。

论文八:计量经济模型下的异方差稳健标准误差估计该论文提出了一种基于异方差稳健标准误差的估计方法,用于解决计量经济模型中异方差问题。

研究结果表明,该方法能够提供更准确的参数估计。

计量经济学伍德里奇第六版stata代码

计量经济学伍德里奇第六版stata代码

文章主题:探寻计量经济学伍德里奇第六版stata代码的应用与意义1. 引言计量经济学作为经济学的一个重要分支,旨在运用数学、统计学和计算机科学的方法来分析经济问题和经济现象,从而为实证经济研究提供理论和方法。

而伍德里奇的《计量经济学》第六版,作为该领域的经典教材,常常被用来进行实证研究和教学。

在本文中,我们将深入探讨这本教材中的stata代码部分,分析其应用与意义。

2. 计量经济学伍德里奇第六版stata代码的意义在《计量经济学》第六版中,作者伍德里奇通过stata代码来展示实证分析的方法和过程。

这些代码不仅仅是为了教学目的,更重要的是为了让读者能够学会如何用计量经济学的方法来研究实际经济问题。

通过学习这些stata代码,读者可以掌握实证分析的基本技能,了解如何处理实际数据、构建模型、进行估计和推断,从而在实际研究中能够灵活运用计量经济学的方法。

3. 深入理解计量经济学伍德里奇第六版stata代码在伍德里奇的《计量经济学》第六版中,stata代码涵盖了从简单的OLS回归分析到复杂的面板数据模型的估计方法,涉及了各种实证问题和分析工具。

通过深入学习这些代码,读者可以逐步理解和掌握计量经济学的核心内容,包括数据的处理与清洗、模型的构建与估计、假设检验与推断等方面的知识和技能。

这样的深入理解将使读者能够更好地应用计量经济学的方法来解决实际经济问题,并且能够进行批判性思考和创新性研究。

4. 个人观点和理解作为一名计量经济学的研究者和教学者,我深切理解学习和掌握计量经济学伍德里奇第六版stata代码的重要性。

这些代码不仅仅是一种工具,更是一种思维方式和方法论,是我们用来研究经济现象和问题的利器。

通过不断地学习和实践,我相信我们能够更好地理解和应用计量经济学的方法,为经济学研究和实践带来更多的启发和进步。

5. 总结通过本文的探讨,我们深入了解了《计量经济学》第六版中stata代码的应用与意义。

这些代码的存在不仅仅是为了让我们学会如何进行实证分析,更重要的是让我们深刻理解和掌握计量经济学的思想和方法。

Microeconometrics using stata

Microeconometrics using stata

Microeconometrics Using StataContentsList of tables xxxv List of figures xxxvii Preface xxxix 1Stata basics1............................................................................................1.1Interactive use 1..............................................................................................1.2 Documentation 2..........................................................................1.2.1Stata manuals 2...........................................................1.2.2Additional Stata resources 3.......................................................................1.2.3The help command 3................................1.2.4The search, findit, and hsearch commands 41.3 Command syntax and operators 5...................................................................................................................................1.3.1Basic command syntax 5................................................1.3.2 Example: The summarize command 61.3.3Example: The regress command 7..............................................................................1.3.4Abbreviations, case sensitivity, and wildcards 9................................1.3.5Arithmetic, relational, and logical operators 9.........................................................................1.3.6Error messages 10........................................................................................1.4 Do-files and log files 10.............................................................................1.4.1Writing a do-file 101.4.2Running do-files 11.........................................................................................................................................................................1.4.3Log files 12..................................................................1.4.4 A three-step process 131.4.5Comments and long lines 13......................................................................................................1.4.6Different implementations of Stata 141.5Scalars and matrices (15)1.5.1Scalars (15)1.5.2Matrices (15)1.6 Using results from Stata commands (16)1.6.1Using results from the r-class command summarize (16)1.6.2Using results from the e-class command regress (17)1.7 Global and local macros (19)1.7.1Global macros (19)1.7.2Local macros (20)1.7.3Scalar or macro? (21)1.8 Looping commands (22)1.8.1The foreach loop (23)1.8.2The forvalues loop (23)1.8.3The while loop (24)1.8.4The continue command (24)1.9 Some useful commands (24)1.10 Template do-file (25)1.11 User-written commands (25)1.12 Stata resources (26)1.13 Exercises (26)2 Data management and graphics292.1Introduction (29)2.2 Types of data (29)2.2.1Text or ASCII data (30)2.2.2Internal numeric data (30)2.2.3String data (31)2.2.4Formats for displaying numeric data (31)2.3Inputting data (32)2.3.1General principles (32)2.3.2Inputting data already in Stata format (33)2.3.3Inputting data from the keyboard (34)2.3.4Inputting nontext data (34)2.3.5Inputting text data from a spreadsheet (35)2.3.6Inputting text data in free format (36)2.3.7Inputting text data in fixed format (36)2.3.8Dictionary files (37)2.3.9Common pitfalls (37)2.4 Data management (38)2.4.1PSID example (38)2.4.2Naming and labeling variables (41)2.4.3Viewing data (42)2.4.4Using original documentation (43)2.4.5Missing values (43)2.4.6Imputing missing data (45)2.4.7Transforming data (generate, replace, egen, recode) (45)The generate and replace commands (46)The egen command (46)The recode command (47)The by prefix (47)Indicator variables (47)Set of indicator variables (48)Interactions (49)Demeaning (50)2.4.8Saving data (51)2.4.9Selecting the sample (51)2.5 Manipulating datasets (53)2.5.1Ordering observations and variables (53)2.5.2Preserving and restoring a dataset (53)2.5.3Wide and long forms for a dataset (54)2.5.4Merging datasets (54)2.5.5Appending datasets (56)2.6 Graphical display of data (57)2.6.1Stata graph commands (57)Example graph commands (57)Saving and exporting graphs (58)Learning how to use graph commands (59)2.6.2Box-and-whisker plot (60)2.6.3Histogram (61)2.6.4Kernel density plot (62)2.6.5Twoway scatterplots and fitted lines (64)2.6.6Lowess, kernel, local linear, and nearest-neighbor regression652.6.7Multiple scatterplots (67)2.7 Stata resources (68)2.8Exercises (68)3Linear regression basics713.1Introduction (71)3.2 Data and data summary (71)3.2.1Data description (71)3.2.2Variable description (72)3.2.3Summary statistics (73)3.2.4More-detailed summary statistics (74)3.2.5Tables for data (75)3.2.6Statistical tests (78)3.2.7Data plots (78)3.3Regression in levels and logs (79)3.3.1Basic regression theory (79)3.3.2OLS regression and matrix algebra (80)3.3.3Properties of the OLS estimator (81)3.3.4Heteroskedasticity-robust standard errors (82)3.3.5Cluster–robust standard errors (82)3.3.6Regression in logs (83)3.4Basic regression analysis (84)3.4.1Correlations (84)3.4.2The regress command (85)3.4.3Hypothesis tests (86)3.4.4Tables of output from several regressions (87)3.4.5Even better tables of regression output (88)3.5Specification analysis (90)3.5.1Specification tests and model diagnostics (90)3.5.2Residual diagnostic plots (91)3.5.3Influential observations (92)3.5.4Specification tests (93)Test of omitted variables (93)Test of the Box–Cox model (94)Test of the functional form of the conditional mean (95)Heteroskedasticity test (96)Omnibus test (97)3.5.5Tests have power in more than one direction (98)3.6Prediction (100)3.6.1In-sample prediction (100)3.6.2Marginal effects (102)3.6.3Prediction in logs: The retransformation problem (103)3.6.4Prediction exercise (104)3.7 Sampling weights (105)3.7.1Weights (106)3.7.2Weighted mean (106)3.7.3Weighted regression (107)3.7.4Weighted prediction and MEs (109)3.8 OLS using Mata (109)3.9Stata resources (111)3.10 Exercises (111)4Simulation1134.1Introduction (113)4.2 Pseudorandom-number generators: Introduction (114)4.2.1Uniform random-number generation (114)4.2.2Draws from normal (116)4.2.3Draws from t, chi-squared, F, gamma, and beta (117)4.2.4 Draws from binomial, Poisson, and negative binomial . . . (118)Independent (but not identically distributed) draws frombinomial (118)Independent (but not identically distributed) draws fromPoisson (119)Histograms and density plots (120)4.3 Distribution of the sample mean (121)4.3.1Stata program (122)4.3.2The simulate command (123)4.3.3Central limit theorem simulation (123)4.3.4The postfile command (124)4.3.5Alternative central limit theorem simulation (125)4.4 Pseudorandom-number generators: Further details (125)4.4.1Inverse-probability transformation (126)4.4.2Direct transformation (127)4.4.3Other methods (127)4.4.4Draws from truncated normal (128)4.4.5Draws from multivariate normal (129)Direct draws from multivariate normal (129)Transformation using Cholesky decomposition (130)4.4.6Draws using Markov chain Monte Carlo method (130)4.5 Computing integrals (132)4.5.1Quadrature (133)4.5.2Monte Carlo integration (133)4.5.3Monte Carlo integration using different S (134)4.6Simulation for regression: Introduction (135)4.6.1Simulation example: OLS with X2 errors (135)4.6.2Interpreting simulation output (138)Unbiasedness of estimator (138)Standard errors (138)t statistic (138)Test size (139)Number of simulations (140)4.6.3Variations (140)Different sample size and number of simulations (140)Test power (140)Different error distributions (141)4.6.4Estimator inconsistency (141)4.6.5Simulation with endogenous regressors (142)4.7Stata resources (144)4.8Exercises (144)5GLS regression1475.1Introduction (147)5.2 GLS and FGLS regression (147)5.2.1GLS for heteroskedastic errors (147)5.2.2GLS and FGLS (148)5.2.3Weighted least squares and robust standard errors (149)5.2.4Leading examples (149)5.3 Modeling heteroskedastic data (150)5.3.1Simulated dataset (150)5.3.2OLS estimation (151)5.3.3Detecting heteroskedasticity (152)5.3.4FGLS estimation (154)5.3.5WLS estimation (156)5.4System of linear regressions (156)5.4.1SUR model (156)5.4.2The sureg command (157)5.4.3Application to two categories of expenditures (158)5.4.4Robust standard errors (160)5.4.5Testing cross-equation constraints (161)5.4.6Imposing cross-equation constraints (162)5.5Survey data: Weighting, clustering, and stratification (163)5.5.1Survey design (164)5.5.2Survey mean estimation (167)5.5.3Survey linear regression (167)5.6Stata resources (169)5.7Exercises (169)6Linear instrumental-variables regression1716.1Introduction (171)6.2 IV estimation (171)6.2.1Basic IV theory (171)6.2.2Model setup (173)6.2.3IV estimators: IV, 2SLS, and GMM (174)6.2.4Instrument validity and relevance (175)6.2.5Robust standard-error estimates (176)6.3 IV example (177)6.3.1The ivregress command (177)6.3.2Medical expenditures with one endogenous regressor . . . (178)6.3.3Available instruments (179)6.3.4IV estimation of an exactly identified model (180)6.3.5IV estimation of an overidentified model (181)6.3.6Testing for regressor endogeneity (182)6.3.7Tests of overidentifying restrictions (185)6.3.8IV estimation with a binary endogenous regressor (186)6.4 Weak instruments (188)6.4.1Finite-sample properties of IV estimators (188)6.4.2Weak instruments (189)Diagnostics for weak instruments (189)Formal tests for weak instruments (190)6.4.3The estat firststage command (191)6.4.4Just-identified model (191)6.4.5Overidentified model (193)6.4.6More than one endogenous regressor (195)6.4.7Sensitivity to choice of instruments (195)6.5 Better inference with weak instruments (197)6.5.1Conditional tests and confidence intervals (197)6.5.2LIML estimator (199)6.5.3Jackknife IV estimator (199)6.5.4 Comparison of 2SLS, LIML, JIVE, and GMM (200)6.6 3SLS systems estimation (201)6.7Stata resources (203)6.8Exercises (203)7Quantile regression2057.1Introduction (205)7.2 QR (205)7.2.1Conditional quantiles (206)7.2.2Computation of QR estimates and standard errors (207)7.2.3The qreg, bsqreg, and sqreg commands (207)7.3 QR for medical expenditures data (208)7.3.1Data summary (208)7.3.2QR estimates (209)7.3.3Interpretation of conditional quantile coefficients (210)7.3.4Retransformation (211)7.3.5Comparison of estimates at different quantiles (212)7.3.6Heteroskedasticity test (213)7.3.7Hypothesis tests (214)7.3.8Graphical display of coefficients over quantiles (215)7.4 QR for generated heteroskedastic data (216)7.4.1Simulated dataset (216)7.4.2QR estimates (219)7.5 QR for count data (220)7.5.1Quantile count regression (221)7.5.2The qcount command (222)7.5.3Summary of doctor visits data (222)7.5.4Results from QCR (224)7.6Stata resources (226)7.7Exercises (226)8Linear panel-data models: Basics2298.1Introduction (229)8.2 Panel-data methods overview (229)8.2.1Some basic considerations (230)8.2.2Some basic panel models (231)Individual-effects model (231)Fixed-effects model (231)Random-effects model (232)Pooled model or population-averaged model (232)Two-way-effects model (232)Mixed linear models (233)8.2.3Cluster-robust inference (233)8.2.4The xtreg command (233)8.2.5Stata linear panel-data commands (234)8.3 Panel-data summary (234)8.3.1Data description and summary statistics (234)8.3.2Panel-data organization (236)8.3.3Panel-data description (237)8.3.4Within and between variation (238)8.3.5Time-series plots for each individual (241)8.3.6Overall scatterplot (242)8.3.7Within scatterplot (243)8.3.8Pooled OLS regression with cluster—robust standard errors ..2448.3.9Time-series autocorrelations for panel data (245)8.3.10 Error correlation in the RE model (247)8.4 Pooled or population-averaged estimators (248)8.4.1Pooled OLS estimator (248)8.4.2Pooled FGLS estimator or population-averaged estimator (248)8.4.3The xtreg, pa command (249)8.4.4Application of the xtreg, pa command (250)8.5 Within estimator (251)8.5.1Within estimator (251)8.5.2The xtreg, fe command (251)8.5.3Application of the xtreg, fe command (252)8.5.4Least-squares dummy-variables regression (253)8.6 Between estimator (254)8.6.1Between estimator (254)8.6.2Application of the xtreg, be command (255)8.7 RE estimator (255)8.7.1RE estimator (255)8.7.2The xtreg, re command (256)8.7.3Application of the xtreg, re command (256)8.8 Comparison of estimators (257)8.8.1Estimates of variance components (257)8.8.2Within and between R-squared (258)8.8.3Estimator comparison (258)8.8.4Fixed effects versus random effects (259)8.8.5Hausman test for fixed effects (260)The hausman command (260)Robust Hausman test (261)8.8.6Prediction (262)8.9 First-difference estimator (263)8.9.1First-difference estimator (263)8.9.2Strict and weak exogeneity (264)8.10 Long panels (265)8.10.1 Long-panel dataset (265)8.10.2 Pooled OLS and PFGLS (266)8.10.3 The xtpcse and xtgls commands (267)8.10.4 Application of the xtgls, xtpcse, and xtscc commands . . . (268)8.10.5 Separate regressions (270)8.10.6 FE and RE models (271)8.10.7 Unit roots and cointegration (272)8.11 Panel-data management (274)8.11.1 Wide-form data (274)8.11.2 Convert wide form to long form (274)8.11.3 Convert long form to wide form (275)8.11.4 An alternative wide-form data (276)8.12 Stata resources (278)8.13 Exercises (278)9Linear panel-data models: Extensions2819.1Introduction (281)9.2 Panel IV estimation (281)9.2.1Panel IV (281)9.2.2The xtivreg command (282)9.2.3Application of the xtivreg command (282)9.2.4Panel IV extensions (284)9.3 Hausman-Taylor estimator (284)9.3.1Hausman-Taylor estimator (284)9.3.2The xthtaylor command (285)9.3.3Application of the xthtaylor command (285)9.4 Arellano-Bond estimator (287)9.4.1Dynamic model (287)9.4.2IV estimation in the FD model (288)9.4.3 The xtabond command (289)9.4.4Arellano-Bond estimator: Pure time series (290)9.4.5Arellano-Bond estimator: Additional regressors (292)9.4.6Specification tests (294)9.4.7 The xtdpdsys command (295)9.4.8 The xtdpd command (297)9.5 Mixed linear models (298)9.5.1Mixed linear model (298)9.5.2 The xtmixed command (299)9.5.3Random-intercept model (300)9.5.4Cluster-robust standard errors (301)9.5.5Random-slopes model (302)9.5.6Random-coefficients model (303)9.5.7Two-way random-effects model (304)9.6 Clustered data (306)9.6.1Clustered dataset (306)9.6.2Clustered data using nonpanel commands (306)9.6.3Clustered data using panel commands (307)9.6.4Hierarchical linear models (310)9.7Stata resources (311)9.8Exercises (311)10 Nonlinear regression methods31310.1 Introduction (313)10.2 Nonlinear example: Doctor visits (314)10.2.1 Data description (314)10.2.2 Poisson model description (315)10.3 Nonlinear regression methods (316)10.3.1 MLE (316)10.3.2 The poisson command (317)10.3.3 Postestimation commands (318)10.3.4 NLS (319)10.3.5 The nl command (319)10.3.6 GLM (321)10.3.7 The glm command (321)10.3.8 Other estimators (322)10.4 Different estimates of the VCE (323)10.4.1 General framework (323)10.4.2 The vce() option (324)10.4.3 Application of the vce() option (324)10.4.4 Default estimate of the VCE (326)10.4.5 Robust estimate of the VCE (326)10.4.6 Cluster–robust estimate of the VCE (327)10.4.7 Heteroskedasticity- and autocorrelation-consistent estimateof the VCE (328)10.4.8 Bootstrap standard errors (328)10.4.9 Statistical inference (329)10.5 Prediction (329)10.5.1 The predict and predictnl commands (329)10.5.2 Application of predict and predictnl (330)10.5.3 Out-of-sample prediction (331)10.5.4 Prediction at a specified value of one of the regressors (321)10.5.5 Prediction at a specified value of all the regressors (332)10.5.6 Prediction of other quantities (333)10.6 Marginal effects (333)10.6.1 Calculus and finite-difference methods (334)10.6.2 MEs estimates AME, MEM, and MER (334)10.6.3 Elasticities and semielasticities (335)10.6.4 Simple interpretations of coefficients in single-index models (336)10.6.5 The mfx command (337)10.6.6 MEM: Marginal effect at mean (337)Comparison of calculus and finite-difference methods . . . (338)10.6.7 MER: Marginal effect at representative value (338)10.6.8 AME: Average marginal effect (339)10.6.9 Elasticities and semielasticities (340)10.6.10 AME computed manually (342)10.6.11 Polynomial regressors (343)10.6.12 Interacted regressors (344)10.6.13 Complex interactions and nonlinearities (344)10.7 Model diagnostics (345)10.7.1 Goodness-of-fit measures (345)10.7.2 Information criteria for model comparison (346)10.7.3 Residuals (347)10.7.4 Model-specification tests (348)10.8 Stata resources (349)10.9 Exercises (349)11 Nonlinear optimization methods35111.1 Introduction (351)11.2 Newton–Raphson method (351)11.2.1 NR method (351)11.2.2 NR method for Poisson (352)11.2.3 Poisson NR example using Mata (353)Core Mata code for Poisson NR iterations (353)Complete Stata and Mata code for Poisson NR iterations (353)11.3 Gradient methods (355)11.3.1 Maximization options (355)11.3.2 Gradient methods (356)11.3.3 Messages during iterations (357)11.3.4 Stopping criteria (357)11.3.5 Multiple maximums (357)11.3.6 Numerical derivatives (358)11.4 The ml command: if method (359)11.4.1 The ml command (360)11.4.2 The If method (360)11.4.3 Poisson example: Single-index model (361)11.4.4 Negative binomial example: Two-index model (362)11.4.5 NLS example: Nonlikelihood model (363)11.5 Checking the program (364)11.5.1 Program debugging using ml check and ml trace (365)11.5.2 Getting the program to run (366)11.5.3 Checking the data (366)11.5.4 Multicollinearity and near coilinearity (367)11.5.5 Multiple optimums (368)11.5.6 Checking parameter estimation (369)11.5.7 Checking standard-error estimation (370)11.6 The ml command: d0, dl, and d2 methods (371)11.6.1 Evaluator functions (371)11.6.2 The d0 method (373)11.6.3 The dl method (374)11.6.4 The dl method with the robust estimate of the VCE (374)11.6.5 The d2 method (375)11.7 The Mata optimize() function (376)11.7.1 Type d and v evaluators (376)11.7.2 Optimize functions (377)11.7.3 Poisson example (377)Evaluator program for Poisson MLE (377)The optimize() function for Poisson MLE (378)11.8 Generalized method of moments (379)11.8.1 Definition (380)11.8.2 Nonlinear IV example (380)11.8.3 GMM using the Mata optimize() function (381)11.9 Stata resources (383)11.10 Exercises (383)12 Testing methods38512.1 Introduction (385)12.2 Critical values and p-values (385)12.2.1 Standard normal compared with Student's t (386)12.2.2 Chi-squared compared with F (386)12.2.3 Plotting densities (386)12.2.4 Computing p-values and critical values (388)12.2.5 Which distributions does Stata use? (389)12.3 Wald tests and confidence intervals (389)12.3.1 Wald test of linear hypotheses (389)12.3.2 The test command (391)Test single coefficient (392)Test several hypotheses (392)Test of overall significance (393)Test calculated from retrieved coefficients and VCE (393)12.3.3 One-sided Wald tests (394)12.3.4 Wald test of nonlinear hypotheses (delta method) (395)12.3.5 The testnl command (395)12.3.6 Wald confidence intervals (396)12.3.7 The lincom command (396)12.3.8 The nlcom command (delta method) (397)12.3.9 Asymmetric confidence intervals (398)12.4 Likelihood-ratio tests (399)12.4.1 Likelihood-ratio tests (399)12.4.2 The lrtest command (401)12.4.3 Direct computation of LR tests (401)12.5 Lagrange multiplier test (or score test) (402)12.5.1 LM tests (402)12.5.2 The estat command (403)12.5.3 LM test by auxiliary regression (403)12.6 Test size and power (405)12.6.1 Simulation DGP: OLS with chi-squared errors (405)12.6.2 Test size (406)12.6.3 Test power (407)12.6.4 Asymptotic test power (410)12.7 Specification tests (411)12.7.1 Moment-based tests (411)12.7.2 Information matrix test (411)12.7.3 Chi-squared goodness-of-fit test (412)12.7.4 Overidentifying restrictions test (412)12.7.5 Hausman test (412)12.7.6 Other tests (413)12.8 Stata resources (413)12.9 Exercises (413)13 Bootstrap methods41513.1 Introduction (415)13.2 Bootstrap methods (415)13.2.1 Bootstrap estimate of standard error (415)13.2.2 Bootstrap methods (416)13.2.3 Asymptotic refinement (416)13.2.4 Use the bootstrap with caution (416)13.3 Bootstrap pairs using the vce(bootstrap) option (417)13.3.1 Bootstrap-pairs method to estimate VCE (417)13.3.2 The vce(bootstrap) option (418)13.3.3 Bootstrap standard-errors example (418)13.3.4 How many bootstraps? (419)13.3.5 Clustered bootstraps (420)13.3.6 Bootstrap confidence intervals (421)13.3.7 The postestimation estat bootstrap command (422)13.3.8 Bootstrap confidence-intervals example (423)13.3.9 Bootstrap estimate of bias (423)13.4 Bootstrap pairs using the bootstrap command (424)13.4.1 The bootstrap command (424)13.4.2 Bootstrap parameter estimate from a Stata estimationcommand (425)13.4.3 Bootstrap standard error from a Stata estimation command (426)13.4.4 Bootstrap standard error from a user-written estimationcommand (426)13.4.5 Bootstrap two-step estimator (427)13.4.6 Bootstrap Hausman test (429)13.4.7 Bootstrap standard error of the coefficient of variation . . (430)13.5 Bootstraps with asymptotic refinement (431)13.5.1 Percentile-t method (431)13.5.2 Percentile-t Wald test (432)13.5.3 Percentile-t Wald confidence interval (433)13.6 Bootstrap pairs using bsample and simulate (434)13.6.1 The bsample command (434)13.6.2 The bsample command with simulate (434)13.6.3 Bootstrap Monte Carlo exercise (436)13.7 Alternative resampling schemes (436)13.7.1 Bootstrap pairs (437)13.7.2 Parametric bootstrap (437)13.7.3 Residual bootstrap (439)13.7.4 Wild bootstrap (440)13.7.5 Subsampling (441)13.8 The jackknife (441)13.8.1 Jackknife method (441)13.8.2 The vice(jackknife) option and the jackknife command . . (442)13.9 Stata resources (442)13.10 Exercises (442)14 Binary outcome models44514.1 Introduction (445)14.2 Some parametric models (445)14.2.1 Basic model (445)14.2.2 Logit, probit, linear probability, and clog-log models . . . (446)14.3 Estimation (446)14.3.1 Latent-variable interpretation and identification (447)14.3.2 ML estimation (447)14.3.3 The logit and probit commands (448)14.3.4 Robust estimate of the VCE (448)14.3.5 OLS estimation of LPM (448)14.4 Example (449)14.4.1 Data description (449)14.4.2 Logit regression (450)14.4.3 Comparison of binary models and parameter estimates . (451)14.5 Hypothesis and specification tests (452)14.5.1 Wald tests (453)14.5.2 Likelihood-ratio tests (453)14.5.3 Additional model-specification tests (454)Lagrange multiplier test of generalized logit (454)Heteroskedastic probit regression (455)14.5.4 Model comparison (456)14.6 Goodness of fit and prediction (457)14.6.1 Pseudo-R2 measure (457)14.6.2 Comparing predicted probabilities with sample frequencies (457)14.6.3 Comparing predicted outcomes with actual outcomes . . . (459)14.6.4 The predict command for fitted probabilities (460)14.6.5 The prvalue command for fitted probabilities (461)14.7 Marginal effects (462)14.7.1 Marginal effect at a representative value (MER) (462)14.7.2 Marginal effect at the mean (MEM) (463)14.7.3 Average marginal effect (AME) (464)14.7.4 The prchange command (464)14.8 Endogenous regressors (465)14.8.1 Example (465)14.8.2 Model assumptions (466)14.8.3 Structural-model approach (467)The ivprobit command (467)Maximum likelihood estimates (468)Two-step sequential estimates (469)14.8.4 IVs approach (471)14.9 Grouped data (472)14.9.1 Estimation with aggregate data (473)14.9.2 Grouped-data application (473)14.10 Stata resources (475)14.11 Exercises (475)15 Multinomial models47715.1 Introduction (477)15.2 Multinomial models overview (477)15.2.1 Probabilities and MEs (477)15.2.2 Maximum likelihood estimation (478)15.2.3 Case-specific and alternative-specific regressors (479)15.2.4 Additive random-utility model (479)15.2.5 Stata multinomial model commands (480)15.3 Multinomial example: Choice of fishing mode (480)15.3.1 Data description (480)15.3.2 Case-specific regressors (483)15.3.3 Alternative-specific regressors (483)15.4 Multinomial logit model (484)15.4.1 The mlogit command (484)15.4.2 Application of the mlogit command (485)15.4.3 Coefficient interpretation (486)15.4.4 Predicted probabilities (487)15.4.5 MEs (488)15.5 Conditional logit model (489)15.5.1 Creating long-form data from wide-form data (489)15.5.2 The asclogit command (491)15.5.3 The clogit command (491)15.5.4 Application of the asclogit command (492)15.5.5 Relationship to multinomial logit model (493)15.5.6 Coefficient interpretation (493)15.5.7 Predicted probabilities (494)15.5.8 MEs (494)15.6 Nested logit model (496)15.6.1 Relaxing the independence of irrelevant alternatives as-sumption (497)15.6.2 NL model (497)15.6.3 The nlogit command (498)15.6.4 Model estimates (499)15.6.5 Predicted probabilities (501)15.6.6 MEs (501)15.6.7 Comparison of logit models (502)15.7 Multinomial probit model (503)15.7.1 MNP (503)15.7.2 The mprobit command (503)15.7.3 Maximum simulated likelihood (504)15.7.4 The asmprobit command (505)15.7.5 Application of the asmprobit command (505)15.7.6 Predicted probabilities and MEs (507)15.8 Random-parameters logit (508)15.8.1 Random-parameters logit (508)15.8.2 The mixlogit command (508)15.8.3 Data preparation for mixlogit (509)15.8.4 Application of the mixlogit command (509)15.9 Ordered outcome models (510)15.9.1 Data summary (511)15.9.2 Ordered outcomes (512)15.9.3 Application of the ologit command (512)15.9.4 Predicted probabilities (513)15.9.5 MEs (513)15.9.6 Other ordered models (514)15.10 Multivariate outcomes (514)15.10.1 Bivariate probit (515)15.10.2 Nonlinear SUR (517)15.11 Stata resources (518)15.12 Exercises (518)16 Tobit and selection models52116.1 Introduction (521)16.2 Tobit model (521)16.2.1 Regression with censored data (521)16.2.2 Tobit model setup (522)16.2.3 Unknown censoring point (523)。

伍德里奇计量经济学英文版各章总结

伍德里奇计量经济学英文版各章总结

CHAPTER 1TEACHING NOTESYou have substantial latitude about what to emphasize in Chapter 1. I find it useful to talk about the economics of crime example (Example and the wage example (Example so that students see, at the outset, that econometrics is linked to economic reasoning, even if the economics is not complicated theory.I like to familiarize students with the important data structures that empirical economists use, focusing primarily on cross-sectional and time series data sets, as these are what I cover in a first-semester course. It is probably a good idea to mention the growing importance of data sets that have both a cross-sectional and time dimension.I spend almost an entire lecture talking about the problems inherent in drawing causal inferences in the social sciences. I do this mostly through the agricultural yield, return to education, and crime examples. These examples also contrast experimental and nonexperimental (observational) data. Students studying business and finance tend to find the term structure of interest rates example more relevant, although the issue there is testing the implication of a simple theory, as opposed to inferring causality. I have found that spending time talking about these examples, in place of a formal review of probability and statistics, is more successful (and more enjoyable for the students and me).CHAPTER 2TEACHING NOTESThis is the chapter where I expect students to follow most, if not all, of the algebraic derivations. In class I like to derive at least the unbiasedness of the OLS slope coefficient, and usually I derive the variance. At a minimum, I talk about the factors affecting the variance. To simplify the notation, after I emphasize the assumptions in the population model, and assume random sampling, I just condition on the values of the explanatory variables in the sample. Technically, this is justified by random sampling because, for example, E(u i|x1,x2,…,x n) = E(u i|x i) by independent sampling. I find that students are able to focus on the key assumption and subsequently take my word about how conditioning on the independent variables in the sample is harmless. (If you prefer, the appendix to Chapter 3 does the conditioning argument carefully.) Because statistical inference is no more difficult in multiple regression than in simple regression, I postpone inference until Chapter 4. (This reduces redundancy and allows you to focus on the interpretive differences between simple and multiple regression.)You might notice how, compared with most other texts, I use relatively few assumptions to derive the unbiasedness of the OLS slope estimator, followed by the formula for its variance. This is because I do not introduce redundant or unnecessary assumptions. For example, once is assumed, nothing further about the relationship between u and x is needed to obtain the unbiasedness of OLS under random sampling.CHAPTER 3TEACHING NOTESFor undergraduates, I do not work through most of the derivations in this chapter, at least not in detail. Rather, I focus on interpreting the assumptions, which mostly concern the population. Other than random sampling, the only assumption that involves more than population considerations is the assumption about no perfect collinearity, where the possibility of perfect collinearity in the sample (even if it does not occur in the population) should be touched on. The more important issue is perfect collinearity in the population, but this is fairly easy to dispense with via examples. These come from my experiences with the kinds of model specification issues that beginners have trouble with.The comparison of simple and multiple regression estimates – based on the particular sample at hand, as opposed to their statistical properties?– usually makes a strong impression. Sometimes I do not bother with the “partialling out” interpretation of multiple regression.As far as statistical properties, notice how I treat the problem of including an irrelevant variable: no separate derivation is needed, as the result follows form Theorem .I do like to derive the omitted variable bias in the simple case. This is not much more difficult than showing unbiasedness of OLS in the simple regression case under the first four Gauss-Markov assumptions. It is important to get the students thinking about this problem early on, and before too many additional (unnecessary) assumptions have been introduced.I have intentionally kept the discussion of multicollinearity to a minimum. This partly indicates my bias, but it also reflects reality. It is, of course, very important for students to understand the potential consequences of having highly correlated independent variables. But this is often beyond our control, except that we can ask less of our multiple regression analysis. If two or more explanatory variables are highly correlated in the sample, we should not expect to precisely estimate their ceteris paribus effects in the population.I find extensive treatmen ts of multicollinearity, where one “tests” or somehow “solves” the multicollinearity problem, to be misleading, at best. Even the organization of some texts gives the impression that imperfect multicollinearity is somehow a violation of the Gauss-Markov assumptions: they include multicollinearity in a chapter or part of the book devoted to “violation of the basic assumptions,” or something like that. I have noticed that master’s students who have had some undergraduate econometrics are often confused on the multicollinearity issue. It is very important that students not confuse multicollinearity among the included explanatory variables in a regression model with the bias caused by omitting an important variable.I do not prove the Gauss-Markov theorem. Instead, I emphasize its implications. Sometimes, and certainly for advanced beginners, I put a special case of Problem on a midterm exam, where I make a particular choice for the functiong(x). Rather than have the students directly compare the variances, they shouldappeal to the Gauss-Markov theorem for the superiority of OLS over any other linear, unbiased estimator.CHAPTER 4TEACHING NOTESAt the start of this chapter is good time to remind students that a specific error distribution played no role in the results of Chapter 3. That is because only the first two moments were derived under the full set of Gauss-Markov assumptions. Nevertheless, normality is needed to obtain exact normal sampling distributions (conditional on the explanatory variables). I emphasize that the full set of CLM assumptions are used in this chapter, but that in Chapter 5 we relax the normality assumption and still perform approximately valid inference. One could argue that the classical linear model results could be skipped entirely, and that only large-sample analysis is needed. But, from a practical perspective, students still need to know where the t distribution comes from because virtually all regression packages report t statistics and obtain p -values off of the t distribution. I then find it very easy tocover Chapter 5 quickly, by just saying we can drop normality and still use t statistics and the associated p -values as being approximately valid. Besides, occasionally students will have to analyze smaller data sets, especially if they do their own small surveys for a term project.It is crucial to emphasize that we test hypotheses about unknown population parameters. I tell my students that they will be punished if they write something likeH 0:1ˆ ?= 0 on an exam or, even worse, H 0: .632 = 0. One useful feature of Chapter 4 is its illustration of how to rewrite a population model so that it contains the parameter of interest in testing a single restriction. I find this is easier, both theoretically and practically, than computing variances that can, in some cases, depend on numerous covariance terms. The example of testing equality of the return to two- and four-year colleges illustrates the basic method, and shows that the respecified model can have a useful interpretation. Of course, some statistical packages now provide a standard error for linear combinations of estimates with a simple command, and that should be taught, too.One can use an F test for single linear restrictions on multiple parameters, but this is less transparent than a t test and does not immediately produce the standard error needed for a confidence interval or for testing a one-sided alternative. The trick of rewriting the population model is useful in several instances, including obtaining confidence intervals for predictions in Chapter 6, as well as for obtaining confidence intervals for marginal effects in models with interactions (also in Chapter6).The major league baseball player salary example illustrates the differencebetween individual and joint significance when explanatory variables (rbisyr and hrunsyr in this case) are highly correlated. I tend to emphasize the R -squared form of the F statistic because, in practice, it is applicable a large percentage of the time, and it is much more readily computed. I do regret that this example is biased toward students in countries where baseball is played. Still, it is one of the better examplesof multicollinearity that I have come across, and students of all backgrounds seem to get the point.CHAPTER 5TEACHING NOTESChapter 5 is short, but it is conceptually more difficult than the earlier chapters, primarily because it requires some knowledge of asymptotic properties of estimators. In class, I give a brief, heuristic description of consistency and asymptotic normality before stating the consistency and asymptotic normality of OLS. (Conveniently, the same assumptions that work for finite sample analysis work for asymptotic analysis.) More advanced students can follow the proof of consistency of the slope coefficient in the bivariate regression case. Section contains a full matrix treatment of asymptoti c analysis appropriate for a master’s level course.An explicit illustration of what happens to standard errors as the sample size grows emphasizes the importance of having a larger sample. I do not usually cover the LM statistic in a first-semester course, and I only briefly mention the asymptotic efficiency result. Without full use of matrix algebra combined with limit theorems for vectors and matrices, it is very difficult to prove asymptotic efficiency of OLS.I think the conclusions of this chapter are important for students to know, even though they may not fully grasp the details. On exams I usually include true-false type questions, with explanation, to test the students’ understanding of asymptotics. [For exam ple: “In large samples we do not have to worry about omitted variable bias.” (False). Or “Even if the error term is not normally distributed, in large samples we can still compute approximately valid confidence intervals under the Gauss-Markov assumptio ns.” (True).]CHAPTER6TEACHING NOTESI cover most of Chapter 6, but not all of the material in great detail. I use the example in Table to quickly run through the effects of data scaling on the important OLS statistics. (Students should already have a feel for the effects of data scaling on the coefficients, fitting values, and R-squared because it is covered in Chapter 2.) At most, I briefly mention beta coefficients; if students have a need for them, they can read this subsection.The functional form material is important, and I spend some time on more complicated models involving logarithms, quadratics, and interactions. An important point for models with quadratics, and especially interactions, is that we need to evaluate the partial effect at interesting values of the explanatory variables. Often, zero is not an interesting value for an explanatory variable and is well outside the range in the sample. Using the methods from Chapter 4, it is easy to obtain confidence intervals for the effects at interesting x values.As far as goodness-of-fit, I only introduce the adjusted R-squared, as I think using a slew of goodness-of-fit measures to choose a model can be confusing to novices (and does not reflect empirical practice). It is important to discuss how, if we fixate on a high R-squared, we may wind up with a model that has no interesting ceteris paribus interpretation.I often have students and colleagues ask if there is a simple way to predict y when log(y) has been used as the dependent variable, and to obtain a goodness-of-fit measure for the log(y) model that can be compared with the usual R-squared obtained when y is the dependent variable. The methods described in Section are easy to implement and, unlike other approaches, do not require normality.The section on prediction and residual analysis contains several important topics, including constructing prediction intervals. It is useful to see how much wider the prediction intervals are than the confidence interval for the conditional mean. I usually discuss some of the residual-analysis examples, as they have real-world applicability.CHAPTER 7TEACHING NOTESThis is a fairly standard chapter on using qualitative information in regression analysis, although I try to emphasize examples with policy relevance (and only cross-sectional applications are included.).In allowing for different slopes, it is important, as in Chapter 6, to appropriately interpret the parameters and to decide whether they are of direct interest. For example, in the wage equation where the return to education is allowed to depend on gender, the coefficient on the female dummy variable is the wage differential between women and men at zero years of education. It is not surprising that we cannot estimate this very well, nor should we want to. In this particular example we would drop the interaction term because it is insignificant, but the issue of interpreting the parameters can arise in models where the interaction term is significant.In discussing the Chow test, I think it is important to discuss testing for differences in slope coefficients after allowing for an intercept difference. In many applications, a significant Chow statistic simply indicates intercept differences. (See the example in Section on student-athlete GPAs in the text.) From a practical perspective, it is important to know whether the partial effects differ across groups or whether a constant differential is sufficient.I admit that an unconventional feature of this chapter is its introduction of the linear probability model. I cover the LPM here for several reasons. First, the LPM is being used more and more because it is easier to interpret than probit or logit models. Plus, once the proper parameter scalings are done for probit and logit, the estimated effects are often similar to the LPM partial effects near the mean or median values of the explanatory variables. The theoretical drawbacks of the LPM are often of secondary importance in practice. Computer Exercise is a good one to illustrate that, even with over 9,000 observations, the LPM can deliver fitted values strictly between zero and one for all observations.If the LPM is not covered, many students will never know about using econometrics to explain qualitative outcomes. This would be especially unfortunate for students who might need to read an article where an LPM is used, or who might want to estimate an LPM for a term paper or senior thesis. Once they are introduced to purpose and interpretation of the LPM, along with its shortcomings, they can tackle nonlinear models on their own or in a subsequent course.A useful modification of the LPM estimated in equation is to drop kidsge6 (because it is not significant) and then define two dummy variables, one for kidslt6 equal to one and the other for kidslt6 at least two. These can be included in place of kidslt6 (with no young children being the base group). This allows a diminishing marginal effect in an LPM. I was a bit surprised when a diminishing effect did not materialize.CHAPTER 8TEACHING NOTESThis is a good place to remind students that homoskedasticity played no role in showing that OLS is unbiased for the parameters in the regression equation. In addition, you probably should mention that there is nothing wrong with the R-squared or adjusted R-squared as goodness-of-fit measures. The key is that these are estimates of the population R-squared, 1?– [Var(u)/Var(y)], where the variances are the unconditional variances in the population. The usual R-squared, and the adjusted version, consistently estimate the population R-squared whether or not Var(u|x)?= Var(y|x) depends on x. Of course, heteroskedasticity causes the usual standard errors, t statistics, and F statistics to be invalid, even in large samples, with or without normality.By explicitly stating the homoskedasticity assumption as conditional on the explanatory variables that appear in the conditional mean, it is clear that only heteroskedasticity that depends on the explanatory variables in the model affects the validity of standard errors and test statistics. The version of the Breusch-Pagan test in the text, and the White test, are ideally suited for detecting forms of heteroskedasticity that invalidate inference obtained under homoskedasticity. If heteroskedasticity depends on an exogenous variable that does not also appear in the mean equation, this can be exploited in weighted least squares for efficiency, but only rarely is such a variable available. One case where such a variable is available is when an individual-level equation has been aggregated. I discuss this case in the text but I rarely have time to teach it.As I mention in the text, other traditional tests for heteroskedasticity, such as the Park and Glejser tests, do not directly test what we want, or add too many assumptions under the null. The Goldfeld-Quandt test only works when there is a natural way to order the data based on one independent variable. This is rare in practice, especially for cross-sectional applications.Some argue that weighted least squares estimation is a relic, and is no longer necessary given the availability of heteroskedasticity-robust standard errors and test statistics. While I am sympathetic to this argument, it presumes that we do not care much about efficiency. Even in large samples, the OLS estimates may not be preciseenough to learn much about the population parameters. With substantial heteroskedasticity we might do better with weighted least squares, even if the weighting function is misspecified. As discussed in the text on pages 288-289, one can, and probably should, compute robust standard errors after weighted least squares. For asymptotic efficiency comparisons, these would be directly comparable to the heteroskedasiticity-robust standard errors for OLS.Weighted least squares estimation of the LPM is a nice example of feasible GLS, at least when all fitted values are in the unit interval. Interestingly, in the LPM examples in the text and the LPM computer exercises, the heteroskedasticity-robust standard errors often differ by only small amounts from the usual standard errors. However, in a couple of cases the differences are notable, as in Computer Exercise .CHAPTER 9TEACHING NOTESThe coverage of RESET in this chapter recognizes that it is a test for neglected nonlinearities, and it should not be expected to be more than that. (Formally, it can be shown that if an omitted variable has a conditional mean that is linear in the included explanatory variables, RESET has no ability to detect the omitted variable. Interested readers may consult my chapter in Companion to Theoretical Econometrics, 2001, edited by Badi Baltagi.) I just teach students the F statistic version of the test.The Davidson-MacKinnon test can be useful for detecting functional form misspecification, especially when one has in mind a specific alternative, nonnested model. It has the advantage of always being a one degree of freedom test.I think the proxy variable material is important, but the main points can be made with Examples and . The first shows that controlling for IQ can substantially change the estimated return to education, and the omitted ability bias is in the expected direction. Interestingly, education and ability do not appear to have an interactive effect. Example is a nice example of how controlling for a previous value of the dependent variable – something that is often possible with survey and nonsurvey data – can greatly affect a policy conclusion. Computer Exercise is also a good illustration of this method.I rarely get to teach the measurement error material, although the attenuation bias result for classical errors-in-variables is worth mentioning.The result on exogenous sample selection is easy to discuss, with more details given in Chapter 17. The effects of outliers can be illustrated using the examples. I think the infant mortality example, Example , is useful for illustrating how a single influential observation can have a large effect on the OLS estimates.With the growing importance of least absolute deviations, it makes sense to at least discuss the merits of LAD, at least in more advanced courses. Computer Exercise is a good example to show how mean and median effects can be very different, even though there may not be “outliers” in the usual sense.CHAPTER 10TEACHING NOTESBecause of its realism and its care in stating assumptions, this chapter puts a somewhat heavier burden on the instructor and student than traditional treatments of time series regression. Nevertheless, I think it is worth it. It is important that students learn that there are potential pitfalls inherent in using regression with time series data that are not present for cross-sectional applications. Trends, seasonality, and high persistence are ubiquitous in time series data. By this time, students should have a firm grasp of multiple regression mechanics and inference, and so you can focus on those features that make time series applications different fromcross-sectional ones.I think it is useful to discuss static and finite distributed lag models at the same time, as these at least have a shot at satisfying the Gauss-Markov assumptions.Many interesting examples have distributed lag dynamics. In discussing the time series versions of the CLM assumptions, I rely mostly on intuition. The notion of strict exogeneity is easy to discuss in terms of feedback. It is also pretty apparent that, in many applications, there are likely to be some explanatory variables that are not strictly exogenous. What the student should know is that, to conclude that OLS is unbiased – as opposed to consistent – we need to assume a very strong form of exogeneity of the regressors. Chapter 11 shows that only contemporaneous exogeneity is needed for consistency.Although the text is careful in stating the assumptions, in class, after discussing strict exogeneity, I leave the conditioning on X implicit, especially when I discuss the no serial correlation assumption. As this is a new assumption I spend some time on it. (I also discuss why we did not need it for random sampling.)Once the unbiasedness of OLS, the Gauss-Markov theorem, and the sampling distributions under the classical linear model assumptions have been covered – which can be done rather quickly – I focus on applications. Fortunately, the students already know about logarithms and dummy variables. I treat index numbers in this chapter because they arise in many time series examples.A novel feature of the text is the discussion of how to compute goodness-of-fit measures with a trending or seasonal dependent variable. While detrending or deseasonalizing y is hardly perfect (and does not work with integrated processes), it is better than simply reporting the very high R-squareds that often come with time series regressions with trending variables.CHAPTER 11TEACHING NOTESMuch of the material in this chapter is usually postponed, or not covered at all, in an introductory course. However, as Chapter 10 indicates, the set of time series applications that satisfy all of the classical linear model assumptions might be very small. In my experience, spurious time series regressions are the hallmark of many student projects that use time series data. Therefore, students need to be alerted to the dangers of using highly persistent processes in time series regression equations.(Spurious regression problem and the notion of cointegration are covered in detail in Chapter 18.)It is fairly easy to heuristically describe the difference between a weakly dependent process and an integrated process. Using the MA(1) and the stable AR(1) examples is usually sufficient.When the data are weakly dependent and the explanatory variables are contemporaneously exogenous, OLS is consistent. This result has many applications, including the stable AR(1) regression model. When we add the appropriate homoskedasticity and no serial correlation assumptions, the usual test statistics are asymptotically valid.The random walk process is a good example of a unit root (highly persistent) process. In a one-semester course, the issue comes down to whether or not to first difference the data before specifying the linear model. While unit root tests are covered in Chapter 18, just computing the first-order autocorrelation is often sufficient, perhaps after detrending. The examples in Section illustrate how different first-difference results can be from estimating equations in levels.Section is novel in an introductory text, and simply points out that, if a modelis dynamically complete in a well-defined sense, it should not have serial correlation. Therefore, we need not worry about serial correlation when, say, we test the efficient market hypothesis. Section further investigates the homoskedasticity assumption, and, in a time series context, emphasizes that what is contained in the explanatory variables determines what kind of heteroskedasticity is ruled out by the usual OLS inference. These two sections could be skipped without loss of continuity.CHAPTER 12TEACHING NOTESMost of this chapter deals with serial correlation, but it also explicitly considers heteroskedasticity in time series regressions. The first section allows a review of what assumptions were needed to obtain both finite sample and asymptotic results. Just as with heteroskedasticity, serial correlation itself does not invalidate R-squared. In fact, if the data are stationary and weakly dependent, R-squared and adjustedR-squared consistently estimate the population R-squared (which is well-defined under stationarity).Equation is useful for explaining why the usual OLS standard errors are not generally valid with AR(1) serial correlation. It also provides a good starting point for discussing serial correlation-robust standard errors in Section . The subsection on serial correlation with lagged dependent variables is included to debunk the myth that OLS is always inconsistent with lagged dependent variables and serial correlation.I do not teach it to undergraduates, but I do to master’s students.Section is somewhat untraditional in that it begins with an asymptotic t testfor AR(1) serial correlation (under strict exogeneity of the regressors). It may seem heretical not to give the Durbin-Watson statistic its usual prominence, but I do believe the DW test is less useful than the t test. With nonstrictly exogenous regressors Icover only the regression form of Durbin’s test, as the h statistic is asymptotically equivalent and not always computable.Section , on GLS and FGLS estimation, is fairly standard, although I try to show how comparing OLS estimates and FGLS estimates is not so straightforward. Unfortunately, at the beginning level (and even beyond), it is difficult to choose a course of action when they are very different.I do not usually cover Section in a first-semester course, but, because some econometrics packages routinely compute fully robust standard errors, students can be pointed to Section if they need to learn something about what the corrections do. I do cover Section for a master’s level course in applied econometrics (after thefirst-semester course).I also do not cover Section in class; again, this is more to serve as a reference for more advanced students, particularly those with interests in finance. One important point is that ARCH is heteroskedasticity and not serial correlation, something that is confusing in many texts. If a model contains no serial correlation, the usual heteroskedasticity-robust statistics are valid. I have a brief subsection on correcting for a known form of heteroskedasticity and AR(1) errors in models with strictly exogenous regressors.CHAPTER 13TEACHING NOTESWhile this chapter falls under “Advanced Topics,” most of this chapter requires no more sophistication than the previous chapters. (In fact, I would argue that, with the possible exception of Section , this material is easier than some of the time series chapters.)Pooling two or more independent cross sections is a straightforward extension of cross-sectional methods. Nothing new needs to be done in stating assumptions, except possibly mentioning that random sampling in each time period is sufficient. The practically important issue is allowing for different intercepts, and possibly different slopes, across time.The natural experiment material and extensions of the difference-in-differences estimator is widely applicable and, with the aid of the examples, easy to understand.Two years of panel data are often available, in which case differencing across time is a simple way of removing g unobserved heterogeneity. If you have covered Chapter 9, you might compare this with a regression in levels using the second year of data, but where a lagged dependent variable is included. (The second approach only requires collecting information on the dependent variable in a previous year.) These often give similar answers. Two years of panel data, collected before and after a policy change, can be very powerful for policy analysis.Having more than two periods of panel data causes slight complications in that the errors in the differenced equation may be serially correlated. (However, the traditional assumption that the errors in the original equation are serially uncorrelated is not always a good one. In other words, it is not always more appropriate to used。

计量经济学及stata应用操作

计量经济学及stata应用操作

计量经济学及stata应用操作计量经济学是经济学中的一门重要的子领域,它研究如何运用数理统计方法来解决经济学中的问题,尤其是通过建立经济模型并利用实际数据进行分析和验证。

它的目标是通过利用观测数据来检验经济理论,并得出有关经济现象和政策的科学结论。

在实践中,研究者通常使用专门的计量经济学软件来进行数据处理和分析。

其中,Stata是一种非常常用的软件工具,它提供了丰富的数据处理、模型估计和统计推断等功能,被广泛应用于计量经济学研究中。

在Stata中,常用的数据处理操作包括数据导入、数据清洗和数据转换。

数据导入是将外部数据文件导入Stata中进行后续分析的过程,可以通过多种格式导入,如Excel、CSV、SPSS等。

数据清洗是对导入的数据进行检查、修正和删除异常值等处理,确保数据的质量和可靠性。

数据转换是将原始数据转换成可用于模型估计和分析的格式,如变量类型转换、数据排序和合并等。

在Stata中进行计量经济分析的核心操作是建立经济模型、估计模型参数和进行统计推断。

建立经济模型包括选择适当的经济理论和模型结构,并设定模型中的自变量、因变量和控制变量等。

估计模型参数是利用观测数据来计算模型中的系数估计值,常用的方法包括最小二乘法、极大似然法和仪器变量法等。

统计推断是对模型的估计结果进行显著性检验和置信区间估计,以评估模型的可靠性和经济意义。

除了基本的计量经济学操作外,Stata还提供了丰富的高级功能,如面板数据分析、时间序列分析和计量计算等。

面板数据分析用于处理多个个体在多个时间点上的数据,考虑到个体和时间的固定效应和随机效应。

时间序列分析用于处理时间依赖的数据,如趋势、周期和季节性等。

计量计算是利用估计结果进行经济政策评估和预测分析,如计量影响评估、决策树分析和蒙特卡洛模拟等。

总之,计量经济学及其在Stata中的应用操作是经济学研究中不可或缺的一部分。

它通过建立经济模型、估计参数和进行统计推断来解决实际问题,Stata作为一种常用的计量经济学软件提供了丰富的功能和工具,使得研究者可以方便、快捷地进行数据处理、模型估计和统计推断,从而得出准确和可靠的经济研究结论。

《计量经济学及Stata应用》

《计量经济学及Stata应用》
© 陈强,《计量经济学及 Stata 应用》,2014 年。请勿上传或散发。
第 8 章 模型设定与数据问题 如果模型设定(model specification)不当,如解释变量选 择不当、测量误差、函数形式不妥等,则会出现“设定 误差”(specification error)。 数据本身也可能存在问题,比如多重共线性。
n→∞
增大。
8.3 建模策略:“由小到大”还是“由大到小”
“由小到大”(specific to general)的建模方式首先从最简 单的小模型开始,逐渐增加解释变量。但小模型很可能 存在遗漏变量,导致估计量不一致,t 检验、F 检验都将 失效,因此很难确定该如何取舍变量。
与此相反,“由大到小”(general to specific)的建模方式
例 在农学中将地块随机地分成三组(因为很难找到土壤 条件完全一样的地块),分别给予不同的施肥量,然后考 察施肥的效果。
在经济学中,“实验经济学”(experimental economics)所 做的实验基本上属于随机实验。
考虑以下回归模型:
8
y =α + β x + ε
其中, x是完全随机地决定的(比如,通过抛硬币或电 脑随机数)。由于 x与ε 相互独立,故Cov(x,ε ) = 0,因此无 论遗漏了多少解释变量,OLS 都是一致的。
{x2
,
,
xK
}
(即使用老鼠做实验,老鼠之间仍然有差异),
故无法进行严格的控制实验。
当代统计学之父费舍尔(Ronald Fischer)提出了随机实验 的概念。通常将实验人群(或个体)随机地分为两组,其
7
中“实验组”或“处理组”(treatment group)服用真药, 而“控制组”(control group,也称“对照组”)服用“安 慰药”(placebo)。

计量经济学实验报告stata

计量经济学实验报告stata计量经济学实验报告导言计量经济学是经济学中的一个重要分支,通过运用统计学和数学工具来研究经济现象和经济理论的有效性。

其中,实证研究是计量经济学的核心内容之一,而stata作为一款强大的统计分析软件,被广泛应用于计量经济学实证研究中。

本文将结合实例,介绍如何使用stata进行计量经济学实验研究。

实证研究的背景和目的实证研究是通过收集实际数据,运用统计学方法对经济理论进行检验和验证的过程。

实证研究的目的在于揭示经济现象的本质规律,为政策制定和经济决策提供科学依据。

在本次实证研究中,我们将以某国家的GDP增长率作为主要研究对象,探讨GDP增长率与人口增长率、投资率以及出口增长率之间的关系。

数据收集和处理首先,我们需要收集相关数据,包括GDP增长率、人口增长率、投资率和出口增长率。

这些数据可以从国家统计局或其他相关机构获取。

在收集到数据后,我们需要对数据进行处理,确保数据的准确性和一致性。

在stata中,可以使用命令load或import将数据导入软件中,并利用命令describe对数据进行描述性统计。

模型设定和估计在数据处理完成后,我们需要建立经济模型,并对模型进行估计。

在本次实证研究中,我们将采用多元线性回归模型来探究GDP增长率与人口增长率、投资率和出口增长率之间的关系。

模型设定如下:GDP增长率= β0 + β1 * 人口增长率+ β2 * 投资率+ β3 * 出口增长率+ ε其中,β0、β1、β2和β3为待估参数,ε为误差项。

在stata中,可以使用命令regress来进行回归分析,估计模型中的参数。

同时,还可以使用命令summary 对回归结果进行统计学检验,判断模型的显著性和拟合优度。

结果分析和讨论在完成模型估计后,我们需要对结果进行分析和讨论。

首先,可以通过回归结果中的系数估计值来判断变量之间的关系。

如果系数为正,表示变量之间存在正向关系;如果系数为负,表示变量之间存在负向关系。

高级计量经济学及stata应用

高级计量经济学及stata应用介绍本文旨在全面、详细、完整且深入地探讨高级计量经济学及Stata应用的主题,从理论到实践,帮助读者深入了解该领域并有效运用Stata进行数据分析。

以下将从以下几个方面展开讨论:1.数理统计的基本概念和应用领域2.高级计量经济学的发展及重要方法3.Stata在高级计量经济学中的应用方法4.经济数据分析实例数理统计的基本概念和应用领域1.1 概念数理统计是以概率论为基础,应用数学和统计学的方法研究统计规律,并应用这些规律来描述、分析和解释各种统计问题的一门学科。

它通过收集、整理、分析实际数据,得到统计定律,为决策提供科学依据。

1.2 应用领域数理统计在各个学科领域都有广泛的应用,特别是在经济学中。

它可以帮助经济学家分析经济现象,进行经济预测,评估政策效果等。

同时,数理统计也应用于医学研究、社会学调查、心理学实验设计等领域。

高级计量经济学的发展及重要方法2.1 发展历程高级计量经济学是计量经济学的一个分支,强调经济理论与计量方法的结合,通过数学模型和统计分析来研究经济现象。

该领域自20世纪50年代以来迅速发展,经历了计量经济学基本理论的建立、计量经济学模型的发展和计量经济学方法的创新等阶段。

2.2 重要方法在高级计量经济学中,有一些方法被广泛应用,如面板数据模型、时间序列分析、计量经济学中的工具变量等。

这些方法可以帮助研究者解决经济学中的内生性问题、数据相关性问题等。

Stata在高级计量经济学中的应用方法3.1 简介Stata是一种常用的经济数据分析软件,可以对数据进行清洗、处理、分析和可视化等操作。

它强大的计量经济学功能使其成为高级计量经济学研究的重要工具。

3.2 Stata的基本操作在使用Stata进行高级计量经济学研究时,需要掌握一些基本操作。

包括数据导入、数据处理、模型估计等。

此外,Stata还提供了丰富的统计命令和图表功能,可以帮助研究者进行详细的数据分析和结果展示。

【经院学子】主要英文经济学期刊论文计量方法

【经院学⼦】主要英⽂经济学期刊论⽂计量⽅法本⽂作者:肖⾦川、任飞、刘郁,来⾃复旦⼤学经济学院。

⽂章通过分析国际公认的五⼤顶级英⽂经济学期刊(《美国经济评论》、《经济学季刊》、《政治经济学杂志》、《计量经济学杂志》(Econometrica)、《经济研究评论》)所刊载论⽂在2001~2012年间的变化特征,客观呈现规范的经验研究是如何发展的。

⼀、2001~2012年顶级经济学期刊刊载论⽂分析将所有论⽂分为“理论”(只包含数理模型推导)、“理论经验”(同时包含数理模型推导,实际数据和⼤样本估计⽅法)和“经验”(不包含数理模型推导、只有实际数据和⼤样本统计推断和估计)三⼤类,其中,“理论经验”包括了以经验研究为主要贡献,⽽理论仅是说明性的研究。

从总量上来看,2001~2012年12年间,五⼤期刊上发表的论⽂总数相对稳定,⼤多数年份为260篇到280篇。

其中“理论经验”类论⽂和“经验”类论⽂在这12年中总体处于上升趋势,2012年论⽂篇数较2001年均有所增加,分别上升55.8%和42.1%。

相较之下,纯“理论”类论⽂篇数却总体处于下降区间。

这初步反映了运⽤经验研究的⽅法进⾏经济学研究开始变得流⾏,并越来越受到主流经济学期刊的欢迎。

⽐较2001~2006年和2007~2012年的数据,从总体上看,后⼀阶段“经验”类和“理论经验”类论⽂分别占到当年论⽂总数的24.6%和23.1%,相较于前⼀阶段的24.3%和16.4%有所提⾼。

相应的,“理论”类论⽂占⽐就下降较多,从前⼀阶段的59.3%降低到后⼀阶段的52.3%。

经验研究论⽂在2001~2012年间数量增长较为迅速,其中⼀个重要的原因是数据可得性提⾼使得经验研究的成本降低。

这⽆疑说明好的经济学研究不仅仅是纯粹的理论推导和数学模拟,运⽤经验研究的⽅法去联系实际、验证模型和发现问题也能够做出⼀流的论⽂,并且,从⽬前的趋势来看,这有可能是未来经济学研究的主流。

【精品】有关计量经济学期末论文-word格式 (6页)

本文部分内容来自网络,本司不为其真实性负责,如有异议请及时联系,本司将予以删除== 本文为word格式,下载后可编辑修改,推荐下载使用!==有关计量经济学期末论文计量经济学与实验经济学是经济学实证分析的重要方法与工具。

下面是羽利小编为大家整理的有关计量经济学期末论文,供大家参考。

有关计量经济学期末论文范文一:我国企业经济统计的现状与改革创新分析在知识经济时代发展背景下,企业经济统计工作也迎来了新一轮的发展机遇和挑战。

传统的思维模式不仅无法满足实际经济统计工作的需要,也阻碍着我国社会经济的进一步发展。

所以必须要针对当前的经济统计工作现状,进行不断的改革与创新,从而更好的适应经济时代的发展。

一、企业经济统计创新的重要性。

现代企业制度的确立,为我国企业发展迎来了新一轮发展机遇和挑战,也对企业管理工作提出更高的要求。

企业经济统计作为企业发展与进步的重要辅助手段,不仅为企业管理者提供精准的信息和决策依据,也为企业生产经营活动的顺利进行提供了有利的保障。

企业管理水平的不断提高,也为企业信息化建设带来了一定的难度,而企业经济统计工作包含了丰富的信息化活动内容,这就要求统计人员要不断优化企业经济统计信息网络的功能,促进企业信息化建设得到充分的完善。

企业发展诈略的制定和管理工作的顺利开展,也需要经济统计人员设置出更加科学合理的统计指标体系,从而统计出更加精准、完整的数据内容,同时也为企业生产经营提供更加精准的评估,从而促进企业的全面发展。

因此,经济统计的创新对企业的建设和发展有着至关重要的作用。

二、当前企业经济统计现状。

一是很多企业统计制度不健全。

当前,很多企业都没有建立起一套完整、统一的经济统计工作制度,使得企业工作人员无法得到精准、完整的统一报表,统计台账和原始记录也比较凌乱,企业管理者在制定发展战略和决策时也无法得到高质量的统计数据。

同时,一些新兴的企业也在随着市场经济的发展在不断壮大,在这些企业中,很大一部分都没有建立起标准的企业统计制度,相应的统计部门与人员也没有进行标准的划分和合理的配置,统计报表也没有专业的统计人员进行报送,职员也没有按照相应的统计制度去开展统计工作,进行报表的计算和统计。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

专业资料整理分享 完美WORD格式编辑 Graduates to apply for the quantitative analysis of changes in number of graduate students

一Topics raised In this paper, the total number of students from graduate students (variable) multivariate analysis (see below) specific analysis, and collect relevant data, model building, this quantitative analysis. The number of relations between the school the total number of graduate students with the major factors, according to the size of the various factors in the coefficient in the model equations, analyze the importance of various factors, exactly what factors in changes in the number of graduate students aspects play a key role in and changes in the trend for future graduate students to our proposal.

The main factors affect changes in the total number of graduate students for students are as follows: Per capita GDP - which is affecting an important factor to the total number of students in the graduate students (graduate school is not a small cost, and only have a certain economic base have more opportunities for post-graduate) The total population - it will affect the total number of students in graduate students is an important factor (it can be said to affect it is based on source) The number of unemployed persons - this is the impact of a direct factor of the total number of students in the graduate students (it is precisely because of the high unemployment rate, will more people choose Kaoyan will be their own employment weights) Number of colleges and universities - which is to influence precisely because of the emergence of more institutions of higher learning in the school the total number of graduate students is not a small factor (to allow more people to participate in Kaoyan)

二 Establish Model Y=α+β1X1+β2X2+β3X3+β4X4 +u Among them, the Y-in the total number of graduate students (variable) X1 - per capita GDP (explanatory variables) X2 - the total population (explanatory variables) 专业资料整理分享 完美WORD格式编辑 X3 - the number of unemployed persons (explanatory variables) X4 - the number of colleges and universities (explanatory variables)

三、Data collection 1. date Explain Here, using the same area (ie, China) time-series data were fitted

2. Data collection Time series data from 1986 to 2005, the specific circumstances are shown in Table 1

Table 1: Y X1 X2 X3 X4 1986 110371 963 107507 264.4 1054 1987 120191 1112 109300 276.6 1063 1988 112776 1366 111026 296.2 1075 1989 101339 1519 112704 377.9 1075 1990 93018 1644 114333 383.2 1075 1991 88128 1893 115823 352.2 1075 1992 94164 2311 117171 363.9 1053 1993 106771 2998 118517 420.1 1065 1994 127935 4044 119850 476.4 1080 1995 145443 5046 121121 519.6 1054 1996 163322 5846 122389 552.8 1032 1997 176353 6420 123626 576.8 1020 1998 198885 6796 124761 571 1022 1999 233513 7159 125786 575 1071 2000 301239 7858 126743 595 1041 2001 393256 8622 127627 681 1225 2002 500980 9398 128453 770 1396 2003 651260 10542 129227 800 1552 2004 819896 12336 129988 827 1731

2005 978610 14040 130756 839 1792

四、Model parameter estimation, inspection and correction 1. Model parameter estimation and its economic significance, statistical inference test 专业资料整理分享 完美WORD格式编辑 . twoway(scatter Y X1)

02000004000006000008000001.0e+06

Y050001000015000X1

twoway(scatter Y X2)

02000004000006000008000001.0e+06

Y105000110000115000120000125000130000X2 专业资料整理分享 完美WORD格式编辑 twoway(scatter Y X3)

02000004000006000008000001.0e+06

Y200400600800X3

twoway(scatter Y X4)

02000004000006000008000001.0e+06

Y10001200140016001800X4

相关文档
最新文档