Statistical model - PLS

合集下载

stata初级入门5线性回归模型估计

offset(varname)表示约束模型中变量varname的系数为1。该选项多出现于离散选择模型、计数模型中。
1.2.3exposure
exposure(varname)表示约束模型中变量ln(varname) 的系数为1。该选项多出现于计数模型中。
计量经济学软件应用
12
2020/6/13
计量经济学软件应用
33
2020/6/13
菜单： Statistics > Postestimation > Reports and statistics
引起完全共线性的情况：（1）一个自变量是另一个自变量的常数倍；（2）一个自变量恰好可以表达为其它两个或多个自变量的一个线性函数。如果此情况发生，自变量间就有多重共线性关系。
*自变量的样本有变异：在样本中，自变量不为相同的常数。
同方差性（亦称有效性）：var(u|x1,x2,x3,….)=σ2。
系数的方法。
method包括：
dw: rho_dw=1 - dw/2, 其中 dw 是Durbin-Watson值 regress:从残差回归方程et=rho_regress*et-1+vt freg:从残差回归方程中et=rho_freg*et+1+vt tscorr: rho=e‘et-1/e’e, 其中e和et-1 是残差和滞后一期残差。 theil: rho=rho_tscorr * (N-k)/N
rconsum
rneti _cons
Coef. Std. Err.
t P>|t|
.6478134 .0387183 482.8383 265.268
16.73 0.000 1.82 0.079

PLS介绍

PCA&PLS 相关文献阅读总结总结：适用于规律不太明显的分析，对于有确定公式的，规律明显的数据，结果必然没有公式算的好1. 论文题目：第三十章偏最小二乘回归论文作者：无存储路径：D:\ProjectWork\文献资料\PCA&PLS\PLS\算法大全第30章__偏最小二乘回归关键词：PLS 相关算法的介绍，同时文章中给出了例程，同时路径中matlab 文件夹中有matlab 程序。

1) 文章概述从文章的题目中就可以看出，这是一篇关于偏最小二乘算法的概述，可以当做入门教程。

2) 重要理论-解释-段落标记a) 重要理论b) 重要解释c) 重要段落(1) PLS 的优点、特点的概述偏最小二乘回归提供一种多对多线性回归建模的方法，特别当两组变量的个数很多，且都存在多重相关性，而观测数据的数量（样本量）又较少时，用偏最小二乘回归建立的模型具有传统的经典回归分析等方法所没有的优点。

偏最小二乘回归分析在建模过程中集中了主成分分析，典型相关分析和线性回归分析方法的特点，因此在分析结果中，除了可以提供一个更为合理的回归模型外，还可以同时完成一些类似于主成分分析和典型相关分析的研究内容，提供更丰富、深入的一些信息。

2. 论文题目：数理统计与MATLAB 数据处理论文作者：无存储路径：D:\ProjectWork\文献资料\PCA&PLS\ PLS\HerveAbdi_MatlabPrograms4PLS\数理统计与MATLAB 数据处理关键词：这个是PLS 入门相对比较容易的教程，最先接触的教程。

1) 文章概述这是一本书，其中第六章是关于PLS 教程。

2) 重要理论-解释-段落标记a) 重要理论PLS 的特点：在处理样本容量小、解释变量个数多、变量间存在严重多重相关性问题方面具有独特的优势，并且可以同时实现回归建模、数据结构简化以及两组变量间的相关分析。

设有q 个因变量y 1，y 2，…，y q 与p 个自变量x 1，x 2，…，x p ，为了研究因变量与自变量的统计关系，观测了n 个样本点，由此分别构成了自变量与因变量的”样本*变量”型的数据矩阵，记为()()12,,,ij p n p X x x x x ⨯== 和 ()()12,,,ij p n pY x y y y ⨯== PLS 方法在建模过程中采用了信息综合与筛选技术，不直接考虑因变量系统Y 对自变量系统X 的回归建模，而是从自变量系统X 中逐步提取m 个对自变量系统X 和因变量系统Y 都具有最佳解释能录的新综合变量t 1，…，t m （m<=p ），亦称为主成分，首先建立y k 对主成分t 1，…，t m 的MLR 回归方程，然后还原为y k 关于原自变量系统x 1，x 2，…，x p 的PLS 回归方程，其中k=1，2，….，q 。

偏最小二乘法(pls)简介(Partial least squares (PLS) Introduction)

---------------------------------------------------------------最新资料推荐------------------------------------------------------偏最小二乘法(pls)简介（Partial least squares(PLS) Introduction）偏最小二乘法(pls)简介（Partial least squares (PLS) Introduction）偏最小二乘法(pls)简介（Partial least squares (PLS) Introduction） Partial least squares (PLS). The news about | | | | | Software Research Report training | knowledge sharing | customer list | Forum Partial least squares (PLS). Author: CIC reading: 14122 times the release date: 2004-12-30 Brief introduction The partial least squares method is a new type of multivariate statistical data analysis method, it was founded in 1983 by Wood (S.Wold) and Abano (C.Albano), is proposed for the first time. In recent decades, it has been developed rapidly in theory, method and application. The partial least squares method For a long time, the boundaries between the methods of model type and understanding of the share is very clear. The partial least squares rule them organically, in an algorithm, can simultaneously achieve regression (multivariate linear regression), simplify the structure of the data (principal component analysis) and the correlation between the two sets of variables analysis (canonical correlation analysis). This is a leap in the multivariate1/ 9statistical analysis of data. The importance of partial least squares method in statistical applications are reflected in the following aspects: Partial least squares regression modeling method is a multi variable for multiple variables. The partial least squares method can solve many previous ordinary regression can not solve the problem. The partial least squares method is called the second generation regression method, because it can realize the comprehensive application of various data analysis methods. The main purpose of principal component regression is to extract relevant information hidden in the matrix X, and then used to predict the value of the variable Y. This approach can ensure that we use only those independent variables, the noise will be eliminated, so as to improve the quality of the prediction model. However, the principal component regression still has some defects, when the correlation between some useful variables is very small, we are in the selection of the main components is easy to put them out, make the reliability prediction model for final decline, if we choose for each component, it is too difficult. Partial least squares regression can solve this problem. It adopts the way of decomposition of the variables X and Y, while extracting components from the variables X and---------------------------------------------------------------最新资料推荐------------------------------------------------------Y (usually called factor), then the factor according to the correlation between them in order from large to small. Now, we want to build a model, as long as we choose several factors involved in modeling it The basic concept Partial least squares regression is an expansion of the multivariate linear regression model, in its simplest form, only a linear model to describe the relationship between the independent variables and Y variables X: Y = B0 + b1X1 + b2X2 + bpXp +... In the equation, B0 is intercept, the value of Bi is 1 to the data points of regression coefficient P. For example, we can think of a person’s weight is a function of his height, gender, and from their respective sample points to estimate the regression coefficient, we can generally predict the weight of someone from the measured height and gender. Many of the methods of data analysis, data describing the biggest problem accurately and make a reasonable forecast of new observational data. The multiple linear regression model to deal with more complex data analysis problems, extended some other algorithms, like discriminant analysis, principal component regression, correlation analysis and so on, are multivariate statistical method with multiple linear regression model based. These3/ 9multivariate statistical methods there are two important characteristics, namely the data binding: The factor variable X and variable Y must be extracted from the X’X and Y’Y matrix, these factors cannot be said at the same time correlation between the variables X and Y. The number of prediction equation can never be more than the variable Y with variable X. Partial least squares regression from multiple linear regression and its expansion did not need these data constraints. In the partial least squares regression, prediction equation by the factor extracted from the matrix Y’XX’Y to describe; in order to be more representative, the maximum number of the number of prediction equations extracted may be greater than the variable X and Y. In short, partial least squares regression may be all multivariate calibration methods for the least variable constraint method, this flexibility makes it suitable for many occasions in the traditional multivariate calibration method is not applicable, for example, some observational data less than when the number of predictor variables. Moreover, partial least squares regression can be used as an exploratory analysis tool, before using the traditional linear regression model, first on the number of variables required for appropriate prediction and---------------------------------------------------------------最新资料推荐------------------------------------------------------ removal of noise. Therefore, the partial least squares regression is widely used in many fields of modeling, such as chemical, medicine, economics, psychology and pharmaceutical science and so on, especially it can according to arbitrarily set the variable this advantage is more prominent. In chemometrics, partial least squares regression has been used as a standard multivariate modeling tool. The calculation process The basic model As a method of multiple linear regression, partial least squares regression main purpose is to establish a linear model: Y=XB+E, where Y is the response matrix with m variables, n points, X is the prediction matrix with P variables, n sample, B regression coefficient matrix, E noise the calibration model, which has the same dimension with Y. In general, the variables X and Y was then used to calculate the standard, i.e. minus their average value and standard deviation divided by the. Partial least squares regression and principal component regression, using factor score as the original prediction variable as a linear combination of the basis, so used to establish prediction model between factor scores must be linearly independent. For example, if we have a set of response variables (Y matrix) and now a large number5/ 9of predictor variables X (matrix), some serious variable linear correlation, we use factor extracted from this set of data extraction method is used to calculate the factor, factor score matrix T=XW, then calculate the weight matrix W right then, a linear regression model: Y=TQ+E, where Q is the regression coefficient matrix of T matrix, E matrix error. Once the Q is calculated after the previous equation is equivalent to the Y=XB+E, the B=WQ, it can be directly used as predictive regression model. The difference of partial least squares regression and principal component regression in different extraction methods of factor score in short weight matrix W principal component regression generated reflects the prediction covariance between variables X, partial least squares W regression weight matrix generated reflects the covariance between the predictor variables and the response variable Y X. In the model, partial least squares regression produces weight matrix W PXC, column vector matrix for W score matrix T column vector calculation variable X nxc. The calculation of these weights the response covariance between the corresponding factor score reaches maximum constant. Ordinary least squares regression matrix Q regression on T when calculating Y, the matrix Y load factor (or weight), are used---------------------------------------------------------------最新资料推荐------------------------------------------------------to establish the regression equation: Y=TQ+E. Once calculated Q, we can get the equation: Y=XB+E, B=WQ, the final prediction model is built up. The nonlinear iterative partial least squares method For a standard algorithm for computing partial least squares regression is a nonlinear iterative partial least squares (NIPALS), there are many variables in this algorithm, some have been standardized, some are not. The following algorithm is considered to be one of the most effective method in the nonlinear iterative partial least squares. On C, A0=X’Y and h=1..., M0=X’X, C0=I, C of known variables. Calculation of QH, the principal eigenvectors of Ah’Ah. Wh=GhAhqh, wh=wh/||wh||, and wh as the column vector of W. Ph=Mhwh, ch=wh’Mhwh, ph=ph/ch, and pH as the column vect or of P. Qh=Ah’wh/ch, and QH as the column vector of Q. Ah+1=Ah - chphqh’, Bh+1=Mh - chphph’ Ch+1=Ch - whph’ Factor score matrix T can be calculated: T=XW, partial least squares regression coefficient B can be calculated by the formula B=WQ. SIMPLS algorithm There is a method to estimate the regression component of partial least squares, called SIMPLS algorithm. On C, A0=X’Y and h=1..., M0=X’X, C0=I, C of known variables. Calculation of QH, the principal eigenvectors of Ah’Ah.7/ 9Wh=Ahqh, ch=wh’Mhwh, wh=wh/sqrt (CH), and wh as the column vector of W. Ph=Mhwh, and pH as the column vector of P. Qh=Ah’wh, and QH as the column vector of Q. Vh=Chph, vh=vh/||vh|| Ch+1=Ch - vhvh’, Mh+1=Mh - phph’ Ah+1=ChAh The same as NIPALS, SIMPLS T T=XW calculated by the formula, B formula by B=WQ’calculation. Related literature Xu Lu, chemometrics methods, Science Press, Beijing, 1995. Wang Huiwen, the partial least squares regression method and application, National Science and Technology Press, Beijing, 1996. Chin, W. W., and Newsted, P. R. Structural Equation (1999). Modeling analysis with Small Samples Using Partial Least Squares. In Rick Hoyle (Ed.), Statistical Strategies for Small Sample Research, Sage Publications. Chin, W. W. (1998) The partial least squares approach for. Structural equation modelling. In George A. Marcoulides (Ed.), Modern Methods for Business Research, Lawrence Erlbaum Associates. Barclay, D., C. Higgins and R. Thompson The Partial (1995). Least Squares (PLS) Approach to Causal Modeling: Personal Computer Adoption and Use as an Illustration. Technology Studies, Volume 2, issue 2, 285-309. Chin, W. W. (1995). Partial Least Squares Is To LISREL As 主成分分析是常见的因素分析。

PLS样文

Research on Service Innovation PerformanceStructural ModelYing SunSchool of Management Hebei University of TechnologyTianjin,Chinasunying99@Wei MaoSchool of Management Hebei University of TechnologyTianjin,Chinamaowei97@Abstract—Modern enterprise need to continuously provide customers with perfect services to maintain its competitive edge. Therefore, the enterprises attach great importance to service innovation activities. For enterprises, how to assess the success of service innovation, to guarantee the rate of service innovation success has become urgent practical problems. Based on this real problem, this paper takes empirical research methods to explore the service innovation performance model. Data were analyzed to verify the research hypothesis by using a variety of statistical analysis methods. And the results were explained and discussed. Ultimately, propose and verify the four-dimensional model of service innovation performance. The results showed that service innovation performance concludes four dimensions, which are financial targets, business growth, customer indicators and internal indicators. On the one hand, this conclusion complements the theory of service innovation performance, on the other hand, the results can provide evaluation for service innovation activities, and help companies determine the success of service innovative activities.Key Words: Service Innovation; Service Innovation Performance; Structural ModelI.引言现代企业实质上均可以称为服务企业，即使制造企业也需要为顾客提供完善的服务（刘仪甫，2007），尤其是制造企业开始广泛地进行服务竞争，试图通过“服务”增强产品竞争力并将其作为价值的新来源（蔺雷，吴贵生，2006）。

基于改进PLS算法的火电机组燃煤热值建模

基于改进PLS算法的火电机组燃煤热值建模李静;孙灵芳【摘要】通过对入炉燃煤工业分析数据和与热值的关系进行分析,选取了燃煤水分、灰分、挥发分、固定碳和全硫分5种工业分析成分作为模型的输入,以燃煤热值作为模型输出,基于改进偏最小二乘(PLS)算法搭建了某电厂燃煤热值预测模型.预测模型中采用PRESS值确定潜变量的个数.预测结果表明:该模型预测精度较高,预测偏差满足工程要求.【期刊名称】《化工自动化及仪表》【年(卷),期】2016(043)005【总页数】5页(P501-504,516)【关键词】偏最小二乘算法;燃煤工业分析;热值;建模【作者】李静;孙灵芳【作者单位】东北电力大学自动化工程学院,吉林吉林132012;东北电力大学自动化工程学院,吉林吉林132012【正文语种】中文【中图分类】TH89燃煤的热值，也称为煤炭发热量，指单位质量的煤完全燃烧时所发出的热量。

煤的热值不仅是动力煤热值计价依据，更是燃煤工艺过程热平衡、耗煤量及热效率等的计算依据。

煤炭热值的高低，直接影响到煤的经济价值。

煤的热值实质上是煤中碳、氢、氧及硫等元素的综合反映。

目前,国内外主要采用离线方式测定煤的热值。

国标规定的热值测量条件较为苛刻，对反应条件要求较高，因此不能对检测样品的热值做出快速反应。

另外，还可利用工业分析与元素分析结果组成的试验数据推导出热值的经验公式进行预测，此类经验公式可以快速估算出煤的热值，但有些公式误差较大。

国外学者对燃煤热值的测量进行了诸多探索。

Franco A[1]、Querol X[2]及Goodarzi F[3]等先后采用复杂的热力学分析方法，如热重法、导数热重法及差热分析法等去测定煤的热值；Hassanzadeh S[4]、Peter H G[5]等建立了相应的经验公式。

而随着计算机应用技术的发展，许多新的方法被用于煤炭热值的计算，如人工神经网络算法等[6,7]。

在国内，梅晓仁等建立了煤质热值与灰分之间的回归模型[8]；韩忠旭等运用能量守恒定律构造出了一种燃煤热值的软测量算法[9]；刘志华通过求解方程并采用门捷列夫公式，求得煤的收到基低位热值[10]；Li Z等采用近红外频域自适应分析法建立了燃煤热值模型[11]；周孑民等采用神经网络方法对煤的热值进行了预测建模[12]；关跃波等采用支持向量机理论建立了煤热值预测模型[13]；闵凡飞和王龙贵利用加权均值生成数据建立了GM(0,3)模型[14]；张西春等推导出了动力煤热值与工业分析指标间的数学模型[15]，笔者通过对某电厂入炉燃煤工业分析数据和与热值关系的分析，基于改进偏最小二乘(Partial Least Squares,PLS)算法搭建了入炉燃煤热值模型，并对该电厂另一超临界燃煤机组入炉燃煤热值进行了预测。

PLS-DR基于基因表达谱的几种癌症分类(IJITCS-V3-N4-1)

Index Terms-Logistic Regression; Partial Least Squares; gene expression profile; PLSDR-LD
I.
INTRODUCTION
A vast amount of data generated through gene microarray,
advantage of the huge information offered by the data. As a
generally used statistical modelling method, PLS is first used in
econometric path modeling by Herman Wold and afterwards it
is used in chemometric and spectrometric modeling as a multivariate regression too[11-13]. Nguyen and Rocke proposed
using PLSDR(PLS based dimension reduction) for dimension
important genes. In this article, we just consider of binary
classification problem. So the samples belong to two classes
and the t-statistic score is given as:
PLSDR have been carried on embracing a great many aspects in various domains[7-8], besides, comparison have been

统计分析入门与应用 SPSS 中文版 + SmartPLS 4 中文版说明书

統計分析入門與應用序科學研究就是不斷地探究人、事、物的真理，其目的在追求「真、善、美」即使無法達到盡善盡美，但是仍盡量貼近事實，我們經過20多年的多變量分析學習和實戰經歷，提供正確的多變量分析研究論文參考範例：有量表的發展、敘述性統計，相關分析、卡方檢定、平均數比較、因素分析、迴歸分析、區別分析和邏輯迴歸、單因素變異數分析、多變量變異數分析、典型相關分析、信度和效度分析、聯合分析多元尺度和集群分析，回歸(Regression) 模型、路徑分析(Path analysis) 和Process功能分析、第二代統計技術–結構方程模式(SEM)，終於完成《統計分析入門與應用SPSS (中文版) + SmartPLS 4 (PLS-SEM)》，希望能幫助更多需要資料分析的人，尤其是正確的報告多變量分析的結果。

近年來，多變量統計分析慢慢地產生巨大變化，例如：SEM的演進、以評估研究模式的適配。

發展量表，CB_SEM和PLS_SEM的區別，辨別模式的指定，反映性和形成性指標的發展和模式的指定，二階和高階潛在變數的使用，中介和調節變數的應用，Formative (形成性) 的評估、中介因素的5種型態、調節效果的多種型態、測量恆等性(Measurement Invariance)、MGA呈現的範例、被中介的調節(中介式調節)、被調節的中介(調節式中介)。

作者歷經多場演講和工作坊，也參加多場講座，培訓班，研討會，很多參加者表示不清楚如何正確的提供分析結果，另外，我們審過很多投稿到期刊的論文後，發現很多論文寫得不錯，但是由於分析或報告結果不精確，而被拒稿了。

《統計分析入門與應用SPSS (中文版) + SmartPLS 4 (PLS-SEM)》的完成可以幫助更多需要正確報告多變量分析的研究者，順利發表研究成果於研討會、期刊和碩博士論文。

感謝眾多讀者對於《多變量分析最佳入門實用書SPSS + LISREL》、《統計分析SPSS (中文版) + PLS_SEM (SmartPLS)》和《統計分析入門與應用SPSS (中文版) + SmartPLS 3 (PLS_SEM)》第二版&第三版的厚愛，本書已經更新至SmartPLS 4版本。

丹红注射液提取过程轨迹及质量在线监控研究_黄红霞

图 4 基于 5 批 NOC 试验丹红注射液提取过程统计监控模型 Fig. 4 Statistical monitoring model for extraction process of Danhong injection based on five NOC batches
3. 3 在线监控提取过程轨迹一旦统计控制模型建立，就可以对新的批次进行监控。测试批次 6（模拟原料配比问题）和批次 7（模拟加热温度异常）用于考察所建立统计模型的监控能力，其提取过程轨迹见图 5。图 5（ a）和（ b）为第 1 和第 2 主成分得分控制图，上述 2 个测试批次均有或者部分采样点落在控制限外面，表明这些批次均存在过程异常。从图 5（ b）观察到，批次 6 因为原料配比问题，提取过程轨迹一开始就偏离正常轨迹，该批次的大部分提取过程轨迹远离正常控制限；加热温度异常的批次 7 在采样点 9 短暂偏离正常轨迹，在采样时间段 30 ～ 45 min，提取过程处于偏离正常状态。
中红花产地为新疆，丹参药材产地为山东。 1. 2 提取过程按照一定比例称取丹参和红花药材，混合均匀，置于 2 L 多口圆底烧瓶中（装置见图 1），加入适量蒸馏水，装上回流装置、透反射探头和温度计，待水浴锅加热系统将温度升至一定温度后保温一段时间。
图 1 提取装置示意图 Fig. 1 Schematic of extraction experiment set-up
图 3 丹红注射液提取过程轨迹 Fig. 3 Extraction process trajectory of Danhong injection based on PC scores
第 38 卷第 11 期 2013 年 6 月

PLS教程

Communications of the Association for Information Systems (Volume 16, 2005) 91-109 91A PRACTICAL GUIDE TO FACTORIAL VALIDITY USING PLS-GRAPH: TUTORIAL AND ANNOTATED EXAMPLE David GefenDrexel Universitygefend@Detmar StraubGeorgia State UniversityABSTRACTThis tutorial explains in detail what factorial validity is and how to run its various aspects in PLS. The tutorial is written as a teaching aid for doctoral seminars that may cover PLS and for researchers interested in learning PLS. An annotated example with data is provided as an additional tool to assist the reader in reconstructing the detailed example.Keywords:PLS, factorial validity, convergent validity, discriminant validity, confirmatory factor analysis, AVE.I. INTRODUCTIONSince we first published our tutorial “Structural Equation Modeling Techniques and Regression: Guidelines for Research Practice” in Communications of AIS [Gefen et al. 2000] and its follow-up “Validation Guidelines for IS Positivist Research” [Straub et al., 2004], we have received many emails about the practicalities of running PLS (Partial Least Squares) and LISREL. In consultation with the editor of CAIS, we are publishing this addendum to the Guidelines. The objective of this particular short guide is to describe how to run factorial validity and how to examine it through PLS. Specifically, the tutorial discusses and demonstrates convergent and discriminant validity, including the AVE analysis. This tutorial is aimed at researchers aiming to adopt PLS-Graph but still unaware of how to actually assess factorial validity through it.II. THEORETICAL BACKGROUNDFactorial validity is important in the context of establishing the validity of latent constructs. Latent constructs, also known as latent variables, are research abstractions that cannot be measured directly, variables such as beliefs and perceptions. Quantitative positivist researchers assume that while some variables such as gender and age can be measured directly and with little error,a major difficulty arises with surrogates where the abstraction is removed from objective reality.11 See :88/quant/ for greater detail.A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. Straub92 Communications of the Association for Information Systems (Volume 16, 2005) 91-109A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubBecause such abstractions cannot easily be measured through direct means, agreed-uponpractice dictates that they be measured indirectly through several items in a research instrument[Anderson and Gerbing, 1988, Bagozzi, 1977, Campbell and Fiske, 1959, Churchill, 1979]. Eachmeasurement item, i.e., each actual scale item on an instrument, is thus assumed to reflect oneand only one latent variable. This property of the scale, having each of its measurement itemsrelate to it better than to any others, is known as unidimensionality. Unidimensionality isdiscussed in detail by Gerbing and Anderson [1988] and is delineated in another CAIS tutorial[Gefen 2003]. Unidimensionality cannot be measured with PLS but is assumed to be there a priori[Gefen, 2003, Gerbing and Anderson, 1988].However, two elements of factorial validity can and must be examined in PLS, as they must bewith latent variables in general [Churchill, 1979, Gerbing and Anderson, 1988]. The two elements,convergent validity and discriminant validity, are components of a larger scientific measurementconcept known as construct validity [Straub et al., 2004]. These two validities capture some of theaspects of the goodness of fit of the measurement model, i.e., how well the measurement itemsrelate to the constructs. When factorial validity is acceptable, it means each measurement itemcorrelates strongly with the one construct it is related to, while correlating weakly or notsignificantly with all other constructs. Typically, because of the way factorial validity is establishedin PLS, this pattern of factorial validity is divided into convergent validity and discriminant validity.Convergent validity is shown when each measurement item correlates strongly with its assumedtheoretical construct, while discriminant validity is shown when each measurement itemcorrelates weakly with all other constructs except for the one to which it is theoreticallyassociated.In first generation regression models, factorial validity was most frequently assessed with anExploratory Factor Analysis, or EFA.2 Several estimation methods can be used in an EFA. Theobjective of all these methods is generally the same, however. This objective is:•T o establish that the measurement items converge into the appropriate number of theoretical factors, •That each item loads with a high coefficient on only one factor, and • That this one factor is the same factor for all the measurement items that supposedlyrelate to the same latent construct [SPSS, 2003].As a rule of thumb, a measurement item loads highly if its loading coefficient is above .60 anddoes not load highly if the coefficient is below .40 [Hair et al., 1998]. Technically, an EFAidentifies the underlying latent variables, or factors, that explain the pattern of correlations within aset of measurement items. Once this data reduction identifies a small number of factors thatexplain most of the variance in the measurement items, the loading pattern of thesemeasurement items is determined and revealed in the statistical output. The number of factorsthat is selected by default is the number of factors with an eigenvalue exceeding 1.0. Sometimes,more or fewer factors are selected by the researcher based on a scree test or on theory [Hair etal., 1998].2 In EFA, the number of factors is not stated in advance by the researcher. The computer program, such asSPSS or SAS, calculates the relationships between all the measurement items, placing those most closelyrelated (highly correlated) into factors, which are then matched to the researcher’s theoretically positedconstructs. A researcher can also specify a certain number of factors to be extracted within EFA and rotatethe matrix. An EFA involves two statistical stages. In the first stage the factors as extracted. In the optionalsecond stage, the factors are then rotated to provide a better picture of the underlying factors of themeasurement items. There are several methods of extracting factors. The most common one that we see inIS studies is a Principal Components Analysis (PCA). An EFA enables specifying the expected number offactors, but although this is a move from being entirely exploratory, it not a confirmatory analysis in thesense of a CFA where the pattern by which measurement items load onto certain factor is specified inadvance.Communications of the Association for Information Systems (Volume 16, 2005) 91-109 93 These two steps are typically carried out through a Principal Components Analysis, or PCA, which extracts the factors assuming uncorrelated linear combinations of the measurement items. The loading pattern is then rotated to simplify the interpretation of the results. Typically this rotation is a Varimax rotation which creates orthogonal factors with minimized high loadings of the measurement items on other factors. Another common rotation method is the Direct Oblimin Method which performs a nonorthogonal or oblique rotation [SPSS, 2003]. Nonorthogonal rotations can produce a neater pattern of loading, and so they make the interpretation of the factors easier, but at the cost of increasing multicollinearity because of the loss of orthogonality.3Both EFA and PCA are run via programs like SPSS, which calls this approach “data reduction.” In a sense, researchers are attempting to achieve data reduction in that items that do not load properly are dropped and the instrument thereby “purified” and by reducing the larger number of measurement items into a smaller number of factors [Churchill, 1979]. With the advent of structural equation modeling (SEM) tools, such as PLS and LISREL, an argument for not purifying measures and treating an instrument more holistically has been made [MacCallum and Austin, 2000, Straub et al., 2004], but there is no clear resolution about whether measurement error should be modeled and accounted for or simply eliminated.PLS FACTORIAL VALIDITYIn contrast to EFA, PLS performs a Confirmatory Factor Analysis (CFA). In a CFA, the pattern of loadings of the measurement items on the latent constructs is specified explicitly in the model. Then, the fit of this pre-specified model is examined to determine its convergent and discriminant validities. This factorial validity deals with whether the pattern of loadings of the measurement items corresponds to the theoretically anticipated factors.4 The example presented in Section III details how this analysis is performed.Convergent validity is shown when each of the measurement items loads with a significant t-value on its latent construct. Typically, the p-value of this t-value should be significant at least at the 0.05 alpha protection level.Discriminant validity is shown when two things happen:1. The correlation of the latent variable scores with the measurement items needs toshow an appropriate pattern of loadings, one in which the measurement items load highly on their theoretically assigned factor and not highly on other factors.Established thresholds do not yet exist for loadings to establish convergent and discriminant validity. In fact, comparing a CFA in PLS with a EFA with the same data and model, Gefen et al. [2000] showed that loadings in PLS could be as high as .50 when the same loadings in an EFA are below the .40 threshold. Nonetheless, in our opinion, all the loadings of the measurement items on their assigned latent variables should be an order of magnitude larger than any other loading. For example, if one of the measurement3 Other than the statistical assumptions of independence of antecedent variables (this is why they are called “independent variables,” in fact), there is no inherent scientific reason to prefer orthogonal rotations to oblique rotations. Oblique rotations are, perhaps, more in keeping with the real world where constructs frequently overlap both conceptually and statistically.4 The discussion assumes that the measurement items are reflections or “reflective” of the construct, which means that all items should correlate highly with each other. We do not deal with the issue of how to validate an instrument when the items (or sub-constructs) are thought to be “formative.” For all intents and purposes, formative measures are still an open issue in the metrics literature. Initial guidelines on constructing indexes with formative measurement items are discussed by Diamantopoulos and Winklhofer [2001].A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. Straub94 Communications of the Association for Information Systems (Volume 16, 2005) 91-109items loads with a .70 coefficient on its latent construct, then the loadings of all the measurement items on any latent construct but their own should be below .60.2. Establishing discriminant validity in PLS also requires an appropriate AVE (AverageVariance Extracted) analysis. In an AVE analysis, we test to see if the square root of every AVE (there is one for each latent construct) is much larger than any correlation among any pair of latent constructs. AVE, which is a test of discriminant validity offered through PLS, is calculated as:(Σλi2)/( (Σλi2) + (Σ(1-λi2) )where λi is the loading of each measurement item on its corresponding construct.AVEs are generated automatically using the bootstrap technique by the latest version of PLS-Graph (i.e., version 03.00 build 1126 of 2003). AVE measures the variance captured by a latent construct, that is, the explained variance. For each specific construct, it shows the ratio of the sum of its measurement item variance as extracted by the construct relative to the measurement error attributed to its items. As a rule of thumb, the square root of the AVE of each construct should be much larger than the correlation of the specific construct with any of the other constructs in the model [Chin, 1998a] and should be at least .50 [Fornell and Larcker, 1981a].5 Unfortunately, guidelines about how much larger the AVE should be than these correlations are not available. Conceptually, the AVE test is equivalent to saying that the correlation of the construct with its measurement items should be larger than its correlation with the other constructs. This comparison harkens back to the tests of correlations in multi-trait multi-method matrices [Campbell and Fiske, 1959], and, indeed, the logic is quite similar.III. PRACTICAL EXAMPLETo show how these principles apply in research practice, we next illustrate the testing of factorial validity via PLS. Data used below is from a study that deals with purchasing tickets online and tested via the Technology Acceptance Model (TAM) [Davis, 1989]. The study, Gefen [2003], is useful in this context since it shows how to apply tests of discriminant validity. A subset of the items has been selected for this practical example (Table 1). Basically, as in TAM, the perceived ease of use (PEOU) of an IT, which is the website in this case, affects its perceived usefulness (PU), and both PEOU and PU affect intended use (USE). Although, as in other studies, we do not expect PEOU to have a direct effect on USE (Intention to Use) because PEOU is not of an intrinsic value to the information technology being used [Gefen and Straub, 2000]. USE is represented as the “Buy Tickets” behavioral intention in the figures that follow. The raw data are shown in Appendix I. The raw data were collected from subjects who answered each item on a 1 to 7 Likert scale ranging from Strongly Disagree through Neutral to Strongly Agree.5 An alternative and more stringent approach of comparing the AVE with the correlations of the latent constructs is presented by Gefen et al. [2000] and by House et al. [1991] who suggest comparing the AVE, rather than the square root of the AVE, with the correlations. If the AVE is larger than the correlation, then the square root of the AVE will always be larger too. The logic behind Gefen et al.’s [2000] more stringent approach reflects the over-estimation of paths by PLS [Chin et al., 2003].A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen and D. StraubCommunications of the Association for Information Systems (Volume 16, 2005) 91-109 95A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. Straub Table 1. Measurement Items in the Example Item WordingItem Code is easy to use PEOU1 It is easy to become skillful at using PEOU2 Learning to operate is easyPEOU3 is flexible to interact withPEOU4 improves my performance in flight searching and buyingPU1 enables me to search and buy flights fasterPU2 enhances my effectiveness in flight searching and buyingPU3 makes it easier to search for and purchase flightsPU4 I would use my credit card to purchase from USE1 I would not hesitate to provide information about my habits to Travelocity USE2ASSESSING FACTORIAL VALIDITY IN PLSConvergent ValidityTo assess factorial validity, we first examine the convergent validity of the scales. To do so, wemust first build the PLS-Graph model. The model as run in the example is shown in Figure 1.Figure 1. PLS-Graph ModelNext, we generate the t-values with a bootstrap, as shown in Figure 2.66The two options to generate t-values in PLS are bootstrap and jackknife. In this example, we use bootstrapbecause it also generates the AVEs in the latest version of PLS-Graph.96 Communications of the Association for Information Systems (Volume 16, 2005) 91-109A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubFigure 2 Extracting PLS-Graph ModelThe generated t-values are not shown graphically. We need to access the results file to viewthese results. Carrying this process out involves two steps. First, we must change the requestedoutput to *.out. We select the View menu and click on Show *.out, (Figure 3).Figure 3. Selecting the View the Out fileAnd then again in the View menu, select Results, as shown in Figure 4.Communications of the Association for Information Systems (Volume 16, 2005) 91-109 97A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubFigure 4. Selecting the ResultsThis selection opens a Notepad file with the results displayed. Figure 5,shows part of what thefile looks like. Convergent validity is shown when the t-values of the Outer Model Loadings areabove 1.96. The t-values of the loadings are, in essence, equivalent to t-values in least-squaresregressions. Each measurement item is explained by the linear regression of its latent constructand its measurement error.DISCRIMINANT VALIDITY: PROCEDURE 1As described in Section II, two procedures are used for assessing discriminant validity:1. Examine item loadings to construct correlations.2. Examine the ratio of the square root of the AVE of each construct to the correlations of thisconstruct to all the other constructs.Communications of the Association for Information Systems (Volume 16, 2005) 91-109 99 A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubExtracting the necessary data requires a change to the default output file. To make this change, first select the Output option in the Options menu, as shown in Figure 6, and then request the Latent variable scores, as shown in Figure 7.Figure 6. Selecting Set the Output OptionsFigure 7. Setting the Output OptionsThe item loadings on the constructs (latent variables) are calculated based on these scores. Once these scores are generated, we can extract the relevant values, as shown in Figure 8. Figure 8 already shows the graphical results of this extraction. The number above each path from item (in boxes) to latent variable (in circles) is the item loading. The number below each path in brackets is the item weight. The number below each circle is the construct R 2, which is calculated and displayed for each variable that is a dependent variable in the model, in this case, PU and Buy Tickets.100 Communications of the Association for Information Systems (Volume 16, 2005) 91-109A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubFigure 8. Displaying the PLS-Graph ModelTo view the detailed output we must first revert back to the 1st output file by clicking on it in the View menu, as shown in Figure 9.Figure 9: Selecting the View the 1st fileCommunications of the Association for Information Systems (Volume 16, 2005) 91-109 101 To view the 1st file, select Results in the View menu. The Notepad file will contain several sections, one of these is labeled ‘Eta tent variables’ (Figure 10). This section appears only because we explicitly requested latent variable scores. The Eta of the first 20 observations, or data points, are shown here. The number of Etas is the number of data points in the data. PLS-Graph copies the label of each construct as the header of each column in the outputFigure 10. Eta … Latent VariablesTo correlate these latent variable scores with the original items, we copy them, after some minor editing, into SPSS together with the original data in the Appendix, as shown in Figure 11. The arrow shows these scores copied from the PLS output file into an SPSS file.A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. Straub102Communications of the Association for Information Systems (Volume 16, 2005) 91-109Figure 11. Analyzing Example Data in SPSS with the Latent Variable Scores from PLSWith this step completed, bivariate correlations can be run. If the data is deemed to be interval or ratio data with a normal distribution, then Pearson correlations [Figure 12] are acceptable. If the data could violate distributional assumptions or is ordinal, then use the nonparametric Spearman correlations [SPSS, 2003]. These values will be very close to the Pearson correlations and have only one small disadvantage: their power is slightly lower.[Siegel and Castellan, 1988].77 An excellent tutorial on this topic is available at /textbook/stbasic.htmlA Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen and D. StraubCommunications of the Association for Information Systems (Volume 16, 2005) 91-109 103Figure 12. The Correlations as Produced by SPSSNext, with some editing in Excel, copy the correlation table produced in SPSS and shown in Figure 12 to produce the correlation table shown in Figure 13. The bold-faced formatting of the numbers was added manually in Figure 13 to emphasize the loading of the measurement items on the constructs to which they are assigned in the CFA.A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. Straub104Communications of the Association for Information Systems (Volume 16, 2005) 91-109Bold face shows loading of the measurement items on theconstructs to which they are assigned in the CFA.Figure 13. Excel Editing of the Correlation TableAlthough the loadings might seem high, it is common to have much higher loadings in PLS than in a PCA. To demonstrate this, the same data are also shown here in a PCA where they demonstrate much lower loadings (Figure 14). The high loadings per construct are emphasized in bold font.Component1 2 3eou3 .894 .092 .072eou2 .784 .178 .115eou1 .782 .167 .114eou4 .771 .310 .047pu2 .097 .856 -.034pu1 .159 .810 .164pu3 .261 .772 .260pu4 .337 .700 .294Use1 .030 .186 .883Use2 .186 .144 .870Extraction Method: Principal Component Analysis.Rotation Method: Varimax with Kaiser Normalization.Rotation converged in 5 iterations.Figure 14. PCA with a Varimax Rotation of the Same DataA Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen and D. StraubCommunications of the Association for Information Systems (Volume 16, 2005) 91-109 105A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubDISCRIMINANT VALIDITY: PROCEDURE 2The second procedure necessary to show discriminant validity is the AVE analysis. The square root of the AVE of each construct needs to be much larger, although there are no guidelines about how much larger, than any correlation between this construct and any other construct. The AVEs were already extracted in the bootstrap shown in Figure 5. We take the square root of each of these and compare them with the construct correlations in the 1st file, shown in Figure 15. In the case of these data, all the square roots are much larger than any correlation, which combined with the correlation of the scores to the items shows a necessary aspect of the discriminant validity of the latent constructs.Figure 15. Correlations in the lst file as compared with the Square Root of the AVEIV. CONCLUSIONIn this tutorial supplement to Gefen et al. [2000], we demonstrate the practical side of using PLS-Graph to argue for the factorial validity of constructs. As explained in Straub et al. [2004], factorial validity is a form of construct validity that uses statistical tools that work with factor structures. The purpose of factorial validity is the same as in any examination of the validity of constructs, that is, to show that constructs that are posited to be made up of certain measurement items are, indeed, made up of those items, and not made up of items posited to be part of another construct. In short, these tests show the convergent and discriminant validity of the constructs [Campbell and Fiske, 1959].IS as a field often selects PLS as a tool of choice along with LISREL and standard regression. It is important, therefore, that quantitative positivist researchers use these tools properly and to their maximal advantage. This paper is designed to contribute to this goal.ADDITIONAL READINGOn PLS in general and guidelines: [Barclay et al., 1995, Chin, 1998a, Chin, 1998b, Fornell and Bookstein, 1982, Fornell and Larcker, 1981a, Fornell and Larcker, 1981b, Gefen et al., 2000]On Interaction effects in PLS: [Chin et al., 2003]Editor’s Note : This article was received on June 8, 2005 and was published on July __, 2005106 Communications of the Association for Information Systems (Volume 16, 2005) 91-109A Practical Guide to Factorial Validity Using PLS-Graph: Tutorial and Annotated Example by D. Gefen andD. StraubREFERENCESAnderson, J. C. and D. W. Gerbing (1988) "Structural Equation Modeling in Practice: A Reviewand Recommended Two-Step Approach," Psychological Bulletin (103)3, Fall, pp. 411-423.Bagozzi, R. P. (1977) "Structural Equation Models in Experimental Research," Journal ofMarketing Research (14) pp. 209-236.Barclay, D., R. Thompson, and C. Higgins (1995) "The Partial Least Squares (PLS) Approach toCausal Modeling: Personal Computer Adoption and Use an Illustration," TechnologyStudies (2)2, pp. 285-309.Campbell, D. T. and D. W. Fiske (1959) "Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix," Psychological Bulletin (56)2, March, pp. 81-105.Chin, W. W. (1998a) "Issues and Opinion on Structural Equation Modeling," MIS Quarterly (22)1,March, pp. vii-xvi.Chin, W. W. (1998b) “The Partial Least Squares Approach to Structural Equation Modeling,” in G.A. Marcoulides (Ed.) Modern Methods for Business Research , Lawrence ErlbaumAssociates, Mahway, New Jersey pp. 295-336.Chin, W. W., B. L. Marcolin, and P. R. Newsted (2003) "A Partial Least Squares Latent VariableModeling Approach for Measuring Interaction Effects: Results from a Monte CarloSimulation Study and an Electronic-Mail Emotion / Adoption Study," Information SystemsResearch (14)2, pp. 189-217.Churchill, G. A., Jr. (1979) "A Paradigm for Developing Better Measures of MarketingConstructs," Journal of Marketing Research (16)1, February, pp. 64-73.Davis, F. D. (1989) "Perceived Usefulness, Perceived Ease of Use and User Acceptance ofInformation Technology," MIS Quarterly (13)3, September, pp. 319-340.Diamantopoulos, A. and H. M. Winklhofer (2001) "Index Construction with Formative Indicators:An Alternative to Scale Development," Journal of Marketing Research (38)2, pp. 269-277.Fornell, C. and F. L. Bookstein (1982) "Two Structural Equation Models: LISREL and PLSApplied to Consumer Exit-Voice Theory," Journal of Marketing Research (19)4, pp. 440-452.Fornell, C. and D. Larcker (1981a) "Evaluating Structural Equation Models with UnobservableVariables and Measurement Error," Journal of Marketing Research (18)1, pp. 39-50.Fornell, C. and D. Larcker (1981b) "Evaluating Structural Equation Models with UnobservableVariables and Measurement Error: Algebra and Statistics," Journal of MarketingResearch (18)3, pp. 382-388.Gefen, D. (2003) "Unidimensional Validity: An Explanation and Example," CAIS (12)2, pp. 23-47.Gefen, D. and D. Straub (2000) "The Relative Importance of Perceived Ease-of-Use in ISAdoption: A Study of e-Commerce Adoption," JAIS (1)8, pp. 1-30.Gefen, D., D. Straub, and M. Boudreau (2000) "Structural Equation Modeling Techniques andRegression: Guidelines for Research Practice," Communications of the Association forInformation Systems (7)7 August,, pp. 1-78.Gerbing, D. W. and J. C. Anderson (1988) "An Updated Paradigm for Scale DevelopmentIncorporating Unidimensionality and Its Assessment," Journal of Marketing Research (25)May, pp. 186-192.Hair, J. F., Jr., R. E. Anderson, R. L. Tatham, and W. C. Black (1998) Multivariate Data Analysiswith Readings, 5th Edition . Englewood Cliffs, NJ: Prentice Hall.House, R. J., W. D. Spangler, and J. Woycke (1991) "Personality and Charisma in the U.S.Presidency: A Psychological Theory of Leader Effectiveness," Administrative ScienceQuarterly (36)3, pp. 364-396.MacCallum, R. C. and J. T. Austin (2000) "Applications of Structural Equation Modeling inPsychological Research," Annual Review of Psychology (51) pp. 201-226.Siegel, S. and N. J. Castellan (1988) Nonparametric Statistics for the Behavioural Sciences , 2ndedition. New York, NY: McGraw-Hill.SPSS (2003) SPSS 12 Help Manual . Chicago, Illinois: SPSS.Straub, D., M.-C. Boudreau, and D. Gefen (2004) "Validation Guidelines for IS PositivistResearch," Communications of the Association for Information Systems (14) pp. 380-426.。

计量经济学(重要名词解释)

——名词解释将因变量与一组解释变量和未观测到的扰动联系起来的方程，方程中未知的总体参数决定了各解释变量在其他条件不变下的效应。

与经济分析不同，在进行计量经济分析之前，要明确变量之间的函数形式。

经验分析（Empirical Analysis）：在规范的计量分析中，用数据检验理论、估计关系式或评价政策有效性的研究。

确定遗漏变量、测量误差、联立性或其他某种模型误设所导致的可能偏误的过程线性概率模型（LPM）（Linear Probability Model, LPM）：响应概率对参数为线性的二值响应模型。

没有一个模型可以通过对参数施加限制条件而被表示成另一个模型的特例的两个（或更多）模型。

有限分布滞后（FDL）模型（Finite Distributed Lag (FDL) Model）：允许一个或多个解释变量对因变量有滞后效应的动态模型。

布罗施-戈弗雷检验（Breusch-Godfrey Test）：渐近正确的AR（p）序列相关检验，以AR（1）最为流行；该检验考虑到滞后因变量和其他不是严格外生的回归元。

布罗施-帕甘检验（Breusch-Pagan Test）/（BP Test）：将OLS 残差的平方对模型中的解释变量做回归的异方差性检验。

若一个模型正确，则另一个非嵌套模型得到的拟合值在该模型是不显著的。

因此，这是相对于非嵌套对立假设而对一个模型的检验。

在模型中包含对立模型的拟合值，并使用对拟合值的t 检验来实现。

回归误差设定检验（RESET）（Regression Specification Error Test, RESET）：在多元回归模型中，检验函数形式的一般性方法。

它是对原OLS 估计拟合值的平方、三次方以及可能更高次幂的联合显著性的F 检验。

怀特检验（White Test）：异方差的一种检验方法，涉及到做OLS 残差的平方对OLS 拟合值和拟合值的平方的回归。

这种检验方法的最一般的形式是，将OLS 残差的平方对解释变量、解释变量的平方和解释变量之间所有非多余的交互项进行回归。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

QUANTITATIVE RESEARCH METHODSSAMPLE OFPLS PROCEDURESPrepared byMichael LingReference: Limayem, M., S. G. Hirt, and C. M. K. Cheung (2007), “How Habits Limits the Predictive Power of Intention: The Case of Information Systems Continuance,” MIS Quarterly, Vol. 31, No. 4, 705-737.INTRODUCTIONPast research in continued usage of IS was limited to the study of initial IS adoption, which was under the assumption that it was primarily driven by intention. The authors recognized that this assumption had ignored the effect of frequently performed behaviours on IS continuance.This paper contributed to IS research by exploring the roles that IS Habit took in the context of continued IS usage. It proposed that IS Habit had a moderating effect on IS Continuance Intention to the extent that its effect on IS Continuance Usage would diminish as the usage behaviour became more habitual.Drawing from the habit literature, the IS Habit construct and its four antecedents were developed: frequency of prior behaviour, satisfaction, stable context and comprehensiveness of usage. PLS was employed as the research method where three competing models were compared for the effect of IS habit on IS Continuance Usage. The moderator model was found to possess the best explanatory power.SUMMARYData collection was divided into three rounds over a 4-week period to measure university students’ usage o f WWW. A total of 553 respondents answered the first questionnaire, and 227 respondents participated in all three rounds. The first round was to collect data for Perceived Usefulness, Confirmation, Satisfaction and IS Continuance Intention; the second and third rounds were to measure IS Continuance Usage. In particular, IS Continuance usage was measured by two items – frequency of WWW usage (how often?) and intensity of usage (how many hours?).The authors developed a six-item IS Habit scale. However, only the best three items, which had composite reliability of 0.88, were used.The data were analysed using PLS-Graph, which was selected for the following reasons: (i) the formative nature of some of the measures and the non-normality of the data; (ii) it was better suited to test moderation effects; (iii) it allowed for small to medium-sized samples.Regarding convergent validity, all reflective items had significant path loadings at the 0.01 level, and acceptable levels of composite reliability (at 0.773 or above) and average variance extracted (at 0.630 or above). The two formative items of IS Continuance Usage had weights of 0.67 (t = 7.6) and 0.500 (t = 4.924).Regarding discriminant validity, each construct shared greater variance with its own block of measures than with other constructs that represented a different block. The reflective measures fulfilled the criteria of cross-loadings.A relatively large correlation (r = 0.751) was found between IS Continuance Intention and IS Habit, which suggested that the measurements might have drawn from the same construct. Nevertheless, the authors defended this point ontheoretical grounds and by citing similar empirical results from Towler and Shepherd (1991-1992) and Trafimow (2000).Regarding common method bias, LISREL were conducted on six indicators (three from each of the IS Continuance Intention and IS Habit measures) and two latent variables (IS Habit and IS Continuance Intention) and a method factor. The findings showed the fit of the model did not improve significantly.Regarding non-response bias, the demographics of respondents in the first round, but not in the last, were compared to those who participated in all three rounds. No significant differences were found.Three models were tested to determine which one provided the best explanatory power for IS Continuance Usage. A baseline model without incorporating the IS habit construct (R2 = 0.180), a second model that modelled IS habit as having a direct effect (R2 = 0.211) and a third model that modelled habit as a moderator (R2 = 0.261). All path coefficients were reported significant at the 0.01 level. The hierarchical difference test showed that the interactions effect had an effect size f of 0.063 which, according to the authors, represented a medium effect.CRITIQUESEM was appropriate in this research as it allowed the specifications of the relationships among the constructs and the measures underlying the constructs concurrently, so that the measures of the construct and the hypothesized model could be analysed simultaneously.The selection of PLS-Graph, a component-based partial least squares methodology, was appropriate compared to other covariance-based SEM (such as LISREL) because PLS-Graph was better for theory development and predictive applications.The authors developed the antecedents of IS Habit: satisfaction, frequency of past behavior, comprehensiveness of usage and stability of context. However, stability of context was not used since “data are collected in only one context and we therefore control for its impacts.” Nevertheless, the authors characterized stability of context as “the presence of similar situational cues and goals across more or less regularly occurring situations.” It was arguable that variations existed in universities, just like any other social institutions, such as availability of facilities and examination periods were likely to influence students’ usage of the WWW. As the research was conducted over a period of four weeks, the probability that the respondents experienced such unstable events could not be overlooked. The inclusion of the stability construct might have increased the explanatory power of the model.The authors defended the high correlation (r = 0.751) between IS Usage Intention and IS Habit by citing references from theory and by making reference to similar high correlation results previously found. Nevertheless, the high correlation was a concern. The IS Habit measure was a new scale which, for all intents andpurposes, would be different from other habit measures previously used. Thus, it was not convincing to support their correlation results with previous habit scales. The authors could have run the model unconstrained and also constraining the correlation between constructs to 1.0. If the two models differed significantly on achi-square difference test, then the two constructs would be different.Common method variance was a type of spurious internal consistency which occurred when the apparent correlations among indicators were due to a common source. Since the data was based on self-reports, the correlation might be due to the propensity of the subjects to answer similarly to multiple items even when there was no true correlation of constructs. LISREL test concluded that common method variance was not an issue.Convergent validity could be assessed in several ways: (i) the correlations among items which made up the scale – internal consistency validity; (ii) the correlations of the given scale with measures of the same construct using scales proposed by other researchers and, preferably, already accepted in the field –criterion validity; (iii) the correlations of relationships involving the given scale across samples or across methods. The results of Cronbach’s alpha and the average variance explained (AVE) provided evidence for internal consistency construct validity. The authors demonstrated criterion validity for Perceived Usefulness, Confirmation, Satisfaction and IS Continuance Intention by referring to scales that had been validated in prior research. The authors developed a six-item habit scale and used the best three items in this research but fell short of providing detail for the decision. It would be helpful if the new habit scale were to be compared againstpreviously developed habit scales. All the constructs were not tested for convergent validity using cross samples or methods.Discriminant validity referred to testing statistically whether two constructs were different. Evidence was provided for discriminant validity, as below: (i) the item loadings were higher for their corresponding constructs than for others; (ii) the square root of the AVE for a given construct was greater than the correlations between it and all other constructs.The authors did not provide any reference to content or face validity. It was a concern that whether the items measure the full domain implied by their label. The indicators might exhibit construct validity, yet the label attached to the concept might be inappropriate. Use of surveys or panels of content experts or focus groups were methods in which content validity might be established.Internal validity had not been adequately addressed by the authors. The number of respondents participated in the three rounds was different – 553 in the first round and 227 in all three rounds. It was not clear what sample size was used in the model testing. The authors did not address the issue of mortality bias, which was an obviously important issue here. For example, was there an attrition bias?Another internal validity issue that had not been addressed was compensatory rivalry. As the data collection took three weeks, the students might have promoted competitive attitudes that could have biased the results.The latent constructs that were associated with reflective measurement items were Confirmation, Habit, IS Continuance Intention, Perceived Usefulness and Satisfaction. The loadings of the reflective items were reported significant.Overall AssessmentThe choice of PLS-Graph was appropriate. The derivation of a new scale for IS Habit was a significant contribution to IS research. The authors obtained the highest R2 in the IS Habit moderated model against the baseline and the direct effect models. Though the R2 value (0.261) of the moderating model was low, the conclusion that the moderating model had the best explanatory power was correct. The exclusion of the stability context antecedents in the IS Habit construct might have reduced the variance explained by the model. The high correlation between IS Usage Intention and IS Habit was a potential concern. Convergent validity, discriminant validity and common method bias were largely in order. Content validity and internal validity were not adequately addressed. On balance, there were more strengths than weaknesses in the paper.CONCLUSIONThe key contribution of the paper rested on the scale development of the IS Habit construct and the finding that there was moderating effect of IS Habit on IS Continuance Intention and IS Continuance Usage.The choice of the component-based PLS model, PLS-Graph, was appropriate for the analysis. Three competing models were compared and the moderating model was found to have the highest explanatory power. All loadings and weights of the indicators were acceptable.The research could have improved by addressing the concerns raised here. In particular, further developed measurement scale of IS Habit; the inclusion of stability in the model; consideration of interactions effect between Satisfaction and Comprehensiveness of Usage and Frequency of Behavior.。