mnts on Efficacy and Labeling Issues for OTC Dr

合集下载

Control of a solution copolymerization reactor using multi-model predictive control

Control of a solution copolymerization reactor using multi-model predictive control
Leyla Ozkana , Mayuresh V. Kotharea ; ∗ , Christos Georgakisb
of Chemical Engineering, Chemical Process Modeling and Control Research Center, 111 Research Drive, Lehigh University, Bethlehem, PA 18015, USA b Department of Chemistry, Chemical, Engineering and Material Science, 728 Rogers Hall, Polytechnic University, Six Metrotech Center, Brooklyn, NY 11201, USA Received 20 December 2001; received in revised form 4 September 2002; accepted 2 October 2002
Corresponding author. Tel.: +610-758-6654; fax: +610-758-5057. E-mail address: mayuresh.kothare@ (M. V. Kothare).
0009-2509/03/$ - see front matter ? 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 9 - 2 5 0 9 ( 0 2 ) 0 0 5 5 9 - 6
Keywords: Process control; Polymer; System engineering; Nonlinear dynamics; Model predictive control; Linear matrix inequalities

Driving forces behind the stagnancy of China’s energy-related CO2 emissions from 1996 to 1999 the r

Driving forces behind the stagnancy of China’s energy-related CO2 emissions from 1996 to 1999 the r

findings indicate that energy efficiency improvements in the industrial sector play the most important role in the evolution of China’s energy use; the structural shifts within the manufacturing sub-sectors or from primary to secondary or tertiary industry play only a nominal role. Such tendencies do not necessarily support continuity in the long run, however, and they do not definitely result in a sudden reversal in energy consumption trends (i.e., the decline in consumption) in the late 1990s.
Since fossil fuel combustion is responsible for threequarters of anthropogenic CO2 emissions in China (Streets et al., 2001), changes in energy consumption and production are expected to directly influence CO2 emissions. As shown in Fig. 2, the decline in CO2 emission is a direct result of the decline in energy consumption and production. This decline happened despite persistently high growth rate of the gross domestic product. Energy intensity, defined as total final energy consumption per unit of GDP, has continued to decline during the last two decades. Meanwhile, the income elasticity of energy consumption (defined as the change in total final energy consumption divided by the change in economic growth) remained at

The-impact-of-cross-border-mergers-and-acquisitions-on-the-acquirers-R-amp-D-Firm-level-evidence

The-impact-of-cross-border-mergers-and-acquisitions-on-the-acquirers-R-amp-D-Firm-level-evidence

The impact of cross-border mergers and acquisitions on the acquirers'R&D —Firm-level evidence ☆Joel Stiebale ⁎University of Nottingham,Nottingham University Business School,United KingdomUniversity of Nottingham,Nottingham Centre for Research on Globalisation and Economic Policy (GEP),United Kingdom RWI,Germanya b s t r a c ta r t i c l e i n f o Article history:Received 6October 2011Received in revised form 17April 2013Accepted 23April 2013Available online 6May 2013JEL classi fication:D21F23G34C31O31O33Keywords:Multinational enterprises Mergers and acquisitions InnovationThis paper provides empirical evidence on the relationship between cross-border acquisitions and innovation activities of the acquirer.For the empirical analysis a unique firm-level data set is constructed that combines survey data for German firms with a merger and acquisition database.After a cross-border acquisition,investing firms display a higher rate of domestic expenditures for research and development.Controlling for endogeneity of foreign acquisitions by estimating a two-equation system with limited dependent vari-ables and applying instrumental variable techniques it is found that part of this correlation stems from a causal effect.The estimated effects are robust towards alternative identi fication strategies and are higher in industries with high knowledge intensity.The analysis is complemented by an investigation of the effects on tangible investment spending and by a comparison of the effects of cross-border acquisitions to those of green field foreign direct investments and domestic acquisitions.©2013Elsevier B.V.All rights reserved.1.IntroductionForeign direct investment (FDI)flows have increased all over the world over the past decades to reach a volume of more than US$1.6trillion in 2011.Much of this increase can be attributed to the ris-ing number of cross-border mergers and acquisitions (M&As).1Fromthe home countries'perspective,cross-border M&As can on the one hand enable market access and the transfer of knowledge from abroad which may strengthen domestic technological capabilities.On the other hand,there might be negative effects if domestic activities are replaced with similar investments abroad.From the host countries'perspective,many policy makers try to prevent foreign takeovers of domestic firm,especially in knowledge intensive industries.2The global effects of mutual restrictions on cross-border M&As depend on the effects on both the acquirer and the target firm.Thus,it is important to complement existing knowledge on the effects on inno-vation in target firms with empirical evidence on the investing firms.Cross-border acquisitions constitute the main form of FDI in industries with a high R&D intensity (UNCTAD,2007).The effects of international M&As on R&D have important policy implications since innovative activity is regarded as a key factor to spur productivity and growth.Existing empirical evidence on the effects of cross-borderInternational Journal of Industrial Organization 31(2013)307–321☆I would like to thank two anonymous referees and a co-editor for helpful comments and suggestions.Further,I would like to thank the KfW Bankengruppe for hospitality and access to their survey data and Frank Reize for sharing his data preparation files and his experience with the data set.Helpful comments by Thomas K.Bauer,Dirk Engel,Ingo Geishecker,Christoph M.Schmidt,and Michaela Trax are gratefully ac-knowledged.I would also like to thank seminar participants in Düsseldorf,Göttingen,Kiel,Aachen and Duisburg as well as participants of the 37th conference of the EARIE,the annual meeting of the German Economic Association,2010,and the PhD presenta-tion meeting of the Royal Economic Society 2011for helpful comments and suggestions.⁎University of Nottingham,Nottingham University Business School,Jubilee Campus,South Building,Wollaton Road,Nottingham NG81BB,United Kingdom.Tel.:+441159515093.E-mail address:joel.stiebale@ .1/ReportFolders/reportFolders.aspx?sRF_ActivePath=P,5,27&sRF_Expanded=,P,5,27.2One example is the announced acquisition of the Spanish energy company Endesa by the German energy provider E.ON in the year 2006that was blocked by the Spanish government.Similarly,in 2005,the French government decided to impose restrictions on foreign acquisitions in several strategically important industries with high knowl-edge intensity like information systems andbiotechnology.0167-7187/$–see front matter ©2013Elsevier B.V.All rights reserved./10.1016/j.ijindorg.2013.04.005Contents lists available at SciVerse ScienceDirectInternational Journal of Industrial Organizationj o u r n a l h om e p a g e :ww w.e l s e v i e r.c o m /l o c a t e /i j i oM&As is mostly limited to targetfirms,while little is known about the effects on the acquiringfirms.3Only recently,cross-border acquisitions as a type of FDI started to receive more attention in the international trade literature.Recent theoretical contributions analyze the role offirm heterogeneity and different motives that determine the choice of foreign market entry modes(Nocke and Yeaple,2007;Norbäck and Persson,2007).These models argue that international M&As are mainly driven by the desire to acquire complementary assets and technology while greenfield investments(newfirms or production units founded by foreign investors)do not provide direct access to foreign knowledge and are rather undertaken to exploit existingfirm-specific assets of the acquiringfirm or factor price differences across countries.If comple-mentarities between acquiring and targetfirm play a role for cross-border acquisitions and these involve innovative activities it is likely that the effects on domestic R&D are quite different from those of greenfield investments.Hence,it is not possible to derive conclusions about the effects of cross-border M&As from existing studies on greenfield investments or aggregate FDI.It is also likely that the effects of international acquisitions are different from those of domestic transactions since previous research argues that the motives and characteristics of cross-border M&As are different(see Shimizu et al.,2004,for instance).Theory suggests that the characteristics offirms that self-select into international acquisitions are quite different from those that engage in domestic acquisitions(see e.g.Nocke and Yeaple,2008).Market access–for instance via access to existing networks or market specific knowledge like marketing capabil-ities–might be a more important motive for international than for do-mestic M&As(see e.g.Nocke and Yeaple,2008;Guadalupe et al.,2012; Blonigen et al.,2012).Improved market access from the perspective of the acquiringfirm may increase the incentives to invest in cost reducing or quality enhancing innovations as these can be applied to a larger production output.Further,as efficiency differences within an industry are likely to be more pronounced across than within countries(Neary, 2007)it is likely that foreign and domestic acquisition targets have dif-ferent characteristics.This may result in different feedback effects on the investingfirm as well.The purpose of this paper is to investigate the impact of cross-border acquisitions on R&D activities of the investingfirm.This paper contributes to the existing literature in several aspects.First, empirical evidence on the effects of international acquisitions on innovation activities of the acquirer is sparse.4Further,I contribute to the industrial organization and the international economics litera-ture by comparing the effects of cross-border acquisitions to those of domestic acquisitions and greenfield foreign direct investments. Heterogeneous effects according to industries and target countries with different characteristics are provided.For this purpose a unique firm-level data set is constructed that combines survey data for Germanfirms with balance sheet data and an M&A database.The case of Germany is in particular interesting as it is one of the most technologically advanced countries in the world and is considerably engaged in FDI and global M&As.The empirical framework accounts for unobservedfirm heteroge-neity and the possible endogeneity of cross-border acquisitions.The main results are based on a non-linear two-equation model in which the decision to engage in an international acquisition as well as the decision of how much to spend on R&D is explained simulta-neously.Identification is achieved by exploiting unexpected shocks to foreign market growth rates and variation in the distance to foreign markets acrossfirms.The robustness of the results towards alternative empirical models and identifying assumptions is checked.This paper is organized as follows.In Section2,I summarize the related literature.Section3describes the empirical model and Section4provides a description of the data.Results of the empirical analysis are presented in Section5.Section6concludes.2.Cross-border acquisitions and R&DThis paper is related to several strands of theoretical and empirical literature that look at M&As from the perspective of industrial organi-zation(IO)economics,strategic management,or corporatefinance.5 As the M&A literature often does not distinguish explicitly between cross-border and domestic acquisitions or between effects on acquir-ingfirms and acquisition targets it is worth taking a look at the litera-ture on international trade and FDI as well.Cross-border acquisition can affect the investingfirm's innovation activities through a variety of channels.First,there might be direct effects via relocation of R&D activities.Second,acquisitions may have an impact on other determi-nants of R&D that have been identified in the theoretical and empirical innovation literature such as afirm's size,market share,competition, technological opportunities,external knowledge sources,market demand,andfinancial factors(see,for instance,Cohen and Levine, 1989or Hall and Mairesse,2006for an overview on the determinants of R&D).The main motives for M&As within the IO literature are the strengthening of market power(Kamien and Zang,1990)and the re-alization of efficiency gains(Röller et al.,2001).The effects on market power and efficiency also belong to the main channels through which M&As can affect R&D.M&As might be undertaken to gain access to targetfirms'assets such as production capabilities or intangible assets (e.g.Jovanovic and Rousseau,2008).Efficiency gains after an acquisi-tion may,for instance,stem from the diffusion of know-how within the merged entity(Röller et al.,2001)or the reallocation of technol-ogy to more efficient uses(Jovanovic and Rousseau,2008).Synergies resulting from M&As might entail an increase in the efficiency of R&D which might increase the incentives to innovate.Regarding the strategic aspect,a reduction in competition has a theoretically ambiguous effect on innovation incentives.This effect depends on market characteristics,the type of innovation,and the de-gree of R&D spillovers(see,for instance,Gilbert,2006;Vives,2008; Schmutzler,2010for a recent discussion).Reduced competition will increase afirm's residual demand–and thus the output to which cost reductions or quality improvements can be applied–but at the same time it tends to decrease the elasticity of demand and thus the impact of price reductions.However,if a merger solely reduces the number offirms in a market,it is likely that this induces a positive ef-fect on innovation incentives(Vives,2008).Further,the internaliza-tion of technology spillovers that have previously been captured by competitors can also increase the incentives for R&D(Kamien et al., 1992).Gilbert and Newbery(1982)argue thatfirms with monopoly power have additional incentives to engage in R&D due to the possibil-ity of preemptive patenting.Acquisitions that are motivated by strategic reasons also play a role in the international economics literature(e.g.Horn and Persson,2001; Neary,2007).Cost differences betweenfirms might be more pro-nounced across than within countries and this may increase the incen-tives for cross-border M&As(Bertrand and Zitouna,2006;Bjorvatn, 2004;Neary,2007).In Neary(2007),for instance,cross-border acqui-sitions are accompanied by a reallocation of production from less3The effects of cross-border M&As on targetfirms have received considerable atten-tion with respect to productivity(Arnold and Javorcik,2009;Benfratello and Sembenelli,2006)and employment(Almeida,2007).Recently,particular attention has been paid to the effects of foreign acquisitions on innovation activity(Bertrand, 2009;Bertrand et al.,2012;Guadalupe et al.,2012;Stiebale and Reize,2011).4Bertrand and Zuninga(2006)analyze effects of domestic and international M&As on R&D at the industry level.Firm-level studies that analyze differences between ef-fects of domestic and international acquisitions on the acquirers'innovation includeDesyllas and Hughes(2010),Cloodt et al.(2006)and Ahuja and Katila(2001),al-though analyzing effects of cross-border M&As is not at the core of their analysis.5The literature on cross-border M&As from the perspective of the management lit-erature is surveyed in Shimizu et al.(2004).308J.Stiebale/International Journal of Industrial Organization31(2013)307–321efficient acquisition targets to more efficient foreign investors.If M&As are primarily motivated by efficiency differences between firms across countries we would expect an increase in economic activ-ity in acquiringfirms at the expense of targetfirms.6The impact of cross-border acquisitions on R&D in acquiringfirms can be different from the effects on efficiency and the scale of produc-tion.Acquirers might relocate R&D facilities from targetfirms to the cor-porate headquarters,but keep production sites running(or vice versa). Manyfirms tend to cluster their R&D activities close to their headquar-ters or their main corporate production unit due to the aim of managers to keep track of these activities(Howell,1984).Sanna-Randaccio and Veugelers(2007)show in a theoretical model that centralizing R&D in the home country increases the appropriability of the results of R&D efforts as it prevents knowledge spillovers to foreign competitors in the host country.Centralizing R&D may also avoid costs of coordination and may allow a multinational enterprise to exploit economies of scale in R&D(Kumar,2001).Hence,it is well possible that relocation effects for R&D are more pronounced than for production activities.Cross-border acquisitions are a mode of FDI and thus might in addition be motivated by differences in production costs across countries,the desire to enter foreign markets,or the access to country specific assets.7In most theoretical trade models incorporatingfirm heterogeneity,market access is the most important motive for FDI (for instance,Helpman et al.,2004).This type of market-seeking FDI is usually referred to as horizontal investment.Horizontal FDI might reduce domestic production if it comes along with a substitution of exports.Contrarily,FDI might spur headquarter activities such as marketing activities and R&D as these investments can be applied to a larger production output after a foreign investment(Fors and Svensson,2002).This might in turn increase growth in the acquirers' home country.Vertical FDI in analogy to Head and Ries(2003)is motivated by differences in factor costs across countries.However,the motives for cross-border M&As might be quite different from greenfield investments(even in a monopolistic com-petition framework where they are not driven by strategic aspects). Theoretical trade models with heterogeneousfirms that differentiate between the modes of foreign market entry usually argue that green-field investments are chosen for FDI motivated by production cost differences(Nocke and Yeaple,2007,2008).In contrast,these models argue that cross-border M&As are aimed to achieve access to comple-mentaryfirm-specific assets of acquisition targets(Nocke and Yeaple, 2008),country-specific assets(Norbäck and Persson,2007),export networks(Blonigen et al.,2012),or capabilities that are non-mobile across countries(Nocke and Yeaple,2007).8If the exploitation of complementary assets entails innovation activities this might in-crease the returns to these activities and thus spur R&D expenditures.There are,however,also counterarguments regarding the effects of international M&As on acquiringfirms'R&D.Cross-border acquisi-tions might come along with a substitution of domestic by foreign activities.There might also be a reduction of duplicate R&D activities after a merger if the overlap between the research projects of acquirer and targetfirm is large(Veugelers,2006).Further,M&As may lead to a reduction in the competition in technology markets which may reduce the incentives of mergingfirms to engage in R&D activities further(Arrow,1962).There are also some counterarguments which can be derived from thefinancial economics and the manage-ment literature.M&As are oftenfinanced with a high amount of debt which might raise the costs for raising external funds for R&D and there is empirical evidence that especially after a leveraged buyout targets display declining expenditures for capital(Kaplan,1989) and R&D(Long and Ravenscraft,1993).Further,M&As might also arise out of a manager's utility maximization(Shleifer and Vishny, 1988)who wants a large empire under control and conducts M&As at the expense of other investment projects including R&D activities. Finally,M&As might reduce R&D due to increased organizational complexity and tighterfinancial controls(Hitt and Hoskisson,1990; Hitt et al.,1991)or due to a disruption of established routines (Ahuja and Katila,2001).Hence,from a theoretical point of view the relationship between foreign acquisitions and acquirers'R&D is unclear and thus boils down to an empirical matter.Empirical studies that deal with the effects of domestic M&As(or do not explicitly differentiate between domestic and international M&As)find in the majority negative effects(Cassiman et al.,2005).But the results seem to depend on product and technology market characteristics.Cassiman et al. (2005)argue that the impact of M&As on R&D in the merged entity depends on technological and market relatedness between acquirer and target.They suggest that M&As between rivalfirms lead to an overall reduction of R&D efforts,while they predict the opposite when the merged entities are technologically complementary. Studies that deal with the effects on innovation activities in foreign acquisition targets have so far yielded mixed results.For instance, Guadalupe et al.(2012)and Bertrand(2009)find positive effects of foreign acquisitions on innovation,while Stiebale and Reize(2011)find large negative effects once endogeneity and selection bias are taken into account,and Bertrand and Zuninga(2006)find no signifi-cant effect on average but some positive effects in industries with a medium technological intensity.Existing empirical studies that ana-lyze the impact of cross-border acquisitions on innovation activities at thefirm level are mostly limited to the evidence on the impact on targetfirms.9Marin and Alvarez(2009)find that acquisitions undertaken by foreign ownedfirms in Spain have a negative impact on the acquirers'innovation activities,in contrast to acquisitions by domestically ownedfirms,but they do not analyze the impact of cross-border acquisitions.Ahuja and Katila(2001)as well as Cloodt et al.(2006)analyze differences in a sample of mergingfirms according to cultural distance between acquirer and targetfirm. Desyllas and Hughes(2010)find that cross-border M&As have a more pronounced negative effect on the acquirer's R&D intensity than domestic M&As.3.Empirical strategyTwo main problems have to be addressed in the empirical analy-sis.First,structural zeros arise because a lot offirms report zero R&D expenditures.Second,endogeneity might arise from the fact that unobserved factors influencing R&D might also be correlated with a foreign acquisition.Thus,a model that accounts for both structural zeros and endogeneity is specified to evaluate the impact of international acquisitions on the acquirer's innovation.To evaluate the effect of outward cross-border acquisitions on domestic R&D expenditures,a two-equation model is specified:RDÃit¼x′itβ1þδIMA itþεitð1Þ6Stiebale and Trax(2011)provide evidence that acquirers'domestic sales and em-ployment tend to increase after international M&As.7See Helpman(2006)for an overview on the theoretical literature onfirms and FDI choices.8There are several further possible motives for cross-border acquisitions.In a model of Head and Ries(2008),cross-border acquisitions arise due to the possibility to shift ownership to a more efficient usage.Cross-border acquisitions(and FDI in general) may also be motivated by building an export platform in a tariff free block such as the European Union(Neary,2002).Cross-border and domestic acquisitions may also involve vertical integration.However,while cross-border M&As often take place acrossindustries they are rarely associated with input–output linkages(e.g.,Hijzen et al., 2008).9A detailed discussion about studies that analyze the relationship between foreign ownership and innovation can be found in Stiebale and Reize(2011).309J.Stiebale/International Journal of Industrial Organization31(2013)307–321IMA Ãit ¼x ′it β2þz it γþu it IMA it ¼(1;IMA Ãit >00;elseRD it ¼max RD Ãit ;0ðÞ:ð2ÞThe error terms of the two equations are assumed to be jointly normally distributed:εitu ite N 2ðÞ00 ;σερσερσε1where the variance of u it is normalized to one for identi fication.RD it denotes the domestic R&D to sales ratio,multiplied by 100,of firm i in period t .IMA it is a dummy variable that takes the value of one if a firm acquired a target in an international M&A between t −2and t .An acquisition is de fined as an increase in the ownership share from below to above 50%of equity —either directly or indirectly through a parent or a holding company.x it is a vector of firm-and industry-level variables that enters both equations.It contains variables that are usually used in innovation studies which are likely to affect both R&D expenditures and interna-tional acquisitions.10A firm's age is measured in years and serves as a proxy for experience and the stage of the product life cycle.Firm size enters the equations as the logarithm of the number of employees.Human capital intensity is approximated by the share of employees with a university degree.Capital intensity controls for past accumula-tion of tangible assets.The ability to raise equity for financing invest-ment is captured by a dummy variable that takes the value of one if the firm has financed part of its tangible investment by equity.Further,a dummy variable for incorporated enterprises is added to the model that captures differences in corporate governance and the ability to raise external finance.A dummy variable for Eastern Germany accounts for the transition process and regional differences.The model also includes a control variable for foreign ownership.Two dummy variables take the value of one if a firm cooperates with other firms or public scienti fic institutions,respectively.11Further,x it contains several variables that account for the competi-tive environment and market conditions.The firm's lagged domestic market share captures the potential to spread the gain from new or improved products and processes over production output.This variable also accounts for the selection of larger and more productive firms into foreign markets.The domestic market growth rate -measured at the two-digit level -controls for time-varying changes in market size at the industry level.To account for changes in competition,a further variable measures the net entry rate on the domestic market (see Aghion et al.,2009for an analysis on the effect of entry on innovation).It is also controlled for a firm's main regional market by a set of dummy variables that take the value of one if a firm's main market is international,national,or regional,respectively (for instance,Aw et al.,2007,2008analyze the role of exporting for R&D).Industry dummies at the two-digit level control for time invariant product and market characteristics and time dummies capture macroeconomic shocks.z it includes variables that are assumed to affect the propensity to engage in a cross-border acquisition but not domestic R&D expendi-tures.These variables are discussed in detail below.Endogeneity of IMA it ,in the two equation model,stems from a non-zero correlation between the two error terms (ρ≠0).A prerequi-site for logical consistency is that a recursive structure is imposed,i.e.RD it does not appear in Eq.(2)(see e.g.Maddala,1983).This prerequisite ismet in the chosen speci fication and seems reasonable,as an acquisition in the past on current R&D expenditures is evaluated.The model does not contain firm-fixed effects.The reason is that introducing fixed effects in non-linear models leads to inconsistent estimates of all parameters.12Estimation is carried out by full maximum likelihood.Full maximum likelihood is more demanding than a two-step control function approach as it requires specifying a joint distribution of the equation system,but it assures most ef ficient estimation if the model is correctly speci fied.13The robustness of the results towards the distributional as-sumptions is checked by using a linear instrumental variable estimator.Standard errors are clustered as some firms appear more than once in the sample.Irrespective of the estimation procedure,it is necessary for identi fication that there is at least one valid exclusion restriction,i.e.a variable that affects the probability to engage in a cross-border ac-quisition but not domestic R&D expenditures.In the context of the two equation model,this is a variable that enters z it but not x it .14Two exclusion restrictions are used in the empirical analysis.Score tests are computed to test the joint and individual validity of the two ex-clusion restrictions,and the results of these tests support the model's identifying assumptions.The first instrumental variable is the distance to foreign markets which is measured as the minimum distance to Western European countries.This variable captures the well known proximity –concentration tradeoff (see e.g.Brainard,1997):In models of horizontal FDI,firms face a trade-off between exporting on the one hand and producing locally via FDI on the other hand.The former re-quires them to pay higher transport costs of the goods shipped to the foreign market,but exporters can bene fit from concentrating produc-tion and thereby achieving scale economies.FDI,in contrast,involves paying higher sunk and fixed costs for the af filiate abroad but lower transport costs due to the proximity to consumers.15For this instrument to be valid,it is crucial that omitted regional fac-tors,that are correlated with distance to foreign markets,do not affect R&D expenditures.I argue that most of the systematic differences in inno-vativeness across regions are captured by the control variables,i.e.vari-ables in x it ,like industry dummies,external knowledge sources,and other firm-and industry-level variables.One might be concerned that firms choose a certain location because they plan to engage in cross-border acquisitions.However,only a few firms change their loca-tion after foundation,and the average firm age at the time of acquisition is more than 35years in our sample.Hence,it seems unlikely that M&As affect the location choice of firms.10See e.g.Cohen and Levine (1989)and Hall and Mairesse (2006)for an overview on empirical innovation studies.11The survey questions underlying these variables refer to cooperation with firms and institutions in general and not to cooperation on R&D as in CIS innovation surveys.Hence they do not imply but might affect R&D activities.12A further problem is that many firms in the data set only appear at most twice in the sample.However,some regressions in first differences and with controls for lagged values of the dependent variable on a reduced sample are presented to convey an im-pression about the importance of time-invariant unobserved firm heterogeneity.13Estimation was carried out in Stata®,version 10.1.The likelihood function of this model can be found in Appendix B available on the web,and the program code for estima-tion is available from the author upon request.Alternative models such as the instrumen-tal variable Tobit model developed by Smith and Blundell (1986)are not applicable as they do not allow for discrete endogenous regressors.Similarly,the fractional response es-timators suggested by Papke and Wooldridge (2008)cannot deal with binary endogenous regressors as well.Abadie (2003)proposes a semi-parametric estimator,but this estima-tor requires that there is a binary instrument variable available which is not the case in this application.Angrist (2001)proposes to use two-stage least squares,but this method is only consistent for censored outcome variables in special cases.Nonetheless,the robust-ness of the main results to using two stage least squares is checked in Section 5.3.14Due to nonlinearity the model is identi fied even without exclusion restrictions,but the results are not very reliable in this case as they critically hinge on distributional and functional form assumptions.15Nonetheless,the relationship between cross-border acquisitions and geographic distance is not unambiguous as this variable might capture other in fluences like cultur-al distance or vertical relations.Hijzen et al.(2008)find a negative relation between cross-border M&As and distance,measured at the industry-country level,which is more pronounced for non-horizontal M&As.However,a positive correlation between a firm's distance to the border and foreign acquisitions does not rule out a negative cor-relation between M&As and distance on a macroeconomic level.Firms may be induced to engage in cross-border acquisitions as opposed to serve a foreign market via exports by distance,but they may (conditional on this choice)choose a close-by target firm to minimize trade and transaction costs.310J.Stiebale /International Journal of Industrial Organization 31(2013)307–321。

APPLICATION OF SYMMETRY ANALYSIS TO A PDE ARISING IN THE CAR WINDSHIELD DESIGN

APPLICATION OF SYMMETRY ANALYSIS TO A PDE ARISING IN THE CAR WINDSHIELD DESIGN

APPLICATION OF SYMMETRY ANALYSIS TO APDE ARISING IN THE CAR WINDSHIELD DESIGN ∗NICOLETA B ˆIL ˘A†SIAM J.A PPL.M ATH .c2004Society for Industrial and Applied Mathematics Vol.65,No.1,pp.113–130Abstract.A new approach to parameter identification problems from the point of view of symmetry analysis theory is given.A mathematical model that arises in the design of car windshield represented by a linear second order mixed type PDE is considered.Following a particular case of the direct method (due to Clarkson and Kruskal),we introduce a method to study the group invariance between the parameter and the data.The equivalence transformations associated with this inverse problem are also found.As a consequence,the symmetry reductions relate the inverse and the direct problem and lead us to a reduced order model.Key words.symmetry reductions,parameter identification problemsAMS subject classifications.58J70,70G65,35R30,35R35DOI.10.1137/S00361399034340311.Introduction.Symmetry analysis theory links differential geometry to PDEs theory [18],symbolic computation [9],and,more recently,to numerical analysis theory[3],[6].The notion of continuous transformation groups was introduced by Sophus Lie [14],who also applied them to differential equations.Over the years,Lie’s method has been proven to be a powerful tool for studying a remarkable number of PDEs arising in mathematical physics (more details can be found for example in [2],[10],and [21]).In the last several years a variety of methods have been developed in order to find special classes of solutions of PDEs,which cannot be determined by applying the classical Lie method.Olver and Rosenau [20]showed that the common theme of all these methods has been the appearance of some form of group invariance.On the other hand,parameter identification problems arising in the inverse problems theory are concerned with the identification of physical parameters from observations of the evolution of a system.In general,these are ill-posed problems,in the sense that they do not fulfill Hadamard’s postulates for all admissible data:a solution exists,the solution is unique,and the solution depends continuously on the given data.Arbitrary small changes in data may lead to arbitrary large changes in the solution.The iterative approach of studying parameter identification problems is a functional-analytic setup with a special emphasis on iterative regularization methods [8].The aim of this paper is to show how parameter identification problems can be analyzed with the tools of group analysis theory.This is a new direction of research in the theory of inverse problems,although the symmetry analysis theory is a com-mon approach for studying PDEs.We restrict ourselves to the case of a parameter identification problem modeled by a PDE of the formF (x,w (m ),E (n ))=0,(1.1)where the unknown function E =E (x )is called parameter ,and,respectively,the arbitrary function w =w (x )is called data ,with x =(x 1,...,x p )∈Ω⊂R p a given∗Receivedby the editors September 4,2003;accepted for publication (in revised form)May 4,2004;published electronically September 24,2004.This work was supported by the Austrian Science Foundation FWF,Project SFB 1308“Large Scale Inverse Problems.”/journals/siap/65-1/43403.html †Institute for Industrial Mathematics,Johannes Kepler University,69Altenbergerstrasse,Linz,A-4040,Austria (bila@indmath.uni-linz.ac.at).113114NICOLETA BˆIL˘Adomain(here w(m)denotes the function w together with its partial derivatives up to order m).Assume that the parameters and the data are analytical functions. The PDE(1.1)sometimes augmented with certain boundary conditions is called the inverse problem associated with a direct problem.The direct problem is the same equation but the unknown function is the data,for which certain boundary conditions are required.The classical Lie method allows us tofind the symmetry group related to a PDE. This is a(local)Lie group of transformations acting on the space of the independent variables and the space of the dependent variables of the equation with the property that it leaves the set of all analytical solutions invariant.Knowledge of these classi-cal symmetries allows us to reduce the order of the studied PDE and to determine group-invariant solutions(or similarity solutions)which are invariant under certain subgroups of the full symmetry group(for more details see[18]).Bluman and Cole[1] introduced the nonclassical method that allows one tofind the conditional symmetries (also called nonclassical symmetries)associated with a PDE.These are transforma-tions that leave only a subset of the set of all analytical solutions invariant.Note that any classical symmetry is a nonclassical symmetry but not conversely.Another procedure forfinding symmetry reductions is the direct method(due to Clarkson and Kruskal[5]).The relation between these last two methods has been studied by Olver[19].Moreover,for a PDE with coefficients depending on an arbitrary function, Ovsiannikov[21]introduced the notion of equivalence transformations,which are(lo-cal)Lie group of transformations acting on the space of the independent variables, the space of the dependent variables and the space of the arbitrary functions that leave the equation unchanged.Notice that these techniques based on group theory do not take into account the boundary conditions attached to a PDE.Tofind symmetry reductions associated with the parameter identification problem (1.1)one can seek classical and nonclassical symmetries related to this equation.Two cases can occur when applying the classical Lie method or the nonclassical method, depending if the data w is known or not.From the symbolic computation point of view,the task offinding symmetry reductions for a PDE depending on an arbi-trary function might be a difficult one,due to the lack of the symbolic manipulation programs that can handle these kind of equations.Another method to determine symmetry reductions for(1.1)might be a particular case of the direct method,which has been applied by Zhdanov[24]to certain multidimensional PDEs arising in mathe-matical physics.Based on this method and taking into account that(1.1)depends on an arbitrary function,we introduce a procedure tofind the relation between the data and the parameter in terms of a similarity variable(see section2).As a consequence, the equivalence transformations related to(1.1)must be considered as well.These final symmetry reductions are found by using any symbolic manipulation program de-signed to determine classical symmetries for a PDE system—now both the data and the parameter are unknown functions in(1.1).The equivalence transformations relate the direct problem and the inverse problem.Moreover,one canfind special classes of data and parameters,respectively,written in terms of the invariants of the group action,the order of the studied PDE can be reduced at least by one,and analytical solutions of(1.1)can be found.At thefirst step,the group approach of the free boundary problem related to (1.1)can be considered and,afterwards,the invariance of the boundary conditions under particular group actions has to be analyzed(see[2]).In the case of parameter identification problems we sometimes have to deal with two pairs of boundary condi-tions,for data and the parameter as well,otherwise we might only know the boundarySYMMETRY ANALYSIS AND PARAMETER IDENTIFICATION PROBLEMS 115conditions for the data.Thus,the problem of finding symmetry reductions for a given data can be more complicated.At least by finding the equivalence transformations related to the problem,the invariants of the group actions can be used to establish suitable domains Ωon which the order of the model can be reduced.In this paper we consider a mathematical model arising in the car windshield design.Let us briefly explain the gravity sag bending process ,one of the main industrial processes used in the manufacture of car windshields.A piece of glass is placed over a rigid frame,with the desired edge curvature and heated from below.The glass becomes viscous due to the temperature rise and sags under its own weight.The final shape depends on the viscosity distribution of the glass obtained from varying the temperature.It has been shown that the sag bending process can also be controlled (in a first approximation)in the terms of Young’s modulus E ,a spatially varying glass material parameter,and the displacement of the glass w can be described by the thin linear elastic plate theory (see [11],[16],and [17]and references from there).The model is based on the linear plate equation(E (w xx +νw yy ))xx +2(1−ν)(Ew xy )xy +(E (w yy +νw xx ))yy =12(1−ν2)fh 3on Ω,(1.2)where w =w (x,y )represents the displacement of the glass sheet (the target shape)occupying a domain Ω⊂R 2,E =E (x,y )is Young’s modulus,a positive function that can be influenced by adjusting the temperature in the process of heating the glass,f is the gravitational force,ν∈ 0,12 is the glass Poisson ratio,and h is thickness of the plate.The direct problem (or the forward problem )is the following:for a given Young modulus E ,find the displacement w of a glass sheet occupying a domain Ωbefore the heating process.Note that the PDE (1.2)is an elliptic fourth order linear PDE for the function w .Until now,two problems related to (1.2)have been studied:the clamped plate case and the simply supported plate case (more details can be found for example in [15]).In this paper we consider the clamped case,in which the following boundary conditions are required:the plate is placed over a rigid frame,i.e.,w (x,y )|∂Ω=0,(1.3)and,respectively,∂w ∂n |∂Ω=0,(1.4)which means the (outward)normal derivative of w must be zero,i.e.,the sheet of glass is not allowed to freely rotate around the tangent to ∂Ω.The associated inverse problem consists of finding Young’s modulus E for a given data w in (1.2).This is a linear second order PDE for Young’s modulus that can be written as(1.5)(w xx +νw yy )E xx +2(1−ν)w xy E xy +(w yy +νw xx )E yy+2(∆w )x E x +2(∆w )y E y +(∆2w )E =1after the scaling transformations w →1k w or E →1k E ,with k =12(1−ν2)f h 3.In (1.5),∆denotes the Laplace operator.The main problem in the car windshield design is that the prescribed target shape w is frequent such that the discriminantD =(1−ν)2w 2xy −(w xx +νw yy )(w yy +νw xx )116NICOLETA BˆIL˘Aof(1.5)changes sign in the domainΩ,so that we get a mixed type PDE.This is one of the reasons for which optical defects might occur during the process.Note that (1.5)would naturally call for boundaries conditions for E on∂Ωin the purely elliptic case(when D<0),and Cauchy data on a suitable(noncharacteristic part)Γ⊂∂Ωin the purely hyperbolic part(for D>0).There is a recent interest in studying this inverse problem(see,e.g.,[13]).It is known[15]that a constant Young’s modulus corresponds to a data which satisfies the nonhomogeneous biharmonic equation(2.29).A survey on this subject can be found in[23].Salazar and Westbrook[22]studied the case when the data and the parameter are given by radial functions;K¨u gler[12] used a derivative free iterative regularization method for analyzing the problem on rectangular frames;and a simplified model for the inverse problem on circular domains was considered by Engl and K¨u gler[7].So far it is not obvious which shapes can be made by using this technique.Hence, we try to answer this question byfinding out the symmetry reductions related to the PDE(1.5)hidden by the nonlinearity that occurs between the data and the parameter. In this sense,we determine(see section3)the group of transformations that leave the equation unchanged,and so,its mixed type form.Knowledge of the invariants of these group actions allows us to write the target shape and the parameter in terms of them, and,therefore,to reduce the order of the studied equation.Wefind again the obvious result that a Young’s modulus constant corresponds to data which is a solution of a nonhomogeneous biharmonic equation.The circular case problem considered by Salazar and Westbrook is,in fact,a particular case of our study.We show that other target shapes which are not radial functions can be considered.We prove that(1.5) is invariant under scaling transformations.It follows that target shapes modeled by homogeneous functions can be analyzed as well.In particular,we are interested in target shapes modeled by homogeneous polynomials defined on elliptical domains or square domains with rounded corners.The paper is structured as follows.To reduce the order of the PDE(1.5)we propose in section2a method for studying the relation between the data and the pa-rameter in terms of the similarity variables.The equivalence transformations related to this equation are given in section3.The symbolic manipulation program DESOLV, authors Carminati and Vu[4]has been used for this purpose.Table1contains a com-plete classification of these symmetry reductions.In the last section,we discuss the PDE(1.5)augmented with the boundary conditions(1.3)and(1.4),namely,how to use the invariants of the group actions(on suitable bounded domainsΩ)in order to incorporate the boundary conditions.In this sense,certain examples of exact and of numerical solutions of the reduced ODEs are given.2.Conditional symmetries.The direct method approach to a second order PDEF(x,y,E(2))=0consists of seeking solutions written in the form(2.1)E(x,y)=Φ(x,y,F(z)),where z=z(x,y),(x,y)∈Ω.In this case the function z is called similarity variable and its level sets{z=k}are named similarity curves.After substituting(2.1)into the studied second order PDE, we require that the result to be an ODE for the arbitrary function F=F(z).Hence, certain conditions are imposed upon the functionsΦ,z and their partial derivatives.SYMMETRY ANALYSIS AND PARAMETER IDENTIFICATION PROBLEMS 117The particular caseE (x,y )=F (z (x,y ))(2.2)consists of looking for solutions depending only on the similarity variable z .If z is an invariant of the group action then the solutions of the form (2.2)are as well.Assume that the similarity variable is such that ∇z =0on ¯Ω.In this section we apply this particular approach to (1.5)in order to study if the parameter and the data are functionally independent,which means whether or not they can depend on the same similarity variable.Assume that Young’s modulus takes the form (2.2).In this case we get the relation(2.3)F (z ) z 2x (w xx +νw yy )+2z x z y (1−ν)w xy +z 2y (w yy +νw xx )+F (z )[z xx (w xx +νw yy )+2(1−ν)z xy w xy ++z yy (w yy +νw xx )+2z x (∆w )x +2z y (∆w )y ]+F (z )(∆2w )=1,which must be an ODE for the unknown function F =F (z ).This condition is satisfied if the coefficients of the partial derivatives of F are function of z only (note that these coefficients are also invariant under the same group action).Denote them byΓ1(z )=z 2x (w xx +νw yy )+2z x z y (1−ν)w xy +z 2y(w yy +νw xx ),Γ2(z )=z xx (w xx +νw yy )+2(1−ν)z xy w xy +z yy (w yy +νw xx )+2z x (∆w )x +2z y (∆w )y ,Γ3(z )=∆2w.(2.4)If these relations hold,then the PDE (1.5)is reduced to the second order linear ODE Γ1(z )F (z )+Γ2(z )F (z )+Γ3(z )F (z )=1.(2.5)2.1.Data and parameter invariant under the same group.If the target shape is invariant under the same group action as Young’s modulus,thenw (x,y )=G (z (x,y )),(2.6)where G =G (z ).Substituting (2.6)into the relations (2.4)we get Γ1=G (z 2x +z 2y )2+G (z 2x +νz 2y )z xx +2(1−ν)z x z y z xy +(z 2y +νz 2x )z yy ,Γ2=2G (z 2x +z 2y )2+G [7z 2x +(ν+2)z 2y ]z xx +2(5−ν)z x z y z xy +[7z 2y +(ν+2)z 2x ]z yy +G (∆z )2+2(1−ν)(z 2xy −z xx z yy )+2[z x (∆z )x +z y (∆z )y ]},Γ3=G (z 2x +z 2y )2+2G (3z 2x +z 2y )z xx +4z x z y z xy +(z 2x +3z 2y )z yy +G 3(∆z )2+4(z 2xy −z xx z yy )+4[z x (∆z )x +z y (∆z )y ] +G ∆2z.(2.7)Next,the coefficients of the partial derivatives of the function G ,denoted by Γi ,must depend only on z ,i.e.,Γ1=α4G +a 1G ,Γ2=2α4G +a 2G +a 3G ,Γ3=α4G +2a 4G +a 5G +a 6G ,118NICOLETA B ˆIL ˘Awhereα2(z )=z 2x +z 2y ,a 1(z )=(z 2x +νz 2y )z xx +2(1−ν)z x z y z xy +(z 2y +νz 2x )z yy ,a 2(z )= 7z 2x +(ν+2)z 2y z xx +2(5−ν)z x z y z xy + 7z 2y +(ν+2)z 2xz yy ,a 3(z )=(∆z )2+2(1−ν)(z 2xy −z xx z yy )+2[z x (∆z )x +z y (∆z )y ],a 4(z )=(3z 2x +z 2y )z xx +4z x z y z xy +(z 2x +3z 2y )z yy ,a 5(z )=3(∆z )2+4(z 2xy −z xx z yy )+4[z x (∆z )x +z y (∆z )y ],a 6(z )=∆2z.(2.8)The first relation in (2.8)is a two-dimensional (2D)eikonal equation.From this we getz 2xz xx +2z x z y z xy +z 2y z yy =α3(z )α (z ),z xx =α(z )α (z )−z y z x z xy ,z yy =α(z )α (z )−z x z y z xy .The last two equations implyz 2y z xx −2z x z y z xy +z 2x z yy =α3(z )α (z )−α4(z )z xy z x z y.(2.9)Assume that there is a function β=β(z )such thatz xy =β(z )z x z y .(2.10)Indeed,since the left-hand side in (2.9)depends only on z ,one can easily check if z satisfies both the 2D eikonal equation in (2.8)and (2.10),then all the functions a i =a i (z )defined by (2.8)are written in terms of αand β.Therefore,the problem of finding the similarity variable z is reduced to that of integrating the 2D eikonal equation and the PDE system⎧⎪⎪⎨⎪⎪⎩z xx =αα −βz 2y ,z xy =βz x z y ,z yy =αα −βz 2x .(2.11)The system (2.11)is compatible if the following relation holds:αα +α 2−3βαα +α2 β2−β =0.Denote µ=12α2.In this case,the above compatibility condition can be written asµ −3βµ +2µ β2−β =0.(2.12)On the other hand,if the function βis given byβ(z )=−λ (z ) ,(2.13)SYMMETRY ANALYSIS AND PARAMETER IDENTIFICATION PROBLEMS119 whereλis a nonconstant function,then(2.10)turns into(λ(z))xy=0.The general solution of this equation is given byλ(z(x,y))=a(x)+b(y),(2.14)with a and b being arbitrary functions.Substitutingβfrom(2.13)into the compati-bility condition(2.12)and after integrating once,we getµ λ +2µλ =k,(2.15)where k is an arbitrary constant.Case1.If k=0,then after integrating(2.15)and substituting backµ=12α2,wegetα2(z)=2kλ(z)+C1λ 2(z).(2.16)The relation(2.14)impliesλ (z)z x=a (x),andλ (z)z y=b (y).We substitute these relations,(2.14)and(2.16),into the2D eikonal equation(see(2.8)).It follows that the functions a=a(x)and b=b(y)are solutions of the following respective ODEs:a 2(x)−2ka(x)=C2andb 2(y)−2kb(y)=C3,with C2+C3=C1(here C i are real constants).The above ODEs admit the noncon-stant solutionsa(x)=12kk2(x−C4)2−C2and b(y)=12kk2(y−C5)2−C3,and so(2.14)takes the formλ(z(x,y))=k2(x−C4)2+(y−C5)2−C12k.(2.17)Notice that1k1λorλ+k2defines the same functionβas the functionλdoes.Moreover,since the PDE(1.5)is invariant under translations in the(x,y)-space,we can considerλ(z(x,y))=x2+y2.(2.18)If √λis a bijective function on a suitable interval,and if we denote byΦ=(√λ)−1its inverse function,then the similarity variable written in the polar coordinates(r,θ) (where x=r cos(θ),y=r sin(θ))is given byz(x,y)=Φ(r).(2.19)For simplicity,we considerΦ=Id,and from that we getE=F(r)and w=G(r),where z(x,y)=r.(2.20)Hence,the ODE(2.5)turns into(2.21)G +νrGF +2G +ν+2rG −1r2GF+G +2rG −1r2G +1r3GF=1,120NICOLETA B ˆIL ˘Awhich can be reduced to the first order ODEG +νG F + G +1G −1G F =r 2−r 20+γ,(2.22)where r 0∈[0,1]with the property that γ= (rG +νG )F + rG +G −1rG F |r =r 0is finite.The smoothness condition G (0)=0implies that (2.22)can be written as [15] G +νrG F + G +1r G −1r 2G F =r 2.(2.23)Case 2.If k =0,similarly we getz (x,y )=Φ(k 1x +k 2y ),(2.24)where k 1and k 2are real constants such that k 21+k 22>0.In this case,for Φ=Id,the parameter and the data are written asE =F (z )and w =G (z ),where z (x,y )=k 1x +k 2y,(2.25)and the ODE (2.5)turns into G (z )F (z )+2G (z )F (z )+G (z )F (z )=1(k 21+k 22)2,(2.26)with {z |G (z )=0}the associated set of singularities.Integrating the above ODE on the set {z |G (z )=0}we obtain that Young’s modulus is given byE (x,y )=(k 1x +k 2y )2+C 1(k 1x +k 2y )+C 22(k 21+k 22)2G (k 1x +k 2y ),where C i are arbitrary constants.2.2.Data and parameter invariant under different groups.Consider two functionally independent functions on Ω,say,z =z (x,y )and v =v (x,y ),and let w =H (v (x,y ))(2.27)be the target shape.In this case,the data and the parameter do not share the same invariance.Similar to the above,substituting (2.27)into the relations (2.4)we get Γ1=H (z x v x +z y v y )2+ν(z y v x −z x v y )2+H z 2x v xx +2z x z y v xy +z 2y v yy +ν z 2x v yy −2z x z y v xy +z 2y v xx ,Γ2=H (v 2x +v 2y )(z x v x +z y v y )+H v 2x z xx +2v x v y z xy +v 2y z yy +ν v 2y z xx −2v x v y z xy +v 2x z yy +2z x v x v xx +2(z x v y +z y v x )v xy +2z y v y v yy+(z x v x +z y v y )(∆v )]+H [z xx v xx +2z xy v xy +z yy v yy +ν(z xx v yy −2z xy v xy +z yy v xx )+z x (∆v )x +z y (∆v )y ],Γ3=H (v 2x +v 2y )2+2H (3v 2x +v 2y )v xx +4v x v y v xy +(v 2x +3v 2y )v yy +H 3v 2xx +4v 2xy +3v 2yy +2v xx v yy +4v x (∆v )x +4v y (∆v )y +H ∆2v.(2.28)SYMMETRY ANALYSIS AND PARAMETER IDENTIFICATION PROBLEMS121 Recall thatΓi’s are functions of z=z(x,y)only.Since each right-hand side in the above relations contains the function H=H(v)and its derivatives,we require that the coefficients of the derivatives of H to be functions of v.It follows thatΓi must be constant and denote them byγi.Therefore,the last condition in(2.28)becomes∆2(w)=γ3,(2.29)which is the biharmonic equation.According to the above assumption,we seek solu-tions of(2.29)that are functions of v only.Similar to section2.1,we getv(x,y)=Ψ(r),or v(x,y)=Ψ(k1x+k2y),(2.30)and thus,forΨ=Id,the target shape is written asw(x,y)=H(r),or w(x,y)=H(k1x+k2y).(2.31)Since z=z(x,y)and v=v(x,y)are functionally independent,we getz(x,y)=k1x+k2y,v(x,y)=x2+y2(2.32)orz(x,y)=x2+y2,v(x,y)=k1x+k2y.(2.33)One can prove that if the coefficientsγi are constant,and if z and v are given by (2.32)or(2.33),respectively,thenγ1=γ2=0,andγ3=0.On the other hand,the solutions of the biharmonic equation(2.29)of the form(2.31)are the following:w(x,y)=γ364z4+C1z2+C2ln(z)+C3z2ln(z)+C4for z=x2+y2,and,respectively,w(x,y)=γ324(k21+k22)2v4+C1v3+C2v2+C3v+C4for v=k1x+k2y,and these correspond to the constant Young’s modulusE(x,y)=1γ3.(2.34)Notice that only particular solutions of the biharmonic equation have been found in this case(i.e.,solutions invariant under rotations and translations).Since this PDE is also invariant under scaling transformations,which act not only on the space of the independent variables but on the data space as well,it is obvious to extend our study and to seek other types of symmetry reductions.3.Equivalence transformations.Consider a one-parameter Lie group of trans-formations acting on an open set D⊂Ω×W×E,where W is the space of the data functions,and E is the space of the parameter functions,given by⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩x∗=x+εζ(x,y,w,E)+O(ε2), y∗=y+εη(x,y,w,E)+O(ε2), w∗=w+εφ(x,y,w,E)+O(ε2), E∗=E+εψ(x,y,w,E)+O(ε2),(3.1)122NICOLETA BˆIL˘Awhereεis the group parameter.LetV=ζ(x,y,w,E)∂x+η(x,y,w,E)∂y+φ(x,y,w,E)∂w+ψ(x,y,w,E)∂E (3.2)be its associated general infinitesimal generator.The group of transformations(3.1) is called an equivalence transformation associated to the PDE(1.5)if this leaves the equation invariant.This means that the form of the equation in the new coordinates remains unchanged and the set of the analytical solutions is invariant under this trans-formation.The equivalence transformations can be found by applying the classical Lie method to(1.5),with E and w both considered as unknown functions(for more details see[10]and[21]).Following this method we obtain⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩ζ(x,y,w,E)=k1+k5x−k4y,η(x,y,w,E)=k2+k4x+k5y,φ(x,y,w,E)=k3+k7x+k6y+(4k5−k8)w,ψ(x,y,w,E)=k8E,(3.3)where k i are real constants.The vectorfield(3.2)is written as V= 8i=1k i V i,whereV1=∂x,V2=∂y,V3=∂w,V4=−y∂x+x∂y,V5=x∂x+y∂y+4w∂w,V6=y∂w,V7=x∂w,V8=−w∂w+E∂E.(3.4)Proposition3.1.The equivalence transformations related to the PDE(1.5)are generated by the infinitesimal generators(3.4).Thus,the equation is invariant under translations in the x-space,y-space,w-space,rotations in the space of the independent variables(x,y),scaling transformations in the(x,y,w)-space,Galilean transforma-tions in the(y,w)and(x,w)spaces,and scaling transformations in the(w,E)-space, respectively.Notice that the conditional symmetries found in section2represent particular cases of the equivalence transformations.Since each one-parameter group of trans-formations generated by V i is a symmetry group,if(w=G(x,y),E=F(x,y))is a pair of known solutions of(1.5),so are the following:w(1)=G(x−ε1,y),E(1)=F(x−ε1,y),w(2)=G(x,y−ε2),E(2)=F(x,y−ε2),w(3)=G(x,y)+ε3,E(3)=F(x,y),w(4)=G(˜x,˜y),E(4)=F(˜x,˜y),w(5)=e4ε5G(e−ε5x,e−ε5y),E(5)=F(e−ε5x,e−ε5y),w(6)=G(x,y)+ε6y,E(6)=F(x,y),w(7)=G(x,y)+ε7x,E(7)=F(x,y),w(8)=e−ε8G(x,y),E(8)=eε8F(x,y),(3.5)SYMMETRY ANALYSIS AND PARAMETER IDENTIFICATION PROBLEMS123where ˜x =x cos(ε4)+y sin(ε4),˜y=−x sin(ε4)+y cos(ε4),and εi are real constants.Moreover,the general solution of (1.5)constructed from a known one is given by w (x,y )=e 4ε5−ε8G (e −ε5(˜x −˜k 1),e −ε5(˜y −˜k 2))+e 4ε5−ε8ε6y +e 4ε5−ε8ε7x +e 4ε5−ε8ε3,E (x,y )=e ε8F (e −ε5(˜x −˜k 1),e −ε5(˜y −˜k 2)),where ˜k1=ε1cos(ε4)+ε2sin(ε4),and ˜k 2=ε1sin(ε4)−ε2cos(ε4).The equivalence transformations form a Lie group G with an eight-dimensional associated Lie algebra A .Using the adjoint representation of G ,one can find the optimal system of one-dimensional subalgebras of A (more details can be found in [18,pp.203–209]).This optimal system is spanned by the vector fields given in Table 1.Denote by z ,I ,and J the invariants related to the one-parameter group of transformations generated by each vector field V i .Here F and G are arbitrary functions,(r,θ)are the polar coordinates,and a,b,c are nonzero constants.To reduce the order of the PDE (1.5)one can also integrate the first order PDE systemζ(x,y,w,E )w x +η(x,y,w,E )w y =φ(x,y,w,E ),ζ(x,y,w,E )E x +η(x,y,w,E )E y =ψ(x,y,w,E ),(3.6)which defines the characteristics of the vector field (3.2).In Table 1,the associated reduced ODEs are listed.The invariance of (1.5)under the one-parameter groups of transformations generated by V 1,V 2,V 1+cV 6,and V 2+cV 7,respectively,leads us to the same ODE,F (z )G (z )+2F (z )G (z )+F (z )G (z )=1,(3.7)with the general solution F (z )=z 2+C 1z +C 22G (z )(3.8)on the set {z |G (z )=0}.The invariance under the scaling transformation generated by the vector field V 5yields the reduced ODEG z 2+1 2−6z (z 2+1)G +12(z 2+ν)G F +2z 2+1 2G−5z (z 2+1)G +3(4z 2+ν+1)G−12zG F+z 2+1 2G−4z (z 2+1)G +4(3z 2+1)G −24zG+24G F =1.(3.9)The ODEz 2+1 2G+2(c −3)z (z 2+1)G +(c −3)(c −4)(z 2+ν)G F+ 2 z 2+1 2G +2(2c −5)z (z 2+1)G +2(c −3)[z 2(c −4)+ν(c −1)−1]G −2(c −3)(c −4)zG }F+z 2+1 2G +2(c −2)z (z 2+1)G +(c −3)(c −4)z 2−2(c −2)+νc (c −1)]G −2(c −4)(c −3)zG +2(c −4)(c −3)G }F =1(3.10)124NICOLETA BˆIL˘ATable1Infintesimal generator Invariants w=w(x,y)E=E(x,y)ODE1.V1z=y w=G(z)E=F(z)(3.7)I=wJ=E2.V2z=x w=G(z)E=F(z)(3.7)I=wJ=E3.V4z=r w=G(z)E=F(z)(2.21)I=wJ=E4.V5z=yx w=x4G(z)E=F(z)(3.9)I=x−4wJ=E5.cV3+V4z=r w=cθ+G(z)E=F(z)(2.21)I=w−cθJ=E6.V5+cV8z=yx w=x4−c G(z)E=x c F(z)(3.10)I=x c−4wJ=x−c E7.V4+cV8z=r w=e−cθG(z)E=e cθF(z)(3.11)I=e cθwJ=e−cθE8.V4+cV5z=re−cθw=r4G(z)E=F(z)(3.13)I=r−4wJ=E9.V4+cX5+bV8z=re−cθw=r4−b c G(z)E=r b c F(z)(3.14)I=r b c−4wJ=r−b c E10.V1+cV6z=y w=cxy+G(z)E=F(z)(3.7)I=w−cxyJ=E11.V2+cV7z=x w=cxy+G(z)E=F(z)(3.7)I=w−cxyJ=E12.V1+cV8z=y w=e−cx G(z)E=e cx F(z)(3.15)I=e cx wJ=e−cx E13.V2+cV8z=x w=e−cy G(z)E=e cy F(z)(3.15)I=e cy wJ=e−cy Eis obtained in case6of Table1.The reduced equationG +νrG +νc2r2GF +2G +ν+2rG +2νc2−1r2G −c2(1+2ν)r3GF(3.11)+G +2rG +c2ν−1r2G +1−c2(2ν+1)r3G +2c2(ν+1)r4GF=1。

SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Group Testing for Image Compression Us

SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Group Testing for Image Compression Us

Group Testing for Image Compression Using Alternative TransformsEdwin S.Hong*Richard dnerDepartment of Computer Science and EngineeringUniversity of Washington,Box352350Seattle,W A98195-2350edhong,ladner@Eve A.RiskinDepartment of Electrical EngineeringUniversity of Washington,Box352500Seattle,W A98195-2500riskin@Abstract—This paper extends the Group Testing for Wavelets[1]algo-rithm to code coefficients from the wavelet packet transform,the discrete cosine transform,and various lapped transforms.In terms of compression performance,these new algorithms are competitive with many recent state-of-the-art image coders that use the same transforms.We also show that group testing offers a noticeable improvement over zerotree coding tech-niques.These new algorithms show the inherentflexibility of the group testing methodology.I.I NTRODUCTIONMuch of the recent compression work has focused on effi-cient methods for encoding the transform coefficients of an im-age.Although the wavelet transform has received the most at-tention in recent years,alternative transforms such as wavelet packets and various block transforms have also been effectively applied to images.In this paper,we extend the Group Test-ing for Wavelets(GTW)algorithm[1]to apply to alternative transforms which have previously been used for image compres-sion,including the wavelet packet transform,the discrete cosine transform(DCT),and several versions of the lapped transform [4],[5].As presented in[1],the group testing framework trans-forms an image and then encodes the resulting transform coef-ficients in a bit-plane order with many different adaptive group testers.For efficient compression,the coefficients are divided into classes whose coefficients have similar statistical character-istics.In order to apply this framework effectively on alternative transforms,new class definitions need to be defined.One main goal of this work is to discover the appropriate class definitions for each type of transform that will result in good performance. Our work was partially motivated by previous work that ap-plied zerotree coding(introduced in[6])to these alternative transforms(see[7],[8],[9],[10],[11])with some success.The zerotree technique was motivated by the multi-resolution struc-Short preliminary versions of some sections this paper appeared in the2001 Data Compression Conference and the35th Asilomar Conference on Signals, Systems,and Computers.Research supported by NSF grant CCR-9732828, NSF Grant EIA-9973531,and NSF Grant CCR-0104800.EDICS:1-STIL ture of the dyadic wavelet decomposition,where coefficients could be organized into trees formed across different subbands. Since there is a mismatch between the zerotree structure and the statistical characteristics of the coefficients generated from the alternative transforms we study,using zerotree coding on these coefficients leads to inefficiencies in coding performance.Fur-thermore,there does not appear to be a natural way to define the parent-child relationships between the alternative transform coefficients,as there is in the dyadic wavelet decomposition. As a generalization of zerotree coding,group testing is not hampered by the zerotree structure and can easily be adapted to more efficiently code these transform coefficients.Our results indicate that our group testing technique achieves better PSNR performance than previous zerotree coding techniques when us-ing the same transform.Our new results show significant perfomance improvements over GTW on the Barbara image.On this image,the algorithm using the best lapped transform performed about dB better than GTW at a wide range of bit-rates.Similarly,the wavelet packets version performed about dB better than GTW.Other images also showed some improvement,although not quite as much.In addition,the algorithms also compare quite favorably to the JPEG2000standard.This paper is organized as follows:Section II reviews the main elements of the framework that was used in the GTW al-gorithm.This includes a brief overview of group testing,image coding,and the GTW algorithm.Section III presents the group testing for wavelet packets(GTWP)algorithm,which includes a brief overview of wavelet packet image compression,the GTWP algorithm,and GTWP’s rate-distortion performance.Section IV presents the group testing for block transforms algorithm,in-cluding an overview of block transforms,and the performance results.We summarize our overall results in section V.II.G ROUP T ESTING FOR I MAGE C OMPRESSIONA.IntroductionGroup testing is a technique used for identifying a few sig-nificant items out of a large pool of items.In this framework, the significant items can be identified only through a series of group tests.A group test consists of picking a subset of items and testing them together.There are two possible outcomes of a group test on set:either is insignificant(meaning all items in are insignificant),or is significant(meaning there is at least one significant item in).The goal is to minimize the number of group tests required to identify all the significant items.In this paradigm,the cost of testing any set of items for significance is the same as the cost of testing a single item.As shown in[1],group testing can be viewed as a generalized form of zerotree coding,where the groups tested together do not have to be coefficients organized strictly into trees.The encoded output would simply be a series of bits representing the group test results;this is exactly like using bits to represent whether a tree of coefficients is significant in zerotree coding.Group test-ing for image compression replaces the zerotree coding process of a typical embedded zerotree coder with a technique based on group testing.B.Group Testing Framework OverviewIn our group testing framework,we follow the standard prac-tice of applying a linear transform to the image data,and then coding the transform coefficients.The transform coefficients are coded in a bit-plane by bit-plane fashion,with each bit-plane coded by two passes:a significance pass that identifies newly significant coefficients in the current bit-plane,and a refinement pass that gives an additional bit of precision to already signifi-cant coefficients.The significance pass uses an adaptive form of group testing based on group iterations(described in section II-C.1).Since this adaptive method is known to work well on i.i.d.sources, we try to ensure that the coefficients we code are approxi-mately i.i.d.We accomplish this by dividing the coefficients into classes,where each class is coded by a different adaptive group tester.The classes are designed so that coefficients within one class are well approximated by an i.i.d.source.Note that divid-ing the coefficients into classes is similar to choosing a different methods of coding coefficents based on its context.Since the statistical characteristics of the transform coeffi-cients depend on the transform used,the classes should be de-signed separately for each transform.In[1],the GTW classes were designed for the dyadic wavelet decomposition of an im-age.In this work,we design new classes for the alternative transforms that we use.We present several different definitions of classes in sections III-D.1III-D.2,and IV-E.1.For the purposes of obtaining good embedded performance, we code the classes in order of the probability of significance of their coefficients.Classes with coefficients that have a higher probability of being significant should be encodedfirst.Since the probability of significance of the coefficients in any class depends on the class definition,we must choose a method of ordering the classes that depends upon the class definition.C.Some Significance Pass DetailsWefirst describe group iterations,the method by which our significance pass is encoded.We then describe our adaptive group testing strategy.Finally,we then end this subsection with a description of how the group testing framework encodes the different classes using adaptive group testing.This section will only present an overview of our significance pass;for full de-tails,see[1].C.1Group IterationsA group iteration is a simple procedure that is given a set of items,and uses group tests to identify up to1significant item, and up to insignificant items.At the end of a group iteration, there may be some unidentified items in that must be tested in a future group iteration.If the set contains a significant item,the group iteration will use group tests in a recursive,binary search-like process to identify one significant item;otherwise it will use exactly one group test to identify set as containing only insignificant items.C.2Adaptive Group TestingWe adaptively pick the group iteration size depending upon the statistical characteristics of the items being encoded.We start out initially in a doubling phase,with group iteration size 1,and double the size of each successive group iteration as long as no significant items have yet been found.Once a significant item has been found,we move to the steady-state estimation phase,where we choose a group iteration size that results in op-timal coding performance based on our estimate of the probabil-ity of significance.Our estimate is calculated as the percentage of significant items seen so far.C.3Significance Pass AlgorithmAs previously described,our method divides the coefficients of one bit-plane into classes,and uses the previously described adaptive group testing technique to code each class.Given the class ordering and the definition of classes,the algorithm for encoding the significance pass is conceptually very simple: Pick thefirst class(according to the class ordering)that con-tains enough coefficients.Then perform a group iteration of size on that class,where is chosen according to the statistics in the adaptive group tester for that class.Then update the coeffi-cients as necessary with the information learned from the group tests(coefficients could change classes at this point).Finally, repeat this entire procedure until all coefficients are coded.III.G ROUP T ESTING FOR W AVELET P ACKETSA.Wavelet Packets BackgroundAs described in[12],wavelet packets are a generalization of the standard dyadic wavelet decomposition of a signal.The standard dyadic wavelet transform decomposes the signal by applying a series of successivefilters to the lowest frequency subband.Wavelet packets are a generalization of this where the successivefilters can be applied to any subband of any orienta-tion,not just the lowest frequency LL subband.Any one partic-ular choice of subbands to decompose is known as a basis;theHONG,LADNER,AND RISKIN:GROUP TESTING3 choice of exactly which basis to use depends on the characteris-tics of the input.Figure1shows the subbands after transformingan image with the wavelet packet transform using one particularbasis.Fig.1.Sample subbands of a wavelet packet-transformed image.A basis that adapts well to the input signal can be chosenvia Coifman and Wickerhauser’s entropy-based technique[13]or by Ramchandran and Vetterli’s rate-distortion optimizationtechnique[14].These methods work by fully decomposingall subbands to a predefined maximum depth,thus forming adecomposition tree where each decomposed subband is repre-sented in the tree by a parent node with four child nodes.Thenthe best basis is found by pruning this decomposition tree in a re-cursive bottom-up fashion.The entropy-based technique prunesthe tree to minimize the overall estimated entropy of the waveletpacket structure.The rate-distortion method is given a particulartarget bit rate for the image and prunes the tree to minimize thedistortion of the image.Xiong et al.[15]first explored the combination of a waveletpacket decomposition of an image with the space-frequencyquantization(SFQ)coder,a coder that uses zerotree quantiza-tion techniques.The difficulty in applying zerotree quantizationto wavelet packets is that it is no longer clear how to define theparent-child relationships in the trees.As noted by Rajpoot etal.[7],there is a parenting conflict,where some child coeffi-cients could have multiple parents.This problem has typicallybeen solved by limiting the space of possible wavelet packet de-compositions so that no parenting conflict occurs,or by assign-ing the parent-child relationships in a somewhat ad hoc manner(see[15],[7],[8]).B.Group Testing for Wavelet PacketsWe propose a new coder,Group Testing for Wavelet Pack-ets(GTWP),that applies our group testing framework to the wavelet packet transform.Thefirst step is tofind the best ba-sis for the input image,and encode the structure of this basis in thefirst bits of our compressed image.Then we define the GTWP classes based on thecharacteristics of the wavelet packet decomposition of the image,so that the classes are encoded ef-ficiently.Along with the class definition,we also specify the order in which we will code the classes.Once both the GTWP classes and the ordering between them are defined,then we can code each class with a different group tester,and proceed as de-scribed in the group testing framework for image compression. Wefirst describe how we choose the best basis and encode it; then we describe two different methods for defining the GTWP classes with their associated orderings.C.Best BasisWe investigated using both the entropy-based technique and the rate-distortion technique for computing the best wavelet packet basis.For the entropy-based technique,we explored many different metrics for calculating the entropy of a partic-ular subband.Let represent the value of the coefficients of a subband.Then the entropy metrics we tried are as follows: log energy metric:.Shannon metric(used in[13]):.L-norm metric:.threshold metric:Given a threshold value,calculate.first-order entropy metric(used in[8]):Given a quantization step size,divide the coefficients into quantization bins,and estimate the probability of a bin occurring by,where is the number of coefficients in that bin,and is the total number of coefficients.Calculate.We also tried the rate-distortion optimization technique,opti-mizing for a wide variety of bit-rates for various different possi-ble scalar quantizers.Note that this technique is not well suited to our problem because it forces us to pick artificial parameters, namely,thefinal bit-rate for which to optimize and the quantizer step sizes to consider.Since GTWP is an embedded coder,the final bit-rate we choose for the purpose of obtaining the best ba-sis does not correspond to the actualfinal bit-rate to which we encode the image.Furthermore,since GTWP codes the trans-form coefficients bit-plane by bit-plane,it cannot choose to code a subband with a particular quantizer step size;the step size it ends up using may not have any relation to the quantizer step size parameters that we chose to run the rate-distortion optimiza-tion technique.It is interesting to note that for the Barbara image,the optimal calculated quantizer step size for all the subbands under the rate-distortion technique differed from each other by no more than a factor of.In the bit-plane encoding technique,if we stop coding in the middle of a bit-plane,then the coefficients that have not yet been coded in the current bit-plane are quantized with a step size of times the step size of those coefficients that have been coded.This suggests that GTWP’s bit-plane encoding technique may be a good approximation to the quantization step sizes that the rate-distortion optimization best basis produces.Fig.2.Illustration of the best basis for the Barbara image.4SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSINGThe log energy,Shannon,and L-norm metrics are the sim-plest in that they do not require additional parameters(such as threshold value or quantization step size)to compute.The top performers for our algorithm are the log energy metric and the rate-distortion optimization metric.Seeing that the log-energy metric was simpler and did not require selecting artificial pa-rameters,we used it exclusively.As an example,we show the best basis chosen by the log energy metric on the Barbara image infigure2.For simplicity,we show only levels of decomposi-tion even though our algorithm uses a maximum of levels. To encode the decomposition tree,we simply perform a depth-first traversal of the tree,and encode a when that par-ticular node is split into children,and a when the node is a leaf.D.GTWP ClassesTo illustrate theflexibility of the group testing methodology, we implemented two different ways of choosing the GTWP classes:GTWP-S and GTWP-J.Thefirst method is a simpli-fication of the definition GTW classes(S for Simple),and the second method is based on the contexts used in the JPEG-2000 image coder(J for JPEG-2000).D.1GTWP-SThe GTWP-S classes are a simplification of GTW classes and are defined by two different characteristics:the subband level and the significant neighbor metric.D.1.a Subband Level.The lowest frequency subband repre-senting the average of the entire image counts as subband level one.There is one additional subband level for each level of the wavelet transform.Figure3shows the subband levels when levels of the wavelet transform are performed.Note that because we are using wavelet packets,each subband level may contain more than actual subbands.Fig.3.Subband levels of a wavelet-packet transformed image in GTWP-S.Solid lines separate subband levels;dotted lines separate subbands.All shaded coefficients are in subband level.Subband levels,,and have ,,and subbands,respectively.D.1.b Significant Neighbor Metric.Tofinesse the problem of defining parent-child relationships in the wavelet packet trans-form,we restrict the neighbors of a coefficient to be one of up to spatially adjacent coefficients in the same subband.Like GTW,there are values in the significant neighbor metric,0,1, 2,and3+,depending on whether,,,or more than neigh-bors are significant.Because there are only up to neighbors, the maximum neighbor count is.Overall,there are subband levels and significant neighbor types resulting in classes total.Note that the subband level and significant neighbor metric are similar to the correspond-ing characteristics of the original GTW classes.Our new def-inition omits the pattern type characteristic found in the GTW classes,because using it did not produce significantly better re-sults.These classes are ordered according to the same ordering as the GTW classes,namely,with significant neighbor metric rated as more important than the subband level characteristic.D.2GTWP-JThe GTWP-J classes are based on the contexts in Taubman’s EBCOT coder[16],and are also found in the JPEG2000coder. In this class definition,there are only two characteristics that define the classes:the orientation type and the neighborhood significance label.D.2.a Orientation Type.The orientation type of subband contained in subband level is based on the orientation of the largest parent subband in subband level that contains. Here,the subband levels are defined as specified in the GTWP-S classes.There are only orientation types,LH(vertically high-pass),HL(horizontally high-pass),and HH subbands.The subband at subband level(the LL subband)is considered to have orientation type LH.The orientation type of a coefficient in subband is the orientation type of subband.This is illus-trated infigure4.Fig.4.Illustration of orientation type of a wavelet-packet transformed image.All shaded coefficients have orientation type HL.D.2.b Neighborhood Significance Label.Let,,and rep-resent the number of significant neighbors that a coefficient has which are adjacent to it horizontally,vertically,and diagonally, respectively.Thus,and both have a value of up to,whereas has a maximum value of.The neighborhood significance la-bel is assigned according to table I.Note that the labeling is dependent on the orientation type of the coefficient.This label is taken from the context classifier in the EBCOT coder.With orientation types and significant neighbor labels for each orientation type,there are a total of classes.The classes are ordered according to the group iteration size.Classes with smaller group iteration size are codedfirst,since they are more likely to be significant.Ties are broken arbitrarily.E.ResultsHere we present our results on some standard-bit monochrome images:the images Barbara,Goldhill,HONG,LADNER,AND RISKIN:GROUP TESTING 5TABLE IN EIGHBORHOOD SIGNIFICANCE LABEL .Assigned Labeland Lena (available from [17]);and a fingerprint im-age from the FBI’s fingerprint compression standard [18].We present results for several different algorithms,including GTW,GTWP-S,GTWP-J,JPEG 2000and SFQ-WP [19].All algo-rithms use the Daubechies /-tap filters [20].JPEG 2000re-sults were produced with a beta version of a codec [21]for the JPEG 2000image compression standard.SFQ-WP represents the practical version of Xiong et al.’s SFQ algorithm applied with wavelet packets;results are taken from [19].To our knowl-edge,SFQ-WP is the current state-of-the-art method for image compression with wavelet packets.paring performance of GTW,GTWP,and JPEG-2000.Figure 5compares the PSNR curves for GTW,GTWP-S,GTWP-J,and JPEG 2000on the Barbara image.As can be seen,there is little difference between the two GTWP variants,and using the wavelet packets increases the PSNR by about dB over GTW.Table II lists PSNR results for all four images on the different algorithms.The table shows that the amount of improvement for using wavelet packets instead of the dyadic wavelet decomposition is highly dependent on the type of image.Some images (like Bar-bara)benefit significantly from the wavelet packet decomposi-tion;some images (like Goldhill)benefit slightly;and some (like Lena)do not benefit at all.In fact,the best wavelet packet ba-sis for the Lena image was calculated to be the standard dyadic wavelet decomposition with one additional decomposition of a highest frequency subband.Thus,as expected,the results for GTWP on Lena are roughly the same as that for GTW.The slight performance differences are due mostly to differing significant neighbor metrics.If we compare our results with the published results of previ-ous zerotree coding techniques on wavelet packets,we see that outperform Rajpoot et al.’s technique by over dB on the image,and we outperform Khalil et al.’s techniqueaboutdB on the Barbara image.is interesting to note that there was little difference between and GTWP-J.It appears that as long as something rea-is chosen,the exact method of classifying coefficients neighbors are significant does not matter that much.In we also tested GTW-S,a version of GTW simplified so that significant neighbor metric did not include spatially identi-neighbors in different subbands (making GTW-S similar to There was also little difference between GTW-S and It appears that the significance of a coefficient depends al-entirely on its immediately adjacent neighbors,and very on the parents and other neighbors in different subbands.agrees with the findings in [16].can be seen in the table,GTWP’s performance is often than JPEG 2000,and never significantly worse.Further-GTWP’s performance is not too far from that of SFQ-WP.GTWP is worse,GTWP is an embedded coder,while SFQ-WP is not.Furthermore,GTWP is much simpler than SFQ in that it does not use arithmetic coding,and does not perform rate-distortion optimization.IV.G ROUP T ESTING FOR B LOCK T RANSFORMSIn this section,we show the results of applying our group testing framework to some standard block transforms.We first overview the use of block transforms for image compression.We then define the classes that we use for the block transforms,and conclude with a discussion of our results.6SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSINGTABLE IIC OMPARING PSNR RESULTS OF VARIOUS ALGORITHMS,WITH BEST RESULTS IN BOLDFACE.GTW IS USED AS THE BASELINE,SO THAT FOR ALGORITHM,GTWRate(Bits/pixel)AlgorithmGTWJPEG2000GTWP-SGTWP-JSFQ-WPLenaNAGTWJPEG2000GTWP-SGTWP-JfingerprintA.Block Transform OverviewA.1Block Transform BackgroundWhen applying standard block transforms such as the DCT to images,the input pixels are divided into into blocks, and each block is separately transformed into an output block of size.The coefficient at position in each output block is known as the DC coefficient,and all other coefficients are known as AC coefficients.Note that the DC coefficient of a particular block represents the average of the value of the pixels in the corresponding input block.A lapped transform[4]is a generalization of the standard block transform where the input is divided into overlapping blocks of length,with each block transformed into an out-put block of size;we call this an lapped transform. An lapped transform can be computed by multiplying the input row vector of length with an size matrix repre-senting the transform,resulting in a output block of length.In a typical example where,each input data point is used in two adjacent output blocks.In this case,the inverse transform to recover one original block of input data points is computed by taking two adjacent output blocks of coefficients(coef-ficients total)and multiplying it with another matrix representing the inverse transform.In the two-dimensional case, we can view a lapped transform as mapping overlapping input blocks of size into output blocks of size. Unlike non-lapped block transforms,lapped transforms can take correlation between adjacent blocks into account;this makes it more efficient at decorrelating pped trans-forms can also reduce blocking artifacts because their basis functions decay smoothly to near zero at the boundaries.Both lapped orthogonal transforms(LOT)and lapped biorthogonal transforms(LBT)have been studied.LBT’s have more degrees of freedom than LOT’s since the biorthogonality condition is weaker than the orthogonality condition.Aase and Ramstad [22]have shown that these extra degrees of freedom can be used to design better lapped transforms for image coding.anization into SubbandsThe block-transform coefficients of an image are typically stored in a block-by-block fashion,so that the output of a block transform that uses blocks consists of angrid of blocks,where each block represents anblock of the original input image.However,we can conceptu-ally reorder the transform coefficients into a grid of subbands,each of size.This reordering puts all the DC coefficients into one subband,ordered so that the DC coefficient in block is at position in the DC subband.Simi-larly,there will be a separate subband for each AC coefficient; subband will contain AC coefficients from position within their block,ordered so that the AC coefficient at position in block will be located at position in subband .Figure6illustrates this reorganization when and.In this reorganized picture,each of the subbands repre-sents the entire original image at a different frequency decompo-sition.Note that with this organization,these subbands are simi-lar to the subbands from a dyadic wavelet decomposition in that coefficients in a subband represent the same frequency decom-position of an image over differing spatial locations.Further-more,the upper-left block of DC coefficients(seefigure6)rep-resents a postage-stamp size overview of the entire image,muchHONG,LADNER,AND RISKIN:GROUP TESTING7Block View Subband ViewFig.6.Block transform coefficients on the left are reorganized into subbands on the right.The DC coefficients are represented as circles and end up together in one subband;black coefficients from one block are scattered out to all subbands.like the lowest frequency subband in a dyadic wavelet decom-position.The principal difference between the dyadic wavelet decomposition and this reorganized block transform picture is that all the subbands from block-transforms are the same size, whereas in the wavelet transform,the subbands’sizes decrease by a factor of with every additional level of the DWT per-formed.In other words,block transforms offer a uniform-band frequency partitioning of the input,in contrast to the octave-band frequency partitioning of the wavelet transform(see[11]). For the DCT transform,the DC coefficient of an output block represents an average of the input block.Since adjacent image blocks often are similar,adjacent coefficients in the DC subband will be correlated.For the lapped transforms,each tra-ditional output block is computed from an input block of the original image.Most of the energy in the DC coef-ficient of the lapped transform is from the average of the entire block.Since blocks are overlapping,some image pixels are used in more than one average and contribute their energy to adjacent coefficients in the DC subband.C.Relation to the Wavelet TransformWith the subband organization,it becomes clear that we can also perform several levels of block transforms by recursively reapplying the block transform to the DC subband.We use the term hierarchical block transform to refer to any block trans-form scheme that decorrelates its DC subband by applying an-other transform.Note that hierarchical block transforms are similar to the levels of the DWT in a dyadic wavelet decomposi-tion.Since the DC subband represents a small low-resolution overview of the entire image,we expect there to be signifi-cant correlation in the DC subband.Hierarchically reapplying a block transform to the DC subband should decorrelate it further and enable better compression performance.We could continue to perform levels of the block transform as long as the lowest-frequency DC subband is not too small.Note that after every block transform step,we always reorganize the transform co-efficients so that a DC subband is always present.Also note that in principle,any transform could be used to decorrelate the DC subbands;in addition to the lapped transforms and the DCT, even a DWT could be used to decorrelate the DC subband.Another relationship between lapped transforms and the DWT is that a lapped transform can be thought of as a gener-alization of one level of the DWT.Recall that the output coef-ficients of a wavelet transform can be computed via convolu-tion.For a-tap wavelet transform,any one output coefficient depends on at most consecutive input coefficients.Thus,anlapped transform can use the overlap of data points on the input to compute the convolution of the input with the waveletfilter coefficients as would be done by the DWT.In other words,the DWT can be implemented as a lapped trans-form.Furthermore,hierarchical lapped transforms can com-pletely implement the DWTs that use many levels.In its full generality,hierarchical block transforms have the potential to perform better than the DWT.D.Previous Zerotree CodersThe most widespread image compression format using DCT is the standard JPEG[23]format.It uses DCT blocks. Xiong et al.’s Embedded Zerotree DCT algorithm(EZ-DCT) [9]applied the zerotree technique to the DCT-transformed coef-ficients of an image.Although the coefficients of a DCT trans-form are not naturally tree-structured,this coder showed that by imposing a somewhat arbitrary tree structure on the coefficients, reasonable performance could be achieved,certainly better than JPEG.Malvar applied the zerotree technique to lapped transform co-efficients[9],[10].He basically used the same method as EZ-DCT,but replaced the DCT transform with lapped transforms. He defined an LOT transform as well as a LBT transform that were optimized for both image compression ef-ficiency and low computational requirements.We use EZ-LOT (EZ-LBT)to refer to Xiong et al.’s embedded zerotree technique when applied to Malvar’s fast version of the LOT(LBT)trans-forms.Tran et al.[9],[10],[11]focused on designing the best lapped transforms for image compression,and did not consider the speed of computation to be a crucial factor.They designed sev-eral lapped transforms,including the generalized LBT (GLBT).This transform was optimized solely for good coding performance on images.Tran et ed a hierarchical coder。

fast approximate energy minimization via graph cuts

fast approximate energy minimization via graph cuts

æ
1
ANY early vision problems require estimating some spatially varying quantity (such as intensity or disparity) from noisy measurements. Such quantities tend to be piecewise smooth; they vary smoothly on the surface of an object, but change dramatically at object boundaries. Every pixel p P must be assigned a label in some finite set v. For motion or stereo, the labels are disparities, while for image restoration they represent intensities. The goal is to find a labeling f that assigns each pixel p P a label fp P v, where f is both piecewise smooth and consistent with the observed data. These vision problems can be naturally formulated in terms of energy minimization. In this framework, one seeks the labeling f that minimizes the energy
AbstractÐMany tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. In this paper, we consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy. Index TermsÐEnergy minimization, early vision, graph algorithms, minimum cut, maximum flow, stereo, motion, image restoration, Markov Random Fields, Potts model, multiway cut.

恩格列净使用剂量对2_型糖尿病治疗效果和不良反应的影响研究

恩格列净使用剂量对2_型糖尿病治疗效果和不良反应的影响研究

恩格列净使用剂量对2型糖尿病治疗效果和不良反应的影响研究张敏,季中秋江苏省盐城市第三人民医院药学部,江苏盐城224000[摘要]目的分析恩格列净用于2型糖尿病治疗时不同剂量对临床疗效及不良反应发生情况的影响。

方法回顾性分析2021年1月—2022年1月期间盐城市第三人民医院收治的180例采用恩格列净治疗的2型糖尿病患者临床资料,根据给药剂量不同将其分为高剂量组(25 mg)和低剂量组(10 mg),每组90例。

对比两组患者临床治疗效果和不良反应发生情况。

结果两组患者不良反应发生率对比,差异无统计学意义(P> 0.05);治疗后,高剂量组患者空腹血糖、餐后2 h血糖以及糖化蛋白水平低于低剂量组,差异有统计学意义(P<0.05)。

结论恩格列净用于2型糖尿病治疗时,给药剂量由10 mg升高到25 mg能够有效控制患者血糖水平,且不会增加不良反应发生率。

[关键词] 恩格列净;2型糖尿病;临床治疗效果;不良反应[中图分类号] R4 [文献标识码] A [文章编号] 1672-4062(2023)04(a)-0076-04Effect of Empagliflozin Dosage on Treatment Effect and Adverse Reac⁃tions of Type 2 Diabetes MellitusZHANG Min, JI ZhongqiuDepartment of Pharmacy, Yancheng Third People's Hospital of Jiangsu Province, Yancheng, Jiangsu Province, 224000 China[Abstract] Objective To analyze the effect of different doses of empagliflozin on the clinical efficacy and adverse re⁃actions in the treatment of type 2 diabetes. Methods Retrospective analysis was made on the clinical data of 180 pa⁃tients with type 2 diabetes who were treated with engegliptin in Yancheng Third People's Hospital from January 2021 to January 2022. They were divided into a high-dose group (25 mg) and a low-dose group (10 mg) according to the dosage, with 90 cases in each group. Results There was no statistically significant difference in the incidence of ad⁃verse reactions between the two groups of patients (P>0.05). After treatment, the fasting blood glucose, 2-hour post⁃prandial blood glucose, and glycated protein levels in the high-dose group were significantly lower than those in the low-dose group, the difference was statistically significant (P<0.05). Conclusion When Empagliflozin is used in the treatment of type 2 diabetes, increasing the dosage from 10 mg to 25 mg can effectively control the blood glucose level of patients, without increasing the incidence of or adverse reactions.[Key words] Empagliflozin; Type 2 diabetes mellitus; Clinical treatment effect; Adverse reaction糖尿病是一种慢性、终身性代谢疾病,患者为了有效控制血糖、降低并发症的发生多需要终身用药,以免对身体健康和生命安全造成不良影响[1]。

跳过序列:在调查问卷中的一个重要问题

跳过序列:在调查问卷中的一个重要问题

The Annals of Applied Statistics2008,V ol.2,No.1,264–285DOI:10.1214/07-AOAS134©Institute of Mathematical Statistics,2008SKIP SEQUENCING:A DECISION PROBLEMIN QUESTIONNAIRE DESIGNB YC HARLES F.M ANSKI1AND F RANCESCA M OLINARI2Northwestern University and Cornell UniversityThis paper studies questionnaire design as a formal decision problem, focusing on one element of the design process:skip sequencing.We proposethat a survey planner use an explicit loss function to quantify the trade-offbetween cost and informativeness of the survey and aim to make a designchoice that minimizes loss.We pose a choice between three options:ask allrespondents about an item of interest,use skip sequencing,thereby askingthe item only of respondents who give a certain answer to an opening ques-tion,or do not ask the item at all.Thefirst option is most informative butalso most costly.The use of skip sequencing reduces respondent burden andthe cost of interviewing,but may spread data quality problems across surveyitems,thereby reducing informativeness.The last option has no cost but iscompletely uninformative about the item of interest.We show how the plan-ner may choose among these three options in the presence of two inferentialproblems,item nonresponse and response error.1.Introduction.Designing a questionnaire for administration to a sample of respondents requires many decisions about the items to be asked,the wording and ordering of the questions,and so on.Considerable research has investigated the item response rates and patterns associated with alternative designs.See Krosnick (1999)for a recent review of the literature.Researchers have also called attention to the tension between the desire to reduce the costs and increase the informative-ness of surveys.See,for example,Groves(1987)and Groves and Heeringa(2006). However,survey researchers have not studied questionnaire design as a formal de-cision problem in which one uses an explicit loss function to quantify the trade-off between cost and informativeness and aims to make a design choice that minimizes loss.This paper takes an initial step in that direction.We consider one element of the design problem,the use of skip sequencing.Skip sequencing is a widespread survey practice in which the response to an opening question is used to determine whether a respondent should be asked cer-tain subsequent questions.The objective is to eliminate inapplicable questions, Received July2007;revised September2007.1Supported in part by National Institute of Aging Grants R21AG028465-01and5P01AG026571-02,and by NSF Grant SES-05-49544.2Supported in part by National Institute of Aging Grant R21AG028465-01and by NSF Grant SES-06-17482.Key words and phrases.Skip sequencing,questionnaire design,item nonresponse,response error, partial identification.264SKIP SEQUENCING265 thereby reducing respondent burden and the cost of interviewing.However,skip sequencing can amplify data quality problems.In particular,skip sequencing ex-acerbates the identification problems caused by item nonresponse and response errors.A respondent may not answer the opening question.When this happens,a com-mon practice is to label the subsequent questions as inapplicable.However,they may be applicable,in which case the item nonresponse problem is amplified.An-other practice is to impute the answer to the opening question and,if the imputation is positive,to also impute answers to the subsequent questions.Some of these im-putations will inevitably be incorrect.A particularly odd situation occurs when the answer to the opening question should be negative but the imputation is positive. Then answers are imputed to subsequent questions that actually are inapplicable.A respondent may answer the opening question with error.An error may cause subsequent questions to be skipped,when they should be asked,or vice versa. An error of thefirst type induces nonresponse to the subsequent questions.The consequences of an error of the second type depend on how the respondent answers the subsequent questions,having answered the opening one incorrectly.I LLUSTRATION1.The2006wave of the Health and Retirement Study(HRS) asked current Social Security recipients about their expectations for the future of the Social Security system.An opening question asked broadly:“Thinking of the Social Security program in general and not just your own Social Security benefits: On a scale from0to100(where0means no chance and100means absolutely certain),what is the percent chance that Congress will change Social Security sometime in the next10years,so that it becomes less generous than now?”If the answer was a number greater than zero,a follow-up question asked“We just asked you about changes to Social Security in general.Now we would like to know whether you think these Social Security changes might affect your own benefits. On a scale from0to100,what do you think is the percent chance that the benefits you yourself are receiving from Social Security will be cut some time over the next 10years?”If a person did not respond to the opening question or gave an answer of0,the follow-up question was not asked.I LLUSTRATION2.The1990wave of the National Longitudinal Survey of Older Men(NLSOM)queried respondents about their limitations in activities of daily living(ADLs).An opening question asked broadly:“Because of a health or physical problem,do you ever need help from anyone in looking after personal care such as dressing,bathing,eating,going to the bathroom,or other such daily activities?”If the answer was positive,the respondent was then asked if he/she re-ceives help from another person in each of six specific ADLs(bathing/showering, dressing,eating,getting in or out of a chair or bed,walking,using the toilet).If the answer was negative or missing,the subsequent questions were skipped out.266 C.F.MANSKI AND F.MOLINARIThese illustrative uses of skip sequencing save survey costs by asking a broad questionfirst and by following up with a more specific question only when the answer to the broad question meets specified criteria.However,nonresponse or response error to the opening question may compromise the quality of the data obtained.This paper studies skip sequencing as a decision problem in questionnaire de-sign.We suppose that a survey planner is considering whether and how to ask about an item of interest.Three design options follow:Option All(A):ask all respondents the question.Option Skip(S):ask only those respondents who respond positively to an opening question.Option None(N):do not ask the question at all.These options vary in the cost of administering the questions and in the informa-tiveness of the data they yield.Option(A)is most costly and is potentially most informative.Option(S)is less costly but may be less informative if the opening question has nonresponse or response errors.Option(N)has no cost but is un-informative about the item of interest.We suppose that the planner must choose among these options,weighing cost and informativeness as he deems appropriate. We suggest an approach to this decision problem and give illustrative applications.The paper is organized as follows.As a prelude,Section2summarizes the few precedent studies that consider the data quality aspects of skip sequencing.These studies do not analyze skip sequencing as a decision problem.Section3formalizes the problem of choice among design options.We assume that the survey planner wants to minimize a loss function whose value depends on the cost of a design option and its informativeness.Thus,evaluation of the design options requires that the planner measure their cost and informativeness.Suppose that a planner wants to combine sample data on an item with specified assumptions in order to learn about a population parameter of interest.When the sample size is large,we propose that informativeness be measured by the size of the identification region that a design option yields for this parameter.As explained in Manski(2003),the identification region for the parameter is the set of values that remain feasible when unlimited observations from the sampling process are com-bined with the maintained assumptions.The parameter is point-identified when this set contains a single value and is partially identified when the set is smaller than the parameter’s logical range,but is not a single point.In survey settings with large samples of respondents,where identification rather than statistical inference is the dominant inferential problem,we think it natural to measure informativeness by the size of the identification region.The smaller the identification region,the better.Section6discusses measurement of informativeness when the sample size is small.Then confidence intervals for the partially identified parameter may be used to measure informativeness.SKIP SEQUENCING267 Sections4and5apply the general ideas of Section3in two polar settings hav-ing distinct inferential problems.Section4studies cases in which there may be nonresponse to the questions posed but it is assumed that there are no response errors.Wefirst derive the identification regions under options A,S and N.We then show the circumstances in which a survey planner should choose each option. To illustrate,we consider choice among options for querying respondents about their expectations for future personal Social Security benefits.The HRS2006used skip sequencing,as described in Illustration1.Another option would be to ask all respondents both the broad and the personal question.A third option would be to ask only the broad question,omitting the one about future personal benefits.Section5studies the other polar setting in which there is full response but there may be response errors.Again,wefirst derive the identification regions under the three design options and then show when a survey planner should choose each option.To illustrate,we consider choice among options for querying respondents about limitations in ADLs.The NLSOM used skip sequencing,as described in Illustration2.Another survey,the1993wave of the Assets and Health Dynamics Among the Oldest Old(AHEAD)asked all respondents about a set of specific ADLs.A third option would be to not ask about specific ADLs at all.Section6concludes by calling for further analysis of questionnaire design as a decision problem.2.Previous studies of skip sequencing.As far as we are aware,there has been no precedent research studying skip sequencing as a decision problem in questionnaire design.Messmer and Seymour(1982)and Hill(1991,1993)are the only precedent studies recognizing that skip sequencing may amplify data quality problems.Messmer and Seymour studied the effect of skip sequencing on item nonre-sponse in a large scale mail survey.Their analysis asked whether the difficult structure of the survey,particularly the fact that respondents were instructed to skip to other questions perhaps several pages away in the questionnaire,increased the number of unanswered questions.Their analysis indicates that branching in-structions significantly increased the rate of item nonresponse for questions fol-lowing a branch,and that this effect was higher for older individuals.This work is interesting but it does not have direct implications for modern surveys,where skip sequencing is automated rather than performed manually.Hill used data fromfive interview/reinterview sequence pairs in the1984Survey of Income and Program Participation(SIPP)Reinterview Program.He examined data errors that manifest themselves through a discrepancy between the responses given in the two interviews,and categorized these discrepancies in three groups.In his terminology,a response discrepancy occurs when a different answer is recorded for an opening question in the interview and in the reinterview.A response induced sequencing discrepancy occurs when,as a consequence of different answers to the268 C.F.MANSKI AND F.MOLINARIopening question,a subsequent question is asked in only one of the two inter-views.A procedurally induced sequencing discrepancy occurs when,in one of the two interviews but not both,an opening question is not asked and,therefore,the subsequent question is not asked either.Hill used a discrete contagious regression model to assess the relative impor-tance of these errors in reducing data quality.The contagion process was used to express the idea that error spreads from one question to the next via skip sequenc-ing.Within this model,the“conditional population at risk of contagion”expresses the idea that the number of remaining questions in the sequence at the point where the initiating error occurs gives an upper bound on the number of errors that can be induced.Hill’s results suggest that the losses of data reliability caused by in-duced sequencing errors are at least as large as those induced by response errors. Moreover,the relative importance of sequencing errors strongly increases with the sequence length.This suggests that the reliability of individual items will be lower, all else equal,the later they appear in the sequence.3.A formal design problem.3.1.The choice setting.We pose here a formal questionnaire design problem that highlights how skip sequencing may affect data quality.To focus on this mat-ter,wefind it helpful to simplify the choice setting in three major respects.First,we suppose that a large random sample of respondents is drawn from a much larger population.This brings identification to the fore as the dominant inferential problem,the statistical precision of sample estimates receding into the background as a minor concern.We also suppose that all sample members agree to be interviewed.Hence,inferential problems arise only from item nonresponse and response errors,not from interview nonresponse.Second,we perform a“marginalist”analysis that supposes the entire design of the questionnaire has been set except for one item.The only decision is whether and how to ask about this item.Marginalist analysis enormously simplifies the de-cision problem.In practice,a survey planner must choose the entire structure of the questionnaire,and the choice made about one item may interact with choices made about others.We recognize this but,nevertheless,find it useful for exposi-tion to focus on a single aspect of the global design problem,holdingfixed the remainder of the questionnaire.Third,we assume that the design chosen for the specific item in our marginalist analysis affects only the informativeness of that item.In practice,the choice of how to ask a specific item affects the length of the entire survey,which may influence respondents’willingness or ability to provide reliable responses to other items. We recognize this but,nevertheless,find it useful for exposition to suppose that the effect on other items is negligible.Let y denote the item under consideration.As indicated in the Introduction,the design options are as follows:SKIP SEQUENCING269 A:ask all respondents to report y.S:ask only those respondents who respond positively to an opening ques-tion.N:do not ask about y at all.The population parameter of interest is labeledτ[P(y)],where P is the pop-ulation distribution of y.For example,τ[P(y)]might be the population mean ormedian value of y.3.2.Measuring the cost,informativeness,and loss of the design options.Thedesign options differ in their costs and in their informativeness aboutτ[P(y)].Abstractly,let c k denote the cost of option k,let d k denote its informativeness,andlet L k=L(c k,d k)be the loss that the survey planner associates with option k.We suppose that the planner wants to choose a design option that minimizes L(c k,d k)over k∈(A,S,N).To operationalize this abstract optimization problem,a survey planner must de-cide how to measure loss,cost,and informativeness.Loss presumably increaseswith cost and decreases with informativeness.We will not be more specific aboutthe form of the loss function here.We will,for simplicity,use a linear form in ourapplications.Cost presumably increases with the fraction of respondents who are asked theitem.In some settings,cost may be proportional to this fraction.Then c k=γf k, whereγ>0is the cost per respondent of data collection and f k is the fraction of respondents asked the item under option k.It is the case that1=f A≥f S≥f N= 0.Hence,c A=γ,c S=γf S,c N=0.As indicated in the Introduction,we propose measurement of the informative-ness of a design option by the size of the identification region obtained for theparameter of interest.In general,the size of an identification region depends onthe specified parameter,the data produced by a design option,and the assumptionsthat the planner is willing to maintain.Sections4and5show how in some leadingcases.4.Question design with nonresponse.This section examines how nonre-sponse affects choice among the three design options.To focus attention on the in-ferential problem created by nonresponse,we assume that when sample members do respond,all answers are accurate.Section4.1considers identification of the parameterτ[P(y)].Section4.2shows how to use thefindings to choose a design. Section4.3uses questions on future generosity of Social Security to illustrate.4.1.Identification with nonresponse.It has been common in survey researchto impute missing values and to use these imputations as if they are real data.Stan-dard imputation methods presume that data are missing at random(MAR),condi-tional on specified observable covariates;see Little and Rubin(1987).If the main-tained MAR assumptions are correct,then parameterτ[P(y)]is point-identified270 C.F.MANSKI AND F.MOLINARIunder both of design options A and S.Option S is less costly,so there is no reasonto contemplate option A from the perspective of identification.If option A is usedin practice,the reason must be to provide a larger sample of observations in orderto improve statistical inference.Identification becomes the dominant concern when,as is often the case,a surveyplanner has only a weak understanding of the distribution of missing data.Wefocus here on the worst-case setting,in which the planner knows nothing at allabout the missing data.It is straightforward to determine the identification regionforτ[P(y)]under design options A and S.We draw on Manski[(2003),Chapter1]to show how.Option A.To formalize the identification problem created by nonresponse,let each member j of a population J have an outcome y j in a space Y≡[0,s].Heres can befinite or can equal∞,in which case Y is the nonnegative part of theextended real line.The assumption that y is nonnegative is not crucial for ouranalysis,but it simplifies the exposition and notation.The population is a probability space and y:J→Y is a random variable withdistribution P(y).Let a sampling process draw persons at random from J.How-ever,not all realizations of y are observable.Let the realization of a binary randomvariable z A y indicate observability;y is observable if z A y=1and not observable if z A y=0.The superscript A shows the dependence of observability of y on design option A.By the Law of Total Probability,(1)P(y)=P(y|z A y=1)P(z A y=1)+P(y|z A y=0)P(z A y=0).The sampling process reveals P(y|z A y=1)and P(z A y),but it is uninformative regarding P(y|z A y=0).Hence,the sampling process partially identifies P(y).In particular,it reveals that P(y)lies in the identification region(2)H A[P(y)]≡[P(y|z A y=1)P(z A y=1)+ψP(z A y=0),ψ∈ Y].Here Y is the space of all probability distributions on Y and the superscript A onH shows the dependence of the identification region on the design option.The identification region for a parameter of P(y)follows immediately fromH A[P(y)].Consider inference on the parameterτ[P(y)].The identification region consists of all possible values of the parameter.Thus,(3)H A{τ[P(y)]}≡{τ(η),η∈H A[P(y)]}.Result(3)is simple but is too abstract to be useful as stated.Research on par-tial identification has sought to characterize H A{τ[P(y)]}for different parame-ters.Manski(1989)does this for means of bounded functions of y,Manski(1994) for quantiles,and Manski[(2003),Chapter1]for all parameters that respectfirst-order stochastic dominance.Blundell et al.(2007)and Stoye(2005)characterizeSKIP SEQUENCING271 the identification regions for spread parameters such as the variance,interquartile range and the Gini coefficient.The results for means of bounded functions are easy to derive and instructive, so we focus on these parameters here.To further simplify the exposition,we re-strict attention to monotone functions.Let be the extended real line.Let g(·)be a monotone function that maps Y into and that attainsfinite lower and upper bounds g0≡min y∈Y g(y)=g(0)and g1≡max y∈Y g(y).Without loss of gener-ality,by a normalization,we set g0=0and g1=1.The problem of interest is toinfer E[g(y)].The Law of Iterated Expectations givesE[g(y)]=E[g(y)|z A y=1]P(z A y=1)+E[g(y)|z A y=0]P(z A y=0). (4)The sampling process reveals E[g(y)|z A y=1]and P(z A y),but it is uninformative regarding E[g(y)|z A y=0],which can take any value in the interval[0,1].Hence, the identification region for E[g(y)]is the closed intervalH A{E[g(y)]}=E[g(y)|z A y=1]P(z A y=1),(5)E[g(y)|z A y=1]P(z A y=1)+P(z A y=0) .H A{E[g(y)]}is a proper subset of[0,1]whenever P(z A y=0)is less than one.The width of the region is P(z A y=0).Thus,the severity of the identification problem varies directly with the prevalence of missing data.Option S.There are two sources of nonresponse under option S.First,a sam-ple member may not respond to the opening question,in which case she is not asked about item y.Second,a sample member may respond to the opening ques-tion but not to the subsequent question about item y.Let x denote the item whose value is sought in the opening question.As in Il-lustrations1and2,we suppose that x is a broad item and that y is a more specific one.For simplicity,we suppose here that x∈{0,1}and that x=0 ⇒y=0.A respondent is asked about y only if she answers the opening question and re-ports x=1.For example,consider Illustration2discussed in the Introduction.If a respondent does not have any limitation in ADLs(x=0),then clearly the respon-dent does not have a limitation in bathing/showering(y=0).Hence,the NLSOM asks about y only when a respondent reports x=1.To formalize the identification problem,we need two response indicators,z S x and z S y,the superscript S showing the dependence of nonresponse on design op-tion S.Let z S x=1if a respondent answers the opening question and let z S x=0 otherwise.Let z S y=1if a respondent who is asked the follow-up question gives a response,with z S y=0otherwise.Hence,z S y=1 ⇒z S x=1.This and the Law of272 C.F.MANSKI AND F.MOLINARIIterated Expectations and the fact that g(0)=0giveE [g(y)]=E [g(y)|x =1]P (x =1)+E [g(y)|x =0]P (x =0)=E [g(y)|x =1,z S y=1]P (z S y =1,x =1)+E [g(y)|x =1,z S x=1,z S y =0]P (z S x =1,z S y =0,x =1)+E [g(y)|x =1,z S x=0]P (z S x =0,x =1).The sampling process reveals E [g(y)|x =1,z S y =1],P (z S x =1,z S y =0,x =1),and P (z S y =1)=P (z S y =1,x =1),where the last equality holds because z S y =1 ⇒x =1.The data are uninformative about E [g(y)|x =1,z S x =1,z S y =0]and E [g(y)|x =1,z S x =0],which can take any values in [0,1].The data are partially informative about P (z S x =0,x =1),which can take any value in [0,P (z S x =0)].It follows that the identification region for E [g(y)]is the closed intervalH S {E [g(y)]}=E [g(y)|z S y =1]P (z S y =1),E [g(y)|z S y=1]P (z S y =1)(6)+P (z S x =1,z S y =0,x =1)+P (z S x =0).Thus,the severity of the identification problem varies directly with the prevalence of nonresponse to the opening question and to the follow-up question in the sub-population in which it is asked.4.2.Choosing a design.Now consider choice among the three design options (A,S,N).The widths of the identification regions for E [g(y)]under these options are as follows:d A =P (z A y =0),d S =P (z S x =1,z S y =0,x =1)+P (z S x =0),d N =1.For specificity,let the loss function have the linear form L k =γf k +d k .The first component measures survey cost and the second measures the informativeness of the design option.We set the coefficient on d k equal to one as a normalization of scale.The parameter γmeasures the importance that the survey planner gives to cost relative to informativeness.There is no universally “correct”value of this parameter.Its value is something that the survey planner must specify,depending on the survey context and the nature of item y .It follows from the above and from the derivations of Section 4.1that the losses associated with the three design options are as follows:L A =γ+P (z A y=0),L S =γP (z S x =1,x =1)+P (z S x =1,z S y =0,x =1)+P (z S x =0),L N =1.SKIP SEQUENCING273 Thus,it is optimal to administer item y to all sample members ifγ+P(z A y=0)≤min{1,γP(z S x=1,x=1)+P(z S x=1,z S y=0,x=1)+P(z S x=0)}.Skip sequencing is optimal ifγP(z S x=1,x=1)+P(z S x=1,z S y=0,x=1)+P(z S x=0)≤min{1,γ+P(z A y=0)}.If neither of these inequalities hold,it is optimal not to ask the item at all.Determination of the optimal design option requires knowledge of the response rates that would occur under options A and S.This is where the body of survey research reviewed by Krosnick(1999)has a potentially important role to play. Through the use of randomized experiments embedded in surveys,researchers have developed considerable knowledge of the response rates that occur when var-ious types of questions are posed to diverse populations.In many cases,this body of knowledge can be brought to bear to provide credible values for the response rates that determine loss under options A and S.When the literature does not provide credible values for these response rates,a survey planner may want to perform his own pretest,randomly assigning sample members to options A and S.The size of the pretest sample only needs to be large enough to determine with reasonable confidence which design option is best.It does not need to be large enough to give precise estimates of the response rates.4.3.Questioning about expectations on the generosity of social security.Con-sider the questions on expectations for the future generosity of the Social Security program cited in Illustration1.The opening question was posed to10,748re-spondents to the2006HRS who currently receive social security benefits,and the follow-up was asked to the sub-sample of9356persons who answered the opening question and gave a response greater than zero.We assume here that the only data problem is nonresponse.The nonresponse rate to the opening question was7.23%. The nonresponse rate to the follow-up question,for the subsample asked this ques-tion,was2.27%.It is plausible that someone may not be willing to respond to the first question and yet be willing to respond to the second one.In particular,this would happen if a person does not want to speculate on what Congress will do but, nevertheless,is sure that if Congress does act,it would only change benefits for fu-ture retirees,not for those already in the system.The HRS use of skip sequencing prevents observation of y in such cases.To cast this application into the notation of the previous section,we let x=1if a respondent places a positive probability on Congress acting,with x=0otherwise. The rest of the notation is the same as above.An early release of the HRS data provide these empirical values for the quan-tities that determine the identification region for E[g(y)]and loss under design option S:P(z S x=1,z S y=0,x=1)=0.0197,P(z S x=1,x=1)=0.8705,P(z S y=1)=0.8508,P(z S x=0)=0.0723,E[g(y)|z S y=1]=0.4039,where g(y)≡y100.Hence,the identification region for E[g(y)]under option S isH S{E[g(y)]}=[0.3436,0.4356]and loss is L S=0.8705γ+0.0920.The HRS data do not reveal the quantities that determine the identification re-gion for E[g(y)]and loss under design option A.For this illustration,we con-jecture that the mean response to item y that would be obtained under option A equals the mean response that is observed under option S.Thus,E[g(y)|z A y= 1]=0.4039.We suppose further that the nonresponse probability would be P(z A y=0)=0.08.Then the identification region for E[g(y)]under option A is H A{E[g(y)]}=[0.3716,0.4516]and loss is L A=γ+0.08.It follows from the above that it is optimal to administer item y to all sample members ifγ≤0.0927.Skip sequencing is optimal if0.0927≤γ≤1.0431.If neither of these inequalities hold,it is optimal not to ask the item at all.5.Question design with data errors.This section examines how response errors affect choice among the three design options.To focus attention on the infer-ential problem created by such errors,we assume that all sample members respond to the questions posed.Section5.1considers identification.Section5.2shows how to use thefindings to choose a design.Section5.3uses questions on limitations in ADLs to illustrate.5.1.Identification with response errors.Section4showed that assumptions about the distribution of missing data are unnecessary for partially informative inference in the presence of nonresponse.In contrast,assumptions on the nature or prevalence of response errors are a prerequisite for inference.In cases where y is discrete,it is natural to think of data errors as classification errors.We con-ceptualize response error here through a misclassification model previously used。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
相关文档
最新文档