• No results found

ECONOMIC STUDIES DEPARTMENT OF ECONOMICS SCHOOL OF ECONOMICS AND COMMERCIAL LAW GÖTEBORG UNIVERSITY 119 _______________________ STUDIES ON THE POST-COMMUNIST TRANSITION Violeta Piculescu

N/A
N/A
Protected

Academic year: 2021

Share "ECONOMIC STUDIES DEPARTMENT OF ECONOMICS SCHOOL OF ECONOMICS AND COMMERCIAL LAW GÖTEBORG UNIVERSITY 119 _______________________ STUDIES ON THE POST-COMMUNIST TRANSITION Violeta Piculescu"

Copied!
429
0
0

Loading.... (view fulltext now)

Full text

(1)

ECONOMIC STUDIES DEPARTMENT OF ECONOMICS

SCHOOL OF ECONOMICS AND COMMERCIAL LAW GÖTEBORG UNIVERSITY

119

_______________________

STUDIES ON THE POST-COMMUNIST TRANSITION

Violeta Piculescu

(2)

Studies on the Post-Communist Transition

Violeta Piculescu PhD Thesis

Department of Economics Göteborg University

October 2002

(3)

This work is dedicated to my parents, and to my brother Marius

(4)

Acknowledgements

I am indebted to my supervisor, Professor Douglas Hibbs, for teaching me the discipline of self-sustaining research. Whether I have been a good apprentice or not that only the future can really tell. Professor Hibbs has been instrumental at all stages of my work.

The studies included in the thesis carry many prints of our multiple discussions on my research interests. However, any inconsistencies that the reader may find in the thesis are entirely mine.

I am also grateful to Professor Lennart Hjalmarsson in many respects. I recall the days when, as an undergraduate student of the Faculty of Economic Cybernetics, ASE Bucharest, I was attending the lectures of the Swedish professor on topics in Microeconomics. Little was I to know that it was going to be in his country where I’d significantly build on the foundations of my professional and personal life. Professor Hjalmarsson proved an excellent advisor and a valuable supporter of my work here in Sweden. When you owe so much to a person there are hardly any words that can accurately reflect the real extent of the associated gratitude.

Time and energy are scarce resources to us all, and yet Wlodeck Bursztyn proved so generous in spending his time and energy on my writings. The opinions he expressed and the various debates we had, often in a genuine forceful Eastern European style, influenced the analyses in this thesis in many respects. I have been very fortunate to have him as a colleague in the Department of Economics, here in Göteborg.

The work on this thesis heavily relied on the professional skills that had been nurtured by my mentors in the Department of Economic Cybernetics of ASE Bucharest. First my teachers, and later my colleagues in the academic world, Professor Dumitru Marin, Professor Vasile Nica, Professor Gheorghe Oprescu, Professor Emil Scarlat, and Professor Eugen Tiganescu laid the bedrock of my professional development. I am indebted to them all, and the thought of all of the colleagues I had in Bucharest warms my heart.

I also benefited from the support of other people I met here in Nationalekonomi. I thank Professor Arne Bigsten for helping with some of the data employed in this thesis. I am grateful to Eva Jonason for administrative support and the kind touch she provided in such a consistent manner. Further thanks I express in relation to the support I received from Eva-Lena Neth and Gunilla Leander.

It will soon be time for me to move further and in doing so I take with me the memory of the good times I owe to Dana Andren, Anke Brederlau, Sorin Maruster and Dana Roughsedge. There is yet another friend with whom I shared many of the first

(5)

experiences in Sweden and in the school in Göteborg, and it is with sadness that I write that he is not among us anymore. He once engaged in the journey of a PhD program like all of us did, but he did not have a chance to complete it. Emi Lazar was a very gifted person and a good friend.

Joys and frustrations are best carried when shared with the ones closest to your heart.

I’ve drawn my inspiration and strength from a dear friend who spends a lot of time and energy in discussing my ideas and in encouraging me in my weak moments. My thoughts are now with Saurabh, my beloved “Indian summer”, whose love and witty spirit brighten up the winters of my soul.

While doing this work I benefited from the financial support extended by the Department of Economics in Göteborg University, and by the Jan Wallander and Tom Hedelius Foundation, Sweden.

Violeta Piculescu Göteborg, October 2002

(6)

Preface

The studies in this thesis will engage the reader in a journey with three main destinations: 1/ the issue of measurement of the economic and institutional features of the transition process in the post-communist countries in Eastern Europe and Eurasia; 2/

the empirical analysis of factors that can, at least partly, explain the relative economic success of the transition process in these countries; 3/ a theoretical model that links the quality of institutions to tax evasion, and bureaucratic corruption. The first two studies in the thesis are empirical, while the third study is purely theoretical1.

Study 1 is the result of the frustration I experienced when I first attempted to collect and employ indicators that relate to the economic and institutional dimensions of the transition process. While I found that there is a wealth of such indicators, the empirical handling of a large set of measures proved challenging. The objective in Study 1 is therefore to introduce and employ the method of Confirmatory Factor Analysis (CFA) in order to estimate latent factors that summarize and reflect the complementary nature of specific dimensions of transition. Such specific dimensions are considered in terms of initial conditions, economic reforms and institutions. The empirical analysis in the study focuses on 25 transition economies in Eastern Europe and Eurasia, and on the transition period until year 2001. The CFA method has the advantage that it supports the empirical testing of the hypothesis that a given set of observed indicators combines into a more abstract theoretical construct in a consistent manner. My analysis in the study produces two main categories of results. First, there are successful cases when the CFA analysis applied on observed measures produces reliable summarizing indicators in the form of latent factors. This is the case of the latent factors constructed for time-invariant initial conditions, liberalization of relative prices, reforms with state-owned enterprises, reforms in the financial system, and the summarizing indicator of political environment. Second, there are also cases when the analysis indicates that aggregation of a given set of observed measures is not warranted by the structure of the sample data. This situation is encountered when an attempt is made to estimate a latent factor of initial structural economic imbalances. Empirical difficulties that can occur in the estimation of latent factors are also illustrated. In instances with acute problems with missing data, the possibility to construct empirical measures of more abstract concepts is impaired. This is despite the fact that the associated CFA measurement model indicates a potentially high validity of the measures employed. I encounter this situation when I analyze the possibility to construct latent factors of state governance (laws and regulations, and political interference in business activities).

Having obtained a set of reliable summarizing indicators of, at least, some of the dimensions of the post-communist transition process, I then turned to the objective of the analysis of determinants of the aggregate economic performance in these countries, as observed until year 2000.

In Study 2, the overriding objective is to disentangle the relative roles that initial conditions and the process of reforms and institution building specific to the transition period played in supporting and/or hindering economic activities in the post-communist countries. There are two main defining features of the analysis that I develop. First, in

1 This thesis is included, in electronic format, in the database ‘Doctoral Theses from Göteborg University’, accessible at http://www.ub.gu.se/Gdig/dissdatabas/index.html

(7)

the spirit of the theoretical literature of Optimal Speed of Transition, I analyze the issue that observed aggregate growth is the net result of the expansion of the private sector and of the collapse of the state sector. For this purpose I decompose aggregate growth rates into growth in value added in the private sector and the corresponding developments in the state sector. I find that even among the most advanced transition economies there are significant differences in terms of the way in which developments in the two sectors combine into aggregate economic performance. The second main feature of the current analysis is the focus on interactions among the determinants of economic performance of the two sectors. I introduce and employ the method of Path Analysis in order to analyze an empirical simultaneous equation model that connects initial conditions, reforms, institutions, and growth in the private and the state sectors.

The advantage of the method of path analysis is that it enables the empirical analysis of direct, indirect and total effects of factors included in the model. This empirical set-up allows me to qualify previous empirical findings in several respects, and to also analyze empirically new hypotheses that have not been tested before. In line with my predecessors, I find sizeable total effects of initial conditions on economic developments in the state and private sectors. However, the effects of initial conditions on growth in the private sector appear to be more of an indirect nature, as they are mediated by reforms and by the process of institution building. On the non-linear effect of reforms, I also find that reforms have positive total effects on activities in the private sector and negative total effects on the state sector. In terms of direct effects, I obtain that reforming the state-owned enterprises and the financial system had a significant influence on the observed activities in the private sector. In this respect I estimate that the largest direct effect is associated with changes in the regulatory and economic conditions in the financial sector. However, price liberalization appears to have only an indirect effect (although sizeable) on the performance of the private sector, mainly due its influence on the other types of reforms. As regards developments in the state sector, I find direct significant negative effects on the growth in the sector for all the economic reforms considered in the analysis. In this respect the largest direct effect is associated with cumulative past changes in reforms in the financial system. The endogenous nature of the reform process is mediated in the model by the role of institutions built during transition. The sample data that I employ support the hypothesis of endogenous institutions. I find significant direct effects from the expansion in the private sector to contemporaneous changes in the political environment. Furthermore, weak economic growth in both sectors (private and state) relate to increases in (international observers’) perceptions on pervasive corruption in transition countries. Changes in the institutional environment are further propagated in the system via their feedback effects on the reform process.

In the aftermath of the empirical endeavor in Study 2 I got to understand better that there is more to the process of economic transition than an analysis at the aggregate level of initial conditions, reforms and institutions can reveal. I then focused on the empirical evidence that we have at the firm level in the transition literature, as provided by survey studies. My specific interest related to the role of institutions in enhancing the economic success of transition, and to adverse developments in terms of flourishing shadow activities and pervasive corruption in some of the post-communist countries. In doing so I realized that, in terms of our understanding of the effects of institutions and corruption on economic activities, we would benefit greatly if we combined related empirical results with adequate theoretical models. I therefore engaged in the adventure

(8)

of building a theoretical model that formalizes the links among institutions, unofficial activities and bureaucratic corruption.

In Study 3 I emphasize that there are two main hypotheses that can support the empirical observation that the volume of the shadow activities is positively correlated with the levels of corruption in transition economies. The first hypothesis presumes that excessive government regulations and the associated bureaucratic corruption complement taxation as factors that push private firms away from the official sector and into the unofficial economy. This hypothesis is formalized in the theoretical model of Friedman, Johnson et al.(2000). The second hypothesis stands on the belief that, as a result of their illegal activities, private firms that engage in unofficial activities need to also engage in corrupting public officials. This is the cornerstone of the theoretical analysis in Study 3. In a partial equilibrium framework, I find that the effect of taxation on the extent of firms’ participation in the unofficial sector is best interpreted if considered in connection with two other aspects: the benefits that firms extract from their legal activities, and the factors that facilitate activities in the shadow sector. In a business environment characterized by a well—functioning financial system, and with high quality of public goods such as contract enforcement and protection of property, firms may be willing to tolerate higher levels of taxes without necessarily migrate to the underground sector. Firms’ incentives to be present in the official sector also relate to low incentives that the bureaucrats may have to engage in corrupt deals with non- compliant firms. However, in economic environments with weak institutions and/or with poorly motivated bureaucrats even low levels of taxes may prove high enough to strengthen the temptation to undertake shadow activities. I find that when circumstances are such that activities in the underground are profitable, a government that experiences vanishing tax revenues should concentrate on enticing the non-compliant firms to be active in the official sector, rather than attempt to squeeze more taxes from the existing official activities. Policy implications related to incentives that the bureaucrats may have to engage in corrupt deals with non-compliant firms are also analyzed.

Finally, the reader should be aware of the fact that I now regard the analysis in Study 3 more as a starting point rather than a destination in itself.

(9)

Contents

Study 1: Dimensions of transition in the post-communist countries: a latent variable approach

Introduction 1

Part I: Latent Variable Methodology: Concepts, Methods and Empirical Issues

I.1 Definitions of latent variables and modelling approaches 4 I.2 Theoretical fundamentals of CFA Measurement Models and empirical issues 8

Part II Latent Dimensions of Transition in the Post-Communist Countries in Central and Eastern Europe, and Eurasia

Introduction to Part II 39

II.1 Brief comments on data 42

II.2 Latent factors of initial conditions 43

II.3 Latent factors of reforms 55

II.4 Latent factors of state governance and political environment 79 II.5 Summary of results and comparisons of the latent factors 98 Appendix I Summary of variables, data definitions and sources 103

Appendix II Factors of initial conditions 109

Appendix III Factors of reforms 115

Appendix IV Factors of state governance and political environment 123 Appendix V Estimated scores of the latent factors 135

References 142

Study 2: Effects of initial conditions, reforms and institutions on economic developments during transition: a path analysis approach

Introduction 146

I. Introduction to path analysis 150

II. Stylized facts of transition 164

III. Review of the theoretical and empirical literature 181 IV. Empirical analysis of economic growth in transition 215

(10)

IV.1 Decomposition of aggregate growth 219

IV.2 An empirical model of transition 228

IV.3 Some limitations of the analysis 274

V. Conclusions 277

Appendix I Summary of variables and data sources 285 Appendix II Calculated private and state growth rates 286

Appendix III The Base Model 288

Appendix IV Adding price, foreign exchanges and trade liberalization 292

Appendix V Adding financial sector reforms 296

Appendix VI Adding the judicial system and property rights protection 300 Appendix VII Introducing the political environment 305 Appendix VIII Introducing perceptions of corruption 309

References 313

Study 3 Hidden activities and bureaucratic corruption

Introduction 318

I. Typologies and related models 323

II. The Model 331

II.1 The firm’s problem 331

II.2 The bureaucrat’s problem 349

II.3 Comparative statics analysis 357

III. Summary of results 378

IV. Discussion on assumptions and possible extensions 386

Mathematical Appendix 393

References 415

(11)

STUDY 1

Dimensions of Transition in the Post-Communist Countries: A Latent Variable Approach

Abstract

While there is a wealth of indicators that reflect specific features of the reform strategies and of the process of institution building implemented in the post – communist countries during the transition period, empirical handling of a large set of such measures proves challenging. The objective in this study is to introduce and employ the method of Confirmatory Factor Analysis (CFA) in order to estimate latent factors that summarize and reflect the complementary nature of specific dimensions of transition. Such specific dimensions are considered in terms of initial conditions, economic reforms and institutions. The empirical analysis in the study focuses on 25 transition economies in Eastern Europe and Eurasia, and on the transition period until year 2001. The CFA method that I employ has the advantage that it supports the empirical testing of the hypothesis that a given set of observed indicators combine into a more abstract theoretical construct in a consistent manner. The current analysis produces two main categories of results. First, there are successful cases when the CFA method applied on specific observed measures produces reliable summarizing indicators in the form of latent factors. This is the case of the latent factors constructed for time-invariant initial conditions, liberalization of relative prices, reforms with state-owned enterprises, reforms in the financial system, and the summarizing indicator of political environment. Second, there are also cases when the analysis indicates that aggregation of a given set of observed measures is not warranted by the structure of the sample data. This situation is encountered when an attempt is made to estimate a latent factor of initial structural economic imbalances. Empirical difficulties that can occur in the estimation of latent factors are also illustrated. In instances with acute problems with missing data, the possibility to construct empirical measures of more abstract concepts is impaired. This is despite the fact that the associated CFA measurement model indicates a potentially high validity of the measures employed. I encounter this situation when I analyze the possibility to construct latent factors of state governance (laws and regulations, and political interference in business activities).

Keywords: transition, latent variable, confirmatory factor analysis, structural equation modeling, initial conditions, reforms, institutions, democracy, Eastern Europe, Eurasia

(12)

Dimensions of transition in the post-communist countries: a latent variable approach

INTRODUCTION

Economic policy debates on the transition process in the post-communist countries in Eastern Europe and Eurasia are usually conducted in terms of general concepts such as economic liberalization, structural and regulatory reforms of the productive and the financial sectors, state governance and political institutions. Agendas of empirical research parallel such debates by attempting to disentangle the relative roles of initial conditions, reforms and institutions in spurring economic growth in transition economies. The relative success of the research efforts is often driven by the methodology they employ in order to summarize the multitude of existing measures of specific aspects of the process of economic reforms and institution building into more aggregate indicators, such that they accurately reflect the concerted policy efforts invested in economic and institutional change during the transition process.

While there is a wealth of measures that reflect specific features of the reform strategies and institution building during transition, empirical handling of a large set of such measures proves challenging. Studies that attempt to employ them as separate independent variables in a multivariate regression analysis framework invariably mention that the estimated results are affected by problems of empirical multicollinearity.

In order to circumvent the empirical problems induced by the use of a large set of explanatory variables we need to find ways to summarize the information included in separate indicators into more aggregate indicators, while still preserving their informational content. The objective in this study is to introduce the method of Confirmatory Factor Analysis (CFA) that can be efficiently used for constructing empirical measures of more abstract theoretical concepts. I employ the CFA method in order to estimate aggregate constructs that summarize and reflect the complementarity of various measures of initial conditions at the start of transition, and of the process of economic reforms and institution building in the post-communist countries. The empirical analysis in the study focuses on 25 transition economies in Eastern Europe

(13)

and Eurasia, and on the period starting with the first year of transition in each country and until year 2001.

The method of Confirmatory Factor Analysis is covariance based and it is currently employed in the fields of psychology, sociology, political science, and to a very limited extent in economics, for the analysis of latent variables. A latent variable can be conceived as a multidimensional theoretical construct for which only partial reflections can be measured. The method of Confirmatory Factor Analysis can help test statistically whether a given set of observed indicators consistently combine into a more abstract theoretical concept. Based on CFA models we can therefore learn whether the aggregation of the observed measures in a summarizing indicator is warranted by the structure of the sample data.

The existing literature on transition economies already includes empirical efforts that focused on the possibilities to summarize information comprised in a given (usually large) set of variables into more aggregate indicators. A representative example is the study of De Melo, Denizer et al(2001) that pioneered the introduction of aggregate indicators of initial condition in the literature on transition. The authors employ the method of principal component analysis in order to extract principal components that summarize information comprised in the given observed measures of initial condition.

In the factor analysis literature I find that there is wide acceptance of the fact that, while at the conceptual level the principal components are assimilated to latent variables, the design of the method proves deficient for the construction of latent variables. The shortcomings of the principal components method are discussed in the present study when theoretical fundamentals of Confirmatory Factor Analysis are introduced.

An alternative approach currently employed in order to obtain summarizing indicators consists in the use of subjective weights assigned to the component measures that are to be summarized. In this study I argue that this approach suffers from lack of transparency and limited scope for empirical replication.

Furthermore, a common point that applies to both approaches mentioned above is that they do not allow for statistical testing of the hypothesis that the given set of observed dimensions actually combine in a common summarizing factor in a consistent manner. If the hypothesis is not supported by the data, then their aggregation carries the risk of distorting their informational potential.

(14)

The analysis in this study produces two categories of results. First, there are successful cases when Confirmatory Factor Analysis applied on observed measures of fixed initial conditions, and various aspects of implemented reforms and the political environment in transition economies produces reliable summarizing indicators in the form of latent factors. Second, there also cases when confirmatory factor analysis indicates that aggregation of a given set of observed variables is not warranted. This situation is encountered when an attempt is made to estimate initial structural economic imbalances at the beginning of transition. Empirical difficulties in estimating latent factors are also illustrated for cases when acute problems with missing data limit the possibility to construct an empirical measure for a latent factor, despite the fact that the associated CFA model indicates a potentially high validity of measures employed.

The current study is divided in two main parts. Part I is dedicated to the theoretical and empirical aspects related to the method of Confirmatory Factor Analysis.

In Section I.1 I discuss various existing definitions for the concept of latent variables.

Section I.2 provides the theoretical intuition behind the method of CFA. Empirical problems expected to occur in practical applications of the CFA method are also discussed in the section. Part II of the study concentrates on some possible applications of the CFA method in the case of transition economies. After including brief initial comments on data preparation in Section II.1, I organize the subsequent sections according to the groups of latent factors constructed in the analysis. Section II.2 focuses on latent factors of initial conditions that characterized transition economies in aftermath of the demise of the communist regime; Section II.3 includes models of latent factors of reforms, with an emphasis on three main types of reforms: liberalization of relative prices, reforms in the enterprise sector and reforms in the financial sector.

Section II.4 includes that analysis of latent factors of institutions in transition, in the form of latent factors of state governance, corruption and political environment. Section II.5 summarizes the results and concludes.

(15)

PART I: Latent Variable Methodology: Concepts, Methods and Empirical Issues

I.1 Definitions of latent variables and modeling approaches

Ever since the dawns of philosophy the human mind has strived to define ideal concepts for which only partial reflections exist in reality. We often exercise our imagination to identify dimensions of conceptual constructs such as happiness and freedom around which we center our existence. In the less philosophical realm of economics, concepts of development and economic freedom are often employed in order to characterize and compare alternative modes of economic organization. Whether we are aware of it or not, we operate with latent concepts both in theory and in empirical analyses whenever we attempt to describe and generalize relationships among classes of events with a higher degree of abstraction. This is not always apparent due to the lack of consensus on what a latent variable actually is. Bollen(2002) summarizes the various attempts to define latent variables into two broad categories: non-formal definitions and formal definitions.

The non-formal definitions of latent variables range from considering latent variables as hypothetical constructs (purely imaginative constructs with little correspondence into reality) to more concrete views that consider latent variables as simply variables that are unobservable or non-measurable, or as devices employed for data reduction.

The definition that addresses the possibility of measuring a latent variable is basically the definition we operate with in empirical economics most of the time. Given a theoretical concept with a broad definition domain, say economic freedom, we try to identify its possible dimensions that can be measured in reality and then approximate the latent construct by aggregating the information across the identified dimensions. The simplest method of aggregation, which often has little theoretical justification, is a simple average of the observed scores for the dimensions of the concept. More elaborated approaches use different subjective weights for the components, but even in that situation there is little transparency and limited possibility of replication of the method employed to identify the weights.

(16)

For the theoretical constructs with a more limited definition domain, the measurement definition of a latent variable translates into a formal one stating that even though the concept potentially has a direct representation in reality, we are not able to observe it accurately. This is the ‘expected value’ definition of a latent variables employed in the classical test theory. The latent concept is referred to as a ‘true score’

that would be obtained if we were able to perform an infinite number of replicated experiments and obtain the mean of the observed results. The formal representation of this definition of a latent concept is the following:

i i

i X

x = *+ε where:

i =

x the observed indicator of the latent factor

=

*

Xi the ‘true score’, which is not observable

i =

ε measurement error

The true score latent variable model relies on the following assumptions: the scale of the latent factor is defined by the expected value of the observed indicator; the measurement errors are not correlated with the latent factor; and the true score has a proportionally direct effect on the observed measure, but the observed measure does not have any (direct or indirect) effect on the true score.

In a classical regression analysis, we assume that the observed measures are accurate representations of the theoretical concepts, with no measurement errors. This is a widely recognized problem in that, if not true, it adversely affects the estimated results1.

An alternative formal definition of complex latent variables is introduced in Bentler(1982) in the context of linear systems. As cited in Bollen(2002), ‘a variable in a linear structural equation system is a latent variable if the equations [in the system]

cannot be manipulated so as to express the variable as a function of manifest variables only’. Called as ‘a non-deterministic function of observed variables definition’, this definition envisions a multi-dimensional latent construct as a linear combination of some observed measures, while assuming we have limited possibilities to fully measure

1 See Greene(1997) and Bollen(1989) for extensive discussions on the consequences of ignoring the measurement errors in regression analysis.

(17)

all the relevant dimensions. The definition is less restrictive in that it allows for correlated measurement errors and the observed indicators are allowed to directly or indirectly influence each other. Models formulated accordingly allow us to estimate a predicted value for the latent variables, while asserting that an exact prediction is next to impossible to obtain.

A more practical, non-formal, definition of latent variables considers them as data reduction devices. Given the restrictions in terms of degrees of freedom and multicollinearity often imposed by econometric analysis, a latent factor is used as means to summarize a large number of variables in fewer factors. This definition focuses less on the theoretical content of the latent constructs, while addressing their descriptive function. The disadvantage of data reduction methods is that they carry the risk of losing much of the informational content in the original observed variables employed, if those variables have considerably different definitions. The net result is often a latent factor with such a broad definition that it has little substance left and it is difficult to interpret.

In practice, the use of latent concepts usually combines two or more of these alternative definitions. Depending on the degree to which one relies comparatively more on a particular definition, three main advantages of explicit modeling of latent variables can be identified:

- At the conceptual level, the construction of more abstract latent variables allows for the generalization of conclusions related to certain events or processes that are derived from empirical analyses

- With latent factors, we explicitly address the assumption of measurement errors (both systematic and random) in the observed variables employed in empirical studies.

- When used appropriately, latent variables provide means to efficiently summarize observed/collected information, provided that the observed indicators relate to a common theoretical concept in a coherent manner. This has the empirical advantage of reducing the number of variables in econometric analyses, with the associated gains in terms of degree of freedom and reduced collinearity among independent variables

(18)

A currently used methodology, specifically dedicated to the analysis of latent variables, is known as Structural Equation Modeling (SEM)2, and it originates in the method of factor analysis first introduced by Spearman(1904)3. Factor analysis was originally conceived to help identifying the main dimensions of an abstract conceptual construct, such as human intelligence. Given a complex set of observed variables, statistical methods of factor analysis use the information comprised in the correlation matrix of the variables in order to group them according to a common underlying trend.

A latent factor corresponds to each of the groups that are identified, and it is estimated as a linear combination of the variables assigned to it. In its early variants, factor analysis was based solely on data properties in that it looked at correlations between the observed variables in order to identify the number of latent factors and the dimensions of each resulting factor. Such methods of factor analysis, included in the category of Exploratory Factor Analysis (EFA), parallel the data reduction definition of latent variables only. While useful in a preliminary stage of an analysis, the disadvantage of the EFA methods is that they use little theoretical (if any) criteria in clustering observed variables and assigning them to factors, as they do not offer the possibility of statistical testing of hypothesis related to the construction of factors. The consequence is that there is an infinite number of solutions that can be obtained by rotating factors and there is little guidance in how to select the ‘appropriate’ solution.

The method of latent variables embedded in Structural Equation Modeling, designated as Confirmatory Factor Analysis, meets most of the definitions on latent variables introduced earlier: it is theory – based when defining the content of a latent variable, it controls for measurement errors (systematic or not) in observed indicators and it also achieves data reduction in a consistent manner with relatively lower risk of informational losses. This methodology is currently used in fields of psychology, sociology and political science, although there have been attempts to introduce it in economics also as early as 1970s (Goldberger(1972), and Goldberger and Duncan(1973))4. A variant of SEM models, in the form of MIMIC5 models, is currently

2 Alternative names are ‘covariance structure models’ or ‘latent variable path analysis’.

3 See Kline (1994) for an introduction to the methodology of factor analysis.

4 For a historical background of SEM see Bollen(1989) and references therein.

5 Models with observed variables that are Multiple Indicators and Multiple Causes of a single latent variable.

(19)

used in the field of empirical economics for obtaining estimates of the underground economy6.

When using latent variables7, a SEM analysis includes two main stages8: (1) the conceptual definition and empirical construction of latent variables (the construction of measurement models for latent variables), and (2) the specification of a model that includes structural links among the latent factors identified at stage (1).

In this study I concentrate on the first stage of a SEM analysis only: the construction of measurement models for latent variables. As mentioned, this type of analysis is often referred to as Confirmatory Factor Analysis in order to reflect the fact that it helps confirming prior hypothesis on the definition of a latent factor, rather than achieving an ad-hoc mixture of observed variables. The following section introduces the basic intuition and the main theoretical fundaments on which CFA models are based.

I.2 Theoretical fundamentals of CFA Measurement Models9 and empirical issues

When working with the concept of latent variables a first question that can be asked is the following: what are the main observed dimensions that can be attributed to a latent (unobserved or abstract) concept. In early 1900s Spearman focused on the possible dimensions of human intelligence. He considered the positive correlations of human abilities such as achievement in school, social class, ability to concentrate and the quality of education, and designed the method of factor analysis in order to identify the dimensions that would combine best in explaining the positive correlations observed between the specific dimensions of human abilities. This would later have important implications for the field of human psychology.

The most common view of observed dimensions of a latent variable is as reflections of the latent concept. What we observe in reality however is often an imperfect

6 See Schneider and Enste(2000) for a review of the empirical literature on the shadow economies.

7 SEM methods also include path analysis of observed variables only, which is very similar to the technique of simultaneous equation systems. For a description of the method of path analysis see the next study in this thesis.

8 For introductory texts on SEM methods see Bollen(1989), Maruyama (1998), Kline (1998) and also Steiger(2001) on critical comments on the existing introductory SEM literature.

Advanced topics in the area are to be found in Marcoulides and Schumacker(1996).

9

(20)

reflection of the abstract concept we operate with theoretically. Depending on how broad is the definition of a latent concept there are several reasons why its observed dimensions we operate with do not perfectly combine to characterize the concept. A first reason is that we cannot observe all the relevant dimensions of the concept, and therefore obtain only a partial characterization of it. A second reason may be that the observed indicators that we employ may not be homogenous enough in their informational content, such that they can also be assigned to another factor. A third reason can be that, if the observed dimensions are produced by the same method, the set-up of the method induces an inherent bias in the obtained results that contributes to a higher correlation among measures that would otherwise be the case. And yet another reason may be that we are not able to observe the dimensions of a factor without (random) measurement errors. The difference between the measurement errors and the method bias is that, while the former are random, the method bias is of a systematic nature.

Theoretically, we can decompose the information comprised in an observed indicator into three distinct components: a systematic part related to the underlying theoretical concept (true score common variance), a systematic component that is not related to theoretical concept at hand, but to other factors and/or method bias (true score unique variance), and a third part that represents purely random measurement errors (error variance).

Existing methods of building latent variables differ in their potential to distinguish and capture the three components in observed measures. A main distinction is to be made between the method of principal component analysis and the methods of factor analysis.

Principal component analysis is designed to group a complex set of observed variables into factors based on the correlations among the variables. There are as many principal components that can be extracted as the number of variables in the set, although researchers often limit their search to a more reduced number of factors.

Principal components are successively generated by the method according to the degree to which they capture the variance in the correlation matrix of observed variable. The first principal component explains this variance to the highest extent, the second principal component is the second highest in terms of the variance explained and so on.

(21)

The main disadvantages of the method are the following: the objective of the method is to explain as much variance as possible in the observed correlation matrix, with no regard for the error structure in the observed measures; the design of the method is such that the principal components generated are orthogonal, and therefore no correlation between components is allowed.

Principal component analysis has a ‘data mining’ nature in that it extracts the variance in a correlation matrix of observed variables with no regard for the fact that such correlations may actually be due to method biases, or partial correlations attributable to other factors that are not identified by the method. It is therefore solely data driven and it often poses problems in the interpretation of the resulting principal components as we don’t really have a direct control on the structure of variance in the observed measures.

The methods of factor analysis are designed to overcome the disadvantages of principal component analysis. We can distinguish between the methods of Exploratory Factor Analysis (EFA) and the Confirmatory Factor Analysis (CFA). Exploratory Factor Analysis is similar in nature to the principal components method, in that it first establishes the number of factors that can be constructed based on a given set of observed variables, but it has the advantage that it does not aim at explaining all the variance in the correlation matrix. Exploratory factor analysis offers the possibility to account for the fact that part of the correlations among observed measures may be due to other factors, not identified by the method. Confirmatory Factor Analysis takes a different view. Having collected a set of observed indicators, the question we ask is whether they combine into a single factor in a coherent manner. We therefore test the hypothesis that the set of observed dimensions can be assigned, and therefore combined, into a latent (aggregate) construct. Measurement models of Confirmatory Factor Analysis can be constructed to include several latent factors, but that does not change the nature of the question. What we test for is whether a given set of observed dimensions can be assigned to a unique latent factor. There are two main differences between Exploratory Factor Analysis and Confirmatory Factor Analysis. With EFA we first explore how many factors we can extract based on a given set of observed variables. This means that we do not know the number of latent factors to operate with in advance, and we therefore do not operate with a specific model that can be tested.

(22)

Furthermore, in an exploratory empirical exercise, all the observed measures relate to all factors and the measurement errors in observed indicators are not allowed to correlate.

In an analogy with CFA models, under-identification of parameters in EFA is usually the rule.

In confirmatory factor analysis we first design a model that specifies the structural relationships between the observed dimensions and the latent factors. The latent factors are conceptually established by the researcher prior to the empirical analysis. The objective of confirmatory factor analysis is to extract the true score common variance in the measures assigned to a latent factor. Given that the ‘residual’ information left in the observed measures, after extracting the true score common components, needs not be purely random (to the extent it contains true score unique variance), the CFA model allow for possible correlations between the error terms. The model is then tested in order to evaluate the degree to which it is supported by the available data. This is the method of factor analysis I operate with in this study.

In order to grasp the intuition of confirmatory factor analysis better consider the following example. Assume we want to construct a latent factor that characterizes corruption in a country, and the available observed dimensions we have are the following: an index of legislative corruption (x ), describing corruption in the 1 lawmaking process; an index of executive corruption (x ), indicating the extent of graft 2 among the members of the executive, a measure of judicial corruption (x ), capturing 3 the extent of corruption in courts, and a measure of bureaucratic corruption (x ). 4 Confirmatory factor analysis can help us understand whether the four observed dimensions combine in a general factor of corruption in a coherent manner. If corruption in general is of a pervasive nature at all levels of the society, then it is possible to characterize a country as having high levels of corruption along all of the four dimensions and therefore aggregate them into one general measure of corruption.

However, there may be a possibility that a country registers high levels of corruption along only two of the four dimensions (say x and 1 x ), while another country will 2 display high degrees of corruption along the other two dimensions of corruption (x 3 and x ). In such a situation, the four dimensions of corruption would not combine in a 4

(23)

systematic manner in a general factor of corruption across countries, and therefore we would need to be more specific in terms of the type of corruption we analyze when referring to a particular country.

For simplicity, assume that the four measures of corruption are constructed by different methods (sources) such that the possibility of method bias in the observed measures is reduced.

The measurement model for our example of the latent factor of corruption writes as following:

(1)



+

=

+

=

+

=

+

=

4 1 41 4

3 1 31 3

2 1 21 2

1 1 11 1

δ ξ λ

δ ξ λ

δ ξ λ

δ ξ λ

x x x x

where:

i =

x one of the observed measures of corruption (i=1,2,3,4)

1=

λi factor loadings

1=

ξ the latent (unobserved) factor of corruption

i =

δ error terms (measurement error)

Note that model (1) interprets the latent factor of corruption ξ1as ‘causing’ the observed dimensions of corruption. This is to say that if corruption is pervasive in a country then it will be reflected at all levels in the society. The presence of the error terms is meant to indicate that we believe we cannot observe the extent of corruption at the four levels without measurement error. The model in (1) can be represented in a diagram as illustrated in Figure 1.1.

The specification of the model in matrix form writes as following:

(2)

[ ]

+

=

4 3 2 1

1

41 31 21 11

4 3 2 1

δ δ δ δ ξ λ λ λ λ

x x x x

(24)

Figure 1.1

The diagram of a measurement model illustrates the assumptions of the model. In the diagram above it is assumed that the factor is reflected in the observed measures with an error δi, and the parameter attached to the error term is constrained to the value of 1, which is the usual assumption made for identification purposes in any classical regression analysis. The model also imposes the constraints that covariances between the error terms are zero. As mentioned, this assumption can be relaxed if there are theoretical justifications for non-zero correlations between the error terms. A possible instance when the error terms may be correlated is the situation when we obtain the observed measures based on the same method, or from the same source. Another possible reason relates to the possibility that some of the observed measures relate to other common latent factor not considered in the model at hand. In our example of corruption it is more difficult to conceive such a situation, although not impossible.

A specific constraint necessary in any measurement model addresses the issue of choosing a scale for the latent factor. As factors represent theoretical constructs, it is seldom the case we have a clear definition for the measurement scale of the concept. For our example, we do not know in advance what type of a scale to choose a priori for the general factor of corruption. Moreover, the observed measures we employ for constructing the factor often come with different measurement scales. In order to make estimation possible, a factor scale needs to be assigned based on the scales of the employed measures. This amounts to constraining one of the factor loadings to a fixed value. The convention is that the fixed value is set to 1, although there is no particular

x1 x2 x3 x4

ξ1

δ 4

δ2

δ1 δ 3

1

1 1

1 1

(25)

reason why other fixed values cannot be used. As illustrated in Maruyama(1998), the empirical choice of the factor loading set to a fixed value for measurement scale purposes does not have any effect on the estimation results in terms of absolute magnitudes, model fit or statistical significance of parameter estimates10.

Measurement models range from simple (i.e. including one factor only and the measures associated to it) to complex (when several factors are estimated together in a common model). The number of measures associated to a factor depends on both the definition of the theoretical concept underlying the factor and the availability of suitable measures. The general specification of a measurement model written in matrix form is the following11:

(3) x=Λx ξ +δ where:

=

x (qx1) vector of observed variables

ξ = (px1) vector of latent (unobserved) factors

=

Λx (qxp) matrix of parameters associated to factors (factor loadings)

δ = (qx1) error term including the two components: true score unique variance and error variance.

The assumptions on the error term are similar to the ones in the multiple regression analysis. Disturbances in δ are assumed to be uncorrelated with the factors ξ, that is E(ξδ)=0. It is also assumed that E(δ)=0. Correlations among the error terms across equations are allowed, provided they are theoretically justifiable. For estimation purposes (and more specifically hypothesis testing) normality is desirable, but there are possibilities to construct reliable test statistics that take into account the degree of non-normality in the data.

Elements in the parameter matrix Λx, denoted λij and referred to as ‘factor loadings’, indicate the extent to which the factor is reflected by (that is, it ‘loads into’) each of the observed variables. If a measure is linked to only one factor in the model,

10 A more careful consideration of the choice for the factor scale is required only when some of the observed measures are negatively correlated.

11 The observed variables x are expressed in deviations from the mean. For a general

(26)

then the factor loading is simply the correlation between the factor and the observed indicator. Each column in matrix Λx corresponds to one latent variable, and each row to one observed indicator.

The objective of the analysis is to best estimate parameters in the matrix Λx, together with all the other unknown parameters in the system, based on the information provided by the sample data, in such a manner that the solution obtained is admissible and it helps replicating the original sample information very closely. At it shall be described, once we estimated the factor loadings, then it is possible to construct the scores for the latent factor(s) ξ.

In the general model (3) above consider post-multiplying the matrix equation by the transpose of vector x, and then taking expectations:

(4) E

( )

xxT =E

[ (

Λxξ+δ

)(

Λxξ+δ

)

T

]

=ΛxΦΛTx +Θδ

The term E

( )

xxT gives us the population variance-covariance matrix of the observed variables x. The matrix Φ is the variance – covariance matrix of the latent variables.

The matrix Θδis the variance-covariance matrix of the error terms.

For our example in (1), the three matrices have the following analytic form.

- The population variance-covariance matrix of the observed variables:

( )

=

) ( ) , ( )

; ( )

, (

) ( )

, ( )

, (

) ( )

, (

) (

4 3

4 2

4 1

4

3 2

3 1

3

2 1

2 1

X VAR X

X COV X

X COV X

X COV

X VAR X

X COV X

X COV

X VAR X

X COV

X VAR xx

E T

- The variance – covariance matrix of the latent factors includes only one element as there is only one latent factor in the model:

[ ]

φ11

=

Φ , with φ11 =VAR(ξ1)

References

Related documents

The model also has an intuitive economic appeal since the total biomass, average phenotype, and phenotypic variance represent overall productivity, responsiveness to environmental

We provide novel empirical evidence of the effect of government price regulations on market outcomes of student demand for college programs and college admission requirements..

In Paper 3 (Measuring the efficiency of Swedish fire services’ stand-by level), the DEA- model is used to find efficiency scores and returns to scale corrected for

In Table 3, the following six relative characteristics of exporting and domestically oriented firms are compared: technical efficiency, firm size, capital-labour ratio, human

The third paper analyses the exit behavior from social assistance dependency and the last paper analyses the simultaneous relationship between welfare participation, paid

Residing in a big city should reduce the recall probability but decrease the unemployment duration (Lazear, 2003), as labour market where is likely to offer more job openings,

Key words: corporate governance; power indices; dual class of shares; pyramidal structure; owner control; firm performance; voting premium; Shapley-Shubik power index; Banzhaf

In the case of subjects’ predictions for women ( PCE ), we find somewhat different f results for the risk averse category where only 24 of 63 cases are consistent with the theory