The Forecasting Power of Economic Growth Models

(1)

UPPSALA UNIVERSITET Nationalekonomiska institutionen Examensarbete magisteruppsats Master's Thesis

Vårterminen 2007

The Forecasting Power of Economic Growth Models

Author: Andreas Bryhn Supervisor: Johan Lyhagen

(2)

Abstract

High forecasting power is essential for understanding scientific relationships. In economics, forecasting power may be decisive for the success or failure of a particular policy. The forecasting power of economic growth models is investigated in this study. Regressions from one dataset including the gross domestic product (GDP), GDP growth, trade openness, the quality of public institutions and secondary education generate insufficient forecasting power with respect to growth. Furthermore, the International Monetary Fund's one-year growth forecasts are compared to outcome. Forecasts for 1999-2006 were found to be significantly different from outcome during 7 years out of 8. The forecast error slightly exceeded 1 percentage unit, which is similar to results from earlier studies on forecast error and equal to the forecast/hindcast error from a simple multivariate model constructed from historical growth data. Possible reasons behind poor forecast quality are discussed, including the tradition to build models using assumptions from irrefutable theoretical constructs.

Sammanfattning

Hög prognoskraft är nödvändigt för förståelsen av vetenskapliga samband. Inom

nationalekonomin kan prognoskraften vara avgörande för huruvida en viss ekonomisk politik kommer att lyckas eller misslyckas. Ekonomiska tillväxtmodellers prognoskraft undersöks i denna studie. Regressioner från ett dataset som innehåller bruttonationalprodukten, tillväxten hos densamma, öppenhet för handel, offentliga institutioners kvalitet, samt andel

gymnasieutbildade, genererar otillräcklig prognoskraft med avseende på tillväxt. Vidare jämförs Internationella Valutafondens ettårsprognoser med utfall. Prognoser för 1999-2006 var signifikant skilda från utfall under 7 år av 8. Prognosfelet uppgick till drygt 1

procentenhet, vilket motsvarar resultat från tidigare studier av prognosfel, liksom prognosfelet från en enkel, flervariabelsmodell som är baserad på historiska tillväxtdata. Möjliga

anledningar till den låga prognoskvaliteten diskuteras, däribland traditionen att bygga modeller med hjälp av antaganden från teoretiska konstruktioner som inte kan falsifieras.

(3)

TABLE OF CONTENTS

Page 1. INTRODUCTION

2. METHODS AND DATA

2.1. Statistical methods and considerations 2.2. The Gallup dataset

2.3. The IMF dataset 3. RESULTS

3.1. Cross-country growth regressions 3.2. The IMF's growth forecasts 4. DISCUSSION

5. CONCLUSIONS REFERENCES

3 5 5 6 6 7 7 9 13 17 18

(4)

1. INTRODUCTION

Forecasting the future has been a highly desirable goal for humans ever since the Stone Age.

With the emergence and growth of science, our forecasting methods have improved considerably, as forecasting power has become a fundamental part and a distinguishing feature of scientific knowledge. The ability to make accurate forecasts of important goal variables is essential for the understanding of scientific relationships, particularly with respect to causality (Blaug, 1980; DeLurgio, 1998; Fildes and Stekler, 2002). High forecasting power is necessary but not sufficient for causal analysis in economics, because strongly correlated variables may have an external, cointegrating causal factor (Clements and Hendry, 1999).

Nevertheless, without high forecasting power, economists will have poor quantitative knowledge about the effect of their suggested policies on, e. g., economic growth. Insufficient forecasting power may therefore result in disappointing and expensive policy failure.

The forecasting power of a model can be estimated by calculating the correlation coefficient (R²) for the relationship between model forecasts and actual outcome. Two crucial issues in this context are (1) the relative quality of models; i. e., how much better is, e. g., R² = 0.9, compared to R² = 0.8 in a regression between forecast and outcome, and (2) the lower limit of acceptability, as measured in R². If there is a large enough amount of data, a correlation with R² very close to 0 can still be significant at high confidence levels, but it will be just as useless for forecasting as a correlation with R² = 0, because in both cases, correlations will describe a beeswarm-like scatterplot from which forecasts will have very high uncertainty.

One widely used lower limit of acceptability regarding R² and forecasting power was first motivated by Prairie (1996), who presented what has become known in the natural sciences as Prairie's staircase. Prairie's staircase method is illustrated in Figure 1A. A thick, solid line starts at the lower boundary line of the 95 % confidence level from the regression and is drawn upwards until it reaches the upper 95 % confidence limit, and is then drawn to the right until the lower boundary line is reached again. This procedure is repeated so that the thick, solid line takes the shape of a staircase (Figure 1A). When Prairie (1996) reiterated this exercise for a large number of correlations, he found a non-linear relationship between R² and the number of staircase risers, which is depicted in Figure 1B. According to this figure, the number of risers is low and fairly constant for R² values from 0 to about 0.65, after which the

(5)

forecasting power, Prairie (1996) argued that R² = 0.65 should be regarded as the lower limit of acceptability. Furthermore, Figure 1B can be used to compare forecasting power. For instance, the difference in forecasting power between R² = 0 and R² = 0.9 can be regarded as more than six times as great as the difference between R² = 0 and R² = 0.45.

Figure 1. Prairie´s staircase, suggesting a non-linear relationship between the correlation coefficient (R²) and forecasting power. From Prairie (1996).

The difficulty in making correct growth forecasts has concerned economists for many decades (Hutchison, 1938; Kenny and Williams, 2001). Table 1 shows R² values from some bi-variate regressions found in the literature. None of the R² values exceed 0.65, although appreciably higher R² values can be found in multi-variate regressions (Gylfason, 2001). This study aims at examining the forecasting power in economic growth models. The structure of this work is as follows: first, two datasets used in the study will be described and statistical methods and considerations will be presented and motivated. Second, correlations and forecasting power will be analysed. Finally, the results will be discussed based on the intention to increase future forecasting power in growth models.

(6)

Table 1. Correlations between some x-variables and GDP growth or GDP per capita growth as a y-variable.

x-variable Correlation sign R² Reference

Fertility - 0.61 Perotti, 1996

Liquid liabilities + 0.55 King and Levine, 1993 Private loans + 0.50 King and Levine, 1993 Investment + 0.35 Levine and Renelt, 1992 Natural resources - 0.28 Gylfason, 2001

Inequality - 0.22 Persson and Tabellini, 1994

Savings + 0.20 Levine and Zervos, 1998

Education + 0.17 Gylfason, 2001

Telecommunications + 0.17 Bougheas et al., 2000 Black market - 0.14 Levine and Renelt, 1992 Revolutions and coups - 0.13 Levine and Renelt, 1992 Bank credit + 0.12 Levine and Zervos, 1998 Paved roads + 0.11 Bougheas et al., 2000 War casualties - 0.10 Easterly et al., 1993 Inflation - 0.08 Carkovic and Levine, 2002 Government size + 0.06 Carkovic and Levine, 2002

Marginal tax + 0.03 Perotti, 1996

Trade + 0.02 Dollar and Kraay, 2001

Welfare spending + 0.01 Perotti, 1996

GDP + ≈ 0 Barro, 1996

2. METHODS AND DATA

2.1. Statistical methods and considerations

Relationships were analysed with single, and forward stepwise multiple, linear regression.

Stepwise multiple regression makes it possible to distinguish the strongest co-varying parameter, followed by all other parameters that may add additional explanatory power to the regression. Several potential x-variables can all show strong individual correlations with the y-variable that they are being used to forecast, although these individual correlations may not be additive if the x-variables are also correlated with each other. Multi-variate models may take such co-variation between x-variables into account if they are developed with stepwise multiple regression techniques (DeLurgio, 1998). Criteria for x-variables to be used to forecast the various y-variables were (1) they had to correlate significantly with y-variables individually as well as within the multiple regression and (2) these correlations had to be of the same sign; i.e., an x-variable that was positively correlated with a y-variable was excluded

(7)

if its contribution in the multiple regression was negative, and vice versa. These criteria are not commonly used in econometrics (DeLurgio, 1998) and their relevance will therefore be discussed later in this paper. Changes in linear slopes were detected with a trend shift analysis method from Rodionov and Overland (2005). This method consists of a downloadable application to Microsoft Excel and makes it possible to detect at which points a trend changes at a specified significance level, given that the trend changes at all. Statistical significance was always determined at the 95 % confidence level, since Figure 1 is defined at that level.

2.2. The Gallup dataset

The first dataset (CID, 2007) of two used in this study has been described by Gallup et al.

(1999) and was used in regression 1, Table 3 in the same study. It consists of six variables;

average purchasing power parity (PPP) adjusted annual gross domestic product (GDP) per capita growth between 1965 and 1990 (hereafter called yG), PPP adjusted initial GDP per capita in 1965 (hereafter referred to as Y_G), average years of secondary schooling among the population in 1965 (Edu_G), the log value of life expectancy (Life_G), openness to international trade (OpenG), and finally, the quality of public administration (Publ_G). This dataset was used in Section 3.1, first to estimate bi-variate correlations with yG, and subsequently, to quantitatively assess multi-variate correlations with yG using the criteria stated in 2.1. The dataset was then divided into groups according to the gradient of one variable which was insignificantly correlated with yG, to investigate whether the effect of such a variable on other regressions could be estimated without violating the criteria in 2.1.

2.3. The IMF dataset

The second dataset consisted of actual and forecasted data on PPP adjusted GDP growth in 29 advanced economy in 10 of the International Monetary Fund's (IMF) April or May issues of World Economic Outlook (IMF, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007). In this work, yt denotes historical IMF data on growth at time t (in years), and it is worth noting that IMF data are not given as per capita values, as opposed to yG. One-year growth forecasts (y^fort+1) were first compared to the actual outcome (yt+1) in a bi-variate regression, and the resulting R² value was compared to R² values found between historical data at the time of the forecast and yt+1.

(8)

A statistical model based on the presented historical data at the time of the forecast was then used as a baseline from which to evaluate the forecast quality. This statistical model was developed from a stepwise multiple regression with yt+1 as a y-variable and all historical data available at the time of the forecast as potential x-variables. The baseline model was based on the time period 1998 ≤ t ≤ 2004, i. e., on growth outcome from 7 years and 29 OECD countries. In order to test the criteria in 2.1 by violating them, 100 normal distributed random variables were added to the data set, allowing all significant x-variables to enter into the forward stepwise multiple regression with yt+1 as a y-variable. If any of the random variables would enter, then that would indicate that the criteria in 2.1 could decrease the risk of adding nonsensical information to the regression.

The stability of the model constants were subsequently studied by omitting data from one year at the time. Forecasts/hindcasts from the statistical model were tested against yt+1 by using those model constants which were valid for all years except for the particular forecast/hindcast year, in order to perform a test against "independent" data in the sense that they were not used to develop the statistical model. For example, when hindcasting y_t+1 for t = 2001, the model constants used were those that were valid for a multiple regression when t = 2001 had been omitted. Furthermore, the baseline model with constants valid for 1998 ≤ t ≤ 2004 was tested to forecast yt+1 for t = 2005, a year from which data had previously not been used at all in the study. These forecasts were compared to the IMF's forecasts (IMF, 2005) and to the actual outcome (IMF, 2007). Finally, error terms from all forecasts and hindcasts in this section were studied and compared.

3. RESULTS

3.1. Cross-country growth regressions

Table 2 displays cross-correlations (R² values) between the six variables in Gallup et al.

(1999), described in Section 2. All correlations carried a positive sign. According to Table 2, EduG, LifeG, OpenG and PublG were all mutually correlated and also positively correlated with yG. YG was insignificantly - although, if anything, positively - correlated with yG.

(9)

Table 2. Cross-correlations (n=75) between six variables described in Section 2 and by Gallup et al., (1999). * = significant at the 95 % confidence level.

YG yG EduG LifeG OpenG

yG 0.00

EduG 0.39* 0.14*

LifeG 0.51* 0.22* 0.60*

OpenG 0.32* 0.38* 0.33* 0.40*

PublG 0.61* 0.19* 0.39* 0.48* 0.53*

Thus, all variables except YG could be used in a multi-variate model according to the criteria set in Section 2. The baseline equation for this model is:

yG = a + b · EduG + c · LifeG + d · OpenG + e · PublG (1)

where a-e are constants. However, only OpenG could enter as a significant x-variable in a forward stepwise multiple regression with yG as a y-variable. Therefore, constants b, c, and e were not determined, while a (including its standard error) was 0.787 ± 0.244 (p = 0.002, t = 3.23) and d was 2.62 ± 0.38 (p < 0.001, t = 6.74).

Table 3 shows how the R² value changed when the data were divided into groups along the YG gradient. When the data were grouped into 2, the R² for OpenG vs. yG increased from 0.39 to 0.45 for the poorest countries (cases 1-48), and from 0.39 to 0.40 for the richest countries (cases 49-96). Growth in the group with the poorest countries could also be forecasted with a multiple regression that rendered an R² value of 0.53 and included openness and log values of life expectancy in 1965 as x-variables.

Table 3. Regression analysis with yG as a y-variable. The data set was sorted ascendingly according to the level of YG.

Cases n Highest R², single regression

Variable in single regression

R², multiple regression

Variable(s) in multiple regression

1-96 94 0.39 OpenG 0.39 OpenG

1-48 48 0.45 OpenG 0.53 OpenG, LifeG

49-96 46 0.40 OpenG 0.40 OpenG

1-32 32 0.39 OpenG 0.39 OpenG

33-64 31 0.42 OpenG 0.58 OpenG, LifeG

65-96 31 0.50 OpenG 0.50 OpenG

When data were grouped into 3, R² increased further for the richest third (cases 65-96) to 0.50 and the middle group (cases 33-64) allowed log life expectancy to enter as an additional x-

(10)

variable in the multiple regression, yielding an R² value of 0.58. However, when the data was divided into more groups than 3, regressions became insignificant in many of the groups. A trend shift analysis of the OpenG vs. yG relationship using the method from Rodionov and Overland (2005) showed that there were no significant slope deviations in the openness- growth relationship along the GDP per capita gradient. Likewise, a t-test revealed that the regression slope was slightly but insignificantly steeper among the poorest half of the countries compared to the richest half.

3.2. The IMF's growth forecasts

The first row in Table 4 gives the correlation between the IMF's GDP growth forecasts and the actual outcome one year after the forecast. The subsequent rows contain correlations between historical GDP growth data at the time of the forecast, and the GDP growth one year after. All significant correlations were positive. It is worth noting that two historical variables in Table 4 yielded equal or higher R² values against outcome as compared to IMF's forecasts.

Table 4. Correlations (n=190) between the IMF's growth forecasts or historical growth data, and growth one year after the forecast. Data from IMF (1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006). All correlations were significant at the 95 % confidence level. MV denotes the mean value.

Variable R²

Y^fort+1 0.15

yt-1 0.04

yt-2 0.08

yt-3 0.10

yt-4 0.18

yt-5 0.06

yt-6 0.03

yt-7 0.04

yt-8 0.09

MV(yt-18, yt-17, …, yt-9) 0.15

A multiple regression with yt-1, yt-2, …, yt-8, and the mean value of yt-18, yt-17, …, yt-9 as x- variables and the future growth outcome yt+1 as an y-variable allowed three historical data variables to enter at the 5 % significance level, raising the R² value to 0.26. From this regression, the following statistical model was constructed:

) ,..., ,

( ₁₈ ₁₇ ₉

4 2

1 − − − − −

+ = + ⋅ _t + ⋅ _t + ⋅ _t _t _t

t f g y h y i MV y y y

y (2)

where f-i were constants which were determined to f = 0.670, g = 0.162, h = 0.273 and i = 0.246.

(11)

When 100 normal distributed random variables were added to the data set, violating the criteria stated in 2.1, the resulting regression generated an R² value of 0.35, and included yt-2, yt-4, yt-7, yt-8, and 5 of the random variables as significant x-variables. None of the random variables was significantly correlated with yt+1 in a single regression. When the random variables were removed from the regression, yt-7 remained as a significant x-variable, but with a negative sign. In an attempt to use yt-7 as an explanatory variable in compliance with the criteria in 2.1, 8 new variables were generated with yt-7 as a denominator and the 8 other historical variables from the IMF dataset as nominators. However, none of these ratios were correlated with yt+1. Likewise, no significant correlation with yt+1 could be generated from variables that consisted of yt-7 subtracted from any of the remaining historical variables in the IMF dataset.

The stability of the constants in Equation 2 was tested by omitting yt+1 from one year at the time and the results are displayed in Table 5. g could not be significantly separated from 0 for one of the years (t+1 = 2001). The coefficient of variation (CV) for h was only 3.3%, indicating a comparatively stable contribution to Equation 2 from yt-4 in relation to the other historical variables.

Table 5. The stability of the constants in Equation 2 when yt+1 from one year at a time was omitted. The first column describes which one of the forecasted years was excluded from the analysis. MV denotes mean value, SD is the standard deviation and CV is the coefficient of variation and equals SD/MV. *Based on t+1 = 1999, 2000, 2002-2005. **Not significantly different from zero.

Excluded t+1 f g h i R²

None 0.670 0.162 0.273 0.246 0.26 1999 0.708 0.265 0.270 0.266 0.22 2000 0.806 0.107 0.257 0.234 0.23 2001 0.628 ** 0.283 0.405 0.31 2002 0.632 0.157 0.271 0.258 0.24 2003 0.558 0.217 0.276 0.230 0.28 2004 0.616 0.269 0.281 0.143 0.28 2005 0.630 0.163 0.261 0.263 0.23 MV* 0.658 0.196 0.269 0.232 0.25 SD* 0.087 0.065 0.009 0.046 0.03 CV* 0.132 0.331 0.033 0.199 0.11

The IMF's forecasts were compared to growth forecasts/hindcasts generated by Equation 2 by regressing forecasts and hindcasts against GDP growth outcome, and these regressions are

(12)

shown in Figure 2. As motivated in the previous section, constants f-i were taken from Table 5 in order to test hindcasts from Equation 2 against independent data; i.e., the growth outcome (yt+1) of, e. g., 2003 was compared to hindcasts from Equation 2 with constant values for f-i taken from the 2003 line in Table 5. It is worth noting that for 1999 ≤ t+1 ≤ 2004, Equation 2 yielded hindcasts since constants were also generated from yt+1 data from subsequent years, while the same equation yielded forecasts for 2005 since the information on the 2005 line in Table 5 strictly emanates from years that preceded 2005. The regression slopes in Figure 2 are both less than 1, indicating that forecasts and hindcasts may be systematically higher than the outcome. However, mean values of forecasts, hindcasts and outcome all slightly exceeded 3%

and were not significantly different from one another. Thus, forecasts/hindcasts < 3% showed a tendency to be overestimated compared to the real outcome yt+1, while forecasts/hindcasts >

3% tended to be underestimated, which explains why the regressions in Figure 2 differed from the unit line (y=x).

Figure 2. Comparison between one-year forecast/hindcasts and outcome in yearly GDP growth (%) 1999-2005.

A. Forecasts from the IMF's World Economic Outlook. B. Forecasts/hindcasts from Equation 2.

The R² value in Figure 2B was low, at 0.19, although marginally higher than in Figure 2A (0.15). According to Figure 2A, none of the IMF's growth forecasts were negative, and none exceeded 7.2 percent, although negative growth was observed in 9 cases out of 195 and higher growth than 7.2 percent was observed in 12 cases.

Forecasts from IMF (1995) and Equation 2 compared to outcome (IMF, 2007) regarding GDP growth in 2006 are displayed in Figure 3. The IMF's forecasts (Figure 3A) yielded a rather

2

(13)

from Equation 2 (0.19, see Figure 3B). However, the two regression equations in Figure 3 show that R² values rather inaccurately reflected the forecasting power in this case, because forecasts from both sources were systematically and significantly lower than outcome.

Figure 3. Comparison between one-year forecasts and outcome in GDP growth (%) 2006. A. Forecasts from the IMF's World Economic Outlook. B. Forecasts from Equation 2.

The differences between the IMF's forecasts, forecasts/hindcasts from Equation 2, and outcome, are further illustrated in Figure 4. Apparently, errors from both methods followed a very similar pattern with respect to mean error and standard deviation of the error. During an average year, the absolute value of the mean error was 1.1 percentage units while the standard deviation of the error slightly exceeded 1.5 percentage units. Figures 4B and 4C suggest that forecasts improved over time, given the slightly decreasing mean error deviation from 0 and the slightly decreasing standard deviation of the error. However, the difference between IMF's forecasts and growth outcome (including the bars representing the mean value error) in Figure 4A shows that IMF's forecasts were significantly different than outcome for all years except for 2005. Forecasts/hindcasts from Equation 2 significantly deviated from outcome for all years except for 1999 and 2005. Thus, 2005 appeared to be an outstandingly predictable year (mean relative error < 10% according to Figure 4D) while the year 2001 was very unpredictable (mean relative error > 70%) with respect to GDP growth.

(14)

Figure 4. Accuracy in GDP growth forecasting. A. Mean GDP growth forecasts for 29 OECD countries from the IMF's World Economic Outlook, compared to mean values generated by Equation 2 (hindcasts for 1999-2004 and forecasts for 2005-2006), and to outcome, including the mean value error. Values in percent. B. The mean forecast/hindcast error in percentage units of GDP growth. B. The standard deviation (SD) of the growth forecast/hindcast error. C. The relative error (forecast or hindcast error divided by outcome) in percent.

4. DISCUSSION

Many economic growth models are motivated by economic theory and contain common theoretical assumptions about, e. g., perfect competition (Ventura, 2005). There is reason to question whether this method is optimal for constructing growth models. Karl Popper's strict, generic, and unambiguous demarcation line between science and non-science (metaphysics or pseudo-science) has since its first publication in 1934 widely gained respect in the scientific

(15)

community as a fundamental part of the hypothetico-deductive method. A primary criterion for a scientific theory according to this demarcation method is that the theory must be refutable, i.e., it must be possible to falsify the theory with evidence of the opposite. A subsequent criterion is that the theory must pass some kind of empirical test. These criteria have laid the ground for substantial progress in many academic disciplines. Irrefutable constructs may very well have a heuristic value or play other important roles in inspiring the future development of scientific theories. However, only refutable scientific theory has the potential to forecast and to explain relationships by including probable outcomes and excluding improbable outcomes. Separating scientific theory from metaphysics and specifying their separate roles have the potential to improve the forecasting power and thus the understanding of scientific relationships. Every time a scientific theory is refuted and improved, the new theory brings us one step closer to the unattainable goal; the "truth"

(Popper, 1972).

The need to use refutability as a scientific criterion in economics has been acknowledged by many economists (e. g., Hutchison, 1938; Blaug, 1980; Eichner, 1985; Stanley, 1998;

Bernhofen, 2005). Those who oppose this criterion (see, e. g., Hands, 2001 and references therein) have thus far failed to present an alternative demarcation line which unambiguously classifies astrology, alchemy and religion as non-science. Many economic theories which have been found irrefutable assume conditions that cannot be observed, such as steady-state, perfect competition and ceteris paribus, or contain empirically empty assumptions such as rational behaviour; although every kind of behaviour may be considered rational by those who display it (Hutchison, 1938; Blaug, 1980).

An example of an irrefutable theory is the well-known principle of comparative advantage.

This principle can be demonstrated by a logical exercise which shows that trade is always beneficial for everyone involved. Evidently, no real observations of trade being harmful (e. g., arms trade between countries at war with each other, or narcotics trade with a subsequent increase in drug addiction and social costs) has the potential to falsify, or even affect, this logical exercise. In other words, the principle has no empirical content and cannot convey any information about economic relationships in the real world. Bernhofen (2005) objected that the principle of comparative advantage is indeed refutable, since the opposite of the principle (trade being harmful) is a possible outcome. However, refutability requires that a theory does not only have an opposing hypothesis, but that the theory may also be falsified with evidence

(16)

of the opposite (Popper, 1972). Opposing evidence is therefore more important than confirmative evidence for a refutability test. Consequently, the principle of comparative advantage can certainly be used to illustrate the correlation between openness to trade and growth, as found in Table 2, but the principle should not be trusted as a pathfinder in the quest for forecasting power. Likewise, neo-classical growth theory, which forecasts growth convergence, hinges on conditions that have yet to be observed, such as perfect competition and market equilibrium. As a result, observed growth divergence cannot possibly refute the convergence hypothesis (Meeusen, 2003), because convergence can always be suspected to occur sometime in the near or distant future, even if there are no present signs of it.

This line of reasoning brings us back to the reasons for not allowing YG to enter as an x- variable with a negative sign in a multiple regression with yG as a y-variable. This choice has been made in the present study, as opposed to, e. g., Gallup et al. (1999) and references therein. The first reason why this choice was made here is that such a regression would forecast that growth is lower in rich countries than in poor countries, which would contradict Table 2, which instead forecasts that the relationship between YG and yG is insignificant, and, if anything, positive (also supported by Barro, 1996). Furthermore, the analysis of the OpenG

vs. yG relationship showed that the regression slope was rather constant all along YG gradient, indicating no significant effect from YG in this respect. Second, the "normal" reason for ignoring results from individual correlations when constructing multi-variate models consists of references to economic theory (DeLurgio, 1998). The convergence theory, which would be needed to justify the use of YG in a negative correlation with yG, is irrefutable, as has been demonstrated above, and therefore ill-suited for forecasting. Third, since YG is non-stationary and yG is its derivative, any linear combination between YG and yG will have a non-stationary residual with an infinite variance (Jones, 1995; Clements and Hendry, 1999). Instead of using YG as a determinant of yG, a non-linear model can be constructed which varies according to YG. Table 3 indicates that such a model can increase the forecasting power compared to a linear model, since the R² value increased when the dataset was divided into groups according to YG values.

Future use of the criteria suggested in 2.1 is also motivated by the observation that when these criteria were violated, several random variables could enter as significant determinants of GDP growth in a multiple regression including historical growth data. This implies that if a

(17)

the contribution of such a variable may very well be spurious. This may explain why many multivariate growth regressions which generate high R² values sometimes contain explanatory variables with contradicting policy implications (Kenny and Williams 2001). Most of the multivariate growth models examined by Levine and Renelt (1992) were fragile to small changes, which should be another reason for questioning the commonly used criteria for explanatory variables in multiple regression analysis. A final defence of the criteria set in 2.1 is that these criteria were successfully used to develop Equation 2, whose forecasts/hindcasts were not less certain than the IMF's forecasts (Figure 4).

Similar methodological inconsistencies as those discussed above may have been fed into many growth regressions, as well as into other types of forecasting models. Conspicuously enough, the IMF's growth forecasts were found in this study to be of equally poor quality as simple forecasts and hindcasts constructed from historical statistics (see Equation 2 and Figures 2-4). These results are consistent with the findings regarding the reliability of the World Bank's (Verbeek, 1999) and the OECD's (Pons, 1999) growth forecasts in the 1990s. In an extensive review, Fildes and Stekler (2002) found that typical one-year growth forecasts deviate slightly more than 1 percentage unit from outcome, which is similar to the findings in this work (1.1 percentage units). Growth forecasts tend to be more uncertain the longer the time-horizon (Pons, 1999) and many long-term forecasts have turned out to be grossly inaccurate (Kenny and Williams, 2001).

Prairie's staircase (Prairie, 1996) provided a useful acceptance limit for forecasting power in the comparison between growth forecasts/hindcasts and outcome. The IMF's forecasts and Equation 2 generated different R² values when regressed against outcome and all values were below 0.65 (Figures 2 and 3) although both methods yielded rather similar errors (Figure 4).

However, the variation in regression equations from Figures 2 and 3 indicates that the regression slope must, in addition to the R² value, be taken into consideration in the evaluation of forecasting power. None of the other growth regressions examined in this study exceeded the acceptance limit (see Section 1) of R² = 0.65. Given that many of the R² values in Table 1 may be at least partly additive, it may very well be possible in the future to develop robust growth models with sufficient forecasting power, although there is reason to question the prospects of such attempts. Forecasts of growth and other macroeconomic indicators have not improved over time, despite extensive macroeconomic research (Fildes and Stekler,

(18)

2002). Ormerod and Mounfield (2000) argued that the great inherent variability in growth statistics makes forecast failure inevitable.

The forecasting power of growth models should not be seen as a marginal issue. As stressed in Section 1, the demonstrated shortcomings in contemporary growth forecasting may have extensive impacts on policy outcome. The global aggregate GDP per capita growth, as well as GDP per capita growth in many countries, have been substantially lower during recent decades, a period often referred to as the "age of globalisation" or the "neoliberal order", compared to preceding decades - despite the fact that policy has often been designed according to the mainstream view among economists about the causes of growth (Rodrik, 1999; Maddison, 2001; Milanovic, 2003; Weisbrot et al., 2006). If growth forecasts would improve in the future, this would imply that the present poor predictive understanding of economic growth has been part of the reason why the long-term growth during recent decades has generally not surpassed, or even reached, the high levels achieved during the "Golden Age" of the 1950s and 1960s.

5. CONCLUSIONS

This study has evaluated the forecasting power of economic growth models and found that all investigated models yield very uncertain forecasts, and that these findings have strong support in the literature. The IMF's growth forecasts were significantly different than outcome in 7 years out of 8, and the forecast error was similar to that of a simple statistical model based on historical data. One reason for the observed poor forecasting power may be the frequent use of metaphysical constructs in growth models. Another reason may be that many growth models include explanatory variables which show no individual correlation with growth, or which are assumed to have an opposite, "concealed" effect, in relation to what is implied from the variables' individual correlations with growth. Such growth models may be fragile to small changes and may even produce contradictory policy implications, in addition to yielding unreliable forecasts which may in turn cause extensive policy failure.

(19)

REFERENCES

Barro, R. J., 1996. Determinants of Economic Growth: A Cross-Country Empirical Study.

NBER Working Paper 5698. NBER, Cambridge, Massachusetts, 118 p.

Bernhofen, D. M., 2005. The Empirics of Comparative Advantage: Overcoming the Tyranny of Nonrefutability. Review of International Economics, 13: 1017-1023.

Blaug, M. 1980. The Methodology of Economics. Cambridge University Press, Cambridge, 314 p.

Bolaky, B. and C. Freund, 2004. Trade, Regulations, and Growth. Research working paper WPS 3255, World Bank, Washington, 40 p.

Bougheas, S., Demetriades, P. O., and Mamuneas, T. P., 2000. Infrastructure, Specialization, and Economic Growth. The Canadian Journal of Economics, 33: 506-522.

Carkovic, M. and Levine, R., 2002. Does Foreign Direct Investment Accelerate Economic Growth? Working paper, University of Minnesota.

CID, 2007. http://www.cid.harvard.edu/ciddata/ciddata.html

Clements, M. P., and Hendry, D. F., 1999. Forecasting Non-stationary Economic Time Series.

The MIT Press, Cambridge, Massachusetts, 362 p.

DeLurgio, S. A., 1998. Forecasting principles and applications. Irwin/McGraw-Hill, Boston, 802 p.

Dollar, D. and A. Kraay, 2004. Trade, Growth, and Poverty. The Economic Journal, 114: F22- F49.

Easterly, W., Kremer, M., Pritchett, L., and Summers, L. H., 1993. Good Policy or Good Luck? Country Growth Performance and Temporary Shocks. Journal of Monetary Economics, 32: 459-483.

(20)

Eichner, A. S., 1985. The lack of progress in Economics. Nature, 313: 427-428.

Fildes, R., and Stekler, H., 2002. The state of macroeconomic forecasting. Journal of Macroeconomics, 24: 435-468.

Gallup, J.L., Sachs, J.D. and Mellinger, A.D., 1999. Geography and economic development.

International Regional Science Review 22, 179–232

Gylfason, T., 2001. Natural resources, education, and economic development. European Economic Review, 45: 847-859.

Hands, D. W., 2001. Economic methodology is dead - long live economic methodology:

thirteen theses on the new economic methodology. Journal of Economic Methodology, 8: 49- 63.

Hutchison, T. W., 1938. The Significance and Basic Postulates of Economic Theory.

Macmillan and Co., London, 192 p.

IMF, 1998. IMF World Economic Outlook (May issue). IMF, Washington, D. C., 221 p.

IMF, 2002. IMF World Economic Outlook (April issue). IMF, Washington, D. C., 225 p.

(21)

Jones, C. I., 1995. Time Series Tests of Endogenous Growth Models. Quarterly Journal of Economics, 110: 495-525.

Kenny, C. and D. Williams, 2001. What Do We Know About Economic Growth? Or, Why Don't We Know Very Much? World Development, 29:1-22.

King, R. G., and Levine, R., 1993. Finance and Growth: Schumpeter Might be Right. The Quarterly Journal of Economics, 108: 717-737.

Levine, R., and Renelt, D., 1992. A Sensitivity Analysis of Cross-Country Growth Regressions. The American Economic Review, 82: 942-963.

Levine, R., and Zervos, S., 1998. Stock Markets, Banks, and Economic Growth. The American Economic Review, Vol. 88, No. 3. (Jun., 1998), pp. 537-558.

Maddison, A., 2001. The World Economy: A Millennial Perspective. OECD, Paris, 388 p.

Meeusen, W., 2003. Economic Convergence, the ‘Stylised Facts of Growth’ and

Technological Progress. An introduction from the perspective of the theory of growth. CESIT Discussion paper No 2003/03. University of Antwerp, Antwerp, 21 p.

Milanovic, B., 2003. The Two Faces of Globalization: Against Globalization as We Know It.

World Development, 31: 667-683.

Ormerod, P., and Mounfield, C., 2000. Random matrix theory and the failure of macroeconomic forecasts. Physica A: Statistical Mechanics and its Applications, 280: 497-504.

Perotti, R., 1996. Growth, Income Distribution, and Democracy: What the Data Say.

Journal of Economic Growth, 1:149-187.

(22)

Persson, T. and Tabellini, G., 1994. Is Inequality Harmful for Growth? Theory and Evidence.

American Economic Review, 84: 600-621.

Pons, J., 1999. Evaluating the OECD's Forecasts for Economic Growth. Applied Economics, 31: 893-902.

Popper, K. R., 1972. Conjectures and Refutations; The Growth of Scientific Knowledge, 4th ed. Routledge and Kegan Paul, London and Henley, 431 p.

Prairie, Y. T., 1996. Evaluating the Predictive Power of Regression Models. Canadian Journal of Fisheries and Aquatic Sciences 53: 490-492.

Rodionov, S. N., and J. E. Overland, 2005. Application of a sequential regime shift detection method to the Bering Sea ecosystem. ICES Journal of Marine Science 62: 328-332.

Rodrik, D., 1999. Where Did All the Growth Go? External Shocks, Social Conflict, and Growth Collapses. Journal of Economic Growth, 4: 385-412.

Stanley, T.D., 1998. Empirical Economics? An Econometric Dilemma with Only a Methodological Solution, Journal of Economic Issues, 32, 191-218.

Verbeek, J., 1999. The World Bank's Unified Survey Projections: How Accurate Are They?

An Ex-Post Evaluation of US91-US97. Policy Research Working Paper 2071. World Bank, Washington, D.C., 60 p.

Ventura, J., 2005. A Global View of Economic Growth. In: Aghion, P., and Durlauf, S.

Handbook of Economic Growth, Vol 1B. North Holland, Amsterdam, pp. 1419-1497.

Weisbrot, M., Baker, D., Rosnick, D., 2006. The Scorecard on Development: 25 Years of Diminished Progress. International Journal of Health Services, 36:211-234.