• No results found

The Swedish Housing Market: An Analysis of Contributing Factors on the Sales Price: With Discussion on the Amortization Requirement’s Effects

N/A
N/A
Protected

Academic year: 2022

Share "The Swedish Housing Market: An Analysis of Contributing Factors on the Sales Price: With Discussion on the Amortization Requirement’s Effects"

Copied!
48
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2018 ,

The Swedish Housing Market:

An Analysis of Contributing Factors on the Sales Price

With Discussion on the Amortization Requirement's Effects

TINTIN CLAESSON

VERONICA EHRSTRÖM EKLÖF

(2)
(3)

The Swedish Housing Market:

An Analysis of Contributing Factors on the Sales Price

With Discussion on the Amortization Requirement's Effects

TINTIN CLAESSON

VERONICA EHRSTRÖM EKLÖF

KTH ROYAL

Degree Projects in Applied Mathematics and Industrial Economics Degree Programme in Industrial Engineering and Management KTH Royal Institute of Technology year 2018

Supervisor at Valueguard AB: Lars-Erik Ericson

Supervisors at KTH: Daniel Berglund, Hans Lööf

Examiner at KTH: Henrik Hult

(4)

TRITA-SCI-GRU 2018:196 MAT-K 2018:15

(5)

Abstract

The purpose of this report was to find the impact of different factors on the sales price of one bedroom apartments in Stockholm. Further, an analysis fol- lows discussing the repercussions that the amortization requirement has had on the sales price. The stated problem have been highly discussed, as many depend on the housing market. To regulate the housing market in Sweden the Swedish government, together with Finansinspektionen, has chosen to introduce an amortization requirement. This means that every new loan taker has agreed to amortize 1 or to 2 percent of the total loan.

The method for analysis was multiple linear regression, where several different variables were applied. The most important parameters were different measure- ments on time and place. But less intuitive variables, such as quality of water, were also included. The data used for analysis was supplied by Valueguard AB dated from 2013 to present day, which included approximately 20 000 data points.

Furthermore, the result of the analysis was not very surprising. It is concluded

that the sales price on a condominium in Stockholm City has a business curve

based on the time of the year. It is also clear that external factors as distance

to water makes a difference in price. As far as the amortization requirement is

concerned, it shows that the steep price curve over time has started to flatten

since 2016, but not exceedingly much. This could be a consequence of the scope

of the analysis, as only one room apartments were included.

(6)
(7)

Sammanfattning

Syftet med denna rapport var att unders¨ oka vilken p˚ averkan olika faktorer har p˚ a f¨ ors¨ aljningspriset av enrumsl¨ agenheter i Stockholm. Vidare diskuteras hur amorteringskravet har p˚ averkat f¨ ors¨ aljningspriset. Den valda fr˚ agest¨ allningen har varit mycket omtalad d˚ a stora delar av samh¨ allet p˚ averkas av f¨ or¨ andringar p˚ a bostadsmarknaden. F¨ or att reglera bostadsmarknaden i Sverige har den svenska regeringen tillsammans med Finansinspektionen introducerat ett amor- teringskrav. Vilket betyder att alla nya l˚ antagare ¨ ar skyldiga att amortera en eller tv˚ a procent av det totala l˚ anet.

Analysen baserades p˚ a multipel linj¨ ar regression, d¨ ar flera olika variabler ap- plicerades. De viktigaste variablerna var olika m˚ att p˚ a tid och plats. Aven ¨ mindre vanliga variabler, s˚ asom vattenkvalitet inkluderades i analyserna. Den data som anv¨ andes distribuerades av Valueguard AB och str¨ acker sig fr˚ an 2013 och fram˚ at, den tilldelade datam¨ angden omfattade cirka 20 000 punkter.

Resultatet av analysen var f¨ oga ¨ overraskande. Det gick att se att slutpriset

p˚ a en bostad i Stockholm har en priskurva som ¨ ar beroende p˚ a vilken tidpunkt

p˚ a ˚ aret den s¨ aljs. Dessutom var det tydligt att utomst˚ aende faktorer, bland

annat avst˚ andet till vatten, p˚ averkade priset. Vad g¨ aller amorteringskravet

visar analysen att den branta kurvan av pris ¨ over tid, har d¨ ampats n˚ agot sedan

inf¨ orandet av amorteringskravet. Detta kan vara en konsekvens av den valda

avgr¨ ansningen f¨ or analysen d˚ a enbart ettor i Stockholms stad ¨ ar inkluderade.

(8)
(9)

Acknowledgments

Regards to the tutors at Kungliga Teckniska H¨ ogskolan in Stockholm and a very

special thanks to Lars-Erik Ericson with crew at Valueguard AB that generously

borrowed us the dataset.

(10)
(11)

Contents

List of Figures 5

List of Tables 6

1 Introduction 7

1.1 Background . . . . 7

1.2 Research Question . . . . 7

1.3 Goal and Purpose . . . . 8

1.4 Scope and Limitations . . . . 8

2 Theoretical Framework 9 2.1 Economic Theory . . . . 9

2.1.1 Loan-to-Value Ratio . . . . 9

2.1.2 Macroeconomic Consequences of a High Debt Ratio . . . 9

2.1.3 History of the Amortization Requirement . . . . 10

2.1.4 Supply and Demand . . . . 12

2.1.5 Hedonic Pricing . . . . 12

2.1.6 Rational Choice Theory . . . . 12

2.1.7 Behavioral Finance . . . . 12

2.2 Mathematical Theory . . . . 13

2.2.1 Definition of the Model . . . . 13

2.2.2 Dummy Variables . . . . 13

2.2.3 Model Assumptions . . . . 13

2.2.4 Ordinary Least Square Estimate . . . . 14

2.2.5 Definition of Residuals . . . . 14

2.2.6 Standardized Residuals . . . . 15

2.2.7 Studentized Residuals . . . . 15

2.2.8 Multicollinearity . . . . 15

2.2.9 Model Validation . . . . 16

2.2.10 Variable Selection . . . . 17

2.2.11 Data Selection . . . . 18

2.2.12 Transformation on the Regressor Variables . . . . 18

3 Methodology 20 3.1 Variable Analysis . . . . 20

3.1.1 Explanation of Variables . . . . 20

3.1.2 Excluded Variables . . . . 20

3.1.3 Model definitions . . . . 22

3.1.4 Analysis of AIC, BIC and R

2

. . . . 23

3.1.5 Best Fitted Models . . . . 23

3.2 Data Analysis . . . . 25

3.2.1 Processing of Data . . . . 25

3.2.2 Data Reduction . . . . 25

3.2.3 Final Dataset . . . . 26

(12)

3.3 Transformation of the Regression Model . . . . 27

3.3.1 Box-Cox Transformation . . . . 28

3.3.2 Quadratic Transformation . . . . 29

3.4 Final Model . . . . 30

4 Result 31 4.1 Presentation of Result . . . . 31

4.2 Regression Equation . . . . 31

5 Discussion 32 5.1 Mathematical Discussion . . . . 32

5.2 Economic Discussion . . . . 36

References 38

(13)

List of Figures

1 The Debt to Income in Sweden. . . . 10

2 The Debt to Income Broken Down by County. Source: Ibid . . . 11

3 The Response Variable vs the Regressors . . . . 21

4 Residual Analysis . . . . 25

5 QQ-plot of Standardized and Studentized Residuals . . . . 26

6 CovRatio with Marked Potential Outliers . . . . 26

7 Cook’s Distance with Marked Potential Outliers . . . . 27

8 Residuals vs Fitted Value after Removal of Outliers . . . . 27

9 Cook’s Distance after Removal of Outliers . . . . 28

10 Residual Plot after Box-Cox Transformation . . . . 28

11 Residuals after Quadratic Transformation . . . . 29

12 Residual Plot of Final Model . . . . 30

13 Distance to Water . . . . 33

14 Building Year . . . . 34

15 Recounted Price . . . . 34

16 Change of Price by Date . . . . 35

17 Seasonal Changes in Price . . . . 35

(14)

List of Tables

1 ANOVA table of Full Model . . . . 20

2 Variables used in the Analysis . . . . 21

3 Result of Variance Inflation Factors . . . . 22

4 Results of Stepwise Regression . . . . 22

5 Model Specifications . . . . 23

6 AIC, BIC and R

2

. . . . 23

7 ANOVA Table of Model 6 . . . . 24

8 ANOVA Table of Model 8 . . . . 24

9 Value of Estimates . . . . 32

(15)

1 Introduction

1.1 Background

A condominium can be created in two ways, either by restructure of a rented apartment or when houses are built and sold. Building new constructions are costly and time consuming. The restructure of rented homes into condominiums has had the implications that the supply of rented apartments has decreased and the demand for condominiums has increased. The combination of increased demand of condominiums and a significantly low trend of interest and mortgage rates has driven the market to higher prices making it necessary for buyers to increase their housing loans(1).

During a long period of time in Sweden the debt has increased in a higher tempo than the income of the country. This means that the debt ratio increased in the same years, from 1990 to 2010, leading to extended recessions. Furthermore, Swedish housing prices have increased rapidly since 1990 leading to a greater need for Swedish households to take housing loans. Consequences from this are highly related to macroeconomic risks. With higher debt ratio the households will not be as eager to spend money and this will slow down the economy (1).

Finansinspektionen and the Swedish government have therefore recommended an amortization requirement on Swedish housing loans on behalf of the Euro- pean parliament. Finansinspektionen introduced the new rule in June 2016, which applies to every bank in Sweden. The amortization requirement means that the loan taker is obligated to amortize 1 or 2 percent of the total loan depending on the loan-to-value ratio of the household. The amortization re- quirement comprises all housing loans.(1).

The purpose of the requirement is to increase the amortization ratio and ac- cordingly decrease the debt ratio, but since the rule is relatively new the con- sequences have not yet been detected in the housing market. One goal of this study is to discuss the repercussions of the amortization requirement in the housing market (2).

1.2 Research Question

The research question has been formulated as:

What factors contributes to the sales price of condominiums in Stockholm City and what are the effects from the amortization requirement?

The part of the research question regarding the sales price of condominiums

will be considered using parameter analysis in multiple linear regression em-

(16)

ploying data distributed from Valueguard AB. As a part of the analysis the feasibility of the distributed data will be discussed. The second part of the re- search question regarding the amortization requirement will be discussed based on a literature study.

1.3 Goal and Purpose

The aim of this study is, as previously stated, to detect factors for the price of condominiums and to analyze the effects of the amortization requirement on the housing market in Stockholm, using multiple linear regression and economical research. Since the amortization requirement was activated in June 2016 it is desired to examine if any consequences have been revealed in the market. The implications of the amortization requirement are likely to have a great impact in the housing market therefore a study of these implications is of great relevance.

The relevance of this study is also significant considering the past ten years changes in the Swedish housing market. Changes that have been seen are re- garding the major raise in sales prices and the increase in supply of condo- miniums (2). The stakeholders for whom this is relevant are buyers, sellers or owners, politicians, architects and contractors.

1.4 Scope and Limitations

The purpose of this study is to get an insight of the Swedish housing market.

It will strictly handle condominiums as the market for rented housing is contin- uously changing and have a significant amount of illegal trade, meaning when people rent apartments without valid contracts. Furthermore, the report has been limited to Stockholm. The time period that will be investigated starts in 2013 to present day.

The limitation in this analysis concerning the amortization requirement, in-

troduced in June 2016, is the fact that only short-term consequences can be

analyzed. The implications from the additional amortization requirement, in-

troduced in March 2018, will not be investigated to any extent. Therefore, in the

remaining parts of the report when the amortization requirement is mentioned

it refers to that of June 2016 unless anything else is stated.

(17)

2 Theoretical Framework

2.1 Economic Theory

2.1.1 Loan-to-Value Ratio

The Loan-to-Value Ratio, LTV is a financial tool used in risk assessment when examining a mortgage application. It is given in decimals or percent and counted as mortgage amount to appraised value of the property. A high LTV-ratio means that the mortgage is much larger than the mortgage taker’s property and therefore the mortgage is at larger risk. It is therefore an indication of the level of vulnerability that a household is exposed to in relation to falling house prices. As a fallout the loan generally has higher interest with a high LTV-ratio (3).

LT V = M ortgage Amount

Appraised V alue of the P roperty (1) 2.1.2 Macroeconomic Consequences of a High Debt Ratio

Debt Ratio

Debt Ratio is a financial ratio that measures total debt in relation to total as- sets. The outcome is given as decimals or in percent and it can be explained as the proportion of a company’s or society’s assets that are financed in debt(4).

Consequences of high debt in a company or society are highly related to fluc- tuations in the economy. When debt is high several economic assets are not as volatile since they are tied up, leading to reduced liquidity. Implications are, among other things, greater bias in case of financial shocks or changes in inter- est rates as the company needs a major part of the assets to reduce and handle debt (5).

Debt Ratio = T otal Debt

T otal Asset (2)

Household Debt

For a household to be able to spread out their consumption over a lifetime one of the most important requirements is that the credit market is functional. To increase the possibilities to buy a home the households must manage to take loans. However, if the households suffer from to much debt, consequences are most likely to occur. The main reason for household debts to increase is rising household prices, which can develop from low interest rates. This is because of the increase in demand of condominiums when interest rates are low. Other reasons to higher household debt are the growing population or a smaller supply of new constructions (6).

The individual household suffers from risk because of the debt ratio. First

of all, when taking a loan the household accept a payment responsibility for

(18)

a long period of time. The implication is that changes in the economy are a great risk, as economic chocks and adjustments may not be counted in while taking the loan. Second, a large debt in relation to the value of the house makes the household sensitive to descending housing prices. In extreme cases there is a risk of the housing price to become less than the size of the loan leading to negative equity. Furthermore, large debt in relation to the income of the household is also a vulnerability if loss of income or raise in interest rate would come about(7). In the case of negative economic changes the households have to prioritize their mortgage and interest payments. Corresponding behavior may lead to cut back in consumption and consequently to recession (7).

Debt Ratio in Sweden

The debt ratio in Sweden is growing for each passing year, see Figure 1. The total amount of debt is SEK 3, 8 trillion and approximately SEK 3 trillion is in mortgage loans, which is 77 percent of the total debt. The Swedish LVT-ratio is at 64 percent(8).

Figure 1: The Debt to Income in Sweden.

The debts are quite irregularly distributed in the country, but the obvious con-

(19)

Figure 2: The Debt to Income Broken Down by County. Source: Ibid

and G¨ oteborg where the prices are generally the highest in Sweden. The higher prices has led to an increase of 80 percent in debt(1).

Households with a high debt ratio are more sensitive to fluctuations in the market than households with lower debt ratio. Fluctuations can include change in interest or loss of income. Households with high debt ratio are more sensi- tive as they pay a larger portion of their income in interest and amortization.

Therefore if the interest increases the households has to decrease savings and consumption or be forced to move to a cheaper house, which can lead to reces- sion (1).

To force home buyers to decrease their debt Finansinspektionen introduced a amortization requirement in June 2016 which obligates loan takers to amortize 1 or 2 percent depending on the loan-to-value ratio. According to Finansinspek- tionen, the amortization requirement has lead to people buying cheaper real estate, taking smaller loans and use a more substantial part of their savings to finance housing loans, with the repercussion that the households are less sen- sitive to fluctuations in the stock or housing market. As a result of the initial amortization requirement household loaned 9 percent less and bought houses that were 3 percent cheaper(1).

Nevertheless, there are risks remaining. There are still households taking loans with high debt ratio, mainly since prices has increased significantly more than the average income. This could have serious macroeconomic consequences(1).

One of these consequences was that a supplementary amortization requirement

was introduced in March 2018 with the requirement that households with debt

ratios higher than 450 percent must amortize one additional percent of their

loans each year(1).

(20)

2.1.4 Supply and Demand

The concept of Supply and Demand is one of the most basic concepts in eco- nomic theory. Supply indicates how much of a good or service is available in the market and demand refers to how much of a good or service is desired by consumers. A relationship between the two is reflected in the price of the good or service. The price can therefore be altered by the behavior of buyers and sellers. When the price decreases the willingness of buyers to purchase increases and the willingness of sellers to sell decreases. As a result an equilibrium is then obtained and a price is set (9).

2.1.5 Hedonic Pricing

In difference with the classic theory of supply and demand Hedonic Pricing rec- ognizes that prices are determined by both internal and external characteristics.

It is commonly used when pricing real estate as there can be several factors that affect the price of a home. The internal factors can include size, condition, ap- pearance and different features such as balconies or solar-panels. The external factors can be the crime rate in the neighborhood or the access to schools(10).

The hedonic pricing method is used to estimate to which extent each factor affect the price(10).

2.1.6 Rational Choice Theory

The Rational Choice Theory states that consumers always make logical decisions in order to maximize the gain and minimize the loss where the decisions result in a great benefit or satisfaction. The Rational Choice Theory is profound in the assumptions of many economical theories. However new theories such as Behavioral Finance challenge these assumptions (11).

2.1.7 Behavioral Finance

Studies in Behavioral Finance has incurred in order to complement the theo-

ries of Rational Choice. Behavioral Finance is based on the believes that peo-

ple sometimes make irrational investment decisions. These irrational decisions

can explain why bubbles and panics occur or why individuals make expensive

investments(12).

(21)

2.2 Mathematical Theory

The mathematical theory is written with Introduction to Multiple Linear regres- sion by Douglas C. Monthgomery, Elizabeth A. Pick and G. Geoffrey Vining as reference if nothing else is stated.

2.2.1 Definition of the Model

A multiple linear model with i predictor variables X

1

, X

2

, ..., X

i

and one re- sponse variable Y can be expressed as:

Y

j

= β

0

+ β

1

x

1

+ ...β

i

x

i

+ 

n

, (3) where 

n

is a normally-distributed random deviation with mean 0 and variance σ

2

; that is,



j

∼ (0, σ

2

) for all j.

This model can also be written in matrix form. With

Y =

 Y

1

Y

2

.. . Y

n

 , X =

1 x

11

x

12

. . . x

1i

1 x

21

x

22

. . . x

2i

.. . .. . .. . .. . .. . 1 x

n1

x

n2

. . . x

ni

 ,

β = (β

0

, β

1

, ..., β

i

)

T

, and  = (

1

, 

2

, . . . , 

n

)

T

. Then (3) can be expressed as

Y = Xβ + . (4)

Thus, (4) implies that Y ∼ (β, σ

2

I), where I is an (n × n) identity matrix.

Where y is the response variable that is predicted by the covariates x

ij

. The unknown parameter β is estimated using data of the covariates.  is the error term, with the requirement that it is normally distributed.

2.2.2 Dummy Variables

Dummy variables or indicator variables are given different values in order to account for the effects that the variable have on the response. The dummy variables will be set to 1 or 0. 1 if it is included in the model and 0 if it is excluded. The dummy variables will be set in order to avoid potential problems with multicollinearity.

2.2.3 Model Assumptions

There are five major assumptions that are required in regression analysis. The

validity of these assumptions should be considered doubtful and subject to anal-

ysis.

(22)

1. Linear Relationship

The relationship between the response variable and the regressors is linear, at least approximately.

2. Strict exogeneity

The error term epsilon has zero mean.

3. No Multicollinearity

The number of observations is greater than the number of variables. There exsists no linear relationship between the variables.

4. Homoscedasticity

The errors are uncorrelated with the same variance.

5. No auto-correlation

The errors are normally distributed.

In order to detect the validity of these assumptions several diagnostics can be useful. These will be presented at later stages of the report.

2.2.4 Ordinary Least Square Estimate

The method of Ordinary Least Squares can be used to estimate the regression coefficients. The ordinary least square estimator provides an estimation of β, namely ˆ β, by minimizing the sum squared errors. The normality assumptions presented in 2.2.3 are required.

S(β) = (y − Xβ)

0

(y − Xβ) (5)

∂S

∂β

βˆ

= −2X

0

y + 2X

0

X ˆ β = 0 (6)

β = (X ˆ

0

X)

−1

X

0

y (7)

2.2.5 Definition of Residuals

Residuals are defined as

(23)

several important assumptions. As listed in chapter 2.2.3 the residual has zero mean and average variance estimated by

P

n

i=1

(e

i

− ¯ e)

2

n − p =

P

n i=1

e

2i

n − p = SS

Res

n − p = M S

Res

(9)

2.2.6 Standardized Residuals

Standardized Residuals is a handy method when detecting outliers and influen- tial points. The standardized residuals are calculated by scaling the residuals with the approximate average variance M S

Res

:

d

i

= e

i

√ M S

Res

, i = 1, 2, ...n (10)

If the standardized residual is large with d

i

> 3 it indicates an outlier.

2.2.7 Studentized Residuals

Studentized Residuals is also a good method in distinguishing influential points and outliers. In this case the residuals are scaled with the approximate standard deviation M S

Res

:

r

i

= e

i

pM S

Res

(1 − h

ii

) , i = 1, 2, ...n (11) The studentized residuals have constant variance V AR(r

i

) = 1 regardless of the location of x

i

when the form of the model is accurate. A point with both large residual and h

ii

is probably of high influence.

2.2.8 Multicollinearity

The phenomenon of Multicollinearity occurs when one variable can be linearly predicted by other variables, meaning that there is very high intercorrelation between the independent variables. This is done with high accuracy, which im- pacts the tested model severely. It is therefore a disturbance to the data, hence, a regression model with multicollinearity may not be reliable.

Diagnostics

Multicollinearity can be detected with several methods, a few will be presented below.

The Variance Inflation Factor, VIF, measures the variances of each regressor and its combined effects. The limit for accepted VIF has been chosen to 10.

Therefore values over 10 indicate multicollinearity. The VIF equation is given by V IF

j

= 1

1 − R

2j

(12)

(24)

Variance decomposition propositions are used to detect values or eigenvalues that dramatically inflate variance and as a consequence multicollinearity. Val- ues greater than 0,5 should be excluded.

Treatments

Multicollinearity can be treated by:

1. Collection of additional data 2. Model respecification

a. Redefine the regressors b. Variable elimination

3. Use of estimation methods designed to combat multicollinearity

In the case of non-orthogonal data Ridge Regression can be used as a sup- plement to least square.

2.2.9 Model Validation

The following subsections present methods used to assess feasibility and the validity of the tested model.

R

2

The R

2

-value can be used to assess the adequacy of the model. R

2

defines the percentage of movement of parameters i.e. how accurate the model is. R

2

is given by Formula 13.

R

2

= SS

R

SS

T

= 1 − SS

Res

SS

T

(13) where

SS

T

= SS

R

+ SS

Res

(14)

SS

R

= ˆ β

0

X

0

y −

n

X

i=1

y

i

!

2

(15)

(25)

hypothesis should be rejected. The P-value is given by

p = P r(X ≥ F ), X ∈ F

k,n−k−1

(17)

F-test

The F-test values the test statistic with an F-distribution under the null hy- pothesis. The F-statistic is compared to the corresponding value of an F-table.

The F-statistic can also be used to assess homocedicity, meaning that all ran- dom variables in a sequence have the same finite variance.

If at least one of the regression coefficients is significant the null hypothesis, H

0

, can be rejected.

H

0

: β

1

= ... = β

k

= 0 (18)

To be able to reject the null hypothesis the F-value is computed F

0

= SS

R

/k

SS

Res

/(n − k − 1) = M S

R

M S

Res

(19)

The F

0

-value follows a F

k,n−k,n−k−1

-distribution and if F

0

> F

α,k,n−k−1

then the null hypothesis is rejected.

Residual Analysis

Residual Analysis can be an efficient way to control the validity of the model.

The residuals are plotted against the fitted value and the plot is then analyzed.

2.2.10 Variable Selection

Variable election is used to detect unwanted variables. The method of choice for variable selection is stepwise regression with AIC as a pre-specified criterion.

The AIC and BIC can also be used for variable selection.

Stepwise Regression

Stepwise regression can be used for variable selection and have three different procedures. The forward, backward and both-way regression. The overall goal is to compare and analyze models with different amounts of regressors in order to select the final, best fitted, one. The different methods differ in what way the variables are added or deleted. To implement an accurate analysis all three methods are recommended.

AIC

Akaike’s Information Criterion, AIC is a statistic tool that compares the con-

dition of a group of statistical models. The AIC is based on maximizing the en-

tropy, a measure of expected information, of the model. Thus, the AIC presents

(26)

the best model of the group but does not indicate the overall quality of the model. Small values of the AIC are desired.

The definition of the AIC is given by

AIC = 2k + nln||

2

(20)

BIC

The Bayesian Information Criterion, BIC is used in statistic for model selec- tion among a finite set of models. The BIC is based on the likelihood function and it is closely related to the Akaike’s Information Criterion, AIC. In contrast to the AIC, the BIC places a greater penalty on adding regressors to the model.

Like with AIC, the smallest value of the BIC is preferred.

The definition of the BIC is given by

BIC = kln(n) + nln( SS

Res

n ) (21)

2.2.11 Data Selection

The following methods are used to find influential points that should be consid- ered to be removed from the dataset.

CovRatio

CovRatio measures the impact on the i:th observation on the precision of es- timation. If the CovRatio > 1 the observation improves the precision, if the CovRatio < 1 the observation degrades the precision.

Cook’s Distance

Cook’s Distance is a measure of deletion diagnostics. The method is based on the squared distance between the least-squares estimates found on all points of β and the approximate attained from deleting the i:th point, this is the Cook’s ˆ distance measure, D.

D > 2

√ (22)

(27)

method to select the transformations is to look at the residual scatter plot. If a pattern can be seen in the residual plot, this indicates non-linearity in the model and transformation is needed.

Box-Cox transformation

The Box-Cox transformation is also referred to as The Power Transformation.

As the name indicates, the transformation is done by raising y to the power of λ which is a parameter to be determined.

The Box-Cox transformation gives us a solution on how to estimate both the

parameters of the regression model and the λ simultaneously, by the use of

the maximum likelihood method. The transformation is done to correct non-

normality and non-constant variance of the model.

(28)

3 Methodology

This section will present the result of the tests and regressions, based on the presented theory. The analysis started with an Analysis of Variance, ANOVA presented in Table 1.

Table 1: ANOVA table of Full Model

Variable Name DF SumSq MeanSq Fvalue Pr(>F) Significance

id number 1 1.6662e+15 1.6662e+15 71045.740 <2.22e-16 ***

x-coordinate 1 4.0144e+14 4.0144e+14 17117.286 <2.22e-16 ***

y-coordinate 1 1.5554e+15 1.5554e+15 66323.362 <2.22e-16 ***

building type 3 4.6798e+12 1.5599e+12 66.515 <2.22e-16 ***

living area 1 1.0273e+15 1.0273e+15 43803.498 <2.22e-16 ***

monthly fee 1 3.1371e+14 3.1371e+14 13376.529 <2.22e-16 ***

construction year 1 3.0810e+14 3.0810e+14 13137.423 <2.22e-16 ***

floor 1 6.5839e+13 6.5839e+13 2807.861 <2.22e-16 ***

floors in building 1 2.3421e+13 2.3421e+13 998.646 <2.22e-16 ***

recounted price 1 2.3051e+15 2.3051e+15 98287.533 <2.22e-16 ***

assembly 1 7.3627e+11 7.3627e+11 31.395 2.134e-8 ***

water distance 1 3.8135e+11 3.8135e+11 16.261 5.540e-5 ***

water quality 1 1.0707e+12 1.0707e+12 45.654 1.450e-11 ***

local area 1 1.3621e+12 1.3621e+12 58.081 2.628e-14 ***

month of the year 11 2.9331e+13 2.6664e+12 113.697 <2.22e-16 ***

date 1 2.6661e+14 2.6661e+14 11368.392 <2.22e-16 ***

Residuals 19804 4.6445e+14 2.3452e+10

3.1 Variable Analysis

3.1.1 Explanation of Variables

The variables given to use are presented in Table 2.

Comments on the Variable Month of the Yeat

Month of the year is a special variable that is formed as a vector with 12 covari-

ates. This is because it counts how many sales are executed each month, which

makes it possible to distinguish the price changes during the year to be able to

(29)

Table 2: Variables used in the Analysis

Variable Name Variable Meaning Referred in text as

didnr ID number of the sales id number

xcoord Geographic coordinate in the x-plane x-coordinate ycoord Geographic coordinate in the y-plane y-coordinate

htype Building type building type

harea Living Area, measured in m

2

living area

hmonthlyfee Monthly fee of the apartment, measured in SEK monthly fee byear Year of construction of the building construction year

hfloor Floor of the apartment floor

bfloors Total number of floors in the building floors in building recountedprice Present value of apartment, measured in SEK recounted price alkf Local assembly, gives a hint of the location assembly

wdist Distance to water water distance

wqual Quality of water water quality

omr Local area local area

mn Indicator of the month of the year month of the year

ym Date measured in year and month date

hrooms Number of rooms in the apartment rooms

tatortfolkmangd Population of the urban location population month Month number starting from January 2013 month index

Figure 3: The Response Variable vs the Regressors

month index and population should be excluded from analysis. This was con- cluded by the nature of the plots and proved by the alias function in R.

Multicollinearity

The mulitcollinearity was tested by the Variance Inflation Factor, VIF. This

(30)

is presented in Table 3. The result of analysis was that no variables should be excluded, as all included variables had a VIF < 5.

Table 3: Result of Variance Inflation Factors Variable Name GVIF DF GVIF

1/(2DF )

id number 4.833 1 2.198

x-coordinate 1.943 1 1.394 y-coordinate 2.501 1 1.582 building type 1.020 3 1.003

living are 2.631 1 1.622

monthly fee 1.971 1 1.404

construction year 1.613 1 1.270

floor 1.282 1 1.132

floors in building 1.427 1 1.195 recounted price 3.043 1 1.744

assembly 2.346 1 1.525

water distance 1.281 1 1.131 water quality 1.318 1 1.148

local area 1.245 1 1.116

month of the year 1.110 11 1.005

date 4.873 1 2.207

Stepwise Regression

The result of the Stepwise regression is presented in Table 4. As shown in the table, Both-way Stepwise regression and Backward Stepwise regression gave the same result, while Forward Stepwise regression proposed the full model as the final model.

Regression Proposed Excluded Variables Forward None

Backward building type, monthly fee, floor

Both building type, monthly fee, floor

(31)

Table 5: Model Specifications

Variables Full Model M1 M2 M3 M4 M5 M6 M7 M8

id number * * * * * * * * *

x-coordinate * * * * * * * * *

y-coordinate * * * * * * * * *

building type * * * * *

living area * * * * * * * * *

monthly fee * * * * *

construction year * * * * * * * * *

floor * * * * *

floors in building * * * * * * * * *

recounted price * * * * * * * * *

assembly * * * * * * * * *

water distance * * * * * * * * *

water quality * * * * * * * * *

local area * * * * * * * * *

month of the year * * * * * * * * *

date * * * * * * * * *

rooms *

population *

month index *

3.1.4 Analysis of AIC, BIC and R

2

In Table 6 the values of AIC, BIC and R

2

are presented for each model.

Table 6: AIC, BIC and R

2

Models AIC BIC R

2

Full Model 526915.6184 527136.6587 0.9515617708 Model 1 526915.6184 527136.6587 0.9515617708 Model 2 526914.6206 527127.7666 0.9515593211 Model 3 526916.6983 527129.8443 0.9515542421 Model 4 526914.0165 527127.1624 0.9515607978 Model 5 526915.7852 527121.0369 0.9515515848 Model 6 526913.0254 527118.2771 0.9515583315 Model 7 526914.9419 527120.1936 0.9515536464 Model 8 526914.0325 527111.3899 0.9515509803

3.1.5 Best Fitted Models

From the values of the different criterions of AIC, BIC and R

2

it has been

identified that Model 6 and Model 8 are the two best models, with 13 versus 14

regressors. To decide which one is the better fit an analyze of the ANOVA-tables

(32)

calculated for the two models was executed with results presented in Table 7 and 8.

Table 7: ANOVA Table of Model 6

Variable Name DF SumSq MeanSq Fvalue Pr(>F) Significance

id number 1 1.6631e+15 1.6631e+15 80642.650 <2.22e-16 ***

x-coordinate 1 4.0408e+14 4.0408e+14 19592.893 <2.22e-16 ***

y-coordinate 1 1.5545e+15 1.5545e+15 75375.620 <2.22e-16 ***

living area 1 1.0267e+15 1.0267e+15 49783.080 <2.22e-16 ***

monthly fee 1 3.1281e+14 3.1281e+14 15167.461 <2.22e-16 ***

construction year 1 3.0812e+14 3.0812e+14 14840.344 <2.22e-16 ***

floors in building 1 6.2124e+13 6.2124e+13 3012.277 <2.22e-16 ***

recounted price 1 2.3923e+15 2.3923e+15 115999.004 <2.22e-16 ***

assembly 1 1.3899e+11 1.3899e+11 6.7398 0.0094 **

water distance 1 2.3528e+11 2.3528e+11 11.4083 0.0007 ***

water quality 1 7.5426e+11 7.5426e+11 36.5726 1.4972e-9 ***

local area 1 1.2985e+12 1.2985e+12 62.9630 2.2177e-15 ***

month of the year 11 2.9493e+13 2.6811e+12 130.0035 <2.22e-16 ***

date 1 2.6234e+14 2.6234e+14 12720.8191 <2.22e-16 ***

Residuals 19792 4.0818e+14 2.0624e+10

Table 8: ANOVA Table of Model 8

Variable Name DF SumSq MeanSq Fvalue Pr(>F) Significance

id number 1 1.6631e+15 1.6631e+15 80634.488 <2.22e-16 ***

x-coordinate 1 4.0408e+14 4.0408e+14 19590.910 <2.22e-16 ***

y-coordinate 1 1.5545e+15 1.5545e+15 75367.991 <2.22e-16 ***

living area 1 1.0267e+15 1.0267e+15 49778.142 <2.22e-16 ***

construction year 1 4.3633e+14 4.3633e+14 21154.658 <2.22e-16 ***

floors in building 1 7.8848e+13 7.8848e+13 3822.821 <2.22e-16 ***

recounted price 1 2.5601e+15 2.5601e+15 124123.205 <2.22e-16 ***

assembly 1 1.2374e+11 1.2374e+11 5.9991 0.0143 *

water distance 1 2.52548e+11 2.5254e+11 12.2442 0.0005 ***

water quality 1 7.8082e+11 7.8082e+11 37.8569 7.759e-10 ***

local area 1 1.2764e+12 1.2764e+12 61.8825 3.832e-15 ***

month of the year 11 2.9492e+13 2.6811e+12 129.9887 <2.22e-16 ***

(33)

3.2 Data Analysis

3.2.1 Processing of Data

The data that was used was processed data from Valueguard AB. The first dataset obtained included every apartment sold in Stockholm County from 2003 until present day. This dataset enclosed approximately 200 000 data points. By the nature of the dataset, the computer’s capacity was to weak to process the data efficiently. This made it impossible to analyze the regression model.

The dataset used in the analysis consists of data from year 2013 until 2018 and the total number of data points was approximately 20 000. The provided data included one bedroom apartments in Stockholm City. Valueguard AB is considered a reliable source and their data is therefore sincere.

3.2.2 Data Reduction Residual analysis

When the residuals were plotted against the fitted value, see Figure 4, three influential points were detected, 1183, 8419 and 10387. These points were then deleted from the set. After initial residual analysis, QQ-plots of studentized and

Figure 4: Residual Analysis standardized residuals were executed, see Figure 5.

CovRatio

The marked points in the CovRatio plot are rated as potential outliers and in- fluential points, see Figure 6.

Cook’s Distance

Cook’s distance procedure was executed which resulted in several marked out-

liers, with the limit of the Cook’s distance measure calculated to D > 0.015.

(34)

Figure 5: QQ-plot of Standardized and Studentized Residuals

Figure 6: CovRatio with Marked Potential Outliers

The result from the Cook’s distance procedure is presented in Figure 7.

(35)

Figure 7: Cook’s Distance with Marked Potential Outliers

plot, see Figure 9, was executed. A new CovRatio plot was not executed as the outliers points detected were not deemed influential.

Figure 8: Residuals vs Fitted Value after Removal of Outliers

The new residual plot shows a much lower general residual value though the plot shows a slight sign of a quadratic relationship to the predicted variable.

3.3 Transformation of the Regression Model

To examine whether a transformation of the regression model was needed two

transformations were preformed. The first one was the Box-Cox transformation

and the second was a Quadratic transformation.

(36)

Figure 9: Cook’s Distance after Removal of Outliers

3.3.1 Box-Cox Transformation

The transformation was executed as the following:

y

λ

with λ = 0.707071

As seen in the scatter plot in Figure 10 the residuals seem to have no pattern

which is a sign of normality, yet, the values of the residuals are much greater

than before the transformation. This implies that the transformed model is

worse than the original, hence the transformation will not be used.

(37)

3.3.2 Quadratic Transformation The transformed model is given below:

y = β

didnr

(x

didnr

)

2

+ ... + β

ym

(x

ym

)

2

Given Figure 11, the quadratic transformation made the plot change direc- tory, but the residuals could not be decided to be accurate. To determine if the changes were positive, a new R

2

was calculated for this method. The new R

2

was set to

R

2

= 0.9313724737

This R

2

-value is poorer than that of the original model, this model is accord- ingly deemed worse than the original, hence the transformation will not be used.

Figure 11: Residuals after Quadratic Transformation

(38)

3.4 Final Model

The final model is Model 8 as stated above, and the final residual plot is shown in Figure 12. The scatter plot is relatively randomly distributed in fact there is

Figure 12: Residual Plot of Final Model

a slight sign of a quadratic pattern. But since the transformation did not show

a significant improving result this was not an optimal strategy. Moreover the

R

2

value of Model 8 was 0.952 and the significance level of the variables was 99

percent, which supports the choice of model.

(39)

4 Result

4.1 Presentation of Result

After thorough analysis several conclusions have been drawn. When the data was analyzed several outliers were found and consequently removed from the dataset. Following analysis of the residual plots, the Cook’s distance prose- dure and the CovRatio the following data points were removed; 644, 901, 1094, 1183, 1637, 2755, 2757, 5651, 5653, 7503, 8419, 10387, 13654, 13892, 13897 and 19097. For further information see section 3.2.2

The variables were assessed using analysis of the plots of each regressor versus the response variable, multicollinearity and stepwise regression in forward, back- ward and both-way direction. The following variables; id number, x-coordinate, y-coordinate, living are, construction year, floors in building, recounted price, assembly, water distance, local area, february, march, april, may, june, july, august, september, october, november, december and date have been concluded to contribute to the price of condominiums in Stockholm City. The coefficients of the final model are as presented in Table 9.

The final model was then subject to analysis. When computing a residual plot it was clear that the model did not follow a completely random pattern. Two transformations was proposed and tested. These were BoxCox transformation and Quadratic transformation. After the transformations were executed it was clear that none of the transformations improved the result of the final model.

For further information on the transformations see section 3.3.1 and 3.3.2.

The R

2

and the general level of significance were 0.952 and 99 percent respec- tively of the final model. These were assessed to be satisfying. See section 4.2 for the final model equation.

4.2 Regression Equation

The final regression equation is based on the analysis and presented in Formula 23.

P rice = β

0

+ β

didnr

x

didnr

+ β

xcoord

x

xcoord

+ β

ycoord

x

ycoord

+ β

harea

x

harea

+ β

byear

x

byear

+ β

bf loors

x

bf loors

+ β

recountedprice

x

recountedprice

+ β

alkf

x

alkf

+

β

wdist

x

wdist

+ β

wqual

x

wqual

+ β

omr

x

omr

+ β

mn2

x

mn2

+ β

mn3

x

mn3

+ β

mn4

x

mn4

+ β

mn5

x

mn5

+ β

mn6

x

mn6

+ β

mn7

x

mn7

+ β

mn8

x

mn8

mn9

x

mn9

+ β

mn10

x

mn10

+ β

mn11

x

mn11

+ β

mn12

x

mn12

+ β

ym

x

ym

(23)

(40)

Table 9: Value of Estimates Variable Name Value of Estimate intercept -3.6550e+8

id number 0.1275

x-coordinate 3.9342 y-coordinate 2.1976 living area -584.6677 construction year -72.2026 floors in building 916.9771 recounted price 0.9453

assembly -586.5029

water distance 107.3398 water quality 1.1312 local area -4.1426

february 8177.2460

march 1.1051e+5

april 1.1684e+5

may 1.1408e+5

june 1.1161e+5

july 1.3570e+5

august 1.6305e+5

september 1.6890e+5

october 1.6117e+5

november 1.2850e+5

december 1.1281e+5

date 1719.0750

5 Discussion

5.1 Mathematical Discussion

The conclusions drawn from the regression have been presented in section 4.

Result. A discussion will follow concerning some chosen regressors and their

individual impact on the final model. A few irregularities were found when each

regressor was plotted against the response variable, which will also be com-

mented in the following section.

(41)

Figure 13: Distance to Water

Secondly, The variable of the Building year can be seen as quite irregular, see Figure 15. The nature of the plot shows that there are few apartments that were built before the turn of the century of 1800 to 1900 which are still being sold. Another conclusion that can be drawn is that apartments that were built at the turn of the century are more expensive than new constructions. This can be the case for several reasons. Firstly, it is a trend to live in an older apartment which implies that Hedonic Pricing is applied. Secondly, older apartments often have tenant corporations with smaller loans as they have had the opportunity to amortize during a longer period of time. This yields a smaller monthly fee and therefore the overall cost of the apartment would be lower for each month, giving the loan taker the opportunity to afford bigger loans (15). As concluded in the analysis the variable of the monthly fee did not give a significant impact on the model and could therefore be excluded. One reason to this could be a possible correlation between the two variables, as construction year probably takes into account the above mentioned impact.

The plot of the price versus the recounted price, see Figure 16, is shaped as a cone. The idea of this variable is to take into account how inflation has an impact on the price. The fact that this plot is not entirely linear can indicate that other factors than inflation has had an impact on the price, which means that the price has increased more than the inflation.

As can be seen clearly in Figure 17. there has been a slight decrease in the

otherwise steep curve of the price at approximately middle of 2016 as the amor-

tization requirement was introduced. It can also be distinguished that prices, in

general, continue to increase in difference with the purpose of the amortization

requirement. These phenomenons will be discussed further in Section 5.2.

(42)

Figure 14: Building Year

Figure 15: Recounted Price

(43)

Figure 16: Change of Price by Date

general and the idea of moving is not as common (13). A reason to why our plot did not show a greater difference can be connected the fact that a lot of youths are moving to Stockholm in the end of summer or in the beginning of the year to start studying and are in general in need of small one bedroom apartments, (13).

Figure 17: Seasonal Changes in Price

One conclusion that can be drawn from our mathematical discussion is that

(44)

the price of a condominium is affected by both internal and external factors.

Intuitive factors such as housing area and date of sale is of importance to the price, yet we can also see trends that external factors such as distance to water, quality of water and the building year play an important role for the price. As previously stated, this can imply that Hedonic Pricing is used when apartments are bought and sold. Recall that Hedonic Pricing, defined in section 2.1.5, rec- ognizes the importance of both external and internal characteristics of a good or service.

Following the discussion of the contributing variables and the result of the analy- sis it can be summarized that the presented model yields a satisfying explanation on which factor contribute to the price of condominiums.

5.2 Economic Discussion

The amortization requirement arose due to an irrational gap in the debt versus income ratio of Sweden. Since the unbalance can contribute to rising macroe- conomic consequences, both the Swedish government and the European Par- liament insisted on the establishment of the requirement. Direct effects of the requirement are both positive and negative. The general conclusion seems to be that the requirement forces households to buy cheaper apartments and in that way forces them to decrease the household debt. This was the purpose of the new rule and it seems that especially households with a high loan-to-value ratio were affected. Moreover, the requirement decreases the debt ratio of Sweden, following that households both are taking smaller loans and paying of their debts faster, yet the general discussion is that the effect is not satisfying. The reason to this is that even though it is more difficult to take loans, the apartment sales prices are still increasing.

Since households are so indebted, but are still choosing to buy apartments be-

yond their economy other factors must be contributing. The immediate conse-

quence of that households continue to buy apartment beyond their economical

capacity are that prices continue to increase. As mentioned in section 2.1.5 He-

donic Pricing is an economic theory based on that external factors can contribute

to economic choices. This in combination with irrational economic choices as

a sequence from Behavioral Finance commit to a fairly weak effect from the

requirement. Recall from section 2.1.7 the definition of Behavioral Finance as

(45)

Furthermore, as Finansinspektionen states in their memo Regulations regarding mortgage amortisation requirement there are some negative socioeconomic con- sequences following the requirement. First of all, there is a certain risk that when the mortgages are increasing, households will cut down on their consumption.

Second of all, the requirement will be most affecting of already indebted house- holds. These households are probably already exposed economically. Although the requirement is more substantial the bigger the loan is, it is also composed to apply to households with a loan-to-value ratio above 50 percent. Thus, house- holds with less economic mobility. The conclusion is that, in worst case, this can slow down the economy if several households simultaneously would cut down their consumption (14). Lastly, the results of the amortization requirement are still clear, but less extensive than hoped for. This is the major reason to why Finansinspektionen has chosen to introduce a new, stricter, requirement. The wish is that similar effects would occur but in an increasing pace.

Along these lines, when comparing the socioeconomic risks on the macroecon-

omy with the risk of how the households would behave in case of an economic

chock, the overall effects of the amortization requirement are assessed to be

positive.

(46)

References

[1] Finansinspektionen, Konsekvenser av ett sk¨ arpt amorteringskrav, Stock- holm, 2017

[2] Finansmarknadsavdelningen, Amorteringskrav, Finansdepartementet, Stockholm 2015

[3] Investopedia, Loan-to-Value Ratio - LTV Ratio, https://www.investopedia.com/terms/l/loantovalue.asp, 2018 [4] Investopedia, Debt Ratios: The Debt Ratio,

https://www.investopedia.com/terms/d/debtratio.asp, 2018 [5] Jim Mueller, Understanding Financial Liquidity, Investopedia,

https://www.investopedia.com/articles/basics/07/liquidity.asp, 2018

[6] Finansinspektionen, The Swedish Mortgage Market, Stockholm, 2017 [7] Sveriges Riksbank, Household Debt, Stockholm, 2018

[8] SEB, 10 questions and answers about Swedish household debt, https://sebgroup.com/press/news/10-questions-and-answers-about- swedish-household-debt, 2017

[9] Paul Krugman and Robin Wells, Economics 4th edition, USA, 2015 [10] Investopedia, Hedonic Pricing,

https://www.investopedia.com/terms/h/hedonicpricing.asp, 2018 [11] Investopedia, Rational Choice Theory,

https://www.investopedia.com/terms/r/rational-choice-theory.asp, 2018

[12] Investopedia, Behavioral Finance,

https://www.investopedia.com/terms/b/behavioralfinance.asp, 2018

[13] Matilda Adelborg, Nu ¨ ar det h¨ ogs¨ asong f¨ or sm˚ a l¨ agenheter!, Booli, Stock-

holm, 2016

(47)
(48)

References

Related documents

Däremot är denna studie endast begränsat till direkta effekter av reformen, det vill säga vi tittar exempelvis inte närmare på andra indirekta effekter för de individer som

The literature suggests that immigrants boost Sweden’s performance in international trade but that Sweden may lose out on some of the positive effects of immigration on

Från den teoretiska modellen vet vi att när det finns två budgivare på marknaden, och marknadsandelen för månadens vara ökar, så leder detta till lägre

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

På många små orter i gles- och landsbygder, där varken några nya apotek eller försälj- ningsställen för receptfria läkemedel har tillkommit, är nätet av