• No results found

Parameterization and Conditioning of Hinging Hyperplane Models

N/A
N/A
Protected

Academic year: 2021

Share "Parameterization and Conditioning of Hinging Hyperplane Models"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Parameterization and Conditioning of Hinging Hyperplane Models

Predrag Pucar

Department of Electrical Engineering Linkoping University

S-581 83 Sweden

Email: predrag@isy.liu.se Voice: +46 13 282803

Jonas Sjoberg

Department of Electrical Engineering Linkoping University

S-581 83 Sweden

Email: sjoberg@isy.liu.se

Submitted to IFAC96

Abstract

Recently a new model class has emerged in the eld of non-linear black-box modeling the hinging hyperplane models. The hinging hyperplane model is closely related to the well known neural net models.

In this contribution the parameterization of hinging hyperplane models is addressed. It is shown that the original setting is overparameterized and a new parameterization involving fewer parameters is suggested.

Moreover, it is shown that there is nothing to loose in terms of negative eects in the numerical search when less parameters are used. The positive eects of a model class parameterized with less parameters is a decrease in computational complexity.

In addition to the parameterization issues, another related question is discussed, namely if the esti- mation problem is ill-conditioned.

Keywords:

Non-linear black-box modeling, system identication, function approximation.

1 Introduction

Recently a new interesting approach to non-linear function approximation named hinging hyperplanes (HH), was reported 2]. In that article a number of advantages of HH were pointed out. For example, they have the same nice feature as neural nets and projecting pursuit models avoiding the curse of dimensionality, 1]. In

2] there is also an estimation algorithm suggested which is inspired by linear regression trees and projection pursuit. However, 11] shows that this is just a Newton-type algorithm. In this paper we concern the parameterization of the HH model. It will be shown that the number of parameters can be reduced without restricting the models capability of approximating unknown functions, i.e., the super uous parameters do not in uence the range of the model structure.

Most non-linear black-box models can be expressed as a basis function expansions as

f

(

'

) =

XK

i=1 h

i

(

'

) (1)

See, e.g., 15]. The basis function

h

in the expansion above is crucial and is the detail by which a number of approaches dier, e.g. neural nets, projection pursuit, etc. The idea is that although using relatively simple building blocks, the basis functions

hi

, a broad class of non-linear functions can be well approximated. In 15]

and 5] a general framework for non-linear black-box models is further developed.

When a parameterized model structure has been chosen upon, i.e., when the specic form of the basic

expansion (1) has been decided, it remains to estimate the parameters by using collected data from the

(2)

hinge

hinge Hinging Hyperplanes

Hinging Hyperplanes hinge function function

hinge

(a) 2-Dimensional (b) 3-Dimensional

Figure 1: Hinge, hinging hyperplane and hinge function

unknown system. This is typically done by computing the parameter value minimizing a criterion of t, i.e.,

^



N

= argmin

 V

N

(



)

where

N

indicates the number of data available for the estimation. In the original algorithm for HH presented in 2] the criterion

VN

(



) is the sum of squared errors (this is explained in 11])

V

N

(



) = 1

N N

X

n=1

(

yn;f

(

'n

))

2

(2)

where

fyn'ngNn=1

is the available data. Since the approximating function is non-linear, the minimization problem ts into the non-linear least squares setting. The minimization is performed with a numerical search routine, see Section 2.1 for a more detailed treatment of the search algorithm.

The HH approach uses hinge functions as basis functions. The hinge function is maybe most easily illustrated by a gure, see Figure 1. As can be seen from Figure 1 the hinge function consists of two joined hyperplanes. Assume the two hyperplanes are given by

y

1

=

'T+

and

y2

=

'T;

(3)

where

'

= 1

'1'2'd

]

T

, is the regressor vector and

y

is the output variable. These two hyperplanes are joined at

f'

:

'T

(

+;;

) = 0

g

which is dened as the hinge of the two hyperplanes. The solid/shaded part of the two hyperplanes in Figure 1, is explicitly given by

y

= max(

'T+ 'T;

) or

y

= min(

'T+ 'T;

) (4) and is dened as the hinge function. A HH model then consists of a sum of hinging hyperplanes of the sort (4).

The above presentation of the hinging hyperplane model follows the original way the model was introduced in 2]. In the next section we will discuss the parameterization of the hinging hyperplane model and it is shown that the original presentation is overparameterized. A new parameterization is suggested with less parameters than the original one. In this new parameterization the similarity between feed-forward neural nets and HH becomes striking and this motivates a comparison between these model structures. In Section 3 the problem when a too exible model structure is tted to data is considered. Such situations usually give overtting but it can, e.g., in neural net applications, be prevented by applying some form of regularization.

The criterion of t has at valleys in some directions of the parameter space and these directions correspond to

parameters which can be excluded from the t by the regularization without increasing the bias contribution

to the error substantially. However, by excluding these parameters the variance part of the total error

decreases and in all this gives a better model. In Section 3 HH models are investigated to see if it can be

expected that they have similar features.

(3)

2 Parameterization

When discussing overparameterization there are two cases that can occur which are fundamentally dierent.

The rst case, which we will call truly overparameterized, means that the model structure can be described by less parameters. Then there exists a mapping from the original parameter space to a parameter space with lower dimension. The other case of overparameterization actually concerns the balancing of the bias- and variance contribution of the total error. See, e.g., 6] or 15] for a general discussion on this. If a model is truly overparameterized or not, is only a matter of how the model structure is parameterized and it does not in uence the exibility of the model structure. However, for the second type of overparameterization the problem is that the model structure is unnecessary exible, the model contains too many parameters with respect to the the unknown relationship to be identied and the number of available data for the identication.

There are indications that truly overparameterized models, in some cases, perform better than not overparameterized models when a numerical search routine is used for minimization of the loss function

8, 7]. This will be investigated for the HH model and it will be shown that there are no such advantages with an overparameterized HH model. This means that the HH model (1) becomes identical for a manifold of parameters, i.e., the parameter vector can be parameterized itself as



=



(

k

) where dim(



)

<

dim(



) and where only the parameters in



in uence the function. The consequence is that the function

f

(

'

(

k

)) becomes independent of the parameters in

k

which are the super uous parameters.

Now, let us describe the HH function (4) in a slightly dierent way. Assume the input space is split into two sets

S+

and

S;

, and the two parameter vectors

+

and

;

are estimated on the two sets respectively.

The hinge function (3) can be rewritten as

f

(

'

) =

'T+IfS+g

(

'

) +

'T;

(1

;IfS+g

(

'

))



where the indicator function

IS

is dened as

I

S

(

'

) =



1 if

'2S

0 if

'62S:

The hinging hyperplane model when

M

hinge functions are used, is given by

f

(

'

) =

XM

i=1 '

T

 +

i I

fS +

i

g

+

'T;i

(1

;IfS+

i

g

)



(5)

where the dependence on

'

of the indicator function is suppressed.

The model above can be rewritten in the following equivalent way

f

(

'

) =

XM

i=1 '

T



i I

fSig

+

'T0

(6)

where

i

=

i+;;i

and

0

=

PMi=1;i

. The rewriting results in using (

d

+ 1)(

M

+ 1) parameters. For

M >

1 this is less than the 2(

d

+ 1)

M

parameters of the original description. The border between the two half-spaces across which the indicator function switches from zero to one is

'T

(

+;;

) = 0, or equivalently

' T



i

= 0.

The similarity with neural nets is striking. Both the HH model and the neural net model can be described by

f

(

'

) =

XM

i=1

(

'Ti

) +

'T

(4)

where

(

x

) =



0 for

'<

0

x

for

x>

0 for the HH model and

(

x

) =

(

x

)

for the neural net model with a direct term from input to output. The activation function

(



) is usually chosen to

(

x

) = 1

=

(1 + e

;x

). The parameters

1

,

i

and

are stored in



. The HH model is built up by a basis which is constant on one half-plane and linear on the other half-plane. The basis function for the neural net model also divides the space into two half-planes but instead of a linear relation it takes a constant value on each half-plane and makes a smooth transition between these constant values.

The HH model does not have a smooth derivative. It is, however, possible to replace the indicator function in (6) with a sigmoid function, which is a \smooth indicator", to obtain smooth hinging hyperplanes, see 10].

The resulting HH model (6) is no longer a sum of hinge functions as the one depicted in Figure 1, but rather a sum of one hyperplane and a number of basis functions which are zero for

f'

:

'62 IfSgg

and a hyperplane otherwise. See Figure 2 where such functions with

'2R

and

'2R2

, are depicted.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.1 0 0.1 0.2 0.3 0.4

0 0.2

0.4 0.6

0.8 1

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.2 1.4

Figure 2: Form of hyperplanes when reparameterized hinging hyperplane models are used.

2.1 Equivalence of Parameterizations

The relation between the two parameter vectors in the two parameterizations (5) and (6), turns out to be a projection of the original parameter space onto the reduced one. Consider the case of

M

hinges. The relation between the original parameter vector and the reduced one is then given by

0

B

B

B

@



...

1



M



0 1

C

C

C

A

=

0

B

B

B

@

I ;I

0 0



0 0 ... ... ... ... ... ... ...

0 0 0 0

 I ;I

0

I

0

I 

0

I

1

C

C

C

A 0

B

B

B

B

B

@

 +

1



;

1

...

 +

M



;

M 1

C

C

C

C

C

A



(7)

where

I

, in this case, is a (

d

+ 1)



(

d

+ 1) identity matrix. The relation between the two parameter vectors can, thus, be written as

R

=

A

, where the superscript

R

indicates that the vector lies in the reduced parameter space. The following questions arise and we will try to provide answers to them in the sequel.

1. Are the two models output equivalent in the sense that for every value of the parameter vector



in the original parameterization, the outputs of the original model and the reduced model using the

parameter vector

R

according to (7), are equivalent?

(5)

2. Are the properties of the numerical search routine used for minimizing the loss function and nding the optimal parameter values, aected? Whatever the answer to the question is, what are the advantages or drawbacks?

Question 1) is readily answered. Since the reparameterization in fact is reordering and lumping some parameters in the original parameterization, the output space is not changed. This follows from (5) and (6).

Question 2) is somewhat more complicated than question 1). It originates in the experience that in some model structures, truly overparameterized models tend to perform better when numerical search methods are applied 8, 7]. The risk of getting stuck at local minima decreases. The intuitive reason for that are the additional dimensions in the parameter space which give more alternative paths for the numerical search algorithm. We will here show that in the case discussed here, no such advantages are present.

The routine that we use in the search of the optimal parameter estimate, is the well known damped Newton method for minimization of the criterion (2), see 3], and 11] for connections with HH models.

Newton's scheme for nding the minimum of the criterion (2) is



k +1

=

k;

(

r2V

)

yrV

(8) where

Ay

denotes the pseudo-inverse of a matrix

A

,

rV

is the gradient of

V

with respect to



, and

r 2



V

is the Hessian of

V

. Question 2) can now be restated as follows. Assume an arbitrary value of the parameter vector

0

is chosen as the initial value for the search algorithm used in the original parameter space, and the projection of

0

, i.e.,

0R

=

A0

, is used as the initial value of the algorithm for the reduced parameter space. Will the two algorithms executed in parallel give the same path in the reduced parameter set if the projection of



onto the

R

-space is compared to the path of

R

at every step of the algorithm?

Straightforward calculations lead to the following relations between the gradients and the Hessians in the two parameter spaces

rV

=

ATrVR

and

r2V

=

ATr2VRA:

(9) Given an arbitrary

0

the next step is



1

=

0;

(

r2V

)

yrV:

(10)

If the above equation is multiplied by the projection matrix

A

we obtain

 R

1

=

R0 ;A

(

ATr2VRA

)

yATrVR

where the gradient and Hessian are expressed in the

R

-space. The strategy is to show that the parameter update term on the right hand side of (10) is equivalent to the parameter update term of the algorithm using the reparameterized parameter vector. The parameter update term in the algorithm executed in the reduced parameter space is (

r2VR

)

yrVR

= (

r2VR

)

;1rVR

(since the Hessian has full rank if the reduced parameter vector is used). If the equivalence of the parameter update terms can be shown, the arbitrary choice of

0

ensures that the result does not depend on which parameterization is chosen. The idea of the proof is illustrated in Figure 3.

If the expressions in (9) are substituted in the parameter update term, straightforward calculations, see 12], give the equivalence of the two spaces with respect to the behavior of the numerical algorithm

A

(

ATr2VRA

)

yAT

= (

r2VR

)

;1:

The conclusion is that the two parameterizations are equivalent also when numerical aspects are taken

into consideration. There is, thus, nothing to gain by using the truly overparameterized model. There is

an obvious disadvantage though, namely the computational complexity. In the numerical algorithms that

are used, one of the most time consuming steps is taking the pseudo-inverse/inverse of the Hessian in the

parameter update equation. Due to limited space we will not discuss the computational complexity further.

(6)

Θ - space

Θ - spaceR θ

θ θ

θ A

R R 0

0 1

1

A

Figure 3: Idea of proof of \numerical equivalence" of the two parameter spaces. The solid arrows are the steps taken by the numerical algorithm in respective space. The dashed arrows are the projections of parameter values in the original set onto the reduced one. The question is if the rightmost projection arrow will point at

1R

.

The conclusion is that for large

M

the complexity of the numerical algorithm when using the reparameterized model is about 30 % of the complexity using the original model.

Other related algorithms for numerical minimization are often variants on the Newton algorithm. The Hessian is altered either to avoid ill-conditioning or to decrease the computational burden. If such variants of the algorithm are used, the conclusions of the discussion above will not be changed. If the algorithm is in the area of attraction of a minimum, the solution will be equivalent. The path, however, may not be the same in all cases.

3 Ill-conditioning of the Estimation Algorithm

The second kind of overparametrization which was discussed in the introduction will now be addressed in this section. This overparametrization means that the model structure is too exible and that the number of parameters are is too large. For neural net models this is often the case, see 14, 9]. Neural net estimation problems are often ill-conditioned, 13], which means that some of the directions in the parameter space have little in uence on the model behaivor. These parameters decrease the bias part of the error very little but their contribution to the variance part of the error is equally large as the important parameters. It is then often advantagous if these parameters are excluded from the t by some kind of regularization.

For neural nets it turns out that for most estimation problems only a subset of the parameters are of substantial importance. It is, however, often impossible to point out these parameters a priori. Instead the ecient number of parameters can be controlled by applying regularization, see 14, 9]. This means that an additional term is added to the criterion (2) which penalizes large parameter values.

We will in this section see that HH model also can be expected to be ill-conditioned and that, hence, regularization can be expected to be of large importance also for HH models.

An estimation procedure for a parameterized model is a numerical minimization algorithm applied to minimize a specied loss function. An often occuring case is that the loss function is quadratic and the minimization algorithm is then a non-linear least squares algorithm where the estimate is given by

^



= argmin



1 2

"T"

where

"

= 

y ;f

(

' 

)

 y ;f

(

' 

)]

T

and

f

is the sum of basis functions used for approximation.

(7)

There are a number of algorithms for nding the estimate that minimizes the quadratic loss function. We will use a damped Newton algorithm (8) as algorithm for minimization. The Jacobian, i.e., the derivative of the error

"

with respect to the parameters, plays an important role in the minimization algorithm (in our case the Hessian of

V

is

JTJ

). The consequence of having an ill-conditioned Jacobian in the minimization routine is bad convergence properties, i.e., the time for nding the estimate ^



dramatically increases, see 3].

Denition3.1

The condition number of

A

is dened as

(

A

) =

1= d+1

, where

1

is the largest singular value and

d+1

is the smallest one. and

A

is said to be ill-conditioned if

(

A

) is large.

A closer look at how the Jacobian is constructed will be taken, and expressions that relates the directions of the Jacobian's column vectors to the how ill-conditioned the Jacobian is, are derived. We start with stating the connection between the condition number of a matrix and the angle between two vectors in the same matrix. 4]] Let

B

be a sub-matrix of

A

consisting of columns of A. Then

(

B

)



(

A

), where

denotes the condition number of a matrix.

13]] Let

A

= 

x y

]

2Rn2

and suppose that the angle



between

x

and

y

satises

cos

(



) = 1

;

for some

2

(0



1). Then

2

(

A

)



1 4



(2

;

)





jjyjj 2

jjxjj

2

+

jjxjj2

jjyjj 2

+ 2



:

It is, thus, su cient to investigate angles between vectors in the Jacobian of a HH model, to get an insight when the Jacobian will be ill-conditioned. First the Jacobian for HH models has to be derived. The Jacobian of the loss function

rV

is

rV

=

JT":

The hinge model

f

(

i

) is dened in (6) and the elements of Jacobian

J

is dened as 

Jij

] =

@"

(

i

)

=@

(

j

) =

;@f

(

i

)

=@

(

j

). The rst index in 

Jij

] denotes the time instant or sample number. Hence,

J

has the dimensions

N

(

d

+ 1)(

M

+ 1), where

N

is the number of available data and (

d

+ 1)(

M

+ 1) is the number of parameters. The following expression for the Jacobian is obtained when taking the derivative of

"

r



f

(

i

) =

h @(1)@f(i)  @(M)@f(i) @f(i)@(0) i

=

; 'TIfS1g

(

'

)

 'TIfSMg

(

'

)

'T T:

Assume that the dimension of the input space is one, i.e., dim

'

=

d

= 1 and

N

is the number of samples of the input signals. The following four columns of a matrix is an example of a part of the Jacobian for the above HH model. The columns correspond to two of the

M

hinge functions.

J

=

;

2

6

6

6

6

6

6

6

6

6

6

6

4

0 0 0 0

... ... ... ...

0 0 1

'1

(

i

)

... ... ... ...

1

'1

(

j

+ 1) 1

'1

(

j

+ 1) ... ... ... ...

1

'1

(

N

) 1

'1

(

N

)

3

7

7

7

7

7

7

7

7

7

7

7

5

The subscript of the

'

denotes the number of the input. In the example above when

d

= 1 there is only one input. The number in the parenthesis is the sample number.

The cosine of the angle between two vectors

a

and

b

is given by

cos(



) =

p

(

aTaaT

)(

bbTb

)



(8)

and three cases that can occur in the Jacobian of the HH model will now be discussed. The rst case is the one corresponding to checking the angle between for example column two and four in the example above.

Note that the two columns originate from the same input

'1

, which is the only one in this example. If

d>

1 it would be possible to check angles between columns originating form dierent inputs, but nothing consistent can be said regarding the angle between such columns. The cosine of the angle between column two and four is

cos(



) =

0

B

@

1

1 +

N;jj;i 2j;i2 +2j;i

N;j +

2

N;j 1

C

A 1=2

:

(11)

The notation above needs further explanation. Except for the subscripts on

'

, the subscripts denote the interval over which a quantity is calculated. For example,

N;j

denotes the mean of

f'1

(

k

)

gNk =j

. Similarly

N;j

denotes the standard deviation of

f'1

(

k

)

gNk =j

in the same interval.

Similar calculations, see 12], give the result for the other two combinations of vectors, namely vector number one and number three

cos(



) =



1

1 +

N;jj;i

!

1=2



(12)

and vector number one and four

cos(



) = 1 1 +

N;jj;i



(

N;j2

+

N;j2N;j

)

1=2:

(13) What do the expressions say about the angle between the columns? Expression (11) and (12) will be close to one if

j;i

is small, or in other words if two hinges are located very close to each other. In expression (13) it is not su cient to have small

j;i

to have an ill-conditioned Jacobian, but also the variance of the input has to be small compared with its mean.

A second phenomenon that can occur and result in ill-conditioning is if, e.g., two consecutive (in time) inputs are used in the regressor, i.e., if for example

u

(

t;

1) and

u

(

t;

2) are used, and the input contains little information so

u

(

t;

1)

u

(

t;

2). This phenomenon is not unique for HH models. It has to do with the excitation capabilities of the input, which is a well-known issue in system identication.

4 Conclusions

This report has considered a number of topics relating to the parameterization of hinging hyperplane model presented in 2]. We have seen that:

1) Hinging hyperplane models are truly overparameterized in the original form used in 2]. The number of parameters can be reduced to (

d

+ 1)(

M

+ 1) without restricting the exibility of the model class compared to 2(

d

+ 1)

M

in the original parameterization.

2) The properties of the parameter estimation algorithm (Newton's algorithm) are not negatively aected by the parameterization. The benet of using fewer parameters is a computationally less demanding estimation algorithm.

3) From the new parametrization it is evident that hinging hyperplane models are very similar to neural net models. Both models divide the space into half-spaces and the dierence can be described in their way model the behavior in the directions perpendicular to the hyper-planes dividing the space.

4) Similar to neural net models hinging hyperplane models are often overparameterized. The model struc-

ture is too exible and to avoid overtting regularization usually has to be applied. The model structure

can be expected to be ill-conditioned which means that some directions on the parameter space are much

more important than others. The regularization limits the e cient number of parameters by excluding the

unimportant directions from the t.

(9)

References

1] A.R. Barron. \Universal Approximation Bounds for Superpositions of a Sigmoidal Function". IEEE Trans. on Information Theory, 39:930{945, May 1993.

2] L. Breiman. Hinging hyperplanes for regression, classication and function approximation. IEEE Trans.

Information Theory, 39(3):999{1013, 1993.

3] J.E. Dennis and R.B. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Clis, New Jersey, 1983.

4] G. Golub and C. van Loan. Matrix Computations. The John Hopkins University Press, Baltimore, MD, USA, 1989.

5] A. Juditsky, H. Hjalmarsson, A. Benveniste, B. Deylon, L. Ljung, J. Sjoberg, and Q. Zhang. \Nonlinear Black-Box Models in System Identication: Mathematical Foundations". Automatica, 1995.

6] L. Ljung. System Identication: Theory for the User. Prentice-Hall, Englewood Clis, NJ, 1987.

7] T. McKelvey. Fully parametrized state-space models in system identication. In Proc. of the 10th IFAC Symposium on System Identication, volume 2, pages 373{378, Copenhagen, Denmark, July 1994.

8] T. McKelvey. 1995. Personal communication.

9] J.E. Moody. The eective number of parameters: An analysis of generalization and regularization in nonlinear learning systems. In J.E. Moody, S.J. Hanson, and R.P. Lippmann, editors, Advances in Neural Information Processing Systems 4. Morgan Kaufmann Publishers, San Mateo, CA, 1992.

10] P. Pucar and M. Millnert. Smooth hinging hyperplanes - an alternative to neural networks. In Proc.

3rd ECC, Italy, 1995.

11] P. Pucar and J. Sj"oberg. On the hinge nding algorithm for hinging hyperplanes. Technical report, Report LiTH-ISY-R-1720, Dep. of Electrical Engineering, Link"oping University, S-581 83 Link"oping, Sweden, February 1995. Available by anonymous ftp

130.236.24.1

.

12] P. Pucar and J. Sj"oberg. On the parameterization of hinging hyperplane models. Lith-isy-r-1727, Department of Electrical Engineering, Link"oping University, Sweden, 1995.

13] S. Saarinen, R. Bremley, and G. Cybenko. \Ill-Conditioning in Neural Network Training Problems".

SIAM Journal Sci. Computing, 14(3):693{714, May 1993.

14] J. Sj"oberg and L. Ljung. Overtraining, regularization, and searching for minimum in neural networks.

In Preprint 4th IFAC Symposium on Adaptive Systems in Control and Signal Processing, pages 669{674, Grenoble, France, 1992.

15] J. Sj"oberg, Q. Zhang, L. Ljung, A. Benveniste, B. Deylon, P-Y. Glorennec, H. Hjalmarsson, and A. Ju-

ditsky. \Non-Linear Black-Box Modeling in System Identication: a Unied Overview". Automatica,

1995.

References

Related documents

Generally, a transition from primary raw materials to recycled materials, along with a change to renewable energy, are the most important actions to reduce greenhouse gas emissions

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Från den teoretiska modellen vet vi att när det finns två budgivare på marknaden, och marknadsandelen för månadens vara ökar, så leder detta till lägre

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i