• No results found

Mixtures of traces of Wishart and inverse Wishart matrices

N/A
N/A
Protected

Academic year: 2021

Share "Mixtures of traces of Wishart and inverse Wishart matrices"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)Communications in Statistics - Theory and Methods. ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: https://www.tandfonline.com/loi/lsta20. Mixtures of traces of Wishart and inverse Wishart matrices Jolanta Pielaszkiewicz & Thomas Holgersson To cite this article: Jolanta Pielaszkiewicz & Thomas Holgersson (2019): Mixtures of traces of Wishart and inverse Wishart matrices, Communications in Statistics - Theory and Methods, DOI: 10.1080/03610926.2019.1691733 To link to this article: https://doi.org/10.1080/03610926.2019.1691733. © 2019 The Author(s). Published with license by Taylor & Francis Group, LLC. Published online: 19 Nov 2019.. Submit your article to this journal. Article views: 140. View related articles. View Crossmark data. Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=lsta20.

(2) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS https://doi.org/10.1080/03610926.2019.1691733. Mixtures of traces of Wishart and inverse Wishart matrices Jolanta Pielaszkiewicza,b and Thomas Holgerssona a Department of Economics and Statistics, Linnaeus University, V€axj€o, Sweden; bDepartment of Statistics, Stockholm University, Stockholm, Sweden. ABSTRACT. ARTICLE HISTORY. Traces of Wishart matrices appear in many applications, for example in finance, discriminant analysis, Mahalanobis distances and angles, loss functions and many more. These applications typically involve mixtures of traces of Wishart and inverse Wishart matrices that are concerned in this paper. Of particular interest are the sampling moments and their limiting joint distribution. The covariance matrix of the marginal positive and negative spectral moments is derived in closed form (covariance matrix of Y ¼ ½p1 TrfW 1 g, p1 TrfWg, are obtained p1 TrfW 2 g0 , where W  Wp ðR ¼ I, nÞ). The results Q through convenient recursive formulas for E½ ki¼0 TrfW mi g and Q mi E½TrfW mk g k1 i¼0 TrfW g: Moreover, we derive an explicit central limit theorem for the scaled vector Y, when p=n ! d < 1, p, n ! 1, and present a simulation study on the convergence to normality and on a skewness measure.. Received 30 December 2018 Accepted 28 October 2019 KEYWORDS. covariance matrix; central limit theorem; eigenvalue distribution; inverse Wishart Matrix; Wishart matrix MATHEMATICS SUBJECT CLASSIFICATION. 15B52; 60B20; 60F05. 1. Introduction Many application areas consider traces of Wishart (W) and inverse Wishart (W 1 ) matrices. In particular the sample covariance matrix S, where nS  WðR, nÞ, is often involved. Finance is an example of an application area where such traces appears (Glombek 2014; Okhrin and Schmid 2006). Similarly, discriminant analysis (Girko and Pavlenko 1989), results regarding Mahalanobis distances and angles (Mardia 1977; Dai and Holgersson 2018) and derivations on loss functions (Efron and Morris 1976) evaluate expressions built on traces of Wishart matrices. These applications typically involve mixtures of Tr½W k  and Tr½W k , k ¼ 1, 2, ::: whose expectations can be seen as positive and negative spectral moments. Those mixtures are our concerns in this paper. Notice that a particular case of trace of Wishart matrix was discussed earlier in Glueck and Muller (1998) where the spectral decomposition was utilized to obtain the underlying characteristic function. Such traces appear, for example, in the problem of finding an estimator of R1 which ^ 1  R1 Þ2 H: Common is optimal with respect to the loss function LðHÞ :¼ p1 Tr½ðR choices of H include H ¼ I, H ¼ S and H ¼ R2 , where S is the traditional sample CONTACT Jolanta Pielaszkiewicz Jolanta.Pielaszkiewicz@stat.su.se Department of Statistics, Stockholm University, SE - 106 91 Stockholm, Sweden. Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/lsta. ß 2019 The Author(s). Published with license by Taylor & Francis Group, LLC. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited..

(3) 2. J. PIELASZKIEWICZ AND T. HOLGERSSON. ^ 1 ¼ aS1 þ covariance matrix. Estimators of R1 are typically of the general form R bgðSÞI, where a and b are scalar constants whose optimal values are to be determined through L(H), and g(S) is a mapping g : Rpp 7!Rþ : Specific choices such as e.g., gðSÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi 1= Tr½S2  or gðSÞ ¼ Tr½S=Tr½S2  will then involve functions of traces of Wishart and inverse Wishart matrices (Efron and Morris 1976; Haff 1979). The covariance matrix of the vector of marginal traces  0 Y ¼ p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g where nS  Wp ðR ¼ I, nÞ (Wishart distribution), is derived in this paper whereas the asymptotic normality is well established (see, for example, Theorem 3.4, Yao, Zheng, and Bai (2015)). Explicit derivations of normality of random variables pffiffiffiffiffi 1 npðp TrfSk g  E½p1 TrfSk gÞ for k ¼ 1, 2 can be found in Pielaszkiewicz, von Rosen, and Singull (2018). The asymptotic distribution of (smooth) functions of the elements in our CLT can be established through the delta method (Birke and Dette 2005). The p, n!1 result for the scaled covariance matrix of Y is derived under p=n ! d: This paper is organized as follows. Section 2 describes earlier results on moments and distributions on linear spectral statistics as a background for the article. In Section 3 some recursive results, utilized later for calculations regarding the covariance matrix of Y in Section 4, are derived. Asymptotic results are presented in Section 4 together with some graphs illustrating the rate of convergence. Section 5 contains a simulation study on normality and skewness of the scaled vector Y, further simulations on results for expectations of mixed traces of Wishart and Inverse Wishart matrices can be found in Appendix A. Section 6 summarizes the paper.. 2. Background Formulas for E½ðTrfWgÞk  and E½TrfW k g, W  Wp ðR, nÞ have been proposed in a number of publications, for both complex and real Wishart matrices. The techniques used to reach desired results are rather diverse, including moment methods and combinatorics, Stieltjes transforms, density expansions, free probability and others. A survey of some methods for deriving properties of linear spectral statistics are given in Bai and Silverstein (2004). Here, we shall only mention a few, which are of particular relevance for this paper. The marginal expectations of powers of the trace of the Wishart matrix has been derived by Nel (1971) and then extended to the formula involving zonal polynomials by Gupta and Nagar (2000). The case of real Wishart matrices with R ¼ I was considered by among others Subrahmaniam (1976) (using zonal polynomials) and Letac and Massam (2004). In a similar way the expectation of the trace of the power of the Wishart matrix, i.e., E½TrfW k g, was studied. Results for a non-central Wishart distributioned W can be found in Gupta and Nagar (2000) and for Wishart matrix with general covariance matrix R, i.e., W  Wp ðR, nÞ, we refer to results by Fujikoshi, Ulyanov, and Shimizu (2011) and Letac and Massam (2008). The analog formulas for complex Wishart matrices (where conjugate transpose exchanges matrix transpose) are derived in explicit form by Hanlon, Stanley and Stembridge (1992) and in recursive form by Haagerup and Thorbjornsen (2003)..

(4) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 3. A recursive formula for expectations of products of traces of powers of Wishart was given by Pielaszkiewicz, von Rosen, and Singull (2017), in the following form. Theorem 2.1. (Pielaszkiewicz, von Rosen, and Singull 2017). Let W  Wp ðI, nÞ: Then, the following recursive formula holds for all k 2 N and all m0, m1 , :::, mk such that m0 ¼ 0, mk 2 N, mi 2 N0 , i ¼ 1, :::, k  1 " # " # k k1 Y Y mi mk 1 mi E TrfW g ¼ ðn  p þ mk  1ÞE TrfW g TrfW g i¼0. þ2. i¼0. þ. i¼0. 2. k1 X. mk þmi 1 mi E6 g 4TrfW. j¼0 j 6¼ i. 3. TrfW mj g7 5. (1). 3 k1 Y E4TrfW i gTrfW mk 1i g TrfW mj g5. m k 1 X. 2. k1 Y. i¼0. j¼0. Theorem 2.1 above is of particular interest here as it is to be generalized to also include the case of inverse Wishart matrices and mixed traces of Wishart and inverse Wishart matrices in Section 3 to follow. Although the mixed moments are of primary importance in this paper we are also interested in the joint limiting (weak) convergence of the sample Wishart traces. While rigorous central limit theorems for linear spectral statistics are available elsewhere in the literature, we shall here refer to a simplified result which is sufficient for our purpose. Proposition 2.1. (Bai and Silverstein 2004). Let Xf denote a linear spectral statistic, i.e., Pp Xf ¼ p1 j¼1 f ðkj Þ, where kj ðX 0 TXÞ is the eigenvalue of a random matrix X ¼ ðXij Þ1i, jp iid. such that Xij Nð0, 1Þ and T a nonrandom matrix. Let f1 , :::, fk be continuous functions. Then ðXf1 , :::, Xfk Þ, with suitable normalizer, converges weakly to a gaussian vector. Remark 2.1. Proposition 2.1 is a simplified version of Bai and Silverstein (2004) Theorem 1.1. Their original theorem does not require Xij to be gaussian, nor does it require f to be continuous or independent of fkj g. However, unlike most other available results on weak limits of Wishart traces, the above theorem allows us to use negative as well as positive powers, i.e., f ðkÞ ¼ kk , k 2 Z, which in turn is necessary for our purpose. Hence, although the limiting normality of ðXf1 , :::, Xfk Þ is merely a consequence of Bai and Silverstein (2004), the explicit covariance matrix of the limiting normal distribution is currently unavailable in explicit form.. 3. Recursive results In this section we present recursive formulas for 2 3 k1 k2 Y Y E4 TrfðWR1 Þmi1 g TrfðWR1 Þmi2 g5 i1 ¼0. where W  Wp ðR, nÞ in two cases:. i2 ¼0.

(5) 4. J. PIELASZKIEWICZ AND T. HOLGERSSON.  . for any k1 2 N, when k2 ¼ 0 and mi1 , mi2 2 N0 , for any k2 2 N, when k1 ¼ 1 and mi1 , mi2 2 N0 ,. i1 ¼ 1, . . . , k1 , i2 ¼ 1, . . . , k2 , i1 ¼ 1, . . . , k1 , i2 ¼ 1, . . . , k2 :. In other words we present an extension of results Pielaszkiewicz, von Rosen, and Singull (2017) to the negative spectral moments and to mixtures of positive and negative spectral moments of real Wishart matrices. The proofs of Theorems 3.1–3.2 to follow hold in multivariate settings when n  p þ k, where k 2 Nþ is an increasing number depending on the degree of power traces to be included. For example, in order to get a nonsingular covariance matrix of Y ¼ ½p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g0 , we need n  p þ 4 as shown in Section 4 to follow. d The operator dX is used to differentiate the Wishart density function with respect to a symmetric matrix R. For Y 2 Rqr and X 2 Rpp we define our differential operator as ( 1 : k ¼ l, dY X @yij ¼ ðgl gk Þkl ðej di Þ0 , kl ¼ 1 (2) : k 6¼ l dX @xkl I 2 where I ¼ fi, j, k, l : 1  i  q, 1  j  r, 1  k  p, 1  l  ng, di, ej and gk are i-th, j-th and k-th column of Iq, Ir and Ip, respectively, and denotes the Kronecker product. The properties of the operator (2) are listed in Appendix of Pielaszkiewicz, von Rosen, and Singull (2017) or originally in Kollo and von Rosen (2005). In the derivation presented further in the paper the rule of differentiating of negative power of the matrix will be essential due to the appearance of inverse Wishart matrix in our formulations. 1 dY n dY 0 X Y i1 ðY 0 Þj1 ¼ (3) A dX @ i þ j ¼ n  1 dX i, j  0. We will use also notion of vec-operator vecðÞ and of the commutation matrix Kp, p defined as X ei e0j ej e0i Kp, p ¼ I. where I ¼ fi, j : 1  i  p, 1  j  pg and ei is i-th column of Iq. Then, assuming sizes are appropriate, TrfABg ¼ vec0 ðA0 ÞvecðBÞ. (4). TrfA Bg ¼ TrfAgTrfBg. (5). TrfKp, p ðA BÞg ¼ TrfABg. (6). Let us first prove the recursive formula for the expectation of the product of powers of inverse Wishart matrices as given in Theorem 3.1. Theorem 3.1. Let W  Wp ðI, nÞ. Then, the following recursive formula holds for all k 2 N and all m0, m1 , :::, mk such that m0 ¼ 0, mk 2 N, mi 2 N0 , i ¼ 1, :::, k  1.

(6) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 5. ". # " # k k1 Y Y mi mk þ1 mi ðn  p  mk ÞE TrfW g ¼ E TrfW g TrfW g i¼0 k1 X. þ2. i¼0. þ. m k 2 X. 2. i¼0. mk mi mi E6 g 4TrfW. 2. k1 Y. j¼0 j 6¼ i. 3 TrfW mj g7 5. (7). 3 k1 Y E4TrfW i1 gTrfW mk þ1þi g TrfW mj g5. i¼0. j¼0. Proof. The proven formula is recursive with respect to the power mk. We denote by R the covariance of a Wishart matrix W  Wp ðR, nÞ as we derive result though differentiating over R. For such matrix W we have the following equality. " # ð k1 k1 Y Y 1 1 mi 1 mk R E TrfðWR Þ gðWR Þ ¼ R1 TrfðWR1 Þmi gðWR1 Þmk fW dW (8) i¼0. i¼0. |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}. |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}. L. R. where fW denotes the density function for W: The trace function, Trfg is applied after differentiation using operator dRd1 : Then, Qk1 TrfðWR1 Þmi gðWR1 Þmk  does not depend on R we obtain the followsince E½ i¼0 ing equation system 8 " #   k1 > Y > dðLÞ p þ 1 > m m 1 1 i k > E TrfðWR Þ gTrfðWR Þ g ¼ < Tr 2 dR1 i¼0   > > dðRÞ > > ¼ TrB þ TrC : Tr dR1 where. ! k1 Y dfW 0 1 1 mi 1 mk B¼ vec R TrfðWR Þ gðWR Þ dW, dR1 i¼0 ! k1 Y m m TrfðWR1 Þ i gR1 ðWR1 Þ k ðd ð. C¼. The operator. d dR1. i¼0. dR1. (9). fW dW. applied to the density function fW gives the following identity dfW 1 1 ¼ ðnvecR  vecWÞfW 2 dR. which in turn gives " # " # k k1 Y n Y 1 m m m þ1 TrB ¼ E TrfðWR1 Þ i gg  E TrfðWR1 Þ i gTrfðWR1 Þ k g 2 2 i¼0 i¼0 Applying the elementary properties of the differential operator, such as (3), we may rewrite C in (9) to a sum C ¼ C1 þ C2 , where.

(7) 6. J. PIELASZKIEWICZ AND T. HOLGERSSON. 2 6d 6 4 C2 ¼ E and. k1 Q. 3. ! TrfðWR1 Þmi g. i¼0. dR1. 7 7 0 1 1 mk 5 vec ðR ðWR Þ Þ. ". # k1 dðR1 ðWR1 Þmk Þ Y 1 mi C1 ¼ E TrfðWR Þ g dR1 i¼0 " # k1 Y dððWR1 Þmk þ1 Þ ¼E ðI W 1 Þ TrfðWR1 Þmi g dR1 i¼0. For mk ¼ 1 we have C1 ¼ 0, while for mk  2 we obtain 2. 3 8 9 k 1 Y dððWR1 ÞÞ < X 1 i1 1 0j1 = 1 1 mi ðI W Þ TrfðWR Þ g7 C1 ¼ E6 ðWR Þ ðWR Þ 4 5 dR1 i¼0 i þ j ¼ m  2 k : i,j  0 ; 2 3 8 X 9 k1 Y I þ Kp,p 1 i1 1 0j1 = 1 1 mi < ðI W Þ TrfðWR Þ g7 ¼ E6 ðWR Þ ðWR Þ ðI WÞ 4 2 5 i¼0 i þ j ¼ m  2 k : i,j  0 ; 28 3 9Y k 1 1 < X 1 i1 1 0j1 1 = 1 mi ¼  E6 ðWR Þ WðWR Þ W TrfðWR Þ g7 5 2 4 iþj¼m 2 : i,j  0k ; i¼0 2 3 8 X 9Y k1 1 1 i1 1 0j1 1 = 1 mi <  E6Kp,p ðWR Þ WðWR Þ W TrfðWR Þ g7 5 2 4 : i þ ji,j¼m0k  2 ; i¼0. Hence,. " # k1 Y 1 X TrfC1 g ¼  E TrfðWR1 Þi1 gTrfðWR1 Þj1 g TrfðWR1 Þmi g 2 iþj ¼m 2 i¼0 k. i,j  0. ". k1 Y mk  1  E TrfðWR1 Þmk g TrfðWR1 Þmi g 2 i¼0. #. (10). Next, the expectation C2 can be rewritten as C2 ¼ D1 þ D2 , where 2 3 ! k2 Y 6d 7 TrfðWR1 Þmi g 6 7 4 i¼0 1 mk1 0 1 1 mk 5 TrfðWR Þ gvec ðR ðWR Þ Þ , D1 ¼ E dR1 " # k2 dðTrfðWR1 Þmk1 gÞ Y D2 ¼ E TrfðWR1 Þmi gvec0 ðR1 ðWR1 Þmk Þ dR1 i¼0 dY dW dZ dW using the relation dW dX ¼ dX dY þ dX dZ , where W ¼ W(Y(X),Z(X)), for operator Using the chain rule we obtain. d dR1. :.

(8) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 7. ". 8 X 9 1 ðI þ Kp, p Þ< ðWR1 Þi1 WðWR1 Þ0j1 = 2 : i þ ji,¼j m0k  2 ; # k2 Y TrfðWR1 Þmi gvec0 ðR1 ðWR1 Þmk Þ vecI. D2 ¼ E. i¼0. The corresponding expectation of the trace is given by " # k2 Y TrfD2 g ¼ mk1 E TrfðWR1 Þmk mk1 g TrfðWR1 Þmi g : i¼0. Repeating a similar calculation as for D1 we have 2 3 k1 k1 X Y 1 mk mi TrfC2 g ¼  mi E6 g TrfðWR1 Þmj g7 4TrfðWR Þ 5 i¼0. (11). j¼0 j 6¼ i. Taken together we reach the final equation " # k1 pþ1 Y E TrfðWR1 Þmi gTrfðWR1 Þmk g 2 " i¼0 # k1 n Y 1 mi 1 mk TrfðWR Þ gTrfðWR Þ g ¼ E 2 i" ¼0 # k1 1 Y 1 mi 1 mk þ1 TrfðWR Þ gTrfðWR Þ g  E 2 i¼0 " # k1 X Y 1 1 i1 1 0j1 1 mi E TrfðWR Þ gTrfðWR Þ g TrfðWR Þ g  2 iþj¼m 2 i¼0 i, j  0k " # k1 Y mk  1 E TrfðWR1 Þmk g TrfðWR1 Þmi g  2 i¼0 2 3 k 1 k1 Y X 1 mk mi 1 mj g TrfðWR Þ g7  mi E6TrfðWR Þ 4 5 i¼0 j¼0 j 6¼ i. which is equivalent to the statement of the theorem.. w. Examples 3.1–3.2 below demonstrate the use of Theorem 3.1 above. Example 3.1 gives E½TrfW 1 g while Example 3.2 gives E½TrfW 1 gTrfW 1 g and E½TrfW 2 g: Example 3.1. Using formula (7) we obtain E½TrfW 1 g as follows     ðn  p  1ÞE TrfW 0 gTrfW 1 g ¼ E TrfW 0 gTrfW 0 g   ðn  p  1ÞpE TrfW 1 g ¼ p2   p E TrfW 1 g ¼ np1.

(9) 8. J. PIELASZKIEWICZ AND T. HOLGERSSON. Example 3.2. Using formula (7) and Example 3.1 we obtain closed form expression for E½TrfW 1 gTrfW 1 g and E½TrfW 2 g as solution to the following system of equations 8   > > > < ðn  p  1ÞE TrfW 0 gTrfW 1 gTrfW 1 g ¼.   p3 þ 2E TrfW 2 gTrfW 0 g np1     p2 > 1 1 0 0 2 > > : ðn  p  2ÞE TrfW gTrfW g ¼ n  p  1 þ E TrfW gTrfW gTrfW g. that can be simplified to 8     p2 > > < ðn  p  1ÞE TrfW 1 gTrfW 1 g ¼ þ 2E TrfW 2 g np1     p > 2 > þ E TrfW 1 gTrfW 1 g : ðn  p  2ÞE TrfW g ¼ np1 We then obtain   E TrfW 1 gTrfW 1 g ¼. ! p2 2p þ n  p  1 ðn  p  1Þðn  p  2Þ. 1. 2 np2 pðpðn  p  2Þ þ 2Þ ¼ ðn  p  3Þðn  p  1Þðn  pÞ np1. and   E TrfW 2 g ¼. ðn  1Þp ðn  p  3Þðn  p  1Þðn  pÞ. Next, we will establish results for products of traces of Wishart and Inverse Wishart matrices. This is obtained as follows: Theorem 3.2. Let W  Wp ðI, nÞ. Then, the following recursive formula holds for all k 2 N and all m0, m1 , :::, mk such that m0 ¼ 0, mk 2 N, mi 2 N0 , i ¼ 1, :::, k  1 " # " # k1 k1 Y Y mk mi mk þ1 mi ðn  p  mk ÞE TrfW g TrfW g ¼ E TrfW g TrfW g i¼0. 2. k1 X. 2. i¼0. þ. i¼0. mk þmi mi E6 g 4TrfW. j¼0 j 6¼ i. 3 TrfW mj g7 5. 3 k1 Y E4TrfW i1 gTrfW mk þ1þi g TrfW mj g5. m k 2 X. 2. k1 Y. i¼0. j¼0. (12) The proof is omitted due to its analogy to the proof of Theorem 3.1. However, for the purpose of demonstration we present an example as follows: Example 3.3. Using formula (12) and E½TrfWg ¼ np we get   pðnp  2Þ E TrfWgTrfW 1 g ¼ np1.

(10) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 9. 4. Covariance matrix and a Central limit theorem Let us consider the vector  0 YW ¼ p1 TrfW 1 g, p1 TrfWg, p1 TrfW 2 g  0 ¼ ðnpÞ1 TrfS1 g, np1 TrfSg, n2 p1 TrfS2 g where nS ¼ W  Wp ðI, nÞ, n  p þ 4: By Theorem 3.1 given by Pielaszkiewicz, von Rosen, and Singull (2017) we obtain the following moments.   ð1Þ E p1 TrfWg ¼ n   ð1Þ n E p2 TrfWgTrfWg ¼ ðnp þ 2Þ p  1  n n Var p TrfWg ¼ ðnp þ 2Þ  n2 ¼ 2 p p  1  ð1Þ 2 E p TrfW g ¼ nðn þ p þ 1Þ  2  ð1Þ n E p TrfW 2 gTrfW 2 g ¼ ððnp þ 4Þðn þ p þ 1Þ2 p þ 4ðnp þ ðn þ p þ 1Þðn þ p þ 2Þ þ 2ÞÞ . 4n 2n2 þ 5nðp þ 1Þ þ pð2p þ 5Þ þ 5  1  Var p TrfW 2 g ¼ p  2  n ð1Þ E p TrfWgTrfW 2 g ¼ ðnp þ 4Þðn þ p þ 1Þ p  1  n Cov p TrfWg, p1 TrfW 2 g ¼ ðnp þ 4Þðn þ p þ 1Þ  n2 ðn þ p þ 1Þ p n ¼ 4 ðn þ p þ 1Þ p By applying Theorems 3.1–3.2 we are now able to to give a closed form expression for the covariance matrix of Y as a function of n and p.   Th:3:1 1 E p1 TrfW 1 g ¼ np1  1  pðn  p  2Þ þ 2 Ex:3:2 E p TrfW 1 gp1 TrfW 1 g ¼ pðn  p  3Þðn  p  1Þðn  pÞ  1  2ðn  1Þ Var p TrfW 1 g ¼ pðn  p  3Þðn  pÞðp  n þ 1Þ2   Ex:3:3 np  2 E p1 TrfWgp1 TrfW 1 g ¼ pðn  p  1Þ  1  2 1 1 Cov p TrfWg, p TrfW g ¼ 2 p  np þ p   Th:3:2 nðnp þ p2 þ p  4Þ E p1 TrfW 2 g, p1 TrfW 1 g ¼ pðn  p  1Þ  1  4n Cov p TrfW 2 g, p1 TrfW 1 g ¼ 2 p  np þ p.

(11) 10. J. PIELASZKIEWICZ AND T. HOLGERSSON. Presented in a covariance matrix we have thus obtained the following: 2. RYW. 2ðn  1Þ 6 6 pðn  p  3Þðn  pÞðp  n þ 1Þ2 6 6 ¼6  6 6 6 4 . 2 p2  np þ p n 2 p . n 1 ppnþ1 n 4 ðn þ p þ 1Þ p. 3. 7 7 7 7 7 7 7 7 . n 5 2 4 2n þ 5nðp þ 1Þ þ pð2p þ 5Þ þ 5 p 4. In terms of sample covariance matrices (i.e., replacing W with S) we finally reach the covariance matrix for the random vector Y ¼ ½p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g0 , 2. 2ðn  1Þn2 6 2 6 pðn  p  3Þðn  pÞðp  n þ 1Þ 6 6 RY ¼ 6  6 6 6 4 . 2 p2  np þ p 2. 1 np. . 1 1 4 pp  n þ 1. 3. 7 7 7 7 1 7 4 2 ðn þ p þ 1Þ 7 pn 7 7 5 1  2 4 3 2n þ 5nðp þ 1Þ þ pð2p þ 5Þ þ 5 pn. The matrix RY vanishes asymptotically as the sample size n and dimension p increase. Although proper scaling can be applied so that covariance matrix of scaled Y converges asymptotically to a non-trivial matrix with elements being functions of d (asymptotic is n, p!1 pffiffiffiffiffi under the Kolmogorov condition np ! d). We propose a symmetric scaling np of the vector Y that leads to the asymptotic covariance matrix in form presented below. 2 3 2 2 4 6 ð1  dÞ4 d  1 7 d1 6 7 ffiffiffi ffi p ~ 6 7 R npY ¼ 6 7  2 4ð1 þ dÞ 4 5   4ð2 þ 5d þ 2d2 Þ while p, n ! 1 and p=n ! d < 1: Finally, we state a result on the covariance matrix in form of a theorem: Theorem 4.1. Let nS  Wp ðI, nÞ. Then, the random vector  0 Y ¼ p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g has covariance matrix 2. 2ðn  1Þn2 6 2 6 pðn  p  3Þðn  pÞðp  n þ 1Þ 6 6 RY ¼ 6  6 6 6 4 . 2 p2  np þ p 2. 1 np. . 1 1 4 pp  n þ 1. 3. 7 7 7 7 1 7 4 2 ðn þ p þ 1Þ 7 pn 7 7 5 1  2 4 3 2n þ 5nðp þ 1Þ þ pð2p þ 5Þ þ 5 pn. pffiffiffiffiffi and hence the covariance matrix of npY converges asymptotically to. (13).

(12) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 2. 2 6 ð1  dÞ4 ffi ~ pffiffiffi 6 R npY ¼ 4   while p=n. 11. 3 2 4 7 d1 d1 7 5 2 4ð1 þ dÞ 2  4ð2 þ 5d þ 2d Þ. p, n!1. ! d < 1:. Let nS  Wp ðI, nÞ, n  p þ 4, Y ¼ ½p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g0 and let RY be defined as in Theorem 4.1. We summarize the paper in the following central limit theorem: Theorem 4.2. Let nS  Wp ðI, nÞ as in Theorem 4.1. Then it holds that . pffiffiffiffiffi l ffi ~ pffiffiffi npðY  E½Y Þ ! N 0, R npY as n, p ! 1, p=n ! d < 1: Proof. Follows from Theorem 4.1 and Proposition 2.1.. w. ffi ~ pffiffiffiffi Although the exact covariance matrix Rpffiffiffi npY and the simplified version R npY are asymptotically equivalent, we wish to investigate the rate of convergence to better understand the difference between those matrices and hence of Theorem 4.2. For this.  2 1 1 ffiffiffi ffi ffiffiffi ffi p p ~ purpose we will use /ðp, dÞ ¼ p Tr R npY ðR npY Þ  I as a measure of the disffi ~ pffiffiffiffi crepancy between Rpffiffiffi npY and R npY : In Figure 1 below, we display the graphs of /ðp, 0:2Þ and /ðp, 0:8Þ: All values are obtained analytically. ffi ~ pffiffiffiffi As expected the faster convergence Rpffiffiffi npY to R npY is observed for small values of ratio d, i.e., when p is significantly smaller than n.. Figure 1. /ðp, dÞ ¼. . 1 p Tr. ffiffiffiffiY ðR ffiffiffiffiY Þ1 ~ pnp Rpnp. I. 2 . as a function of p with ratio d 2 f0:2, 0:8g:.

(13) 12. J. PIELASZKIEWICZ AND T. HOLGERSSON. Figure 2. Simulation for p ¼ 20, d ¼ 0.4. (a) Empirical and theoretical distribution function (b) Difference between empirical and theoretical distribution function.. Remark 4.1. Since applications of Theorem 4.1 (or Theorem 4.2) are likely to involve the delta method in one way or another it should be noticed that the delta method generally does not work for d ¼ limn, p!1 np ¼ 0, Birke and Dette (2005). For example, it is directly seen that RY becomes singular in this case. There are, however, at least three ways to handle the case d ¼ 0: (i) For random quantities which are essentially chi-square distributed we may represent the distribution in terms of sums of independent chisquare variables, as suggested by Birke and Dette (2005), Lemma 1. (ii) the case d ¼ 0 essentially represents a fixed-dimension asymptotic case, and we may therefore treat it by using the traditional fixed-dimension asymptotic machinery, rather than insisting on deriving a unified theory for fd : d ¼ 0g [ fd : d > 0g: (iii) Since any real-data analysis will necessarily involve finite values of n and p we may condition our analysis on the event d ¼ p=n according to the conditionality principle, Cox and Hinkley (1974). This approach is in all senses equivalent to assuming d > 0, i.e., that d ¼ limn, p!1 np > 0:. 5. Simulation study In this section we present a simulation study on asymptotic normality of the vector Y (as given in Theorem 4.2) as well as on its skewness. Simulations confirming the results of Theorems 3.1, 3.2 and 4.1 are presented in Appendix A. 5.1. Normality Theorem 4.2 in Section 4 states that the vector  0 Y ¼ p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g is asymptotically Gaussian, in the sense that. pffiffiffiffiffi 1=2 l ffi npRpffiffiffi npY ðY  lY Þ!Nð0, IÞ: In order to. investigate the rate of this convergence we will conduct a simulation study by using a suitpffiffiffiffiffi 1=2 ffi able measure of the difference between the distribution of the vector npRpffiffiffi npY ðY  lY Þ and the standard normal distribution. It is well-known that a linear combination of a random vector (say a0 z) is normally distributed for any nonrandom a iff is multivariate.

(14) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 13. Figure 3. Kolmogorov distance dd as a function of p. The distance dd is obtained based on 1000 simulated Wishart matrices. (a) d ¼ 0.2 (b) d ¼ 0.4.. Figure 4. Measure of sampling skewness c as a function of p for d 2 f0:1, 0:6g: The results are based on 1000 pair of simulated independent realizations of vector Y.. normally distributed (see Anderson (2003)). We may use a weaker version of this characpffiffiffiffi np 1=2 ffi terization and assess the closeness of X ¼ pffiffi 10 Rpffiffiffi npY ðY  lY Þ to a univariate normal dis3. tribution. Let FðxÞ ¼ PðX  xÞ and UðxÞ denote the cumulative distribution function (CDF) of X and of a standard normal variate, respectively. We then apply the Kolmogorov distance (KD) defined by dd ¼ supx jFðxÞ  UðxÞj, as our measure of normality of pffiffiffiffiffi 1=2 ffi npRpffiffiffi npY ðY  lY Þ: The simulations results for KD distance are displayed in Figures 2a and b, 3a and b below for some p and d. The simulation confirms convergence to normality under p ! 1, p=n ! d < 1: 5.2. Skewness To investigate the weak convergence further, we will also consider the skewness of Y ¼ ½p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g0 : This quantity is of great importance since the rate of convergence to the normal distribution is often determined by the skewness (which.

(15) 14. J. PIELASZKIEWICZ AND T. HOLGERSSON. appear as the dominating term in the expansion of the characteristic function). Moreover, since the exact mean and covariance matrix is a known function of n and p, pffiffiffiffiffi 1=2 ffi we may consider X ¼ npRpffiffiffi npY ðY  lY Þ and use Mardias skewness measure which is. defined by ~c ¼ E½ðX 0a Xb Þ3 , where Xa and Xb are two independent realizations of X. When X : g  1 is normally distributed the sample skewness measure has mean zero and variance 8gðg þ 2Þ: To simplify interpretation of the simulations we will use the 1 standardized statistic c ¼ pffiffiffiffiffi ðX 0a Xb Þ3 as our measure of skewness of X. Figure 4 120 presents simulation results for c with d 2 f0:1, 0:6g: See Mardia (1977) for further details about this measure.. 6. Summary In this paper we provide a recursive tool for obtaining moments of traces of Wishart matrices. Previous work in the field are extended to involve mixtures of Wishart and inverse Wishart matrices. Such spectral statistics in turn appear in risk functions, finance, classification analysis and many other applications. Some exact moments are derived while a scaled version of the underlying vector of linear spectral statistics is shown to satisfy a multivariate central limit theorem which is valid under increasing dimension asymptotics, i.e., as n, p ! 1 such that p=n < 1: Simulations of the rate of weak convergence, which shows that the normality approximation deteriorate as the ratio p/n gets larger, are presented.. Acknowledgment The authors would like to acknowledge valuable comments and suggestions from the Editor and anonymous Referees which helped to improve this paper.. References Anderson, T. W. 2003. An introduction to multivariate statistical analysis. 3rd ed. New York: Wiley. Bai, Z. D., and J. W. Silverstein. 2004. CLT for linear spectral statistics of large-dimensional sample covariance matrices. The Annals of Probability 32 (1A):553–605. doi:10.1214/aop/ 1078415845. Birke, M., and H. Dette. 2005. A note on testing the covariance matrix for large dimension. Statistics & Probability Letters 74:281–9. doi:10.1016/j.spl.2005.04.051. Cox, D. R., and D. V. Hinkley. 1974. Theoretical statistics. London: Chapman & Hall. Dai, D., and T. Holgersson. 2018. High-dimensional CLTs for individual Mahalanobis distances. In Trends and perspectives in linear statistical inference. Contributions to statistics, ed. M. Tez, D. von Rosen. Cham: Springer. Efron, B., and C. Morris. 1976. Multivariate empirical Bayes and estimation of covariance matrices. The Annals of Statistics 4 (1):22–32. https://projecteuclid.org/euclid.aos/1176343345 doi:10. 1214/aos/1176343345. Fujikoshi, Y., V. V. Ulyanov, and R. Shimizu. 2011. Multivariate statistics: High-dimensional and large-sample approximations. Hoboken: John Wiley & Sons. Glombek, K. 2014. Statistical inference for high-dimensional global minimum variance portfolios. Scandinavian Journal of Statistics 41 (4):845–65. doi:10.1111/sjos.12066. Glueck, D. H., and K. E. Muller. 1998. On the trace of a Wishart. Communications in Statistics Theory and Methods 27 (9):2137–41. doi:10.1080/03610929808832218..

(16) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 15. Girko, V., and T. Pavlenko. 1989. G-estimates of the quadratic discriminant function. (Russian, English summary). Ukrainian Mathematical Journal 41 (12):1469–73. doi:10.1007/BF01056118. Gupta, A. K., and D. K. Nagar. 2000. Matrix variate distributions, monographs and surveys in pure and applied mathematics; 104. Boca Raton: Chapman & Hall/CRC. Haagerup, U., and S. Thorbjornsen. 2003. Random matrices with complex Gaussian entries. Expositiones Mathematicae 21 (4):293–337. doi:10.1016/S0723-0869(03)80036-1. Haff, L. R. 1979. Estimation of the inverse covariance matrix: Random mixtures of the inverse Wishart matrix and the identity. The Annals of Statistics 7 (6):1264–76. doi:10.1214/aos/ 1176344845. Hanlon, P. J., R. P. Stanley, and J. R. Stembridge. 1992. Some combinatorial aspects of the spectral of normally distributed random matrices. Contemporary Mathematics 138:151–74. Kollo, T., and D. von Rosen. 2005. Advanced multivariate statistics with matrices. Dordrecht: Springer. Letac, G., and H. Massam. 2004. All invariant moments of the Wishart distribution. Scandinavian Journal of Statistics 31 (2):295–318. doi:10.1111/j.1467-9469.2004.01-043.x. Letac, G., and H. Massam. 2008. The noncentral Wishart as an exponential family, and its moments. Journal of Multivariate Analysis 99 (7):1393–417. doi:10.1016/j.jmva.2008.04.006. Mardia, K. 1977. Mahalanobis distances and angles. In Multivariate analysis IV, ed. P. R. Krishnaiah, 495–511. Amsterdam, North-Holland. https://scholar.google.com/scholar?hl=en& as_sdt=0,5&cluster=12286584190613962445. Nel, D. G. 1971. The h-th moment of the trace of a noncentral Wishart matrix. South African Statistical Journal 5:41–52. Okhrin, Y., and W. Schmid. 2006. Distributional properties of portfolio weights. Journal of Econometrics 134 (1):235–56. doi:10.1016/j.jeconom.2005.06.022. Q Pielaszkiewicz, J. M., D. von Rosen, and M. Singull. 2017. On E½ ki¼0 TrfW mi g, where W  Wp ðI, nÞ: Communications in Statistics - Theory and Methods 46:2990–3005. doi:10.1080/ 03610926.2015.1053942. Pielaszkiewicz, J. M., D. von Rosen, and M. Singull. 2018. On n/p-Asymptotic distribution of vector of weighted traces of powers of Wishart matrices. Electronic Journal of Linear Algebra 33 (1):24–40. doi:10.13001/1081-3810.3732. Subrahmaniam, K. 1976. Recent trends in multivariate normal distribution theory: On the zonal polynomials and other functions of matrix argument. Sankhy: The Indian Journal of Statistics, Series A(1961-2002) 38 (3):221–58. Yao, J., S. Zheng, and Z. Bai. 2015. Large sample covariance matrices and high-dimensional data analysis (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press.. Appendix A. Simulation study on Theorem 3.1, 3.2 and 4.1 In the appendix, we will illustrate proven results of Theorem 3.1, 3.2 and 4.1. We perform simulation studies for the Wishart matrices W  Wp ðI, nÞ with n ¼ 10 and p ¼ 4 for results regarding matrices of fixed size, and keep the ratio p=n ¼ 0:4: while increasing n and p. The Mathematica code used to generate the Wishart matrices and hence for calculating expectations of traces of powers is given below. Repl ¼ 10^4; datamatrixX ¼ RandomVariate [MatrixNormalDistribution[IdentityMatrix[p], IdentityMatrix [n]], Repl]; W[i__]: ¼ datamatrixX[[i, All, All]]. Transpose[datamatrixX[[i, All, All]]] (. Wishart matrix XX’. ) ExpectationTrace[k__]: ¼ Table[Product[Tr[MatrixPower[W[i], k[[j]]]], {j, 1, Length[k]}], {i,1, Repl}];.

(17) 16. J. PIELASZKIEWICZ AND T. HOLGERSSON. Table A1. Theoretical and estimated values of E½TrfW 1 g, E½TrfW 0 gTrfW 1 g and E½TrfW 2 g for 100, 1000, 10000 simulated Wishart matrices W with n ¼ 10, p ¼ 4. No. of replicates. E½TrfW 1 g. E½TrfW 0 gTrfW 1 g. E½TrfW 2 g. 100. 0.790619 (–0.009381). 3.16247 (–0.03753). 0.460621 (0.060621). 1000. 0.805752 (0.005752). 3.22301 (0.02301). 0.391524 (–0.008476). 10000. 0.799844 (–0.000156). 3.19938 (–0.000624). 0.408825 (0.008825). Theoretical value. p np1. ¼ 0:8. p2 np1. ¼ 3:2. ðn1Þp ðnp3Þðnp1ÞðnpÞ. ¼ 0:4. In brackets (light gray color) the difference between those values is given.. Table A2. Theoretical covariance matrix RY as presented in Theorem 4.1 and its estimate base on 10, 100 and 1000 simulated Wishart matrices W with n ¼ 10, p ¼ 4. ^Y R. No. of replicates 10. 100. 1000. Theoretical RY. 0. 1 0:180095 0:0599653 0:134392 @ 0:0599653 0:0273467 0:0693022 A 0:134392 0:0693022 0:18785 0 1 0:85547 0:127183 0:249362 @ 0:127183 0:0460792 0:120738 A 0:249362 0:120738 0:354638 0 1 0:870109 0:102145 0:204345 @ 0:102145 0:0503067 0:150253 A 0:204345 0:150253 0:503454 0 1 1: 0:1 0:2 @ 0:1 0:05 0:15 A 0:2 0:15 0:507. ^ Y RY k kR F kRY kF. 0.754446. 0.192749. 0.109794. Table A3. Convergence of simulated covariance matrix to the theoretical asymptotic covariance p!1 ffi matrix Rnp, ffiffiffi given in Theorem 4.1. Averages of 1000 simulated Wishart matrices W with n ¼ npY 5k, p ¼ 2k, k 2 f1, 2, 4, 10, 20g are presented. ffi ^ pffiffiffi R npY. (n, p) (2, 5). (4, 10). (20, 50). (100, 40). p!1 ffi Theoretical Rnp, ffiffiffi npY. 0. 112:11 @ 4:66102 9:69508 0 33:6838 @ 4:30948 8:91127 0 20:4182 @ 3:99165 8:0334 0 16:41 @ 3:19128 6:26881 0 15:4321 @ 3:33333 6:66667. 1 4:66102 9:69508 1:90553 6:0842 A 6:0842 22:7998 1 4:30948 8:91127 2:10385 6:19747 A 6:19747 20:3428 1 3:99165 8:0334 2:06537 5:72042 A 5:72042 17:4649 1 3:19128 6:26881 1:84435 5:1616 A 5:1616 16:0745 1 3:33333 6:66667 A 2: 5:6 5:6 17:28. ffi n,npp!1 ^ pffiffi ffiY kF kR npY Rpffiffi n, p!1 ffiffiffi k kRp npY. F. 3.6268. 0.705047. 0.203291. 0.0666588.

(18) COMMUNICATIONS IN STATISTICS—THEORY AND METHODS. 17. Q Estimates for E½ ki¼0 TrfW mi g, for mi 2 Z can be obtained though the command Mean[ExpectationTrace[list_of_powers]]; In Table A1, we present the comparison between theoretical values of E½TrfW 1 g, E½TrfW 0 gTrfW 1 g and E½TrfW 2 g derived in Ex. 3.1 and Ex. 3.2 and the ones estimated from the average of 10t , t 2 f2, 3, 4g simulated Wishart random matrices. Finally, to illustrate results regarding the mean and covariance matrix in Theorem 4.1 we generate a vector of Y ¼ ½p1 TrfS1 g, p1 TrfSg, p1 TrfS2 g0 and provide its covariance matrix in Table A2. We observe reasonable fit to the theoretical result already while averaging over 100 ^Y Wishart matrices as relative error in R given by ratio of Frobenius ^ norms kR Y  RY kF =kRY kF < 0:2: We continue the simulation in Table A3 to confirm the asymptotic result given in Theorem 4.1 for particular case of d ¼ 0.4. The rightmost column visualizes the rate of convergence to the limiting matrix..

(19)

References

Related documents

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating

Stöden omfattar statliga lån och kreditgarantier; anstånd med skatter och avgifter; tillfälligt sänkta arbetsgivaravgifter under pandemins första fas; ökat statligt ansvar

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Syftet eller förväntan med denna rapport är inte heller att kunna ”mäta” effekter kvantita- tivt, utan att med huvudsakligt fokus på output och resultat i eller från

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

I regleringsbrevet för 2014 uppdrog Regeringen åt Tillväxtanalys att ”föreslå mätmetoder och indikatorer som kan användas vid utvärdering av de samhällsekonomiska effekterna av

As several of the participants mention, as well as the common knowledge of the widespread silence culture regarding the Sami identity and the abandoned the Saminess - it is only

In his essay The Death of the Author from 1967, Roland Barthes, reflecting on the act of reading, proposes a strategy where one’s attentive empathy connects with the shared