Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors

(1)

Exact probabilities for typical ranks of 2 × 2 × 2

and 3 × 3 × 2 tensors

Göran Bergqvist

Linköping University Post Print

N.B.: When citing this work, cite the original article.

Original Publication:

Göran Bergqvist , Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors,

2013, Linear Algebra and its Applications, (438), 2, 663-667.

http://dx.doi.org/10.1016/j.laa.2011.02.041

Copyright: Elsevier

http://www.elsevier.com/

Postprint available at: Linköping University Electronic Press

(2)

Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2

tensors

G¨oran Bergqvist

Matematiska institutionen, Link¨opings universitet, SE-581 83 Link¨oping, Sweden

gober@mai.liu.se May 12, 2010

Abstract

We show that the probability to be of rank 2 for a 2 × 2 × 2 tensor with elements from a standard normal distribution is π/4, and that the probability to be of rank 3 for a 3 × 3 × 2 tensor is 1/2. In the proof results on the expected number of real generalized eigenvalues of random matrices are applied. For n × n × 2 tensors with n ≥ 4 we also present some new aspects of their rank.

Keywords: tensors, multi-way arrays, typical rank, random matrices AMS classification codes: 15A69, 15B52

1 Introduction

The rank concept for multi-way arrays or tensors is not as simple as for matrices. If T is a real m × n × p 3-way array (or 3-tensor), then it can be expanded as

T =

r

X

i=1

ciui⊗ vi⊗ wi (ci∈ R, ui ∈ Rm, vi ∈ Rn, wi ∈ Rp) (1)

[2, 8, 10], where ⊗ denotes the tensor (or outer) product. This is called the CP expansion (or CANDECOMP for canonical decomposition, or PARAFAC for parallel factors) of T , and the extension to higher order tensors or arrays is obvious.

The are several rank concepts for tensors [1, 3, 4, 7, 10, 11]. The rank of T is the minimal possible value of r in the CP expansion (1) and is always well defined. The column (mode-1) rank of T is the dimension of the subspace of Rm _{spanned by the np columns of T (for every}

fixed pair of values of jk we have such a column). The row (mode-2) rank r2 and the mode-3

rank r3are defined analogously. The triple (r1, r2, r3) is called the multirank of T . A typical rank

of T is a rank which appears with nonzero probability if the elements Tijk are randomly chosen

from a continuous probability distribution. A generic rank is a typical rank which appears with probability 1. In the matrix case, the number of terms r in the singular value expansion is always equal to the column and row ranks of the matrix. However, for tensors, r, r1, r2 and r3

(3)

can all be different. For matrices the typical and generic ranks of an m × n matrix are always min(m, n). However, for a higher order tensor a generic rank over the real numbers does not necessarily exist (over the complex numbers a generic rank always exists, but in this paper we only consider real tensors and real CP expansions). Both the typical and generic ranks of an m × n × p tensor may be strictly greater than min(m, n, p), and are in general hard to calculate. In 1989, using numerical simulations with each tensor element drawn from a normal distri-bution with zero mean, Kruskal [12] reported that the probabilities for a 2 × 2 × 2 tensor to have rank 2 or 3 are approximately 79% and 21% respectively. Hence, ranks 2 and 3 are typical, and no generic rank exists (over the complex numbers it is 2). It was later shown [15] for all n ≥ 2, that for n × n × 2 tensors there is no generic rank and that the typical ranks are n and n + 1. The generic and typical ranks of m × n × p tensors for several small values of m, n and p have recently been determined [3, 16]. While a generic rank seems to exist for most (m, n, p), another case with two typical ranks is 5 × 3 × 3 tensors for which 5 and 6 are typical ranks.

Until now, the probabilities for random tensors to be of the different possible typical ranks have only been studied by numerical simulations. Below we derive the first exact values of such probabilities, namely for 2 × 2 × 2 and 3 × 3 × 2 tensors.

For other aspects of the CP expansion, such as uniqueness of the expansion, estimates of maximal rank, and low rank approximations, we refer to the review paper [10] and the references therein, other types of tensor decompositions (e.g., Tucker and higher order singular value decompositions) and applications are also described there.

2 A characterization of n × n × 2 tensors of rank n

We now assume that T is a real n × n × 2 tensor whose elements Tijk, 1 ≤ i, j ≤ n, 1 ≤ k ≤ 2,

are picked from some continuous probability distribution. The theorem presented in this section is essentially given by ten Berge [14], although the probablistic view was not used there (but see [13]). Define the n × n matrices T1 and T2, the frontal slices of T , by (T1)ij = Tij1 and

(T2)ij = Tij2. We have

Theorem 1. With probability 1:

rank(T ) = n ⇐⇒ det(T2− λT1) = 0 has n real solutions

Proof. Suppose first that rank(T ) = n. Then we can write T = n X i=1 ui⊗ vi⊗ di ei (2)

Following [14], we define the n × n matrices U with columns ui, V with columns vi, D diagonal

with elements di on the diagonal, and E diagonal with elements ei on the diagonal. Then

T1 = UDVT and T2 = UEVT. If we assume that T1 is invertible (true with probability

1), then det U 6= 0 6= det V and all di 6= 0. Hence the generalized eigenvalue equation 0 =

det(T2− λT1) = det U det(D − λE) det V = det U det VQn_i=1(ei− λdi) has n real solutions.

Conversely, if det(T2 − λT1) = 0 has n real solutions, then (again assuming T1 invertible)

det(T2T−11 − λI) = 0 has n real solutions. With probability 1, all eigenvalues of T2T−11 are

(4)

be non-trivial Jordan blocks). Define U = P and VT = P−1T1. Then T1 = PVT = UIVT

and T2 = PΛP−1T1 = UΛVT and T has the expansion

T = n X i=1 ui⊗ vi⊗ 1 λi (3) and is therefore of rank n.

Notice that in the proof, the CP expansion of T is actually constructed. The exceptional (probability 0) cases were discussed in some detail by ten Berge [14] but are not relevant for the results of this paper.

3 The expected number of real generalized eigenvalues of

ran-dom matrices

Consider random matrices whose elements are independent random variables and normally dis-tributed with mean 0 and variance 1. Kanzieper and Akemann [9] (see also Edelman [6]) have found the probabilities for such a random n × n matrix A to have k real eigenvalues, i.e., the probability for having k real solutions to det(A − λI) = 0. Since T2T−1₁ is not normally

dis-tributed we cannot apply their result to it. The same problem for the generalized eigenvalue problem det(A − λB) = 0, with B having the same distribution as A, seems to be unsolved.

In [5], however, Edelman, Koslan and Shub found both the expected number of real eigenval-ues and the expected number of real generalized eigenvaleigenval-ues for such random matrices. Let En

be the expectation value for the number of real solutions to det(A − λB) = 0 with A and B as above. Then [5] En= √ πΓ( n+1 2 ) Γ(n₂) (4)

It turns out that this information is sufficient to solve our problem for n × n × 2 tensors with n = 2 and n = 3.

4 Probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors

Now assume that T is an n × n × 2 tensor whose elements are independent random variables and normally distributed with mean 0 and variance 1. Then the slices T1 and T2 are random

n × n matrices. The expected number En of real solutions to det(T2− λT1) = 0 is given by (4).

We also have En= n X k=1 kpn(k) (5)

where pn(k) is the probability for having k real generalized eigenvalues. Since complex

eigen-values come in pairs we can write

En=

[(n−1)/2]

X

k=0

(n − 2k)pn(n − 2k) (6)

(5)

Theorem 2. Suppose that T is an n × n × 2 tensor whose elements are independent random variables which are normally distributed with mean 0 and variance 1, and let Pn denote the

probability that T has rank n. Then P2 = π/4 and P3= 1/2.

Proof. By Theorem 1, Pn = pn(n) is equal to the probability that det(T2 − λT1) = 0 has n

real solutions. For n = 2, by (4), the expected number of real solutions is E2 =

√ πΓ(

3 2)

Γ(1). Recall

that Γ(m) = (m − 1)! if m is an integer and Γ(m₂) =√π₂(m−2)!!(m−1)/2 if m is an odd integer. Hence

E2 = √ π √ π/2 1 = π

2. By (6), E2 = 2p2(2) and we conclude that P2 = p2(2) = π 4. For n = 3, again by (4), E3 = √ πΓ(2) Γ(3₂) = √ π√1 π/2 = 2. By (6), E3 = 3p3(3) + 1p3(1) = 3p3(3) + 1(1 − p3(3)) = 2p3(3) + 1 = 2P3+ 1 so we conclude that P3 = 1/2.

Since the only other typical rank of an n × n × 2 tensor is n + 1 we immediately have: Corollary 3. Suppose that T is an n × n × 2 tensor whose elements are independent random variables which are normally distributed with mean 0 and variance 1, and let ˜Pn denote the

probability that T has rank n + 1. Then ˜P2 = 1 − π/4 and ˜P3 = 1/2.

For n ≥ 4, there are at least three terms in (6) so the conditionP[n/2]

k=0 pn(n − 2k) = 1 is not

sufficient to determine Pn= pn(n). For n = 4 we get

2p4(2) + 4p4(4) = E4= √ πΓ( 5 2) Γ(2) = 3π 4 ; p4(0) + p4(2) + p4(4) = 1 (7) and for n = 5 1p5(1) + 3p5(3) + 5p5(5) = E5= √ πΓ(3) Γ(5₂) = 8 3 ; p5(1) + p5(3) + p5(5) = 1 (8) and so on for increasing values of n.

Remark. It is interesting to interpret the above results as a result for how often curves intersect. For the case n = 2, expanding det(T2 − xT1) = 0 to an equation of the type

(a1x − a2)(a3x − a4) = (a5x − a6)(a7x − a8), we see that the probability for two paraboloids

with real roots to intersect is π/4, when the coefficients ai are chosen from a standard normal

distribution. Since the equation has real solutions if B2 − 4AC ≥ 0, where A = a₁a3 − a5a7,

B = a5a8 + a6a7 − a1a4 − a2a3, and C = a2a4 − a6a8 (for T , B2 − 4AC will just be its

hyperdeterminant [4]), it is very simple to verify that the value of P2 must be close to π/4 by

running a large number of tests with each ai from the normal distribution N (0, 1) and checking

how often B2 ≥ 4AC. Since π/4 ≈ 0.7854, we see that the value 79% of Kruskal’s original simulation [12] is a good approximation. For n = 3 a result for qubic equations is obtained and one can also by simple simulations see that the value of P3 must be near the exact value 1/2

found above.

References

[1] G Bergqvist and E G Larsson ”The higher-order singular value decomposition: theory and an application” IEEE Signal Proc. Mag. 27 (2010) 151–154

(6)

[2] J D Carroll and J J Chang ”Analysis of individual differences in multidimensional scaling via N-way generalization of Eckart-Young decomposition” Psychometrika 35 (1970) 283–319 [3] P Comon, J M F ten Berge, L De Lathauwer and J Castaing ”Generic and typical ranks of

multi-way arrays” Lin. Alg. Appl. 430 (2009) 2997–3007

[4] V De Silva and L-H Lim ”Tensor rank and the ill-posedness of the best low-rank approxi-mation problem” SIAM J. Matrix Anal. Appl. 30 (2008) 1084–1127

[5] A Edelman, E Kostlan and M Shub ”How many eigenvalues of a random matrix are real?” J. Amer. Math. Soc. 7 (1994) 247–267

[6] A Edelman ”The probability that a random real Gaussian matrix has real eigenvalues, related distributions, and the circular law” J. Multivariate Anal. 60 (1997) 203–232 [7] S Friedland ”On the generic rank of 3-tensors” preprint, 2009 (arXiv:0805.3777v3)

[8] R Harshman ”Foundations of the PARAFAC procedure: Models and conditions for an explanatory multi-modal factor analysis” UCLA working papers in phonetics 16 (1970) 1–84

[9] E Kanzieper and G Akemann ”Statistics of real eigenvalues in Ginibre’s ensemble of random real matrices.” Phys. Rev. Lett. 95 (2005) 230201

[10] T G Kolda and B W Bader ”Tensor decompositions and applications” SIAM Review 51 (2009) 455–500

[11] J B Kruskal ”Three-way arrays: rank and uniqueness of trilinear decompositions, with applications to arithmetic complexity and statistics” Lin. Alg. Appl. 18 (1977) 95–138 [12] J B Kruskal ”Rank, decomposition, and uniqueness for 3-way and N-way arrays” in

Multi-way data analysis 7–18, North-Holland (Amsterdam), 1989

[13] A Stegeman ”Degeneracy in CANDECOMP/PARAFAC explained for p × p × 2 arrays of rank p + 1 or higher” Psychometrika 71 (2006) 483–501

[14] J M F ten Berge ”Kruskal’s polynomial for 2 × 2 × 2 arrays and a generalization to 2 × n × n arrays” Psychometrika 56 (1991) 631–636

[15] J M F ten Berge and H A L Kiers ”Simplicity of core arrays in three-way principal compo-nent analysis and the typical rank of p × q × 2 arrays” Lin. Alg. Appl. 294 (1999) 169–179 [16] J M F ten Berge and A Stegeman ”Symmetry transformations for squared sliced three-way