Clustering using k-means algorithm in multivariate dependent models with factor structure

(1)

U.U.D.M. Project Report 2020:52

Examensarbete i matematik, 30 hp Handledare: Rolf Larsson

Examinator: Julian Külshammer December 2020

Department of Mathematics Uppsala University

Clustering using k-means algorithm in

multivariate dependent models with factor structure

Dimitris Dineff

(2)

(3)

Contents

1 Introduction 3

2 k-means Clustering 4

3 Model 5

3.1 A model in dimension 2 . . . . 5

4 Model Selection & Simulations 8 4.1 Model Selection . . . . 8

4.2 Dimension 5 . . . . 8

4.3 Dimension 7 . . . . 16

4.4 Observations and Comparison . . . . 26

5 Empirical Example 29 6 Conclusion 31 7 Appendix 32 7.1 Dimension 5 . . . . 32

7.2 Dimension 7 . . . . 32

(4)

1 Introduction

One of the major growing topics in various science fields is machine learning. This rapid development is mainly due to the huge need that exists for interpretation and analysis of countless data that arise and collected in a daily basis. Machine learning is divided into two big categories, supervised and unsupervised. In unsupervised machine learning, there is no supervisor which provides the correct values of an output, while we are trying to learn the mapping from the input, but we only have input data. The aim is to find the regularities in the input. There is a structure to the input space such that certain patterns occur more often than others, and we want to see what generally happens and what does not. In statistics, this is called density estimation. One method for density estimation is clustering where the aim is to find clusters or groupings of input ([8]).

One of the major clustering approaches is based on the sum of squares criterion and on the algorithm that is today well known under the name ’k-means’ ([9]). The k-means algorithm is the most widely used clustering method. It constructs a partition of a set of objects into k clusters, that minimize some objective function, usually a squared error function, which imply round-shape clusters. The input parameter k is fixed and must be given in advance that limits its real applicability to streaming and evolving data ([8]).

The models which we are going to apply k-means to are described thoroughly in a relevant paper ([2]). These are dependent models with factor structure containing discrete data which we generated in Matlab. Working with k-means clustering corresponds to the determination of the input parameter k. In this paper, we select k depending on the factor structure of our dependent models. For example, if we have a model where a number of variates are described as a linear combination of a factor U₁and some independent random variables and the rest of the variables are described as a linear combination of a second factor, U2 and some other independent random variables, then we ’divide’ our model in two groups, one where the variables are linked through factor U1 and another one where the variables are linked though factor U2. Thus, we are selecting k equal to two. If we also had some variables where they were only equal to some independent random variables, then we would had a third group, as a result, we would select k equal to three.

First, we begin with a simple dependent Poisson model in dimension 2 ([3]) and then we constructed dependent, discrete Poisson, binomial and mixed Poisson and binomial factor models in dimensions 5 and 7.

Our goal is to explore with k-means clustering which model structures are easier to be found than others and this we do, by calculating the accuracy of all the models, which we will exhibit in a simulation study. Also, we will use different parameters for the factors and the variables and we will investigate how the accuracy is affected. Last but not least, we will compare our performance with the one in a relevant paper ([2]).

Finally, a few words about the structure. In Section 2, we will present the k-means method. In Section 3, we will analyze the models and their components. In Section 4, there will be the results of our simulations. In Section 5, we perform k-means clustering in an empirical example with ordinal data, previously analyzed by J¨oreskog ([7]).

Finally, in Section 6, there will be the conclusion.

(5)

2 k-means Clustering

Clustering is a data analysis technique that, when applied to a set of heterogeneous items, identifies homogeneous subgroups as defined by a given model or measure of similarity. One feature of clustering is that the process is unsupervised, that is, there is no predefined grouping that the clustering seeks to reproduce. In unsupervised machine learning, only the inputs are available and the task is to reveal aspects of the underlying distribution of the input data. Clustering is a technique for exploratory data analysis and is used increasingly in preliminary analyses of large data sets of medium and high dimensionality as a method of selection, diversity analysis and data reduction ([4]).

If a data set is analyzed in an iterative way, such that at each step a pair of clusters is merged or a single cluster is divided, the result is hierarchical, with a parent-child relationship being established between cluster at each successive level of the iteration. If the data set is analyzed to produce a single partition of the compounds resulting in a set of clusters, the result is then non-hierarchical ([4]).

The k-means method is a non-hierarchical relocation clustering technique in which each item is assigned to the cluster having the nearest centroid (mean). A relocation method is one in which compounds are moved from one cluster to another, to try to improve on the initial estimation of the cluster. The relocation is typically accomplished based on improving a cost function, describing the ”goodness” of each resultant cluster ([4]). It solves the clustering reduction problem which amounts to grouping similar items. First, we choose k initial cluster centers which they called centroids. At the second step, the algorithm computes point-to-cluster-centroid distances of all observations to each centroid. At the third step, it assigns each observation to the cluster with the closest centroid. After that, it computes the average of the observations in each cluster to obtain k new centroid locations. Finally, it repeats the second step to the fourth one until the cluster assignments do not change or the maximum number of iterations are reached ([6]). The final assignment of items to clusters will be, to some extent, dependent upon the initial partition or the initial selection of seed points ([1], p. 696).

The k-means clustering algorithm amounts to selecting the clusters such that the sum of pairwise squared Euclidean distances within each cluster are minimized.

arg min_R₁_,...,R_k

k

X

j=1

1

|Rj| X

x,x’∈Rj

kx − x’k²₂ (1)

where |Rj| is the number of data points in cluster Rj.

The intention of (1) is to select the clusters such that all the points within each cluster are as similar as possible.

Solving (1) is to select the clusters such that the distance to the cluster center, summed over all data points is minimized.

arg min_R

1,...,R_k k

X

j=1

X

x∈R_j

x − µ_j

2

2 (2)

where |R_j| is the number of data points in cluster Rj and µ_j is the center (average of all data points x) ([5]).

(6)

3 Model

The general model is a model with factor structure as we can see below :











Y1= X1

Y₂= X₂ ...

Yn₀= Xn₀

Y_n₀₊₁= U₁+ X_n₀₊₁ ...

Yn₀+n₁ = U1+ Xn₀+n₁

Y_n₀_+n₁₊₁= U₂+ X_n₀_+n₁₊₁ ...

Yn₀+n₁+n₂= U2+ Xn₀+n₁+n₂

...

Y_n₀_+...+n_k−1₊₁= U_k+ X_n₀_+...+n_k−1₊₁ ...

Yn₀+...+n_k= Uk+ Xn₀+...+n_k

(3)

where N = n0+ ... + nk, U1, ..., Uk the factors, Y1, ..., Yn the dependent variables and X1, ..., XN the independent ones. All the variables follow the Poisson distribution. The type of the model is (n1, n2, ..., nk, 1, ..., 1), where there are n0 ones at the end.

3.1 A model in dimension 2

Here, we have the Karlis bivariate model (2), which is : (Y₁= U + X₁

Y2= U + X2

(4)

where U, X₁, X₂are independent non negative valued Poisson variables.

We want to estimate the parameters of the above model by maximum likelihood. Let f (u; λ) and g(x; µ_j) the probability mass functions of U and X₁, X₂respectively. We have a set of observation pairs (y₁₁, y₁₂), ..., (y_n1, y_n2).

(7)

Since Y1 and Y2 are conditionally independent given U and U , X1, X2 follow the Poisson distribution, the likelihood is :

L(λ, µ₁, µ₂) =

n

Y

i=1

min(y_i1,y_i2)

X

u=0

λ^uexp{−λ}

u!

µ^y₁ⁱ¹^−uexp{−µ1} (y_i1− u)!

µ^y₂ⁱ²^−uexp{−µ2} (y_i2− u)! =

= exp{−n(λ1+ µ1+ µ2)}

n

Y

i=1

min(y_i1,y_i2)

X

u=0

λ^u u!

µ^y₁ⁱ¹^−u (y_i1− u)!

µ^y₂ⁱ²^−u

(y_i2− u)! (5)

After taking the logarithm of the right hand side of the above expression we can numerically maximize only over the parameter λ by inserting ˆµ1= ¯y1− ˆλ, ˆµ2= ¯y2− ˆλ. This is a result of the following proposition.

Proposition 1 The parameters that maximize (6), ˆλ, ˆµ1, ..., ˆµm, satisfy the equalities

¯

yk = ˆµk+ ˆλ, k = 1, 2, ..., m, (6)

where ¯y_k= ¹_nPn

i=1y_ik for all k ([2])

Let’s introduce the (3, 2) model.











Y₁= U₁+ X₁ Y2= U1+ X2

Y₃= U₁+ X₃ Y4= U2+ X4

Y₅= U₂+ X₅

We will estimate the parameters of the above model by maximum likelihood. Let f (u; λ_ξ) and g(x; µ_j) be the probability mass functions of U_i where i = 1, 2, ξ = 1, 2 and X_1j where j = 1, ..., 5 respectively. We have five observations (y₁₁, y₁₂, ..., y₁₅), ..., (y_n1, y_n2, ..., y_n5). Since Y₁, Y₂and Y₃are conditionally independent given U₁ and Y₄, Y₅conditionally independent given U₂, the likelihood is :

L(λξ, µj) =

5

Y

i=1

min(y_i1,...,y_i5)

X

u=0

f (u1; λ1)f (u2; λ2)g(yi1− u1; µ1)g(yi2− u1; µ2)g(yi3−u₁; µ3) g(yi4−u2; µ4)g(yi5−u2; µ5) =

=

5

Y

i=1

min(yi1,...,yi5)

X

u=0

λ^u₁¹exp{−λ₁} u1!

λ^u₂²exp{−λ₂} u2!

µ^y₁ⁱ¹^−u¹exp{−µ₁} (yi1− u1)!

µ^y₂ⁱ²^−u¹exp{−µ₂} (yi2− u2)!

µ^y₃ⁱ³^−u¹exp{−µ₃} (yi3− u1)!

µ^y₄ⁱ⁴^−u²exp{−µ4} (yi4− u2)!

µ^y₅ⁱ⁵^−u²exp{−µ5} (yi5− u2)! =

= exp







−5(λ1+ λ2+

5

X

j=1

µj)







7

Y

i=1

min(y_i1,...,y_i5)

X

u=0

µ^y₁ⁱ¹^−u¹µ^y₂ⁱ²^−u²µ^y₃ⁱ¹^−u¹µ^y₄ⁱ⁴^−u²µ^y₅ⁱ⁵^−u²

(u1)!(u2)!(yi1− u1)!(yi2− u1)!(yi3− u1)!(yi4− u2)!(yi5− u2)!

After taking the logarithm of the above expression we numerically maximize over the parameter λ, using Propo-

(8)

Let’s introduce the (3, 2, 1, 1) model.











Y₁= U₁+ X₁ Y2= U1+ X2

Y3= U1+ X3

Y4= U2+ X4

Y5= U2+ X5

Y6= X6

Y7= X7

We will estimate the parameters of the above model by maximum likelihood. Let f (u; λξ) and g(x; µj) the probability mass functions of Ui, i = 1, 2, ξ = 1, 2 and X1j where j = 1, ..., 7 respectively. We have seven observations (y11, y12, ..., y17), ..., (yn1, yn2, ..., yn7)). Since Y3, Y4 and Y5 are conditionally independent given U1 and Y6, Y7 conditionally indepenent given U2, the likelihood is :

L(λ_ξ, µ_j) =

7

Y

i=1

min(y_i1,...,y_i7)

X

u=0

f (u₁; λ₁)f (u₂; λ₂)g(y_i1; µ₁)g(y_i2; µ₂)g(y_i3−u₁; µ₃)g(y_i4−u₁; µ₄)g(y_i5−u₁; µ₅)

g(y_i6−u₂; µ₆)g(y_i7−u₂; µ₇) =

=

7

Y

i=1

min(y_i1,...,y_i7)

X

u=0

λ^u₁¹exp{−λ₁} u1!

λ^u₂²exp{−λ₂} u2!

µ^y₁ⁱ¹exp{−µ₁} (yi1)!

µ^y₂ⁱ²exp{−µ₂} (yi2)!

µ^y₃ⁱ¹^−u¹exp{−µ₃} (yi3− u1)!

µ^y₄ⁱ⁴^−u¹exp{−µ4} (yi4− u1)!

µ^y₅ⁱ⁵^−u¹exp{−µ5} (yi5− u1)!

µ^y₆ⁱ⁶^−u²exp{−µ6} (yi6− u2)!

µ^y₇ⁱ⁷^−u²exp{−µ7} (yi7− u2)! =

= exp







−7(λ1+ λ2+

7

X

j=1

µj)







7

Y

i=1

min(yi1,...,yi7)

X

u=0

µ^y₁ⁱ¹µ^y₂ⁱ²^−u²µ^y₃ⁱ¹^−u¹µ^y₄ⁱ⁴^−u¹µ^y₅ⁱ⁵^−u¹

(u1)!(u2)!(yi1)!(yi2)!(yi3− u1)!(yi4− u1)!(yi5− u1)!

µ^y₆ⁱ⁶^−u²µ^y₇ⁱ⁷^−u²

(y_i6− u₂)!(y_i7− u₂)! (7)

After taking the logarithm of the above expression we numerically maximize over the parameter λ, using Proposition 1 again.

(9)

4 Model Selection & Simulations

4.1 Model Selection

Here, the model selection is simple. Our goal is to see which model is easier to be found in the correct structure using k-means. We will present a table with the accuracy of its model in the dimensions 5 and 7 and after that we will compare our results with those from Larsson’s paper ([2]).

Larsson ([2]) uses the AIC as a proposed method for model selection. Due to the large number of the potential models, he focuses in finding models where one factor loads to each variable. It is the same for all the dimensions.

Considering dimension 7, his model selection algorithm starts by computing the AIC for the independence model and compare it with all the (2,1,...,1). If it has the lowest AIC, the algorithm stops. If not, he estimates all (3,1,...,1) models where the pair of variables that had the same factor in the first step is joined by one of the other variables as well as all (2,2,1,...,1) models where he adds a new pair of variables that consists of any two that were not in the first pair. If none of the (3,1...,1) of (2,2,1,...,1) models that he tried is better than the previously chosen (2,1...,1) model, the algorithm stops and chooses the previous model. If not, it continues to test new model, the way we described above.

4.2 Dimension 5

We started simulating with the models (5) and (1, 1, 1, 1, 1) and all of times we achieved 100 % accuracy. The simulations have been done for 100.000 replications.

In the (4,1) model, the variables X_j∼ P (µ), j = 1, 2, 3, 4, X5∼ P (1) and U1∼ P (λ).

In the (3,2) model, the variables X_j∼ P (µ), j = 1, 2, 3, 4, 5 and Ui∼ P (λ), i = 1, 2.

In the (3,1,1) model, the variables X_j ∼ P (µ), j = 1, 2, 3, X_4,5 ∼ P (1) and U₁∼ P (λ).

In the (2,2,1) model, the variables X_j ∼ P (µ), j = 1, 2, 3, 4, X₅∼ P (1) and U_i∼ P (λ), i = 1, 2.

In the (2,1,1,1) model, the variables X_j ∼ P (µ), j = 1, 2, X_3,4,5∼ P (1) and U₁∼ P (λ).

The above is explained because we always have in mind that for the variables which are linked with a factor, it holds :

E[Y ] = E[U ] + E[X] (8)

For the variables that are not linked with a factor, it holds :

E[Y ] = E[X] (9)

The variables follow the Poisson distribution with parameters µ = 0.5 and λ = 0.5.

Sample size n = 25 n = 50 n = 100 n = 1000 n = 10000 Model Accuracy Accuracy Accuracy Accuracy Accuracy

(4,1) 47.20 51.44 51.92 52.00 52.03

(3,2) 66.82 74.76 76.41 77.00 79.27

(3,1,1) 36.91 41.69 43.40 44.70 44.97

(2,2,1) 45.81 53.27 55.80 56.45 57.19

(2,1,1,1) 47.08 52.97 55.24 56.93 57.33

(4,1) 79.06 80.89 82.27 84.25 85.92

(3,2) 96.75 98.42 99.32 100 100

(3,1,1) 69.98 72.39 73.78 74.50 74.69

(2,2,1) 80.17 83.04 85.39 89.94 90.23

(2,1,1,1) 76.89 79.91 81.55 86.62 87.02

(10)

(4,1) 21.59 26.00 31.87 43.53 43.51

(3,2) 24.34 32.29 43.88 64.86 64.84

(3,1,1) 13.59 17.44 22.75 33.96 34.08

(2,2,1) 13.18 19.72 27.68 44.95 45.63

(2,1,1,1) 21.41 26.48 32.30 44.65 45.27

In the (4,1) model, the variables Xj∼ Bin(n, p1), j = 1, 2, 3, 4, X5∼ Bin(n, p2) and U1∼ Bin(n, p3).

In the (3,2) model, the variables Xj∼ Bin(n, p1), j = 1, 2, 3, 4, 5 and Ui∼ Bin(n, p3), i = 1, 2.

In the (3,1,1) model, the variables Xj ∼ Bin(n, p1), j = 1, 2, 3, X4,5∼ Bin(n, p2) and U1∼ Bin(n, p3).

In the (2,2,1) model, the variables Xj ∼ Bin(n, p1), j = 1, 2, 3, 4, X5∼ Bin(n, p2) and Ui∼ Bin(n, p3), i = 1, 2.

In the (2,1,1,1) model, the variables X_j ∼ Bin(n, p1), j = 1, 2, X_3,4,5∼ Bin(n, p2) and U₁∼ Bin(n, p3).

The variables follow the binomial distribution with parameter n = 5, p₁= 0.1, p₂= 0.2 and p₃= 0.1.

(4,1) 44.97 49.60 51.02 50.82 50.82

(3,2) 67.83 74.80 76.56 76.85 79.50

(3,1,1) 34.34 39.71 41.46 42.37 42.39

(2,2,1) 44.82 52.53 54.38 55.42 55.47

(2,1,1,1) 44.35 50.10 52.58 54.18 54.42

The variables follow the binomial distribution with parameters n = 10, p1= 0.05, p2= 0.1 and p3= 0.05 Sample size n = 25 n = 50 n = 100 n = 1000 n = 10000

Model Accuracy Accuracy Accuracy Accuracy Accuracy

(4,1) 46.10 50.31 51.44 51.38 51.48

(3,2) 67.47 74.47 76.20 77.19 79.24

(3,1,1) 28.77 36.52 39.24 40.53 41.38

(2,2,1) 45.29 52.96 54.85 56.14 56.34

(2,1,1,1) 46.26 51.64 53.93 55.43 55.97

The variables follow the binomial distribution with parameters n = 10, p1= 0.08, p2= 0.1 and p3= 0.02.

(4,1) 21.03 25.89 31.71 43.57 43.95

(3,2) 25.97 34.83 47.05 65.09 65.34

(3,1,1) 13.74 17.84 23.18 34.09 34.09

(2,2,1) 14.81 20.66 28.93 45.27 45.23

(2,1,1,1) 21.43 26.13 32.34 44.43 44.88

The variables follow the binomial distribution with parameters n = 10, p₁= 0.02, p₂= 0.1 and p₃= 0.08.

(4,1) 77.19 79.04 80.35 80.60 79.03

(3,2) 96.40 98.12 99.13 100 100

(3,1,1) 69.73 70.44 71.81 72.64 71.43

(2,2,1) 78.59 81.45 83.15 88.64 88.07

(2,1,1,1) 74.96 77.65 79.22 82.56 78.59

Also, we can calculate the confidence interval for the estimated proportions using the binomial distribution and the number r of the replicates. That is, that exists probability p that k-means finds the right model if the estimated

(11)

probability is ˆp. For example, for the model (3,2) when Xj∼ Bin(5, 0.1), Ui∼ Bin(5, 0.1) at n = 25 the estimated proportion is 0.4497. Thus, we have :

ˆ p ± 1.96

rp(1 − ˆˆ p)

r = (10)

= 0.4497 ± 1.96

r0.4497(1 − 0.4497)

100.000 = [0.4466, 0.4528]

is a 95 % confidence interval for p.

In the next table, we calculated a few more confidence intervals.

Models n = 100 n = 1000 n = 10.000

For (4,1) when Xj ∼ Bin(5, 0.1), Ui∼ Bin(5, 0.1) [0.5071, 0.5133] [0.5051, 0.5113] [0.5051, 0.5113]

For (3,1,1) when Xj∼ Bin(10, 0.08), Ui ∼ Bin(10, 0.02) [0.2292, 0.2344] [0.3380, 0.3438] [0.3380, 0.3438]

For (2,1,1,1) when Xj ∼ Bin(10, 0.02), Ui∼ Bin(10, 0.08) [0.7897, 0.7947] [0.8233, 0.8280] [0.7834, 0.7884]

Going back to the simulations, in the next table we have :

In the (4,1) model, the variables X_j∼ P (µ), j = 1, 2, 3, 4, X5∼ P (1) and U1∼ Bin(n, p1).

In the (3,2) model, the variables X_j∼ P (µ), j = 1, 2, 3, 4, 5 and Ui∼ Bin(n, p1), i = 1, 2.

In the (3,1,1) model, the variables X_j ∼ P (µ), j = 1, 2, 3, X_4,5 ∼ P (1) and U₁∼ Bin(n, p₁).

In the (2,2,1) model, the variables X_j ∼ P (µ), j = 1, 2, 3, 4, X₅∼ P (1) and U_i∼ Bin(n.p₁), i = 1, 2.

In the (2,1,1,1) model, the variables Xj ∼ P (µ), j = 1, 2, X3,4,5∼ P (1) and U1∼ Bin(n, p1).

The variables Xj follow the Poisson distribution with parameter µ = 0.5 and the factors Ui follows the binomial distribution with parameters n = 10, p1= 0.05.

(4,1) 46.83 50.68 51.55 51.83 51.89

(3,2) 65.39 73.20 75.33 75.45 75.98

(3,1,1) 36.74 41.46 43.72 44.34 44.73

(2,2,1) 44.97 52.10 55.05 56.40 56.98

(2,1,1,1) 47.00 52.58 55.00 56.74 57.26

Here, X_j follow the Poisson distribution with parameter µ = 0.2 and U_i follow the binomial distribution with parameter n = 10, p₁= 0.08

(4,1) 78.05 80.05 81.34 82.46 84.13

(3,2) 96.07 97.79 98.97 100 100

(3,1,1) 69.64 72.08 74.20 76.80 79.49

(2,2,1) 79.60 82.40 84.95 90.50 91.68

(2,1,1,1) 77.26 80.07 82.30 92.04 99.09

Here, X_j follow the Poisson distribution with parameter µ = 0.8 and U_i follow the binomial distribution with parameter n = 10, p₁= 0.02

(12)

(4,1) 21.17 25.77 31.78 43.56 43.41

(3,2) 23.60 31.92 43.16 65.02 64.81

(3,1,1) 13.61 17.34 22.62 33.83 34.23

(2,2,1) 13.91 19.47 27.32 45.03 45.38

(2,1,1,1) 21.65 26.19 32.01 45.01 45.11

In the (4,1) model, the variables X_j∼ Bin(n, p1), j = 1, 2, 3, 4, X₅∼ Bin(n, p2) and U₁∼ P (µ).

In the (3,2) model, the variables X_j∼ Bin(n, p1), j = 1, 2, 3, 4, 5 and U_i∼ P (µ), i = 1, 2.

In the (3,1,1) model, the variables X_j ∼ Bin(n, p₁), j = 1, 2, 3, X_4,5∼ Bin(n, p₂) and U₁∼ P (µ).

In the (2,2,1) model, the variables X_j ∼ Bin(n, p₁), j = 1, 2, 3, 4, X₅∼ Bin(n, p₂) and U_i∼ P (µ), i = 1, 2.

In the (2,1,1,1) model, the variables X_j ∼ Bin(n, p₁), j = 1, 2, X_3,4,5∼ Bin(n, p₂) and U₁∼ P (µ).

Here, Xj follow the binomial distribution with parameters n = 10, p1= 0.05 and Uifollow the Poisson distribution with parameter µ = 0.5

(4,1) 46.72 50.61 51.73 51.71 51.68

(3,2) 68.75 75.66 77.39 78.99 80.38

(3,1,1) 37.15 41.90 43.50 44.34 44.72

(2,2,1) 46.25 53.43 55.51 56.25 56.43

(2,1,1,1) 46.48 51.67 53.93 55.83 56.06

Here, X_j follow the binomial distribution with parameters n = 10, p₁= 0.08 and U_ifollow the Poisson distribution with parameter µ = 0.2

(4,1) 21.27 25.93 32.10 43.24 43.44

(3,2) 26.29 35.56 47.39 65.40 65.19

(3,1,1) 13.85 18.18 23.63 34.10 34.30

(2,2,1) 15.07 20.83 29.40 45.22 45.20

(2,1,1,1) 21.67 26.23 32.77 44.30 44.95

Here, Xj follow the binomial distribution with parameters n = 10, p1= 0.02 and Uifollow the Poisson distribution with parameter µ = 0.8

(4,1) 78.26 80.14 81.53 82.63 84.19

(3,2) 97.06 98.68 99.49 100 100

(3,1,1) 69.27 71.24 72.36 72.23 70.51

(2,2,1) 79.45 81.86 84.06 87.90 86.78

(2,1,1,1) 75.12 77.26 78.47 78.49 75.94

The following plots have been done with the Matlab command semilog which plots x and y-coordinates using a base-10 logarithmic scale on the x-axis and a linear scale on the y-axis.

(13)

(14)

(15)

(16)

(17)

4.3 Dimension 7

At first, we started simulating for the models (7) and (1,1,1,1,1,1,1) and we achieved 100 % in all of them. All the simulations have been done for 100.000 replications.

In the (6,1) model, the variables Xj∼ P (µ), j = 1, 2, 3, 4, 5, 6, X7∼ P (1) and U1∼ P (λ).

In the (5,2) model, the variables Xj∼ P (µ), j = 1, 2, 3, 4, 5, 6, 7 and Ui∼ P (λ), i = 1, 2.

In the (5,1,1) model, the variables Xj ∼ P (µ), j = 1, 2, 3, 4, 5, X6,7 ∼ P (1) and U1∼ P (λ).

In the (4,3) model, the variables Xj∼ P (µ), j = 1, 2, 3, 4, 5, 6, 7 and Ui∼ P (λ), i = 1, 2.

In the (4,2,1) model, the variables Xj ∼ P (µ) , j = 1, 2, 3, 4, 5, 6, X7∼ P (1) and Ui∼ P (λ), i = 1, 2.

In the (4,1,1,1) model, the variables Xj ∼ P (µ), j = 1, 2, 3, 4, X5,6,7∼ P (1) and U1∼ P (λ).

In the (3,3,1) model, the variables Xj ∼ P (µ), j = 1, 2, 3, 4, 5, 6, X7∼ P (1) and Ui∼ P (λ), i = 1, 2.

In the (3,3,2) model, the variables Xj ∼ P (µ), j = 1, 2, 3, 4, 5, 6, 7 and Ui ∼ P (λ), i = 1, 2, 3.

In the (3,2,1,1) model, the variables Xj ∼ P (µ), j = 1, 2, 3, 4, 5, X6,7∼ P (1) and Ui∼ P (λ), i = 1, 2.

In the (3,1,1,1,1) model, the variables X_j∼ P (µ), j = 1, 2, 3, X4,5,6,7 ∼ P (1) and U1∼ P (λ).

In the (2,2,2,1) model, the variables X_j ∼ P (µ), j = 1, 2, 3, 4, 5, 6, X7∼ P (1) and Ui∼ P (λ), i = 1, 2, 3.

In the (2,2,1,1,1) model, the variables X_j∼ P (µ), j = 1, 2, 3, 4, X_5,6,7∼ P (1) and U_i∼ P (λ), i = 1, 2.

In the (2,1,1,1,1,1) model, the variables X_j ∼ P (µ), j = 1, 2, X_3,4,5,6,7∼ P (1) and U₁∼ P (λ).

The variables Xj follow the Poisson distribution with the parameter µ = 0.5 and the variables Ui follow also the Poisson distribution with the parameter λ = 0.5.

(6,1) 33.74 37.79 38.55 38.80 38.94

(5,2) 55.02 63.80 66.87 69.80 70.72

(5,1,1) 18.70 22.56 24.02 24.72 25.29

(4,3) 65.67 75.88 79.74 81.86 82.22

(4,2,1) 26.84 33.54 36.04 37.90 38.39

(4,1,1,1) 15.16 19.19 20.87 22.21 22.46

(3,3,1) 29.07 36.85 40.11 41.80 41.69

(3,2,2) 37.12 48.72 52.03 53.94 56.08

(3,2,1,1) 20.37 27.60 29.91 31.96 32.34

(3,1,1,1,1) 18.53 23.73 25.96 27.67 28.23

(2,2,2,1) 17.87 22.56 24.52 24.83 25.48

(2,2,1,1,1) 21.46 28.48 32.82 34.67 35.26

(2,1,1,1,1,1) 34.21 40.69 43.69 46.61 47.35

The variables X_j follow the Poisson distribution with the parameter µ = 0.2 and the variables U_i follow also the Poisson distribution with the parameter λ = 0.8.

(6,1) 66.23 69.97 72.09 72.58 72.50

(5,2) 92.59 93.55 93.73 92.94 92.92

(5,1,1) 51.28 54.27 55.86 57.44 57.65

(4,3) 97.65 98.88 99.38 99.97 100

(4,2,1) 65.17 68.32 70.07 74.01 76.10

(4,1,1,1) 46.69 49.91 52.29 55.52 57.71

(3,3,1) 67.04 70.30 71.39 72.61 72.70

(3,2,2) 83.12 86.56 88.09 89.51 89.55

(3,2,1,1) 56.16 59.99 62.85 66.68 67.30

(3,1,1,1,1) 51.13 54.59 56.80 59.66 59.89

(2,2,2,1) 64.89 67.19 68.92 72.22 72.16

(2,2,1,1,1) 58.02 62.27 65.74 73.04 73.50

(2,1,1,1,1,1) 66.71 70.09 72.88 78.40 78.87