Representation theory of the symmetric group

(1)

SJÄLVSTÄNDIGA ARBETEN I MATEMATIK

MATEMATISKA INSTITUTIONEN, STOCKHOLMS UNIVERSITET

Representation theory of the symmetric group

av

Hannes Vestberg

2014 - No 18

MATEMATISKA INSTITUTIONEN, STOCKHOLMS UNIVERSITET, 106 91 STOCKHOLM

(2)

(3)

Representation theory of the symmetric group

Hannes Vestberg

Självständigt arbete i matematik 15 högskolepoäng, grundnivå Handledare: Rikard Bögvad

2014

(4)

(5)

Abstract

In this thesis we will cover some basic properties about the symmetric group Sn

and how to represent Sn. The aim of this thesis is to be able to represent the group Sn with help of Specht Modules. To do so we need to study the topics matrix representation, linear transformation, λ-tableaux, modules and group rings, which will all occur in this thesis.

(6)

1 The Symmetric group

In the first part of this essay we will describe the symmetric group and some of its properties. The reader should be familiar with the basic theory of groups, otherwise we recommend him/her to read Abstract Algebra by John A. Beachy and William D. Blair [1], where the information in this section is taken from if nothing else is mentioned.

We will begin with some useful definitions.

Definition 1.0.1

The symmetric group, Sn, is the group consisting of all bijections from{1, 2, . . . , n}

to itself with composition as multiplication.

This group clearly has order n!. The elements σ∈ Sⁿ are called permutations.

We multiply from right to left, if π, σ ∈ Sⁿ and we want to compute the bijection πσ, we first apply σ and then π.

Example 1.0.1 Let π, σ∈ S⁶ given by

π(1) = 4, π(2) = 2, π(3) = 1, π(4) = 6, π(5) = 3, π(6) = 5 σ(1) = 2, σ(2) = 3, σ(3) = 4, σ(4) = 5, σ(5) = 6, σ(6) = 1 then

πσ(1) = π(σ(1)) = π(2) = 2 πσ(2) = 1 πσ(3) = 6 πσ(4) = 3 πσ(5) = 5 πσ(6) = 4

We can display σ using something that is called cycle notation. If σ∈ Sⁿ and k∈ {1, 2, . . . , n}, then the elements in the sequence k, σ(k), σ²(k), σ³(k), . . . can not all be distinct. Take the first power q such that σ^q(k) = k, then we can write, in cycle notation, (k, σ(k), σ²(k), . . . , σ^q⁻¹) = (k, l, m, . . . , n). This means that σ sends k to l, l to m,. . . and n back to k.

Theorem 1.0.1

Every permutation in Sn can be written as a product of disjoint cycles. The cycles of length ≥ 2 that appear in the product are unique.

This is clear. Take the same σ and π as in example 1.0.1. Then σ can be written as (1, 2, 3, 4, 5, 6), π as (1, 4, 6, 5, 3)(2) and πσ becomes (1, 2)(3, 6, 4)(5).

A cycle containing k elements is called a k-cycle. The cycle

πσ = (1, 2)(3, 6, 4)(5) contains one 2-cycle, one 3-cycle and one 1-cycle. We call a 1-cycle a fixed point and a 2-cycle a transposition. The cycle-type of a permutation is an expression of the form (1ê¹, 2ê², . . . , nêⁿ), where ei is the number

(8)

of i-cycles. Our example πσ has the cycle type (1¹, 2¹, 3¹, 4⁰, 5⁰, 6⁰). We say that σ is an involution if σ² = ε, where ε is the identity element. To clarify, σ²(k) = k ∀k ∈ {1, 2, . . . , n}. It is clear that σ is an involution if and only if σ only contains cycles of length one or two.

We can describe a cycle type by using a partition instead.

Definition 1.0.2

A partition of an integer n is a sequence

λ = (λ1, λ2, . . . , λk) where

Pk i=1

λi= n and λi is a positive integer.

Our example πσ that we used above corresponds to the partition λ = (3, 2, 1)

and we note that 3 + 2 + 1 = 6.

Definition 1.0.3

Let G be a group. Elements g and k in G are called conjugates if g = hkh⁻¹

for some h∈ G.

Definition 1.0.4

The set of all elements conjugate to a given g is called the conjugacy class of g and is denoted by Kg.

Theorem 1.0.2

Conjugacy of elements defines an equivalence relation on any group G.

Proof. See [1] p.323.

Since conjugacy is an equivalence relation, the conjugacy classes (which are all distinct) partition G. Note that this is a set partition and not an integer partition.

If we return to our favourite group Sn, we have the following result.

Theorem 1.0.3

The conjugacy classes of Sn are determined by cycle type. That is, if π and σ have the same cycle type they are conjugates.

Proof. This proof can be read in [5].

This result implies that we have a natural one-to-one correspondence between the conjugacy classes of Sn and the partitions of n. One can ask what size the conjugacy classes of Sn have. For this we first need to define the centralizer of g∈ G.

Definition 1.0.5

Let G be any group, then the centralizer Zg of g∈ G is Zg={h ∈ G : hgh⁻¹= g}

(9)

There is a bijection between the cosets of Zg and Kg so that|K^g| = |G|

|Z^g|. This can be seen with help of the function

θ : G→ K^g defined by

θ(h) = hgh⁻¹

where h∈ G. Now we can form the following chain of equivalences:

θ(h) = θ(h^′)⇐⇒ hgh⁻¹= h^′g(h^′)⁻¹⇐⇒ (h^′)⁻¹hgh⁻¹h^′ =

= (h^′)⁻¹hg((h^′)⁻¹h)⁻¹ = g.

This shows that (h^′)⁻¹h∈ Z^g, meaning that h^′Zg = hZg. Thus the image is the same if and only if h and h^′ are in the same conjugacy class. Hence the images under θ are in 1-1 bijection with the cosets of Zg in G: |K^g| = |G|

|Z^g|. Proposition 1.0.1

If g∈ Sⁿ has type λ = (1ê¹, 2ê², . . . , nêⁿ), then|Z^g| depends only on λ and zλ:=|Z^g| = 1ê¹e1!2ê²e2! . . . nêⁿen!

Proof. [6] Any x∈ Z^g can either permute the cycles of length k among themselves, or perform a cyclic rotation on each of the individual cycles (or both).

Since there are ek! ways to do the first operation and k^e^kways to do the second, we are done.

Returning to our mission to count the elements in|K^g| (the size of a conjugacy class of Sn), we see that

|K^g| = |Sⁿ|

|Z^g| = n!

1ê¹· e¹!· 2ê²· e²!· . . . · nêⁿ· eⁿ! Example 1.0.2

Let π = (1, 2)(3, 4) ∈ S⁴. Theorem 1.0.3 tells us that all permutations of the same cycle type are in the conjugacy class Kπ. All elements in this class are easy to see:

(1, 2)(3, 4) (1, 3)(2, 4) (1, 4)(2, 3)

So |K^π| = 3. Since π has cycle type λ = (1⁰, 2², 3⁰, 4⁰), proposition 1.0.1 tells us that zλ=|Z^π| = 1⁰· 0! · 2²· 2! · 3⁰· 0! · 4⁰· 0! = 4 · 2 = 8. We also know that

|S⁴| = 4! = 24. Thus

|S⁴|

|Z^π| =24

8 = 3 =|K^π| which is exactly what we wanted.

Theorem 1.0.4

Any permutation in Sn, where n≥ 2 can be written as a product of transpositions.

(10)

Note that such a product of transpositions (often) is not unique. In many cases a permutation can be written as a product of transpositions in different ways.

Proof. Theorem 1.0.1 states that we can express every permutation in Sn as a product of cycles. Thus we only have to show that every cycle can be written as a product of transpositions. The identity can be expressed as (1) = (1, 2)(1, 2).

For any other cycle type, the following explicit computations can be made:

(k1, k2, . . . , kn) = (k_n−1, kn)(k_n−2, kn) . . . (k2, kn)(k1, kn)

Example 1.0.3

Let us write σ = (1, 2)(3, 5, 4)∈ S⁵ as a product of transpositions:

σ = (1, 2)(3, 5, 4) = (1, 2)(5, 4)(3, 4)

Theorem 1.0.5

If you write a permutation as a product of transpositions in two ways, then the number of transpositions is either odd in both cases or even in both cases.

Proof. This will be a proof by contradiction. Suppose that you can write a permutation σ as a product of both an even number of transpositions and an odd number of transpositions:

σ = π1π2· · · π²ⁿ= δ1δ2· · · δ²ⁿ⁺¹

where π and δ are transpositions and n is a positive integer. We know that δk= δ⁻¹_k . Thus the identity permutation can be written as

(1) = σσ⁻¹= π1π2· · · π²ⁿδ2n+1· · · δ¹

which is an odd number of transpositions. Suppose that (1) = θ1θ2· · · θ^k is the shortest product of an odd number of transpositions that is equal to the identity (k ≥ 3). Suppose that θ¹ = (x, y). We can deduce that x must appear in at least one other transposition θi, i > 1, otherwise

(x) = (1)(x) = θ1· · · θ^k(x) = (y)

which is a contradiction. Assume that θ1· · · θ^k has x to the extreme left and has the fewest number of x’s of all transpositions of length k that are equal to the identity. Take the smallest i 6= 1 such that x occurs in θⁱ. Now we would like to move this transposition to the left, without changing the number of transposition we have in our product, nor the number of x’s that appear.

This can be done with help of some nice computations. Let x, r, s and t all be distinct, we see that

(s, t)(x, r) = (x, r)(s, t) and (s, r)(x, r) = (x, s)(s, r).

This means that we can move our transposition with x in it to θ2, without changing the number of transpositions in the product, nor getting more x’s.

Say that θ2= (x, z), z6= x. If z = y we see that

(1) = θ1θ2· · · θ^k = (x, y)(x, y)θ3· · · θ^k= θ3· · · θ^k

(11)

which is a shorter product of transpositions equal to the identity, a contradiction.

If z6= y we get

(1) = θ1θ2· · · θ^k = (x, y)(x, z)θ3· · · θ^k= (x, z)(y, z)θ3· · · θ^k

which is a product that have fewer x’s, also a contradiction! Hence the identity cannot be written as a product of an odd number of transpositions, and we are done.

Example 1.0.4

If we have the same σ = (1, 2)(3, 5, 4) as before, we can write it as σ = (1, 2)(5, 4)(3, 4) or as σ = (1, 2)(3, 5)(5, 4). In both cases we have an odd number of transpositions.

Definition 1.0.6

If σ = τ1τ2. . . τk, where τi are transpositions the sign of σ, denoted by sgn(σ), is

sgn(σ) = (−1)^k

From theorem 1.0.4 and theorem 1.0.5 we see that the sign is well defined. It follows that if σ and π are elements of Sn, then

sgn(σπ) = sgn(σ)sgn(π).

2 Representation theory

In this section we will only go through some of the basics in representation theory. The goal with representation theory is to classify the homomorphisms of an abstract finite group into groups of matrices or linear transformations, which are well studied in mathematics. The aim of this thesis is to represent the symmetric group using modules and especially Specht Modules.

2.1 Matrix Representation and Linear Transformations

We will begin to define what a linear transformation is. Let A and B be commutative groups. The set of all homomorphisms of A into B is denoted by

Hom(A, B)

If we define the sum of two such homomorphisms f and g by (f + g)a = f (a) + g(a)

where a∈ A, then Hom(A, B) becomes a commutative group. The group of all homomorphisms from A to itself (with addition) becomes a ring with an identity element if we define multiplication as composition, i.e. (f g)a = f (g(a)), a∈ A [2].

Definition 2.1.1

Let V and W be vector spaces over the same field K. A function f : V → W

is called a linear transformation if for any two vectors x, y∈ V and any scalar α∈ K, the following two conditions are satisfied:

(12)

• f(x + y) = f(x) + f(y)

• f(αx) = αf(x)

We will denote HomK(V, W ) to be the group of all linear transformations from V to W . This group becomes a vector space over K if we define for each α∈ K and f ∈ Hom^K(V, W ),

(α)f v = αf (v) where v∈ V .

Now we will move forward to matrix representations. You can think about a matrix representation as a way to turn an abstract group into a concrete group of matrices.

Definition 2.1.2

The full complex matrix algebra of degree d is the set of all d× d matrices with entries in C and is denoted by Matd.

An algebra is a ring A which is also a vector space over a field K, called the scalars, such that multiplication with elements in K commutes [6]:

λa = aλ, and 1k1A= 1A

where λ∈ K and a ∈ A.

Example 2.1.1

The complex numbers form an algebra over the real numbers and it is 2-dimensional.

Every complex number can be written, as we already know, on the form a + bi.

Instead of writing the numbers in this way we can represent them as vectors.

The number a + bi corresponds to the vector (a, b), where a, b∈ R. Addition is given by

(a, b) + (c, d) = (a + c, b + d) Scalar multiplication is given by

c(a, b) = (ca, cb) and we use complex multiplication

(a + bi)· (c + di) = (ac − bd) + (ad + bc)i to define multiplication of two vectors

(a, b)· (c, d) = (ac − bd, ad + bc)

Example 2.1.2

The set of matrices on the form







a b 0 0

c d 0 0

0 0 e f

0 0 g h





 ∈ Mat⁴

(13)

is a subalgebra in the matrix algebra M at4. A subalgebra is linear subspace with the property that if we multiply two elements of the subspace, we are again in the subspace. The above subalgebra has a subspace A which is isomorphic to the set of 2× 2-matrices.

A =







a b 0 0

c d 0 0

0 0 0 0





 ∈ Mat⁴

is a subalgebra of M at4 and

A ∼={

a b

c d

}

(∼= means that we have a 1− 1 map that is a linear transformation such that θ(ab) = θ(a)θ(b) and θ(1) = 1) by

a b

c d

7→







a b 0 0

c d 0 0

0 0 0 0







Definition 2.1.3

The group of all X ∈ Mat^d that are invertible with respect to multiplication is called the complex general linear group of degree d. It is denoted by GLd. Now we are ready for the precise definition of a matrix representation.

Definition 2.1.4

A matrix representation of a group G is a group homomorphism Y : G→ GL^d

and we call d the dimension or degree of the representation. It is denoted it by deg(Y) [6].

Two matrix representations (of G) X and X^′ are equivalent if they have the same degree d and if there exists a fixed matrix Y ∈ GL^d such that

X^′(g) = Y X(g)Y⁻¹, ∀g ∈ G.

Every group has a trivial matrix representation, namely the one where you send every element g∈ G to the unit matrix (1) (the one with 1:s on the main diago- nal and zeros in every other position). This clearly is a representation since the mapping X : G→ (1) is a homomorphism: X(ab) = (1) = (1)(1) = X(a)X(b).

A certain representation for the symmetric group Sn, the defining representation of Sn, is a matrix of degree n with entries that are either 0 or 1.

(14)

Definition 2.1.5

The defining representation of Sn is the function X : Sn→ Matⁿ defined by the matrices

X(σ) = (xij)n×n

for σ∈ Sⁿ and

xij=

1 if σ(j) = i 0 otherwise

The defining representation will be described in the following example.

Example 2.1.3

Let X : S4→ Mat⁴ be defined by the matrices X(σ) = (xij)4×4

for σ∈ S⁴ and

xij=

1 if σ(j) = i 0 otherwise

We will not give all 4! = 24 matrices but we will give some:

X((1)) =







1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1





 X((1, 2)) =







0 1 0 0

1 0 0 0

0 0 1 0

0 0 0 1







X((1, 3, 2)) =







0 1 0 0

0 0 1 0

1 0 0 0

0 0 0 1





 X((1, 3)(2, 4)) =







0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0







The rest of the matrices are easy to create, and we see for example that X((1, 3, 2)(1, 3)(2, 4)) = X((1, 2, 4)) =

=







0 0 0 1

1 0 0 0

0 0 1 0

0 1 0 0





 =







0 1 0 0

0 0 1 0

1 0 0 0

0 0 0 1













0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0





 =

= X((1, 3, 2))X((1, 3)(2, 4))

Now we show that X, if we define it as in the example above but with n× n- matrices instead, always is a homomorphism (X(στ ) = X(σ)X(τ )).

σ(a1, a2, . . . , an) = (aσ(1), . . . , aσ(n)) and σ(0, . . . , 0,1, 0, . . . , 0) =ⁱ

= (0, . . . , 0,1, 0, . . . , 0) where j is determined by σ(j) = i. Hence every row and^j

(15)

every column will contain one 1 and the rest is 0. For an i we have that τ (j) = k and σ(k) = i so that στ (j) = i. Now look at

X(σ)X(τ ) =







0 · · · 0

... ...

0 · · · 1ik 0 0

0 · · · 0

... ...

0 · · · 0







·







0 · · · 0

... ...

0 · · · 0 1kj 0

0 · · · 0

... ...

0 · · · 0







=







0 · · · 0

... ...

0 · · · 1ij 0 0

0 · · · 0

... ...

0 · · · 0







= X(στ )

This example verifies that X is a homomorphism, the proper proof is relatively easy.

We have also seen another type of representation, namely the sign function of the symmetric group.

2.2 Sign representation

We have seen one representation of Sn with degree 1, the trivial representation.

There is only one more representation of degree one, the sign representation.

This is the homomorphism where you send every cycle π∈ Sⁿ(n≥ 2) to a 1 × 1 matrix with entries that are either 1 or−1, depending of sgn(π).

Example 2.2.1

The sign representation of S3 is the function f : S3→ Mat¹ defined by f (π) = (sgn(π)). We see that

f ((1)) = (sgn((1))) = (1) f ((1, 2)) = (sgn((1, 2))) = (−1) f ((1, 3)) = (sgn((1, 3))) = (−1) f ((2, 3)) = (sgn((2, 3))) = (−1) f ((1, 2, 3)) = (sgn((1, 2, 3))) = (1) f ((1, 3, 2)) = (sgn((1, 3, 2))) = (1)

Since we know that the composition of two even permutations is even, two odd permutations is even and one even and one odd is odd, f is a homomorphism.

(16)

3 Some necessary tools

Soon we are ready to study the representation of the symmetric group using Specht Modules, but first we need to go through some concepts that are needed to understand and create these modules.

3.1 The λ-tableau

Definition 3.1.1

An ordered partition of n is a partition λ = (λ1, λ2, . . . , λk) where λ1≥ λ²≥ · · · ≥ λ^k andPk

i=1λi= n.

Let λ = (λ1, λ2, . . . , λk) be a partition of n. If we create an array with k rows and λ1 columns that contain the variables x1, x2, . . . , xn and also let each row contain λi variables (in the first λi columns), we call that array a λ-tableau or a Young-tableau of shape λ [4].

Example 3.1.1

Let λ = (4, 2, 2, 1) be a partition of 9. One λ-tableau with the variables x1, . . . , x9

is:

x1 x5 x8 x9

x2 x6

x3 x7

x4

For each partition λ of n there are n! different λ-tableaux. The position in which the elements are is called a cell, i.e. if xk is in the i:th row and the j:th column we say that xk is in the cell (i, j).

Definition 3.1.2

A λ-tableau is said to be standard if all variables occur in increasing order (where xi > xj if i > j) from left to right along each row and down each column.

The array in example 3.1.1 is a standard λ-tableaux or a standard Young- tableaux of shape λ. You can easily count the number of standard Young- tableaux of shape λ with help of the Hook-formula. We will not give the proof for the Hook-formula but we will give the statement and an example on how to count with it.

Definition 3.1.3

Let λ be a λ-tableau with cells (x,y). The hook, denoted by Hλ(x, y), is the set of cells (a, b) such that a = x and b≥ y or a ≥ x and b = y. The hook-length hλ(x, y) is the number of cells in Hλ(x, y).

The hook-length formula counts the number of standard Young-tableaux of shape λ, usually denoted dλ. The statement is as follows:

dλ= n!

Q

(x,y)∈λ

hλ(x, y)

(17)

Example 3.1.2

Let λ = (5, 3, 1) be a partition of 9. In the following λ-tableau we will write the hook-length of each cell in the cell itself.

7 5 4 2 1

4 2 1 1 and for example the hook Hλ(1, 2) is

Hλ(1, 2) = 5 4 2 1 2

The hook-length formula tells us that we have

dλ= 9!

7· 5 · 4 · 2 · 1 · 4 · 2 · 1 · 1= 162 different standard λ-tableaux of this shape.

Example 3.1.3

We will now give an example of a non-standard λ-tableaux. Let λ = (4, 2, 2, 1) be, as in example 3.1.1, a partition of 9. A non-standard λ-tableau is, for example, the array

x7 x3 x1 x5

x9 x8

x2 x4

x6

3.2 Modules

To move forward in this thesis we need to have some basic understanding about modules. We begin with the formal definition:

Definition 3.2.1

Let R be a ring and 1Rbe the multiplicative identity. A left R-module M consists of an abelian group (M, +) and an operation R× M → M such that ∀ r, s ∈ R and x, y∈ M the following holds:

• r(x + y) = rx + ry

• (r + s)x = rx + sx

• (rs)x = r(sx)

• 1^Rx = x

We call the operation on the ring scalar multiplication and we usually just write rx for r∈ R and x ∈ M. We can also define a right R-module M if we let the ring act on the right side instead of the left, i.e. M × R → M and the axioms are written with the scalars r and s to the right of x and y. We denote a left R-module asRM and a right R-module as MR.

Before we move forward we will give some examples of modules.

(18)

Example 3.2.1

If K is a field, then a K-module and a K-vectorspace are two different names for the same thing. We see this from the definition of vector spaces:

• (a + b)v = av + bv

• a(bv) = (ab)v, a, b ∈ K

• a(u + v) = au + av

• 1v = v

The above is exactly the same as the axioms for modules, where v and u are vectors.

Example 3.2.2

Every abelian group G is a Z-module, i.e. a module over the integers. We add and subtract according to the addition of the group G. The most important point is that we can multiply any x∈ G by an integer n. If n > 0, then:

nx = x + x + . . . + x

| {z }

n times

and if n < 0:

nx =−x − x − . . . − x| {z }

|n| times

Example 3.2.3

Let R be any ring, then the set of all n-tuples, Rⁿ, where all components are in R, is an R-module if we have the usual definition for scalar multiplication and addition. With usual we mean as in Euclidean space:

r(x1, x2, . . . , xn) = (rx1, rx2, . . . , rxn) and (x1, x2, . . . , xn) + (y1, y2, . . . , yn) = (x1+ y1, x2+ y2, . . . , xn+ yn). This is in fact a vector space.

Example 3.2.4

We will now give two non trivial examples of a module. The set M atn(R) together with the set of n× n matrices of the form

M =

















a1 0 · · · 0

a2 ... ...

... ... ...

an 0 · · · 0

















is a left module since A· m ∈ M for all A ∈ Matⁿ(R) and m ∈ M.

The set M atn(R) is also a right module with the set of n × n matrices of the form

N =

















a1 a2 · · · an

0 · · · 0

... ...

0 · · · 0

















(19)

since n· A ∈ N for all A ∈ Matⁿ(R) and n ∈ N.

3.3 Group rings

Since this thesis is about the group Sn, we will mostly consider the group ring ZSⁿ or CSⁿ. These group rings are in fact algebras. All information in this section is taken from the book The Algebraic Structure of Group Rings [3].

Definition 3.3.1

Let G be a group written multiplicatively and let R be a ring. The group ring R[G] is the ring consisting of all formal sums

α =X

g∈G

ag· g

with ag∈ R. Formal here means that there are no relations P

g∈G

ag· g = 0 except when all ag= 0. The identity element in R[G] is the identity element in G, i.e.

e = 1· e + 0 · g¹+ 0· g²+· · · where gⁱ ∈ G. We can add two elements in the group ring:

α + β = (X

σ∈G

aσ· σ) + (X

σ∈G

bσ· σ) = X

σ∈G

(aσ+ bσ)σ

and we can multiply them:

α· β = (X

σ∈G

aσ· σ)(X

π∈G

bπ· π) = X

σ,π∈G

(aσ· b^π)σπ.

Example 3.3.1

We will now construct the group ring ZS³. ZS³={X

σ∈S3

nσ· σ}

where nσ is an integer. Let for example α = 24(1, 2) + 17(1, 2, 3) and β = 7(1, 2) + 53(1, 3, 2). Then

α + β = (24(1, 2) + 17(1, 2, 3)) + (7(1, 2) + 53(1, 3, 2))

= (24 + 7)(1, 2) + 17(1, 2, 3) + 53(1, 3, 2)

= 31(1, 2) + 17(1, 2, 3) + 53(1, 3, 2)

α· β = (24(1, 2) + 17(1, 2, 3))(7(1, 2) + 53(1, 3, 2))

= 24· 7(1, 2)(1, 2) + 24 · 52(1, 2)(1, 3, 2)+

+ 17· 7(1, 2, 3)(1, 2) + 17 · 53(1, 2, 3)(1, 3, 2)

= 168 + 1248(1, 3) + 119(1, 3) + 901 = 1069 + (1248 + 119)(1, 3)

= 1069 + 1367· (1, 3)

(20)

We can have some more fun with these group rings. A module M over the group ring CSⁿ is simply the same as a linear representation of Sn over C. We will now give an example of this.

Example 3.3.2

In this example we will consider the group S2={(1), (1, 2)}. Let Y be a function Y : S2−→ GL²

defined by Y (σ) =

1 0

0 sgn(σ)

. Thus we have the two elements

1 0

0 1

= Y ((1)) and

1 0

0 −1

= Y ((1, 2)).

C² is a module over the group algebraCS² in the following way: Define Y :CS²−→ Mat²

X

σ∈S2

nσσ7−→ X

σ∈S2

nσY (σ)

With nσ∈ C. We see that the algebra Y (CS²) is the following set of matrices:

Y (CS²) ={a

1 0

0 1

+ b

1 0

0 −1

} = {

a + b 0

0 a− b

, a, b∈ C}.

The module structure on C² is then given by x ∈ C[S²] acts on v ∈ C² by x· v := Y (x)v.

4 Specht Modules

In this section we will study the main goal of this thesis, the Specht Modules.

If nothing else is mentioned, all information will come from the very nice paper Specht Modules and Symmetric Groups written by M. H. Peel [4].

4.1 Specht polynomials and definition of Specht module

To create Specht Modules we need to define a family of integral representation modules of Sn. We first would like to turnZ[x¹, x2, . . . , xn] into a left

ZSⁿ-module. We can do this by defining

σf (x1, x2, . . . , xn) = f (σx1, σx2, . . . , σxn)

where f is a polynomial with integral coefficients and σ ∈ Sⁿ. σ acts on the polynomial by switching index, i.e. σxi = xσ(i). Then we can define that the group ringZ[Sⁿ] acts on polynomials in the following way:

X

σ∈Sn

nσσ

!

f = X

σ∈Sn

nσ(σ(f )).

This is a module since

(21)

• x(f(x¹, . . . , xn) + g(x1, . . . , xn)) = xf (x1, . . . , xn) + xg(x1, . . . , xn)

• (x + y)f = xf + yf

• (xy)f = x(yf)

• 1f = f

for x, y∈ ZSⁿ and f, g∈ Z[x¹, . . . , xn].

Example 4.1.1

Let x = ((1) + (1, 2)) and f = f (x1, x2). Then

x· f = ((1) + (1, 2))f(x¹, x2) = f (x1, x2) + f (x2, x1) Now let g = g(x1, x2). Then

x(f + g) = xf + xg = ((1) + (1, 2))f (x1, x2) + ((1) + (1, 2))g(x1, x2) =

= f (x1, x2) + f (x2, x1) + g(x1, x2) + g(x2, x1).

To make things easier to follow we will work with an example. Let λ = (4, 2, 2, 1) be a partition of 9. Then a standard λ-tableau is (as in example 3.1.1):

x1 x5 x8 x9

x2 x6

x3 x7

x4

From now on we will denote a λ-tableaux by y or z and{a¹, . . . , an}, {b¹, . . . , bn} will be subsets of {x¹, . . . , xn}. We define a function f^y(x1, . . . , xn) by taking the product of differences in each column. Our example above gives the function

fy(x1, . . . , x7) = (x2− x¹)(x3− x¹)(x4− x¹)(x3− x²) (x4− x²)(x4− x³)(x6− x⁵)(x7− x⁵)(x7− x⁶) Formally we can define the following expression

∆(a1, . . . , at) =Y

i>j

(ai− a^j)

for every a1, . . . , at in the k :th column and t > 1. If t = 1, ∆(a1) = 1. The function fy(x1, . . . , xn) is defined by taking the product of all these products (for every column k). We can permute every λ-tableaux (λ is a partition of n) in n! ways. We permute the table y by replacing xi with σxi = xσ(i) for σ ∈ Sⁿ. This permutation of λ-tableaux satisfies σfy(x1, . . . , xn) = fσ(y) = fy(σx1, . . . , σxn) = f (xσ(1), . . . , xσ(n)).

From this way of constructing our function fy(x1, . . . , xn), which is called a Specht polynomial, it follows that the set of all Z -linear combinations of these polynomials is a cyclicZSⁿ-module overZSⁿ.

(22)

Example 4.1.2

Let λ = (2, 1) be a partition of 3 and consider the following tableau y:

x1 x3

x2

Thus fy(x1, x2, x3) = (x2−x¹). We want to show that the moduleZS³f is cyclic, i.e. it can be generated in a specific way. Let us consider the permutations of our Specht polynomial f (y).

(1)f = f

(1, 2)f = (x1− x²) =−f (1, 3)f = (x2− x³) (2, 3)f = (x3− x¹) (1, 2, 3)f = (x3− x²) (1, 3, 2)f = (x1− x³)

We now make a linear combination of the functions above, for example:

f + 5· (1, 3)f + 9 · (1, 3, 2)f = x²− x¹+ 5x2− 5x³+ 9x1− 9x³=

= 8x1+ 6x2− 14x³= 6(x2− x¹)− 14(x³− x¹)

note that 8 + 6− 14 = 0. In fact all linear combinations of f^y can be written on the form above. Thus the module ZS³f is generated in the following way:

ZS³f =< nσσf, nσ ∈ Z >

=< a(x2− x¹) + b(x3− x¹), a, b∈ Z >

(∗)= < a1x1+ a2x2+ a3x3: a1+ a2+ a3= 0 >

we see that (∗) holds since

a(x1− x²) + b(x1− x³) = ax1− ax²+ bx1− bx³= (a + b)x1− ax²− bx³ with a1= a + b, a2=−a and a³=−b. We do not need to have the generator

< a(x2− x¹) + b(x3− x¹) >, we can change it to (almost) whatever we want.

For example < a(x2− x³) + b(x1− x³) > works equally good. If we used this base in the calculations above we would have got:

f + 5· (1, 3)f + 9 · (1, 3, 2)f = x²− x¹+ 5x2− 5x³+ 9x1− 9x³=

= 8x1+ 6x2− 14x³= 6(x2− x³) + 8(x1− x³).

Definition 4.1.1

The modules ZSⁿfy are called Specht modules corresponding to the partition λ, and hence the tableaux y:

S^y :=ZSⁿfy

Later in this thesis we will prove the following theorem. Note that we can change the ring Z to C whenever we want, without any changes in our arguments.

(23)

Theorem 4.1.1 S^y has a Z-basis:

B^y ={f^y(x1, . . . , xn) : y is a standard λ-tableau}

Remark: We can construct Specht modules over any integral domain K with the following definition.

Definition 4.1.2 S_k^y := KN

ZS^yis called a Specht module over K (S_k^y is a KSn-module) for any arbitrary integral domain K.

We will only consider the domains K = Z and K = C, using K = C just amounts to changing every occurrence ofZ to C as coefficients.

4.2 M

^y

, a permutation representation module

It will be useful to have a set of all row permutations and column permutations of any λ- tableaux. A row permutation is exactly what it sounds like: if y is a λ-tableaux, then σ∈ Sⁿ is called a row permutation if xi and σxi = xσ(i) are found in the same row of y, for every i. A column permutation is a permutation π∈ Sⁿsuch that for each k, xk and πxk= xπ(k)occur in the same column of y.

Definition 4.2.1

R(y) is the set of all row permutations of y. C(y) is the set of all column permutations of y.

Example 4.2.1

Let λ = (4, 2, 1) be a partition of 7.

y =

x1 x5 x7

x2 x6

x3

x4

The set of all row and column permutations of y is the following:

R(y) = S_{1,5,7}× S{2,6}× S{3}× S{4}

C(y) = S_{1,2,3,4}× S{5,6}× S{7}

S_{1,5,7} is the group of all bijections from{1, 5, 7} to itself. Example of elements in R(y) : (1, 5), (1, 7, 5), (1, 5)(2, 6).

Now let gy(x1, . . . , xn) be defined by

gy(x1, . . . , xn) = Yn k=1

x^θ(k)_k

where θ(k) = i− 1 when x^k is in the ith row of the tableaux y. We have that σgy(x1, . . . , xn) = gσ(y)= gy(σx1, . . . , σxn) = gy(xσ(1), . . . , xσ(n)).

(24)

Example 4.2.2

Let λ = (3, 2, 1) be a partition of 6, σ = (1, 3, 5)(2, 4, 6)∈ S⁶ and let y be the tableau:

x1 x2 x3

x4 x5

x6

Then gy(x1, . . . , x6) = (x⁰₁· x⁰2· x⁰3)(x¹₄· x¹5)(x²₆) and σgy(x1. . . , x6) = (x⁰₃· x⁰4· x⁰₅)(x¹₆· x¹1)(x²₂).

Thus σy is the tableaux

x3 x4 x5

x6 x1

x2

which shows that gy(σx1, . . . , σx6) = (x⁰₃·x⁰4·x⁰5)(x¹₆·x¹1)(x²₂) = gσ(y)(x1, . . . , x6).

The monomials gy(x1, . . . , xn) are very useful because they let us define the permutation representation module:

M^y :=ZSⁿgy(x1, . . . , xn)

One can show that S^y ⊆ M^y with help of the following definition. Let Y be any subset of Sn and define

α(Y ) := X

π∈Y

sgn(π)· π (1)

Theorem 4.2.1

fy(x1, . . . , xn) = α(C(y))gy(x1, . . . , xn).

Before the proof we need a lemma.

Lemma 4.2.1

Let py be the Specht polynomial corresponding to the tableau y and let

h ∈ P^d ={All homogeneous polynomials of deg(p^y) = d} with the property that σh = −h for all transpositions σ ∈ C(y). This implies that ∃c ∈ C such that h = c· p^y. In particular, py (or a multiple of py) is the only polynomial in Pd, with the property σpy=−p^y for all transpositions σ∈ C(y).

To clarify:

Pd=

(X

I

αIx^I )

where I := (i1, . . . , in), Pn j=1

ij= d and αI ∈ C.

Proof. Let us consider h(x1, x2, . . . , xn)∈ P^d and σ = (1, 2)∈ C(y). Then h(x1, x2, . . . , xn) =−h(x², x1, . . . , xn)

(25)

Let x1= x2= t, then

h(t, t, x3, . . . , xn) =−h(t, t, x³, . . . , xn) =⇒ h(t, t, x³, . . . , xn) = 0 This implies that (x1− x²)|h(x¹, x2, . . . , xn). Hence

h∈ C[x¹, . . . , xn]⊂ C[

z }| {=t

x1− x², x2, . . . , xn]⊂ C[x², . . . , xn][t] = K[t]

we can use the division algorithm and write h(x1, . . . , xn) = t· k + r Since h is a homogeneous polynomial we get

h(0) = 0· k + r =⇒ r = 0

thus t = (x1− x²)|h. If we now take σ = (i, j) ∈ C(y) we get, with the same arguments as above, that (xi− x^j)|h. Since all (xⁱ− x^j) are relatively prime we have

Y

(i,j)∈C(y),i>j

(xi− x^j) = py(x1, . . . , xn)|h

meaning that h = c· p^y. Now we prove theorem 4.2.1.

Proof. Consider the following

σ∈ C(y) =⇒ σ(α(C(y))) =

= σ X

τ∈C(y)

τ sgn(τ ) =

= X

τ∈C(y)

στ sgn(τ ) =

= sgn(σ) X

τ^′∈C(y)

τ^′sgn(τ^′)

and let h = α(C(y))gy. If σ is a transposition we get that σh =−h. Thus, by lemma 4.2.1, we have that h = c· f^y. Since both h and fy contain the monomial g(y) with coefficient 1, c = 1.

Example 4.2.3

Let λ = (3, 2) be a partition of 5 and let y be the tableau x1 x2 x3

x4 x5

then

fy(x1, . . . , x5) = (x4− x¹)(x5− x²) = x4x5− x²x4− x¹x5+ x1x2

and

C(y) ={(1), (1, 4), (2, 5), (1, 4)(2, 5)}

(26)

so α(C(y)) = (1)− (1, 4) − (2, 5) + (1, 4)(2, 5) and g^y(x1, . . . , x5) = x4x5. Thus α(C(y))gy(x1, . . . , x5) = ((1)− (1, 4) − (2, 5) + (1, 4)(2, 5))x⁴x5

= (1)x4x5− (1, 4)x⁴x5− (2, 5)x⁴x5+ (1, 4)(2, 5)x4x5

= x4x5− x¹x5− x²x4+ x1x2= fy(x1, . . . , x5)

Since fy(x1, . . . , xn) = α(C(y))gy(x1, . . . , xn), S^y ⊆ M^y. We can also see that σgy(x1, . . . , xn) = gy(x1, . . . , xn) if and only if σ∈ R(y). We will soon be able to prove theorem 4.1.1.

4.3 Construction of a basis for S

^y

Before we begin we need some help from a theorem and a corollary. Let t≥ s and let {a¹, . . . , at, b1, . . . , bs} ⊆ {x¹, . . . , xn}. If we fix our attention on two columns, say the columns m and n with m < n, we consider the tableau

y =

· · · a1 · · · b¹ · · · a2 · · · b²

... ... as · · · bs

...

· · · a^t.

Our first theorem will state a relation involving fz(a1, . . . , at, b1, . . . , bs) where z is obtained from y by permuting a1, . . . , at, b1, . . . , bsamongst themselves. If we let k be an integer such that 1≤ k ≤ s and S be the group of permutations of{a^k, . . . , at, b1, . . . , bk}, we have the following theorem.

Theorem 4.3.1 α(S)fy(x1, . . . , xn) = 0.

Proof. The proof can be read in detail in [4] p.92.

Example 4.3.1 Let y be the tableau

x1 x2

x3

then fy(x1, x2, x3) = x3− x¹. If we let

S ={(1), (1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2)} we see that α(S)fy(x1, x2, x3) =X

σ∈S

sgn(σ)σ· f^y

= ((1)− (1, 2) − (1, 3) − (2, 3) + (1, 2, 3) + (1, 3, 2))(x³− x¹)

= (x3− x¹)− (x³− x²)− (x¹− x³)− (x²− x¹) + (x1− x²) + (x2− x³)

= (2x3− 2x³) + (2x2− 2x²) + (2x1− 2x¹) = 0.

(27)

Let H be the set of all permutations σ∈ S such that

sgn(σ)· σf^y(x1, . . . , xn) = fy(x1, . . . , xn) (2) Then H = S∩ C(y) which we can show with the following arguments.

If σ∈ C(y), then σf^y(x1, . . . , xn) = sgn(σ)fy(x1, . . . , xn) (3) This is trivial. If σ∈ S ∩ C(y) =⇒ (2), which is deduced by (3). We now need to show that if σ satisfies (2) and σ ∈ S then σ ∈ C(y). Let A = {a^k, . . . , at} and B ={b¹, . . . , bk}. If σ 6∈ C(y), we see that we cannot get our polynomial σfy to be equal to sgn(σ)fy, since the part of the polynomial from the column where ai appear has more factors than the part where bi occur. This will be illustrated in the following example.

Example 4.3.2 Let y be the tableau

x3 x2

x1 x5

x4

Let S = {x¹, x4, x2, x5}, A = {x¹, x4} and B = {x², x5}. If σ = (1, 2)(4, 5) we see that

fy(x1, . . . , x5) = (x4− x¹)(x4− x³)(x1− x³)(x5− x²) σfy(x1, . . . , x5) = (x5− x²)(x5− x³)(x2− x³)(x4− x¹)

thus σfy6= sgn(σ)f^y = fy. This will always be the case since, the factor (bj−x), where x represents the element in the first row in the column where ai occur, will never occur in sgn(σ)fy.

In the case where the columns we look at only have one row each, it is easy to see that σ ∈ C(y) (in this case it is the identity permutation), since otherwise sgn(σ)σfy =−f^y. Thus H is a subgroup of S. If we let W be the set of left coset representatives of H in S in which the identity permutation is contained we have the following corollary.

Corollary 4.3.1

fy(x1, . . . , xn) =−X

sgn(w)fy(wx1, . . . , wxn) when we sum over all w∈ W \ {(1)}.

Proof. The statement above is equivalent to X

w∈W

sgn(w)fy(wx1, . . . , wxn) = 0

and this is deduced from theorem 4.3.1 by cancelling the factor|H|.

(28)

We want to prove that B^y = {f^y(x1, . . . , xn) : y is a standard λ-tableau} generates S^y over Z. We will write x^a > xb when a > b. We now construct two λ-tableux in the following way. Let a = (a1, . . . , an) and b = (b1, . . . , bn) be permutations of (x1, . . . , xn), and let y1 be a tableaux such that the variables a1, . . . , anappears in the order a1, . . . , anfrom top to bottom in the first column, then down the second column and so on. Let y2 be constructed in the same way.

Example 4.3.3

Let a = (x3, x2, x5, x1, x4) and b = (x3, x2, x1, x5, x4) be permutations of (x1, x2, x3, x4, x5) and let λ = (3, 2) be a partition of 5. Then our two tableau y1 and y2 looks like this:

y1= a1 a3 a5

a2 a4 = x3 x5 x4

x2 x1

and

y2= b1 b3 b5

b2 b4 = x3 x1 x4

x2 x5

It will be useful to order these tableaux. We say that y1> y2if there exists and integer n such that ak = bk for k = 1, 2, . . . , n− 1 and aⁿ> bn. In the example above y1 > y2, since a1= x3 = b1, a2 = x2 = b2 and a3 = x5> x1= b3. The set of all λ-tableaux are totally ordered when we use this relation. Now we will consider those fy for tableaux y in which the variables xi occur in increasing order from top to bottom in each column, note that this is not the same thing as a standard λ-tableaux, recall definition 3.1.2. The polynomials fy generates S^y overZ.

Suppose that y is a non-standard λ tableaux. Then there exists two adjacent columns, say c and c + 1, that contain the variables a1, . . . , an and b1, . . . , bm

with n≥ m such that for some integer 1 ≤ k ≤ m we have that aⁱ < bi for all i < k and ak > bk. We also have as< at for s < t and bu< bv for u < v. We will illustrate with an array:

a1 < b1

... ... a_k−1 < b_k−1

ak > bk

... ...

an bm

From corollary 4.3.1 we get that fy is an integer combination of certain fy(wx1, . . . , wxn), where w∈ W \ {(1)}. Since w 6∈ H we have that wa^q = br

for at least one q ≥ k and r ≤ q. We denote the first such b^r with b. This gives that b < ak < ak+1<· · · < aⁿ. Now we can rearrange the columns of our tableau wy so that all the variables occur in increasing order down the columns, let our new tableau be denoted by z. If b > ak−1, b can be found in the kth row

(29)

and the cth column of z. Since a1, . . . , ak−1 have not been moved we see that z < y. If we instead have that b < ak−1, there is some 1 < i < k− 1 such that ai−1 < b < aiand b is found in the ith row and cth column. Since the variables a1, . . . , ai−1 have not been moved, we have that z < y. We can continue in the same way and by induction with respect to the ordering of the λ-tableaux it follows that B^ygenerates S^yoverZ. This is true since the first tableaux in this order is standard and the variables occur in their natural order.

We will end this thesis with the following, important theorem.

4.4 Specht modules are irreducible over C

Theorem 4.4.1

S^y for a partition λ of n, is an irreducible Sn-module over the complex numbers.

In fact for any finite group there is only a finite number of irreducible modules up to isomorphism. For Sn, the Specht modules form a complete list of these, however we refer the reader to [6] p.66 for the proof. We need the following theorem, which is generally true for all groups, not only Sn.

Theorem 4.4.2 (Maschke’s theorem)

If G is a finite group and V is a nonzero G-module, then V can be written as a direct sum of irreducible submodules of V :

V = W1⊕ W²⊕ · · · ⊕ Wⁿ Proof. The proof can be read in [6] p.16.

We are now ready to prove theorem 4.4.1.

Proof. Suppose that the statement of the theorem is false. Then we can find V 6= 0 and V^′6= 0 such that

S^y= V ⊕ V^′

Since S^y ⊂ P^d (Pd is the set of all homogeneous polynomials of degree d), we can find a homomorphism that is not injective:

θ : S^y → P^d v + v^′ 7→ v ∈ P^d

with ker(θ) = V^′ 6= 0. Now look at f = θ(p^y) ∈ P^d, where py is our Specht polynomial. f has the property described in lemma 4.2.1: Let σ ∈ C(y) be a transposition, then

σf = σθ(py) = θ(σpy) = θ(−p^y) =−θ(p^y) =−f So f = θ(py) = c· p^y if c6= 0. But now we have that

θ : S^y → S^y⊂ P^d

is surjective and thus it cannot have a kernel since a surjective linear transformation from a space to itself must be injective, a contradiction.

(30)

References

[1] Beachy, J.A. & Blair, W.D. (2006). Abstract algebra. (3rd ed.) Long Grove, Ill.: Waveland Press.

[2] Curtis, C.W. & Reiner, I. (1962). Representation theory of finite groups and associative algebras. New York:

[3] Passman, D.S. (1977). The Algebraic Structure of Group Rings. New York:

Wiley.

[4] Peel M. H. (1975). Specht Modules and Symmetric Groups. Journal of Algebra, volume(36), 88-97. Available at: http://www.sciencedirect.

com/science/article/pii/0021869375901581

[5] Ryan V. (2006). Conjugacy Classes of Symmetric Groups. Presented at the Mathematics Department at the College of William and Mary. Available at: http://www.math.wm.edu/~vinroot/415conj.pdf

[6] Sagan, B.E. (2001). The symmetric group: representations, combinatorial algorithms, and symmetric functions. (2. ed.) New York: Springer.

Representation theory of the symmetric group

SJÄLVSTÄNDIGA ARBETEN I MATEMATIK

Representation theory of the symmetric group

av

Hannes Vestberg

2014 - No 18

Representation theory of the symmetric group

Hannes Vestberg

Självständigt arbete i matematik 15 högskolepoäng, grundnivå Handledare: Rikard Bögvad

2014

Abstract

Contents

1 The Symmetric group

2 Representation theory

2.1 Matrix Representation and Linear Transformations

2.2 Sign representation

3 Some necessary tools

3.1 The λ-tableau

3.2 Modules

3.3 Group rings

4 Specht Modules

4.1 Specht polynomials and definition of Specht module

4.2 M

, a permutation representation module

4.3 Construction of a basis for S

4.4 Specht modules are irreducible over C

References