• No results found

Analysis and visualization of collective motion in football

N/A
N/A
Protected

Academic year: 2021

Share "Analysis and visualization of collective motion in football"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

UPTEC F 15068

Examensarbete 30 hp December 2015

Analysis and visualization of collective motion in football

Analysis of youth football using GPS and

visualization of professional football

Emil Rosén

(2)

Teknisk- naturvetenskaplig fakultet UTH-enheten

Besöksadress:

Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0

Postadress:

Box 536 751 21 Uppsala

Telefon:

018 – 471 30 03

Telefax:

018 – 471 30 00

Hemsida:

http://www.teknat.uu.se/student

Abstract

Analysis and visualization of collective motion in football

Emil Rosén

Football is one of the biggest sports in the world. Professional teams track their player's positions using GPS (Global Positioning System). This report is divided into two parts, both focusing on applying collective motion to football.

The goal of the first part was to both see if a set of cheaper GPS units could be used to analyze the collective motion of a youth football team. 15 football players did two experiments and played three versus three football matches against each other while wearing a GPS. The first experiment measured the player's ability to control the ball while the second experiment measured how well they were able to move together as a team. Different measurements were measured from the match and Spearman correlations were calculated between measurements from the experiments and matches. Players which had good ball control also scored more goals in the match and received more passes. However, they also took the middle position in the field which naturally is a position which receives more passes. Players which were correlated during the team experiment were also correlated with team-members in the match.

But, this correlation was weak and the experiment should be done again with more players. The GPS did not work well in the team experiment but have potential to work well in experiments done on a normal-sized football field.

The goal of the second part of the report was to visualize collective motion, more specifically leader-follower relations, in football which can be used as a basis for further research. This is done by plotting the player's positions at each time step to a user interface. Between each player, a double pointed arrow is drawn, where each side of the arrow has a separate color and arrow width. The maximum time lag between the between the two players is shown as the "pointiness" of the arrow while the color of the arrow show the maximum time lag correlation. The user can change the metrics the correlations are based of. As a compliment to the lagged correlation, a lag score is defined which tell the user how strong the lagged correlation is.

Ämnesgranskare: Thomas Schön Handledare: David Sumpter

(3)

Popul¨ arvetenskaplig sammanfattning

Fotboll ¨ar en av v¨arldens st¨orsta sporter i v¨arlden. Professionella lag sp˚arar spelarnas position och hastighet under tr¨aningsmatcher med hj¨alp utav GPS (Global Positioning System). D¨armed kan en manager till exempel se vilka ytor p˚a planen en spelare t¨acker eller hur snabbt och ofta de springer eller spurtar, vilket ocks˚a mycket forskning inom fotboll ¨ar fokuserad p˚a. Dock

¨

ar ju fotboll en lagsport och mindre fokus inom forskningen har lagts p˚a att analysera dynamiken inom fotbollslagen.

Ett s¨att man kan g¨ora detta p˚a ¨ar att titta p˚a den kollektiva r¨orelsen inom ett lag. Med kollektiv r¨orelse i fotboll menar man att spelarnas r¨orelser inte

¨

ar oberoende utav varandra, utan en spelare best¨ammer vart hen ska placera sig utifr˚an dess lagkamraters och motst˚andares position. Mycket forskning har skett p˚a kollektiv r¨orelse inom andra omr˚aden, s˚a som hur fiskars r¨orelser beror p˚a varandra f¨or att skapa dynamiska fiskstim eller hur bin skapar stora sv¨armar. Id´een med det h¨ar projektet ¨ar att anv¨anda kollektiv r¨orelse f¨or att analysera fotbollspelarnas r¨orelser likt det man g¨or inom biologin.

Projektet ¨ar uppdelat i tv˚a delar med olika fokus. I den f¨orsta delen anv¨andes GPS f¨or att sp˚ara 10-˚ar gamla fotbollsspelares positioner p˚a en fot- bollsplan. M˚alet med denna del var dels att unders¨oka ifall en billigare GPS (¨an de som anv¨ands av professionella lag) kan anv¨andas f¨or att unders¨oka denna typ av experiment samt att utf¨ora n˚agra enkla experiment f¨or att unders¨oka den kollektiva r¨orelsen inom laget. 15 fotbollsspelare utf¨orde tv˚a experiment samt spelade fyra tre mot tre matcher (p˚a en liten fotbollsplan) mot varandra. Det f¨orsta experimentet m¨atte deras individuella f¨orm˚aga att hantera boll medan det andra experimentet var ett backlinjeexperiment som m¨atte deras f¨orm˚aga att r¨ora sig som ett lag. Spelarna rankades inom olika omr˚aden utifr˚an resultaten fr˚an experimenten och matchen, som till exempel deras bollskicklighet, hur bra de ¨ar p˚a att r¨ora sig som ett lag, hur m˚anga m˚al de gjorde, hur snabbt de springer under en match och hur m˚anga pass- ningar de f˚ar. D¨arefter ber¨aknades korrelationer mellan dessa rankningar f¨or att unders¨oka ifall spelare som ¨ar bra i ett omr˚ade ¨ar bra eller d˚aliga i ett annat.

Spelare som var bra p˚a att hantera boll gjorde ocks˚a mer m˚al samt fick mer passningar. Dock tog dessa spelare ¨aven mittpositionen i laget vilket ¨ar en position som naturligt f˚ar mer passar. D¨armed ¨ar det inte s¨akert om dessa spelare faktiskt var b¨attre p˚a att g¨ora m˚al eller om det endast berodde p˚a den position de hade. Spelare som var korrelarade med andra spelare i backlinje- experimentet var ocks˚a mer korrelerade med sina lagkamrater n¨ar de hade boll. Dock utf¨ordes experimenten med endast 15 spelare vilket betyder att

(4)

detta resultat inte ¨ar ¨overtygande och experimentet borde g¨oras om med fler spelare. GPS:erna fungerade inte s¨arskilt bra i backlinje-experimentet men kan anv¨andas b¨attre under matcher eller experiment som sker p˚a en st¨orre fotbollsplan.

M˚alet med den andra delen av projektet var att visualisera dynamiken i den kollektiva r¨orelsen fr˚an en fotbollsmatch f¨or att snabbt kunna f˚a en

¨

overblick och hitta intressanta omr˚aden som borde unders¨okas mer. Den po- sitionsdata som anv¨andes var fr˚an en ˚atta mot ˚atta fotbollsmatch spelad av professionella spelare och ¨ar av h¨og kvalitet. Ett anv¨andargr¨anssnitt utveck- lades i Matlab d¨ar spelarnas positioner var utritade p˚a en fotbollsplan och d¨ar anv¨andaren kan stega fram tiden f¨or att se hur spelarna f¨orflyttar sig.

Samt s˚a unders¨okts det om det finns s˚a kallade “ledare-f¨oljare” relationer mellan spelare, allts˚a om en spelare f¨oljer en annan spelare eller inte. Detta visualiseras i form utav dubbelsidiga pilar som pekar mellan spelarna. F¨argen p˚a pilarna visar ifall spelarna ¨ar korrelerade eller inte mellan spetsigheten p˚a pilen visar hur stort lag det ¨ar mellan spelarna, d¨ar ett stort lag visar p˚a att en spelare f¨oljer den andra spelarens r¨orelser fast med en f¨ordr¨ojning, och d¨armed ¨ar en “f¨oljare”. Med detta gr¨anssnitt kan det bland annat ses att det lag som inte har bollen ¨ar betydligt mer korrelerade med varandra ¨an det attackerande laget.

Contents

1 Abbreviations 4

2 Background and introduction 4

3 Analysis of collective motion of a youth football team using

GPS 5

3.1 Introduction . . . 5

3.2 Material and methods . . . 5

3.2.1 GPS units . . . 5

3.2.2 Experiment group and location . . . 5

3.2.3 Preprocessing of GPS data . . . 5

3.2.4 Validation of GPS data . . . 6

3.2.5 Correlation score . . . 7

3.2.6 Ball control experiment: Evaluation of player ball control 8 3.2.7 Line experiment: Evaluation of player team skill . . . . 8

3.2.8 Match experiment: Evaluation of player match skill . . 10

(5)

3.3 Results . . . 12

3.3.1 GPS validation . . . 12

3.3.2 Ball control experiment results . . . 13

3.3.3 Assignment of player ID . . . 14

3.3.4 Line experiment results . . . 15

3.3.5 Match experiment results . . . 15

3.3.6 Correlations between player measurements . . . 19

3.3.7 Correlations between team measurements . . . 20

3.4 Discussion . . . 21

3.4.1 GPS . . . 21

3.4.2 Ball control . . . 22

3.4.3 Line experiment . . . 23

3.4.4 Match experiment . . . 23

4 Visualization of collective motion of a professional football players in a eight versus eight match 24 4.1 Introduction . . . 24

4.2 Materials and methods . . . 24

4.2.1 Data description . . . 24

4.2.2 Preprocessing of data . . . 24

4.2.3 Time window . . . 25

4.2.4 Time lag correlation . . . 25

4.2.5 Lag score . . . 26

4.2.6 Implementation . . . 27

4.3 Results . . . 27

4.3.1 Maximum time lag . . . 28

4.3.2 Voronoi Regions . . . 30

4.4 Discussion . . . 30

5 Conclusions 31

6 Future Outlook 32

Appendices 32

Appendix A Movie examples of collective motion of a youth

football team 32

Appendix B Movie examples showing the visualization 32

(6)

1 Abbreviations

• GPS - Global Positioning System

• GUI - Graphical User Interface

• RMSE - Root mean squared error

• STD - Standard Deviation

2 Background and introduction

Football is one of the worlds biggest and most popular sports with million fans around the globe. Modern mathematics and statistics are used by top-level football teams and clubs to improve training and analyse tactics. A large amount of research have been directed into areas such as player’s run per- formance in various speed intervals, distance covered, team ball possession, team pass rate, correlations between training exercises and in the creation of heat maps which show which areas certain players covers during a match [2, 6, 10, 17, 20].

In the area of collective motion and behaviour, models and methods have been developed to analyze the follower-leader relationships in for example birds [16] but also in social structures such as that in the relationship between GDP per capita and democracy [19]. These kind of analysis could be applied to football to approach the subject from a different angle, but less work have been done in this area compared to the more traditional approach.

Some research about collective motion and football show players of op- posing teams have a greater amount of leader-follower interactions than that between team members [14], developing a network based method for quanti- fying the performance of individuals in a team [5] and relationships between different quantities in a full football match [22, 23]. Other research combine the areas of biology and football by modeling a football team as a super- organism to find the collective behaviour of the team [3] or analyzing the shapes the team creates during a match [4, 7].

This thesis will focus on the collective motion of football players and is divided into two parts. The first part focus on analyzing the collective motion in youth football during a match and different exercises, using GPS to gather position data. The second part focus on visualizing the collective motion of professional football players.

(7)

3 Analysis of collective motion of a youth football team using GPS

3.1 Introduction

To be able to analyze the collective motion of football players their position have to be recorded. There are three main systems to track the positions of football players during a match. Image analysis of video footage (such as ProZone [13]), using a local positioning system or by mounting GPS receivers on the players (such as GPSports [9]). While these equipment probably work very well they are also very expensive. Therefore, one goal of the thesis is to see if a cheaper GPS can work well enough to analyze the collective motion of youth football players.

Additional goals are to see if there are any relationships between some measurements from some simple experiments with measurements from a small three versus three match, as well as to analyze the collective motion of the football players during a match.

3.2 Material and methods

3.2.1 GPS units

The GPS units used were the QStarz BT-Q1300ST GPS logger. The GPS unit can log data at 5 Hz, have a 3 m accuracy in the 2D plane and 0.1 m/s speed accuracy [18].

3.2.2 Experiment group and location

The experiments were carried out on 15 male football players around the age of 10, who train approximately 3 times each week, on a local football pitch.

Each player carried a GPS mounted on their shoulder while participating in each experiment.

Parent’s written consent as well as the children’s permissions were ob- tained prior to the experiments. The data is stored anonymously.

3.2.3 Preprocessing of GPS data

The geodetic coordinates gained from the GPS units were converted to a 2D cartesian coordinate system using the flat earth model. Drift points were removed and missing data points were filled in if the gap was small enough (< 1 s) using linear interpolation [21]. The data points were rotated so that

(8)

the length and width of the football pitch would be aligned with the X and Y axis respectively.

3.2.4 Validation of GPS data

Seven GPS units were placed along a 104 cm long pole with 16 cm between each neighboring GPS. A calibration, running and line experiment (Table 1) was carried out to measure the position accuracy of the GPS at the experi- ment location. The sky was almost completely clear.

Table 1: Description of experiments measuring and validating GPS position accuracy

Name Description

Calibration The pole was placed for 1 minute on each corner of a penalty field

Running Running around the circumference of the penalty field twice with the pole perpendicular to the running direction

Line Jogging back and forth along one of the widths of the penalty field four times with the pole perpendicular to the jogging direction

At each time step the relative position error of the GPS units is given by

Et= 2 N (N − 1)

N

X

x=1 N

X

y=x+1

q

(0.16 · (y − x) − |Xx,t− Xy,t|)2 (1)

where Etis the RMSE at time t, Xn,t is the position of GPS n at time t and N is the number of GPS units.

In the line experiment, the GPS units are only moving in one dimension.

For each time step, the variance of the position on the axis parallel with the direction of movement between all GPS units were calculated according to:

Vk, t = var(Xk, t) (2)

If the GPS units would be ideal, Vk, t = 0 for each time step.

(9)

3.2.5 Correlation score

Pearson correlations between players, of some measurement, over several time segments were compared to each other in this thesis. This was done by defining and calculating a correlation score for each player using correlation matrices each calculated over a different time segment. The correlation score is higher for players that correlate more than players that do not correlate.

Given a time series of correlation matrices, C = {C1, C2, ..., Ci}, where Ci is a correlation matrix (of some measurement) from time segment i. Each correlation pair Cx1,y1,i1 is compared to all other correlation pairs Cx2,y2,i2, where Cx,y,i is the correlation pair between player x and player y for time segment i (x 6= y). Each time the correlation pair Cx1,y1,i1 has a higher correlation than correlation pair Cx2,y2,i2, with 95% significance, the correla- tion score for player x1 is increased by one. This tournament like method is applied between all correlation pairs from all given time segments. See Algorithm 1.

input : C: Time series of correlation matrices. Each correlation matrix consist of correlation pairs Cx,y,i between player x and player y for time segment i.

output: S: Correlation scores. Consists of elements Sp, which is the correlation score for player p

Initialize all scores to 0 for each Sp in S do

Sp ← 0 end

Compare all correlation pairs Cx1,y1,i1 with all correlation pairs Cx2,y2,i2, where Cx,y,i is the correlation pair between player x and y at time segment i

for each Cx1,y1,i1, x1 6= y1 in C do for each Cx2,y2,i2, x2 6= y2 in C do

if Cx1,y1,t1 > Cx2,y2,i2 with 95% confidence then Sx1 ← Sx1 + 1

end end end

Algorithm 1: Algorithm which calculates a correlation score for each player given a time series of correlation matrices. The correlation score is used for comparing the overall correlation between the different players.

A correlation pair is considered bigger than another correlation pair with

(10)

95% confidence if it is bigger, and the 95% confidence intervals of the two pairs does not overlap. The confidence intervals are calculated using the Fisher transformation [12]. A Fisher transformation of a correlation coeffi- cient can be calculated by

zr = 1 2ln

1 + r 1 − r



= arctanh(r) (3)

and with inverse

r = e2zr−1

e2zr+1 = tanh(zr) (4)

where r is the original correlation coefficient and zr is the transformed correlation coefficient. Equation 3 has standard error 1

N −3 where N is the number of samples. The confidence intervals can then be calculated by

r = tanh

arctanh(r) ± z1−α/2· 1

√N − 3



(5) with z1−α/2 = 1.96 for a 95% confidence interval.

3.2.6 Ball control experiment: Evaluation of player ball control Each player did two individual experiments, dribbling and juggling (Table 2), to measure their individual skill level at controlling a football. The best result for each player was recorded for each experiment.

Table 2: Description of experiments evaluating player ball control

Name Description # Attempts

Dribbling Dribble the ball between six cones as fast as possible

2 Juggling Juggle the ball (with their feet)

for as many times as possible

4

3.2.7 Line experiment: Evaluation of player team skill

The players did a defensive line experiment in groups of six to measure their ability to work as a team. In the experiment, the group was standing beside each other in a straight line and was asked to run forward as fast as they could while still retaining the line formation. Occasionally a football coach clapped their hands, indicating that the group had to turn and run in the

(11)

opposite direction. The experiment was repeated four times with different group configurations with each group changing direction 11 times. The play- ers had done the experiment once before so that they would be familiar with it.

The following statistics were measured for each player from the line exper- iment: Speed, reaction time and velocity correlation, measured either when running or when changing direction. A group is defined as running during two seconds after a direction change to two seconds before the next direction change. A group that is turning, or changing direction is defined as the time span two seconds before and after a direction change.

Table 3: Description of measurements used to evaluate how well a players collaborate with their team members

Name Description Time segment

Speed Average speed of group while running

2 s after a direction change and 2 s before the next direction change

Reaction time

Reaction time compared with group members while changing direction

2 s before and after a direction change

Run correlation

Correlation score calculated from speed correlation between each player in a group while running.

Divided by total participation time.

2 s after a direction change and 2 s before the next direction change

Turn correlation

Correlation score calculated from speed correlation between each player in a group while changing direction. Divided by total participation time.

2 s before and after a direction change

3.2.7.1 Calculation of correlations scores for the line experiment To be able to determine which player’s correlate more with their group mem- bers over the entire experiment a correlation score is calculated according to Algorithm 1. To calculate the correlation score, Pearson correlations between each player’s velocity were calculated (Equation 6) for two scenarios: When the players are running and when they are changing direction. This gives

(12)

a correlation matrix for each group and for each time segment for the two different scenarios. The matrices were converted to absolute valued matrices by applying the absolute value to each element. This way the player’s which do not correlate at all would get low scores while those who correlate either positively or negatively would get a high correlation score.

The equation used to calculate a correlation matrix was:

Ci = abs(corr(Vi, Vi)) (6) where abs(A) = [|ak,l|] is a function that returns the absolute value for each element ak,l in generic matrix or vector A. The function corr(A, B) calculates the Pearson correlations between all elements in two generic vectors A and B. Vi is a vector of the velocities of all players for time interval i and Ci is the resulting (absolute valued) correlation matrix between all player’s velocity for time interval i.

These series of correlation matrices were used to calculate a correlation score for each player and each scenario. The correlation scores were normal- ized by weighting the score gained from a correlation matrix by the length of that time span and finally divided by the total amount of time for that scenario.

The GPS units does not directly give a velocity, they only measure the players speed. But since the players are only moving in one dimension and the direction of their movement is known the velocity is easily derived. The velocity was used for the correlation calculations because in the deviations of the player’s position were too small compared to the accuracy of the GPS while the measured speed had higher deviation (compared to the GPS accu- racy of the measured speed).

3.2.8 Match experiment: Evaluation of player match skill

Five matches, with three versus three players (no goalkeeper), were played with different group configurations so that each player got to play at least one match (small pitch ≈ 26 × 14 meters, small goals 0.8 meters wide and 0.6 meters high). The starting formation of each team was with two players placed at each corner of their side of the pitch and one player in front of their goal (Figure 1). Each team composition was chosen by a coach, but the individual positions were chosen by the players. The teams were chosen so that the their skill level would be on a similar level by choosing each group so that their combined rank from the dribbling and juggling experiment (see Figure 4) would be on the same level between the groups.

If the ball went out of bounds or if a team scored a goal each player had to return to their starting position and a player chosen by the coach

(13)

would start with the ball. Each player started with the ball an equal amount of times. During each match a commentary was recorded stating which player who currently is possessing the ball, if the ball went out, if a goal was scored or if the match was otherwise paused. Each match was between 3 to 5 minutes long. The following parameters were measured for each player:

Number of received and successfully made passes, number of interceptions, ball possession, position in team formation, number of goals scored, speed and correlation between the player’s movement (Table 4).

Table 4: Description of player statistics measured during a match and the normalization method used for each measurement.

Name Description Normalization

Passes received Number of received passes Divided by team ball possession time Passes made Number of successful

passes

Divided by team ball possession time Interceptions Number of time a player

broke an opponent’s attack

Divided by opponent ball possession time

Ball possession Amount of time possessing the ball

Divided by team ball possession time Formation Position in team formation

(Wing or center)

N / A

Goals scored Number of goals scored Divided by total time player was playing actively

Speed Average speed Only measured when a

player was actively playing Corr, team

(Attacking)

Correlation score with teammates while attacking

Divided by team ball possession time Corr, team

(Defending)

Correlation score with teammates while defending

Divided by opponent ball possession time

Corr, opponent (Defending)

Correlation score with opponents while defending

Divided by opponent ball possession time

3.2.8.1 Calculation of correlation scores for the match experiment The correlation scores for the match experiment are calculated analogously to the correlation scores from the line experiment (Section 3.2.7.1) but with the difference that rather than using the velocities, the player’s position along

(14)

the X-axis (axis parallel to the pitch) were used, and the correlations were calculated when a team was either attacking or defending.

A team is defined as attacking if they have possession over the ball for four consecutive seconds or longer while a defending team is the opposing team in that scenario.

Figure 1: Three versus three pitch, with the X-axis parallel with the pitch length and the Y-axis parallel with the pitch width. Yellow and blue circles are the starting position for the yellow and blue team respectively. Pitch size: ≈ 26 × 14 meters.

3.3 Results

3.3.1 GPS validation

The mean and standard deviation was calculated from the time series of errors given by equations 1 and 2 (Figure 2). The mean error was calculated to E = 1.2 for the relative GPS experiment and slightly lower for the line experiment. This is a smaller error than the accuracy of 3m given by the specification of the GPS (Section 3.2.1).

(15)

Figure 2: Mean error and standard deviation of the relative position error of the GPS units for a GPS validation experiment(blue). Mean error and standard deviation of the the position parallel to the direction of movement for a line (one dimensional) GPS validation experiment(red).

3.3.2 Ball control experiment results

Players which do well in one of the two ball control experiments (dribbling and juggling) also seem to do well in the other ball control experiment, with a Spearman rank correlation of ≈ 0.8(p < 0.01) (Figure 3). Those players seem to have better control over the ball in general. The ranking were applied so that skilled players have low rank, i.e descending order for the juggling experiment and ascending order for the dribbling experiment.

(16)

Figure 3: Juggling versus dribbling Spearman ranking for players partici- pating in a ball control experiment. The result for the worst and best player for each measurement is noted in a parenthesis next to the rank. Skilled players have low rank.

3.3.3 Assignment of player ID

Each players was given an ID ordered after their combined performance in the two ball control experiments between 15 players by combining the rank from both experiments. The player with the lowest combined rank was denoted as having player ID 1, the next lowest as having player ID 2, and so on until the player which had the highest combined rank were given ID 15 (Figure 4).

(17)

Figure 4: Combined Spearman ranking from a dribbling(blue) and jug- gling(red) ball control experiment for 15 players. Players were given an ID depending on their combined rank from these two experiments with the player with the lowest combined rank given ID 1 and the one with the highest com- bined rank given ID 15. Skilled players have low rank.

3.3.4 Line experiment results

No Spearman correlations with at least 95% confidence were found either between the measurements from the line experiment or between the line experiment and the ball control experiment.

3.3.5 Match experiment results

There were several correlations between the measurements from the match control experiment that correlated with each other and with the ball control and line experiment.

3.3.5.1 Ball control

Passes made correlated with ball possession with a correlation of ≈ 0.75(p < 0.01) and ball possession correlated with scored goals at ≈ 0.52(p < 0.05). Ball possession, passes received, passes made and goals scored correlated with the dribbling experiment positively with correlations between ≈ [0.59, 0.89](p < 0.02) (Figure 5), and (to a lower degree) with the juggling experiment.

(18)

Figure 5: Spearman ranked ball possession versus passes made, (a) scored goals (b) and dribbling time (c) as well as passes received versus dribbling time (d), for players participating in a match and a dribbling experiment.

The result for the worst and best player for each measurement is noted in a parenthesis next to the rank. Skilled players have low rank.

3.3.5.2 Player formation and ball control

Player’s position in the team formation correlate positively with dribbling, juggling, ball possession and passes received, i.e players which have a center position also receive more passes and are better at dribbling and juggling (Figure 6). The player’s position in each team were not enforced so it seem like that players that are good at dribbling and juggling also take the center position. Players which have a center position also get more passes and possesses the ball for more time compared to their team members. See Figure 7 for an example of a pass chain from the match.

(19)

Figure 6: Passes received and ball possession Spearman rank for each player participating in a match. Each player’s position in their team formation is noted in parenthesis next to their player ID as C, a center player, or W, a wing player. Players with a high amount of received passes and ball possession have low rank.

(20)

Figure 7: One of the longer passing chains recorded from a three versus three match. The center player (blue) acts as an intermediate player where the two wing players (yellow) does rarely pass the ball to each other. The numbers indicate the order of the passes where number one is the first pass while the arrows indicate the direction of the pass.

3.3.5.3 Line experiment and scored goals

A negative correlation were found between the Spearman rank the turn correlation from the line experiment and the number of scored goals with

≈ −0.52(p < 0.05). The players which correlated with each other the least while turning during the line experiment also scored more goals in general (Figure 8d).

3.3.5.4 Correlations between players in a match and in the line experiment

Players that correlated with their team members while defending also cor- relate with their opponents while defending with a Spearman correlation of

≈ 0.61(p < 0.02). Players that correlated while team members correlated with the player’s speed rank (0.74(p < 0.01)). Players which also correlated with team members while attacking also correlate with each other while run- ning in the line experiment (≈ 0.58(p < 0.05)) (Figure 8a, 8b, 8c).

(21)

Figure 8: Spearman rank between different measurements for players par- ticipating in a match and a line experiment. The figure show: Players that correlate with their team members while attacking versus player speed during match (a) and correlation while running in the line experiment (b), players that correlate with their team versus opponents while defending (c), players that correlate with each other while changing direction in the line experiment versus scored goals during match (d). The results for the lowest and highest ranked player for each measurement is noted in a parenthesis next to their rank. The ranking is applied in descending order.

3.3.6 Correlations between player measurements

The Spearman’s rank correlation between all measurement from the three experiments are shown in Figure 9. Correlations with significance lower than 95% were removed.

(22)

Figure 9: Spearman correlations with at least 95% significance between measured and derived values from three different football experiments for different players. Red indicates a positive correlation while blue is negative.

3.3.7 Correlations between team measurements

The average of each player’s measurements for the three experiments were calculated for every team from the five matches in the match experiment.

Two teams consisted of exactly the same members so there were nine different teams in the five matches. The Spearman correlation were calculated for the dribbling, juggling, number of interceptions, goals scored, opponent scored goals, speed, correlation with team members while attacking and defending and correlation with opponents while defending (Figure 10). Correlations

(23)

with significance less than 95% were removed.

From this figure it is seen that teams that had players that were good at juggling also had players good at dribbling (≈ 0.68(p < 0.05)). However no significant correlation between the number of scored goals and opponent scored goals were found.

Teams with fast players correlated with teams which had a high corre- lation with each other while attacking (≈ 0.75(p < 0.02)) while they cor- related negatively with teams that had a high correlation while defending (≈ −0.7(p < 0.05)).

Figure 10: Spearman correlations with at least 95% significance between measured and derived values from five football matches and nine different teams. Red indicates a positive correlation while blue is negative.

3.4 Discussion

3.4.1 GPS

One of the goals of these experiments was to see how well this particular GPS, and in extension other GPS with similar capabilities, in reality would

(24)

work to measure position data and calculate correlations between football players movement.

While the GPS did perform as advertised and in line with previous find- ings [16], they did not work well in the line experiment. The players which have the biggest trouble at keeping the line intact still kept within the error marginal of the GPS, hence it was not possible to determine which players and groups were good or bad at keeping the line intact by studying the GPS positions.

The GPS were accurate enough to measure the player’s general positions in a match. The GPS might not be able to accurately measure a player dribbling past another player or an interception, but is accurate enough to measure the general movement of the players on the pitch. The matches in the experiment were small three versus three matches so there were a lot of close interaction between players, thereby using these GPS might not have been optimal for this particular set up. They would probably work better on a normal sized football field with more players since close interactions would normally only include a few of the players at a time.

3.4.2 Ball control

While giving each player a few more attempts at the dribbling and jug- gling experiment and increasing the number of participating players would be preferable, the data gathered was enough to determine that a player good at dribbling was also good at juggling and vice versa (which is not really a surprising result) (Figure 4). That dribbling and juggling correlate with each other is in line with previous findings [8].

Players which were good at the ball control experiment also got better results during the match. They have a higher ball possession, make and receive more passes and they score more goals (Figure 5). This is similar to previous findings, where players good at dribbling were also good at offense in one versus one matches, but no significant correlation was found for players good at juggling and one versus one match skills [8].

However, they also take the center position in the team and it is seen that such players receives more passes and have a higher ball possession (Figure 6).

But is the only reason they receive more passes because they have the center position, because players pass the ball to players they think are skilled or because these players actually are better at positioning themselves and thus receive more successful passes? Since one goal of the experiment was to see if the ball control experiment would be good indicators to match performance this dependency should be removed.

Currently these results show that players with good ball control might

(25)

be better at playing an actual match, or it might be because they take the center position. In a future match experiment the player’s positions in the team should be predetermined and should change throughout the match to remove this dependency.

3.4.3 Line experiment

The line experiment might be able to predict how well player’s in a match will correlate with each other, as seen in Figure 8b. It is not far fetched to believe that players which are good at correlating in one of the experiments also is in the other. However, the significance of the correlation is low so the correlation found might not be relevant.

For some reason, players that correlate while changing directions also score less goals (Figure 8d). Why this is the case is unknown, since it is hard to find a logical link between the two measurements. This correlation is not very strong as well so there is a high possibility that there is no correlation between the two.

Since all measurements from the line experiment have low significance more tests with more players would have to be done to see if there actually are any significant correlations between the line experiment and the other two experiments.

3.4.4 Match experiment

No significant correlation was found between teams with high ball control and teams that scored many goals (Figure 10) although individual players who had high ball control did score more goals (Figure 5b). However only five matches were played so we can only say that we didn’t find a significant correlation not that there is none. More matches should be played to either find out the answer.

Players with high speed did correlate also did correlate more with their team while attacking both individually and as a team (Figure 5a, Figure 10). The same is said for players that correlate well with team members and opponents while defending (Figure 5c, Figure 10). Teams that correlated while defending did not correlate while attacking (Figure 10). One explana- tion for this might be that some players are more offensive (correlate while attacking) and some players are more defensive (correlate while defending) although there are no results to back this claim.

See Appendix A for video examples of correlating players.

(26)

4 Visualization of collective motion of a pro- fessional football players in a eight versus eight match

4.1 Introduction

The goal of this part of thesis was to implement a visualization tool that could be used to observe the collective motion behaviour of players and the ball in a football match. The collective motion focused on in this implementation is leader-follower relationships[14, 16], where one player is following another player.

In addition, Voronoi regions of the players were implemented into the GUI. A Voronoi region is the region around a point which is closer to the point than any other point in the space. Thus a football players Voronoi region is the region which is closer to the player than to the other players and is an important concept in football. For example, a team would want to increase the area of their own Voronoi regions while decreasing their opponents [11].

A data set of a professional football match is used as an example to show the visualization.

4.2 Materials and methods

4.2.1 Data description

The data used for the visualization was a public data set with sensor data from a 8 versus 8 football match in Nuremberg Stadium, Germany using a local wireless positioning system[1]. The sensors were placed on both feet of each football player, the judge and also in the ball. The sensors placed on the players and the judge recorded data at 200Hz while the sensor placed in the ball recorded data at 2000Hz. Each sensor recorded the current time, its position, magnitude and direction of speed as well as direction and magnitude of their acceleration.

The match was carried out on a half sized football pitch (≈ 70 × 50 meters). Two periods were played, each being 30 minutes long.

4.2.2 Preprocessing of data

The data was rotated 90 degrees so that the horizontal axis (X-axis) would be parallel with the pitch. Linear interpolation was used to align data points

(27)

to even 1/25s intervals, and then downsampled to 25 Hz. Missing data points were filled in using linear interpolation if the gap was small enough (< 1s)[21].

The senor data from each foot were averaged to a single data point, rather than two, for each player.

4.2.3 Time window

The match was divided into rectangular time windows of length Ti. Time windows of any length can be used. Shorter time windows use less samples and are therefore more prone to noise and are less reliable. Using a longer time window is more reliable, but if the time window is too long it won’t be able to find dynamic correlations that change too quickly over time.

In this thesis the length of the time window was set to Ti = 5 seconds since it seemed like this time window worked well for observing overall patterns.

4.2.4 Time lag correlation

Time lagged correlations between each player’s data points were calculated for all time windows in the match[16, 19]. During every such time window, the correlation between the player’s X and Y positions (axis parallel and perpendicular to the pitch respectively), speed and direction was calculated as for all time lags t = [0, Tlag]:

Ci,t = corr(Pi,0, Pi,t) (7) where Ci,t is the correlation matrix between all players parameters for time interval i and time lag t. The function corr(A, B) calculates the Pearson correlation between all of the player’s parameters in two matrices A and B, with parameters as columns and data points as rows. Pi,t is a matrix with all player’s parameters as columns and with all data points as rows, for time interval i. The data points in the matrix Pi,t are shifted t seconds to the future compared to Pi,0.

Since the data points are shifted compared to each other, if the current time window is on an edge, the non-shifted and shifted data points on both edges won’t match up with any shifted and non-shifted data points respec- tively. For this reason, calculating lag correlations with a lag of Tlag seconds would remove 2 · Tlag seconds worth of data. This is not a problem when analyzing an entire 30 minute long match. But if analyzing short scenarios, the time lag can’t be too big because of this reason.

The return matrices Ci,0 are normal symmetrical correlation matrices between all parameters for each player. However, the matrix Ci,t, for t > 0, is no longer symmetrical due to the time lag. Row A and column B from the

(28)

matrix with lag t would return the correlation between the current values A and the future values of B.

For every time window and row in the correlation matrix a function c(t) is defined which return the correlation at time lag t. The maximum time lag correlation is defined as kck and the maximum time lag, Tmax, is defined as the the smallest time lag, t, which satisfy c(t) = kck[16]. If Tmax > 0 then that means that the parameter stored at column B is following the parameter in row A, with lag Tmax, since future values of B correlate with the current values of A the most at that point (Figure 11).

Tlag = 2 seconds for this report because football players should rarely base their movement on older position data.

Figure 11: Example of a time lagged correlation curve. There is a max- imum correlation with a time lag of 1 second (maximum time lag), where the correlation is 0.8 (maximum time lag correlation). The area below the maximum lag correlation, above the correlation curve and to the left of the maximum time lag is the lag score.

4.2.5 Lag score

As a supplement of measuring the leader-follower behaviour a lag score was defined(8) and calculated for each player. The lag score is calculated from the maximum time lag, by calculating the area above the time lag correlation curve but which is below the maximum time lag correlation (Figure 11). This

(29)

gives a measurement of how strong the leader-follower relationship is. For example, if the maximum time lag would be high but the lag score be very low, the leader-follower relationship can be considered as weak.

For every time window and row in the correlation matrix the lag score is calculated with the following equation:

S = kck

Z Tmax

0

c(t)dt (8)

where S is the lag score and c(t) is the correlation between two parame- ters, A and B, where B is shifted to the future compare to A, with time lag t. c(t) is defined on the interval [0, Tlag]. Tmax is the smallest time lag, t, which satisfy c(t) = kck.

4.2.6 Implementation

Matlab was used for implementation, using Matlabs built in graphical user interface system: GUIDE[15]. The implementation is divided into two parts:

1. Precalculate the correlations for a preset time window function and a preset time lag

2. Render the match in the GUI

The precalculation stage only has to be run once for a given data set, time lag and window function. In the precalculation stage the maximum time lag and lag score are calculated for every time window in the match, and stored in a file.

After the calculation stage a user may load the stored file using the GUI and observe different parts of the match without the need of additional com- plex calculations.

Since the maximum time lag is calculated for every time window the cor- relations will change abruptly between two subsequent time windows. Linear interpolation is therefore used to calculate intermediate values of the max- imum time lag, maximum time lag correlation and lag score between two subsequent time windows. This way the GUI will show a gradual changes rather than sudden abrupt changes. However caution must be taken, espe- cially if the size of the time windows are big, since the values shown between the interpolation points might not be the true values at that point.

4.3 Results

Figure 12 show the implemented GUI. The GUI show the pitch as well as all players and the ball in the center frame. Each yellow and red circle in

(30)

the frame are players from team yellow and red respectively. The blue circle is the ball. The size of the circle of a player show their correlation with the ball, with bigger circles showing a bigger correlation. A player which have possession over the ball have a smaller black dot in the middle of their circle.

The scrollbar below the frame is used to advance, and show, the current time in the match. The menu options above and beside the match frame are used to show maximum time lag between players, which parameters are used for the correlations and different Voronoi regions.

Figure 12: Visualization GUI showing players from a foot ball match. The yellow and red circles are players from team yellow and red respectively. The blue circle is the ball. Players with bigger circles have a higher correlation with the ball in the Y-axis .The light blue lines show the Voronoi regions of the yellow players, excluding the yellow goal keeper.

4.3.1 Maximum time lag

The correlation as well as the maximum time lag between the players can be shown by connecting players with an edge. The edge is a double pointed arrow, where the pointedness of the arrow represent the maximum time lag between the players and the color of the edge show the magnitude and sign of the correlation.

If the correlation is positive between two players the edge will be green, with green edges showing a high magnitude while darker green edges show a

(31)

lower magnitude. Negative correlations are shown analogously but using the color blue instead.

The more pointy an edge is, the bigger the maximum time lag is. For example, if an edge is pointing from player A toward player B, that means player A is following player B (if the correlation also is high). If the edge would not be pointy player A would not be following player B. If the edge points at both players they both would follow each other.

The GUI can be changed to show the lag score (Section 4.2.5) between the players, rather than maximum time lag, by choosing Score rather than Delay under the section Lag Type. A higher lag score gives a more pointy edge.

By changing the radio buttons under the sections Origin and Target cor- relations between the following parameters can be shown, the player’s and balls position along the X and Y-axis, speed (|v|) and direction (angle). Dif- ferent parameters for Origin and Target can be chosen. For example if X is ticked under Origin and Y under Target, the part of the edge closest to player B would show the correlation between the X position of player A and Y position of player B. If the edge would be pointing from player A to player B it would mean that the X position of player A is following the Y position of player B.

In Figure 13 the maximum time lag correlations are shown for players of the yellow team (excluding the goal keeper).

(32)

Figure 13: Visualization GUI showing collaborative motion between players in a team from a foot ball match. The yellow and red circles are players from team yellow and red respectively. The blue circle is the ball. Players with bigger circles have a higher correlation with the ball in the X-axis. The edges connecting the yellow players show their correlation and maximum time lag with each other in the X-axis. Light green edges represent a high positive correlation while darker edges represent a correlation closer to zero. The edges are double pointed arrows with variable width depending on how big the time lag is between the players. A pointy edge pointing from player A to player B mean player A is following player B.

4.3.2 Voronoi Regions

The Voronoi regions for the players can be shown in the GUI by ticking the All, Yellow or Red Voronoi boxes which show the regions for all, yellow or red players respectively (excluding goal keepers). The red and yellow goalkeeper as well as the ball can be included if the corresponding box is ticked in as well. For example, Figure 12 show the Voronoi regions for the yellow team.

4.4 Discussion

The implemented GUI works well for observing the entire match and quickly change which parameters or players which should looked closer to at the

(33)

given moment. This eases the process of finding interesting scenarios which are interesting to analyse further and to make humans understand the data.

For example, the GUI can be used to see in which scenarios the players are more correlated than in other scenarios and which players are more correlated with whom and how it changes over time. See Appendix B for example observations.

However, in football, the ball is probably the most important object and it’s position and velocity change the overall state and response from the players directly. In the current state the GUI is not able to remove the effect the ball has on the players, i.e even the players would be correlating with each other they all might just be following the ball rather than each other.

For this reason, in a future version of the GUI, a user should be able to remove the influence of the ball from the player-player correlations.

5 Conclusions

The conclusion of this project is that the GPS do work, if the experiments are set up in the correct way. For example, doing the experiment on more players on a bigger pitch would work better since the GPS error would not be as significant as in a smaller three versus three match.

It was found that players which performed well in the ball control exper- iment also seemed to have better ball control in the match and scored more goals. However, these players also took the center position in the team, a position which correlated positively with the number of received passes and ball possession. Also, not significant correlation between the final match re- sults and any other parameter was found so it is not certain if having players with good ball control actually improve the team’s performance. This would not have been a problem if the player’s positions were predetermined. To make the data more reliable in general more than 15 players should have been used for the experiments and the players should have played more and longer matches. If so we would be able to find out if there actually is any relationship between the line experiment and the collective motion in the match.

The visualization tool works well for observing the collective motion be- tween the players and the ball but it is also a tool that can be developed much more. In the next version of the GUI the ability to remove the influence of the ball to the players movements should be implemented so that the player relationships can be studied more directly.

(34)

6 Future Outlook

There are countless experiments that can be done using GPS and football players to further analyze the connections between different statistics and find training exercises that prepare players for a match. Some ideas:

1. Correlations between players movement and the intersections of Voronoi regions could be analyzed to see if players which follow these regions receive more passes or not.

2. See if different players in a full size team correlate differently than their team members depending on their position.

3. Measuring maximum time lag between position correlations during a match or the line experiment to see if some players assume leader-follow positions.

4. Letting two players run each running on one out of two perpendicular lines, where one player’s goal is to get as far away from the other player as possible while the other has to follow. Can be done with the chased player having run with and without a ball.

The visualization tool is a start to a deeper analysis of collective motion in football. The data set used for the visualization is a good data set to also perform some analysis like those done on the youth football team. Other than that it is a tool that can be developed with many more features which would help users to observe specific scenarios or show the collective motion between agents from something entirely different than football.

Appendix A Movie examples of collective mo- tion of a youth football team

This video show the GPS positions of the football players during the match.

The correlations between the player’s movements are also shown.

Link: https://www.youtube.com/my_videos?o=U

Appendix B Movie examples showing the vi- sualization

From time 0.00 to 1.04 the yellow team is defending and is an example of a defending team having a high correlation.

(35)

Link: https://youtu.be/dRLr-lBR2-Y

From time 0.00 to 1.04 the red team is attacking and is an example of an attacking team being less correlated.

Link: https://youtu.be/vAosLGGLXyU

At time 0.19 there is an example of an attacking team getting more correlated as they begin to advance.

Link: https://youtu.be/B_i98WdH0Pk

References

[1] ACM DEBS 2013 Grand Challange. Data description, 2013.

[Online; http://www.orgs.ttu.edu/debs2013/index.php?goto=

cfchallengedetails; accessed 9-September-2015].

[2] Micael S. Couceiro, Filipe M. Clemente, Fernando M. L. Martins, and Jos´e A. Tenreiro Machado. Dynamical stability and predictability of football players: The study of one match. Entropy, 16:645–674, January 23 2014.

[3] Ricardo Duarte, Duarte Ara´ujo, Vanda Correia, and Keith Davids.

Sports teams as superorganisms; implications of sociobiological mod- els of behaviour for research and practice in team sports performance analysis. Sports Medicine, 2012.

[4] Ricardo Duarte and Telmo Frias. Collective intelligence: An incursion into the tactical performance of football teams. First International Con- ference in Science and Football, April 1 2011.

[5] Jordi Duch, Joshua S. Waitzman, and Lu´ıs A. Nunes Amaral. Quan- tifying the performance of individual players in a team activity. PLoS ONE 5(6): e10937. doi:10.1371/journal.pone.0010937, June 16 2010.

[6] Hugo Folgado, Ricardo Duarteb, Pedro Marquesc, and Jaime Sampaiod.

The effects of congested fixtures period on tactical and physical perfor- mance in elite football. Journal of Sport Sciences, Mars 13 2015.

[7] Hugo Folgado, Koen A. P. M. Lemmink, Wouter Frencken, and Jaime Sampaio. Length, width and centroid distance as measures of teams tac- tical performance in youth football. European journal of sports science, 14:S487–S492, October 12 2012.

(36)

[8] Nobuyoshi Fumoto and Koji Kumagai. Does a player whose ball juggling skill is the best shows the best ability in a soccer game?: A consideration of the validity of skill tests from a new viewpoint keeping utility in mind.

Football Science, 11:18–28, January 9 2014.

[9] GPSports. SPI HPU brochure, 2012. [Online; http://home.gpsports.

com/wp-content/uploads/2013/08/SPI_HPU_2013.pdf; accessed 16- September-2015].

[10] Thomas U. Grund. Network structure and team performance: The case of english premier league soccer teams. Elsevier: Social Networks, 2012.

[11] S. Kim. Voronoi analysis of a soccer game. Nonlinear Analysis: Mod- elling and Control, 9, August 30 2004.

[12] Geoffrey R. Loftus and Elizabeth F. Loftus. Essence of Statistics (Alfred A. Knopf series in psychology). Knopf, New York, NY, US, 2 edition, June 8 1998.

[13] Prozone Sports Ltd. About, 2015. [Online; http://www.

prozonesports.com/about/; accessed 16-September-2015].

[14] R. Marcelino, M. Nagy, B. Gon¸calves, and J. Sampaio. Quantitative analysis of leader-follower interactions between football players. Inter- national Congress of Exercise and Sports Performance, November 14-15 2014.

[15] MathWorks. Create a simple UI using GUIDE. [Online;

http://se.mathworks.com/help/matlab/creating\_guis/

about-the-simple-guide-gui-example.html; accessed 15- September-2015].

[16] Benjamin Pettit, Andrea Perna, Dora Biro, and David J. T. Sumpter.

Interaction rules underlying group decisions in homing pigeons. Journal of the Royal Society Interface, September 25 2013.

[17] Wong Pui-Lam, Chamari Karim, Dellal Alexandre, and Wisløff Ulrik.

Relationship between anthropometric and physiological characteristics in youth soccer players. Journal of Strength and Conditioning Research, pages 1204–1210, July 2009.

[18] QStarz International CO., Ltd. GPS BT Q1300ST specification, 2013. [Online; http://www.qstarz.com/Products/GPS%20Products/

BT-Q1300ST-S.htm; accessed 9-September-2015].

(37)

[19] Shyam Ranganathan, Viktoria Spaiser, Richard P. Mann, and David J. T. Sumpter. Bayesian dynamical systems modelling in the social sciences. PLoS ONE 9(1): e86468.doi:10.1371/journal.pone.0086468, January 20 2014.

[20] Hugo Sarmento, Rui Marcelino, M. Teresa Anguera, Jorge CampaniC¸ o, Nuno Matos, and Jos´e Carlos Leit ˜Ao. Match analysis in football: a systematic review. Journal of Sport Sciences, May 1 2014.

[21] Mingzhou Song and J. Bilmes. Lecture 9: Upsampling and downsam- pling, 2001. [Online; http://melodi.ee.washington.edu/courses/

ee518/notes/lec9.pdf; accessed 13-September-2015].

[22] Zengyuan Yue, Holger Broich, Florian Seifriz, and Joachim Mester.

Mathematical analysis of a soccer game. part i: Individual and collec- tive behaviors. Studies in Applied Mathematics, 121:223–243, October 1 2008.

[23] Zengyuan Yue, Holger Broich, Florian Seifriz, and Joachim Mester.

Mathematical analysis of a soccer game. part ii: Energy, spectral, and correlation analyses. Studies in Applied Mathematics, 121:245–261, Oc- tober 1 2008.

References

Related documents

Reported attributes In table 3.1 are the attributes listed that the players are expected to report via a form every day even if no training session has occurred.. Attribute

When summarizing the clubs injury incidence for their players when playing for their first team and their national team respectively, we see an injury incidence between 0 and 297

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating

Secondly, the result showed that the participant shared experience in the cultural transition process which are presented in 12 themes (e.g. Pre-transition phase: satisfaction

The aim of this study was to evaluate the correlation between exposure to match play for football players in top European clubs during the season prior to the World Cup 2002 and

Medial collateral ligament injuries of the knee in men’s professional football players: a prospective three-season study of 130 cases from the UEFA Elite Club Injury Study..

This paper also evidences that, while it remains entirely possible that players enjoy the element of distraction while playing single player CRPGs, their main motivation in playing

The major patterns from this study pointed towards the group with national team players (NT) having a statistically significant higher linear sprint ability compared to the group