• No results found

Measuring a Platoon Commander's Performance in a Complex, Dynamic and Information Rich Environment

N/A
N/A
Protected

Academic year: 2021

Share "Measuring a Platoon Commander's Performance in a Complex, Dynamic and Information Rich Environment"

Copied!
98
0
0

Loading.... (view fulltext now)

Full text

(1)

Linköping University | Department of Computer and Information Science

Master’s Thesis, 30 ECTS

|

Cognitive Science

Spring Term 2021 | LIU-IDA/KOGVET-A--21/009--SE

Measuring a Platoon Commander's

Performance in a Complex, Dynamic

and Information Rich Environment

Alexander Melbi

Supervisor: Björn Johansson Examiner: Arne Jönsson

(2)
(3)

Copyright

The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

(4)

Abstract

Command and control (C2) environments are complex, dynamic and rich in information. Thus, measuring

the performance of an agent in a C2-system, in this case a platoon commander, poses a challenging task for

the researcher. To measure the performance of a platoon commander in this thesis, the OODA loop is used as a model for representing the four processes in which the platoon commander is engaged in during a military C2 mission. In accordance with these processes, performance measurements for the platoon commander

are identified. The relevance of the performance measurements, to the C2 tasks and goals of the platoon

commander, are tested through three studies conducted in a simulated warfare scenario, and two workshops, one with a platoon commander and one with two scientists. As a result of the studies and workshops, an assessment tool for measuring the C2 tasks and goals of the platoon commander, is developed. This

assessment tool consists of modified versions of the Crew Awareness Rating Scale (CARS), the Situation Awareness Global Assessment Technique (SAGAT) and the NASA-Task Load Index (NASA-TLX), as well as generic performance measurements measuring fratricide, deaths and completion of overarching goal.

(5)

Acknowledgment

First and foremost, I would like to give my sincerest appreciation to the Swedish Defense Research Agency in Linköping for giving me the opportunity of being a part of such an exciting and hands-on project. I would like to thank Björn Johansson, my supervisor for his theoretical knowledge and helping this thesis to stay on course. A special thanks to Per-Anders Oskarsson for giving me feedback and advice for my thesis. Thank you to Kristofer Bengtsson and all the other employees at the Swedish Defense Research Agency, who participated in and orchestrated the studies. Lastly, Malin for making this thesis possible.

Alexander Melbi Linköping June 2021

(6)
(7)

Contents

1 Introduction 1 1.1 The Problem . . . 1 1.2 Purpose . . . 2 1.3 Research Questions . . . 3 1.4 Delimitation . . . 3 2 Theory 4 2.1 Command and Control . . . 4

2.1.1 The OODA Loop . . . 4

2.1.2 The DOODA Loop . . . 5

2.1.3 Contextual Control Model . . . 6

2.2 Measuring Performance of the Platoon Commander . . . 8

2.2.1 Measuring Performance in First Person Shooter Games . . . 8

2.2.2 Measuring Performance in Simulated Environments . . . 9

2.3 Situation Awareness . . . 10

2.3.1 Endsley’s Three-Level Model . . . 11

2.4 Measuring SA in C2Environments . . . . 12

2.4.1 Situation Awareness Requirement Analysis . . . 13

2.4.2 Crew Awareness Rating Scale . . . 13

2.4.3 Situation Awareness Global Assessment Technique . . . 14

2.4.4 NASA-Task Load Index . . . 15

2.5 Selection of C2 Model . . . . 17

3 General Method 19 3.1 Scenario . . . 20

3.2 Virtual Battle Space 3 . . . 21

3.3 Observation . . . 22

3.4 Semi-structured Interviews . . . 23

3.5 Initial Method and Adaptations to Methods . . . 24

(8)

3.5.2 Adaptions of CARS . . . 24

3.5.3 Adaptions of SAGAT . . . 25

3.5.4 Adaptions of NASA-TLX . . . 25

3.5.5 Adaptions of SARA . . . 26

3.6 Research Ethics . . . 26

4 Workshop with Platoon Commander 27 4.1 Method . . . 27

4.1.1 Participant . . . 27

4.1.2 Materials . . . 28

4.1.3 Procedure . . . 28

4.1.4 Approach for Analysis . . . 29

4.2 Results from Workshop . . . 29

5 Study 1 31 5.1 Method . . . 31 5.1.1 Participants . . . 31 5.1.2 Apparatus . . . 31 5.1.3 Materials . . . 32 5.1.4 Procedure . . . 33

5.2 Feedback Session with Platoon Commander . . . 35

5.2.1 Materials and Procedure . . . 35

5.2.2 Results from Feedback Session with Platoon Commander . . . 35

5.2.3 Analyzing the Results . . . 35

5.3 Changes after Study 1 . . . 36

6 Study 2 37 6.1 Method . . . 37 6.1.1 Participants . . . 37 6.1.2 Apparatus . . . 37 6.1.3 Materials . . . 37 6.1.4 Procedure . . . 39

6.2 Feedback Session with Platoon Commander . . . 40

6.2.1 Materials and Procedure . . . 40

6.2.2 Results from Feedback Session with Platoon Commander . . . 41

6.2.3 Analyzing the Results . . . 42

(9)

7 Workshop with Scientists 44

7.1 Method . . . 44

7.1.1 Participants . . . 44

7.1.2 Materials . . . 44

7.1.3 Procedure . . . 44

7.1.4 Analyzing the Results . . . 44

7.2 Results from Workshop . . . 45

8 Study 3 47 8.1 Method . . . 47 8.1.1 Participants . . . 47 8.1.2 Apparatus . . . 47 8.1.3 Materials . . . 47 8.1.4 Procedure . . . 48

8.2 Generic Performance Measurement Review Session . . . 50

8.2.1 Selection of Generic Performance Measurements . . . 50

8.2.2 Materials and Procedure . . . 51

8.2.3 Results from Generic Performance Measurement Review Session . . . 51

8.2.4 Analyzing the Results . . . 52

8.3 Feedback Session with Platoon Commander . . . 53

8.3.1 Materials and Procedure . . . 53

8.3.2 Results from Feedback Session with Platoon Commander . . . 53

8.4 Changes after Study 3 . . . 54

8.4.1 Generic Performance Measurements . . . 54

8.4.2 Questionnaires . . . 56

8.4.3 Final Version of Assessment Tool . . . 57

9 Discussion 59 9.1 Discussion of Performance Measurements . . . 59

9.1.1 CARS . . . 59

9.1.2 SAGAT . . . 61

9.1.3 NASA-TLX . . . 63

9.2 The Generic Performance Measurements . . . 64

9.3 Selection of Meister’s Criteria . . . 64

9.4 Research Questions . . . 66

9.4.1 Research question 1 . . . 66

9.4.2 Research Question 2 . . . 67

9.4.3 Research Question 3 . . . 68

(10)

10 Conclusions 70

Bibliography 71

(11)

Translations

Company Commander Kompanichef Platoon Commander Plutonchef Company Kompani

Platoon Pluton

Command and Control Ledning

Command and Control Support Systems Ledningsstödsystem

Swedish Defense Research Agency Totalförsvarets forskningsinstitut (FOI) Swedish Armed Forces Försvarsmakten

IFV Pansarskyttefordon (in this thesis, vagn)

Acronyms

KLASS Konsekvenser för Ledning av Autonoma Samverkande System

LASSIE Ledning av Autonoma och Sammansatta System med Intelligenta Enheter C2 Command and Control

VBS3 Virtual Battle Space 3 IFV Infantry Fighting Vehicle CARS Crew Awareness Rating Scale

SAGAT Situation Awareness Global Assessment Technique NASA-TLX NASA-Task Load Index

SARA Situation Awareness Requirement Analysis OODA Observe, Orient, Decide, Act

FPS First Person Shooter

Definitions

Performance Measurement Term for any measurement measuring performance regardless of domain Generic Performance Measurement Measurements retrieved from studies conducted in FPS games and

simulated environments, e.g. fratricide, deaths and kills

Performance Assessment Tool The Performance Assessment Tool developed in this thesis, including the CARS, SAGAT and NASA-TLX questionnaire as well as the generic performance measurements

(12)
(13)

1.

Introduction

This thesis is a part of the project Konsekvenser för Ledning av Autonoma Samverkande System (KLASS) conducted at Swedish Defense Research Agency (sv. Totalförsvarets forskningsinstitut) in Linköping. The aim of KLASS is to investigate how command and control (C2, sv. ledning) is affected by the implementation

of ground-based and aerial systems with autonomous or intelligent abilities. KLASS is a continuation of the project Ledning av Autonoma och Sammansatta System med Intelligenta Enheter (LASSIE). LASSIE started in 2018 with the purpose of investigating how joint systems, consisting of autonomous or intelligent systems, can be conceived to affect C2 and C2 support systems (sv. ledningssystem) in the future (Johansson et al.,

2019). In this thesis the effects on the C2tasks and goals of the platoon commander will be of focus. In order

to measure the effects of autonomous and intelligent systems on C2, baseline values of platoon commander

performance without an autonomous or intelligent system, must be defined. Thus, one of the aims of KLASS and this thesis, is to create this baseline.

The studies in LASSIE and KLASS are performed in the simulated environment Virtual Battle Space 3 (VBS3). Simulated environments are used across a wide array of domains such as spacecraft, marine trans-portation, aviation, C2, and tactical surface warfare (Jones et al., 1985). The advantages ranges from training

opportunities for users in a safe environment with respect to cost efficiency, human safety in terms of reduced accident rates, equipment or system damage, aborted missions and equipment or system failures, compared to real-world environments (Thompson et al., 2009). Although the advantages surpass the disadvantages, there are challenges facing the use of simulated environments. One of the challenges of conducting Simulation-based training (SBT) in the military domain is the lack of clear performance measurements (Seibert et al., 2011).

1.1

The Problem

C2environments are usually very complex, dynamic, and rich in information (Salmon et al., 2006). Therefore,

developing methods for measuring C2 performance poses a challenge for the researcher. Today, C2systems

are regarded as complex sociotechnical systems (Riley et al., 2006). Therefore, systems operating in C2

environments often perform their tasks in contexts defined as complex, rapidly changing, uncertain, time constrained, and high risk, where poor performance could result in costly or disastrous consequences (Rasker et al., 2000). The common denominator of C2systems, regardless of domain, is the need to perceive, interpret,

as well as exchange large amounts of ambiguous information, in order to perform successful decision-making (Riley et al., 2006).

In order to measure performance of a C2system, the researcher needs to develop an assessment tool for

measuring performance. Meister (1985) presents 8 different criteria that the researcher can consider when selecting such an assessment tool. These include effectiveness, ease of use, cost, flexibility, range, validity,

(14)

reliability, and objectivity. Effectiveness concerns the extent to which the method accomplishes its purpose. Ease of use is defined by how easy the method is to carry out. Cost concerns several different aspects in addition to monetary costs such as data requirements, equipment needs, personnel, and time needed to apply the method. Flexibility describes how the method can be used in many different contexts, with different system types, at several system levels. Range concerns the number of phenomena, behaviors and events that the method can analyze or measure. Validity is defined by the extent to which the method measures what it is intended to measure. Reliability concerns if the method provides similar results if it is applied to the same phenomena. Finally, Objectivity describes the extent to which the method is independent of the researcher’s subjective biases, feelings and interpretations.

1.2

Purpose

The purpose of this thesis is to develop performance measurements related to a platoon commander serving in the Swedish Armed Forces (sv. Försvarsmakten). These performance measurements should adhere to Meister’s (1985) criteria of effectiveness, ease of use, cost, flexibility, and objectivity. This thesis will not adhere to the criteria range, validity and reliability. As this thesis will be limited to one scenario, Meister’s definition of range will not be included. The number of participants are not enough to evaluate the validity and reliability of the measurements, these criteria will not be controlled for.1 Below, further descriptions of

the criteria used are presented:

• Effective in the sense that the measurements actually assess performance related to the platoon com-mander’s tasks and goals. That is, performance that can be obtained and measured from the scenario but does not have relevance to the tasks and goals of the platoon commander, should not be prioritized. • Easy to use in that the measurements do not require training to be administered. Also, the measure-ments should not rely on subject matter experts (SME:s) for assessing the results. By not relying on SME:s, the results from measurements become easier for the researcher to evaluate.

• Cost effective meaning that the measurements used to assess the performance of the platoon commander should have a low application time to avoid personnel related costs. The measurements developed in this thesis aim to be applied for future studies at the Swedish Defense Research Agency. These studies place higher demands on the number of participants involved, as well as using participants from the Swedish Armed Forces. Also, the materials needed for collecting performance data should not be too expensive.

• Flexible in the sense that the measurements could potentially be applied to different simulated scenarios. • Objective meaning that the measurements do not rely on the researcher’s subjective biases, feelings or

interpretations for assessing the results.

1As the measurements used in this thesis are well established measurements, they’re validity and reliability have previously

(15)

1.3

Research Questions

The following research questions are posed in this thesis:

1. What model of cognition can be used to represent the C2 tasks and goals of the platoon commander?

2. What performance measurements, related to the C2tasks and goals of the platoon commander, can be

identified from such a model?

3. Could one assessment tool, with regards to the platoon commander’s C2tasks and goals, be developed

from such a model?

1.4

Delimitation

The limitations for this thesis are restricted to a specific scenario in a simulated environment. The details of the scenario are described in further detail in section 3.1. Furthermore, this thesis will develop performance measurements related to the C2tasks and goals of a platoon commander, therefore the outcome of the platoon

(16)

2.

Theory

This chapter presents the theoretical frameworks and performance measurements used to investigate how the performance of a platoon commander can be assessed. This chapter begins with a section over Command and control (C2) and how it related to the models of cognition presented in this thesis, including the OODA

and DOODA loop as well as the Contextual Control Model. After the sections on C2 and the models

of cognition, sections describing different performance measurements are presented. These sections can be seen as consisting of two parts, (1) a section over generic performance measurements retrieved from first person shooter games and simulated environments. Followed by, (2) a section introducing Situation Awareness (SA), including two SA measurement techniques the Crew Awareness Rating Scale (CARS) and the Situation Awareness Global Assessment Technique (SAGAT), and finishing with a section describing a workload measurement technique, the NASA-Task Load Index (NASA-TLX).

2.1

Command and Control

Many different varieties of C2 have been developed. These include: Command, Control and

Communica-tions (C3); Command, Control, Communications and Intelligence (C3i); Command, Control, Communication,

Computers and Intelligence (C4i); as well as Command, Control, Communications, Computers, Intelligence,

Surveillance and Reconnaissance (C4ISR). The common denominator of these different definitions is that

C2 requires organized management of several components of the sociotechnical system, including personnel,

communications, procedures, equipment and facilities. These components need to be organized in order to be able to plan, direct, coordinate, and control operations which ultimately lead to the achievement of orga-nizational objectives (Wallenius, 2002).

In short, it can be concluded that most of these definitions agree that C2constitutes the process or function

that directs or controls systems in order to achieve certain effects. In order to measure a C2-systems

perfor-mance, in this thesis a military C2-system, the components of this system need to be more closely defined

by a model of cognition. Three models of cognition will be presented in the following sections.

2.1.1

The OODA Loop

OODA, also called OODA loop, stands for Observe, Orient, Decide, Act (for a simplified version of OODA see figure 2.1). OODA is a cyclical model consisting of 4 processes that interact with the environment. Originally developed by Boyd (1987a) for use in observing and examining decision-making in fighter pilots (Révay & Lıéška, 2017), OODA has developed far beyond its original use. It is the most dominant military C2 model

and is included as a doctrine for all four branches of the US Armed Forces as well as occurring in the Swedish Armed Forces (Brehmer, 2006). The OODA processes are performed by an agent that interacts competitively

(17)

with other agents in an environment, these competing agents are also seen as operating in accordance with the OODA processes (Grant & Kooter, 2005).

The observe process describes a process of collecting information over the environment through interacting, sensing and receiving information. By observing, the agent is guided and controlled by the orient process as well as receiving feedback from the decide and act processes. Orient is the most complex of the four processes. It represents the images, views or impressions of the world, which are the result of a complex setup of genetic heritage, cultural predispositions, personal experiences and knowledge (Révay & Lıéška, 2017). ”It [orient] shapes the way [...] we observe, the way we decide, the way we act.” (Boyd, 1987b) (as cited in Grant and Kooter, 2005, underlining in original). The decide process constitutes the choice between a number of hypotheses in order to conduct the most appropriate response to the specific environmental situation in which the agent is located in. Decisions are guided by internal feedforward from the orient process and provides internal feedback to the observe process. Finally, the act process describes the testing phase, a phase that occurs when a hypothesis has been selected. Act is the part of the OODA loop that interacts with the environment. It operates by receiving guidance from the orient process and feedforward information from the decide process, while providing feedback to the observe process.

Figure 2.1: Simplified OODA loop

2.1.2

The DOODA Loop

The DOODA loop, or the Dynamic OODA loop, was first presented by (Brehmer, 2005) as an attempt to mitigate the shortcomings of the OODA loop and cybernetic models (Brehmer, 2006). The aim of the DOODA is to create a more robust and general model of C2-systems. It does this mainly by (1) including a

representation of effects, like other cybernetic models, (2) including concepts that are required in order to sep-arate from the reactive nature of C2models, (3) providing representations of functions required for C2(ibid.).

(18)

Figure 2.2: The DOODA loop

According to the DOODA loop, the C2-systems can be seen as having sensors as well as three main functions:

information collection, sensemaking and planning (see figure 2.2). The information collection function col-lects data and receives feedback from the sensemaking function, which in turn directs the information search. Sensemaking is described as the function that produces an understanding of the mission. This function fo-cuses on what should be done in the current situation. Its inputs are the mission and information collection. The planning function converts the output of the sensemaking function into orders, which are considered to be the most important output of a C2-system. These orders are then converted into military activity, which

are filtered through frictions and result in effects on the battlefield. These effects are then assembled in the information collection function via sensors (Brehmer, 2006).

In summary, the DOODA loop can be seen as taking the simpler OODA loop, which focuses more on individual C2 functioning, and adding more details to C2 functions as well as raising it to a system-oriented

level.

2.1.3

Contextual Control Model

The Contextual Control Model (COCOM) is a cyclical model that describes how units of analysis, comprising of individuals and organizations which are referred to as Joint Cognitive Systems (JCS:s), maintains control (see figure 2.3). It is different to that of sequential models in its ability to study complex systems. When studying complex systems COCOM has several advantages compared to sequential models. These include, seeing the user as a part of a process, being functional rather than structural, combining feedback with feed-forward, and finally by being contextual rather sequential (Hollnagel & Woods, 2005). By being a contextual

(19)

model, the context of the situation in which the JCS is located in, determines the next action. This can be compared to sequential models, where actions are derived from a pre-existing pattern (ibid.).

Figure 2.3: The COCOM model

COCOM consists of three main constituents, competence, control, and constructs. Competence represents the set of possible actions or responses that a JCS can conduct in order to meet the specific needs and demands of the situation. Control describes the orderliness in which competence is applied. Constructs describes the understanding of the situation by the JCS (Hollnagel & Woods, 2005).

A primary feature of the COCOM is its control modes. The control modes describe the different modes of control that a JCS can employ, ranging from scrambled to strategic. When the JCS is in scrambled mode, the choice of the next action is basically random. There is little, if any, thinking or consideration taken of the context. The next control mode is opportunistic, in this mode the JCS takes some consideration over contextual factors when choosing the next action. Planning and anticipation is however limited, and the ap-proach is mostly trial-and-error. The opportunistic control mode is followed by the tactical mode. Here, the JCS more or less follows known procedures or rules and the time horizon is stretched beyond current needs. Planning is however of limited scope or range and the considered needs may sometimes be ad hoc. Finally, in the strategic control mode the JCS has a longer time horizon and focus can be placed on higher-level goals. The interaction between multiple goals will be taken into consideration when planning and outcomes are deemed as successful if goals are achieved at the proper time, while simultaneously not jeopardizing other goals (Hollnagel & Woods, 2005).

(20)

COCOM is not as widely used in military C2operations as OODA and DOODA. Therefore, finding examples

of such is more rare. One example of COCOM in a military C2 operation was presented by Banbury et al.

(2008). In their study, Situation Awareness (SA) requirements for Tactical Army Commanders engaged in time critical C2 operations such as convoy escorts, checkpoint security, combat, and cordon and search, were

identified. The SA requirements were then categorized in accordance with COCOM in order to develop decision aids and other forms of decision support techniques for C2operations.

2.2

Measuring Performance of the Platoon Commander

There are multiple measurements that can be applied to measure performance of a platoon commander. Therefore, the measurements used in this thesis will be divided in two parts. (1) Generic performance measurements derived from studies conducted in First Person Shooter (FPS) games and other simulated en-vironments. (2) Performance measurements based on SA (CARS and SAGAT) and workload (NASA-TLX). These identified measures were then used as basis for selecting which measures could be applied to the pla-toon commander operating in the scenario for this thesis.

The reason for using studies of FPS games to compile generic performance measurements of the platoon commander, is that video games are a used as a popular instructional tool for creating virtual environments for training. Additionally, FPS games provide the player with a holistic game experience by removing the player representations, such as avatars, and puts the player in a first-person perspective. This allows the player to fully identify with the character that is only represented through a weapon and/or hands (Grimshaw et al., 2008).

2.2.1

Measuring Performance in First Person Shooter Games

Research from FPS games such as Counter-Strike: Global Offensive (CS:GO) show that objective perfor-mance can be measured by giving points for either killing an opponent or assisting a teammate´s kill (Hopp & Fisher, 2017). Other studies conducted in FPS games measure performance in terms of number of killed opponents, number of deaths suffered by the player, and assisting a teammate’s kill. The number of kills can then be compared to the total number of kills performed by the team. This comparison provides a metric over the player’s kill participation, e.g. if player X made 25 kills and the team made 100 kills, player X:s kill participation is 0.25 (or 25%). This comparison can then be repeated for the number of assists or deaths. The reason for providing this metric is to give an indication for what role a player has in a team, i.e. kills many opponents, supports the team through assists or dies multiple times (Shim et al., 2011). Klimmt et al. (2009) focused on players objective performance in the FPS game Unreal Tournament 2 by recording the number of enemies killed and the number of times the player died, under 10 minutes of play time. Finally subjective measures have also been employed. Through self-relative performance assessment, participants were asked to evaluate their own in-game performance using anchor terms such as ”very bad/very good”, ”very ineffectively/very effectively” and ”unsuccessfully/successfully” (Hopp & Fisher, 2017).

In summary, the unique performance measurements identified from FPS games include: • Enemies killed.

(21)

• Assisting a teammate’s kill.

• Number of deaths suffered by the player. • Kill participation.

• Subjective self-relative performance.

2.2.2

Measuring Performance in Simulated Environments

Stevens (2014) measured performance of aerial gunners in helicopters through Non-Rated Crew Member Manned Module (NCM3). Performance was measured in terms of numbers of enemies neutralized. Elliott et al. (1999) describes how performance in an aircraft mission training, conducted via a high-fidelity sim-ulation can be used to measure performance. This is conducted by dividing a mission into three phases. Phase I (pre-mission briefing), phase II (mission execution) and phase III (mission debriefing). Performance in Phase I was measured by: the development of mission aids, formulation of contracts internal to the team, formulation of contracts external to the team, and pre-briefing of pilots. Phase II assessed performance by: communication in accordance with a standardized communication protocol, communication in support of situation awareness and the ”big picture” and mission execution. Performance in phase III was assessed by: reconstruction of team engagement, evaluation of team objectives, review of equipment issues, review of team mission execution and review of information exchange.

In other studies, in which a different setup had been used, the player controls the character with a mouse and keyboard through a FPS perspective. Here, participants performance was measured through the number of targets hit in a virtual firing range in Virtual Battle Space 1 (VBS1, Orvis et al., 2008). Maxwell and Zheng (2017) and Maxwell et al. (2016) used a similar setup but different performance measures were em-ployed. Training of infantry soldier skills were assessed by how the participants reacted to indirect fire while dismounted, shouted “incoming” in a loud recognizable voice, reacted to the instruction of the leader and looked for guidance, sought nearest cover, assessed the situation, reported situation to leader, and continued mission. These performance measurements were then assessed by Subject Matter Experts (SME:s) through four rating categories: ”needs improvement”, ”adequate”, ’successful”, and ”excels”. SME:s have also been used to assess the performance of expected squad behaviors (Ross et al., 2016).

Guides for debriefing performance in military training and mission-based application in Virtual Battle Space 2 (VBS2), have also been developed (Green et al., 2011). The guides include mission rehearsal and fa-miliarization, tactical training, vehicle checkpoints, procedural training for unmanned aerial vehicle (UAV) operators, and cultural awareness training. By using the guide, training facilitators can more effectively identify infantry soldiers’ performance deficiencies, improve soldiers understanding of new tactics, techniques and procedures, and enhance the effectiveness of the exercises performed in VBS2. The guide can then be used to analyze what happened, why it happened and how it can be performed better by the participants and other people related to the mission (ibid.).

Maraj et al. (2017, 2016) and Hurter et al. (2016) employed a version of Kim’s Game to help train soldiers in behavior cue detection. Kim’s Game, originally an observational training game that includes memorization of objects and then later recalling them, can be customized for use in military training tasks. This military version of Kim’s Game is used to enhance soldiers’ skills in observing an environment more critically, mem-orizing rapidly, and deepening descriptive skills. The task for the soldiers playing Kim’s Game is to identify

(22)

aggressive and nervous kinesic behavior cues of AI controlled units, such as clenching of fists and wiring of hands. Participants performance is then rated based on detection accuracy (the number of correctly identified behaviors of nervousness or aggressiveness), false positive detection (incorrectly identifying nervousness or aggressiveness), and response time (the amount of time taken for participants to react to behavior cues). In summary, the unique performance measurements identified from simulated environments include:

• Numbers of enemies neutralized.

• Phase I (pre-mission briefing): development of mission aids, formulation of contracts internal to the team, formulation of contracts external to the team, and pre-briefing of pilots.

• Phase II (mission execution): communication in accordance with a standardized communication pro-tocol, communication in support of situation awareness and the ”big picture”, and mission execution. • Phase III (mission debriefing): re-construction of team engagement, evaluation of team objectives,

review of equipment issues, review of team mission execution, and review of information exchange. • Number of targets hit.

• Reaction to indirect fire while dismounted, shouting “incoming” in a loud recognizable voice, reaction to the instruction of the leader and looking for guidance, seeking nearest cover, assessing the situation, reporting situation to leader, and continuing mission.

• Mission rehearsal and familiarization, tactical training, vehicle checkpoints, procedural training for unmanned aerial vehicle (UAV) operators, and cultural awareness training.

• Detection accuracy, false positive detection and response time, in Kim’s Game.

2.3

Situation Awareness

Situation awareness (SA) sometimes called situational awareness, refers to the level of awareness that an actor has of the current situation (Stanton et al., 2005). The concept of situation awareness (SA) was first identified as an important factor in military flight during the first world war. However, the term did not emerge in research literature until late 1980s (Endsley, 1995b). Endsley formally defines SA as:

”[...] the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future.” (Endsley, 1995b, p. 36).

Sarter and Woods (1995, as cited in Endsley, 1995b) believed that developing a definition of SA was futile and not constructive. SA has been used quite loosely and was thus named ”the buzzword of the 90s” (Wiener, 1993, p. 4). Thus, there exists many different definitions of SA. From a literature overview, 30 different definitions of SA were found (Salmon et al., 2017). Dominguez (1994) synthesized 15 definitions into the following definition:

(23)

”Continuous extraction of environmental information, and integration of this information with previous knowledge to form a coherent mental picture, and the use of that picture in directing future perception and anticipating future events” (Dominguez, 1994, p. 11 as cited in Vidulich et al., 1994).

Whereas Dominguez (1994) definition refers to SA as a process, Endsley (1995b) sees SA as a state. Viewing SA as a process or state, constitutes one of the major differences in defining SA (Salmon et al., 2017). SA can also be classified into two different approaches, an individual and distributed approach. The individual approach considers SA from an individual actor’s perspective and the distributed approach considers SA as being distributed across multiple actors and artifacts, which together comprise the total system (Stanton et al., 2005). Within the individual approach there exists two dominant theories, the three-level model of SA (Endsley, 1995b) and the perceptual cycle model of SA (Smith & Hancock, 1995). Endsley’s (1995b) three-level model is the most commonly used and widely cited theory of SA, therefore it is the theory of SA that will be used in this thesis.

Endsley (1995b) also emphasizes that SA should be separated from decision-making and performance. Highly trained decision makers can make wrong decisions if they have inaccurate or incomplete SA. On the contrary, an actor who has very good SA can also make wrong decisions and perform poorly. SA, decision-making and performance are different stages with different influential factors and the approaches for dealing with each one varies. SA is however an essential prerequisite of decision-making and the outcome of performance as seen in figure 2.4.

2.3.1

Endsley’s Three-Level Model

Level 1 SA: Perception of the Elements in the Environment. The first level of SA involves perceiving the status, attributes and dynamics of relevant elements in the environment. For a tactical commander this would entail the perception of location, type, number, capabilities and dynamics of enemy and friendly forces in a certain area (Endsley, 1995b). Attention is directed to the most relevant environmental cues based on the actor’s goals and experiences in the form of mental models (Stanton et al., 2005).

Level 2 SA: Comprehension of the Current Situation. The second level of SA involves the connection between disjointed elements of level 1. Whereas level 1 entails awareness of elements in the environment, level 2 goes a step further by understanding the significance of those elements based on what goals the actor has. By combining the different elements, the actor forms a holistic image of the environment. A novice actor may be capable of achieving level 1 SA just as a more experienced decision maker but may fail in being able to integrate elements along with pertinent goals in order to comprehend the situation. For example, a tactical commander must comprehend that three enemy aircrafts appearing within a particular proximity of each other and in a particular geographical location, could indicate something about their objectives (Endsley, 1995b).

Level 3 SA: Projection of Future Status. The third and highest level of SA, is the ability to project the future actions of the elements in the environment. In order to achieve level three SA, the actor must have knowledge of the status and dynamics of the elements (level 1) as well as comprehension of the situation (level 2). For example, if a military commander knows that an aircraft is in a certain location and that it poses a threat, it allows the commander to project that the aircraft will attack in a given manner. This

(24)

provides the commander with knowledge and time to decide on a course of action that is aligned with their objective(s) (Endsley, 1995b).

Figure 2.4: Simplified version of Three-Level Model (Endsley, 1995b)

2.4

Measuring SA in C

2

Environments

Measuring SA in C2 environments is a complex task and has posed a great challenge to the Human Factors

community. C2 environments are usually complex, dynamic, and rich in information. Therefore, numerous

techniques for measuring SA in the C2 domain have been reviewed and compared (Salmon et al., 2006).1

A multiple measure approach is recommended in order to assess SA in a C2 environment, of these, generic

performance measurements are mentioned (Endsley et al., 2000; Salmon et al., 2006). Generic performance measurements are used as they provide an indirect measure of SA while also being easy to obtain through a non-intrusive approach. Apart from generic performance measurements, a multiple-measure could also include a freeze probe technique, a post-trial subjective rating technique and an observer rating technique, however the make-up of these techniques can differ (Salmon et al., 2006). Situation Awareness Global Assessment

1It should be noted that Salmon et al. (2006) discusses the complexities of Command, control, communication, computers

and intelligence (C4i). C4i aims to describe the communication, computer and intelligence aspects. Whereas C2only describes

command and control. However, as C4i can be seen as a variety of C2, conclusions drawn over studies of C4i can be applied to

(25)

Technique (SAGAT, Endsley, 1995a) and Situation Awareness Rating Technique (SART, Taylor, 1990) are both common SA measurement techniques typically used in the military aviation domain. Whereas Crew Awareness Rating Scale (CARS) and Mission Awareness Rating Scale (MARS) are used in the domain of military infantry operations. This thesis employs a multiple measure approach for the purpose of measuring the performance of a platoon commander. This is conducted by the use of generic performance measurements, the CARS, the freeze probe technique SAGAT and the NASA-TLX.

2.4.1

Situation Awareness Requirement Analysis

In order to measure the SA of a platoon commander, a Situation Awareness Requirement Analysis (SARA) can be performed. The purpose of a SARA is to determine the tasks of the actor in the environment as well as assuring the validity of the SA assessment technique that will be used (Stanton et al., 2005). Endsley (1993) describes a generic process, similar to a SARA, for acquiring SA requirements. This process includes unstructured interviews with SME:s, goal-directed task analysis, and questionnaires for determining SA re-quirements for a particular scenario.

Matthews et al. (2004) describes the process of conducting a SARA for platoon commanders in Military Operations in Urbanized Terrain (MOUT). In this approach an interview with the SME begins with a series of open-ended questions intended to elicit detailed responses over doctrinally based goals and the decisions associated with the accomplishment of those goals. The SME can also be shown a graphical representation of a goal hierarchy and asked about its accuracy in relation to the platoon commander goals in a real-world setting. From this point the SME can identify gaps in the analysis over multiple iterations. This continues until the SME is in general agreement with the analysis. The requirements can later be used to develop a SAGAT protocol for simulations of infantry missions. By using the SAGAT, the researcher can systematically assess a platoon commander’s ability to perceive, comprehend and predict key elements in their mission. Endsley (1993) describes a slightly different approach to the SARA, where the first step is to conduct an unstructured interview. During this interview the SME can be asked to define what good SA feels like in a specific context. The SME can then be asked what s/he would like to know in order to have perfect SA. Based on the answers the SME can be asked to elaborate on responses in order to sort out unimportant information and also to help the interviewer gain a better understanding of the nature of the SA elements described by the SME (Endsley, 1993).

2.4.2

Crew Awareness Rating Scale

The Crew Awareness Rating Scale (CARS) originally developed by McGuinness and Foy (2000), is a subjec-tive measurement technique of individual SA (Gawron, 2019). CARS uses two subscales, the content subscale (see statements 1, 3, 5 and 7 in table 2.1) and the process subscale (see statements 2, 4, 6 and 8 in table 2.1) to measure SA. Each statement is rephrased into a question when the CARS questionnaire is adminis-tered. Statements 1, 3 and 5 measures the participants ease of identification, understanding and projection of task SA elements (Stanton et al., 2005) and correspond to level 1, 2 and 3 of SA according to Endsley’s (1995b) three-level model. Statement 7 is used to assess how well participants can combine the content from statement 1, 3 and 5 with their course of action. E.g. how much the participant used the information from statement 1, 3 and 5 to make their decisions (Stanton et al., 2005). Statements 2, 4, 6 and 8 measures the participants mental effort to identify, understand, project future states of the SA related elements in the situation, and how mentally difficult it was to achieve the appropriate task goals.

(26)

For each of the 8 questions, participants are asked to rate themselves on a scale from 1 (best case) - 4 (worst case). 1 ”Yes, I have good SA”, 2 ”Probably”, 3 ”Probably not”, and 4 ”No, I do not have good SA”. For the process subscale, the answers range from ”Easy” to ”Unmanageable” (McGuinness, 1999) (as cited in Prytz, 2010).

Table 2.1: Definitions of CARS rating scales (Gawron, 2019)

CARS is developed for use in military operations and has previously been used to measure workload and SA of military commanders while using a digitized C2technology in a simulated battlefield scenario (McGuinness

& Ebbage, 2002). The advantages of CARS includes, the questionnaire being handed out post-trial and thus being non-intrusive, requiring low training time for both the researcher and the participants, having a low application time and requiring only a pen and paper to administer (Salmon et al., 2006). CARS does not rely on subject matter experts (SME:s) as other methods for assessing SA such as The Situation Awareness Behavioral Rating Scale (SABARS, Matthews and Beal, 2002) and Situation Awareness for Solutions for Human-Automation (SASHA, Jeannot et al., 2003).

2.4.3

Situation Awareness Global Assessment Technique

Situation Awareness Global Assessment Technique (SAGAT) is a method for measuring participants SA and has primarily been used within high-fidelity or medium-fidelity part-task simulations (Endsley, 1995a). SAGAT applies a freeze-probe technique, meaning that the trial is paused at random points. During the pause the participant is handed the queries either on a piece of paper(s) containing the SAGAT or directly on the screen as a prompt (Salmon et al., 2006). Most studies applying SAGAT have been conducted in aircraft simulations but it can be applied to any domain where simulation of tasks exists and an SARA has been made to develop the queries. SAGAT queries have previously been developed for advanced bomber aircraft (Endsley, 1990), en-route air traffic control (Endsley & Rodgers, 1994) and nuclear control rooms (Hogg et al., 1993). There are two primary advantages of using SAGAT: (1) it offers a direct measurement of the participants SA, i.e. it does not rely on post-trial or subjective SA data. (2) SAGAT is one of the most widely used and most validated methods for measuring SA (Salmon et al., 2006).

(27)

performance and workload may also be collected. While conducting a study, participants should be given instructions to conduct the tasks as they normally would. When participants are filling out the SAGAT questionnaire, no displays or other visual aids should be available. Talking or sharing information between participants should not be allowed. If the participant does not know the answer of a certain query, he/she should be encouraged to make their best guess. As a general recommendation, a freeze should last for a fixed amount of time. When the time limit is reached, the trial should be resumed regardless if the participant has answered all the queries. Upon selecting which queries to use during the trials, the researcher should have a surplus of queries of which s/he can choose a selection for each trial. This way the queries can be randomly distributed to the participant, providing consistency and statistical validity. As an informal rule no freezes should occur before three minutes into a trial, nor should they occur within one minute of each other. This is to increase the likeliness of the participant to build up an image of the scenario. Multiple freezes may occur during each trial, there is no known limit over the number of freezes within one trial. One experiment applied the SAGAT and showed that as many as three 2-minute freezes during a 15-minute trial had no negative ef-fects on performance, another experiment showed that 5-minute freezes did not produce memory decay (ibid). In order to evaluate the results of the SAGAT, the answers given for the queries are compared to values collected by the simulator (Endsley, 1995a). In order to evaluate, the queries are recommended to be scored as either correct or incorrect based on if the answer falls within an acceptable range. This range is to be decided by giving each query a range of values that are considered to be acceptable or not (Endsley, 2000), e.g. it may be acceptable for a car driver to be within 10km/h of the intended speed limit. Certain questions that relate to higher-level SA requirements and that cannot be collected from the simulator may need SME:s to assess the correctness of the answer. For answers that are available from the simulator, they can then be calculated through a tabulation containing the frequency of correctness, which can be made within each test condition for each query. Additionally, statistical tests such as chi-square, Cochran’s Q or binomial t test can be used to calculate statistical significance of differences in SA between test conditions (Endsley, 1995a). A combination of the query answers is not recommended as this tends to reduce sensitivity of the metric by losing important distinctions between the queries (ibid.; Endsley, 2000).

2.4.4

NASA-Task Load Index

The NASA Task Load Index (NASA-TLX) is a multi-dimensional scale used to measure workload estimates from one or more operators. Workload represents the operator’s costs of accomplishing the mission require-ments. Since accomplishing these requirements can have costs such as fatigue, stress, illness and possibly accidents, knowledge is needed about operator workload (Hart, 2006).

The usage of the NASA-TLX has developed from its original domain of application, aviation, to automobiles, medicine and combat, for example. The questionnaire can be distributed to the participant either during a task or afterwards (Hart, 2006). The NASA-TLX questionnaire includes 6 different types of scales: mental demand (sv. mental belastning), physical demand (sv. fysisk belastning), temporal demand (sv. tidspress), performance (sv. prestation), effort (sv. ansträngning), and frustration (sv. frustration). The different types of scales represent the phenomena that influence subjective workload experience in a broad range of tasks which are administered during and following activities performed in operational environments (Hart & Staveland, 1988). The assumption is that a combination of these six scales represent the workload experience of the operator. The scales are a result of extensive analysis of the primary factors that do, and do not, define the subjective experience of workload for people performing a broad range of activities ranging from

(28)

simpler laboratory tasks to flying an aircraft (Hart, 2006).

When answering the NASA-TLX the participant is presented with the six subscales on a piece of paper. The participant marks their perceived workload on a 12 cm line for each subscale. At both ends of each line are the bipolar adjectives "Low" and ”High” (see figure 2.5). With the exception of the performance subscale, where participants are asked to mark their perceived performance from ”Good” to ”Poor” (Hart & Staveland, 1988).

Figure 2.5: Example of the mental demand question

Hart and Staveland (1988) provides a description over each subscale and how they are correlated with overall workload (see table 2.2).

Table 2.2: Description and connection of subscales to different factors of workload

In order to account for individual differences when calculating overall workload and thus decrease in between-rater variability, a weighting is conducted after the participant has performed the NASA-TLX. This weighting is conducted by administering a series of 15 paired comparisons of the six subscales (e.g. mental demand vs performance) after the NASA-TLX has been administered. The participant is then asked to choose which subscale provided the most significant source of workload in the task. The answers from the weightings are then used to identify which scales are more related to the participant’s personal definition of workload (Grier, 2015).

(29)

To calculate the workload through NASA-TLX, the mean Weighted Workload Score (WWL) is calculated. During the data analysis the ratings, i.e. the markings on the line, are quantified by assigning a value between 1-100, 1 indicating a low workload value and 100 indicating a high workload value. When calculating the WWL each rating is divided with their respective weight, these scores are then added together and then divided by the sum of the ratings for each subscale (Grier, 2015).

Finally, the scores from the NASA-TLX need to be evaluated. Hart (2006) noted an issue with evaluating the scores from the NASA-TLX (Hart & Staveland, 1988). The issue concerns the elusive point of which workload is not only considered to be high, but too high. Given the relative nature of subjective ratings such as the NASA-TLX, this may pose an issue for researchers when trying to analyze scores. As a result of this issue, Grier (2015) has provided ranges and percentiles in order for researchers to determine how common a particular workload score is. This way, the researcher can decide if the score is low or high compared to other studies using similar tasks. The researcher should bear in mind that although the tasks between studies may be similar, contextual factors e.g. task type, different levels of expertise, different levels of difficulty within task type, and different stressors, should be considered. Scores for C2 tasks (military planning, computer

based military simulations, and gunner exercises range studies) range from 20-75 (ibid.).

2.5

Selection of C

2

Model

The first research question of this thesis, as presented in section 1.3, was to identify a model of cognition that can represent the C2 tasks and goals of the platoon commander. Thus, the C2 models OODA, DOODA and

COCOM were described in sections 2.1.1, 2.1.2 and 2.1.3.

The OODA loop considering its frequent use in military C2, as well as its simple design of four processes

that relate well to the performance measurements (CARS, SAGAT, NASA-TLX and generic performance measurements), make it a appropriate model of cognition to be used in this thesis (see figure 2.6). The arrows in figure 2.6 indicate that a certain performance measurement measures a process in the OODA loop, the dashed arrows indicates that a certain performance measurement partially measures a process in the OODA loop.

(30)
(31)

3.

General Method

To answer the research questions in this thesis, a methodological approach consisting of 3 studies was ap-plied (see figure 3.1). The studies were conducted through a land warfare scenario conducted in a simulated environment. For each study, questionnaires aimed to measure the performance of the platoon commander were developed. These questionnaires were then distributed to the platoon commander during freezes in the trial. After all the questionnaires were filled-out and the trial came to an end, the platoon commander participated in a feedback session with the researcher. In these feedback sessions the platoon commander was asked to give verbal feedback on the questionnaires. Subjects for the feedback sessions included, word-ings of the questions, the relevance of the questions in relation to the tasks of the platoon commander and reflections over the performance measurements. To complement these studies a workshop with a platoon commander and a workshop with scientists was performed. After each completed study changes were made to the questionnaires. Hence, the succeeding study included modified as well as new questionnaires.

Figure 3.1: Chart over methodological approach

Furthermore, this chapter will include sections describing the scenario used, the simulation Virtual Battle Space 3, the techniques used to collect data during the studies (observations and semi-structured interviews),

(32)

the initial method used in study 1 (the Map-method), the adaptations made to the CARS, SAGAT, NASA-TLX and SARA, and finally the research ethics that were applied.

3.1

Scenario

The scenario (see figure 3.2) that was used in this thesis was created by Johansson et al. (2019) through three workshops with the Land Warfare School (sv. Markstridsskolan). The overarching scenario (see figure 3.3) in which this scenario takes place, was created by the Swedish Armed Forces for combat training with mechanized units. This overarching scenario is a fictitious scenario where Sweden has been attacked by a hostile nation. The attack includes enemies overshipping and disembarking in Norrköping and Oxelösund. At the same time enemy air landings are taking place at Malmens airport in Linköping. To counter the attack, the players must take the fictitious village Spång, where an enemy mechanized company is based consisting of ten armored personnel carriers and a mortar section equipped with heavy grenade launchers. The Swedish attack is commenced by one mechanized battalion. Parts of the battalion will flank the enemy from the north-west while the remaining battalion will attack head on from the east.

Figure 3.2: Scenario – the starting point for the mechanized battalion crew

(33)

north-east (see black square marked ”GuK” in figure 3.2 and figure 3.4) of Spång while the enemy battalions are scanning for movement around Spång and are therefore located in different forest lots. The mortar section is located centrally in Spång. The goal of the player, in this thesis the platoon commander, is to take Spång.

Figure 3.3: Overarching scenario – attack against Spång

3.2

Virtual Battle Space 3

VBS3 is a 3D virtual training environment for tactical training experimentation and mission rehearsal for land, sea and air. Players experience gameplay through a first person shooter (FPS) perspective (Bohemia Interactive, n.d.). VBS3 is used by the Swedish Armed Forces’ combat schools as a training environment. Through VBS3:s interactive 3D-simulation, the player can move around by foot or by manning different vehicles, e.g. an Infantry Fighting Vehicle (IFV, see figure 3.5).

The replication of visual and auditory stimulus (visual–audio fidelity) is considered to be high in VBS3, while it struggles to provide a feeling of the actual environment (physical fidelity), e.g. vibrations from

(34)

Figure 3.4: The company at the start of the scenario in VBS3

the vehicles, temperature and wind. The replication of actual equipment (equipment fidelity) is limited as the player uses a computer mouse, keyboard, screen and headset. The degree to which VBS3 replicates psychological and cognitive factors (psychological–cognitive fidelity), such as communication and situational awareness, is considered to be high. As VBS3 offers a high degree of adjustability, experiment leaders can create tasks and scenarios that are similar to what the players would experience in real-world combat missions (Johansson et al., 2019). VBS3 has been used to assess infantry soldier skills training (Maxwell & Zheng, 2017) and earlier versions of the Virtual Battle Space series, such as VBS1 has been used for the creation and modification of military-oriented scenarios (Orvis et al., 2008).

3.3

Observation

Observations were made during the studies in order to, gain a better understanding of how the platoon com-mander performed his/her C2 tasks in VBS3, record improvements that could be made for the next study,

write down time stamps of events in the video recording.

During an observation, the researcher can choose to act as one of four different roles: total or complete participation, total or complete observation, participant as an observer, or observer as a non-participant. A total or complete participation means that the researcher entirely assumes the role of a member of the group that is being studied, without revealing that s/he is also assuming the role of a researcher. Total or complete observation refers to a role where personal involvement in the group is minimized. Participating as an observer refers to the role of the researcher being known to the group that is being studied and the researcher participating in group activities. As a non-participant the researcher observes the behavior of

(35)

the participants while being known to the people in the study but does not actively engage in the activities performed by the participants of the study (Howitt, 2016). To complement the observations the researcher can conduct additional data collection methods, e.g. group discussions, semi-structured interviews and video recordings (ibid.). For this thesis, the role of the researcher was as a non-participant.

Figure 3.5: IFV in VBS3

3.4

Semi-structured Interviews

After each study a feedback session with the platoon commander participating in the study was performed. During these feedback sessions the platoon commander was asked a series of questions relating to the rele-vance of the questions in the questionnaires (CARS, Map-method, SAGAT and NASA-TLX), in relation to the C2 tasks and goals of the platoon commander. Additionally, questions included subjects such as, if the

wordings were difficult to understand and if the generic performance measurements were of relevance. These interviews were conducted using a semi-structured approach. A semi-structured interview is a mix between a structured and unstructured interview. Before the interview, the interviewer has prepared some questions in order to explore a subject. However, there is no clear structure, new questions can come up during the interview as flexibility is vital (Howitt, 2019).

A semi-structured interview approach was chosen for this thesis as flexibility during the interviews was key. It was important the interviewees could talk about what they felt was interesting about the questionnaires and the generic performance measurements. This could have been undone if the structure of the interview was too rigid.

(36)

3.5

Initial Method and Adaptations to Methods

This section describes an initial method that was used to measure SA, namely the Map-method, as well as descriptions of adaptations that were made to the already existing methods, CARS, SAGAT, NASA-TLX and SARA. Finally, the research ethics of this thesis will be presented.

3.5.1

Map-method

Previous studies have used a similar approach to the map-method conducted in this thesis, where participants’ SA were measured through a map in the simulated micro world C3Fire (Granlund, 2003; Persson & Rigas,

2007). The studies used a map system, the Geographic Information System (GIS), to measure participant SA. The GIS allows the participant to place markers on a map, of which they can choose different palettes in order to make a mark. The different marks that can be used are fire, unit and geographical objects (C3

Learning Labs, 2019). The accuracy of the markings and what time they were inserted in the GIS can then be compared to where the fires actually took place. As there is no similar system to the GIS used by infantry commanders, the participant of study 1 was asked to mark on a paper map where they believed friendly and enemy systems to be.

In study 1, the participant was handed two versions of the same map (see figure 3.2) during a freeze in the trial. On the first map, the participant was asked to mark where they believed friendlies and enemies to be positioned at the point in time of the freeze. On the second map, the participant was asked to mark the same information as they did on the first map but 10 minutes into the future. This process was then repeated four times (at battle lines 101, 103, 104 and 106). Additionally, for lines 103, 104 and 106 the participant was asked how well the previous prediction corresponded with the current situation.

The Map-method used in study 1 applied different methods for measuring participants SA compared to studies measuring SA in participants in C3Fire (Persson & Rigas, 2007), these differences include using a

pen to mark enemies and friendlies on a paper map instead of using a GIS and only being able to mark enemy and friendly positions.

3.5.2

Adaptions of CARS

To formulate the questions in the CARS questionnaire, the questions were created from the framework of Gawron (2019) and Prytz (2010) example question. Adaptions from the original version of CARS include:

• The CARS questionnaire being handed out during freezes instead of post-trial. • Adding specific examples for each CARS question (see figure 3.6).

(37)

Figure 3.6: Example of perception question used in this thesis

3.5.3

Adaptions of SAGAT

As studies seem to vary in their approaches, i.e. amount and duration of freezes (Endsley, 1995a), there seems to be no original version of which the application of SAGAT in this thesis can be compared to. Instead, guidelines (ibid.) and a SAGAT study performed for a MOUT mission (Strater et al., 2001) are used to create the SAGAT developed for this thesis. However some of the steps in the guideline are not followed, these adaptions include:

• Having no fixed time limit for the freezes.

• Not administering random queries to the participant during a freeze.

• Not conducting a full SARA, instead using results from a SARA over a platoon commander engaged in a MOUT mission.

• Translating queries from English to Swedish.

3.5.4

Adaptions of NASA-TLX

For the use of the NASA-TLX in this thesis, some adaptions from Hart and Staveland (1988) original were made. These adaptions include:

• Adding a segment of text at the top of the questionnaire which included a sentence over what the NASA-TLX is and a sentence explaining how to fill out the questionnaire.

• Elimination of the physical demand subscale. • Not conducting the weighting process.

• Dividing the scales into 5-point segments (see figure 3.7).

(38)

Figure 3.7: Example of the mental demand question used in this thesis

3.5.5

Adaptions of SARA

Several steps for conducting a SARA (Matthews et al., 2004) were not performed in this thesis. The SARA approach used in this thesis include the following adaptions:

• Not creating a goal-directed task analysis and not administering a questionnaire. • Using a workshop format instead of an unstructured interview.

• Using a pre-existing graphical representation of the platoon commander’s goals in a MOUT mission. • Translating the graphical representation of a platoon commander’s goals from English to Swedish. Although the graphical representation was developed for MOUT missions, the goals are similar to that of other missions such as the one used in this thesis. The reason for not performing an interview, and instead performing a workshop was twofold. Firstly, a workshop approach seemed as a more appropriate method to stimulate creativity and recollection from the participant as they had not performed the duties of a platoon commander for 24 years. By providing the participant with a starting point it was assumed to be easier for the participant to recollect these goals, instead of using the approach of an unstructured interview with open-ended questions. Secondly, conducting each step of the SARA seemed unnecessary as a goal structured task analysis had previously been created by Matthews et al. (2004).

3.6

Research Ethics

This thesis has complied with the four basic research ethics requirements: requirement of information, re-quirement of informed consent, rere-quirement of confidentiality and rere-quirement of utilization (Vetenskapsrådet, 2002). The participant was given an opportunity to read the consent form (see appendix A) before the record-ing started and then decide whether they wanted to participate in the study. The consent form clarified the purpose of the study, the interview procedure and relevant ethical aspects. The participant was informed that they could cancel their participation at any time without having to explain why, that all data would be anonymized in the publication of the thesis and that the recorded material is deleted after the end of the thesis. Participants participating in studies 1, 2 and 3 did not sign a consent form as they participated as a part of the KLASS project.

(39)

4.

Workshop with Platoon Commander

Before study 1, a SARA was performed. The purpose of conducting a SARA is to determine the tasks of the platoon commander operating in the military C2environment (Stanton et al., 2005). However, the main

advantage of performing the workshop in this thesis was providing the researcher with greater insight of the tasks and goals of the platoon commander.

Matthews et al. (2004) developed a graphical representation over the overarching goal and the 7 primary goals of an infantry platoon commander in Military Operations in Urbanized Terrain (MOUT) mission. The overarching goal is to: ”Attack, Secure and Hold Terrain”. The primary goals are:

1. Avoid Casualties. 2. Negate Enemy Threat.

3. Movement: Reach Point X by Time Y. 4. Assault Through Objective.

5. Hold Objective.

6. Provide Stability and Support Operations. 7. Function in a Team Environment.

The goals were a result from interviews with MOUT SME:s (Matthews et al., 2004). Step 6 ”Provide Stability and Support Operations” was removed due to its irrelevance for the scenario used in this thesis, the rest of the goals were translated to Swedish and printed on a piece of paper (see figure 4.1).

4.1

Method

This section describes the participant, materials and procedure, and a section describing the results of the workshop with the platoon commander.

4.1.1

Participant

The participant of the study had worked as a platoon commander 24 years ago, for a total of 4 years within the Swedish Armed Forces, spread out in two two-year segments.

(40)

Figure 4.1: Graphical representation of platoon commanders’ goals (adapted from Matthews et al. (2004))

4.1.2

Materials

The materials used for the workshop was the graphical representation of platoon commander goals (figure 4.1), an additional blank piece of A3 paper, a map of the scenario (figure 3.2), a consent form (see appendix A), two pens and an iPhone 6s (2015-year model) equipped with the Voice Memos app used for recording the audio from the workshop.

4.1.3

Procedure

Before the workshop began the participant was handed the informed consent form. The participant read the informed consent form and handed it back to the researcher. After this, the graphical representation was placed on a table between the researcher and the participant and the workshop began. The workshop started with the researcher asking if the graphical representation represented goals relevant to a platoon commander in the scenario. Next, the participant examined each primary goal and addressed their relevance to the goals

References

Related documents

spårbarhet av resurser i leverantörskedjan, ekonomiskt stöd för att minska miljörelaterade risker, riktlinjer för hur företag kan agera för att minska miljöriskerna,

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

• Utbildningsnivåerna i Sveriges FA-regioner varierar kraftigt. I Stockholm har 46 procent av de sysselsatta eftergymnasial utbildning, medan samma andel i Dorotea endast

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i

Detta projekt utvecklar policymixen för strategin Smart industri (Näringsdepartementet, 2016a). En av anledningarna till en stark avgränsning är att analysen bygger på djupa