• No results found

From Actions to Faces: Studies of Social Perception in Typical and Atypical Development

N/A
N/A
Protected

Academic year: 2022

Share "From Actions to Faces: Studies of Social Perception in Typical and Atypical Development"

Copied!
80
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)
(3)

To Karin, Hedda and Elsa

(4)
(5)

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Falck-Ytter, T.,Gredebäck, G., & von Hofsten, C. (2006).

Infants predict other people’s action goals. Nature Neuroscience, 9 (7), 879–879

II Falck-Ytter, T. (2010). Young children with autism spectrum disorder use predictive eye movements in action observation.

Biology Letters, 6 (3), 375-378

III Falck-Ytter, T., Fernell, E., Gillberg, C., von Hofsten, C. (in press) Face scanning distinguishes social from communication impairments in autism. Developmental Science

Reprints were made with permission from the respective publishers.

(6)
(7)

Contents

Introduction...11

Understanding other people’s actions ...12

Eye movements in action execution and action observation...12

Action understanding via simulation: the Mirror Neuron System hypothesis...13

Mirror Neurons in monkeys...13

The human MNS ...14

The functional significance of the MNS in action perception ...15

The development of the MNS in infancy...16

Simulation, motor resonance and direct matching...17

Understanding actions without simulation ...18

Modularist accounts of action understanding ...18

Action understanding via teleological reasoning...19

Face perception ...21

Neural systems involved in face perception ...21

Information from facial features: the eyes and the mouth ...22

Configural versus featural processing of faces ...23

Autism Spectrum Disorder...23

General description...24

Psychological explanations of ASD ...25

MNS dysfunction in ASD?...25

Face perception in ASD...26

The aims of this thesis...28

Methods ...29

Recruitment of participants ...29

Stimuli ...29

Procedure...29

Apparatus ...30

Data analysis ...31

Study I...33

Design ...34

Results ...35

Discussion Study I...37

(8)

Study II ...40

Design ...41

Results ...42

Discussion Study II ...43

Study III ...46

Design ...48

Results ...49

Discussion Study III ...51

General Discussion ...54

The MNS hypothesis of predictive eye movements – revisited ...55

The MNS hypothesis of autism – revisited ...57

Are motor experience and action perception functionally related?...59

The MNS hypothesis of high-order social competences – revisited ...60

What is special about (social) perception in ASD? ...61

Conclusion...62

Future directions...62

Summary in Swedish ...64

Acknowledgements...66

References...67

Erratum ...78

(9)

Abbreviations

ADI-R Autism Diagnostic Interview -

Revised

ASD Autism Spectrum Disorder

fMRI Functional Magnetic Resonance

Imaging

MNS Mirror Neuron System

SCQ Social Communication Questionnaire

TMS Transcranial Magnetic Stimulation

ToM Theory of Mind

(10)
(11)

Introduction

In this thesis, I take a closer look at action and face perception. The thesis is based on three eye-tracking studies, the first of which focuses on action perception in typical development. The latter two studies focus on action perception and face perception (respectively) in children with Autism Spectrum Disorder (ASD). Much discussion will be devoted to the Mirror Neuron System (MNS), which is involved in both action and face perception and which may be dysfunctional in individuals with ASD.

Why study action and face perception? The answer is simple: both processes are fundamental components of social life. To be able to accurately perceive what another person is doing, or expressing with her face, is absolutely necessary in order to cooperate (or compete) with her.

Conversely, a failure to accurately perceive actions and faces would have severe consequences both on the individual-, group- and societal level. Thus, understanding these processes is central in order to understand ourselves, as social human beings.

Why study children with ASD? Children with ASD have social interaction impairments, and current theories link these impairments to their perception of other people’s actions and faces. It is the hope that understanding the underlying basis of these impairments in more detail will have long term positive consequences for both the affected individual, her/his family and for the society. Besides from the clinical incentives, knowledge about a disorder of social communication constrains theories of normal development.

Why use eye-tracking? Eye-tracking has at least three clear advantages compared with other available techniques. First, modern eye-tracking technology is very accurate. Second, the measured variable (gaze position) is often “close” to the conceptual variable (e.g. focal attention). Third, it is a non-invasive and time-efficient method suitable for infants and children (including those with ASD, who may refuse invasive methods or methods that require long testing sessions). Finally, eye-tracking is particularly valuable in the study of action and face perception, because it indicates which aspects people spontaneously attend to. By showing videos of dynamic actions and faces, one the observed performance is likely to resemble the performance in real life outside the laboratory.

(12)

Understanding other people’s actions

At the core of social cognition is the ability to understand other people’s actions and intentions. When we observe somebody behave, there are many levels of action-related information available (Thioux, Gazzola, and Keysers, 2008). At a low level of description is the how of the action – referring for example to its kinematic profile. At a higher level is the what of the action – the observable and immediate action goals. At a still higher level is the why of the action – its underlying intentions. For example, if you are at a restaurant and your date discretely looks at her watch, what-level action understanding may tell you “to see the time”, how-level action understanding may tell you ”by lifting and rotating the left arm”, while why-level action understanding may tell you “this topic bores her”. Recent research has pointed to the possibility that in human adults, these levels of analysis may be driven by partly different and complementary circuits in the brain (Van Overwalle and Baetens, 2009). When simply asking people about what they are doing, they typically report their high-order intentions and goals (the why of the action; Vallacher and Wegner, 1987). Using other types of measures such as eye-movements, one can see that humans attend to immediate goals and obstacles (the what of the action) during action execution (Johansson, Westling, Backstrom, and Flanagan, 2001). Needless to say, action understanding is a very complex endeavour.

Eye movements in action execution and action observation

Humans often use their eyes to guide their own actions. One study by Land and colleagues (Land, Mennie, and Rusted, 1999) underlines the importance of predictive eye movements in everyday tasks. In this study, gaze was recorded during the preparation of a cup of tea, and the investigators found that, on average, gaze precedes object manipulation by approximately half a second. Thus, gaze leads the way and object manipulation follows.

Furthermore, eye movements are highly task-specific, which suggests that a particular action plan includes specific directions for the oculomotor system (Land and Furneaux, 1997). Task-specific eye movements have been documented in many goal-directed behaviors, such as driving (Land, 1992) and typing (Inhoff and Jian, 1992).

Flanagan & Johansson (2003) conducted a study of eye movements in adults that illustrates that there is a close linkage between how humans perceive their own and other people’s actions. The researchers recorded gaze when participants executed a simple block-stacking task and when

(13)

land sites of the blocks both when observing and executing the action. In other words; adults search for goals irrespective of whether they execute the action themselves or observe another person perform it. In a control experiment, the hand moving the blocks was made invisible to the observers.

In this situation, the gaze arrived at the goal sites reactively and tracked the moving objects. Thus, predictive eye movements in action observation seem to require observing a hand object interaction.

The close resemblance between eye movements used in action execution and action observation in the Flanagan and Johansson study suggested that human adults use action plans in action observation. This interpretation implies that the motor system is involved in predicting the goal of other people’s actions. This view is in accordance with the discovery of a mirror mechanism in the brain, a mechanism that connects action execution and action perception. The core of this mechanism is this: when observing someone else executing an action, a set of neurons that encodes that action is activated in the (cortical) motor system (Rizzolatti and Sinigaglia, 2010).

These neurons are labeled ‘mirror neurons’, constituting the basis of the Mirror Neuron System (MNS)

Action understanding via simulation: the Mirror Neuron System hypothesis

Mirror Neurons in monkeys

Mirror neurons are sensorimotor neurons that fire irrespectively of whether an action is executed or observed. Mirror neurons were first found in the ventral premotor area of the macaque (Gallese, Fadiga, Fogassi, and Rizzolatti, 1996), and later in the inferior parietal lobule (Fogassi, Gallese, Fadiga, and Rizzolatti, 1998). These two areas form the core areas of the MNS, but they are strongly connected to and receive high order visual input from the superior temporal sulcus (Rizzolatti and Craighero, 2004). Mouth and hand actions are the most studied actions, and specific mirror neurons are not only dedicated to each of these effectors1 but also to specific hand and mouth actions such as grasping, holding, and manipulating. Later experiments have revealed the existence of audiovisual mirror neurons, that fire independently of whether the action is executed, heard or seen (Kohler et al., 2002). Furthermore, mirror neurons that are active during the execution of a specific action also fire when the monkey observes that action with its end state (goal) occluded (Umilta et al., 2001).

1 ‘Effector’ refers to the physical device (e.g. a hand) that makes about a change in the state of affairs in the world. It has been shown that as long as a clear goal is present, even a robotic arm can activate the MNS in human observers (Gazzola, Rizzolatti, Wicker, and Keysers, 2007a).

(14)

Some mirror neurons show very high congruence between effective perceived and performed actions, while others are apparently able to generalize action goals across different instances of an action (Gallese et al., 1996). Such broadly congruent mirror neurons are suggested to encode the overreaching goal and may reflect a functional mapping of observed actions.

Mirror neurons in the parietal lobe of the monkey are believed to be involved in networks representing high-order action goals. The reason for this is that both during execution and observation, mirror neurons in this area discriminate motor acts embedded in an action sequence, dependent on the future (global) goal of that sequence (Fogassi et al., 2005).

Together, these data indicate that representations of actions and action goals are linked to the MNS in monkeys. These representations, which are organized hierarchically, are used online to plan and control own movements as well as to understand actions performed by other individuals (Fogassi et al., 2005).

The human MNS

Studies of the human MNS usually use the following basic comparisons: (i) observation of human motion versus physical movement (Nystrom, 2008), (ii) action execution versus action observation (Flanagan and Johansson, 2003), or (iii) imitation versus execution or observation (Iacoboni, Woods, Brass, Bekkering, Mazziotta, and Rizzolatti, 1999). The motor system is expected to activate when observing human movement but not during physical movement. Furthermore, motor activation during observation is expected to resemble activation during execution, and it should be more pronounced during imitation than execution/observation on its own (Dinstein, Thomas, Behrmann, and Heeger, 2008). Both behavioral, electrophysiological and imaging studies support the view that a MNS exists in humans (Buccino et al., 2004; Dapretto et al., 2006; Fadiga, Fogassi, Pavesi, and Rizzolatti, 1995; Iacoboni et al., 1999; Nishitani and Hari, 2002).

However, because neither of these methods prove that single cells encode both sensory and motor information from actions, one cannot establish with certainty that these effects reflect the activity in mirror neurons (single cells).

Thus, more direct measures are needed. One recent development is the use of adaptation protocols (Dinstein, Hasson, Rubin, and Heeger, 2007). This method is based on the fact that single neurons encoding a certain attribute tend to fire less if their preferred stimulus is shown repeatedly. Thus, it is argued, if actual mirror neurons exist in the putative human mirror areas, these areas are expected to show cross-modal adaptation (e.g. visual to motor). Results from studies using this protocol are mixed (Chong, Cunnington, Williams, Kanwisher, and Mattingley, 2008; Dinstein et al.,

(15)

protocol are reasonable from a neurophysiological perspective (Rizzolatti and Sinigaglia, 2010).

A recent single cell recording study on patients with epilepsy shows that humans have mirror neurons, and these neurons were not only found in the areas normally considered the core of the MNS (Mukamel, Ekstrom, Kaplan, Iacoboni, and Fried, 2010). Furthermore, some mirror neurons (or ‘anti- mirror neurons’; Keysers and Gazzola, 2010), were inhibited by the same action irrespectively of it being executed or observed. Although this study confirms that mirror neurons exist in humans, it also raises many new questions regarding the role of mirror neurons in perception and cognition.

The functional significance of the MNS in action perception

The view that activity in the monkey MNS is functionally linked to action (goal) understanding rather than being an epiphenomenon with no functional significance typically builds on a few key findings: (i) that (some) mirror neurons fire during observation of grasping even though the goal object is occluded, (ii) that some “audiovisual” mirror neurons are activated when you execute an action, see the action or hear sounds associated with the action, and (iii) that some parietal mirror neurons fire during a motor act dependent on the final goal of an action sequence (Fogassi et al., 2005; Kohler et al., 2002; Rizzolatti and Craighero, 2004; Umilta et al., 2001). For a critical review of these findings and alternative interpretations, see Heyes (2010)

Does the human motor system (or the MNS specifically) serve a functional role in the perception of other people? Recently, important evidence has been presented showing that transcranial magnetic stimulation (TMS) directed at a specific part of the primary motor cortex (lip versus tongue area) enhances perception of phonemes involving the corresponding muscle (D'Ausillio et al., 2009). Similarly, on a semantic level, it has been shown that motor activity (manipulated with TMS) affects perception of words (Pulvermuller, Hauk, Nikulin, and Ilmoniemi, 2005). These pieces of evidence suggest a functional role of the motor system in language perception. On the other hand, patients with damage to Broca’s area can display a speech production deficit combined with relatively intact comprehension, and patients with damage to Wernicke’s area sometimes show the reverse symptom profiles. This shows that the putative shared neural systems for language and actions do have functional specificity linked to distinct areas (Pulvermuller, 2005).

Direct evidence to suggest a functional role of the motor system (or the MNS specifically) in action perception in humans is still limited. For example, to my knowledge no one has published a study in which an induced change in motor activity has had an effect on the perception of

(16)

others’ actions. Interestingly, one recent study of apraxic2 patients indicates that impairments in pairing human action-related sounds to the correct action parallel impairments in performing the same actions (Pazzaglia, Pizzamiglio, Pes, and Aglioti, 2008).

The development of the MNS in infancy

3

Study I of this thesis relates to the development of the MNS in human infants. At the time we were planning that study, almost nothing was known about how the MNS develops in human infants and children. The finding that human newborns imitate different mouth actions, suggested that there is an innate perception-execution matching mechanism in the human brain (Meltzoff and Moore, 1977). However, this evidence is controversial (Heyes, 2010). Moreover, because older infants temporarily “loose” this ability, it is unclear how this early competence relates to later imitation (or the MNS).

It had been suggested that mirror neurons may be trained by experience with actions (own actions). Observing own actions will provide correlated information from the visual and motor systems. Hebbian learning (strengthened associations between neurons/systems following synchronous activity) could explain why motor neurons – after repeated practise – also fire in response to seeing another person doing actions (Keysers and Perrett, 2004). From this perspective, one would expect a correspondence between the ability to execute actions and the ability to understand other people’s similar actions. This argument is most straight-forward for actions that the infant can see (or hear) during execution, and less straight forward for unseen and silent actions such as facial emotional expressions. This latter difficulty could be overcome by assuming that infants learn from pairing visual information from their parents’ expressions with their own motor activity when adults imitate them. That activity in the MNS can change during sensorimotor learning has been confirmed in later empirical studies (Catmur, Walsh, and Heyes, 2007).

Later, Pär Nyström at the Uppsala Babylab has used electroencephalography in young infants and has documented mu-rythm suppression during action observation (but not during object observation).

As mu-rhythm suppression is typically observed during action execution, this strongly suggests that young infants employ their motor system in action observation (Nystrom, 2008).

2 Apraxia refers to an inability to execute an action despite a desire and a physical ability to do so. It is a disorder of motor planning, caused by damage to specific brain areas.

(17)

Simulation, motor resonance and direct matching

The process performed by the MNS is frequently described as simulation, motor resonance and direct matching (Flanagan and Johansson, 2003;

Gallese and Goldman, 1998; Taylor and Zwaan, 2008). The meaning attributed to these concepts differ in the literature (Csibra, 2007; Hurley, Clark, and Kiverstein, 2008; Rizzolatti and Sinigaglia, 2010).

Generally speaking, simulation can be thought of as an imitation of some

‘original’ process. Simulation involves representing key aspects of the original process. Simulations may either be implemented in a different system than that used for the original process (e.g. when computer programs are used to simulate molecular dynamics) or in some of the same (sub-) systems that were used for the original process (Hurley et al., 2008).

If processes in the brain typically associated with perceiving facial information are also activated when remembering faces (a qualitatively different task than processing faces that are seen “online”), this suggests that remembering faces involves simulation. In this case, the brain simulates seeing the face although no perceptual input is available. Likewise, if observing others’ actions activates the same processes in the brain as executing actions, this may indicate that processing others’ actions involves simulation. In this case, you simulate actually executing the action.

According to Rizzolatti et al., the advantage of using such simulation processes during action observation is that you obtain experiential knowledge about what the other person is doing (Rizzolatti and Sinigaglia, 2010). You are, literarily speaking, putting yourself in someone else’s shoes.

Others have advocated the view that using the motor system to simulate actions improves prediction of what is going to happen next (Miall, 2003).

This latter view is most relevant for the work presented in this thesis.

A central question for simulation theory of action understanding concerns which type of information that is required for a simulation to take place.

According to one view, simulations implemented in the MNS run on cognitively unmediated input. This view has been labelled ‘direct-matching’

hypothesis. According to this view, the input to the MNS is detailed kinematic information, and the output an estimation of the goal (or intention) of the action. This view been criticised (Csibra, 2007). Csibra argues that the available evidence is more consistent with the view that the input to the MNS is typically a description of the intention of an action, rather than a detailed kinematic input. In other words, key aspects of ‘action understanding’ occur outside the MNS.

The concept of motor resonance is closely connected to direct-matching, because the metaphor indicates that a simulation does little more than duplicating an observed action. One may say that the difference between these concepts is that while the ‘direct-matching’ hypothesis concerns which

(18)

type of information is used as the input for simulations, motor resonance describes the simulation itself.

To say that the brain sometimes uses motor simulations in action perception is not very controversial. The current dispute mainly concerns the scope and function of motor simulations in social perception (Csibra, 2007;

De Lange, Spronk, Willems, Toni, and Bekkering, 2008), as well as the development of the MNS in evolution (Heyes, 2010). For example, do humans use motor simulation to understand the “deeper” intentions of other people? This and related questions are discussed in more detail in later (see General Discussion).

Understanding actions without simulation

Simulation is not the only route to action understanding. Here I will review two alternative accounts; the modularist account of action understanding and Teleological Stance theory.

Modularist accounts of action understanding

“Of the proper participants of motion some are moved by themselves and others by something not themselves, and some have a movement natural to themselves and others have a movement forced upon them which is not natural to them. Thus the self-moved has a natural motion. Take, for instance, any animal: the animal moves itself, and we call every movement natural, the principle of which is internal to the body in motion.”

Aristotle, Physics (vol. V, p. 307) According to modularist theories of action understanding, cues such as self- propulsion or the direction of movement activate a prewired mechanism specialized for processing and inferring mental states. From this viewpoint, it is natural to look for early developmental evidence that self-propelled motion is differentiated from nonself-propelled motion. For example, Premack (1990), proposed that young infants divide moving objects into two kinds: those that are self-propelled and those that are not. It was hypothesized that changes in motion that are not related to an external force (and thus violate the basic laws of physics), will be processed in a very different way than nonself-propelled objects (as illustrated by the above quotation, this idea is not new). According to Premack, self-propulsion induces an intentional stance in the infants, while nonself-propulsion induces a causal stance. He argued not only that self-propelled objects will be perceived as having the intention to move. According to Premack, infants

(19)

Empirical studies of infant behaviour have shown that self-propulsion can facilitate perception of goal-directedness, even in the absence of human features (Luo and Baillargeon, 2005). Moreover, the fact that newly hacked chicks prefer self-propelled objects over nonself-propelled objects (controlling for physical differences), indicate a hard-wired mechanism related to preference for self-propulsion (Mascalzoni, Regolin, and Vallortigara, 2010). Today, however, these findings are generally not interpreted as evidence that the infants (or chicks) automatically perceive self-propelled objects as intentional agents. Rather, most investigators seem to prefer the weaker interpretation that infants can perceive animacy based on self-propulsion, but that self-propulsion by itself is insufficient to induce an intentional stance (Mascalzoni et al., 2010). In the next section, I will outline an influential account of infant’s perception of actions, an account that specifies the properties of the stimulus that are required in order for humans (including infants) to perceive that moving objects have goals.

Action understanding via teleological reasoning

At the heart of teleological reasoning lies the principle of rational action.

This principle states that actions function to realize goal-states by the most efficient means available in the situation (Gergely and Csibra, 2003).

According to this account, there are three essential components of an action:

the behaviour of the agent, the goal state and the situational constraints. The idea is that by means of the principle of rational action, knowledge of two of these aspects is sufficient to infer the remaining aspect. For example, one can infer the likely behaviour of an agent based on knowledge of the situational constraints and the goal states. Likewise, one can infer the likely goal state based on knowledge of the behavior and the situational constraints. Lastly, one can infer the likely situational constraints based on the behavior and the end state. According to the Teleological Stance theory, these inferences can be performed by preverbal infants.

Empirical support for the theory is strong (Csibra, Gergely, Biro, Koos, and Brockbank, 1999; Gergely and Csibra, 2003; Gergely, Zoltán, Csibra, and Szilvia, 1995). For example, in one experiment infants were first habituated to an event in which one circular shape (object A) moved over a wall like barrier and ended up close to the position of another circular shape (object B; Figure 1; Csibra et al., 1999). In the test phase, the infants were presented with two similar stimuli, but the barrier was removed. In a direct approach condition, object A moved in a straight line towards object B. This behavior is predicted from the principle of rational action (see above). In an indirect approach condition, object A repeated the same trajectory as it had taken in the habituation trial, that is, the trajectory that went over the barrier.

Results showed that infants looked preferentially at the indirect approach. Of course, this result cannot be explained by novelty preference, because in the

(20)

indirect approach condition the action was the same as in the habituation phase and thus not novel. Rather, the result indicates that infants did not expect the ball to take the “irrational jump” in the absence of a barrier. The authors concluded that infants perceive an animated shape as goal-directed if and only if it attains this goal via the most efficient means in the situation.

Other similar studies have confirmed and broadened these findings on infants’ perception of goal-directedness in animated shapes.

Figure 1. Typical stimuli from study of teleological reasoning in infants (Csibra et al., 1999). First, the infants are being habituated to seeing an object (white circle)

”jump” over a barrier and arriving at another object (black circle). Then, in the test phase, the barrier is removed. Looking time (dishabituation) is compared for an indirect approach condition (the object still jumps) or a direct approach (the object moves in a straight line). Twelve-month-old infants typically look longer at the indirect approach than the direct approach. This has been taken as evidence that they realize that the principle of rational action has been violated in the indirect approach.

Teleological reasoning differs in important ways from both modularist accounts of mentalistic attribution and on MNS account of action understanding (Gergely and Csibra, 2003). First, teleological reasoning is not necessarily mentalistic reasoning (although mentalistic interpretation may utilize the same principle; ibid). This follows directly from the principle of rational action, which does not specify any need for mentalistic representations. In other words, one may be able to draw conclusions about actions on the basis of the principle of rational action with or without attributing mental properties to the observed events. The fact that infants seem to apply the principle of rational action to observed events before they show direct evidence of mentalistic thinking, gives support to the view that teleological reasoning is a precursor for mentalistic thinking (Csibra, Biro, Koos, and Gergely, 2003; Onishi and Bailargeon, 2005). Secondly, in contrast to MNS theory of action understanding, teleological reasoning does not depend on any similarity between the observer and the observed agent.

In fact, in the now classical empirical work in support for the Teleological Stance theory, participants were observing basic geometrical shapes

“behaving” either rationally or irrationally (with respect to the principle).

(21)

Face perception

So far, this introduction has primarily discussed one key social competence:

understanding the actions of others. The next part of the introduction is concerned with another competence needed to efficiently take part of social life: understanding faces. This part relates specifically to Study III, which investigated face processing in children with ASD.

Neural systems involved in face perception

The neural systems for face perception have been intensively studied, and I will briefly summarize some main conclusions. According to an influential model, face perception in adults is done in a distributed neural system, separating invariant (e.g. identity) from variant (e.g. expression) information from faces (Haxby, Hoffman, and Gobbini, 2000). In the visual system, division of labour begins already behind the retina, where information is channelled into either the parvocellular stream (high spatial frequency color information) or the magnocellular stream (low spatial frequency motion information). The parvocellular and the magnocellular stream map onto the ventral (“what”) and dorsal (“where”) stream, respectively. In humans, the ventral stream connects to the fusiform face area (Goffaux and Rossion), and according to the distributed network model, this area encodes the invariant structure information (Hasselmo, Rolls, and Baylis, 1989). In contrast, the superior temporal sulcus processes variant static face information such as facial expression and lip forms, and dynamic information from the face, such as gaze and head direction (Hasselmo et al., 1989; Narumoto, Okada, Sadato, Fukui, and Yonekura, 2001; Nishitani and Hari, 2002). Although separable, the channels and systems interact, such as when motion information is used to build up facial structure under poor viewing conditions (O'Toole, Roark, and Abdi, 2002). According to the distributed neural system model of face perception, the fusiform face area and the superior temporal sulcus (“the core system”) recruits other expert regions (“the extended system”) to extract meaning from faces such as in audiovisual speech perception (Haxby et al., 2000).

‘Blindsight’ patients (with damage to primary visual cortex) and prosopagnostic patients (who are unable to recognize faces that were familiar to them before brain injury) can detect faces and some expressions (de Gelder, Frissen, Barton, and Hadjikhani, 2003; Morris, DeGelder, Weiskrantz, and Dolan, 2001). Also, neglect patients can show visual extinction4 to objects but not faces in the neglected area of vision (Vuilleumier, 2000). These findings point to the existence of a fast sub-

4 Visual extinction refers to the phenomenon that some patients fail to see a stimulus in the neglected visual field if and only if another stimulus is presented in the non-neglected visual field (Johnson, 2005).

(22)

cortical face processing pathway dedicated to face detection and recognition of expressions (Johnson, 2005).

Information from facial features: the eyes and the mouth

Study III of this thesis relates to the perception of two parts of the face; the eye area and the mouth area. The eye area has previously been linked to perception of emotion. Emotions play a central role in the regulation of social activity and it is likely that the ability to transmit different emotions – as well as the ability to decode the emotions of others – has evolved under strong evolutionary pressures (Darwin, 1872). Much of the emotional information communicated between people is transmitted via visual information from the face. It has been shown that humans selectively attend to information from the particular face parts that most clearly disambiguate emotional expressions. For example, when looking at others’ negative emotional expressions, human adults tend to attend selectively to information from the eye area (Smith, Cottrell, Gosselin, and Schyns, 2005).

When looking at positive/neutral expressions (happiness, surprise), they selectively attend to information from the mouth area (ibid).

Visual information from the mouth can also facilitate the perception of auditory speech signals. Human adults tend to look more at the mouth when the speaker’s voice is accompanied by noise (Vatikiotis-Bateson, Eigsti, Yano, and Munhall, 1998). The McGurk effect (the phenomenon that seeing mouth movements that are incongruent with heard syllables modulates the perception of the syllables) illustrates that the visual and auditory systems are highly integrated during speech processing (McGurk and Macdonald, 1976). Developmental data show that looking at the mouth in infancy is positively related to later language development (Young, Merin, Rogers, and Ozonoff, 2009).

The eyes do not only transmit emotionally laden information, they also transmit important information about what the observed person is looking at (Gredebäck, Theuring, Hauf, and Kenward, 2008; von Hofsten, Dahlstrom, and Fredriksson, 2005). By following the gaze direction of other people, one can rapidly look at the places in the environment that are judged by that person to be worth attending to. This attentional synchrony has obvious advantages both in competitive and cooperative situations.

Taken together; the mouth and eye region of the face differ in terms of what type of information they transmit. Attending to the eye region will help you rapidly perceive the negative emotions of other people, as well as what in the physical environment they focusing on. Visual attention to the mouth region facilitates speech perception and perception of positive emotions.

(23)

Configural versus featural processing of faces

The face is a complex stimulus, and efficient face processing is dependent on analysis of both parts and of the configuration of multiple parts (Lobmaier and Mast, 2007; Moscovitch, Winocur, and Behrmann, 1997; Rakover, 2002; Schwaninger, Lobmaier, and Collishaw, 2002). Almost 40 years ago came the first report that the recognition of faces, compared to other objects, was disproportionally affected by inversion (Yin, 1969). Later research has shown that the electrophysiological brain response named N170, which is larger for faces than for many other stimuli, is modulated by face inversion (Bentin, Allison, Puce, Perez, and McCarthy, 1996; Itier, Alain, Sedore, and McIntosh, 2007). These phenomena, which are known as ‘face inversion effects’, are believed to reflect that inversion specifically disrupts configural information (rather than featural information), and that face perception relies heavily on configural processing.

Behavioral evidence suggests that infants as young as four to six months process inverted faces differently compared to upright faces (Fagan, 1972;

Turati, Sangrigoli, Ruel, and de Schonen, 2004). On the same time, a large body of literature suggests that holistic processing is a much later development (Joseph et al., 2006; Mondloch, Le Grand, and Maurer, 2002;

Passarotti, Smith, DeLano, and Huang, 2007; Taylor, Edmonds, McCarthy, and Allison, 2001). According to one view, children process faces in a piecemeal fashion until the age of ten years (Carey and Diamond, 1977).

Although more recent studies have found evidence for holistic based recognition in young preschoolers (de Heering, Houthuys, and Rossion, 2007; Pellicano and Rhodes, 2003; Tanaka, Kay, Grinnell, Stansfield, and Szechter, 1998), there is still a considerable gap between preferential looking studies claiming to have documented holistic processing in infancy, and other behavioral paradigms failing to find such evidence before early childhood.

Autism Spectrum Disorder

Study II and Study III of this thesis include children with ASD, a disorder of social communication. Here, I will provide a general description of this disorder, followed by a discussion of findings that relate specifically to the two studies (action perception and face perception in ASD, respectively).

(24)

General description

ASDs5 are pervasive developmental disorders that affect around 0.6 % of the population (Fombonne, 2005) and are defined by (i) social impairments, (ii) communicative impairments and (iii) repetitive/restricted behaviours and interests (American Psychiatric Association, 1994). Autistic disorder, Aspergers syndrome and Pervasive Developmental Disorder – Not Otherwise Specified (PDD-NOS) are the three diagnoses usually defined as ASDs. The diagnosis Autistic disorder requires severe symptoms in all three symptom areas. The diagnosis Apergers syndrome requires severe symptoms only in the first and third symptom domain. The PDD-NOS diagnosis is less stringently defined than the other two, and is meant to capture individuals that do not meet the criteria for full diagnosis, but nevertheless are judged to have significant impairments within the symptom triad. Other and less common pervasive developmental disorders with clear similarities to ASDs include Retts syndrome and childhood disintegrative disorder. Boys are at increased risk for ASDs (ratio ~4:1). Twin studies indicate that ASDs has a strong genetic basis, although the exact nature of the ‘autistic genotype’ is far from being fully understood (Happe, Ronald, and Plomin, 2006). Ten to twenty percent of ASD cases can be accounted for by identified genetic abnormalities (Abrahams and Geschwind, 2008). On the same time, no single abnormality accounts for more than 1-2% of the cases (ibid). Thus, ASDs are genetically heterogeneous. In addition to the identified genetic abnormalities, rare combinations of common gene variants could underlie (other types of) ASDs. Against this background, it is not surprising that studies of pathophysiology in ASDs fail to identify a simple explanation, neither on molecular, cellular or systems level (Amaral, Schumann, and Nordahl, 2008). Current major hypotheses relate to differences in total brain volume, alteration of the columnar structure of the neocortex, or neuropathology of brain regions such as the cerebellum and amygdala (ibid).

Signs of ASD are usually noted by parents during the first two years of life (Johnson and Myers, 2007). Intensive and early interventions can be effective (Myers and Johnson, 2007), but most individuals diagnosed with ASD continue to display socio-communicative symptoms. Based on UK data, Knapp, Romeo, and Beecham (2009) estimated the lifetime societal costs of ASD to lie between 0.8 and 1.23 million pounds (per individual).

Thus, research into the ontology of the disorder as well as studies of intervention is likely to have long-term positive consequences both from an individual/clinical, and societal/economical point of view.

(25)

Psychological explanations of ASD

There are two main categories of psychological explanations of ASD.

Theories belonging to the first category hold that ASD is a disorder of social cognition. According to one such view, ASD is characterized primarily by deficits in understanding other people’s minds (e.g. Hamilton, Brindley, and Frith, 2007). Another social explanation holds that the core symptoms in ASD relate to difficulties with understanding basic actions and in imitating (Iacoboni and Dapretto, 2006). This view will be discussed in more detail in the next section, and relates specifically to Study II of this thesis. A third type of social explanation of ASD focuses on impairments in face processing (Hobson, Ouston, and Lee, 1988). This view relates to Study III of this thesis.

The second category of psychological theories of ASD includes various

‘non-social’ explanations of ASD. Executive function hypotheses posit that individuals with ASD have difficulties with “frontal” functions such as planning, working memory and cognitive flexibility (Damasio and Maurer, 1978; Kenworthy, Yerys, Anthony, and Wallace, 2008). Other authors advocate the view that there is an enhanced processing of parts in individuals with ASD and, in certain tasks, a tendency to ignore “the big picture”

(Happe and Frith, 2006; Mottron, Dawson, Soulieres, Hubert, and Burack, 2006).

The view that ASD is characterized by an imbalance between systems for empathizing and systemizing in the brain (Baron-Cohen, Knickmeyer, and Belmonte, 2005) falls somewhat in between the two main categories. Given the heterogeneity of ASDs, it has been pointed out that it is unlikely that one unified psychological theory of ASDs will be satisfactory (Happe et al., 2006).

MNS dysfunction in ASD?

The aim of Study II of this thesis was to test the MNS dysfunction hypothesis of ASD. According to this hypothesis, the social impairments defining the disorder are linked specifically to a dysfunction in the MNS (Iacoboni and Dapretto, 2006). There are two principal experiments supporting this theory (Cattaneo et al., 2007; Dapretto et al., 2006). Cattaneo et al. (2007) reported that in both execution and observation, typically developing children discriminated between two identical acts (grasping) dependent on the final goal of the action sequence, while children with ASD discriminated later in execution and not at all in observation. The performance of typically developing children is reminiscent of the properties of parietal mirror neurons in monkeys (Fogassi et al., 2005). Because later discrimination was found during execution in ASD, the authors point to the

(26)

possibility of a global action understanding problem in ASD, relating both to understanding own actions and the actions of others.

The second study (Dapretto et al., 2006) builds on the assumption that the MNS mediates perception of others’ emotions via action representations of facial expressions (Carr, Iacoboni, Dubeau, Mazziotta, and Lenzi, 2003). In other words, motor empathy mediates affective empathy. Dapretto et al.

(2006) reported that seeing emotional faces activated the frontal component of the MNS as well as the insula and limbic structures less in twelve-year- olds with ASD than in IQ matched controls, and activity in these areas showed a high negative correlation with socio-emotional symptom severity in individuals with ASD. Later experiments have shown that measures of empathy and interpersonal skills correlate positively with activity in these areas in typically developing children as well (Pfeifer, Iacoboni, Mazziotta, and Dapretto, 2008).

Together, these studies represent the most direct evidence that a MNS dysfunction is at the core of social impairments in ASD. The theory has been challenged on both theoretical and empirical grounds (Dinstein et al., 2008;

Dinstein et al., 2010). Along with related findings, I will discuss these issues in more detail later on.

Face perception in ASD

Study III of this thesis investigated the link between autistic symptoms and face perception. Previous research has shown that individuals with ASD have difficulties recognising face identity and facial expressions (Boucher and Lewis, 1992; Dawson et al., 2005). Unlike typical controls, individuals with ASD recruit object specific areas when looking at faces (Schultz et al., 2000), and the fusiform face area less than controls (Pierce, Muller, Ambrose, Allen, and Courchesne, 2001). Both brain activation during face observation (Dalton et al., 2005) and social skills outside the laboratory (Klin, Jones, Schultz, Volkmar, and Cohen, 2002) are linked to the amount of time spent fixating the core features of the face (i.e. mouth and eyes), measured with eye-tracking. Studies have found that face inversion effects are less pronounced in ASD than in typically developing individuals, a finding generally explained by poor configural information processing and/or increased processing of parts in ASD (Frith, 1989; Hobson et al., 1988; McPartland, Dawson, Webb, Panagiotides, and Carver, 2004; Rouse, Donnelly, Hadwin, and Brown, 2004; Tantam, Monaghan, Nicholson, and Stirling, 1989; Teunisse and de Gelder, 2003; van der Geest, Kemner, Verbaten, and van Engeland, 2002). Similarly, two eye-tracking studies have found different face inversion effects in ASD compared to typically developing children (Falck-Ytter, 2008; van der Geest et al., 2002). The

(27)

much previous work in this area, Study III was dedicated to understanding differences in face processing within the ASD population, rather than differences between ASD and typically developing individuals.

(28)

The aims of this thesis

The primary aim of this thesis was to test theoretically derived hypotheses about action and face perception in children using eye-tracking. Because the studies employed ecologically valid stimuli such as faces and manual actions, the studies were judged to be of more general interest as well.6

Study I tested the hypothesis that prediction of others’ actions is linked to action experience. This hypothesis derives from the MNS hypothesis of social cognition, according to which motor competence is an important knowledge base in many social competences.

Because it has been suggested that a dysfunction in the MNS underlies the social impairments defining ASD, Study II asked whether prediction of other people’s action goals was impaired in children with this disorder.

Study III asked how individual differences in socio-communicative symptoms in ASD relate to face scanning. Previous work has shown that the eyes have an important role in emotional communication while visual information from the mouth facilitates verbal communication. We tested the hypothesis that the magnitude of the difference between socio-emotional and non-verbal communication symptoms is related to where in the face one looks (eyes versus mouth).

6 In addition, a long term aim (not directly of the studies reported here, but of ongoing/planned studies of similar type) is to evaluate the validity of eye-tracking as a

(29)

Methods

Recruitment of participants

Typically developing children/infants were recruited from birth records.

Parents who chose to participate returned a pre-paid letter with contact details, after which the experimenter called them to arrange the eye-tracking assessment. Typically developing adults were recruited from a university campus area.

Families of all children with ASD who had recently been diagnosed at Habilitation centres in Sweden were invited to participate via letters given to them by a clinician. Parents who chose to participate returned a pre-paid letter with contact details, after which the experimenter called them to arrange the eye-tracking assessment.

Stimuli

Stimuli were either videos (all Studies) or pictures (Study III) of manual actions and/or human faces. Stimuli were recorded using a digital video camera recorder (Sony HDR-FX1; Sony Corporation, Tokyo, Japan) and edited with Sony Vegas (Sony Corporation, Tokyo, Japan). The stimuli lasted between 4 (faces; Study III) and 13 seconds (actions; all Studies).

Procedure

Typically developing participants were tested in the lab (Babylab, Department of Psychology, Uppsala University). After arrival and a short period of familiarization with the experimenter, participants (infants/children: with their caregiver) were brought to the eye-tracking room. Infants were placed in a car safety seat and placed on their parents lap.

All other participants sat on a chair in front of the eye-tracker monitor. The experimenter was seated behind the participant. Before recording, participant and monitor positions were adjusted to obtain satisfactory gaze tracking status. A calibration procedure was always conducted, typically with a coloured moving target accompanied by a sound. Verbally able participants were told that they were going to look at some short movies on the monitor.

(30)

Between stimuli, attention getters (e.g. spherical colored circles expanding and contracting on a black background, accompanied by a sound) were shown until the participants looked at the monitor.

Children with ASD were tested at Habilitation centres for children (in which their families received counselling). After arrival at the centre and a short period of familiarization with the experimenter, they were brought to the study room (which they were familiar with from previous visits at the centre) with their caregivers present. Both the experimenter and caregivers were seated behind the participants. If children with ASD were inattentive for more than about four seconds, they were prompted to look at the monitor.

All participants received a small compensation (e.g. vouchers to use in a shop) for their participation.

Apparatus

All studies reported in this thesis use a technique based on recording of corneal reflections of infrared light. The basic principle is simple: infrared light is projected onto the eyes of the subject from multiple light diodes (Aslin and McMurray, 2004). The frequency of light is hardly noticeable, unless the room is very dark (in our experiments, the subject always observes a light monitor screen). A camera mounted in front of the subject

“sees” the reflection of the infrared light on the cornea, as well as the contour of the pupil (Figure 1). Computer algorithms (from the manufacturer) relate the reflection of the light to the location of the pupil on the image. In Study I and Study II, we used a 50 Hz system (Tobii 1750). In Study III, we used a 60 Hz system (T120). Both systems measure gaze as the subject watches an integrated 17 inch monitor and both were manufactured by Tobii Technology (Stockholm, Sweden).

(31)

Figure 2. The basic principle’s of corneal reflection eye-tracking. Note that in the eye-trackers used in the studies of this thesis, the camera as well and the light sources are integrated into a monitor (which to a naïve observer looks like a normal PC monitor). The picture is modified with permission from Tobii Technology, Inc (Stockholm, Sweden).

Data analysis

The eye tracker saves the recorded gaze data as a text file in which each row corresponds to one sample (lasting 16-20ms dependent on the system), and in which the columns contains data (e.g. gaze position). These files were imported into MATLAB (MathWorks Inc, Natick, MA) and analyzed in a series of steps. First, gaze data from both eyes were averaged. Second, gaze data were analyzed with respect to what was shown on the screen at the point of time of the particular data sample. This was always done using Areas Of Interest (AOIs) which represent predefined parts of the screen judged to be important for the purpose of the study. For example, in Study III, we created three AOIs; one covering the whole face, one covering the mouth and one covering the eyes of an observed face stimulus. If a participant looked at the eye AOI during the whole stimulus presentation, this would be coded as 100% ‘eye looking’. Finally, based on such AOI based calculations, software written in MATLAB reduced the original data file to individual summary statistics to be used for group-level statistical comparisons.

In Study I and II, the software also took into account at which time during the stimulus presentation the gaze was recorded from an AOI. In Study III, we used so-called ‘moving AOIs’, in order to accurately measure gaze fixations on moving targets (such as a moving face). Coordinates for these AOIs were determined via visual inspection (by mouse clicking in the

(32)

frames from the stimulus movie) using computer programs written in MATLAB by Pär Nyström. Statistical analyses were done in SPSS (SPSS Inc, Chicago, IL).

(33)

Study I

Study I was motivated by the recent theorizing on the role of the MNS in developmental social cognition (Gallese, Keysers, and Rizzolatti, 2004;

Keysers and Perrett, 2004), but the most direct impetus came from a recent behavioral paradigm that allowed investigation of mirror processes via eye- tracking. This paradigm was invented by Flanagan and Johansson (2003), and involved measuring eye movements during action execution and action observation, respectively. As reviewed earlier, task-specific predictive eye movements are crucial for planning and control of visually guided actions (Johansson et al., 2001; Land and Furneaux, 1997). What Flanagan and Johansson showed was that adults also use such eye movements when they observe similar actions performed by others. The striking resemblance of eye movements observed during execution and observation indicates that motor plans guide eye movements in both situations.

Seeing the actor performing the action is essential. Flanagan and Johansson showed that eye movements were reactive when the observers were prevented from seeing the hands of the actor, but instead saw seemingly self-propelled objects. The authors concluded that predictive eye movements require seeing a hand object interaction and that the results supported the view that action prediction relies on a direct-matching mechanism (see Introduction).

In Study I, we adopted the Flanagan and Johansson paradigm and used it to test hypotheses about infant development derived from the MNS theory (Gallese et al., 2004). The MNS theory includes two different aspects, both of which were taken into account in the preparation of the study. First, action prediction should be linked to action experience. This implies that being able to perform an action should facilitate prediction of similar actions performed by others. The particular action chosen was a placement action (placing toys into a bucket), which infants seldom master before seven to nine months of age. Thus, prediction of others’ actions before this age would not be expected according to the MNS theory. Secondly, according to the MNS theory, the MNS is involved in many broader social competences such as imitation and communication via gestures and language. The rudiments of these competences emerge around eight to twelve months of life. If the MNS underlies the development of these competences, evidence of basic MNS functions such as action prediction should be present at the same age. Taken together, from the MNS theory it is expected that twelve-month-old infants,

(34)

but not six-month-old infants, will predict the goal of other people’s placement actions.

According to the Teleological Stance theory, neither having motor experience with similar actions nor seeing a human actor is necessary in order to attribute goal-dictedness to observed events (see Introduction). This theory states that goal directedness will be attributed to any event that moves in an efficient way to a goal within the constraints of the situation (Kiraly, Jovanovic, Prinz, Aschersleben, and Gergely, 2003). The theory applies to infant as well as adults (Gergely and Csibra, 2003). By including both objects were moved to a goal by a human hand and the same objects moving by themselves to the goal, we could evaluate the merits of these conflicting views with respect to prediction. From the MNS theory, only objects moved by a hand should be predicted. From the Teleological Stance theory, no difference between these conditions is expected.

Design

In Study I, we recorded eye movements during action observation in infants to determine if they predicted the goal of simple manual actions performed by another person (Figure 2). Two groups of infants aged six and twelve months were included, as well as an adult comparison group. Independent variables in this experiment were movie type (three different conditions). In the Human Agent condition, participants observed video recordings of an actor’s hand moving toys to a bucket. This condition was administered to all groups. In the Self-propelled condition, the motion was identical to the Human Agent condition except that no hand moved the toys. We also included a Mechanical Motion condition in which three objects moved with constant velocity and following a smooth curvilinear path to the bucket. The latter two conditions were control conditions only administered to the groups that were predictive in the Human Agent condition. All conditions included nine identical trials, and each trial consisted of three separate actions/object movements to the bucket. Each subject was shown only one condition.

(35)

Figure 3. Sample pictures of stimulus videos used in Study I. (a) Condition Human Agent and Self-Propelled with AOIs (black squares) and trajectories for each object (colored lines) superimposed. (b) Stimulus in condition Mechanical Motion.

Reproduced with permission from Falck-Ytter, Gredebäck, and von Hofsten (2006).

Two types of dependent measures were included in the study. Firstly, we studied the arrival of the gaze at the goal area, compared to the arrival of the moving target (hand/object). Secondly, we measured the spatial distribution of gaze. The spatial analysis was based on the ratio of looking time in the goal area to total looking time in combined goal and trajectory areas during target movements. One-way ANOVAs and Bonferroni post hoc tests were used to test the hypotheses statistically. One sample t-tests were used to test individual group data against predefined test values. Bonferroni corrected paired-samples t-tests were used to test for learning effects.

Results

We found that in the Human Agent condition, there was a main effect of age on predictive eye movements to the action goal (F(2,30) = 19.845, p < .001;

Figure 3a). There was no significant difference between the adults and the twelve-month-olds, while the differences between both these groups and the six-month-olds were significant (p < .001; Bonferroni post hoc test). Both adults and twelve-month-olds looked at the bucket before the hand arrived there (p < .05). In contrast, the six-month-olds looked at the goal after the hand arrived (p = .01). The participants in the different age groups also distributed their fixations differently across the movement trajectory in this

(36)

condition (F(2,30) = 12.015, p < .001; Figure 3b). Post hoc testing (Bonferroni) failed to find a significant difference between the adults and the twelve-month-olds. Both these groups differed from the six-month-olds (p <

.001). Adults and twelve-month-olds looked significantly longer at the bucket during target movement than expected from visual tracking (p <

.001), while the six-month-olds did not.

Figure 4. Gaze performance during observation of actions and moving objects.

Statistics (means and s.e.m.) are based on all data points for adults (left), twelve- month-old infants (middle) and six-month-old infants (right), respectively. (a) Timing (ms) of gaze arrival at the goal relative to the arrival of the moving target.

Target arrival is represented by the horizontal line at 0 ms. The horizontal line at 0 ms correspond to the arrival of the hand of the model at the goal site. (b) Ratios of looking time at the goal area to the total looking time in both goal and trajectory areas during target movement. The horizontal line at 0.2 represents the ratio expected if subjects tracked the moving target. Reproduced with permission from Falck-Ytter, Gredebäck, and von Hofsten (2006).

In both adults and twelve-month-olds, predictive gaze shifts were observed when a human hand moved the objects (F(2,30) = 7.637, p = .002 and (F(2,30) = 7.180, p = .003, respectively; Figure 3a), not when observing the two control conditions. Post hoc testing (Bonferroni) demonstrated that for adults, condition Human Agent was significantly different from the Self- propelled and Mechanical Motion conditions (p < .01). Twelve-month-olds displayed the same pattern (p <. 05). Gaze did not arrive significantly ahead of the moving object(s) in the two control conditions in neither adults nor twelve-month-olds. Spatial distribution of gaze differed between the conditions (Figure 3b) in both adults (F(2,30) = 17.782, p < .001) and twelve-month-olds (F(2,30) = 36.055, p < .001). Both adults and twelve- month-olds spent more time looking at the goal in condition Human Agent than in the Self-propelled and Mechanical Motion conditions (p < .001 in all

(37)

distribution of gaze did not differ significantly from what would be expected if participants tracked the objects (in both groups)

Discussion Study I

Study I indicated that when watching another person performing actions on a computer screen, twelve-month-old infants looked at the goal of the actions ahead of the arrival of the moving hand. Such gaze behaviour is advantageous in many real-life situations. For example, looking at an object before another person has grasped it makes it possible to grasp the same object faster than it would be if the action was tracked reactively. Thus, in competitive (and timely) situations, predictive gaze performance is highly adaptive. Because predictive gaze in action observation plays an important

‘attention-synchronizing’ role during social interactions, it probably facilitates play with peers in young children. A failure to predict other people’s action goals is likely to have severe consequences for the individual.

Previous studies have found that adults use predictive eye movements in action observation. This finding was replicated in Study I. More importantly, we showed that when observing actions, twelve-month-old infants focused on goals in the same way as adults do. Neither twelve-month-old infants nor adults used predictive eye movements when observing self-propelled objects moving to the same goal. This indicates that an action specific mechanism underlies goal prediction in twelve-month-olds and adults. Later research replicating the stimuli in Study I has shown that sequential finger tapping but not backward counting inhibits predictive eye movements in the Human Agent condition (Cannon and Woodward, 2008; see also Study II and General Discussion). This provides strong support for the view that the motor system is involved in predicting other people’s action goals.

Six-month-olds focused on the movements, not the goals, when observing other people’s placement actions. This tendency cannot originate from a general inability to predict future events as six-month-olds predict the reappearance of temporarily occluded objects (Rosander and von Hofsten, 2004). Notably, manual actions of this type become mastered in the second half year of life (Bruner, 1970). Thus, the development from six to twelve months is in line with the view that infants come to predict others’ actions by matching observed actions onto motor representations of those actions.

Did the self-propelled conditions in Study I meet the requirements for goal attribution according to Teleological Stance theory? The self-propelled objects caused salient end effects (auditory and visual) as they reached the bucket, which is the first requirement for teleological processes (Gergely and Csibra, 2003). Furthermore, they moved in a rational way given the constraints of the situation (the reader may benefit from comparing Figure 1

(38)

and 3 of this thesis). This is the second (and last) requirement of Teleological Stance theory. Thus, it seems that although twelve-month-old infants are highly likely to have encoded the goal in all conditions in Study I, they nevertheless only used predictive eye movements when observing a human hand interact with the objects. Against this background, I find it reasonable to conclude that teleological reasoning is not the most likely explanation of predictive eye movements in action observation.7

According to Teleological Stance theory, one important principle inherent in all actions is equifinal variation (Kiraly et al., 2003). This implies that the trajectory of the moving object should change as a function of contextual change (obstacles) while the goal state remains the same. Our self-propelled objects did not show evidence of equifinal variation, because the context always remained the same. However, this potential criticism can easily be countered with the fact that in real life, one has to be able to understand and predict actions even without having seen a functional response to a contextual change. In other words, even though equifinal variation probably facilitates perception of goal-directedness, it cannot be a necessary condition for all types of action understanding or action prediction. More importantly, equifinal variation was not manipulated between the conditions. Thus, it cannot be the source of the difference between them.

Lastly, it is important to discuss the fact that self-propelled objects like the toys moving to the bucket in this Study I seem to violate the basic laws of physics. Could it be the “strangeness” of these self-propelled objects prevented prediction? It is clear that on the basis of Study I, one cannot know why participants did not use predictive eye movements observing self- propelled objects. The central point is, however, that according to Teleological Stance theory (which is the main alternative to simulation

7 The proponents of the Teleological Stance theory may be expected to counter this argument by arguing that there are more contextual cues in the human agent condition than the self- propelled condition, and that the increase in cues constraints the interpretation of the action, facilitating predictive eye movements to the goal based on visual analysis (no motor simulation). First, such a position would imply that also biological constraints are taken into account by the teleological reasoning system. The importance of biological constraints (or the number of constraints) has not been studied systematically and is typically not discussed in the teleological reasoning literature. The primary empirical studies in support for teleological reasoning in infants used self-propelled geometrical shapes that changed their trajectory depending on only one (non-biological) contextual manipulation. Thus, the view that teleological reasoning in infants can take into account biological constraints without the help of motor simulations remains a speculation. Secondly, even though it may eventually turn out that teleological judgments (e.g. “the goal is to put the object into the container”) are facilitated by biological constraints identified via pure visual analysis, I would argue that this knowledge is semantic rather than episodic (it is not accurately defined in space and time).

Available evidence suggests that motor simulations during observation of others’ actions are used for predictive purposes when precise estimation of when an where something is going to

(39)

theory; de Lange et al., 2008; Csibra, 2007), observers should be taking a teleological stance when observing self-propelled objects, not a physical stance (Gergely and Csibra, 2003). That is, when you observe a geometrical shape move in a “physical context” such as that presented in Figure 1, what comes to your mind is not typically “this is a meaningless and implausible physical event”. Rather, you perceive the shape as animate, and interpret its movement with respect to the context and movement of other objects (Heider and Simmel, 1944). From the perspective of the Teleological Stance theory, the fact that self-propulsion is a violation of basic physics is expected to facilitate rather than hinder perception of goal-directedness (see Introduction).

References

Related documents

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

I regleringsbrevet för 2014 uppdrog Regeringen åt Tillväxtanalys att ”föreslå mätmetoder och indikatorer som kan användas vid utvärdering av de samhällsekonomiska effekterna av

a) Inom den regionala utvecklingen betonas allt oftare betydelsen av de kvalitativa faktorerna och kunnandet. En kvalitativ faktor är samarbetet mellan de olika

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

Denna förenkling innebär att den nuvarande statistiken över nystartade företag inom ramen för den internationella rapporteringen till Eurostat även kan bilda underlag för

Industrial Emissions Directive, supplemented by horizontal legislation (e.g., Framework Directives on Waste and Water, Emissions Trading System, etc) and guidance on operating

The EU exports of waste abroad have negative environmental and public health consequences in the countries of destination, while resources for the circular economy.. domestically