• No results found

EMA-based head movements, word accents, vowel length and segments: a preliminary study

N/A
N/A
Protected

Academic year: 2022

Share "EMA-based head movements, word accents, vowel length and segments: a preliminary study"

Copied!
3
0
0

Loading.... (view fulltext now)

Full text

(1)

http://www.diva-portal.org

This is the published version of a paper presented at FONETIK 2019, Stockholm, June 10-12, 2019.

Citation for the original published paper:

Frid, J., Svensson Lundmark, M., Ambrazaitis, G., House, D. (2019)

EMA-based head movements, word accents, vowel length and segments: a preliminary study

In: Mattias Heldner (ed.), Proceedings from FONETIK 2019 Stockholm, June 10-12, 2019 (pp. 125-126). Stockholm: Stockholm University

PERILUS

https://doi.org/10.5281/zenodo.3246023

N.B. When citing this work, cite the original published paper.

Permanent link to this version:

http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-92572

(2)

EMA-based head movements, word accents, vowel length and segments: a preliminary study

Johan Frid1, Malin Svensson Lundmark2, Gilbert Ambrazaitis3 and David House4

1 Lund University Humanities Lab, Lund University

2 Centre for Languages and Literature, Lund University

3 Department of Swedish, Linnæus University

4 Department of Speech, Music and Hearing, KTH johan.frid@humlab.lu.se, malin.svensson_lundmark@ling.lu.se,

gilbert.ambrazaitis@lnu.se, davidh@speech.kth.se Abstract

This study describes on-going work in the field of multimodal prosody carried out by means of simultaneous recordings of speech acoustics, articulation and head movements.

Introduction

People naturally move their heads when they speak, and head movements have been found both to correlate strongly with the pitch and amplitude of the speaker's voices and to convey linguistic information. Here, we report on a study that explores how head movement pat- terns vary and co-occur with lexical pitch accents (and their acoustic corre- lates F0 and intensity), vowel length and segmental position. The study uses data from Swedish, where there are both two lexical pitch accents and two vowel lengths that differ phonologically.

Method

We use EMA (Electromagnetic articu- lography), which allows for high sample rates, accurate synchronisation of kine- matic and acoustic recordings, as well as three-dimensional movement data. Kin- ematic data is obtained by gluing small sensors on the speakers’ articulators (tongue, lips, jaw). Head movement data is obtained by similar sensors on the nose ridge and behind the ears, which al- lows us to capture the angle of the tilt of the head. Figure 1 shows an example of nose sensor movement.

Articulatory data was collected from 18 South Swedish speakers (12 female) using a Carstens AG501. Each speaker read leading questions + sentences con- taining a target word from a prompter (presented eight times in random order), an arrangement employed to put a con- trastive focus onto the last element in the target sentence. This left the target word in a low-prominence inducing context, hence controlling for possible effects of sentence intonation.

Material

For this study we used eight target words where pitch accent and vowel length were cross-matched so that there were two cases of each combination of word accent category and vowel length cate- gory. All words shared the similar word- initial C /m/, followed by a vowel that was either /a/ or /ɑ:/. The target words were segmented and time-normalized between 0 to 1 and the head tilt angle (sagAng) was normalized for each speaker by z-transforming the angles per speaker. Spatial movements were ana- lysed using Generalized Additive Mod- els, which we used to test if there were effects of segmental position (C versus V in the first syllable), word accent (1 or 2) and vowel length (short or long) on sagAng. Models were fit using the max- imum likelihood (ML) estimation method.

Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019

125

(3)

Results

Figures 2-4 show the fitted models. The Chi-Square test on the ML scores indi- cates that a model with the word accent distinction is significantly better than a model without it (X2(4.00)=632.796, p<2e-16***). Similarly, a model with vowel length distinction is significantly better than a model without it (X2(4.00)=820.997, p<2e-16***). Fi- nally, a model with segmental position is significantly better than a model without it (X2(8.00)= 173.316, p<2e-16***).

Discussion

The results indicate that head nod pat- terns that occur in synchronisation with the stressed syllable of spoken words differ with respect to word accent, vowel length and segmental position. This could possibly point to an effect of F0 and intensity on the head nod move- ments.

Acknowledgements

This work was supported by grants from the Swedish Research Council: Swe- Clarin (VR 2013-2003) and Progest (VR 2017-02140).

Figure 1. Two examples of nose sensor movement and alignment with vowel. CVC segment between the red lines, V between the green lines.

Figure 2. Non-linear smooths (fitted values) of sagAng for the Accent 1 (blue) and Ac- cent 2 (red) words in the GAM model.

Shaded bands represent the pointwise 95%- confidence interval.

Figure 3. Non-linear smooths (fitted values) of sagAng for the V (blue) and V: (red) words in the GAM model. Shaded bands represent the pointwise 95%-confidence interval.

Figure 4. Non-linear smooths (fitted values) of sagAng for the pre-vocalic C (red) the V (green), and the post-vocalic C (blue) in the GAM model. Shaded bands represent the pointwise 95%-confidence interval.

Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019

126

References

Related documents

To have a better view of the improved situation in Scenario 1A compared with Scenario Zero, the decrease in density, mean delay, stop time, number of stops, travel time, mean queue

Given a finite element model of an adhesive joint and a list of boundary elements and nodes this tool calculates the energy release rates in mode I and mode II, and if the

Using a Gaussian Mixture Model allows to capture the multimodal character of real-world dynamics (e.g. intersecting flows) and also to account for flow variability.. issn 1650-8580

Department of clincal and experimental medicine Faculty of health sciences linköping university Se-581 85 Linköping, Sweden.

[r]

Linköping Studies in Science and Technology Dissertations, No.1690 Inessa Laur Ine ss a L au r Clu ste r in itia tiv es as in te rm ed iar ies 20

After solving collinearity problem, the regression result shows both education policy and R&amp;D expenditure policy have obvious positive effects on patent development

In the original article, the authors (Tyler &amp; Lindblom, 1982) constructed an experiment masking steady-state synthetic pure tones by simultaneous