• No results found

PERILUS VIII: December 1988

N/A
N/A
Protected

Academic year: 2022

Share "PERILUS VIII: December 1988"

Copied!
200
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)
(3)

UNIVERSITY OF STOCKHOLM

INSTITUTE OF LINGUISTICS

PERILUS VIII

PERIL US mainly contains reports on current experimental work carried out in the Phonetics Laboratory at the Unhfersity of Stockholm. Copies are available from the Institute of Linguistics, University of Stockholm, S-106 9 1 Stockholm, Sweden.

This issue of PERILUS was edited by Olle Engstrand, Mats Dufberg and Johan Stark.

(4)

Institute of Linguistics University of Stockholm S-106 91 Stockholm

© 1988 F6rfattarna ISSN 0282-6690

11

(5)

ill

THE PHONETICS LABORATORY GROUP

Ann-Marie Alme Ulf Andersson Leslie Bailey 1 Robert Bannert Aina Bigestans Peter Branderud

Una Cunningham-Andersson Hassan Djamshidpey

Mats Dufberg aile Engstrand Garda Ericsson2 Anders Eriksson Ake Floren Eva Holmberg3 Diana Krull

Francisco Lacerda Ingrid Landber

Bjorn Lindblom Rolf Lindgren James Lubker5 Bertil Lyberg6 Robert McAllister Lennart Nord 7

Lennart Nordstrand8 Liselotte Roug Richard Schulman Johan Stark

Hartmut TraunmOlier Eva Oberg

1 Visiting from Department of Linguistics, University of Delaware, Newark, Delaware, USA 2 Also Department of Phoniatrics, University Hospital, Link6ping

3 Also Research Laboratory of Electronics, MIT, Cambridge, MA, USA

4 Also Department of Linguistics, University of Texas at Austin, Austin, Texas, USA

5 Also Department of Communication Science and Disorders, University of Vermont, Burlington, Vermont, USA

6 Also Technology Department, Swedish Telecom, Stockholm

7 Also Department of Speech Communication and Music Acoustics, Royal l nstitute of Technology (KTH), Stockholm

8 Also AB Consonant, Uppsala

(6)

IV

ACKNOWLEDGMENTS

The current phonetic experimental research at the Institute of Linguistics is sponsored in part by the following sources:

The Swedish Board for Planning and Coordination of Research (FRN)

The Swedish Council for Research in the Humanities and Social Sciences (HSFR)

The Swedish Natural Sciences Reseach Council (NFR) The Bank of Sweden Tercentenary Foundation (RJ)

The Swedish National Board for Technical Development (STU)

The Swedish Cancer Society, The Cancer Foundation The Swedish Institute

AI/manna arvsfonden

(7)

v

PREVIOUS ISSUES OF PERILUS

PERILUS 11978-1979

1. INTRODUCTION

Bjorn Lindblom and James Lubker

2. SOME ISSUES IN RESEARCH ON THE PERCEPTION OF STEADY-STATE VOWELS

Vowel identification and spectral slope Eva Agelfors and Mary Graslund

Why does [Q] change to [0] when Fo is increased? Interplay between harmonic structure and formant frequency in the perception of vowel guality

Ake Floren

Analysis and prediction of difference limen data for formant frequencies Lennart Nord and Eva Sventelius

Vowel identification as a function of increasing fundamental frequency Elisabeth Tenenholtz

Essentials of a psychoacoustic model of spectral matching Hartmut TraunmOlier

3. ON THE PERCEPTUAL ROLE OF DYNAMIC FEATURES IN THE SPEECH SIGNAL

Interaction between spectral and durational cues in Swedish vowel contrasts

Anette Bishop and Gunilla Edlund

On the distribution of [h] in the languages of the world: is the rarity of syllable final [h] due to an asymmetry of backward and forward

masking?

Eva Holmberg and Alan Gibson

On the function of formant transitions

I. Formant frequency target vs. rate of change in vowel identification II. Perception of steady vs. dynamic vowel sounds in noise

Karin Holmgren

(8)

vi

Artificially clipped syllables and the role of formant transitions in consonant perception

Hartmut TraunmOlier

4. PROSODY AND TOP DOWN PROCESSING

The importance of timing and fundamental frequency contour information in the perception of prosodic categories

Bertil Lyberg

Speech perception in noise and the evaluation of language proficiency Alan C. Sheats

5. BLOD - A BLOCK DIAGRAM SIMULATOR Peter Branderud

Introduction James Lubker

PERILUS 111979-1980

A study of anticipatory labial coarticulation in the speech of children

Asa Berlin, Ingrid Landberg and Lilian Persson

Rapid reproduction of vowel-vowel sequences by children

Ake Floren

Production of bite-block vowels by children Alan Gibson and Lorrane McPhearson

Laryngeal airway resistance as a function of phonation type Eva Holmberg

The declination effect in Swedish Diana Krull and Siv Wandeback

Compensatory articulation by deaf speakers Richard Schulman

Neural and mechanical response time in the speech of cerebral palsied subjects

Elisabeth Tenenholtz

An acoustic investigation of production of plosives by cleft palate speakers

Garda Ericsson

(9)

Introduction Bjorn Lindblom

vii

PERILUS III 1982-1983

Elicitation and perceptual judgement of disfluency and stuttering Ann-Marie Alms

Intelligibility vs. redundancy - conditions of dependency Sheri Hunnicut

The role of vowel context on the perception of place of articulation for stops

Diana Krull

Vowel categorization by the bilingual listener Richard Schulman

Comprehension of foreign accents. (A Cryptic investigation.) Richard Schulman and Maria Wingstedt

Syntetiskt tal som hjalpmedel vid korrektion av dovas tal Anne-Marie Oster

Introduction Bjorn Lindblom

PERILUS IV 1984-1985

Labial coarticulation in stutterers and normal speakers Ann-Marie Alms

Movetrack Peter Branderud

Some evidence on rhythmic patterns of spoken French Danielle Duez and Yukihoro Nishinuma

On the relation between the acoustic properties of Swedish voiced stops and their perceptual processing

Diana Krull

Descriptive acoustic studies for the synthesis of spoken Swedish Francisco Lacerda

Frequency discrimination as a function of stimulus onset characteristics Francisco Lacerda

(10)

V111

Speaker-listener interaction and phonetic variation Bjorn Lindblom and Rolf Lindgren

Articulatory targeting and perceptual consistency of loud speech Richard Schulman

The role of the fundamental and the higher formants in the perception of speaker size, vocal effort, and vowel openness

Hartmut TraunmOlier

About the computer-lab Peter Branderud

PERILUS V 1986-1987

Adaptive variability and absolute constancy in speech signals: two themes in the quest for phonetiC invariance

Bjorn Lindblom

Articulatory dynamics of loud and normal speech Richard Schulman

An experiment on the cues to the identification of fricatives Hartmut TraunmOlier and Diana Krull

Second formant locus patterns as a measure of consonant-vowel coarticulation

Diana Krull

Exploring discourse intonation in Swedish Madeleine Wulffson

Why two labialization strategies in Setswana?

Mats Dufberg

Phonetic development in early infancy - a study of four Swedish children during the first 18 months of life

Liselotte Roug, Ingrid Landberg and Lars Johan Lundberg A simple computerized response collection system Johan Stark and Mats Dufberg

Experiments with technical aids in pronunciation teaching Robert McAllister, Mats Dufberg and Maria Wallius

(11)

ix

PERILUS VI FALL 1987

Effects of peripheral auditory adaptation on the discrimination of speech sounds (Ph.D. thesis)

Francisco Lacerda

PERILUS VII MAY 1988

Acoustic properties as predictors of perceptual responses: a study of Swedish voiced stops (Ph.D. thesis)

Diana Krull

(12)

CONTENTS OF PERILUS VIII

Some remarks on the origin of the "phonetic code"

Bjorn Lindblom ... . . .. . .... . ... . . ... ... . ... . . . 1 Formant undershoot in clear and citation form speech

Bjorn Lindblom and Seung-Jae Moon ... . ... . .. . . .. . . 20 On the systematicity of phonetic variation in spontaneous speech aile Engstrand and Diana Krull . . . ... . . . ... . ... . . . .. . . 34 Discontinuous variation in spontaneous speech

aile Engstrand and Diana Krull .. . . . ... . ... . ... . ... . . . ... . . .48

Paralinguistic variation and invariance in the characteristic frequencies of vowels

Hartmut TraunmOlier . . . . ... . . . ... . . .. . . .. . . 54 Analytical expressions for the tonotopic sensory scale

Hartmut TraunmOlier ... . ... . ... . . . .. . . .. . . .. . . . .... . . 93

Attitudes to immigrant Swedish -A literature review and preparatory experiments

Una Cunningham-Andersson and aile Engstrand . . .. . ... . . 1 03 Representing pitch accent in Swedish

Leslie M. Bailey . . .... . .. . . .. . . .. . . .. . . 153

(13)

Some remarks on the origin of the "phonetic code"*

Bjorn Lindblom

Departments of Linguistics University of Texas at Austin

and

Stockholm University

*Paper presented at the Symposium on Developmental Dyslexia: Aspects on Memory Functions, Sequencing and Hemispheric Interactions, sponsored by the Academia Rodinensis Pro Remediatione, in June 1988 at the Wenner-Gren Center, Stockholm Sweden.

(14)

INTRODUCTION: THE ELUSIVE PHONEME

Human languages exhibit duality (Hockett 1958) which means that they make combinatorial use of discrete units at two levels of structure: Elements carrying meaning (words, morphemes) are combined to f orm phrases and sentences according to syntactical rules. Phonemes are combined to f orm words and morphemes according to phonological- rules.

Apparently no other species codes its communicative signals in this combinatorial way. In all languages the building blocks of spoken words are vowel and consonant phonemes. In animal communication systems, on the other hand, meaningf ul elements cannot be f ormed by systematic use of discrete units since they lack such units. The signals are Gestalts.

Inventories are limited and typically consist of no more than 10- 40 holistic patterns (Wilson 1975) . By comparison, human vocabularies can become extremely large owing to the combinatorial power of the phonemic principle. Once acquired this "phonetic code" enables the normal child to begin, by its eighteenth month or so, to learn half a dozen new words a day, so that by six years of age it und erstands seven to eleven thousand phonetic f orms, or about ten percent of an adult's vocabulary (Studdert-Kennedy 1983, 1987)

We are thus led to conclude that the property of dual structure is unique to hum an langauges. It 1S

linguistically universal. And it is the key to their unique expressiveness.

'Slips of the tongue' (spoonerisms and other speech errors) - e g "our g!leer old gean" f or "our gear old g!leen" provide strong evidence that the adult language user's production of speech is organized in terms of phonemic segments (Fromkin 1980, MacNeilage,

- 2 -

(15)

Studdert- Kennedy and Lindblom 1985). Interestingly, 'slips of the hand' are reported to occur in sign language (Klima and Bellugi 1979: 126-146). Such f acts along with other independent observations confirm the assumption that also sig n langua ge uses the method of combining abstract building blocks, i e hand shapes like phonemes by themselves totally devoid of meaning - to f orm the complex signs of their vocabularies.

Accordingly, duality and the principle of phonemic coding cannot be said to be unique to the vocal­

auditory med ium (Bellugi and Studd ert-Kennedy 1980) . Demonstrably the phonetic cod e of f ers an extremely powerf ul method of coding semantic inf ormation. Clearly it would not be possible f or a linguist to d escribe langua ge structure, whether spoken or signed, without the recognition of the phonemic organization of the lexicon. The phoneme represents a major d iscovery of twentieth-century linguistics (Fischer- J¢rgensen 1975) .

Yet the phoneme remains elusive to those who study the physical and behavioral aspects of language use.

Neither in articulatory movements nor in the speech signal d o phonemes appear as beads on a necklace. Their correlates in the signal d o not f orm segments that are sharply d elimited along the time axis. And those correlates cannot be unambiguously identif ied irrespective of contex t. These dif f iculties are known as the segmentation ' and invariance problems (Perkell and Klatt 1986, Fant 1988, Liberman 1988). Large-scale recognition of human speech by computer still awaits the successf ul resolution of these classical theoretical issues.

At this point the student of reading would want to intersperse that the alphabet which ref lects the phonemic segmentation of speech, was d eveloped late in the evolutionary history of language, perhaps no earlier than 3500--4 000 years ago. He would also point out that readers tend to vary with respect to their a bility to a nalyze written ma terials phonolog ica lly.

Poor readers lack phonemic awareness (Liberman 1987, Lundberg, Olof sson and Wall 1980, Lundberg 1987) They have greater d if f iculties segmenting words into phonemes than good readers.

According ly we f ind that the research interests of those studying speech and those stUdying reading converge on the baf f ling but admittedly powerf ul notion of the phoneme. How could such a complex structure ha ve evolved? The goal of the present pa per is to shed some lig ht on that question.

I shall make my presentation in two steps. We beg in by f irst considering how the phonetic values of phonemes might ha ve developed. In other w ord s, how do phonetic systems evolve? We then use our tentative

- 3 -

(16)

answer to that questio n to elucidate the o rigin of the units themselves? Where did the pho nemic principle co me fro m?

ARTICULATORY AND PERCEPTUAL CONSTRAINTS I: HOW DO PHONE TIC SYSTEMS EVOLVE?

The majo r dimensio ns that linguists traditio nally use to describe vo wels are: (i) the degree of rounding of the lips and the po sitio n o f the to ngue alo ng (ii) a front-back and (iii) a high-low dimensio n (Ladef o ged 1982 ) . The typo lo gical data used in the present paper

FIGURE 1

%

100

50

o

I

u 'U" o O ·:>

include vo wel systems fro m o ver 200 languages (Cro thers 1978) w ho se vo wel qualities were specified in relatio n to a maximal universal set with fro nt, central o r back, ro unded o r unro unded and seven po sitio ns o n the high­

lo w continuum. The mo st f avo red invento ries are listed in Table 1.

Figure 1 sho ws supplementary data fro m an independent investigatio n o f 317 languages (Maddieso n

- 4 -

(17)

TABLE l. Mo st favo red vo wel systems o bserved in a co rpus o f o ver 200 languages (Cro thers 1978)

INVENTORY SIZE VOWE L QUALITIE S NO OF LG' S

3 1 a u 23

4 1 a u t 13

4 1 a u + 9

5 1 a u €. " 55

5 1 a u E t 5

6 1 a u € :> :t:: 29

6 i a u E ,;) e 7

7 1 a u e 0 :i: 14

7 1 a u ;) e 0 11

9 1 a u E. ::> e 0 3: 7

1984) . The o ccurrence o f the mo st frequent symbo ls have been plo tted o n a two - dimensio nal pro je ctio n o f the universal set. Bo th so urces o f data co nverge in demo nstrating that only a small subset of the available qualities are bro ught into play. We f urther no te that there is a clear preference fo r 'peripheral vo wels' such as Ii e t a 0 ul and a relative disfavo ring o f Iy 0 .a y 1M. I. Also high- lo w co ntrasts are mo re co mmo n than f ro nt-back and ro unded-unro unded o ppo sitio ns.

These systematic trends represent rather drastic departures fro m the systems w e wo uld generate simply by drawing invento ries at rando m fro m the max imal set o f universal vo wel types. Ho w do we explain such patterns?

Here is a brief summary o f a theo ry develo ped to acco unt fo r the o bserve d regularities but w ho se co mpo nents o riginally co me f ro m several independently mo tivated research themes. (Fo r an ex haustive descriptio n o f the research repo rted in the present and the f o llo wing sectio ns see Lindblo m, MacNeilage and Studdert- Kennedy fo rthco ming). The theo ry can be presented in three parts. It pro vides quantitative definitio ns o f the space o f "po ssible vo w els", a co nstraint o n "pho netic discriminability'" and a criterio n fo r selecting the "o ptimal system".

The po int o f de parture is a physio lo gically mo tivated, numerical mo del (Lindblo m and Sundberg 1971) which takes specificatio ns o f the po sitio n o f the jaw , to ngue, larynx and the lips as its input and w ho se o utput is the shape (area functio n) o f the vo cal tract f o r an arbitrary, but physio lo gically po ssible vo w e l articulatio n. The aco ustic pro perties o f such vo cal tract shape s can be ascertained by me ans o f establishe d metho ds o f aco ustic the o ry (Fant 1960). The audito ry pro pe rtie s are derived by transf o rming the aco ustic descriptio n o f a vo w el w hich is given in terms o f its harmo nic spectrum into an audito ry representatio n. This

- 5 -

(18)

last ste p employs computational models that capture e sse ntial characteristics of the auditory periphe ry as re veale d by psycho acoustic re se arch (Schro eder, Atal and Hall 1979). Acco rdingly, the class of vo w els, or the vowel space, generate d by this model can be de scribe d in articulatory, acoustic or auditory dime nsions. Since the above -me ntioned de f initio ns quantif y ge ne ral aspects of oral physiology, aco ustics and he aring that are in no w ay spe cial to spe e ch w e can view the vowe l space as a te ntative hypo thesis about the a prio ri range of physical sounds unive rsally available f o r the linguistic selection of vowel co ntrasts.

The the ory analyzes phonetic discriminability into an auditory and a sensori-moto r aspect. It can be sho w n that it is po ssible to predict the auditory diffe rence or distance that a listene r assigns to an arbitrary pair o f vo w els (Blado n and Lindblom 1981) from

�"S-

AUDij=c (f I Ei (z) -Ej (z) 12 dz) 1/2 (1)

o

where c is a co nstant and Ei (z) and E" (z) represent

"excitatio n patterns" calibrated in psyc1oaco ustically motivated dimensions. The interval z=O-24. 5, in Bark units, correspo nds to the frequency range o f human hearing (Schroeder, Atal and Hall 1979) . There is also data fro m experiments using the technique of Direct Magnitude Estimation (Stevens 1975) . These experiments compared subjects' judgements o f movement along the dimensions of jaw o pening and front-back positioning o f the tongue. The DME results indicated that subjectively jaw movements appeared mo re exte nsive than tongue movements altho ugh displacements were equal in terms o f

physical measures (Lindblo m and Lubker 1985) . On the basis of these findings an articulatory distance metric, ARTi"' was derived f or the vowel space (Lindblo m 198) Taking the product of the articulatory and the audito ry . matrice s w e e xpress phone tic discriminability as

Dij = ARTij*AUDij (2)

Given the definitio ns o f the space and the discriminability measure we are in a po sition to ask:

If vowe l systems were se e n as evolutionary adaptatio ns to the idio syncratic shape of the vo wel space and to selectio n pressures favo ring maximally discriminable vowe l co ntrasts what wo uld they be like? This que stio n was addre sse d in a series of computatio nal expe rime nts 1n w hich optimal system w as de rive d by computing:

k i- I

L L. (l/Dij) 2 ---.... minimize d (3)

�:' '2.. j=1

- 6 -

(19)

TABLE 2.

---OBSERVED---COMPUTED---­

INVENTORY SIZE: 3

i . . . . u i . . . . u

a a

(23)

INVENTORY SIZE: 4

i . . . . u i . . . . u

e . . . E

a a

(13 )

INVENTORY SIZE: 5

i . . . . u i . . . . u

€. , E. ,

a a

(55)

INVENTORY SIZE: 6

i . � . . u i . . . u

t . . , £ :>

a a

(29)

------

i . i . . u

� . . ) . � .

a

(14)

i . i . . u

e . . . a

t . . )

a

(7)

INVENTORY SIZE: 7

INVENTORY SIZE: 9

i . . tt . U . . . y .

£. .

a.

a

i . . o\:r U

e . � . a tt..

0- a

f or all possible combinations generated by k=3 through 9 (inventory size) and n=19 (size of universal set) .

- 7 -

(20)

The results are presented in Table 2. The lef t co lumn restates the inf o rmatio n of Table 1 and is co mpared w ith the results o f the simulatio ns (right co lumn) . In no case do es the pro bability o f selecting a co rrect system by pure chance exceed 10-3. If w e make a gro ss co mpariso n in terms o f the number o f high- lo w (vertical) and f ro nt- back (horizo ntal) co ntrasts there is perfect agreement between the predictio ns and the data. Lo o king at the individual qualities w e f ind that certain discrepancies o ccur in systems with mo re than six vo w els. Ho wever, in mo st cases they are o f f by no mo re than a single step o n the nineteen-po int grid o f the universal set.

It appears justif ied to co nclude that the simulatio ns achieve a rather a clo se agreement w ith the typo lo gical data. Such a result suppo rts the idea that vo wel systems can be understo o d as f unctio nal adaptatio ns to articulato ry and perceptual co nstraints.

w (J

Z w a:: a::

:::>

(J (J o I-

30

20

Z 10

w (J

a:: w

a..

o

o 5

VOWEL SYSTEMS CONTAINING:

10

BASIC SEGMENTS

BASIC . �ND ELABORATED SEGMENTS

BASIC, ELABORATED, AND

COMPLEX SEGMENTS

15 20 INVENTORY SIZE

25

FIGURE 2

Acco rding to the theo ry presented here the preference f o r 'peripheral' vo wels and the disf avo ring o f ' interio r' vo w els o riginates in an interactio n between a "demand" f o r discriminability o n the o ne hand - w hich pro duces a dispersio n ef f ect displacing vo wels to wards the periphery and the idio syncratic pro perties o f the vo wel space o n the o ther w hich

- 8 -

(21)

leaves more room f or high-low contrasts than f or f ront­

back and rounding gestures. As we broaden our f ield of observation the above conclusions tend to be reinforced. The vowel data just examined are limited to systems of so-called 'plain' vowels. Many languages use vowel series that have ad ditional attributes such as apicalization, nasalization, breathy or creak y voice, etc: Ii %., 'e, a, 01 as well as combinations of such additional f eatre;: lair �:/. We classif ied the vowels of the UPSID database (Maddieson 1984) into those three groups: Basic or plain, Elaborated (with additional features) and Complex (with combinations of elaborated mechanisms) . Plotting the distribution of these types as a f unction of inventory size we obtained the d iagram shown in Figure 2. It shows that small sy stems use Basic segments, medium-sized invoke Basic and Elaborated articulations. Large systems bring all three types into play.

FIGURE 3

60

50

z w

40

en III o 30

u. o

0: 20

w a:l ::::!:

;:) 10

z

o

BASIC ARTICULATIONS

60 50 40 30 20 10 o

ELABORATED

COMPLEX } ARTICULATIONS

o 10 20" 30 40 50 60 o 10 20 30 40 50 60

TOTAL INVENTORY SIZE

We classif ied the consonants of UPSID in a similar manner. Segments such as Ip t b d m . .. 1 are treated as Basic; IpJ t' nd m . . . 1 represent Elaborated gestures; Articulation; that combine Elaborated mechanisms are Complex: Itl:h qW �C cl . . .I. Examining

the occurrence of these consonant "categories as a f unction of inventory size we find patterns closely paralleling those f or vowels. Figure 3 presents a representative subset of the UPSID corpus showing data f rom 47 languages of the Af ro-Asiatic and the Indo­

Pacif ic language groups. Each data point ref ers to a

- 9 -

(22)

gi ven language. The top diagram shows the number of Basic obstruents as a f unction of total system size.

The bottom panel indicates the number of Elaborated and Complex obstruents as a f unction of total sy stem size.

Once more we see that small sy stems have Basic articulations, medium sy stems have Basic and Elaborated gestures and large sy stems have all three ty pes. Also note the saturation of Basic elements bey ond a certain system size and the lawf ul linear growth of the Elaborated and Complex data.

Our analysis indicates that know ing the size of a vow el or consonant inventory w e can make some f airly good predictions about its phonetic contents. The motorically most elaborated and complex phonetic gestures (e g clicks, vowels simultaneously diphthongized and pharyngealized etc) are likely to occur in the largest inventories (as in ! Xu of the

Kalahari desert with its 148 segments) whereas small systems (n<10, e g Maori and Hawaiian) would not be expected to contain such sounds but to f avor elementary articulations (p, t, m, e, a etc) . This Size Principle makes sense if we assume that in small inventories Basic articulations achieve suf f icient contrast whereas larger sy stems place greater demands f or intrasy stemic distinctiveness and theref ore cause additional dimensions to be recruited and to be combined to f orm more complex segments.

ARTI CULATORY AND PERCEPTUAL CONSTRAI NTS I I : WHERE DOES THE PHONEMI C PRI NCI PLE COME FROM?

I n the preceding sections we have argued that the phonetic values that vow els and consonants ex hibit have evolved in response to universal, non-linguistic articulatory and perceptual constraints. Let us now see whether these constraints could have play ed a role also in the emergence of the discrete units themselves.

Our discussion w ill be based on a computational ex periment in which we simulate the phonetic growth of a small vocabulary , a mini lexicon . The design of the experiment is closely analogous to the vowel sy stem simulations. The point of departure is again the articulatory model (Lindblom and Sundberg 1971) . We use it to generate a phonetic space consisting in this case of a set of "possible syllables". A possible sy llable is of f ixed duration and is represented as a continuous trajectory in phonetic space moving f rom a complete closure of the vocal tract (whose location ranges f rom labial through dental, alveolar, retrof lex to palatal, velar and uvular points of articulation) to an open conf iguration. The open conf igurations are those of the

- 10 -

(23)

previously described cardinal vowel set. Figure 4 gives an example of such a transition w ith a stylized f requency-time f ormant pattern at the top and its representation in a three-dimensional f ormant space below .

The phonetic discriminability of an arbitrary pair of trajectories was obtained by generalizing the procedures applied to vowels to the time domain , that

FIGURE 4

( kHz)

3.0

2.5 r--t---l

2.0

a .2

TIME

--

., .6 .8

FIRST FORMANT (kHz)

THIRD

FORMANT

(kHz)

3.0

2.5

2.0

3.0 SECOND FORMANT

1.0 (kHz)

1.0

is by representing them as a series of discrete spectra in time, calculating Eqs 1-3 f or each time sample and then deriving the discriminability measure as the square root of the sum of the individual samples squared (cf Eq 3) .

since the reduction phenomena and articulatory simplif ications of on-line speech can in most cases be explained satisf actorily in elementary biomechanical terms by representing articulators by damped spring­

mass systems (Lindblom 1983) a rank ordering of every possible syllable based on articulatory criteria was also attempted. Such a biomechanical analysis makes us

- 11 -

(24)

expect that ex treme positions (ex treme displacements f rom habitual rest) and ex treme movement rates tend, if possible, to be avoided. That is a f act richly supported by phonetic observations (Lindblom 1983) . There is room f or only a f ew ex amples: Syllables with labial and dental occlusions receive high ranks. They have near-neutral points of closure (cf their high f requency in babbling) whereas a transition with a retracted tongue tip, a retrof lex closure, represents a more ex treme departure f rom neutral. The penalty on extreme movement rates leads to a favoring of homorganic, assimilated sequences. Thus a trajectory consisting of a uvular closure followed by a palatal (high-f ront) open conf iguration gets a lower score than say a palatal (velar) closure f ollowed by a palatal

(velar) open conf iguration.

Pursuing the analogy with the vowel system simulations f urther we investigated "optimal systems"

of syllables by computing k - I

"" (a· ·/d· .)2 --- minimized (4)

L- L. 1 J 1 J

·l:2. j=1

where dij is the discriminability of an arbitrary transition pa1r and aij the articulatory cost of that pair. In words: Find that set of k syllables that simultaneously satisf y the goal of being as easy as possible to say (minimal articblatory �ost) and as easy as possible to hear (max imal discriminability). In the present case k=15 and the total inventory was 133. A procedure of cumulative selection was adopted.

TABLE 3

bi ba bu

di de da du

gl + . gc + ga gt . gu

Once an initial syllable had been selected Eq 4 was applied repeatedly until a mini lex icon of 15 elements had been obtained. In all there were 133 runs (=initial syllables) . The results were pooled which yielded a total of 1995 syllables. The "optimal system" was def ined as the 15 f orms w ith the highest f requency in this pooled set. The results are presented in Table 3.

- 12 -

(25)

The most signif icant aspect of this table emerges when, examining it row by row and column by column, we observe that trajectory onsets and end-points are shared . Rows and columns appear to contain what

linguists w ould call "minimal pairs". Why not a more diverse set of closures and open conf igurations?

FIGURE 5

Perhaps the easiest way of obtaining an intuitive grasp of the causes of the combinatorial structure of the derived inventory is to invoke a simple geometrical metaphor. Suppose we consider tw o vertical line segments and the task of d rawing k lines (trajectories) f rom anywhere on the lef t segment to anywhere on the right in such away that the area, A, between any pair of lines w ill be as large as possible. Mathematically,

k i.-I

L. L

i-=2 j=1

(l! Aij) 2 ---... minimized (5 )

Cf Eqs 3 and 4. Figure 5 shows the result f or k=9. We see that trajectory onsets and end-points are shared.

The phonetic space of the simulations is clearly much more complex but our point is that the convergence of trajectories in the geometrical exercise is analogous to the convergence of the optimized phonetic transitions. The combinatorial pattern appears to be a consequence of achieving an ef f icient packing (read:

optimal d iscrimination) w ithin a bounded space. It can be shown that the aij matrix does not inf luence the degree combinatorial coding in any major w ay but it

- 13 -

(26)

does play an important role determining the phonetic value s of the derive d syllables.

Suppose w e presented Table 3 to a linguist as a sample f rom the vocabulary of an unknown language implying that the f orms have dif f e rnet meaning but being very caref ul so as not ot reveal that they had be e n produce d as unanalyze d w holes. Undoubtedl y he would note that the table contains numerous minimal pairs. Assuming that the le xical items are semantically distinct and f ollowing standard linguistic methodology he would hypothesize that the language in question uses three consonant phone mes and f ive vowel phonemes the minimally contrastive segments being /b d g/ and /i a

t :> u/ . FIGURE 6

T ONGUE TIP

JAW

TONGUE BODY

:

\ IALVf m

:

...

\

iALVrm- -

t- -- -

f-�.

... � -

t- t-

f-

---------- .

� -

PA_______ _

COMMON

f- --- -

... � -

PH�: ______ . _

DENOMINATOR

TONGUE

TIP

: \i

ALyfm -

-

TARGET

NEUTRAL HABITUAL

REST JAW

TONGUE BODY

t-

- -

t-

----------

----------

--0---

---------- -

- -

-

OPEN TARGET

NEUTRAL

TARGET

NEUTR AL HABITU AL

REST

OPEN TARGET

NEUTRAL

How do w e resolve this paradox? For remember: The syllables are by def inition specified as continuous transitions, as phonetic Ge stalts. Their production is no more segmentally organize d than the early vocalizations of the babbling child. Cf top panels of Figure 6. Neve rtheless, we transcribe such utterances using se gments. Accordingly our use of segments l.n

- 14 -

(27)

Table 3 should be seen as a mere convenience analogous to the conventional way of describing the phonetic behavior of the y oung child.

However, a "phonemic principle" is nevertheless implicitly present in the derived lexicon since the existence of minimal pairs implies gestural overlap among motor scores. This is the point w e try to make in Figure 6 which shows the motor scores of two syllables, call them Idil and Ida/. The jaw and tongue body time f unctions dif f er whereas the tongue tip curves are identical. This overlap identifies a common denominator component. N ote that its contents is in a one-to-one relation with all words linguistically analy zed as beginning w ith the phoneme Id/. Were we to examine the rest of the motor scores of the derived syllables in the same manner we would obtain analogous common denominators f or the remaining "phonemic segments". We conclude that the simulated lexicon exhibits implicit phonemic coding.

CONCLUSIONS: THE ELUSIVE PHONEME REVISITED

We began by describing the phoneme as a pow erf ul but elusive unit of linguistic structure and by asking how such a complex structure could have evolved.

Although our simulations of lexical growth no dOubt drastically underestimate the complexities of real-lif e vocabulary acquisition let us nevertheless brief ly examine what we might have learned f rom these preliminary considerations.

The main f inding appears to be the demonstration of the beginnings of combinatorial structure in speech­

like signals. The computational experiments tell us that such combinatorial patterns can arise f rom phonetic constraints that f avor the selection of optimally discriminable stimuli. Recall that we f irst inferred the presence of such constraints f rom our analyses of vowel and consonant systems. In the present context w e have emphasized the role of perceptual aspects but other f actors should also be conside;t:'ed, f or instance learning. Conceivably , articulatory gestures that a child has alrea dy mastered might make new syllbles also containing those gestures easier to acquire than totally novel materials. Work now in progress indicates that such a mechanism w ould reinf orce the trend towards combinatorial coding even f urther and extend it to much larger lexica than the ones considered here.

Examining the motor scores of the derived syllables w e f ound gestural components that can be said to be in a one-to-one relation w ith phonemic segments.

- 15 -

References

Related documents

sex difference in the ratings of the adult female stimuli , but not in the adult mal e stimuli. W e shall now test the hypothesi s that the perception of liveliness i s influenced

This paper concerns the experimental estimation of the accuracy of an electromagnetic transduction system. This system was designed for the observation and

In summary then, if Bruce's interpretation of the Stockholm word accents is correct, we would expect an Fo fall roughly coinciding with the vocalic segment pertaining

Normal speech after secondary palatal operation without subsequent speech therapy in relation to age and defect type (the number of patients with normal speech in re- lation to

In this paper I shall argue (i) that speakers adaptively tune phonetic gestures to the various needs of speaking situations (the plasticity of phonetic

&#34;Uno belznade gzrden i Boden&#34; (Uno mortgaged the farm in Boden). The different curves pertain to emphasis on the four main words. Durational differences between segments

[r]

The overall aim of this thesis was to explore and describe speech, language and communicative ability in school-aged children with cerebral palsy (CP) and speech impairment..