• No results found

Letter: Reply to the Comment on International Collegium of Rehabilitative Audiology (ICRA) recommendations for the constructor of multilingual speech test by Akeroyd et al by Cas Smits in INTERNATIONAL JOURNAL OF AUDIOLOGY, vol 55, issue 4, pp 269-271

N/A
N/A
Protected

Academic year: 2021

Share "Letter: Reply to the Comment on International Collegium of Rehabilitative Audiology (ICRA) recommendations for the constructor of multilingual speech test by Akeroyd et al by Cas Smits in INTERNATIONAL JOURNAL OF AUDIOLOGY, vol 55, issue 4, pp 269-271"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Collegium of Rehabilitative Audiology (ICRA)

recommendations for the constructor of

multilingual speech test by Akeroyd et al by

Cas Smits in INTERNATIONAL JOURNAL

OF AUDIOLOGY, vol 55, issue 4, pp 269-271

Birger Kollmeier, Michael A. Akeroyd, Stig Arlinger, Ruth A. Bentler, Arthur Boothroyd,

Norbert Dillier, Wouter A. Dreschler, Jean-Pierre Gagne, Mark E. Lutman, Jan Wouters and

Lena Wong

Linköping University Post Print

N.B.: When citing this work, cite the original article.

This is an electronic version of an article published in:

Birger Kollmeier, Michael A. Akeroyd, Stig Arlinger, Ruth A. Bentler, Arthur Boothroyd,

Norbert Dillier, Wouter A. Dreschler, Jean-Pierre Gagne, Mark E. Lutman, Jan Wouters and

Lena Wong, Letter: Reply to the Comment on International Collegium of Rehabilitative

Audiology (ICRA) recommendations for the constructor of multilingual speech test by

Akeroyd et al by Cas Smits in INTERNATIONAL JOURNAL OF AUDIOLOGY, vol 55,

issue 4, pp 269-271, 2016, International Journal of Audiology, (55), 4, 269-271.

International Journal of Audiology is available online at informaworld

TM

:

Copyright: Taylor & Francis: STM, Behavioural Science and Public Health Titles

http://www.tandf.co.uk/journals/default.asp

Postprint available at: Linköping University Electronic Press

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-126796

(2)

Reply to the “Comment on “International Collegium of Rehabilitative Audiology (ICRA) recommendations for the constructor of multilingual speech test” by Akeroyd et al. “ by Cas Smits

Birger Kollmeier (Cluster of Excellence Hearing4all & Medizinische Physik, Universität Oldenburg, and HörTech gGmbH, Oldenburg, Germany), Michael A. Akeroyd (MRC Institute of Hearing Research, Nottingham, UK), Stig Arlinger (Linköping University, Department of Clinical and Experimental Medicine, Technical Audiology, Sweden), Ruth A. Bentler (Department of Communication Sciences & Disorders, The University of Iowa, Wendell Johnson Speech and Hearing Center, Iowa, USA), Arthur Boothroyd (San Diego State University, San Diego, CA, USA), Norbert Dillier (University of Zurich, ENT Department, Zürich, Switzerland), Wouter A. Dreschler (Academic Medical Centre, Amsterdam, The Netherlands), Jean-Pierre Gagné (Université de Montréal, Montréal, Québec, Canada), Mark E Lutman (Institute of Sound and Vibration Research, University of Southampton, Highfield, Southampton, UK), Jan Wouters (KU Leuven, Dept. Neurosciences, ExpORL, Leuven, Belgium), Lena Wong (Division of Speech and Hearing Sciences, University of Hong Kong, Hong Kong)

The aim of this reply by the authors of the ICRA recommendations is to gratefully acknowledge the valid points raised by Cas Smits and to clarify the reasons why ICRA has taken the decisions as published.

The author of the Comment (Cas Smits) is among the pioneers of the digit triplet test and has provided substantial developments for the field as expressed by his numerous publications about the digit triplet test and its further developments, though as he is not a member of ICRA he was not included in the working group that led to the paper by Akeroyd et al. (2015). His Comment is constructive and well taken and the current authors agree that not every recommendation in Akeroyd et al. is supported by a large body of empirical evidence. In many cases there are good arguments to make a decision either in one direction or the other (see comments below). The Comment adds to the literature about how best to construct digit triplet tests by addressing those few points where the publication by Smits et al. (2013) deviate from the ICRA recommendations.

Specific comments:

A) Word selection: The ICRA recommendations to “avoid a certain digit being recognized purely by its unique number of syllables” contrasts with the usage of all available digits as recommended by Smits et al. (2013). The Comment points out that no significant difference in recognition probability between monosyllabic and bisyllabic digits in noise were found in Dutch and English.

However, it is known from speech perception literature that the duration of a phoneme and the number of syllables are highly robust speech cues still available at very low signal-to-noise ratios well below the speech recognition threshold (e.g., Erber and Witt, 1977). It is also known from psychoacoustics that the detection threshold of a signal in noise decreases with increasing signal duration (within certain limits). Hence, it is expected that the number of syllables can be detected at a signal-to-noise ratio even below the speech recognition

threshold and above the speech detection threshold. The ICRA recommendation is meant to avoid an unwanted cue if only one or two number words are bisyllabic. The corresponding difference in recognition probability between a monosyllabic and a multisyllabic numeral may be too small to be detected within a limited amount of measurement effort due to the

underlying binomial statistics. Hence, the fact that no significant difference was found between monosyllabic and bisyllabic digits in noise in Dutch and English does not preclude that this difference exists in other languages or might be shown if more empirical data were available.

(3)

Avoiding this bias by eliminating one or two from the 10 digits has of course to be balanced against the possible reduction in response alternatives. However, the increase in guessing rate from 1/10 to 1/9 or 1/8 (evenly distributed across all digits) is much less problematic than a deviation in digit-specific recognition probability for one or two digits from the average, especially if an adaptive tracking procedure is employed which assumes an equal recognition rate across items presented to the subjects (see Zokoll et al., 2012 for a review). This

argument supports the current ICRA recommendations. Nevertheless, it is recognized that in some languages there may not be sufficient monosyllabic number words and a different approach will be required.

B) Recording: While the ICRA recommendations advocate a position-specific recording and re-synthesis of the triplets, the Comment argues against this necessity because of the extra effort required to record the digits for each position and because of a vanishing difference to a non-position-specific presentation of the digits in Dutch. This holds both for the difference in intelligibility and in perceived intonation if presented close to the SRT in noise.

If one considers intelligibility and intonation at low signal-to-noise ratios alone, there is not much of a difference to be expected (as correctly stated by the Comment). However, the extra effort to produce a position-specific resynthesis and perceptive balancing using 30 instead of 10 digits is very limited while the gain in subjectively evaluated naturalness at higher signal-to-noise ratios is considerable: typically, the test sequences are also played at high signal-to-noise ratios during the initial phase of an adaptive measurement or even used in quiet for applications like speech quality assessment. Also, any listener with normal hearing (such as, for example the conductor of a test which is not being used for self-screening) who listens to the test being conducted by a hearing-impaired listener in the free field will perceive the test materials at suprathreshold SNR values. Any unnatural cues or reduction in perceived naturalness in the speech material will clearly be audible in these situations and will decrease the acceptance of the test and its face validity to represent natural speech. The extra effort of position-specific recordings is further supported by

findings from Wilson et al. (2005) and Bräcker (2005, unpublished Diploma thesis) who found a dependence on the word-specific SRT across the first, second and third position in the triplet. This is not compatible with the assumption that the position in the digit for the word is irrelevant. It rather supports the ICRA recommendation of a position-dependent recording and testing of a digit in a triplet.

Moreover, there seems to be no clear reason why certain digits like 3-3-3 or 1-2-3 should be avoided (as suggested by the Comment). The effect of avoiding a certain sequence of digits or a repetition of a certain digit is expected to be too small to be empirically detectable within a reasonable measurement effort. Hence, there is no literature supporting that the triplets with standard numerical order are easier to understand than triplets with pseudo random order.

C) Masking noise: The Comment argues that filtered Gaussian noise is better suited as the standard masker than the randomly overlaid speech babble noise produced by the

respective talker (as recommended by ICRA). Even though the Comment points out correctly that both maskers have the desired property of the same long term spectrum as speech and that the discrimination function is somewhat steeper for the superimposed noise (i.e., a desired property for yet unclear reasons that have, however, been modeled in a recent paper by Schaedler et al., 2015), both noise types have specific advantages and shortcomings. The advantage of the filtered Gaussian noise is its easy generation while its disadvantage is that the spectrum statistics of the filtered noise (and consequently its autocorrelation

function) may differ depending on the details of the generation process. The most important property of filtered Gaussian noise is the spectral resolution of the filter transfer function

(4)

employed (or, equivalently, the duration of the impulse response of the filter which directly relates to the width of the autocorrelation function of the filtered noise). Hence, a filtered Gaussian noise is not – as assumed by the Comment - “perfectly defined” and does not show statistically independent samples but needs more parameters to be exactly specified. The advantage of the random superposition of the available speech material is its

straightforward generation procedure which is independent of the definition of any frequency resolution or time window for an impulse response. The disadvantage of this procedure, however, is the need of an appropriate pseudorandom distribution of the onset delay and the time intervals between successive repetitions of the same word to be overlaid in order for the resulting noise not to exhibit any audible regularity.

Hence, both procedures to generate speech-simulating noise need some further

specification, which has not always been provided by the papers describing the maskers. This led ICRA to recommend both noise types with a preference for the randomly overlaid speech babble noise due to the greater achievable steepness of the discrimination function. D) With respect to choosing the 50% or 80%-point of the discrimination function for the SRT, the Comment is correct in indicating that a consistent difference across measurement

conditions using either definition depends on the assumption that the shape of the discrimination function is the same across these conditions. This, however, does not invalidate the usage of either the 50% or the 80% point for defining the SRT as long as a proper report is included on what kind of SRT has been used (which is also mentioned in the current ICRA recommendations).

Hence, the argument used by the Comment concerns a special case, where the slope of the discrimination function changes across conditions, and so should be regarded as an

important addition to the recommendations by ICRA.

Taken together, all the points listed above and helpfully raised in the “Comment to the editor” should be considered when constructing a digit triplet test in a new language within the framework of the ICRA recommendations by Akeroyd et al. (2015).

References:

Michael A. Akeroyd, Stig Arlinger, Ruth A. Bentler, Arthur Boothroyd, Norbert Dillier, Wouter A. Dreschler, Jean-Pierre Gagné, Mark Lutman, Jan Wouters, Lena Wong, Birger Kollmeier, International Collegium of Rehabilitative Audiology (ICRA) recommendations for the construction of multilingual speech tests, International Journal of Audiology, vol. 54 Early Online, pp. 1-6, 2015.

Norman P. Erber and Linda H. Witt, Effects of stimulus intensity on speech perception by deaf children, J. Speech Hear. Disorders 42, pp.271-278, 1977

Melanie A. Zokoll, Kirsten C. Wagener., Thomas Brand, Michael Buschermöhle, Birger Kollmeier, Internationally comparable screening tests for listening in noise in several European languages: The German digit triplet test as an optimization prototype, International Journal of Audiology, vol. 51, pp. 697-707, 2012.

Marc Rene Schädler, Anna Warzybok, Sabine Hochmuth, Birger Kollmeier, Matrix

sentence intelligibility prediction using an automatic speech recognition system, International Journal of Audiology, vol. 54, pp. 1-8, 2015 (online first)

(5)

Wilson, R.H, Burks, Ch.A., Weakley, M.S. (2005) A comparison of word-recognition abilities assessed with digit pairs and digit triplets in multitalker babble. J Rehabil Res Dev 42(4), 499-510.

References

Related documents

More correct would have been to include all grades of positivity for p53, excluding grade 6, giving figures for OLP of 98% and for oral SCC of 78% in the study

NAT translates your external public IP (assigned by your ISP) into multiple internal private IPs. This allows each computer system to be on an internal network with a private

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

I regleringsbrevet för 2014 uppdrog Regeringen åt Tillväxtanalys att ”föreslå mätmetoder och indikatorer som kan användas vid utvärdering av de samhällsekonomiska effekterna av

Re-examination of the actual 2 ♀♀ (ZML) revealed that they are Andrena labialis (det.. Andrena jacobi Perkins: Paxton & al. -Species synonymy- Schwarz & al. scotica while