Speech Enhancement for Hands-Free Terminals

(1)

Speech Enhancement Hands-Free Terminals for

Nedelko Grbic,

Sven Nordholm and Anders Johansson

(2)

Handsfree Telephony

n Safety problems in cars

n Inconvenience of conversation

n Prohibited by legislation in some

regions

(4)

Handsfree Telephony

n Perception problems

n

Acoustic feedback

n

Wind and Tire friction in cars

n

Engine and Fan noise

Single Mic.

(5)

Handsfree Telephony

Beamformer

Beamformer, 6 Mics.

n Speech enhancement by means of

beamforming

(6)

Handsfree problem

Speech + Noise

Distance = R

2

1 ∝ R α

Noise

] [ log

10 * dB

x

SNR = + α ] [ dB x SNR =

⋅

α

(7)

Handsfree Improvement

Distance = R

2

1 ∝ R α

Noise

] [ ) log(

10 * dB

x

SNR β

+ α

=

⋅

α β ⋅

Sensors

∝ # β

Speech + Noise

(8)

Spatial Selectivity

Wave propaga

tion direction

Resulting signal waveform Wave propagation direction

(9)

Spatial Selectivity

Wave propagation direction

Wave propaga

tion direction

Resulting signal waveform

(10)

Broadband Beamformer

w

₁

[j]

w

₂

[j]

w

₃

[j]

w

₄

[j]

w

_I

[j]

Output

#I Microphones

x₁(n) x₂(n) x₃(n) x₄(n)

x_I(n)

FIR filters

(11)

Ex. Broadband response

(12)

Beamforming approaches

Data independent Beamformers

n

The Delay and Sum Beamformer

n

Multidimensional Filter designed Beamformers

Statistical Beamformers

n

Linearly Constrained Minimum Variance Beamforming

n

The Optimal Signal-to-Noise plus Interference (SNIB) Beamformer

n

Minimum Mean Square Beamformer

n

Diffuse Noise Field Beamformer

(13)

Linearly Constrained Minimum Variance Beamformer (LCMV)

=>

For each frequency, the weights are found For each frequency, the weights are found

from:

=>

The correlation matrix contains contributions from all sources

Subject to:

(14)

Optimal SNIB Beamformer

The weights that maximizes the quote, are found from the Generalized Eigenvalue relation, i.e.,

=>

The correlation matrix contains contributions from the

source of interest and contains contributions from all other

sources

(15)

MMSE Beamformer

(16)

Diffuse Noise Field beamformer

For each frequency, the weights are found For each frequency, the weights are found

from:

(17)

Evaluation Conditions

n Environment in car running at 110 km/h

n Linear sensor array

n 6 sensors with 12 kHz sampling rate

n Evaluation on real speech signals

(18)

Results

1.9 4.0

-26.5 Diffuse Noise Field

17.2 15.2

-30.6 MMSE

30.7 18.1

-19.4 SNIB

Interference Suppression Noise Suppression

Speech Distortion Performance [dB]

(19)

Speech Enhancement for Hands-Free Terminals

Speech Enhancement Hands-Free Terminals for

Nedelko Grbic,

Sven Nordholm and Anders Johansson

Contents

n Handsfree Telephony Principles

n Handsfree problem

n Optimal Beamformers

Linearly Constrained Minimum Variance Beamfomer

Optimal Signal-to-Noise plus Interference

Diffuse Noise Field Beamformer

Minimum Mean Square Error Beamformer

n Results in a real environment

n Conclusions

Handsfree Telephony

n Safety problems in cars

n Inconvenience of conversation

n Prohibited by legislation in some

regions

Handsfree Telephony

n Perception problems

Acoustic feedback

Wind and Tire friction in cars

Engine and Fan noise

Handsfree Telephony

Beamformer

n Speech enhancement by means of

beamforming

Handsfree problem

1

∝ R α

] [ log

10 * dB

x

SNR = + α ] [ dB x SNR =

⋅

α

Handsfree Improvement

1

∝ R α

] [ ) log(

10 * dB

x

SNR β

+ α

=

⋅

α β ⋅

Sensors

∝ # β

Spatial Selectivity

Spatial Selectivity

Broadband Beamformer

w

[j]

w

[j]

w

[j]

w

[j]

w

[j]

Output

FIR filters

Ex. Broadband response

Beamforming approaches

Data independent Beamformers

The Delay and Sum Beamformer

Multidimensional Filter designed Beamformers

Statistical Beamformers

Linearly Constrained Minimum Variance Beamforming

The Optimal Signal-to-Noise plus Interference (SNIB) Beamformer

Minimum Mean Square Beamformer

Diffuse Noise Field Beamformer

Linearly Constrained Minimum Variance Beamformer (LCMV)

=>

For each frequency, the weights are found For each frequency, the weights are found

from:

from: