speech signal processing apparatus for extracting a speech signal from a noisy speech signal

1
20a IOa 2O 18 17 12 I0 19 II 14 16 15 13 5,204,906 43.72.Ar VOICE SIGNAL PROCESSING DEVICE Akira Nohara and Joji Kane, assignorsto Matsushita Electric Industrial Company 20 April 1993 (Class 381/36); filed in Japan 13 February 1990 This voiced/voiceless detectorand noise reduction systemmeasures the peakamplitude, the meancepstral energy, andtheir rates of change from one frameto the next.A high and increasing peakmarks a voicedsegment, whereas the mean value must be above a threshold to indicate a consonant segment. Consonant-like measurements not adjacent to voiced sounds are judgedto be noise. During such segments, a characteristic noise spectrum is accumulated.--DLR member 17 supports a leg 20. The endof thisleg is attached to an ossicle to conduct vibration to the innerear. Typically, the transducer wouldbe driven by the outputof an implantedmicrophone and amplifieroperating from a battery that can be recharged by induction through the skin.--SFL 5,201,765 43.70.Aj VOCAL CORD MEDIALIZATION PROSTHESIS James L. Nettervi!!e and James B. Hissong,assignors to Xomed-Treace, Incorporated 13 April 1993 (Class623/11);filed 20 September 1991 In oneform of laryngeal pathology, speaking is difficultor impossible due to the displacement or retraction of a vocal fold, often caused by pa- ralysis.This device consists of a wedge to be fitted betweenthe thyroid cartilage andthedisplaced fold, pushing the fold backinto medial proximity •'z. I1,'.• ' t4 ':L4 and so allowing at least somedegree of vocalization to occur. The wedge thickness may be adjusted usinga probetool, thus reducing the need for frequent operations to change the shape of the prosthesis.--DLR 5,197,113 43.72.Ar METHOD OF AND ARRANGEMENT FOR DISTINGUISHING BETWEEN VOICED AND UNVOICED SPEECH ELEMENTS Enzo Mumolo, assignor to Alcatel N.V. 23 March 1993 (Class395/2); filed in Italy 15 May 1989 A largefraction of the manydifferentspeech coding methods depends on a decision asto whether a given frameis voiced or unvoiced. The voicing detector presented here is a decision tree based on an estimate of the fre- quency value of the centroid of spectral energy. If the centroid has a suffi- ciently low frequencyand that frequency is falling rapidly enough,the frame is said to be voiced.--DLR 5,216,747 43.72.Ar VOICED/UNVOICED ESTIMATION OF AN ACOUSTIC SIGNAL John C. Hardwick and Jae S. Lim, assignorsto Digital Voice Systems, Incorporated 1 June 1993 (Class395/2); filed 20 September 1990 This description of a multi-band-excited (MBE) vocoder is primarily concerned with a methodof high-resolution pitch detection, but also ad- dresses a kind of mixed-voicing approach to generating harmonics in the speech output waveform. The MBE vocoder architecture has beendisclosed in earlier patents by the same inventors. The pitchanalysis uses an estimated pitch track extending two frames backwardand two forward to reach an interpolated period estimate, said to be typicallyaccurate to within a quarter to one-eighth of a sample period. An energy-based voiced/unvoiced decision leads to thegeneration of harmonics by adding voiced (low-frequency) and unvoiced (high-frequency) components.--DLR 5,226,108 43.72.Ar PROCESSING A SPEECH SIGNAL WITH ESTIMATED PITCH John C. Hardwick and Jae S. Lim, assignorsto Digital Voice Systems, Incorporated 6 July 1993 (Class 395/2); filed 20 September 1990 This patentis closelyrelatedto Patentnumber 5,216,747, reviewed above. This one was in fact filed a year before the earlier-issued patent. There are only minor differences in the texts of the two patents, with the all-important claimssection substantially expanded.--DLR 5,220,610 43.72.Dv SPEECH SIGNAL PROCESSING APPARATUS FOR EXTRACTING A SPEECH SIGNAL FROM A NOISY SPEECH SIGNAL Joi Kane and Akira Nohara, assignorsto Matsushita Electric Industrial Company 15 June 1993 (Class381/46); filed in Japan 28 May 1990 This speech noise reduction system performs a log-fit type of cepstrum analysis in orderto estimate the periodic component amplitudes. Based on peak andaverage energy values and a peakfrequency value as measured in the cepstral domain during speech andnonspeech intervals, a reduced-noise version of.the speech signal is somehow produced. The patent text describes many alt6rnative hookups to a box known as the "speech extraction sec- tion," but gives very little specific information about how thatbox works.It perhaps functions as described in Patentnumber 5,204,906, by the same inventors, reviewed above.--DLR 619 J. Acoust. Soc. Am., Vol. 96, No. 1, July 1994 Reviews of AcousticalPatents 619 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 128.123.44.23 On: Sat, 20 Dec 2014 06:18:38

Upload: joi

Post on 14-Apr-2017

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Speech signal processing apparatus for extracting a speech signal from a noisy speech signal

20a

IOa

2O 18 17

12

I0

19 II

14

16

15

13

5,204,906

43.72.Ar VOICE SIGNAL PROCESSING DEVICE

Akira Nohara and Joji Kane, assignors to Matsushita Electric Industrial Company

20 April 1993 (Class 381/36); filed in Japan 13 February 1990

This voiced/voiceless detector and noise reduction system measures the peak amplitude, the mean cepstral energy, and their rates of change from one frame to the next. A high and increasing peak marks a voiced segment, whereas the mean value must be above a threshold to indicate a consonant

segment. Consonant-like measurements not adjacent to voiced sounds are judged to be noise. During such segments, a characteristic noise spectrum is accumulated.--DLR

member 17 supports a leg 20. The end of this leg is attached to an ossicle to conduct vibration to the inner ear. Typically, the transducer would be driven by the output of an implanted microphone and amplifier operating from a battery that can be recharged by induction through the skin.--SFL

5,201,765

43.70.Aj VOCAL CORD MEDIALIZATION PROSTHESIS

James L. Nettervi!!e and James B. Hissong, assignors to Xomed-Treace, Incorporated

13 April 1993 (Class 623/11); filed 20 September 1991

In one form of laryngeal pathology, speaking is difficult or impossible due to the displacement or retraction of a vocal fold, often caused by pa- ralysis. This device consists of a wedge to be fitted between the thyroid cartilage and the displaced fold, pushing the fold back into medial proximity

•'z. I1,'.• ' t4 ':L4

and so allowing at least some degree of vocalization to occur. The wedge thickness may be adjusted using a probe tool, thus reducing the need for frequent operations to change the shape of the prosthesis.--DLR

5,197,113

43.72.Ar METHOD OF AND ARRANGEMENT FOR

DISTINGUISHING BETWEEN VOICED AND

UNVOICED SPEECH ELEMENTS

Enzo Mumolo, assignor to Alcatel N.V. 23 March 1993 (Class 395/2); filed in Italy 15 May 1989

A large fraction of the many different speech coding methods depends on a decision as to whether a given frame is voiced or unvoiced. The voicing detector presented here is a decision tree based on an estimate of the fre- quency value of the centroid of spectral energy. If the centroid has a suffi- ciently low frequency and that frequency is falling rapidly enough, the frame is said to be voiced.--DLR

5,216,747

43.72.Ar VOICED/UNVOICED ESTIMATION OF AN ACOUSTIC SIGNAL

John C. Hardwick and Jae S. Lim, assignors to Digital Voice Systems, Incorporated

1 June 1993 (Class 395/2); filed 20 September 1990

This description of a multi-band-excited (MBE) vocoder is primarily concerned with a method of high-resolution pitch detection, but also ad- dresses a kind of mixed-voicing approach to generating harmonics in the speech output waveform. The MBE vocoder architecture has been disclosed in earlier patents by the same inventors. The pitch analysis uses an estimated pitch track extending two frames backward and two forward to reach an interpolated period estimate, said to be typically accurate to within a quarter to one-eighth of a sample period. An energy-based voiced/unvoiced decision leads to the generation of harmonics by adding voiced (low-frequency) and unvoiced (high-frequency) components.--DLR

5,226,108

43.72.Ar PROCESSING A SPEECH SIGNAL WITH

ESTIMATED PITCH

John C. Hardwick and Jae S. Lim, assignors to Digital Voice Systems, Incorporated

6 July 1993 (Class 395/2); filed 20 September 1990

This patent is closely related to Patent number 5,216,747, reviewed above. This one was in fact filed a year before the earlier-issued patent. There are only minor differences in the texts of the two patents, with the all-important claims section substantially expanded.--DLR

5,220,610

43.72.Dv SPEECH SIGNAL PROCESSING

APPARATUS FOR EXTRACTING A SPEECH

SIGNAL FROM A NOISY SPEECH SIGNAL

Joi Kane and Akira Nohara, assignors to Matsushita Electric Industrial Company

15 June 1993 (Class 381/46); filed in Japan 28 May 1990

This speech noise reduction system performs a log-fit type of cepstrum analysis in order to estimate the periodic component amplitudes. Based on peak and average energy values and a peak frequency value as measured in the cepstral domain during speech and nonspeech intervals, a reduced-noise version of.the speech signal is somehow produced. The patent text describes many alt6rnative hookups to a box known as the "speech extraction sec- tion," but gives very little specific information about how that box works. It perhaps functions as described in Patent number 5,204,906, by the same inventors, reviewed above.--DLR

619 J. Acoust. Soc. Am., Vol. 96, No. 1, July 1994 Reviews of Acoustical Patents 619

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 128.123.44.23 On: Sat, 20 Dec 2014 06:18:38