FN Thomson Reuters Web of Science™ VR 1.0 PT J AU LEGUYADER, A LAMBLIN, C BOURSICAUT, E AF LEGUYADER, A LAMBLIN, C BOURSICAUT, E TI EMBEDDED ALGEBRAIC CELP/VSELP CODERS FOR WIDE-BAND SPEECH CODING SO SPEECH COMMUNICATION LA English DT Article DE CELP CODING; VSELP CODING; EMBEDDED CODING; ORTHOGENALIZATION; VARIABLE BIT RATE; WIDE-BAND CODING AB Today, the need for variable bit rate coding exists to an increasing extent for, such as videoconferencing, audioconferencing, packet circuit multiplication systems and mobile radio applications. This article extends the concept of embedded coding to include multi-stage CELP and VSELP coding at variable bit rates. Firstly, this article presents a unified approach to multi-stage CELP and VSELP coding. Secondly, this contribution provides a general expression for the CELP/VSELP error criterion in the case of a sequential search for successive indexes and gains, including gain reoptimization at each stage. Thirdly, this paper outlines the development of a recursive algorithm, used in order to solve this least-squares problem. As explained herein, this algorithm is based on both the matrix partitioning approach of the QR decomposition, and the Gram-Schmidt orthogonalization algorithm. Then, the resulting orthogonalized gains are used in the derivation of a new algorithm, which is implemented off-line in order to ensure the optimization of successive codebooks in embedded multi-stage CELP/VSELP coding. Finally, subjective test results are presented, which illustrate that 24 and 32 kbit/s embedded CELP/VSELP wideband coders provide speech quality close to that of the embedded SB/ADPCM G722 coders at 56 and 64 kbit/s. C1 TELECOM BRETAGNE, TECHNOPOLE BREST IROISE, F-29285 BREST, FRANCE. RP LEGUYADER, A (reprint author), FRANCE TELECOM, CNET,LAA,TSS,TECHNOPOLE ANTICIPA, 2 AVE PIERRE MARZIN, F-22307 LANNION, FRANCE. CR BOURSICAUT E, 1992, CALCUL DICT VSELP CO DAVIDSON G, 1988, 1988 P INT C AC SPEE, P163 DROGO R, 1991, 1991 P INT C AC SPEE, P681 DYMARSKI P, 1990, 1990 P INT C AC SPEE, P485 DYMARSKI P, 1993, SPEECH AUDIO CODING, P231 GERSON IA, 1990, APR INT C AC SPEECH, P461 Golub G. H., 1989, MATRIX COMPUTATIONS GOODMAN DJ, 1980, IEEE T COMMUN, V28, P1040, DOI 10.1109/TCOM.1980.1094764 JOHNSON M, 1990, P GLOBECOM, P542, DOI 10.1109/GLOCOM.1990.116570 KLEIJN WB, 1990, IEEE T ACOUST SPEECH, V38, P1330, DOI 10.1109/29.57568 LAFLAMME C, 1991, 1991 P INT C AC SPE, P13 LEGUYADER A, 1992, P EUSIPCO SIGNAL PRO, V6, P527 LEGUYADER A, 1993, 1993 P IEEE WORKSH S, P15 LOZACH B, 1993, THESIS U RENNES 1 MAITRE X, 1988, IEEE J SELECTED AREA, V6 MOREAU N, 1992, 1992 P INT C AC SPEE SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 SINGHAL S, 1989, IEEE T ACOUST SPEECH, V37, P317, DOI 10.1109/29.21700 STEWART GW, 1973, COMPUTER SCI APPL MA, pCH5 TANIGUCHI T, 1990 P INT C SPOK LA, P113 TRANCOSO IM, 1990, IEEE T ACOUST SPEECH, V38, P385, DOI 10.1109/29.106858 NR 21 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 319 EP 328 DI 10.1016/0167-6393(95)00002-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500001 ER PT J AU VANBERGEM, DR AF VANBERGEM, DR TI PERCEPTUAL AND ACOUSTIC ASPECTS OF LEXICAL VOWEL REDUCTION, A SOUND CHANGE IN PROGRESS SO SPEECH COMMUNICATION LA English DT Article DE VOWEL REDUCTION; VOWEL CATEGORIES; FREQUENCY OF OCCURRENCE OF WORDS; SPEAKING STYLES; INTERSTRESS POSITION; SOUND CHANGE ID CATEGORICAL DATA; AGREEMENT; SPANISH; SPEECH AB In the present study 20 Dutch male speakers were asked to read aloud 47 test words in a word list and in short sentences. Part of this word set was also named by them through the presentation of pictures. A group of 20 listeners was asked to identify an unstressed vowel in all of these test words. The vowel responses of listeners were recoded into two broad categories: ''full vowel'' and ''schwa''. Our aims were (1) to find out to what extent listeners are able to unambiguously distinguish between these two categories, (2) to investigate the influence of the frequency of occurrence of words on the classification of the test vowels, (3) to investigate the influence of speaking styles on the classification of the test vowels by comparing the speech conditions ''word list'', ''pictures'' and ''sentences''. The experimental results showed that (1) listeners often could not unambiguously classify the test vowels, especially if these occurred in interstress position, (2) the number of schwa responses was much higher for vowels in words with a relatively high frequency of occurrence, (3) the number of schwa responses increased in a more casual speaking style. Acoustic measurements on the test vowels revealed a clear relation between the perceptual results and the acoustic features of the vowels. Although the preconditions for the sound change ''full vowel --> schwa'' in several Dutch words are excellent, the actual completion of the sound change is in our view to a large extent blocked by the rather close correspondence between Dutch vowel sounds and their orthographic representations. RP VANBERGEM, DR (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR AITCHISON J, 1987, LANGAUGE CHANGE PROG Booij Geert, 1995, PHONOLOGY DUTCH BYRD D, 1994, SPEECH COMMUN, V15, P39, DOI 10.1016/0167-6393(94)90039-6 CARON WJH, 1972, KLANK TEKEN VERZAMEL, P131 Coates J., 1986, WOMEN MEN LANGUAGE S COHEN J, 1960, EDUC PSYCHOL MEAS, V20, P37, DOI 10.1177/001316446002000104 Cox D. R., 1970, ANAL BINARY DATA CUTLER A, 1992, PAPERS LABORATORY PH, V2, P290 De Jong E. D., 1979, SPREEKTAAL WOORDFREQ DEGRAAF T, 1984, P I PHONETIC SCI, V8, P41 DELATTRE P, 1971, PHONETICA, V23, P129 DENOS EA, 1988, THESIS U UTRECHT DUNN OJ, 1961, J AM STAT ASSOC, V56, P52, DOI 10.2307/2282330 FERGUSON GA, 1976, STATISTICAL ANAL PSY Fidelholz James, 1975, 11 REG M CHIC LING S, P200 FLEISS JL, 1971, PSYCHOL BULL, V76, P378, DOI 10.1037/h0031619 FOSS DJ, 1980, PERCEPTION PRODUCTIO FREEMAN DH, 1987, APPLEID CATEGORICAL GRIZZLE JE, 1969, BIOMETRICS, V25, P489, DOI 10.2307/2528901 HARMEGNIES B, 1992, SPEECH COMMUN, V11, P429, DOI 10.1016/0167-6393(92)90048-C HEEROMA K, 1959, TIJDSCHRIFT NEDERLAN, V77, P187 Hopper P., 1993, GRAMMATICALIZATION Hosmer DW, 1989, APPLIED LOGISTIC REG KLATT DH, 1979, J PHONETICS, V7, P279 KOCH GG, 1977, BIOMETRICS, V33, P133, DOI 10.2307/2529309 KOOPMANSVANBEINUM FJ, 1994, PHONETICA, V51, P68 KOOPMANSVANBEIN.FJ, 1982, SPEKTATOR, V11, P284 KREIMAN J, 1993, J SPEECH HEAR RES, V36, P21 LANDIS JR, 1977, BIOMETRICS, V33, P159, DOI 10.2307/2529310 MAKHOUL J, 1976, IEEE T ACOUST SPEECH, P466 Marslen-Wilson W. D., 1989, LEXICAL REPRESENTATI, P169 Martin W., 1968, NIEUW TAALGIDS, V61, P162 MILLER PD, 1972, 8TH REG M CHIC LING, P482 *NIJM U, 1985, PROP CREAT NAT MULT OHALA JJ, 1989, SERIES TRENDS LINGUI, V43, P173 Ohala John J, 1981, PAPERS PARASESSION L, P178 Press W. H., 1991, NUMERICAL RECIPES C SCHOUTEN HJA, 1985, THESIS U ROTTERDAM TINSLEY HEA, 1975, J COUNS PSYCHOL, V22, P358, DOI 10.1037/h0076640 VANBERGEM DR, 1993, SPEECH COMMUN, V12, P1, DOI 10.1016/0167-6393(93)90015-D VANBERGEM DR, 1994, SPEECH COMMUN, V14, P143, DOI 10.1016/0167-6393(94)90005-1 VANBERGEM DR, 1990, P I PHONETIC SCI AMS, V14, P53 VANBERGEM DR, 1993, EUROSPEECH 93 BERLIN, P677 VANBERGEM DR, 1991, P ESCA WORKSHOP PHON WILLEMS LF, 1986, IPO21 ANN PROGR REP, P34 NR 45 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 329 EP 358 DI 10.1016/0167-6393(95)00003-7 PG 30 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500002 ER PT J AU TRITTIN, PJ LLEO, ADY AF TRITTIN, PJ LLEO, ADY TI VOICE QUALITY ANALYSIS OF MALE AND FEMALE SPANISH SPEAKERS SO SPEECH COMMUNICATION LA English DT Article DE MALE FEMALE SPEECH; VOICE ANALYSIS; VOICE QUALITY; SPANISH ID SPEECH SYNTHESIS AB This paper describes the results of an acoustical analysis which compares the quality of female and male voices of Spanish speakers. The analysis is a pilot study based on a similar one presented by Klatt and Klatt (1990). Results indicate that the Spanish female voice does differ in some respects from Spanish males; however, to a lesser extent than what was found by Klatt and Klatt. The breathy quality found in Spanish females is not too different from Spanish males, which may support the assumption that breathiness may be a learned, cultural behaviour. C1 ETSI TELECOMUNICAC, DEPT INGN ELECTR, E-28040 MADRID, SPAIN. CR BICKLEY C, 1982, MIT SPEECH COMMUNICA, P71 CRESPO MAR, 1991, COMUNICACIOINES TELE, V2, P35 Fant G., 1970, ACOUSTIC THEORY SPEE FANT G, 1993, SPEECH COMMUN, V13, P7, DOI 10.1016/0167-6393(93)90055-P FANT G, 1982, STLQPSR23 ROYAL I TE, P1 Fant G., 1973, SPEECH SOUNDS FEATUR FUJIMURA O, 1968, IEEE T ACOUST SPEECH, VAU16, P68, DOI 10.1109/TAU.1968.1161951 FUJIMURA O, 1962, J ACOUST SOC AM, V34, P1865, DOI 10.1121/1.1909142 KARLSSON I, 1989, P EUROPEAN C SPEECH, V1, P349 KARLSSON I, 1992, SPEECH COMMUN, V11, P491, DOI 10.1016/0167-6393(92)90056-D KARLSSON I, 1991, J PHONETICS, V19, P111 Karlsson I, 1992, SPEECH TRANSMISSION, V1, P19 KARLSSON I, 1991, 12TH P INT C PHON SC KLATT DH, 1987, J ACOUST SOC AM, V82, P737, DOI 10.1121/1.395275 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 PRICE PJ, 1989, SPEECH COMMUN, V8, P261, DOI 10.1016/0167-6393(89)90005-8 SAVOJI MH, 1990, DIFFERENCES MALE FEM, P1 NR 18 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 359 EP 368 DI 10.1016/0167-6393(95)00004-8 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500003 ER PT J AU MURRAY, IR ARNOTT, JL AF MURRAY, IR ARNOTT, JL TI IMPLEMENTATION AND TESTING OF A SYSTEM FOR PRODUCING EMOTION-BY-RULE IN SYNTHETIC SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; SYNTHESIS-BY-RULE; EMOTION; MOOD; AFFECT; SYNTHETIC SPEECH PERCEPTION ID INTONATION; RECOGNITION; JUDGMENTS AB A system is described which adds simulated emotion effects to synthetic speech. The control parameters of a speech synthesizer are controlled by rule in order to simulate the features of emotion expressed in the human voice. The system can simulate six vocal emotions and was evaluated with naive listeners. The results indicated that the system was producing recognizable vocal emotions, with perception rankings similar to those found by previous research on human emotional speech. This system has been developed for use in voice prosthesis systems fbr non-vocal disabled persons, although it could be used to enhance any application which uses rule-based synthetic speech. RP MURRAY, IR (reprint author), UNIV DUNDEE, DEPT MATH & COMP SCI, MICROCTR, DUNDEE DD1 4HN, SCOTLAND. CR ALLEN MS, 1987, TEXT SPEECH MITALK S ALM N, 1992, COMMUN ACM, V35, P46 Bolinger D., 1986, INTONATION ITS PARTS BROWN BL, 1973, J ACOUST SOC AM, V54, P29, DOI 10.1121/1.1913571 BROWN BL, 1972, 80TH P ANN CONV AM P, P197 BROWN BL, 1974, J ACOUST SOC AM, V55, P313, DOI 10.1121/1.1914504 CAHN JE, 1990, GENERATING EXPRESSIO COHEN A, 1967, LINGUA, V19, P177 COSTANZO FS, 1969, J COUNS PSYCHOL, V16, P267, DOI 10.1037/h0027355 COWAN M, 1936, ARCH SPEECH S, V16 Davitz Joel Robert, 1964, COMMUNICATION EMOTIO *DIG EQ CORP, 1984, DECTALK DTCO1 OWN MA Fairbanks G, 1941, SPEECH MONOGR, V8, P85 Fairbanks G, 1939, SPEECH MONOGR, V6, P87 Fonagy I., 1963, Z PHONETIK SPRACHWIS, V16, P293 FONAGY I, 1981, RES ASPECTS SINGING, P51 GREENE BG, 1986, BEHAV RES METH INSTR, V18, P100, DOI 10.3758/BF03201008 HUTTAR GL, 1968, J SPEECH HEAR RES, V11, P481 HUTTAR GL, 1967, MONOGRAPH SPEECH COM, V1 Izard C. E., 1972, PATTERNS OF EMOTIONS JOHNSON WF, 1986, ARCH GEN PSYCHIAT, V43, P280 KRAMER E, 1963, PSYCHOL BULL, V60, P408, DOI 10.1037/h0044890 MURRAY IR, 1989, THESIS U DUNDEE UK MURRAY IR, 1993, J ACOUST SOC AM, V93, P1097, DOI 10.1121/1.405558 MURRAY IR, 1991, 1991 P EUR 91 GEN, P311 OATLEY K, 1989, NEW SCI, V123, P33 ORTONY A, 1990, PSYCHOL REV, V97, P315, DOI 10.1037//0033-295X.97.3.315 OSULLIVAN M, 1985, J PERS SOC PSYCHOL, V48, P54, DOI 10.1037/0022-3514.48.1.54 PAKOSZ M, 1983, J PSYCHOLINGUIST RES, V12, P311 PAKOSZ M, 1982, LINGUA, V56, P153, DOI 10.1016/0024-3841(82)90028-6 PIERREHUMBERT J, 1981, J ACOUST SOC AM, V70, P985, DOI 10.1121/1.387033 PISONI DB, 1985, P SPEECH TECH NEW YO, P57 Roach P, 1992, INTRO PHONETICS SCHERER KR, 1986, PSYCHOL BULL, V99, P143, DOI 10.1037//0033-2909.99.2.143 Scherer K.R., 1979, EMOTIONS PERSONALITY, P495 SCHERER KR, 1974, NONVERBAL COMMUNICAT, P105 SCHERER KR, 1982, HDB METHODS NONVERBA, P1 Scherer K.R., 1981, SPEECH EVALUATION PS, P189 SPRENT P, 1989, APPLIED NONPARAMETRI ULDALL E, 1960, LANG SPEECH, V3, P223 van Bezooijen R., 1984, CHARACTERISTICS RECO VANBEZOOIJEN R, 1983, J CROSS CULT PSYCHOL, V14, P387, DOI 10.1177/0022002183014004001 WILLIAMS CE, 1972, J ACOUST SOC AM, V52, P1238, DOI 10.1121/1.1913238 NR 43 TC 44 Z9 44 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 369 EP 390 DI 10.1016/0167-6393(95)00005-9 PG 22 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500004 ER PT J AU HANSEN, JHL CAIRNS, DA AF HANSEN, JHL CAIRNS, DA TI ICARUS - SOURCE GENERATOR BASED REAL-TIME RECOGNITION OF SPEECH IN NOISY STRESSFUL AND LOMBARD EFFECT ENVIRONMENTS SO SPEECH COMMUNICATION LA English DT Article DE ROBUST SPEECH RECOGNITION; SOURCE GENERATOR THEORY; SPEECH UNDER STRESS; LOMBARD EFFECT; NOISE ADAPTATION; STRESS EQUALIZATION; SPEECH ENHANCEMENT; REAL-TIME SPEECH PROCESSING ID WORD RECOGNITION; ENHANCEMENT; SYSTEMS AB The problem of real-time automatic speech recognition in an adverse environment is addressed. Though much research has been performed in the area of speech recognition, only limited success has been demonstrated for real-time recognition in noisy stressful environments. The primary reason for this is that the performance of present day recognition algorithms are predicated on the assumptions of the environmental settings in which the algorithms have been formulated and implemented. In this paper, we discuss the effects of additive background noise on speech quality and recognition parameters, and propose a source generator based framework to address stress and noise. Using this framework, a computationally efficient real-time recognition system called ICARUS is developed. The speech recognition system incorporates direct processing steps to address the effects of additive noise on the speech signal and stress on the speech production system. Central issues which are addressed include (i) improved characterization of speech spoken in noisy situations involving both parameter estimation methods and analysis of varying speech characteristics spoken in adverse environments (i.e., stress and Lombard effect), (ii) exploration of signal processing strategies tailored to such speech, and()demonstration of real-time system performance of the (iii) proposed methods. The proposed recognition system was formulated using a digital signal processing platform. Performance evaluations showed an improvement in speech feature representation under stressed speaking conditions, with an average improvement in recognition rate of +17.28% across eleven noisy stressful speaking conditions. RP HANSEN, JHL (reprint author), DUKE UNIV, DEPT ELECT ENGN, ROBUST SPEECH PROC LAB, BOX 90291, DURHAM, NC 27708 USA. CR ALEXANDRE P, 1993, SPEECH COMMUN, V12, P277, DOI 10.1016/0167-6393(93)90099-7 ANGLADE Y, 1993, 1993 P IEEE INT C AC, V2, P279 [Anonymous], 1969, IEEE T AUDIO ELECTRO, V17, P227 BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 Bond Z.S., 1990, INT C SPEECH LANG PR, P969 BOUGHAZALE SE, 1994, 1994 P IEEE INT C AC, V1, P413 BRIA O, 1991, THESIS DUKE U DURHAM CAIRNS DA, 1991, THESIS DUKE U DURHAM CAIRNS DA, 1994, J ACOUST SOC AM, V96, P3392, DOI 10.1121/1.410601 CAIRNS DA, 1994, ICSLP 94 INT C SPOKE, V3, P1035 CAIRNS DA, 1992, ICSLP 92 INT C SPOKE, P703 CARLSON B, 1992, 1992 P IEEE INT C AC, V1, P237 CHEN YN, 1988, IEEE T ACOUST SPEECH, V36, P433, DOI 10.1109/29.1547 DAS S, 1993, 1993 P IEEE INT C AC, V2, P71 DAUTRICH BA, 1983, IEEE T ACOUST SPEECH, V31, P793, DOI 10.1109/TASSP.1983.1164172 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 Deller J. R., 1993, DISCRETE TIME PROCES DODDINGTON GR, 1985, P IEEE, V73, P1651, DOI 10.1109/PROC.1985.13345 DODDINGTON GR, 1981, IEEE SPECTRUM, V18, P26 EPHRAIM Y, 1992, P IEEE, V80, P1526, DOI 10.1109/5.168664 FISHER W, 1986, 1986 P DARPA SPEECH, P93 GALES MJF, 1992, 1992 P INT C AC SPEE, P233 GAO Y, 1993, 1993 P IEEE INT C AC, P257 GARDNER MB, 1966, J ACOUST SOC AM, V40, P955, DOI 10.1121/1.1910220 GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 Gray R.M., 1984, IEEE ASSP MAG APR, P4 HANLEY CN, 1965, J SPEECH HEAR DISORD, V30, P274 Hansen J. H. L., 1988, THESIS GEORGIA I TEC Hansen J. H. L., 1990, ICSLP 90, P1125 Hansen JHL, 1994, IEEE T SPEECH AUDI P, V2, P598, DOI 10.1109/89.326618 HANSEN JH, 1988, 1988 P IEEE INT C AC, P561 HANSEN JH, 1987, 1987 P IEEE INT C AC, P189 HANSEN JHL, 1991, IEEE T SIGNAL PROCES, V39, P795, DOI 10.1109/78.80901 HANSEN JHL, 1993, 1993 P IEEE INT C AC, P95 HANSEN JHL, 1995, IN PRESS J ACOUST SO HANSEN JHL, 1992, 6TH EUSIPCO 92 EUR S, P403 HANSEN JHL, 1994, SEP ICSLP 94 INT C S, V3, P1003 HANSEN JHL, 1995, IEEE T SPEECH AUDI P, V3, P169, DOI 10.1109/89.388143 HANSEN JHL, 1985, 110TH P AC SOC AM M, pC11 HANSEN JHL, 1987, 114TH P AC SOC AM M, pH15 HANSEN JHL, 1995, IEEE T SPEECH AUDIO, V3 HANSEN JHL, 1993, DSPL932 DUK U TECHN HANSEN JHL, 1989, AUG P IEEE MIDW S CI, P105 HANSEN JHL, 1989, 15TH IEEE P ANN NE B, P31 HANSEN JHL, 1989, 1989 P IEEE INT C AC, P266 HANSON BA, 1993, 1993 P IEEE INT C AC, V2, P79 HANSON BA, 1990, 1990 P IEEE INT C AC, P857 HERMANSKY H, 1993, 1993 P IEEE INT C AC, V2, P83 HUNT MJ, 1989, 1989 P IEEE INT C AC, P262 *IBM MICR DIV, 1994, MWAV LS4000 REF DES ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 Juang B. H., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90011-E JUNQUA JC, 1993, J ACOUST SOC AM, V93, P510, DOI 10.1121/1.405631 KITAWAKI N, 1984, IEEE COMMUN MAG, V22, P26, DOI 10.1109/MCOM.1984.1091825 KLATT D, 1982, 1982 P IEEE INT C AC, P1278 KOEHLER J, 1994, 1994 P IEEE INT C AC, V1, P421 LAMEL LF, 1981, IEEE T ACOUST SPEECH, V29, P777, DOI 10.1109/TASSP.1981.1163642 Lea W., 1980, TRENDS SPEECH RECOGN LENNIG M, 1992, ICSLP 92 INT C SPOKE, V1, P93 LERNER S, 1992, 1992 P IEEE INT C AC, V1, P261 LEVINSON SE, 1983, AT&T TECH J, V62, P1035 LIM JS, 1979, P IEEE, V67, P1586, DOI 10.1109/PROC.1979.11540 LIPPMANN RP, 1987, 1987 P IEEE INT C AC, P705 LIU FH, 1992, 1992 P IEEE INT C A, V1, P257 LOCKWOOD P, 1992, 1992 P IEEE INT C AC, V1, P265 Lombard E., 1911, ANN MALADIES OREILLE, V37, P101 LYNCH JF, 1987, 1987 P IEEE INT C AC, P1348 MELLOR BA, 1993, 1993 P IEEE INT C AC, V2, P87 MENSOUR D, 1988, IEEE T ACOUST SPEECH, V37, P1659 MOKBEL C, 1992, ICSLP 92 INT C SPOKE, P707 MORENO P, 1994, 1994 P IEEE INT C AC, V1, P109 NANDKUMAR S, 1992, 1992 P IEEE INT C AC, V1, P297 *NAT I STAND TECHN, 1988, GETT START DARPA TIM NEUMEYER L, 1994, 1994 P IEEE INT C AC, V1, P417 PAUL DB, 1987, 1987 P IEEE INT C AC, P713 PAWATE B, 1989, 1989 P IEEE INT C AC, P801 PISONI DB, 1985, 1985 P IEEE INT C AC Quackenbush S. R., 1988, OBJECTIVE MEASURES S Rabiners LR, 1986, IEEE ASSP MAGAZI JAN, P4 RAJASEKARAN PK, 1986, 1986 P IEEE INT C AC, P733 SMOLDERS J, 1994, 1994 P IEEE INT C AC, V1, P429 STANTON BJ, 1989, 1989 P IEEE INT C AC, P675 STANTON BJ, 1988, 1988 P IEEE INT C AC, P331 SULLIVAN TM, 1993, 1993 P IEEE INT C AC, V2, P91 Summers W V, 1988, J Acoust Soc Am, V84, P917, DOI 10.1121/1.396660 VASEGHI SV, 1993, 1993 P EUR C SPEECH, V2, P1023 VOIERS WD, 1977, 1977 P IEEE INT C AC, P204 Waibel A, 1990, READINGS SPEECH RECO WILPON JG, 1990, IEEE T ACOUST SPEECH, V38, P1870, DOI 10.1109/29.103088 NR 89 TC 16 Z9 18 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 391 EP 422 DI 10.1016/0167-6393(95)00007-B PG 32 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500005 ER PT J AU KWON, CH UN, CK AF KWON, CH UN, CK TI IMPROVING THE ADAPTIVE SOURCE MODEL FOR CELP CODING WITH LONG ANALYSIS FRAME SIZE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; LOW BIT RATE; ADAPTIVE SOURCE AB One way to lower the coding rate of CELP coders is to lengthen the excitation analysis frame size. For enhanced speech quality in such a case it is desirable to have the CELP excitation peaky (or sharpened). In this paper we first consider the relation between the LPC prediction residual and the CELP excitation, and show that the adaptive source of a CELP coder reconstructs the major pulse at glottal closure and the formant structure remained in the LPC residual, and that the stochastic source models the randomness of the LPC residual. Based on this observation, we propose a new adaptive source in which samples of the source have different gains according to their amplitudes by a two-tap pitch predictor. Simulation results show that peaky pulses at voiced onset and a burst of plosive sound are clearly reconstructed, and that in voiced sound the excitation has the desirable peaky pulse characteristic and the pitch periodicity is well reproduced. C1 KOREA ADV INST SCI & TECHNOL, DEPT ELECT ENGN, COMMUN RES LAB, YUSUNG KU, TAEJON 305701, SOUTH KOREA. CR ANANTHAPADMANABHA TV, 1979, IEEE T ACOUST SPEECH, V27, P309, DOI 10.1109/TASSP.1979.1163267 COPPERI M, 1991, INT CONF ACOUST SPEE, P233, DOI 10.1109/ICASSP.1991.150320 FERRERBALLESTER MA, 1993, P INT C SIGNAL PROCE, P1360 FLANAGAN JL, 1972, SPEECH ANAL SYNTHESI, P184 GRANZOW W, 1991, INT CONF ACOUST SPEE, P217, DOI 10.1109/ICASSP.1991.150316 HAAGEN J, 1992, P IEEE INT C ACOUST HERNANDEZGOMEZ LA, 1991, P IEEE INT ACOUST SP, P585, DOI 10.1109/ICASSP.1991.150407 KANG GS, 1985, IEEE T ACOUST SPEECH, V33, P277 Kondoz A. M., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266380 MOULINES E, 1990, INT CONF ACOUST SPEE, P309, DOI 10.1109/ICASSP.1990.115650 Shoham Y., 1991, Advances in Speech Coding TANAKA N, 1992, TECHNICAL REPORT IEI, V24, P55 Trancoso I. M., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) Woo H. C., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266382 ZINSER RL, 1992, P IEEE INT C ACOUST NR 15 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1995 VL 16 IS 4 BP 423 EP 433 DI 10.1016/0167-6393(95)00006-A PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA RF855 UT WOS:A1995RF85500006 ER PT J AU GOLDSTEIN, M AF GOLDSTEIN, M TI CLASSIFICATION OF METHODS USED FOR ASSESSMENT OF TEXT-TO-SPEECH SYSTEMS ACCORDING TO THE DEMANDS PLACED ON THE LISTENER SO SPEECH COMMUNICATION LA English DT Article ID INTELLIGIBILITY AB A classification of different methods used for the assessment of TTS (Text-To-Speech) systems, according to the demands placed on the listener, is proposed and discussed. The classification is made according to the four traditional scale levels: the Nominal, Ordinal, Interval and Ratio level. A fifth level, the Supra-Nominal, including memory processes, is proposed. The methods are divided into qualitative, non-metric methods and quantitative, metric methods. The outcome is that the highest metric assessment level (Ratio) is not necessarily the level that places the highest demands on the listener. Quite to the contrary, the Nominal level, supporting a non-metric qualitative approach, places even higher demands on the listener. Additionally, various factors affecting the outcome regardless of at what level the assessment takes place are discussed such as the number of source and speech content conditions, dynamic response range, subjects, training, degree of user involvement and listening level, in relation to ITU-TS and ITU-R recommendations. RP GOLDSTEIN, M (reprint author), TELIA RES, RUDSJOTERRASSEN 2, S-13680 HANINGE, SWEDEN. CR BENNETT RW, 1985, 11TH INT S HUM FACT BENNETT RW, 1988, HUMAN FACTORS TELECO BENOIT C, 1989, P ESCA WORKSHOP SPEE BENOIT C, 1989, P EUROSPEECH 89 C PA, P633 BERGLUND B, 1983, MASTER SCALING ENV L, P610 BJORKMAN M, 1957, EXPERIMENTALPSYKOLOG BLAUERT J, 1990, UNPUB SOME BASICS PS BOOGAART T, 1992, OCT ICSLP 92 P BANFF, V2, P1207 CARLSON R, 1991, TALKING MACHINES THE COLEMAN AE, 1988, SPEECH COMMUN, V7, P151, DOI 10.1016/0167-6393(88)90036-2 Delogu C., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings DELOGU C, 1992, SAMUCLG004 FIN REP DUFFY SA, 1991, 17 IND U DEP PSYCH S EGAN JP, 1948, LARYNGOSCOPE, V58, P955, DOI 10.1288/00005537-194809000-00002 GLASS GV, 1984, STATISTICAL METHODS GLEISS N, 1974, PREFERRED LISTENING GOLDSTEIN M, 1992, OCT ICSLP 92 P BANFF, P1339 GOLDSTEIN M, 1993, 6TH INT WORKSH PACKE GOLDSTEIN M, 1992, OCT ICSLP 92 P BANFF, P1131 GRICE M, 1989, P ESCA WORKSHOP SPEE GRICE M, 1989, SPEECH HEARING LANGU, V3, P107 HAZAN V, 1992, ESPRIT2589 FIN REP HAZAN V, 1989, P ESCA WORKSHOP SPEE JEKOSCH U, 1993, 1993 EUROSPEECH 93 P, V2, P1387 JONES BJ, 1985, SMPTE J, P1244 KASUYA H, 1992, 1992 ICSLP 92 P BANF, P1215 LOGAN JS, 1989, J ACOUST SOC AM, V86, P566, DOI 10.1121/1.398236 MARSH DJ, 1987, SPEECH TECHNOLOGY, P76 MULLENNIX JW, 1989, J ACOUST SOC AM, V85, P365, DOI 10.1121/1.397688 NYGAARD LC, 1992, 18 IND U DEP PSYCH S OMALLEY M, 1987, SPEECH TECHNOLOGY, P66 OZAWA K, 1987, RES SPEECH PERCEPTIO, V13, P71 PAVLOVIC CV, 1990, J ACOUST SOC AM, V87, P373, DOI 10.1121/1.399258 PISONI DB, 1982, SPEECH TECHNOLOG APR, P10 PRATT RL, 1987, SPEECH TECHNOLOGY, P54 SALZA PL, 1993, DEV CONTEXT DEPENDEN SILVERMAN K, 1990, 1990 ICSLP 90 P KOB, P981 SILVERMAN K, 1990, EVALUATING SYNTHETIC SIMPSON CA, 1987, SPEECH TECHNOLOGY, P48 SOTSCHECK J, 1985, ACUSTICA, V57, P257 SOTSCHECK J, 1989, INFORMATION TECHNOLO, P224 SPIEGEL M, 1988, 1988 P AM VOIC I O S Stevens S. S., 1975, PSYCHOPHYSICS INTRO VANBEZOOIJEN R, 1990, SPEECH COMMUN, V9, P263, DOI 10.1016/0167-6393(90)90002-Q VIRZI RA, 1992, HUM FACTORS, V34, P457 VOIERS WD, 1983, SPEECH TECHNOLOGY, V55, P30 1986, 1082 REPT, V11, P210 1987, HUMAN INFORMATION PR 1993, STUDY GROUP 12 QUEST 1987, HDB TELEPHONOMETRY 1989, TELEPHONE TRANSMISSI, V5 NR 51 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1995 VL 16 IS 3 BP 225 EP 244 DI 10.1016/0167-6393(94)00047-E PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QT740 UT WOS:A1995QT74000001 ER PT J AU LEBOUQUINJEANNES, R FAUCON, G AF LEBOUQUINJEANNES, R FAUCON, G TI STUDY OF A VOICE ACTIVITY DETECTOR AND ITS INFLUENCE ON A NOISE-REDUCTION SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE VOICE ACTIVITY DETECTOR (VAD); NOISE REDUCTION ID SPEECH SIGNALS AB Some noise reduction processings, such as spectral subtraction techniques, require the learning of noise characteristics. In consequence, a Voice Activity Detector (VAD) is needed to determine noise and speech sequences. In this paper, in the case of spatially uncorrelated (or slightly correlated) noises, we introduce a new technique based on the coherence function which is used to determine a speech/noise classification algorithm. We combine it with a noise reduction technique based on the spectral subtraction and evaluate its influence. We report on results obtained on the performance of the algorithm and conclude that they are quite comparable to those obtained using a manual labelling. RP LEBOUQUINJEANNES, R (reprint author), UNIV RENNES 1, TRAITEMENT SIGNAL & IMAGE LAB, CAMPUS BEAULIEU, F-35042 RENNES, FRANCE. CR ALLEN JB, 1977, J ACOUST SOC AM, V62, P912, DOI 10.1121/1.381621 BEROUTI M, 1979, APR P IEEE INT C AC, P208 CARTER GC, 1987, P IEEE, V75, P236, DOI 10.1109/PROC.1987.13723 EPHRAIM Y, 1984, IEEE T ACOUST SPEECH, V32, P1109, DOI 10.1109/TASSP.1984.1164453 FREEMAN DK, 1989, MAY P IEEE ICASSP GL, P369 HALKA U, 1992, SPEECH COMMUN, V11, P15, DOI 10.1016/0167-6393(92)90060-K HIDAYAT B, 1989, 12E C GRETSI, P525 LEBOUQUIN R, 1990, SIGNAL PROCESS, V5, P1103 LEBOUQUIN R, 1993, EUROSPEECH 93, P227 SAVOJI MH, 1989, SPEECH COMMUN, V8, P45, DOI 10.1016/0167-6393(89)90067-8 NR 10 TC 27 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1995 VL 16 IS 3 BP 245 EP 254 DI 10.1016/0167-6393(94)00056-G PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QT740 UT WOS:A1995QT74000002 ER PT J AU KUO, SM CHEN, J AF KUO, SM CHEN, J TI ANALYSIS OF FINITE-LENGTH ACOUSTIC ECHO CANCELLATION SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTIC ECHO CANCELLATION; ADAPTIVE FIR FILTER; PERFORMANCE ANALYSIS OF FINITE LENGTH AEC AB The acoustic echo canceler using an adaptive transversal filter and the least mean-square (LMS) algorithm is the most effective technique to reduce acoustic echoes in a hands-free telephone system. However, the requirement of a very high order filter for each microphone results in difficulties in convergence and hardware implementation. In this paper, the performance of the finite length adaptive filter is studied. A formula which relates the echo cancellation to the filter size, N, is established. Detailed analysis shows that this finite filter length will have better performance using speech than white noise. RP KUO, SM (reprint author), NO ILLINOIS UNIV, DEPT ELECT ENGN, DE KALB, IL 60115 USA. CR CARAISCOS C, 1984, IEEE T ACOUST SPEECH, V32, P34, DOI 10.1109/TASSP.1984.1164286 GITLIN RD, 1973, IEEE T CIRCUITS SYST, VCT20, P125 OIKAWA H, 1988, IEEE INT S CIRCUITS, P1329 Widrow B, 1985, ADAPTIVE SIGNAL PROC NR 4 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1995 VL 16 IS 3 BP 255 EP 260 DI 10.1016/0167-6393(94)00057-H PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QT740 UT WOS:A1995QT74000003 ER PT J AU GONG, YF AF GONG, YF TI SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY SO SPEECH COMMUNICATION LA English DT Review DE SURVEY; NOISY SPEECH RECOGNITION; PARAMETRIZATION; SPEECH ENHANCEMENT; COMPENSATION FOR NOISE ID HIDDEN MARKOV-MODELS; SPECTRAL AMPLITUDE ESTIMATOR; ISOLATED WORD RECOGNITION; MAXIMUM-LIKELIHOOD; EM ALGORITHM; SPEAKER ADAPTATION; CEPSTRAL ANALYSIS; ENHANCEMENT; REPRESENTATIONS; VERIFICATION AB The performance levels of most current speech recognizers degrade significantly when environmental noise occurs during use. Such performance degradation is mainly caused by mismatches in training and operating environments. During recent years much effort has been directed to reducing this mismatch. This paper surveys research results in the area of digital techniques for single microphone noisy speech recognition classified in three categories: noise resistant features and similarity measurement, speech enhancement, and speech model compensation for noise. The survey indicates that the essential points in noisy speech recognition consist of incorporating time and frequency correlations, giving more importance to high SNR portions of speech in decision making, exploiting task-specific a priori knowledge both of speech and of noise, using class-dependent processing, and including auditory models in speech processing. C1 BROADCAST TECHNOL RES BRANCH, COMMUN RES CTR, DEPT COMMUN, OTTAWA, ON, CANADA. RP GONG, YF (reprint author), INRIA LORRAINE, CRIN, CNRS, F-54506 NANCY, FRANCE. CR ACERO A, 1990, 1990 INT C SPEECH LA, P1121 Acero A, 1993, ACOUSTICAL ENV ROBUS ACERO A, 1990, 1990 P IEEE INT C AC, P849 ACERO A, 1992, 1992 ESCA WORKSH P S, P89 ALEXANDRE P, 1993, 1993 P EUR C SPEECH, V2, P1255 ALEXANDRE P, 1993, SPEECH COMMUN, V12, P277, DOI 10.1016/0167-6393(93)90099-7 ALEXANDRE P, 1993, 1993 P IEEE INT C AC, V2, P99 ANASTASAKOS A, 1994, 1994 P IEEE INT C AC, V1, P433 Anderson TW, 1984, INTRO MULTIVARIATE S ANGLADE Y, 1993, IEEE T ACOUST SPEECH, V2, P279, DOI 10.1109/ICASSP.1993.319290 APPLEBAUM TH, 1991, 1991 P IEEE INT C AC, P985 ARSLAN LM, 1994, 1994 P IEEE INT C AC, V2, P4548 ATAL BS, 1974, J ACOUST SOC AM, V55, P1304, DOI 10.1121/1.1914702 AUBERT X, 1993, 1993 P IEEE INT C AC, V2, P648 BARBIER L, 1990, 5TH P EUR SIGN PROC, V2, P1111 BARBIER L, 1991, 1991 P IEEE INT C AC, P145 BATEMAN DC, 1992, 1992 P IEEE INT C AC, V1, P241 BEATTIE VL, 1991, 1991 P IEEE INT C AC, V2, P917 BEATTIE VL, 1992, 1992 INT C SPEECH LA, V1, P519 BEROUTI M, 1979, APR P IEEE INT C AC, P208 BERSTEIN AD, 1991, 1991 P IEEE INT C AC, P913 BLANCHET M, 1992, EUSIPCO, V6, P391 BOLL S, 1992, ADV SPEECH SIGNAL PR, pCH10 BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 BOUGHAZALE SE, 1994, 1994 P IEEE INT C AC, V1, P413 CARDIN R, 1993, 1993 P IEEE INT C AC, V2, P243 CARLSON BA, 1991, 1991 P IEEE INT C AC, P921 CARLSON BA, 1992, 1992 P IEEE INT C AC, P237 CHEN YN, 1988, IEEE T ACOUST SPEECH, V36, P433, DOI 10.1109/29.1547 CHENG YM, 1991, IEEE T SIGNAL PROCES, V39, P1943, DOI 10.1109/78.134427 CHENG YM, 1992, 1992 P INT C SPOK LA, V1, P515 CHENG YM, 1992, 6TH P IEEE WORKSH ST, P436 COHEN J R, 1985, Journal of the Acoustical Society of America, V78, pS50, DOI 10.1121/1.2022857 COHEN J, 1986, 1986 P DSP WORKSH CA CROZIER PM, 1993, 1993 P EUR C SPEECH, V1, P231 CUNG HM, 1993, SPEECH COMMUN, V12, P267, DOI 10.1016/0167-6393(93)90098-6 DAS S, 1994, 1994 P IEEE INT C AC, V1, P21 DAS S, 1993, 1993 P IEEE INT C AC, V2, P71 DAUTRICH BA, 1983, IEEE T ACOUST SPEECH, V31, P793, DOI 10.1109/TASSP.1983.1164172 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 DEMPSTER AP, 1977, J ROY STAT SOC B MET, V39, P1 DODDINGTON GR, 1989, 1989 P IEEE INT C AC, P556 DRUCKER H, 1968, IEEE T ACOUST SPEECH, VAU16, P165, DOI 10.1109/TAU.1968.1161979 Duda R. O., 1973, PATTERN CLASSIFICATI EPHRAIM Y, 1992, 1992 P IEEE INT C AC, V1, P289 EPHRAIM Y, 1990, APR P IEEE INT C AC, P829 EPHRAIM Y, 1985, IEEE T ACOUST SPEECH, V33, P443, DOI 10.1109/TASSP.1985.1164550 EPHRAIM Y, 1989, IEEE T ACOUST SPEECH, V37, P1856 EPHRAIM Y, 1992, IEEE T SIGNAL PROCES, V40, P725, DOI 10.1109/78.127947 EPHRAIM Y, 1992, P IEEE, V80, P1526, DOI 10.1109/5.168664 EPHRAIM Y, 1987, 1987 P IEEE INT C AC, P1324 EPHRAIM Y, 1984, IEEE T ACOUST SPEECH, V32, P1109, DOI 10.1109/TASSP.1984.1164453 ERELL A, 1990, 1990 P IEEE INT C AC, V2, P853 Erell A, 1993, IEEE T SPEECH AUDI P, V1, P84, DOI 10.1109/89.221370 ERELL A, 1993, IEEE T SPEECH AUDIO, V1 FEDER M, 1989, IEEE T ACOUST SPEECH, V37, P204, DOI 10.1109/29.21683 Ferguson J. D., 1980, P S APPL HIDD MARK M, P143 Frazier R. H., 1976, 1976 IEEE International Conference on Acoustics, Speech and Signal Processing FUKUNAGA K, 1990, INTRO STATISTICAL PA FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P254, DOI 10.1109/TASSP.1981.1163530 FURUI S, 1990, 1990 P IEEE INT C AC, P789 FURUI S, 1992, 1992 ESCA WORKSH P S, P31 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 FURUI S, 1989, 1989 P IEEE INT C AC, P286 FURUI S, 1989, IEEE T ACOUST SPEECH, V37, P1923, DOI 10.1109/29.45538 GALES MJF, 1993, TR154 CUEDFINFENG TE GALES MJF, 1992, 1992 P IEEE INT C AC, V1, P233 GALES MJF, 1993, SPEECH COMMUN, V12, P231, DOI 10.1016/0167-6393(93)90093-Z GALES MJF, 1993, FINFENGTR135 CAMBR U GALES MJF, 1993, 1993 P EUR C SPEECH, V2, P837 GAO Y, 1992, 1992 P ESCA WORKSH S GAO Y, 1994, 1994 P IEEE INT C AC, V2, P89 GAO Y, 1992, 1992 INT C SPEECH LA, V1, P73 GAO Y, 1993, 1993 P EUR C SPEECH, P1035 GAROFOLO JS, 1988, STRUCTURE FORMAT DAR GAUVAIN JL, 1992, SPEECH COMMUN, V11, P205, DOI 10.1016/0167-6393(92)90015-Y GHITZA O, 1987, COMPUTER SPEECH LANG, V2 GHITZA O, 1992, ADV SPEECH SIGNAL PR, P453 GHITZA O, 1988, 1988 P IEEE INT C AC, P91 GHITZA O, 1987, 1987 IEEE INT C AC S, P2372 GISH H, 1990, 1990 P IEEE INT C AC, V1, P117 GONG Y, 1993, 1993 P EUR C SPEECH, V3, P2227 GONG Y, 1994, 1994 P IEEE INT C AC, V1, P57 GONG Y, 1992, 1992 P INT C SPOK LA, V1, P377 GONG Y, 1993, CRCTN93002 COMM RES GRAF JT, 1993, 1993 P IEEE INT C AC, V2, P339 GU Y, 1989, 1989 P EUR C SPEECH, P258 GUAN C, 1993, 1993 P IEEE INT C AC, V2, P107 Haeb-Umbach R., 1992, 1992 IEEE INT C AC S, V1, P13 Hand D. J., 1981, DISCRIMINATION CLASS Hansen J. H. L., 1988, THESIS GEORGIA I TEC HANSEN JH, 1988, 1988 P IEEE INT C AC, P561 HANSEN JH, 1987, 1987 P IEEE INT C AC, P189 HANSEN JHL, 1991, IEEE T SIGNAL PROCES, V39, P795, DOI 10.1109/78.80901 HANSEN JHL, 1990, 1990 INT C SPEECH LA, P1125 HANSEN JHL, 1992, SIGNAL PROCESS, V6, P403 HANSEN JHL, 1989, 1989 P IEEE INT C AC, P266 HANSON BA, 1986, 1986 P IEEE INT C AC, P757 HANSON BA, 1993, 1993 P IEEE INT C AC, V2, P79 HANSON BA, 1990, 1990 P IEEE INT C AC, P857 HANSON BA, 1990, INT C SPEECH LANGUAG, P1117 HANSON BA, 1987, IEEE T ACOUST SPEECH, V35, P968, DOI 10.1109/TASSP.1987.1165241 HATON JP, 1993, 1993 P NATO ASI NEW HERMANSKY H, 1991, 1991 P EUR C SPEECH, P1367 HERMANSKY H, 1992, 1992 INT C SPEECH LA, V1, P85 HERMANSKY H, 1993, 1993 P IEEE INT C AC, V2, P83 HERMANSKY H, 1985, 1985 P IEEE INT C AC, V1, P509 HERNANDO J, 1994, 1994 P IEEE INT C AC, V2, P69 HIRSCH HG, 1991, 1991 P EUR C SPEECH, P413 HOLMES JN, 1986, 1986 P IEEE INT C AC, P741 HUANG XD, 1991, 1991 P DARPA WORKSH HUNT MJ, 1991, 1991 P IEEE INT C AC, P881 HUNT MJ, 1989, 1989 P IEEE INT C AC HUSH DR, 1993, IEEE SIGNAL PROC JAN, P8 ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 ITAKURA F, 1987, 1987 P IEEE INT C AC, P1257 JANKOWSKI C, 1990, 1990 P IEEE INT C AC JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 Juang B. H., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90011-E JUANG BH, 1992, 1992 P IEEE INT C AC, V1, P301 JUANG BH, 1987, 1987 P IEEE INT C AC, P2368 JUANG BH, 1986, 1986 P IEEE INT C AC, P765 JUNQUA JC, 1992, 1992 ESCA WORKSH P S, P43 JUNQUA JC, 1989, 1989 P IEEE INT C AC, P476 JUNQUA JC, 1993, J ACOUST SOC AM, V93, P510, DOI 10.1121/1.405631 JUNQUA JC, 1990, 1990 P IEEE INT C AC, P841 KADIRKAMANATHAN.M, 1992, 1992 ESCA WORKSH P S, P187 KITAMURA T, 1992, 1992 INT C SPEECH LA, V1, P699 Klatt D. H., 1976, 1976 IEEE International Conference on Acoustics, Speech and Signal Processing KOBATAKE H, 1994, 1994 P IEEE INT C AC, V1, P425 KOBATAKE H, 1991, 1991 P IEEE INT C AC, P973 KOBAYASHI T, 1994, 1994 P IEEE INT C AC, V2, P57 KOEHLER J, 1994, 1994 P IEEE INT C AC, V1, P421 KOO B, 1989, 1989 P IEEE INT C AC, P349 LECOMTE I, 1989, 1989 P IEEE INT C AC, P512 LEE CH, 1991, IEEE T SIGNAL PROCES, V39, P806, DOI 10.1109/78.80902 LEE KF, 1988, 1988 P IEEE INT C AC, P123 LEFEBVRE C, 1992, 1992 INT C SPEECH LA, V1, P691 Levinson S. E., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80009-2 LIM JS, 1979, 1979 IEEE T AC SPEEC LIM JS, 1979, P IEEE, V67, P1586, DOI 10.1109/PROC.1979.11540 LIM JS, 1978, IEEE T ACOUST SPEECH, V26, P197 LIM JS, 1983, SPEECH ENHANCEMENT LIM JS, 1978, IEEE T ACOUST SPEECH, V26, P354 LIM JS, 1983, SPEECH ENHANCEMENT, P101 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 LIPPMANN RP, 1987, 1987 P IEEE INT C AC, P705 Lippmann R.P., 1987, IEEE ASSP MAG, V3, P4 LIU FH, 1994, 1994 P IEEE INT C AC, V2, P61 LOCKWOOD P, 1991, 1991 P EUR LOCKWOOD P, 1994, 1994 P IEEE INT C AC, V1, P441 LOCKWOOD P, 1992, SPEECH COMMUN, V11, P215, DOI 10.1016/0167-6393(92)90016-Z MALACHLAN GJ, 1988, MIXTURE MODELS INFER MANSOUR D, 1988, 1988 P IEEE INT C AC, P36 MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37, P1659, DOI 10.1109/29.46548 MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37, P795, DOI 10.1109/ASSP.1989.28053 MANSOUR D, 1988, 1988 P IEEE INT C AC, P525 Markel JD, 1976, LINEAR PREDICTION SP MARTIN F, 1993, 1993 P EUR C SPEECH, V2, P1031 MATSUMOTO H, 1986, 1986 P IEEE INT C AC MATSUOKA T, 1992, 1992 INT C SPEECH LA, V1, P373 MCAULAY RJ, 1980, IEEE T ACOUST SPEECH, V28, P137, DOI 10.1109/TASSP.1980.1163394 MELLOR BA, 1993, 1993 P IEEE INT C AC, V2, P87 MENA JG, 1990, SIGNAL PROCESS, V5, P1191 MIZUTA S, 1992, 1992 INT C SPEECH LA, V2, P1519 MOKBEL C, 1992, 1992 INT C SPEECH LA, V1, P707 MOKBEL C, 1993, 1993 P EUR C SPEECH, V2, P1247 MOKBEL C, 1991, 1991 P IEEE INT C AC MOKBEL C, 1992, 1992 ESCA WORKSH P S, P211 MOKBEL C, 1991, 1991 P IEEE INT C AC, P925 MORENO P, 1994, 1994 P IEEE INT C AC, V1, P109 MORGAN N, 1992, 1992 ESCA WORKSH P S, P115 MORII S, 1990, 1990 INT C SPEECH LA, P1145 MURVEIT H, 1992, 1992 P DARPA WORKSH NADAS A, 1989, IEEE T ACOUST SPEECH, V37, P1495, DOI 10.1109/29.35387 NAKAMURA S, 1990, 1990 P IEEE INT C AC, P157 Nakamura S., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266370 NAKAMURA S, 1993, 1993 P EUR C SPEECH, V2, P1045 NANDKUMAR S, 1994, 1994 P IEEE INT C AC, V1, P1 NANDKUMAR S, 1992, 1992 P IEEE INT C AC, P297 NG K, 1992, 1992 P IEEE INT C AC, V2, P109 NICOL N, 1992, 1992 ESCA WORKSH P S, P183 NOCERINO N, 1985, 1985 P IEEE INT C AC, P25 NOLAZCOFLORES JA, 1994, 1994 P IEEE INT C AC, V1, P409 NOLAZCOFLORES JA, 1993, 1993 P EUR C SPEECH, V2, P829 OHKURA K, 1993, 1993 P IEEE INT C AC, V2, P75 OHKURA K, 1992, 1992 INT C SPEECH LA, V1, P369 OHKURA K, 1991, 1991 P IEEE INT C AC, P929 OPENSHAW JP, 1994, 1994 P IEEE INT C AC, V2, P49 OSHAUGHNESSY D, 1988, APR P IEEE INT C AC, P549 OSHAUGHNESSY D, 1989, IEEE COMMUN MAG, V27, P46, DOI 10.1109/35.17653 Paliwal K. K., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90034-6 PALIWAL KK, 1990, 1990 P IEEE INT C AC, P429 Papoulis A, 1991, PROBABILITY RANDOM V, V3rd PERLMUTTER YM, 1977, P IEEE INT C ACOUSTI, P212 PISONI DB, 1985, 1985 P IEEE INT C AC, P1581 PORTER JE, 1984, 1984 P IEEE INT C AC, V2 QUATIERI TF, 1990, 1990 P IEEE INT C AC, P821 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 RAHIM MG, 1994, 1994 P IEEE INT C AC, V1, P445 RAMALHO MA, 1994, 1994 P IEEE INT C AC, V1, P29 REDNER RA, 1984, SIAM REV, V26, P195, DOI 10.1137/1026034 ROE DB, 1987, 1987 P IEEE INT C AC, P1139 Rose RC, 1994, IEEE T SPEECH AUDI P, V2, P245, DOI 10.1109/89.279273 ROTH R, 1993, 1993 P IEEE INT C AC, V2, P640 RUSSELL MJ, 1990, 1990 P IEEE INT C AC, P69 RUSSELL MJ, 1987, 1887 P IEEE INT C AC, P2376 SALAVEDRA JM, 1993, 1993 P EUR C SPEECH, V1, P223 SCHROEDE.MR, 1974, J ACOUST SOC AM, V55, P1055, DOI 10.1121/1.1914647 SEIDE F, 1994, 1994 P IEEE INT C AC, V2, P85 SENEFF S, 1988, J PHONETICS, V16, P55 SHAMMA SA, 1985, J ACOUST SOC AM, V78, P1622, DOI 10.1121/1.392800 SHEIKHZADEN B, 1994, 1994 P IEEE INT C AC, V1, P13 SHIKANO K, 1986, 1986 P IEEE INT C AC, P2643 SIOHAN O, 1994, 1994 P INT C SPOK LA SIOHAN O, 1993, 1993 P EUR C SPEECH, V3, P1639 SMOLDERS J, 1994, 1994 P IEEE INT C AC, V1, P429 SOONG FK, 1987, 1987 P IEEE INT C AC, P625 SORENSEN HBD, 1993, 1993 P EUR C SPEECH, V1, P235 SORENSEN HBD, 1994, 1994 P IEEE INT C AC, V2, P657 Sorenson H., 1980, PARAMETER ESTIMATION STERN RM, 1990, 1990 P DARPA WORKSH, P311 STERN RM, 1987, IEEE T ACOUST SPEECH, V35 STERN RM, 1992, 1992 INT C SPEECH LA, V1, P695 TAMURA S, 1988, 1988 P IEEE INT C AC, P553 TAMURA S, 1989, 1989 P IEEE INT C AC, P2001 TITTERINGTON DM, 1985, STATISTICAL ANAL FIN TOHKURA Y, 1987, IEEE T ACOUST SPEECH, V35, P1414, DOI 10.1109/TASSP.1987.1165058 TREURNIET WC, 1994, 1994 P IEEE INT C AC, V1, P437 TROMPF M, 1993, 1993 P EUR C SPEECH, V2, P1039 TSENG HP, 1987, 1987 P IEEE INT C AC, P641 USAGAWA T, 1994, 1994 P IEEE INT C AC, V2, P81 VANCOMPERNOLLE D, 1989, 1989 P IEEE INT C AC, P258 Van Compernolle D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90027-2 VANCOMPERNOLLE D, 1987, 1987 P IEEE INT C AC, P1143 VANCOMPERNOLLE D, 1992, 1992 ESCA WORKSH P S, P21 VARGA AP, 1989, 1989 P EUR C SPEECH VARGA AP, 1990, APR P IEEE INT C AC, P845 VARGA AP, 1992, NOISEX 92 STUDY EFFE VASEGHI SV, 1993, 1993 P EUR C SPEECH, V2, P1023 VASEGHI SV, 1994, 1994 P IEEE INT C AC, V2, P65 WANG K, 1993, 1993 P IEEE INT C AC, V2, P335 WHIPPLE G, 1994, 1994 P IEEE INT C AC, V1, P5 WOOD L, 1991, 1991 P IEEE INT C AC, P181 XIE F, 1994, 1994 P IEEE INT C AC, V2, P53 XIE F, 1993, 1993 P EUR C SPEECH, P617 YANG XW, 1992, IEEE T INFORM THEORY, V38, P824, DOI 10.1109/18.119739 YOUNG SJ, 1992, 1992 ESCA WORKSH P S, P123 NR 248 TC 248 Z9 258 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1995 VL 16 IS 3 BP 261 EP 291 DI 10.1016/0167-6393(94)00059-J PG 31 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QT740 UT WOS:A1995QT74000004 ER PT J AU GERARD, C DAHAN, D AF GERARD, C DAHAN, D TI DURATIONAL VARIATIONS IN SPEECH AND DIDACTIC ACCENT DURING READING SO SPEECH COMMUNICATION LA English DT Article ID INTONATION; COMPREHENSION; INFORMATION; VARIABLES; LOCATION; ENGLISH; STRESS; FRENCH; FOCUS AB The aim of this research is to analyze the main durational changes occurring in the spoken string when a reader produces an emphatic accent (''didactic accent'') on target-words included in texts, and to verify whether these changes are organized into structured prosodic forms. Three experiments address the following questions: (1) Are the durational changes concentrated in the immediate vicinity of the target (last syllable and pause preceding the target, duration of enunciation of the target, pause following the target)? (2) Are there intercorrelations among the durational variations observed? (3) Does the typographical realization of the target produce different effects on time changes? (4) Does the semantic weight of the targets change the way in which speakers produce didactic accents? Two speakers had to read 16 different texts (each text once with and once without target-words). After a description of the amplitude and location of durational variations in their performance, analyses are performed in order to locate precisely where slowing down occurs. The conclusions are relativized when confronting the two speakers' strategies. Pre-target variables are intercorrelated, but post-target pauses vary independently. Speakers are insensitive to typographical and semantic determinants. The results are compatible with the hypothesis that mental representations of prosodic forms govern the temporal structure of speech in loud reading, and they show the importance of cognitive determinants during continuous reading. RP GERARD, C (reprint author), UNIV PARIS 05, PSYCHOL EXPTL LAB, CNRS, URA 316, 28 RUE SERPENTE, F-75270 PARIS 06, FRANCE. CR BOCK JK, 1983, MEM COGNITION, V11, P64, DOI 10.3758/BF03197663 BOLINGER DL, 1961, LANGUAGE, V37, P83, DOI 10.2307/411252 Calliope, 1989, PAROLE SON TRAITEMEN CHAFE WL, 1974, LANGUAGE, V50, P111, DOI 10.2307/412014 COOPER WE, 1985, J ACOUST SOC AM, V77, P2142, DOI 10.1121/1.392372 COZANNET A, 1991, MESSAGE SPEECH NOTE CUTLER A, 1981, PERCEPT PSYCHOPHYS, V29, P217, DOI 10.3758/BF03207288 CUTLER A, 1987, LANGUAGE PERCEPTION, P23 CUTLER A, 1979, COGNITION, V7, P49, DOI 10.1016/0010-0277(79)90010-6 DAHAN D, UNPUB PROSODIC CORRE DAHAN D, 1994, 20EMES P JOURN ETUD DAHAN D, 1994, J PHYS IV, V4, P501, DOI 10.1051/jp4:19945106 Dell F., 1984, FORME SONORE LANGAGE, P65 EADY SJ, 1986, J ACOUST SOC AM, V80, P402, DOI 10.1121/1.394091 FOLKINS JW, 1975, J SPEECH HEAR RES, V18, P739 Fonagy I., 1980, ACCENT FRANCAIS CONT, P123 Garde P., 1968, ACCENT GERARD C, 1994, 20EM P JOURN ET PAR GERARD C, 1992, MUSIC PERCEPT, V10, P93 GERARD C, 1992, 19EM P JOURN ET PAR, P507 GERARD C, 1994, J PHYS IV, V4, P505, DOI 10.1051/jp4:19945107 GERARD C, 1991, CNET891B226 RAPP CON GOLDMANE.F, 1972, LANG SPEECH, V15, P103 GOLDMANEISLER J, 1972, B PSYCHOL, V26, P383 GROSJEAN F, 1973, PHONETICA, V28, P191 GROSJEAN F, 1980, TEMPORAL VARIABLES S, P85 GROSJEAN F, 1975, PHONETICA, V31, P144 GROSJEAN F, 1972, PHONETICA, V26, P129 HOWELL P, 1991, SPEECH COMMUN, V10, P163, DOI 10.1016/0167-6393(91)90039-V JAKOBI JM, 1988, REV INT PSYCHOL SOC, V1, P345 JAYEZ J, 1991, CAHIERS LINGUISTIQUE, V10, P147 Kowal S., 1980, TEMPORAL VARIABLES S, P61 LENOUVEAU N, 1987, MEMOIRE DEA PSYCHOL LIEBERMAN P, 1960, J ACOUST SOC AM, V32, P451, DOI 10.1121/1.1908095 LUCCI V, 1983, PUBLICATION U LANGUE MARTIN JG, 1972, PSYCHOL REV, V79, P487, DOI 10.1037/h0033467 MARTIN JG, 1979, J ACOUST SOC AM, V65, P1286, DOI 10.1121/1.382797 NEEDHAM WP, 1990, J MEM LANG, V29, P455, DOI 10.1016/0749-596X(90)90066-9 Nooteboom S.G., 1978, STUDIES PERCEPTION L, P75 Pasdeloup V, 1990, THESIS U PROVENCE AI PASDELOUP V, 1984, MEMOIRE MAITRISE LIN PASDELOUP V, 1989, J ACOUSTIQUE, V2, P47 PASDELOUP V, 1988, 17EM P JOURN ET PAR PIERREL JP, 1993, COURRIER CNRS, V79, P11 Rossi M, 1980, ACCENT FRANCAIS CONT, P13 ROSSI M, 1987, ASPECTS PROSODIQUES, V66, P20 ROSSI M, 1985, PHONETICA, V42, P135 ROSSI M, 1988, JOURNEES NATIONALES, P63 SAINTBONNET M, 1977, 8EM P JOURN ET PAR A, P337 SEGUINOT A, 1977, ACCENT INSISTANCE EM, V12, P1 SORIN C, 1989, PSYCHOACOUSTIQUE PER, P123 WEISMER G, 1979, J SPEECH HEAR RES, V22, P516 NR 52 TC 1 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1995 VL 16 IS 3 BP 293 EP 311 DI 10.1016/0167-6393(94)00060-N PG 19 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QT740 UT WOS:A1995QT74000005 ER PT J AU MOULINES, E SAGISAKA, Y AF MOULINES, E SAGISAKA, Y TI VOICE CONVERSION - STATE-OF-THE-ART AND PERSPECTIVES SO SPEECH COMMUNICATION LA English DT Editorial Material C1 ATR, INTERPRETING TELECOMMUN RES LABS, SEIKA, KYOTO 61902, JAPAN. RP MOULINES, E (reprint author), TELECOM PARIS, DEPT SIGNAL PROC, PARIS, FRANCE. NR 0 TC 12 Z9 15 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 125 EP 126 DI 10.1016/0167-6393(95)90054-3 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000001 ER PT J AU CHILDERS, DG AF CHILDERS, DG TI GLOTTAL SOURCE MODELING FOR VOICE CONVERSION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; GLOTTAL SOURCE; GLOTTAL EXCITATION; GLOTTAL FLOW; VOLUME VELOCITY; VOICE CONVERSION ID SPEECH SYNTHESIS; VECTOR QUANTIZATION; VOCAL QUALITY; FEMALE; PERCEPTION AB This paper describes recent advances in glottal source modeling for speech synthesis. In particular two procedures for modeling the glottal excitation waveform are described and applied to voice conversion. One model uses a polynomial to represent the glottal excitation waveform for one pitch period. The coefficients of the polynomial model form. a vector that is used to design a glottal excitation code book with 32 entries for voiced excitation. The codebook is designed and trained using two sentences spoken by different speakers. Speech is synthesized using a quantized glottal excitation waveform for one speaker as the excitation for a glottal excitation linear predictive (GELP) synthesizer designed using tract parameters obtained from the speech of another speaker. Our implementation of the LP synthesizer is patterned after both a pitch-excited LP speech synthesizer and a code excited linear predictive (CELP) speech coder. In addition to the glottal excitation codebook, we use a stochastic codebook with 256 entries for unvoiced noise excitation. Analysis techniques are described for constructing both codebooks. The GELP synthesizer, which resynthesizes speech with high quality, provides the speech scientist with a simple speech synthesis procedure that uses established analysis techniques, that is able to reproduce all speech sounds, and yet also has an excitation model waveform that is related to the derivative of the glottal flow and the integral of the residue. Another approach uses the LF glottal volume-velocity waveform to model the characteristics of three voice types: modal, breathy, and vocal fry (creaky). We then convert a modal voice to sound like a breathy or vocal fry voice using the vocal tract characteristics for modal voice and the glottal volume-velocity waveform model for breathy and vocal fry voices as the excitation. RP CHILDERS, DG (reprint author), UNIV FLORIDA, DEPT ELECT ENGN, 405 CSE, GAINESVILLE, FL 32611 USA. CR ALLEN DR, 1985, J ACOUST SOC AM, V78, P58, DOI 10.1121/1.392454 ATAL BS, 1989, IEEE T ACOUST SPEECH, V1, P69 BUZO A, 1980, IEEE T ACOUST SPEECH, V28, P562, DOI 10.1109/TASSP.1980.1163445 CARLSON R, 1991, SPEECH COMMUN, V10, P481, DOI 10.1016/0167-6393(91)90051-T CHILDERS DG, 1995, IN PRESS J ACOUS JAN CHILDERS DG, 1990, SPEECH COMMUN, V9, P97, DOI 10.1016/0167-6393(90)90064-G CHILDERS DG, 1989, SPEECH COMMUN, V8, P147, DOI 10.1016/0167-6393(89)90041-1 CHILDERS DG, 1994, J ACOUST SOC AM, V96, P2026, DOI 10.1121/1.411319 CHILDERS DG, 1994, IEEE T BIO-MED ENG, V41, P663, DOI 10.1109/10.301733 CHILDERS DG, 1991, J ACOUST SOC AM, V90, P2394, DOI 10.1121/1.402044 CHILDERS DG, 1991, J ACOUST SOC AM, V90, P1841, DOI 10.1121/1.401664 ESKENAZI L, 1990, J SPEECH HEAR RES, V33, P298 FANT G, 1988, STLQPSR23, P1 FANT G, 1985, SPEECH TRANSMISSION, P1 FANT G, 1993, SPEECH COMMUN, V13, P7, DOI 10.1016/0167-6393(93)90055-P FANT G, 1982, STLQPSR23 ROYAL I TE, P1 FUJISAKI H, 1986, IEEE T ACOUST SPEECH, V3, P1605 GOBL C, 1989, STLQPSR4 ROYAL I TEC, P9 KANG GS, 1985, IEEE T ACOUST SPEECH, V33, P377, DOI 10.1109/TASSP.1985.1164556 KARLSSON I, 1992, SPEECH COMMUN, V11, P491, DOI 10.1016/0167-6393(92)90056-D KARLSSON I, 1990, P INT C SPOKEN LANGU, V1, P69 KARLSSON I, 1991, J PHONETICS, V19, P111 KARLSSON I, 1988, 88 P SPEECH M FASE S, V1, P225 KARLSSON I, 1986, J PHONETICS, V14, P415 KLATT DH, 1987, J ACOUST SOC AM, V82, P737, DOI 10.1121/1.395275 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 Kleijn WB, 1993, IEEE T SPEECH AUDI P, V1, P386, DOI 10.1109/89.242484 MA CK, 1991, ELECTRON LETT, V27, P1772, DOI 10.1049/el:19911102 MILENKOVIC PH, 1993, J ACOUST SOC AM, V93, P1087, DOI 10.1121/1.405557 NING T, 1990, INT C AC SPEECH SIGN, V5, P2523 OLIVE JP, 1992, J ACOUST SOC AM, V92, P1837, DOI 10.1121/1.403840 PINTO NB, 1989, IEEE T ACOUST SPEECH, V37, P1887 Rothenberg M., 1981, VOCAL FOLD PHYSL, P305 Saito S., 1985, FUNDAMENTALS SPEECH Savic M., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90099-7 SCHULTHEISS M, 1989, IEEE T ACOUST SPEECH, V1, P152 TRANCOSO IM, 1990, SPEECH COMMUN, V9, P389, DOI 10.1016/0167-6393(90)90016-3 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V YEA JJ, 1983, IEEE T ACOUST SPEECH, V3, P1332 NR 40 TC 24 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 127 EP 138 DI 10.1016/0167-6393(94)00050-K PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000002 ER PT J AU IWAHASHI, N SAGISAKA, Y AF IWAHASHI, N SAGISAKA, Y TI SPEECH SPECTRUM CONVERSION BASED ON SPEAKER INTERPOLATION AND MULTIFUNCTIONAL REPRESENTATION WITH WEIGHTING BY RADIAL BASIS FUNCTION NETWORKS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SPECTRUM CONVERSION; SPEAKER ADAPTATION; VOICE CONVERSION; SPEAKER INTERPOLATION; MULTIPLE FUNCTIONAL REPRESENTATION; RADIAL BASIS FUNCTION ID VOICE; FEMALE AB This paper describes a speech spectrum transformation method by interpolating multi-speakers' spectral patterns and multi-functional representation with Radial Basis Function networks. The interpolation is carried out using spectral parameters between pre-stored multiple speakers' utterance data to generate new spectrum patterns. Adaptation to a target speaker can be performed by this interpolation, which uses only a small amount of training data to generate new speech spectrum sequences close to those of the target speaker. Moreover, to obtain more precise adaptation by using a larger amount of training data, the transformation is represented by multiple interpolating functions. The multiple functions' outputs are weighted-summed, using weighting values given by RBF networks. The parameters of this multi-functional transformation are adapted by the gradient descent method. Adaptation experiments were carried out using four pre-stored speakers' data. Using only one word spoken by the target speaker for training, the distance between the target speaker's spectrum and the spectrum generated by the single interpolating function was reduced by about 35% compared with the distance between the target speaker's spectrum and the spectrum of the pre-stored speaker closest to the target. Using ten training words, the reduction rate increased to 48% by the multi-functional transformation. C1 ATR, INTERPRETING TELECOMMUN RES LABS, KYOTO 61902, JAPAN. RP IWAHASHI, N (reprint author), SONY RES CTR, SHINAGAWA KU, 6-7-35 KITASHINAGAWA, TOKYO 141, JAPAN. CR ABE M, 1991, P IEEE INT C AC SPEE, P765, DOI 10.1109/ICASSP.1991.150451 Abe M., 1988, P ICASSP, P655 Atal B. S., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Broomhead D. S., 1988, 4148 ROYAL SIGN RAD CHILDERS DG, 1989, SPEECH COMMUN, V8, P147, DOI 10.1016/0167-6393(89)90041-1 HAKODA K, 1987, FALL P M AC SOC JAP, P213 IMAI S, 1980, IEICE 1980 JA, V63 Jacobs R. A., 1991, Neural Computation, V3, DOI 10.1162/neco.1991.3.1.79 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 KOSAKA T, 1993, P INT C ACOUST SPEEC, V2, P570 KUWABARA H, 1987, 11TH P INT C PHON SC, P281 PRICE PJ, 1989, SPEECH COMMUN, V8, P261, DOI 10.1016/0167-6393(89)90005-8 TAKAGI T, 1986, FALL P M AC SOC JAP, P145 TAN Y, 1989, P INT JOINT C NEURLA, V2, P439 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V VISWANATHAN R, 1975, IEEE T ACOUST SPEECH, VAS23, P309, DOI 10.1109/TASSP.1975.1162675 NR 17 TC 25 Z9 28 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 139 EP 151 DI 10.1016/0167-6393(94)00051-B PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000003 ER PT J AU MIZUNO, H ABE, M AF MIZUNO, H ABE, M TI VOICE CONVERSION ALGORITHM-BASED ON PIECEWISE-LINEAR CONVERSION RULES OF FORMANT FREQUENCY AND SPECTRUM TILT SO SPEECH COMMUNICATION LA English DT Article DE VOICE CONVERSION; FORMANT FREQUENCY; SPECTRAL INTENSITY; SPECTRUM TILT; PIECEWISE LINEAR; LISTENING TEST AB This article presents a new algorithm used in order to convert the speech of one speaker so that it sounds like that of another speaker. This algorithm flexibly converts voice quality using two major technical developments. Firstly, the modification of formant frequencies and spectral intensity using piecewise linear voice conversion rules. This enables the control of spectrum parameters in detail. The conversion rules are generated automatically for any pair of speakers. The reliability of the conversion rules is guaranteed because they are statistically generated using training data. Secondly, this algorithm provides the ability to produce speech with the desired formant structure by controlling formant frequencies, formant bandwidths and spectral intensity. Speech is iteratively modified in order to achieve the specified formant structure. Listening tests prove that the proposed algorithm converts speaker individuality while maintaining high speech quality. C1 NIPPON TELEGRAPH & TEL PUBL CORP, HUMAN INTERFACE LABS, YOKOSUKA, KANAGAWA 23803, JAPAN. CR ABE M, 1988, ICASSP, P565 CHILDERS DG, 1989, SPEECH COMMUN, V8, P147, DOI 10.1016/0167-6393(89)90041-1 Fant G., 1960, ACOUSTIC THEORY SPEE FRANAGAN JL, 1972, SPEECH ANALYSIS SYNT FURUI S, 1989, DIGITAL SPEECH PROCE, P97 Hamon C., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266409 HAYASHI C, 1985, BEHAVIORMETRIKA ITOH K, 1982, T IEICE JAPAN A, V65, P101 KLATT DH, 1982, IEEE T ACOUST SPEECH, P1589 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 KUWABARA H, 1987, ACUSTICA, V63, P120 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P129, DOI 10.1109/TAU.1972.1162367 MATSUMOTO H, 1990, ICSLP 90, P161 MATSUMOT.H, 1973, IEEE T ACOUST SPEECH, VAU21, P428, DOI 10.1109/TAU.1973.1162507 Mizuno H., 1993, ICASSP-93. 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No.92CH3252-4), DOI 10.1109/ICASSP.1993.319267 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V NR 18 TC 16 Z9 19 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 153 EP 164 DI 10.1016/0167-6393(94)00052-C PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000004 ER PT J AU KUWABARA, H SAGISAKA, Y AF KUWABARA, H SAGISAKA, Y TI ACOUSTIC CHARACTERISTICS OF SPEAKER INDIVIDUALITY - CONTROL AND CONVERSION SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER CHARACTERISTICS; VOICE CONVERSION; SPECTRAL MAPPING; SPEECH SYNTHESIS; VOICE QUALITY CONTROL ID GENDER RECOGNITION; SYNTHESIS SYSTEM; SPEECH SYNTHESIS; VOCAL QUALITY; VOICE; FEMALE; PERCEPTION; FEATURES; MODEL AB This paper introduces some recent studies on voice quality control and conversion technologies. After briefly summarizing some basic scientific findings on the acoustic correlates of speech individuality, we review the latest developments in speech technologies related to voice control and speaker characteristic copying. The main focus is on a survey of non-parametric methods for spectral segmental characteristics mapping between speakers, introducing some different types of spectral mapping methods that have evolved in relation to the speaker adaptation techniques being developed in speech recognition research. C1 ATR, INTERPRETING TELECOMMUN LABS, KYOTO 61902, JAPAN. RP KUWABARA, H (reprint author), NISHI TOKYO UNIV, DEPT ELECTR & INFORMAT SCI, 2525 YATSUZAWA, YAMANASHI 40901, JAPAN. CR ABE M, 1991, J ACOUST SOC JAPAN E, V13, P131 Abe M., 1990, Journal of the Acoustical Society of Japan (E), V11 ABE M, 1989, P INT C ACOUST SPEEC, P592 BAVEGARD M, 1993, STL QPSR, V4, P43 CARLSON R, 1993, STL QPSR, V1, P1 CHILDERS DG, 1991, J ACOUST SOC AM, V90, P2394, DOI 10.1121/1.402044 CHILDERS DG, 1991, J ACOUST SOC AM, V90, P1841, DOI 10.1121/1.401664 COOK PR, 1993, COMPUT MUSIC J, V17, P30, DOI 10.2307/3680568 ESKENAZI L, 1990, J SPEECH HEAR RES, V33, P298 Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1993, SPEECH COMMUN, V13, P7, DOI 10.1016/0167-6393(93)90055-P FURUI S, 1986, SPEECH COMMUN, V5, P183, DOI 10.1016/0167-6393(86)90007-5 Gobl C., 1989, STL QPSR, P9 GRADDOL D, 1983, LANG SPEECH, V26, P351 HARTMAN DE, 1976, J ACOUST SOC AM, V59, P713, DOI 10.1121/1.380894 ITOH K, 1982, IECE T A, V65, P101 IWAHASHI N, 1995, SPEECH COMMUN, V16, P139, DOI 10.1016/0167-6393(94)00051-B KARLSSON I, 1992, SPEECH COMMUN, V11, P491, DOI 10.1016/0167-6393(92)90056-D KARLSSON I, 1990, P INT C SPOKEN LANGU, V1, P69 KARLSSON I, 1991, J PHONETICS, V19, P111 KARLSSON I, 1992, THESIS ROYAL I TECH KARLSSON I, 1986, J PHONETICS, V14, P415 KARLSSON I, 1988, 7TH P SPEECH 88 FASE, V1, P225 KASUYA H, 1986, J PHONETICS, V14, P463 KASUYA H, 1986, SPEECH COMMUN, V5, P171, DOI 10.1016/0167-6393(86)90006-3 Kasuya H., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 KOSAKA T, 1993, P INT C ACOUST SPEEC, P570 KUWABARA H, 1991, SPEECH COMMUN, V10, P491, DOI 10.1016/0167-6393(91)90052-U KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 KUWABARA H, 1987, ACUSTICA, V63, P121 LALWANI AL, 1991, IEEE T ACOUST SPEECH, V1, P505 LASS NJ, 1978, J ACOUST SOC AM, V63, P1218, DOI 10.1121/1.381808 MALAH D, 1979, IEEE T ACOUST SPEECH, V27, P121, DOI 10.1109/TASSP.1979.1163210 Matsumoto H., 1994, Journal of the Acoustical Society of Japan, V50 Matsumoto H., 1993, Journal of the Acoustical Society of Japan (E), V14 MATSUMOT.H, 1973, IEEE T ACOUST SPEECH, VAU21, P428, DOI 10.1109/TAU.1973.1162507 MILENKOVIC PH, 1993, J ACOUST SOC AM, V93, P1087, DOI 10.1121/1.405557 MURRAY IR, 1993, J ACOUST SOC AM, V93, P1097, DOI 10.1121/1.405558 MUTA H, 1987, LARYNGEAL FUNCTION P, P463 Nakajima T., 1988, Journal of the Acoustical Society of Japan, V44 Nakamura S., 1989, Journal of the Acoustical Society of Japan, V45 NIIMI Y, 1987, P EUROPEAN C SPEECH, V2, P430 OLIVE JP, 1992, J ACOUST SOC AM, V92, P1837, DOI 10.1121/1.403840 PROSEK RA, 1987, J COMMUN DISORD, V20, P105, DOI 10.1016/0021-9924(87)90002-5 QUATIERI T, 1992, IEEE T SIGNAL PROCES, V3, P497 RODET X, 1987, P EUROPEAN C SPEECH, V1, P155 SATO H, 1974, IECE T A, V57, P23 Savic M., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90099-7 SCHWARTZ MF, 1968, J ACOUST SOC AM, V44, P1736, DOI 10.1121/1.1911324 SHIKANO K, 1986, P INT C ACOUST SPEEC, P2642 SHIRAKI Y, 1989, IEICE T J D2, V72, P1118 SUZUKI T, 1985, J ACOUST SOC JAPAN, V41, P895 TAKAGI T, 1986, P INT C ACOUST SPEEC, P889 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V WU K, 1991, J ACOUST SOC AM, V90, P1828, DOI 10.1121/1.401663 NR 56 TC 39 Z9 42 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 165 EP 173 DI 10.1016/0167-6393(94)00053-D PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000005 ER PT J AU MOULINES, E LAROCHE, J AF MOULINES, E LAROCHE, J TI NONPARAMETRIC TECHNIQUES FOR PITCH-SCALE AND TIME-SCALE MODIFICATION OF SPEECH SO SPEECH COMMUNICATION LA English DT Article DE PITCH-SCALE AND TIME-SCALE TRANSFORMATIONS; PHASE VOCODER; PSOLA ANALYSIS-SYNTHESIS; QUASI-HARMONIC MODEL ID FOURIER-TRANSFORM; SINUSOIDAL REPRESENTATION; SIGNAL RECONSTRUCTION; MAGNITUDE; PHASE; WAVE AB Time-scale and, to a lesser extent, pitch-scale modifications of speech and audio signals are the subject of major theoretical and practical interest. Applications are numerous, including, to name but a few, text-to-speech synthesis (based on acoustical unit concatenation), transformation of voice characteristics, foreign language learning but also audio monitoring or film/soundtrack post-synchronization. To fulfill the need for high-quality time and pitch-sealing, a number of algorithms have been proposed recently, along with their real-time implementation, sometimes for very inexpensive hardware. It appears that most of these algorithms can be viewed as slight variations of a small number of basic schemes. This contribution reviews frequency-domain algorithms (phase-vocoder) and time-domain algorithms (Time-Domain Pitch-Synchronous Overlap/Add and the like) in the same framework. More recent variations of these schemes are also presented. RP MOULINES, E (reprint author), TELECOM PARIS, 46 RUE BARRAULT, F-75634 PARIS 13, FRANCE. CR ALLEN JB, 1982, IEEE T ACOUST SPEECH, V82, P1012 ALLEN JB, 1977, IEEE T ACOUST SPEECH, V25, P235, DOI 10.1109/TASSP.1977.1162950 ALMEIDA LB, 1984, P IEEE INT C ACOUST ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 CROCHIERE RE, 1980, IEEE T ACOUST SPEECH, V28, P99, DOI 10.1109/TASSP.1980.1163353 Crochiere R. E., 1983, MULTIRATE DIGITAL SI DEPALLE P, 1991, THESIS U MAINE LE MA DOLSON M, 1986, COMPUT MUSIC J, V10, P14, DOI 10.2307/3680093 ELJAROUDI A, 1991, IEEE T SIGNAL PROCES, V39, P411, DOI 10.1109/78.80824 ELJAROUDI A, 1986, P IEEE WORKSH SPECTR, P29 FLANAGAN JL, 1966, AT&T TECH J, V45, P1493 Galas T., 1991, P EUROSPEECH GENOVA, P1085 GEORGE EB, 1992, J AUDIO ENG SOC, V40, P497 GRIFFIN D, 1988, IEEE T ACOUST SPEECH, V36, P236 GRIFFIN DW, 1984, IEEE T ACOUST SPEECH, V32, P236, DOI 10.1109/TASSP.1984.1164317 HAMON C, 1988, NTLAATSSRCP359 TECH HARDAM E, 1990, 90 P IEEE INT C ACOU, P409 HAYES MH, 1980, IEEE T ACOUST SPEECH, V28, P672, DOI 10.1109/TASSP.1980.1163463 LAROCHE J, 1993, IEEE ASSP WORKSH APP LAROCHE J, 1993, IEEE ASSP WORKSHOP A LAROCHE J, 1993, 93 P IEEE INT C ACOU Lim J., 1988, ADV TOPICS SIGNAL PR LUKASZEWICKZ K, 1987, 87 P IEEE INT C ACOU, P1426 Markel JD, 1976, LINEAR PREDICTION SP MARQUES JS, 1989, IEEE T ACOUST SPEECH, V37, P763 MCAULAY RJ, 1986, IEEE T ACOUST SPEECH, V34, P744, DOI 10.1109/TASSP.1986.1164910 MOULINES E, 1990, SPEECH COMMUN, V9, P453, DOI 10.1016/0167-6393(90)90021-Z NAWAB SH, 1988, ADV TOPICS SIGNAL PR, pCH6 NAWAB SH, 1983, IEEE T ACOUST SPEECH, V31, P986, DOI 10.1109/TASSP.1983.1164162 Oppenheim A. V., 1989, DISCRETE TIME SIGNAL POIROT G, 1988, P INT COMPUTER MUSIC PORTNOFF MR, 1981, IEEE T ACOUST SPEECH, V29, P364, DOI 10.1109/TASSP.1981.1163580 PORTNOFF MR, 1980, IEEE T ACOUST SPEECH, V28, P55, DOI 10.1109/TASSP.1980.1163359 Portnoff R., 1981, IEEE T ACOUST SPEECH, V29, P374 QUATIERI TF, 1986, IEEE T ACOUST SPEECH, V34, P1449, DOI 10.1109/TASSP.1986.1164985 ROUCOS S, 1985, 85 P IEEE INT C ACOU, P493 SENEFF S, 1982, IEEE T ACOUST SPEECH, V24, P358 SERRA X, 1990, COMPUT MUSIC J, V14, P12, DOI 10.2307/3680788 SYLVESTRE B, 1992, 92 P IEEE INT C ACOU, P81 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V VERHELST W, 1993, 93 P IEEE INT C ACOU, P554 WAYMAN JL, 1988, IEEE T ACOUST SPEECH, V36, P139, DOI 10.1109/29.1505 NR 42 TC 80 Z9 87 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 175 EP 205 DI 10.1016/0167-6393(94)00054-E PG 31 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000006 ER PT J AU NARENDRANATH, M MURTHY, HA RAJENDRAN, S YEGNANARAYANA, B AF NARENDRANATH, M MURTHY, HA RAJENDRAN, S YEGNANARAYANA, B TI TRANSFORMATION OF FORMANTS FOR VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS SO SPEECH COMMUNICATION LA English DT Article DE VOICE CONVERSION; SPEAKER CHARACTERISTICS; FORMANTS; MULTILAYER FEEDFORWARD NEURAL NETWORK ID PERCEPTION AB In this paper we propose a scheme for developing a voice conversion system that converts the speech signal uttered by a source speaker to a speech signal having the voice characteristics of the target speaker. In particular, we address the issue of transformation of the vocal tract system features from one speaker to another. Formants are used to represent the vocal tract system features and a formant vocoder is used for synthesis. The scheme consists of a formant analysis phase, followed by a learning phase in which the implicit formant transformation is captured by a neural network. The transformed formants together with the pitch contour modified to suit the average pitch of the target speaker are used to synthesize speech with the desired vocal tract system characteristics. C1 INDIAN INST TECHNOL, DEPT COMP SCI & ENGN, MADRAS 600036, TAMIL NADU, INDIA. CR Abe M., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196671 ABE M, 1991, INT CONF ACOUST SPEE, P765, DOI 10.1109/ICASSP.1991.150451 ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 CHILDERS DG, 1987, IEEE T ACOUST SPEECH, P293 CHILDERS DG, 1985, P IEEE INT C ACOUST CHILDERS DG, 1989, SPEECH COMMUN, V8, P147, DOI 10.1016/0167-6393(89)90041-1 CHILDERS DG, 1991, J ACOUST SOC AM, V90, P2394, DOI 10.1121/1.402044 FANT G, 1986, J PHONETICS, V14, P393 FANT G, 1991, SPEECH COMMUN, V10, P521, DOI 10.1016/0167-6393(91)90055-X HORNIK K, 1989, NEURAL NETWORKS, V2, P359, DOI 10.1016/0893-6080(89)90020-8 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P367, DOI 10.1109/TAU.1972.1162410 MCCLELLAND TL, 1986, PARALLEL DISTRIBUTED MURTHY HA, 1991, SPEECH COMMUN, V10, P209, DOI 10.1016/0167-6393(91)90011-H RABIN LR, 1993, FUNDAMENTALS SPEECH Savic M., 1991, DIGIT SIGNAL PROCESS, V4, P107 SENEFF S, 1982, IEEE T ACOUST SPEECH, V30, P566, DOI 10.1109/TASSP.1982.1163919 VALBRET H, 1992, SPEECH COMMUN, V11, P175, DOI 10.1016/0167-6393(92)90012-V NR 18 TC 59 Z9 69 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 207 EP 216 DI 10.1016/0167-6393(94)00058-I PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000007 ER PT J AU FLEGE, JE MUNRO, MJ MACKAY, IRA AF FLEGE, JE MUNRO, MJ MACKAY, IRA TI EFFECTS OF AGE OF 2ND-LANGUAGE LEARNING ON THE PRODUCTION OF ENGLISH CONSONANTS (VOL 16, PG 1, 1995) SO SPEECH COMMUNICATION LA English DT Correction CR FLEGE JE, 1995, SPEECH COMMUN, V16, P1, DOI 10.1016/0167-6393(94)00044-B NR 1 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1995 VL 16 IS 2 BP 217 EP 217 PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QP300 UT WOS:A1995QP30000008 ER PT J AU FLEGE, JE MUNRO, MJ MACKAY, IRA AF FLEGE, JE MUNRO, MJ MACKAY, IRA TI EFFECTS OF AGE OF 2ND-LANGUAGE LEARNING ON THE PRODUCTION OF ENGLISH CONSONANTS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; 2ND-LANGUAGE; CONSONANTS; BILINGUALISM; PHONETIC INTERFERENCE; FOREIGN ACCENT; CRITICAL PERIOD; ITALIAN ID NATIVE SPEAKERS; SPEECH CONTRASTS; FOREIGN-LANGUAGE; STOP CONSONANTS; 2ND LANGUAGE; BILINGUALS; FRENCH; ACQUISITION; PERCEPTION; MANDARINE AB This study examined the production of English consonants by native speakers of Italian. The 240 adult native Italian speakers of English who participated had begun learning English when they emigrated to Canada between the ages of 2 and 23 years. Word-initial, word-medial and word-final tokens of English stops and fricatives were assessed through forced-choice judgments made by native English-speaking listeners, and acoustically. The native Italian subjects' ages of learning (AOL) English exerted a systematic effect on their production of English consonants even though they had lived in Canada for an average of 32 years, and reported speaking English more than Italian. In all but two instances, one or more native Italian subgroup defined on the basis of AOL differed significantly from subjects in a native English (NE) control group. The AOL of the first native Italian subgroup to differ from the NE subjects varied across consonant and syllable position. The results are discussed in terms of hypotheses proposed in the literature concerning the basis of segmental errors in L2 speech production. C1 UNIV OTTAWA, DEPT LINGUIST, OTTAWA, ON, CANADA. RP FLEGE, JE (reprint author), UNIV ALABAMA, DEPT BIOCOMMUN, VH 503, BIRMINGHAM, AL 35294 USA. CR ARNETANI E, 1991, SPEECH COMMUN, V5, P17 BERTINETTO P, 1979, J ITALIAN LING, V3, P97 BEST CT, 1992, J PHONETICS, V20, P305 BEST CT, 1988, J EXP PSYCHOL HUMAN, V14, P345, DOI 10.1037/0096-1523.14.3.345 BUSA MG, 1992, NEWS SOUNDS 92, P47 CALDOGNETTO EM, 1988, RIV ITALIANA ACUSTIC, V13, P121 CALDOGNETTO EM, 1979, ACTA PHONIATRICA LAT, V1, P219 CARAMAZZ.A, 1973, J ACOUST SOC AM, V54, P421, DOI 10.1121/1.1913594 ECKMAN F, 1993, 1ST 2ND LANGUAGE PHO, P251 FARNETANI E, 1991, 12TH P INT C PHON SC, P14 FARNETANI E, 1989, P INT C SPEECH RES FERRERO F, 1986, ASPETTI FONETICA COM, P155 FERRERO FE, 1979, FRONTIERS SPEECH COM, P159 FLEGE J, 1995, IN PRESS LANG SPEECH FLEGE J, 1995, IN PRESS STUD SEC LA FLEGE J, 1994, IN PRESS PHONETICA FLEGE J, 1995, IN PRESS SPEECH PERC Flege J., 1982, STUDIES 2 LANGUAGE A, V5, P1, DOI 10.1017/S0272263100004563 FLEGE J, 1995, IN PRESS VARIATION L FLEGE J, 1994, UNPUB J ACOUST SOC A FLEGE J, 1994, UNPUB PHONETICA Flege J. E., 1988, HUMAN COMMUNICATION, P224 Flege J. E., 1991, CROSSCURRENTS 2 LANG, P249, DOI 10.1075/lald.2.15fle BOHN OS, 1993, J PHONETICS, V21, P267 Flege J. E., 1987, APPLIED LINGUISTICS, V8, P162, DOI [10.1093/applin/8.2.162, DOI 10.1093/APPLIN/8.2.162] FLEGE JE, 1991, J ACOUST SOC AM, V89, P395, DOI 10.1121/1.400473 Flege J. E., 1992, PHONOLOGICAL DEV MOD, P565 FLEGE JE, 1992, J ACOUST SOC AM, V91, P370, DOI 10.1121/1.402780 Flege James E, 1992, INTELLIGIBILITY SPEE, P157 FLEGE JE, 1984, APPL PSYCHOLINGUIST, V5, P323, DOI 10.1017/S014271640000521X FLEGE JE, 1992, J ACOUST SOC AM, V92, P128, DOI 10.1121/1.404278 FLEGE JE, 1987, J ACOUST SOC AM, V82, P433, DOI 10.1121/1.395444 FLEGE JE, 1994, J ACOUST SOC AM, V95, P3623, DOI 10.1121/1.409931 FLEGE JE, 1987, J PHONETICS, V15, P47 FLEGE JE, 1987, J PHONETICS, V15, P203 FONDA C, 1984, 10TH P INT C PHON SC, P785 GATBONTON E, 1983, 2ND LANGUAGE LEARNIN, P240 HAMMARBERG B, 1988, STUDEN PHONOLOGIE ZW HANCINBHATT B, 1993, THESIS U UTAH HAZAN VL, 1993, LANG SPEECH, V36, P17 KOUTSOUDAS A, 1983, 2ND LANGAUGE LEARNIN LENNEBERG E, 1967, BIOL F LANGUAGE LISKER L, 1964, WORD, V20, P384 LISKER L, 1970, 6TH P INT C PHON SCI, P563 LOCKE JL, 1980, J SPEECH HEAR DISORD, V45, P445 LOCKE JL, 1980, J SPEECH HEAR DISORD, V45, P431 Long M., 1990, STUDIES 2ND LANGUAGE, V12, P251, DOI DOI 10.1017/S0272263100009165 MAJOR RC, 1992, NEW SOUNDS 92, P128 MILLER JL, 1989, PERCEPT PSYCHOPHYS, V46, P505, DOI 10.3758/BF03208147 MOROSAN DE, 1989, J SPEECH HEAR RES, V32, P501 MUNRO M, 1995, IN PRESS APPLIED PSY MUNRO MJ, 1993, LANG SPEECH, V36, P39 NEVILLE H J, 1992, Cerebral Cortex, V2, P244, DOI 10.1093/cercor/2.3.244 OYAMA S, 1976, J PSYCHOLINGUIST RES, V5, P261, DOI 10.1007/BF01067377 PARADIS M, 1995, IN PRESS NATIVE SPEA PENFIELD W, 1959, SPEECH BRAIN MECHANI PENG SH, 1993, PHONETICA, V50, P245 RITCHIE WC, 1968, LANG LEARN, V18, P183, DOI 10.1111/j.1467-1770.1968.tb00206.x SCHMIDT A, 1994, IN PRESS PHONETICA SCOVEL T, 1969, LANG LEARN, V19, P245, DOI 10.1111/j.1467-1770.1969.tb00466.x Scovel T., 1988, TIME SPEAK PSYCHOLIN SEGALOWITZ N, 1977, BILINGUALISM PSYCHOL, P77 Snodgrass JG, 1985, HUMAN EXPT PSYCHOL SUTER RW, 1976, LANG LEARN, V26, P233, DOI 10.1111/j.1467-1770.1976.tb00275.x Vagges K., 1978, J ITAL LING, V3, P69 VAYRA M, 1984, 10TH P INT C PHON SC, P541 WEINBERGER S, 1990, NEW SOUNDS 90 WEINREICH U, 1957, WORD, V13, P1 Wenk B. J., 1979, INTERLANGUAGE STUDIE, V4, P202 WILLIAMS DA, 1971, BIOMETRICS, V27, P103, DOI 10.2307/2528930 WILLIAMS L, 1979, PERCEPT PSYCHOPHYS, V26, P95, DOI 10.3758/BF03208301 Yamada R. A., 1992, SPEECH PERCEPTION PR, P155 NR 72 TC 82 Z9 82 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JAN PY 1995 VL 16 IS 1 BP 1 EP 26 DI 10.1016/0167-6393(94)00044-B PG 26 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QE680 UT WOS:A1995QE68000001 ER PT J AU BEAUTEMPS, D BADIN, P LABOISSIERE, R AF BEAUTEMPS, D BADIN, P LABOISSIERE, R TI DERIVING VOCAL-TRACT AREA FUNCTIONS FROM MIDSAGITTAL PROFILES AND FORMANT FREQUENCIES - A NEW MODEL FOR VOWELS AND FRICATIVE CONSONANTS BASED ON EXPERIMENTAL-DATA SO SPEECH COMMUNICATION LA English DT Article DE ARTICULATORY ACOUSTIC RELATIONSHIP; MIDSAGITTAL PROFILE; AREA FUNCTION; ARTICULATORY DATA ID SHAPE; DIMENSIONS; TONGUE; LIPS AB In order to achieve better understanding of the articulatory-acoustic relationships, more data are still very much needed. The two-fold aim of the present study was thus (1) to provide a set of coherent midsagittal functions, area functions and formant frequencies, for a small corpus of vowels and fricative consonants produced by one subject, and (2) to derive a midsagittal profile to area function conversion model optimised for this given subject. Simultaneous tomography and sound recording were available for the subject, as well as some complementary data such as lip geometry or casts of the hard palate. The model is based on Heinz and Stevens' A = alpha d(beta) area function model, modified so that alpha varies continuously along the vocal tract midline as a function of the midsagittal distance. The coefficients of the model have been determined with the help of an optimisation algorithm based on a gradient descent technique. The gradient of the error between actual and desired formant values was computed through a back-propagation network implementing both sagittal-to-area conversion and acoustic wave propagation. The fact that the model should work for sounds as different as vowels and consonants and be coherent at both midsagittal and acoustic levels ensures the reliability of the area functions determined in such a way. RP BEAUTEMPS, D (reprint author), UNIV GRENOBLE 3, INPG,INST COMMUN PARLEE,CNRS,URA 368, 46 AVE FELIX VIALLET, F-38031 GRENOBLE, FRANCE. RI Laboissiere, Rafael/E-9814-2013 OI Laboissiere, Rafael/0000-0002-2180-9250 CR ABRY C, 1986, SPEECH COMMUN, V5, P97, DOI 10.1016/0167-6393(86)90032-4 ATAL BS, 1978, J ACOUST SOC AM, V63, P1535, DOI 10.1121/1.381848 BADIN P, 1984, STL QPSR, P53 BADIN P, 1989, SPEECH TRANSMISSION, V3, P33 BADIN P, 1990, SC10147C EDB CEC REP, P5 BADIN P, IN PRESS J PHONETICS BADIN P, 1991, J PHONETICS, V19, P397 BAER T, 1991, J ACOUST SOC AM, V90, P799, DOI 10.1121/1.401949 BOTHOREL A, 1986, TRAVAUX I PHONETIQUE CARTER JN, 1990, APPLICATIONS DIGITAL, V13, P378 CASTELLI E, 1990, SC10147C EDB CEC REP, P35 Chiba T., 1941, VOWEL ITS NATURE STR Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1992, 1992 P INT C SPOK LA, V1, P807 FANT G, 1964, 5TH P INT C PHON SCI, P120 HARDCASTLE WJ, 1991, J PHONETICS, V19, P251 HEINZ JM, 1965, 5TH P INT C AC HOLMES JN, 1981, 4TH P FASE S, P169 INGARD U, 1953, J ACOUST SOC AM, V25, P1037, DOI 10.1121/1.1907235 LABOISSIERE R, 1992, THESIS I NATIONAL PO LALLOUACHE MT, 1990, 18EMES P JOURN ET PA Lavrentiev MM, 1967, SOME IMPROPERLY POSE LINDBLOM BE, 1971, J ACOUST SOC AM, V50, P1166, DOI 10.1121/1.1912750 MAEDA S, 1989, MELANGES PHONETIQUE, P545 Maeda S, 1972, CONVERSION VOCAL TRA PERRIER P, 1992, J SPEECH HEAR RES, V35, P53 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1, P45 SCHROEDE.MR, 1967, J ACOUST SOC AM, V41, P1002, DOI 10.1121/1.1910429 SONDHI MM, 1971, J ACOUST SOC AM, V49, P1867, DOI 10.1121/1.1912593 SONDHI MM, 1983, J ACOUST SOC AM, V73, P985, DOI 10.1121/1.389024 STONE M, 1992, J PHONETICS, V20, P253 STONE M, 1988, J ACOUST SOC AM, V83, P1586, DOI 10.1121/1.395913 SUNDBERG J, 1990, J ACOUST SOC AM, V88, P1313, DOI 10.1121/1.399707 SUNDBERG J, 1969, SPEECH TRANSMISSION, V1, P33 SUNDBERG J, 1987, PHONETICA, V44, P76 SUNDBERG J, 1992, J ACOUST SOC AM, V91, P3478, DOI 10.1121/1.402836 VALLEE N, 1992, 19EMES P JOURN ET PA, P53 WU HY, 1987, IEEE INT C ACOUST SP, V1, P9 NR 38 TC 21 Z9 22 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JAN PY 1995 VL 16 IS 1 BP 27 EP 47 DI 10.1016/0167-6393(94)00045-C PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QE680 UT WOS:A1995QE68000002 ER PT J AU LOFQVIST, A KOENIG, LL MCGOWAN, RS AF LOFQVIST, A KOENIG, LL MCGOWAN, RS TI VOCAL-TRACT AERODYNAMICS IN VERTICAL-BAR-ACA-VERTICAL-BAR UTTERANCES - MEASUREMENTS SO SPEECH COMMUNICATION LA English DT Article DE ARTICULATORY TIMING; SPEECH AERODYNAMICS; VOICE SOURCE PROPERTIES ID GLOTTAL AIR-FLOW; FUNDAMENTAL-FREQUENCY; AMERICAN ENGLISH; VOICE; FEMALE; FRICATIVES; WAVEFORM; SPEAKERS; CLUSTERS; SPEECH AB This paper examines air flow patterns at vowel-consonant and consonant-vowel transitions. Oral air flow was recorded in six speakers of American English producing reiterant speech. The air flow signal was inverse filtered to obtain an estimate of the glottal pulse. Measurements were made of peak and minimum flow, open quotient, pulse area and fundamental frequency. The results show that at the transitions between vowels and voiceless consonants the pulse properties show large variations. In particular, the source is characterized by a breathy mode of phonation. Breathiness was indexed by large values of peak and minimum flow, and an open quotient close to 1. The observed variations can be accounted for by the laryngeal adjustments that are made for voiceless consonants, in particular the glottal opening movement and its phasing with the oral articulatory events. Individual differences suggest that speakers vary in their use of the longitudinal tension of the vocal folds in controlling voicelessness. RP LOFQVIST, A (reprint author), HASKINS LABS INC, 270 CROWN ST, NEW HAVEN, CT 06511 USA. CR ABRAMSON AS, 1977, PHONETICA, V34, P295 ANANTHAPADMANAB.T, 1982, STL QPSR, V1, P1 ARKEBAUE.HJ, 1967, J SPEECH HEAR RES, V10, P196 BADIN P, 1990, STL QPSR, V1, P1 BIEVER D M, 1989, Journal of Voice, V3, P120, DOI 10.1016/S0892-1997(89)80138-9 CRANEN B, 1987, J ACOUST SOC AM, V81, P734, DOI 10.1121/1.394842 DEVETH J, 1990, 1990 INT C AC SPEECH, P301 DIXIT RP, 1989, J PHONETICS, V17, P213 FANT G, 1986, J PHONETICS, V14, P393 FANT G, 1961, 3RD P INT C AC STUTT Fant Gunnar, 1972, STL QPSR, V1, P1 Fant Gunnar, 1985, STL QPSR, V4, P1 Fex S., 1991, VOCAL FOLD PHYSL ACO, P273 GAUFFIN J, 1989, J SPEECH HEAR RES, V32, P556 GOBL C, 1988, STL QPSR, V1, P123 GOBL C, 1988, STL QPSR, V2, P23 HERTEGARD S, 1992, J VOICE, V6, P224, DOI 10.1016/S0892-1997(05)80147-X HOLMBERG EB, 1988, J ACOUST SOC AM, V84, P511, DOI 10.1121/1.396829 HOMBERT JM, 1979, LANGUAGE, V55, P37, DOI 10.2307/412518 ISSHIKI N, 1964, J SPEECH HEAR RES, V7, P233 KLATT DH, 1968, ANN NY ACAD SCI, V155, P42, DOI 10.1111/j.1749-6632.1968.tb56748.x KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 KOREMAN J, 1992, 1992 P INT C SPOK LA, V1, P125 KRISHNAMURTHY AK, 1986, IEEE T ACOUST SPEECH, V34, P730, DOI 10.1109/TASSP.1986.1164909 LISKER L, 1984, LANG SPEECH, V27, P163 LISKER L, 1986, LANG SPEECH, V29, P3 LOFQVIST A, 1984, SPEECH COMMUN, V3, P279, DOI 10.1016/0167-6393(84)90024-4 LOFQVIST A, 1980, J ACOUST SOC AM, V68, P792 LOFQVIST A, 1980, J PHONETICS, V8, P475 LOFQVIST A, 1989, J ACOUST SOC AM, V85, P1314 Lofqvist A., 1981, NORD J LINGUIST, V4, P1 LOFQVIST A, 1992, J PHONETICS, V20, P93 LOFQVIST A, 1992, LANG SPEECH, V35, P15 MCGOWAN RS, 1995, SPEECH COMMUN, V16, P67, DOI 10.1016/0167-6393(94)00048-F MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 NITTROUER S, 1990, J SPEECH HEAR RES, V33, P761 OHDE RN, 1984, J ACOUST SOC AM, V75, P224, DOI 10.1121/1.390399 PALMER S, 1992, 1992 P INT C SPOK LA, V1, P129 KROOK MIP, 1988, FOLIA PHONIATR, V40, P82 Peppard RC, 1988, J VOICE, V2, P250, DOI 10.1016/S0892-1997(88)80083-3 PETURSSON M, 1976, PHONETICA, V35, P65 Pierrehumbert J., 1992, PAPERS LABORATORY PH, P90 PRICE PJ, 1989, SPEECH COMMUN, V8, P261, DOI 10.1016/0167-6393(89)90005-8 RABINER LR, 1977, IEEE T ACOUST SPEECH, V25, P434, DOI 10.1109/TASSP.1977.1162987 ROTHENBE.M, 1973, J ACOUST SOC AM, V53, P1632, DOI 10.1121/1.1913513 SCULLY C, 1992, J PHONETICS, V20, P39 SCULLY C, 1992, SPEECH COMMUN, V11, P411, DOI 10.1016/0167-6393(92)90046-A SODERSTEN M, 1990, J SPEECH HEAR RES, V33, P601 SUBTELNY JD, 1966, J SPEECH HEAR RES, V9, P498 YOSHIOKA H, 1981, J ACOUST SOC AM, V70, P1615, DOI 10.1121/1.387227 YOSHIOKA H, 1982, ANN B RES I LOGOPEDI, V16, P27 NR 51 TC 21 Z9 21 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JAN PY 1995 VL 16 IS 1 BP 49 EP 66 DI 10.1016/0167-6393(94)00049-G PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QE680 UT WOS:A1995QE68000003 ER PT J AU MCGOWAN, RS KOENIG, LL LOFQVIST, A AF MCGOWAN, RS KOENIG, LL LOFQVIST, A TI VOCAL-TRACT AERODYNAMICS IN VERTICAL-BAR-ACA-VERTICAL-BAR UTTERANCES - SIMULATIONS SO SPEECH COMMUNICATION LA English DT Article DE AERODYNAMICS; VOICE; RUNNING SPEECH ID SPEECH; MODEL; CORDS; SYNTHESIZER; NOISE AB Aerodynamic simulations of /aCa/ utterances were made using a low-frequency model for upper vocal tract airflow and a two-mass model for the voice source. These simulations helped increase insight into the results of an empirical study of flow during running speech. The various sources of flow, including wall compliance, were examined for their contributions to total Bow from the mouth. The two-mass model was modified to allow for more natural glottal flow during abduction and adduction. Even with modifications the two-mass model was not sufficient to model source variations during running speech. RP MCGOWAN, RS (reprint author), HASKINS LABS INC, 270 CROWN ST, NEW HAVEN, CT 06511 USA. CR Ananthapadmanabha T. V., 1982, SPEECH COMMUN, V1, P167, DOI 10.1016/0167-6393(82)90015-2 BAK P, 1987, PHYS REV LETT, V59, P381, DOI 10.1103/PhysRevLett.59.381 BELLBERTI F, 1975, J ACOUST SOC AM, V57, P456, DOI 10.1121/1.380468 BICKLEY C, 1991, VOCAL FOLD PHYSL ACO, P37 BICKLEY CA, 1986, J PHONETICS, V14, P373 CRANEN B, 1987, J ACOUST SOC AM, V81, P734, DOI 10.1121/1.394842 Davies P. O. A. L., 1993, VOCAL FOLD PHYSL FRO, P93 Finkelhor B. K., 1988, J VOICE, V1, P320, DOI DOI 10.1016/S0892-1997(88)80005-5 FLANAGAN JL, 1975, AT&T TECH J, V54, P485 Hirose H., 1987, LARYNGEAL FUNCTION P, P381 ISHIZAKA K, 1968, MONOGRAPH SPEECH COM, V8 ISHIZAKA K, 1972, AT&T TECH J, V51, P1233 JOHNSON MA, 1992, J ACOUST SOC AM, V91, P2420, DOI 10.1121/1.403194 KLATT DH, 1968, ANN NY ACAD SCI, V155, P42, DOI 10.1111/j.1749-6632.1968.tb56748.x KOIZUMI T, 1987, J ACOUST SOC AM, V82, P1179, DOI 10.1121/1.395254 LIN Q, 1990, THESIS KTH STOCKHOLM LOFQVIST A, 1984, SPEECH COMMUN, V3, P279, DOI 10.1016/0167-6393(84)90024-4 LOFQVIST A, 1995, SPEECH COMMUN, V16, P49, DOI 10.1016/0167-6393(94)00049-G LOFQVIST A, 1989, J ACOUST SOC AM, V85, P1314 LOFQVIST A, 1975, J PHONETICS, V3, P175 LOFQVIST A, 1992, J PHONETICS, V20, P93 MERMELST.P, 1973, J ACOUST SOC AM, V53, P1070, DOI 10.1121/1.1913427 Muller EM, 1980, SPEECH LANGUAGE ADV, V4, P318 OHALA JJ, 1974, SPEECH COMMUN, V2, P65 OHALA JJ, 1990, NATO ADV SCI I D-BEH, V55, P23 ROTHENBERG M, 1968, BREATH STREAM DYNAMI ROTHENBERG M, 1983, VOCAL FOLD PHYSL, P155 RUBIN P, 1981, J ACOUST SOC AM, V70, P321, DOI 10.1121/1.386780 SCULLY C, 1990, NATO ADV SCI I D-BEH, V55, P151 SONDHI MM, 1987, IEEE T ACOUST SPEECH, V35, P955 Stevens Kenneth, 1988, VOCAL FOLD PHYSL VOI, P357 STEVENS KN, 1971, J ACOUST SOC AM, V50, P1180, DOI 10.1121/1.1912751 Stevens KN, 1991, VOCAL FOLD PHYSL ACO, P29 STEVENS KWH, COMMUNICATION WESTBURY JR, 1983, J ACOUST SOC AM, V73, P1322, DOI 10.1121/1.389236 NR 35 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JAN PY 1995 VL 16 IS 1 BP 67 EP 88 DI 10.1016/0167-6393(94)00048-F PG 22 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QE680 UT WOS:A1995QE68000004 ER PT J AU LUCKE, H AF LUCKE, H TI BAYESIAN BELIEF NETWORKS AS A TOOL FOR STOCHASTIC PARSING SO SPEECH COMMUNICATION LA English DT Article DE BAYESIAN NETWORKS; GRAMMAR INFERENCE; STOCHASTIC PARSING AB Bayesian Belief Networks are a powerful tool for combining different knowledge sources with various degrees of uncertainty in a mathematical sound and computationally efficient way. Surprisingly they have not yet found their way into the speech processing field, despite the fact that in this science multiple unreliable information sources exist. The present paper shows how the theory can be utilized in for language modeling. After providing an introduction to the theory of Bayesian Networks, we develop several extensions to the classic theory by describing mechanisms for dealing with statistical dependence among daughter nodes (usually assumed to be conditionally independent) and by providing a learning algorithm based on the EM-algorithm with which the probabilities of link matrices can be learned from example data. Using these extensions a language model for speech recognition based on a context-free framework is constructed. In this model, sentences are not parsed in their entirety, as is usual with grammatical description, but only ''locally'' on suitably located segments. The model was evaluated over a text data base. In terms of test set entropy the model performed at least as good as the bi/tri-gram models, while showing a good ability to generalize from training to test data. RP LUCKE, H (reprint author), ATR, INTERPRETING TELECOMMUN RES LABS, 2-2 HIKARIDAI, SEIKA, TOKYO, JAPAN. CR ANDERSON JR, 1981, 7TH P INT JOINT C AR, P97 BAHL LR, 1989, IEEE T ACOUST SPEECH, V37 Baker JK, 1979, 97 M AC SOC AM, P547 BERWICK R, 1980, 16TH P ANN M ASS COM Brown P. F., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Chomsky N., 1959, INFORM CONTR, V2, P137, DOI DOI 10.1016/S0019-9958(59)90362-6 COWELL RG, 1993, IEEE T PATTERN ANAL, V15, P209, DOI 10.1109/34.204903 DEMPSTER AP, 1977, J ROY STAT SOC B MET, V39, P1 EHARA T, 1990, I0186 ATR INT TEL RE FU KS, 1986, IEEE PATTERN ANAL MA, V8 GOLD EM, 1967, INFORM CONTROL, V10, P447, DOI 10.1016/S0019-9958(67)91165-5 Huang X.D., 1990, HIDDEN MARKOV MODELS JELINEK F, 1969, IBM J RES DEV, V13 JELINEK F, 1991, P 2 EUR C SPEECH COM, P1037 Lari K., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90009-F LAURITZEN SL, 1988, STATISTICAL SOC, P157 LUCKE H, 1994, P INT C ACOUST SPEEC, pI353 MCCLELLAND J, 1987, MECHANISMS LANGUAGE NEY H, 1991, IEEE T SIGNAL PROCES, V39, P336, DOI 10.1109/78.80816 Paul D., 1991, P IEEE INT C AC SPEE, P693, DOI 10.1109/ICASSP.1991.150434 PAUL DB, 1992, P INT C ACOUST SPEEC, pI25 Pearl J., 1987, PROBABILISTIC REASON WOLFF JG, 1980, LANG SPEECH, V23, P255 WOLFF JG, 1982, LANG COMMUN, V2, P57, DOI 10.1016/0271-5309(82)90035-0 NR 24 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JAN PY 1995 VL 16 IS 1 BP 89 EP 118 DI 10.1016/0167-6393(94)00046-D PG 30 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA QE680 UT WOS:A1995QE68000005 ER PT J AU SHIRAI, K AF SHIRAI, K TI SPOKEN DIALOG SO SPEECH COMMUNICATION LA English DT Editorial Material C1 NIPPON TELEGRAPH & TEL PUBL CORP, HUMAN INTERFACE LABS, FURUI RES LABS, TOKYO, JAPAN. TOKYO INST TECHNOL, GRAD SCH INFORMAT SCI & ENGN, TOKYO 152, JAPAN. RP SHIRAI, K (reprint author), WASEDA UNIV, DEPT ELECT ENGN, TOKYO 160, JAPAN. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 189 EP 191 DI 10.1016/0167-6393(94)90070-1 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200001 ER PT J AU NAGATA, M MORIMOTO, T AF NAGATA, M MORIMOTO, T TI 1ST STEPS TOWARDS STATISTICAL MODELING OF DIALOG TO PREDICT THE SPEECH-ACT TYPE OF THE NEXT UTTERANCE SO SPEECH COMMUNICATION LA English DT Article DE STATISTICAL LANGUAGE MODELING; DIALOG MODEL; SPEECH ACT; DISCOURSE STRUCTURE AB We propose a statistical dialogue modeling method based on the information theory and the speech act theory. The dialogue model consists of a trigram of utterances classified by their speech act. It can be used to rule out erroneous speech recognition candidates that are syntactically and semantically correct, but contextually incorrect, by examining whether the utterance candidates form a natural local discourse in terms of speech act sequencing. Since it is based on the information theory, we can define objective measures for the quality of the dialogue model, such as discourse perplexity. We show that the dialogue model can predict the speech act type of the next utterance by experiments on 100 keyboard dialogues, that include 2,722 utterances and 38,954 words. It achieves 39.7% prediction accuracy for the top candidate and 61.7% for the top three candidates, when 90 dialogues were used for training and the remaining 10 dialogues were used for testing. We also show that we can make a better language model by combining the dialogue model with a sentence model. The word perplexity of word bigram with speech act type trigram is 7.27, while that of simple word bigram is 11.6, when the word perplexity of the language models is computed using the 100 keyboard dialogues. C1 ATR INTERPRETING TELECOMMUN RES LABS, KYOTO 61902, JAPAN. RP NAGATA, M (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, INFORMAT & COMMUN SYST LABS, 1-2356 TAKE, YOKOSUKA, KANAGAWA 23803, JAPAN. CR EHARA T, 1990, ICSLP 90, P1093 Grosz B. J., 1986, Computational Linguistics, V12 Hauptmann A. G., 1988, AAAI 88. Seventh National Conference on Artificial Intelligence JELINEK F, 1985, UNPUB SELF ORG LANGU KUME M, 1989, EACL 89, P264 Magerman D. M., 1990, AAAI-90 Proceedings. Eighth National Conference on Artificial Intelligence MORIMOTO T, 1992, COLING 92, P1048 NAGATA M, 1992, ICSLP 92, P647 NAGATA M, 1993, ATR TRI0298 TECHN RE NAGATA M, 1993, P INT S SPOK DIAL, P83 WALKER M, 1990, ACL 90, P70 Yamaoka T., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings NR 12 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 193 EP 203 DI 10.1016/0167-6393(94)90071-X PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200002 ER PT J AU SOBASHIMA, Y FURUSE, O IIDA, H AF SOBASHIMA, Y FURUSE, O IIDA, H TI A CORPUS-BASED LOCAL CONTEXT ANALYSIS FOR SPOKEN DIALOGS SO SPEECH COMMUNICATION LA English DT Article DE LOCAL CONTEXT ANALYSIS; TAGGED CORPORA; ILLOCUTIONARY FORCE; EXAMPLE-BASED MACHINE TRANSLATION AB Grammatical and semantic constraints are effective for interpreting or understanding linguistic expressions. However, they appear to be inadequate for selecting among several candidates, all of which may be relatively correct or inadequate grammatically or semantically. Clearly, we humans interpret a linguistic expression contextually even if there are many potential interpretations. This paper introduces an example-based local context analysis method using tagged corpora to deal with contextual selection of linguistic expressions, taking into account the cohesive nature of spoken dialogues. This method performs calculations for similarity scores between linguistic expressions and for likelihood scores to select the most suitable expression. Both illocutionary force-based and morpho-syntactic classifications are considered, along with the frequencies of existing sets of neighboring linguistic expressions stored in an example database. An experimental processing unit which performs such local context analysis has been implemented in a bidirectional (English and Japanese) translation prototype system, and has shown its applicability to the selection of context-dependent translation candidates. This local context analysis mechanism can be used with conventional translation systems without contextual processing to raise translation accuracy. RP SOBASHIMA, Y (reprint author), ATR INTERPRETING TELECOMMUN RES LABS, 2-2 HIKARIDAI, SEIKA CHO, KYOTO 61902, JAPAN. CR AUSTIN JL, 1962, HOW TO DO THINGS WOR FAIS L, 1993, P ISSD 93 TOKYO, P133 FURUSE O, 1992, P 14 INT C COMP LING, P645 IIDA H, 1992, T IPS JAPAN, V15, P60 NAGAO M., 1984, ARTIFICIAL HUMAN INT, P173 NAGATA M, 1992, P ICSLP 92, P647 SATO S, 1991, THESIS KYOTO U Searle John R., 1969, SPEECH ACTS SUMITA E, 1992, IEICE T INF SYST, VE75D, P585 NR 9 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 205 EP 212 DI 10.1016/0167-6393(94)90072-8 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200003 ER PT J AU HIRST, G MCROY, S HEEMAN, P EDMONDS, P HORTON, D AF HIRST, G MCROY, S HEEMAN, P EDMONDS, P HORTON, D TI REPAIRING CONVERSATIONAL MISUNDERSTANDINGS AND NON-UNDERSTANDINGS SO SPEECH COMMUNICATION LA English DT Article DE CONVERSATION; REFERENCE; MISUNDERSTANDING; NON-UNDERSTANDING; NEGOTIATION; COLLABORATION; ABDUCTION AB Participants in a discourse sometimes fail to understand one another, but, when aware of the problem, collaborate upon or negotiate the meaning of a problematic utterance. To address non-understanding, we have developed two plan-based models of collaboration in identifying the correct referent of a description: one covers situations where both conversants know of the referent, and the other covers situations, such as direction-giving, where the recipient does not. In the models, conversants use the mechanisms of refashioning, suggestion and elaboration, to collaboratively refine a referring expression until it is successful. To address misunderstanding, we have developed a model that combines intentional and social accounts of discourse to support the negotiation of meaning. The approach extends intentional accounts by using expectations deriving from social conventions in order to guide interpretation. Reflecting the inherent symmetry of the negotiation of meaning, all our models can act as both speaker and hearer, and can play both the role of the conversant who is not understood or misunderstood and the role of the conversant who fails to understand. RP HIRST, G (reprint author), UNIV TORONTO, DEPT COMP SCI, TORONTO, ON M5S 1A1, CANADA. RI Hirst, Graeme/A-1825-2008 CR APPELT DE, 1985, ARTIF INTELL, V26, P1, DOI 10.1016/0004-3702(85)90011-6 Appelt D. E., 1985, 23rd Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference Appelt D. E., 1992, User Modeling and User-Adapted Interaction, V2, DOI 10.1007/BF01101857 Blum-Kulka S., 1988, TEXT, V8, P219, DOI 10.1515/text.1.1988.8.3.219 Calistri-Yeh R. J., 1991, User Modeling and User-Adapted Interaction, V1, DOI 10.1007/BF00141047 Clark H., 1993, ARENAS LANGUAGE USE CLARK HH, 1986, COGNITION, V22, P1, DOI 10.1016/0010-0277(86)90010-7 Cohen Philip R., 1990, INTENTIONS COMMUNICA COHEN PR, 1981, 7TH P INT JOINT C AR, P31 COULTHARD M, 1984, STUDIES DISCOURSE AN, P82 DALE R, 1989, 27TH P ANN M ASS COM, P68 DAVIS JR, 1989, THESIS MIT DEVLIN AS, 1976, ENV KNOWING THEORIES EDMONDS PG, 1994, 15TH P INT C COMP LI EDMONDS PG, 1993, CSRI289 U TOR DEP CO Eller R., 1992, User Modeling and User-Adapted Interaction, V2, DOI 10.1007/BF01101858 Garfinkel H, 1967, STUDIES ETHNOMETHODO Goodman B. A., 1985, 23rd Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference HAYES PJ, 1975, 4TH P INT JOINT C AR, P181 HEEMAN PA, 1991, CSRI251 U TOR DEP CO HEEMAN PA, 1994, UNPUB COLLABORATING HORTON D, 1991, FAL WORK NOT AAAI S, P31 JOSHI AK, 1981, ELEMENTS DISCOURSE U LAMBERT L, 1991, 29TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS : PROCEEDINGS OF THE CONFERENCE, P47 LITMAN DJ, 1990, DISCOURSE PROCESS, P365 LYNCH K., 1960, IMAGE CITY MCCOY KF, 1989, ARTIF INTELL, V41, P157, DOI 10.1016/0004-3702(89)90009-X MCRAY SW, 1994, UNPUB REPAIR SPEECH MCROY SW, 1993, FAL AAAI S HUM COMP, P57 MCROY SW, 1993, CSRI288 U TOR DEP CO MCROY SW, 1993, 6TH P C EUR CHAPT AS, P277 NADATHUR G, 1983, 8TH P INT JOINT C AR, P603 PERRAULT CR, 1990, SYS DEV FDN, P161 PERRAULT CR, 1981, ELEMENTS DISCOURSE U, P217 Pollack M. E., 1986, 24th Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference POLLACK ME, 1990, SYS DEV FDN, P77 POOLE D, 1987, KNOWLEDGE FRONTIER E, P331 PSATHAS G, 1991, TALK SOCIAL STRUCTUR, P195 REITER E, 1992, 14TH P INT C COMP LI, P232 REITER E, 1990, 28TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P97 SCHEGLOFF EA, 1977, LANGUAGE, V53, P361, DOI 10.2307/413107 SCHEGLOFF EA, 1987, LINGUISTICS, V25, P201, DOI 10.1515/ling.1987.25.1.201 SCHEGLOFF EA, 1992, AM J SOCIOL, V97, P1295, DOI 10.1086/229903 SVARTVIK J, 1980, LUND STUDIES ENGLISH, V56 TERASAKI A, 1976, 99 U CAL SCH SOC SCI VANARRAGON P, 1990, THESIS U WATERLOO Wilensky R., 1983, PLANNING UNDERSTANDI WILKINS DE, 1985, COMPUT INTELL, V1, P33, DOI 10.1111/j.1467-8640.1985.tb00057.x NR 48 TC 16 Z9 16 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 213 EP 229 DI 10.1016/0167-6393(94)90073-6 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200004 ER PT J AU FAIS, L AF FAIS, L TI CONVERSATION AS COLLABORATION - SOME SYNTACTIC EVIDENCE SO SPEECH COMMUNICATION LA English DT Article DE CONSERVATION ANALYSIS; SYNTAX; JOINT PRODUCTIONS; COLLABORATIVE CONVERSATION AB In order to construct a robust natural language processing system, it is necessary to incorporate dialogue management. The design of an appropriate manager requires a well-founded understanding of human-human conversation. This paper presents an attempt to articulate the notion of conversation as collaboration in light of this need. The argument that conversation is inherently collaborative is made based on syntactic phenomena from spontaneous English conversation. In this paper, ''collaboration'' implies the simultaneous co-production of a conversation by the participants involved, not merely the construction of conversational meaning through alternating discrete contributions made by conversants. The syntactic structures discussed in support of this view are list structures, echo questions, short answers, joint productions and what are here called ''parallel structures'' and ''accommodations''. Suggestions are made concerning the impact of this view on conversation analysis and on the design of human-machine interfaces. RP FAIS, L (reprint author), ADV TELECOMMUN RES INST, INTERPRETING TELECOMMUN RES LAB, 2-2 HIKARIDAI, SEIKA CHO, KYOTO 61902, JAPAN. CR Bates E., 1989, CROSSLINGUISTIC STUD BAVELAS JB, 1992, DISCOURSE PROCESS, V15, P469 FAIS L, UNPUB ICSLP 94 FAIS L, 1994, ATR TRIT0040 ADV TEL FERRARA K, 1992, DISCOURSE PROCESS, V15, P207 GILES H, 1987, COMMUNICATION YB, V10 Grice H., 1975, SYNTAX SEMANTICS, V3 Gumperz J., 1982, DISCOURSE STRATEGIES Haviland J.B., 1986, TEXT, V6, P249, DOI 10.1515/text.1.1986.6.3.249 Labov William, 1972, SOCIOLINGUISTIC PATT Leiser R. G., 1989, Interacting with Computers, V1, DOI 10.1016/0953-5438(89)90016-7 SACKS H, 1974, LANGUAGE, V50, P696, DOI 10.2307/412243 Sankoff D, 1988, LINGUISTICS CAMBRIDG, P140, DOI DOI 10.1017/CBO9780511620577.009 Schegloff Emmanuel, 1984, STRUCTURES SOCIAL AC SCHIFFRIN D, 1987, DISCOURSE MARKETS SCHIFFRIN D, 1984, LANG SOC, V13, P311 Schiffrin D., 1988, LINGUISTICS CAMBRIDG, VIV, P251 Svartvik J., 1980, CORPUS ENGLISH CONVE Tannen D., 1989, TALKING VOICES REPET Tannen D., 1982, ANAL DISCOURSE TEXT, P43 NR 20 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 231 EP 242 DI 10.1016/0167-6393(94)90074-4 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200005 ER PT J AU CLARK, HH AF CLARK, HH TI MANAGING PROBLEMS IN SPEAKING SO SPEECH COMMUNICATION LA English DT Article DE REPAIRS; DISFLUENCIES; SPEAKING PROBLEMS; CONVERSATION ID SPEECH; REPAIR AB The problems that participants in conversation have, it is argued, are really joint problems and have to be managed jointly. The participants have three types of strategies for managing them. (1) They try to prevent foreseeable but avoidable problems. (2) They warn partners of foreseeable but unavoidable problems. And (3) they repair problems that have already arisen. Speakers and addresses coordinate actions at three levels of talk: (1) the speaker's articulation and the addressees' attention to that articulation; (2) the speaker's presentation of an utterance and the addressees' identification of that utterance; and (3) the speaker's meaning and the addressees' understanding of that meaning. There is evidence that the participants have joint strategies for preventing, warning about and repairing problems at each of these levels. There is also evidence that they prefer preventatives to warnings, and warnings to repairs, all other things being equal. RP CLARK, HH (reprint author), STANFORD UNIV, DEPT PSYCHOL, BLDG 420, STANFORD, CA 94305 USA. CR Clark H. H., 1994, HDB PSYCHOLINGUISTIC CLARK HH, 1986, COGNITION, V22, P1, DOI 10.1016/0010-0277(86)90010-7 CLARK HCS, UNPUB CLARK HH, 1989, COGNITIVE SCI, V13, P259, DOI 10.1207/s15516709cog1302_7 CLARK HH, 1982, LANGUAGE, V58, P332, DOI 10.2307/414102 CLARK HH, 1977, PSYCHOL LANGUAGE, P261 Cohen Philip R., 1990, INTENTIONS COMMUNICA Goodwin C., 1981, CONVERSATIONAL ORG I LEVELT WJM, 1983, COGNITION, V14, P41, DOI 10.1016/0010-0277(83)90026-4 SCHEGLOFF EA, 1977, LANGUAGE, V53, P361, DOI 10.2307/413107 SMITH VL, 1993, J MEM LANG, V32, P25, DOI 10.1006/jmla.1993.1002 SVARVIK J, 1980, CORPUS ENGLISH CONVE TREE JEF, 1994, NOV PSYCH SOC ST LOU WADE E, 1993, J MEM LANG, V32, P805, DOI 10.1006/jmla.1993.1040 WADE E, 1993, THESIS STANFORD U ST NR 15 TC 53 Z9 53 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 243 EP 250 DI 10.1016/0167-6393(94)90075-2 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200006 ER PT J AU SADEK, MD AF SADEK, MD TI TOWARDS A THEORY OF BELIEF RECONSTRUCTION - APPLICATION TO COMMUNICATION SO SPEECH COMMUNICATION LA English DT Article DE REASONING ABOUT ACTION; BELIEF PERSISTENCE AND REVISION; INTERAGENT COMMUNICATION ID INTENTION; LOGIC AB We present a theory of belief reconstruction to be embedded in an agent's communication model, which accounts for both belief persistence and revision. We analyse Cohen and Levesque (1990a)'s and Perrault (1990)'s theories, highlight problems which arise, and show that our theory does not have these problems. The starting point of our theory is called the observation principle. It accounts for a distinction between what an agent observes from another agent, and the action the latter has really performed. The theory is couched in an autoepistemic logic used objectively, along the same lines as in (Levesque, 1990). When applied to a communication context, it is shown that it correctly predicts the changes in an observer's beliefs in test cases such as sincere assertion and (detected or non-detected) lie. Such test cases highlight the ability of the theory to handle not only normal dialogue situations but also those where problems arise due to erroneous perception, such as misrecognition in spoken communication. RP SADEK, MD (reprint author), FRANCE TELECOM, CTR NATL ETUD TELECOMMUN,LAA,TSS,RCP, BP 40, F-22301 LANNION, FRANCE. CR ALLEN JF, 1980, ARTIF INTELL, V15, P143, DOI 10.1016/0004-3702(80)90042-9 Appelt D., 1988, 26th Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference AUSTIN JL, 1962, HOW TO DO THINGS WOR Cohen P. R., 1979, COGNITIVE SCI, V3, P177, DOI DOI 10.1207/S15516709COG0303_1 COHEN PR, 1990, ARTIF INTELL, V42, P213, DOI 10.1016/0004-3702(90)90055-5 COHEN PR, 1990, SYS DEV FDN, P221 GRICE HP, 1957, PHILOS REV, V66, P377, DOI 10.2307/2182440 Halpern J.Y., 1985, 9TH P INT JOINT C AR, P480 LEVESQUE HJ, 1990, ARTIF INTELL, V42, P263, DOI 10.1016/0004-3702(90)90056-6 PERRAULT CR, 1990, SYS DEV FDN, P161 REITER R, 1980, ARTIF INTELL, V13, P81, DOI 10.1016/0004-3702(80)90014-4 SADEK MD, 1992, 3 C PRINC KNOWL REPR, P462 SADEK MD, 1991, 2 P VEN WORKSH STRUC, P1 SADEK MD, 1990, 8TH P C AM ASS ART I, P970 SADEK MD, 1991, THESIS U RENNES FRAN Searle John R., 1969, SPEECH ACTS STRAWSON PF, 1971, LOGICO-LINGUISTIC PA NR 17 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 251 EP 263 DI 10.1016/0167-6393(94)90076-0 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200007 ER PT J AU COHEN, PR LEVESQUE, HJ AF COHEN, PR LEVESQUE, HJ TI PRELIMINARIES TO A COLLABORATIVE MODEL OF DIALOG SO SPEECH COMMUNICATION LA English DT Article DE COLLABORATION; DIALOG; JOINT INTENTION; CONFIRMATIONS AB Many writers have argued that dialogue should be regarded as a joint activity (see for example (Clark and Wilkes-Gibbs, 1986; Grosz and Sidner, 1990; Schegloff, 1981; Suchman, 1987)), something that agents do together, rather than simply as a product of the interaction of plan generators and recognizers working in synchrony and harmony, as plan-based theories propose. Such plan-based approaches do not explain why addresses ask clarification questions, why they confirm, or even, why they do not walk away. Rather, the joint action model claims that both parties to a dialogue are responsible for sustaining it. Participating in a dialogue requires the conversants to have at least a joint commitment to understand one another. The key questions to be answered include how to formalize such general commitments precisely, and to show how they predict the fine-grained synchrony so apparent in ordinary conversation. To begin to answer these questions, we sketch here how a formal theory of joint action explains confirmations that arise in task-oriented telephone dialogues. A more formal account is given in (Cohen and Levesque, 1991a). Then we argue that extensions of this analysis to dialogue more generally will be difficult. In particular, it will force us to give up our simplistic analyses of propositional content and literal meaning. C1 UNIV TORONTO, DEPT COMP SCI, TORONTO, ON M5S 1A1, CANADA. RP COHEN, PR (reprint author), SRI INT, CTR ARTIFICIAL INTELLIGENCE, 333 RAVENSWOOD AVE, MENLO PK, CA 94025 USA. CR Allen James F., 1980, AM J COMPUTATIONAL L, V6, P167 ALLEN JF, 1980, ARTIF INTELL, V15, P143, DOI 10.1016/0004-3702(80)90042-9 CLARK HH, 1986, COGNITION, V22, P1, DOI 10.1016/0010-0277(86)90010-7 COHEN PR, 1991, NOUS, V25, P487, DOI 10.2307/2216075 Cohen P. R., 1984, Computational Linguistics, V10 Cohen Philip R., 1990, INTENTIONS COMMUNICA COHEN PR, 1990, ARTIFICIAL INTELLIGE, V42 COHEN PR, 1978, 118 U TOR DEP COMP S COHEN PR, 1991, 12TH P INT JOINT C A, P951 COHEN PR, 1981, 7TH P INT JOINT C AR COHEN PR, 1991, 504 SRI INT ART INT GROSZ BJ, 1990, SYS DEV FDN, P417 HALPERN JY, 1984, 3RD P ACM C PRINC DI JENNINGS NR, 1992, 10TH P NAT C ART INT, P269 KRONFELD A, 1990, STUDIES NATURAL LANG LEVESQUE HJ, 1990 P AAAI 90 SAN M Oviatt S. L., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90001-7 PERRAULT CR, 1990, INTENTIONS COMMUNICA Schegloff Emmanuel, 1981, ANAL DISCOURSE TEXT Searle John, 1990, INTENTIONS COMMUNICA Searle John R., 1969, SPEECH ACTS Suchman L. A., 1987, PLANS SITUATED ACTIO NR 22 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 265 EP 274 DI 10.1016/0167-6393(94)90077-9 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200008 ER PT J AU NAKASHIMA, H HARADA, Y AF NAKASHIMA, H HARADA, Y TI SITUATED DIALOG MODEL FOR SOFTWARE AGENTS SO SPEECH COMMUNICATION LA English DT Article DE SITUATED REASONING; DIALOG MODEL; AGENTS AB When we communicate through (natural) languages, we do not explicitly say everything. Both the speaker and the hearer utilize information available from the utterance situation, which includes the mental states of the speaker and the hearer. Interesting cases are frequently observed in the use of Japanese (in dialogue situations). Syntactic (or configurational) constraints of Japanese are weaker than those of English, in the sense that the speaker may omit almost any element in a sentence. In this paper we present a mechanism of the hearer in the light of situated reasoning and show how the missing information can be supplied from the situation. Although we believe that the model captures the essential nature of human communication, it may be too naive as a model of human cognition. Rather, the model is intended to be used in the design of software agents that communicate with each other in a mechanical but flexible and efficient way. C1 WASEDA UNIV, TOKYO 160, JAPAN. RP NAKASHIMA, H (reprint author), ELECTROTECH LABS, 1-1-1 UMEZONO, TSUKUBA, JAPAN. CR Barwise J., 1983, SITUATIONS ATTITUDES DEVLIN K, 1991, LOGIC INFORMATION, V1 KAMEYAMA M, 1993, 31ST P ANN M ASS COM KAMEYAMA M, 1988, 2ND INT WORKSH JAP S, P47 KATAGIRI Y, 1990, PROCEEDINGS : EIGHTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P958 Nakashima H., 1991, SITUATION THEORY ITS, V2, P215 NAKASHIMA H, 1991, TR917 ETL NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 275 EP 281 DI 10.1016/0167-6393(94)90078-7 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200009 ER PT J AU OVIATT, SL COHEN, PR WANG, M AF OVIATT, SL COHEN, PR WANG, M TI TOWARD INTERFACE DESIGN FOR HUMAN LANGUAGE TECHNOLOGY - MODALITY AND STRUCTURE AS DETERMINANTS OF LINGUISTIC COMPLEXITY SO SPEECH COMMUNICATION LA English DT Article DE SPEECH AND PEN SYSTEMS; LINGUISTIC COMPLEXITY; INTERFACE DESIGN; COMMUNICATION MODALITY; PRESENTATION STRUCTURE ID SPEECH RECOGNITION AB Before next-generation human language technology can be designed to function successfully in actual field settings, interface techniques will be needed that can guide users' language to coincide with current system capabilities. The present study examines how input modality and presentation structure influence the linguistic complexity observed in people's spoken and written input to an interactive system. Using a semi-automatic simulation technique, language was collected during speech-only, writing-only and combined pen/voice exchanges, and using presentation formats that either were structured or unconstrained. Results indicate that both modality and presentation format substantially influence linguistic complexity, although the specific nature of their impact differs. A comprehensive analysis is provided of how both factors affect people's observed language in terms of total words, disfluencies, utterance length, lexical variability, perplexity, syntactic ambiguity and semantic integration. Users' preferences for modalities and formats also are analyzed, and implications are discussed for channeling people's language in a transparent way. The long-term goal of this research is to develop interface techniques for managing difficult sources of variability in people's language, so that robust processing of human language technology can be achieved. C1 SRI INT, COMP DIALOGUE LAB, MENLO PK, CA 94025 USA. SRI INT, CTR ARTIFICIAL INTELLIGENCE, MENLO PK, CA 94025 USA. STANFORD UNIV, DEPT COMP SCI, STANFORD, CA 94305 USA. CR BAHL LR, 1977, 94TH P M AC SOC A S1, V62 Black E., 1991, P DARPA SPEECH NAT L, P306, DOI 10.3115/112405.112467 CARAMAZZA A, 1991, NATURE, V349, P788, DOI 10.1038/349788a0 Chafe Wallace, 1982, SPOKEN WRITTEN LANGU, P35 CHAPANIS A, 1977, HUM FACTORS, V19, P101 Church K., 1982, American Journal of Computational Linguistics, V8 Church K. W., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90016-J Cohen P. R., 1992, P ACM S US INT SOFTW, P143, DOI 10.1145/142621.142641 COHEN PR, 1989, 1989 P C HUM FACT CO, P227 COLE R, IN PRESS IEEE T SPEE Giles H., 1987, COMMUNICATION YB, V10, P13 GOOD IJ, 1953, BIOMETRIKA, V40, P237, DOI 10.2307/2333344 HOBBS JR, 1992, TEXT BASED INTELLIGE JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 KARIS D, 1991, IEEE J SEL AREA COMM, V9, P574, DOI 10.1109/49.81951 Kurtenbach G., 1994, P CHI 94, P258, DOI 10.1145/191666.191759 LACOMIA MJ, 1994, P C HUMAN FACTORS CO, P107 LALOMIA MJ, 1991, P C HUMAN FACTORS CO Leiser R. G., 1989, Interacting with Computers, V1, DOI 10.1016/0953-5438(89)90016-7 MAKHOUL J, 1994, 1994 P ARPA HUM LANG NAKATANI C, 1993, 31ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P46 Oviatt S. L., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90001-7 OVIATT SL, 1991, ACM PRESS FRONTIER S, P69 OVIATT SL, 1993, 1993 ARPA HUM LANG T OVIATT SL, 1992, 1992 P INT C SPOK LA, V2, P1351 OVIATT SL, IN PRESS COMPUT SPEE Rhyne J. R., 1993, ADV HUMAN COMPUTER I, V4, P191 SHRIBERG E, 1992, 1992 P DARPA SPEECH, P23 SPITZ J, 1991, 4TH P DARPA WORKSH S SRIHARI RK, 1994, 1994 P ARPA HUM LANG STOLL FC, 1976, J PSYCHOL, V94, P13 TAPPERT CC, 1990, IEEE T PATTERN ANAL, V12, P787, DOI 10.1109/34.57669 WARD JR, 1988, IEEE T SYST MAN CYB, V18, P438, DOI 10.1109/21.7493 WOLF CG, 1990, 34TH P HUM FACT SOC, P249 WOLF CG, 1987, INT J MAN MACH STUD, V27, P91, DOI 10.1016/S0020-7373(87)80045-7 ZOLTANFORD E, 1991, INT J MAN MACH STUD, V34, P527, DOI 10.1016/0020-7373(91)90034-5 NR 36 TC 22 Z9 22 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 283 EP 300 DI 10.1016/0167-6393(94)90079-5 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200010 ER PT J AU MINAMI, Y SHIKANO, K TAKAHASHI, S YAMADA, T YOSHIOKA, O FURUI, S AF MINAMI, Y SHIKANO, K TAKAHASHI, S YAMADA, T YOSHIOKA, O FURUI, S TI LARGE-VOCABULARY CONTINUOUS SPEECH RECOGNITION ALGORITHM APPLIED TO A MULTIMODAL TELEPHONE DIRECTORY ASSISTANCE SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE CONTINUOUS SPEECH RECOGNITION; HMM; SEARCH ALGORITHM; MULTIMODAL DIALOG SYSTEM AB This paper describes an accurate and efficient algorithm for very-large-vocabulary continuous speech recognition. It is based on a two-stage LR parser with hidden Markov models (HMMs) as phoneme models. To improve recognition accuracy, it uses the forward and backward trellis likelihood. To improve search efficiency, it uses adjusting windows and merges candidates that have the same allophonic phoneme sequences and grammatical state, and then merges candidates at the meaning level. This algorithm was applied to a telephone directory assistance system that contains more than 70,000 subscribers (about 80,000 words) to evaluate its speaker-independent speech recognition capabilities. For eight speakers, the algorithm achieved a speech understanding rate of 65% for spontaneous speech. The results show that the system performs well in spite of the large word perplexity. This paper also describes a multi-modal dialog system that uses our large-vocabulary speech recognition algorithm. RP MINAMI, Y (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, HUMAN INTERFACE LABS, MUSASHINO, TOKYO 180, JAPAN. CR AUSTIN S, 1991, 1991 P INT C AC SPEE, P697 GALES MJF, 1992, 1992 P INT C AC SPEE, P233 HIRSCHMAN L, 1993, MAR ARPA WORKSH HUM KENNY P, 1992, 1992 P INT C SPOK LA, P225 KITA K, 1989, 1989 P INT C ACOUST, P703 MARTIN F, 1993, 1993 P EUR BERL, P1031 MINAMI Y, 1993, 1993 P INT S SPOK DI, P169 MINAMI Y, 1992, 1ST IEEE WORKSH INT NEY H, 1992, 1922 P INT C AC SPEE, P19 SHRIBERG E, 1992, 1992 P SPEECH NAT LA, P49 SOONG FK, 1991, P INT C AC SPEECH SI, V1, P705 TAKEBAYASHI Y, 1992, P INT C SPOKEN LANGU, P651 Tomita M., 1986, EFFICIENT PARSING NA VARGA AP, 1990, APR P IEEE INT C AC, P845 NR 14 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 301 EP 310 DI 10.1016/0167-6393(94)90080-9 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200011 ER PT J AU LEE, CH AF LEE, CH TI STOCHASTIC MODELING IN SPOKEN DIALOG SYSTEM-DESIGN SO SPEECH COMMUNICATION LA English DT Article DE AUTOMATIC SPEECH RECOGNITION; NATURAL LANGUAGE PROCESSING; STOCHASTIC MODELING; ACOUSTIC MODELING; LANGUAGE MODELING; HIDDEN MARKOV MODEL; SPOKEN DIALOG SYSTEM ID SPEECH RECOGNITION; LANGUAGE MODEL AB In this paper, we review the current state of the art in stochastic modeling for spoken dialogue system design. We discuss acoustic modeling of speech units for automatic speech recognition and language modeling of linguistic units for natural language processing. We point out some of the emerging stochastic modeling techniques and show the similarity between language modeling and acoustic modeling. Finally, we address search and decision issues related to the integration of knowledge sources for automatic speech recognition and natural language processing. RP LEE, CH (reprint author), AT&T BELL LABS, SPEECH RES DEPT, MURRAY HILL, NJ 07974 USA. CR Austin S., 1991, P IEEE INT C AC SPEE, P697, DOI 10.1109/ICASSP.1991.150435 BAHL LR, 1989, IEEE T ACOUST SPEECH, V37, P1001, DOI 10.1109/29.32278 BAHL LR, 1986, P INT C ACOUST SPEEC BAKER J, 1979, 97TH M ACOUST SOC AM BATES L, 1993, P INT C ACOUST SPEEC, V2, P45 BAUM LE, 1970, ANN MATH STAT, V41, P164, DOI 10.1214/aoms/1177697196 BELLEGARDA JR, 1990, IEEE T ACOUST SPEECH, V38, P2033, DOI 10.1109/29.61531 Brown P. F., 1990, Computational Linguistics, V16 CHOU W, 1993, P IEEE INT C AC SPEE, V2, P652 CHOU W, 1992, P IEEE INT C AC SPEE, P473, DOI 10.1109/ICASSP.1992.225869 CHURCH KW, 1988, 2ND C APPL NAT LANG DELLAPIETRA S, 1992, P INT C AC SPEECH SI, P633, DOI 10.1109/ICASSP.1992.225829 FUJISAKA T, 1989, INT WORKSHOP PARSING Gauvain JL, 1994, IEEE T SPEECH AUDI P, V2, P291, DOI 10.1109/89.279278 GAUVAIN JL, 1992, SPEECH COMMUN, V11, P205, DOI 10.1016/0167-6393(92)90015-Y GIACHIN EP, 1992, P INT C ACOUST SPEEC, P173, DOI 10.1109/ICASSP.1992.225944 GOOD IJ, 1953, BIOMETRIKA, V40, P237, DOI 10.2307/2333344 HEMPHILL CT, 1990, P DARPA SPEECH NATUR Huang X. D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90020-X JELINEK F, 1980, PATTERN RECOGNITION Jelinek F., 1991, P DARPA SPEECH NAT L, P293, DOI 10.3115/112405.112464 JELINEK F, 1985, P IEEE, V73, P1616, DOI 10.1109/PROC.1985.13343 JUANG BH, 1992, IEEE T SIGNAL PROCES, V40, P3043, DOI 10.1109/78.175747 Katagiri S., 1991, P IEEE WORKSH NEUR N, P299 KATZ SM, 1987, IEEE T ACOUST SPEECH, V35, P400, DOI 10.1109/TASSP.1987.1165125 KUHN R, 1993, P INT C ACOUST SPEEC, V2, P55 KUPIEC J, 1992, P ICASSP 92 SAN FRAN, P177, DOI 10.1109/ICASSP.1992.225943 Lau R., 1993, P ICASSP, P45 Lee C. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90002-N Lee K.-F., 1989, AUTOMATIC SPEECH REC Levinson S. E., 1989, P 1989 INT C AC SPEE, P441 NAGAI A, 1991, P EUROSPEECH 91 GENO, P1297 NEY H, 1991, IEEE T SIGNAL PROCES, V39, P336, DOI 10.1109/78.80816 Normandin Y., 1991, P ICASSP 91, P537, DOI 10.1109/ICASSP.1991.150395 ONCINA J, 1993, IEEE T PATTERN ANAL, V15 Pieraccini R., 1991, P SPEECH NAT LANG WO, P121, DOI 10.3115/112405.112423 PIERACCINI R, 1991, P ACL 91 BERKELEY Placeway P., 1993, P INT C AC SPEECH SI, P33 PRIETO N, 1991, P EUROSPEECH 91 GENE Rabiner L, 1993, FUNDAMENTALS SPEECH RABINER LR, 1986, AT&T TECH J, V65, P21 ROE DB, 1992, SPEECH COMMUN, V11, P311, DOI 10.1016/0167-6393(92)90025-3 SCHWARTZ R, 1992, P INT C AC SPEECH SI, P1 Schwartz R., 1990, P IEEE INT C AC SPEE, P81 SHARMAN RA, 1990, P DARPA SPEECH NATUR SOONG FK, 1991, P INT C AC SPEECH SI, V1, P705 SU KY, 1992, P IEEE INT C AC SPEE, P185 VEILLEUX N, 1993, P INT C AC SPEECH SI, V2, P51 VIDAL E, 1993, P EUROSPEECH 93 BERL Young S. J., 1993, P EUROSPEECH, P2203 ZAVALIAGKOS G, 1991, P ARPA MTO CSR WORKS, P71 ZUE V, 1991, P 1991 IEEE INT C AC, P713, DOI 10.1109/ICASSP.1991.150439 NR 52 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 311 EP 322 DI 10.1016/0167-6393(94)90081-7 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200012 ER PT J AU HATAZAKI, K EHSANI, F NOGUCHI, J WATANABE, T AF HATAZAKI, K EHSANI, F NOGUCHI, J WATANABE, T TI SPEECH DIALOG SYSTEM BASED ON SIMULTANEOUS UNDERSTANDING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH DIALOG; SIMULTANEOUS UNDERSTANDING; SPEECH RECOGNITION; DEMI-SYLLABLE HMM AB The authors propose a novel way to implement a speech dialogue system. The method, called Simultaneous Understanding, accomplishes speech recognition, understanding and action simultaneously with the user's utterance. This makes the dialogue system more interactive, because of the following advantages: the user does not have to wait for the system response, unless he wants to see the action results on the screen. If he sees a mis-recognized entry, he can correct it before the database is accessed. The authors implemented two ticke reservation speech dialog systems, using an existing speech recognition system. One of them was based on the above method, which can accept new inputs while simultaneously analyzing the previous utterance, and the other accepts new inputs after analyzing the previous utterance. It was found that the above method improved the total average utterance accuracy by 5.4% and the total time spent to solve tasks by 4.7%. This shows that the new method is promising for increasing the interactiveness for the speech dialogue system. RP HATAZAKI, K (reprint author), NEC CORP LTD, INFORMAT TECHNOL RES LABS, 4-1-1 MIYAZAKI, MIYAMAE KU, KAWASAKI 216, JAPAN. CR BATES M, 1993, P INT C ACOUST SPEEC, P11 HATAZAKI K, 1992, P ICSLP, P393 KITANO H, 1991, IEEE COMPUTER JUN, P36 KOGA S, 1992, P ICSLP, P1483 Pieraccini Roberto, 1992, P IEEE ICASSP, V1, P193 Polifroni J., 1992, P DARPA SPEECH NAT L, P28, DOI 10.3115/1075527.1075533 Takebayashi Y., 1994, Transactions of the Institute of Electronics, Information and Communication Engineers A, VJ77-A WARD W, 1992, P SPEECH NAT LANG WO, P78, DOI 10.3115/1075527.1075545 Watanabe T., 1992, Transactions of the Institute of Electronics, Information and Communication Engineers D-II, VJ75D-II Watanabe T., 1989, Transactions of the Institute of Electronics, Information and Communication Engineers D-I, VJ72D-II Watanabe T., 1992, Transactions of the Institute of Electronics, Information and Communication Engineers D-II, VJ75D-II NR 11 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 323 EP 330 DI 10.1016/0167-6393(94)90082-5 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200013 ER PT J AU ZUE, V SENEFF, S POLIFRONI, J PHILLIPS, M PAO, C GOODINE, D GODDEAU, D GLASS, J AF ZUE, V SENEFF, S POLIFRONI, J PHILLIPS, M PAO, C GOODINE, D GODDEAU, D GLASS, J TI PEGASUS - A SPOKEN DIALOG INTERFACE FOR ONLINE AIR-TRAVEL PLANNING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH UNDERSTANDING SYSTEM; SPOKEN DIALOG INTERFACE; AIR-TRAVEL PLANNING AB This paper describes PEGASUS, a spoken dialogue interface for on-line air travel planning that we have recently developed. PEGASUS leverages off our spoken language technology development in the ATIS domain, and enables users to book flights using the American Airlines EAASY SABRE system. The input query is transformed by the speech understanding system to a frame representation that captures its meaning. The tasks of the System Manager include transforming the semantic representation into an EAASY SABRE command, transmitting it to the application backend, formatting and interpreting the resulting information, and managing the dialogue. Preliminary evaluation results suggest that users can learn to make productive use of PEGASUS for travel planning, although much work remains to be done. RP ZUE, V (reprint author), MIT, COMP SCI LAB, SPOKEN LANGUAGE SYST GRP, CAMBRIDGE, MA 02139 USA. CR GLASS J, 1993, P 3 EUR C SPEECH COM, P2063 Grosz B.J., 1990, INTENTIONS COMMUNICA PALLETT D, 1993, P ARPA SPEECH NATURA PALLETT D, 1992, P DARPA SPEECH NATUR, P15, DOI 10.3115/1075527.1075532 PALLETT D, 1994, P DARPA SPEECH NATUR Price P., 1990, P DARPA SPEECH NAT L, P91, DOI 10.3115/116580.116612 Seneff S., 1992, P ICASSP, P189, DOI 10.1109/ICASSP.1992.225940 Seneff S., 1992, Computational Linguistics, V18 SENEFF S, 1991, P DARPA SPEECH NAT L, P88, DOI 10.3115/112405.112417 SENEFF S, 1991, P DARPA SPEECH NATUR, P354, DOI 10.3115/112405.112743 Zue V., 1992, P DARPA SPEECH NATUR, P84, DOI 10.3115/1075527.1075546 Zue V., 1990, P ICASSP 90, P73 Zue V. W., 1989, P DARPA SPEECH NAT L, P179, DOI 10.3115/100964.100983 NR 13 TC 15 Z9 15 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 331 EP 340 DI 10.1016/0167-6393(94)90083-3 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200014 ER PT J AU SETO, S KANAZAWA, H SHINCHI, H TAKEBAYASHI, Y AF SETO, S KANAZAWA, H SHINCHI, H TAKEBAYASHI, Y TI SPONTANEOUS SPEECH DIALOG SYSTEM TOSBURG-II AND ITS EVALUATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; SPEECH DIALOG; SPONTANEOUS SPEECH; WORD SPOTTING; MULTIMODAL DIALOG; ACTIVE NOISE CANCELLATION AB We have developed a spontaneous speech dialogue system TOSBURG II, employing keyword-based spontaneous speech understanding and multimodal response generation, with adaptive speech response cancellation. Since in multimodal interaction, the user understands the system's response by a visual output before its speech response is completed, the user often interrupts the system's speech response. Therefore, our adaptive speech response cancellation serves to facilitate natural human-computer interaction by allowing the user's interruption. We have also developed an evaluation environment for dialogue data collection and the performance of TOSBURG II. Unlike conventional data collection systems, TOSBURG II collection in this environment not only speech data and the final results of speech understanding but also its intermediate results as dialogue data, to use them for the evaluation and improvement of the system. The results of our dialogue experiments using TOSBURG II prove the effectiveness of adaptive speech response cancellation for natural interaction, confirming that the dialogue data and the evaluation environment will contribute to a further development of spontaneous speech dialogue systems. C1 TOSHIBA SOFTWARE ENGN CO LTD, SAIWAI KU, KAWASAKI 210, JAPAN. TOSHIBA CO LTD, CTR RES & DEV, SAIWAI KU, KAWASAKI 210, JAPAN. RP SETO, S (reprint author), KANSAI RES LAB, TOSHIBA CORP, 6-26 MOTOYAMA MINAMI CHO 8 CHOME, HIGASHINADA KU, KOBE 658, JAPAN. CR BATES M, 1993, 1993 P INT C AC SPEE, P111 COLE RA, 1993, 1993 P ISSD 93 TOK, P19 GERBINO E, 1993, 1993 P INT C AC SPEE, P135 HAYAMIZU S, 1991, IEICE SP91101 TECHN HIRSCHMAN L, 1992, 1992 P DARPA SPEECH, P7 JUNQUA J, 1991, 2ND P WORKSH STRUCT Kobayashi T., 1992, J ACOUST SOC JPN, V48, P888 Komatsu A., 1988, Transactions of the Institute of Electronics, Information and Communication Engineers D, VJ71D KUROIWA S, 1993, 1993 P ISSD 93 TOK, P25 Lee K.-F., 1989, AUTOMATIC SPEECH REC MARIANI J, 1992, 1992 P DARPA SPEECH, P55 MINAMI T, 1994, T I ELECTRONICS IN A, V77, P190 MOORE R, 1992, 1992 P DARPA SPEECH, P61 MURAKAMI J, 1991, IEICE SP91100 TECHN NAGATA Y, 1992, IEICE EA9284 TECHN R, P1 NIELSEN J, 1993, COMPUTER, V26, P32, DOI 10.1109/2.241424 Peckham J., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings POLOFRONI J, 1992, 1992 P DARPA SPEECH, P28 SHRIBERG E, 1992, 1992 P SPEECH NAT LA, P49 TAKEBAYASHI Y, 1992, 1992 P ICSLP 92 BANF, P651 TAKEBAYASHI Y, 1993, 1993 P INT C ACOUST, P115 THOMPSON HS, 1993, 1993 P ISSD 93 TOK, P33 TSUBOI H, 1990, 1990 P ICSLP 90 KOB, P273 Tubach J. P., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings ZUE V, 1993, 1993 P ISSD 93 TOK, P157 NR 25 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 341 EP 353 DI 10.1016/0167-6393(94)90084-1 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200015 ER PT J AU YAMADA, M ITOH, F SAKAI, K KOMORI, Y OHORA, Y FUJITA, M AF YAMADA, M ITOH, F SAKAI, K KOMORI, Y OHORA, Y FUJITA, M TI A SPOKEN DIALOG SYSTEM WITH ACTIVE NON-ACTIVE WORD CONTROL FOR CD-ROM INFORMATION-RETRIEVAL SO SPEECH COMMUNICATION LA English DT Article DE CD-ROM; SPOKEN DIALOG SYSTEM; ACTIVE NON-ACTIVE WORD CONTROL; WORD GRAMMAR PREDICTION; UNKNOWN WORD PROCESSING; GARBAGE MODEL AB This paper describes a development of a spoken dialogue travel guidance system, TARSAN. TARSAN uses commercial CD-ROM guidebooks as its knowledge source, containing a large amount of travel information. To deal with this amount of information, a large vocabulary has to be accepted by a speech recognizer without reducing its performance. Thus, we propose two steps of active/non-active word control methods: (1) a word/grammar prediction strategy, and (2) unknown word re-evaluation algorithm. The word/grammar prediction strategy dynamically changes a recognition network according to a conversation situation by making use of results retrieved from the CD-ROMs. This strategy makes users to access almost all data on the CD-ROMs using a small vocabular speech recognizer. The unknown word re-evaluation algorithm processes unknown words and non-active words using Garbage Models by integrating them into the recognition network, and once the Garbage Models are recognized, the unknown part will be compared with the non-active words. This algorithm enhances the ability of the word/grammar prediction. In the experiment without Garbage Models, 80.9% of the utterances were correctly understood. In the unknown word re-evaluation experiment using the Garbage Models, 86.4% were correctly re-evaluated, while the false alarms of 5% were found. RP YAMADA, M (reprint author), CANON INC, MEDIA TECHNOL LAB, 890-12 KASHIMADA, SAIWAI KU, KAWASAKI, KANAGAWA 211, JAPAN. CR ASADI A, 1990, 1990 INT C ACOUST SP, P125 *JTB, 1992, JTBS ACC INF Kobayashi T., 1992, J ACOUST SOC JPN, V48, P888 *KOS CO, 1990, TABIGURA EL BOOK KUROIWA S, 1993, 1993 P ISSD 93 TOK, P25 SAKAI K, 1994, T I ELEC INFO COMM A, V177, P232 WILPON JG, 1989, 1989 P INT C AC SPEE, P254 YAMADA M, 1993, NOV P ISSD 93 TOK, P117 ZUE V, 1993, 1993 P ISSD 93 TOK, P157 NR 9 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1994 VL 15 IS 3-4 BP 355 EP 365 DI 10.1016/0167-6393(94)90085-X PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PZ332 UT WOS:A1994PZ33200016 ER PT J AU TSOPANOGLOU, A MOURJOPOULOS, J KOKKINAKIS, G AF TSOPANOGLOU, A MOURJOPOULOS, J KOKKINAKIS, G TI ADAPTATION OF AN ISOLATED WORD SPEECH RECOGNITION SYSTEM TO CONTINUOUS SPEECH USING MULTISECTION LVQ CODEBOOK MODIFICATION AND PROSODIC PARAMETER TRANSFORMATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; CODEBOOK ADAPTATION; MULTISECTION CODEBOOK DESIGN; LVQ ALGORITHM; PROSODIC PARAMETERS ID HIDDEN MARKOV-MODELS AB An improved, phoneme-based IWSR system is described, which employs a robust reference data extraction procedure and achieves increased recognition accuracy. Furthermore, a novel method for the adaptation of the IWSR-system to continuous speech is presented. The IWSR system employs a multisection codebook design technique and the LVQ algorithm, which provide well-defined and accurate codebooks, minimize the influence of the within-word coarticulation and allow the use of time-sequence information at the recognition stage. The adaptation method is based on modifications of the system's reference data codebook using a small amount of representative continuous speech data and on linear transformations of the main prosodic parameters (i.e. energy and duration). Extensive testing under different conditions (speaker dependent versus speaker independent reference data, single versus multisection codebooks, adapted versus unadapted codebooks, phoneme versus word recognition accuracy, etc.) has shown the efficiency of the proposed methods. C1 UNIV PATRAS, DEPT ELECT ENGN, WIRE COMMUN LAB, GR-26500 PATRAI, GREECE. CR BILLI R, 1986, P INT C ACOUST SPEEC BILLI R, 1989, P EUROSPEECH 89 PAR, V2, P157 BROWN MK, 1982, AT&T TECH J, V61, P2971 BURTON DK, 1985, IEEE T ACOUST SPEECH, V33, P837, DOI 10.1109/TASSP.1985.1164650 BURTON DK, 1987, IEEE T ACOUST SPEECH, V35 CRYSTAL TH, 1988, J ACOUST SOC AM, V83, P1553, DOI 10.1121/1.395911 FANT G, 1992, INT C SPOKEN LANGUAG, P667 FISSORE L, 1989, IEEE T ACOUST SPEECH, V37, P1197, DOI 10.1109/29.31268 GUPTA V, 1991, P INT C ACOUST SPEEC, P341, DOI 10.1109/ICASSP.1991.150346 JUANG BH, 1987, IEEE T ACOUST SPEECH, P947 KATO H, 1992, INT C SPOK LANGUAGE, P507 KIMURA, 1987, P INT C ACOUST SPEEC, P825 KLATT DH, 1976, J ACOUST SOC AM, V59, P1208, DOI 10.1121/1.380986 KOHONEN T, 1988, IEEE COMPUT, V21, P11 Kohonen T., 1987, SPRINGER SERIES INFO KOO M, 1992, INT C SPOKEN LANGUAG, P1475 LEE KF, 1990, IEEE T ACOUST SPEECH, V38, P35, DOI 10.1109/29.45616 LEE KF, 1989, IEEE T ACOUST SPEECH, V37, P1641, DOI 10.1109/29.46546 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MCDERMOTT E, 1991, IEEE T SIGNAL PROCES, V39, P1398, DOI 10.1109/78.136545 MIKKILINENI R, 1988, P INT C ACOUST SPEEC, P433 MURVEIT H, 1986, P INT C ACOUST SPEEC, P837 NASRI M, 1989, SEP P EUROSPEECH 89, V1, P518 NOCERINO N, 1985, SPEECH COMMUN, V4, P317, DOI 10.1016/0167-6393(85)90057-3 Noll A., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Picone J., 1990, IEEE ASSP Magazine, V7, DOI 10.1109/53.54527 Press WH, 1988, NUMERICAL RECIPES C RABINER L, 1985, BELL LABS TECH J, V64, P2319 RABINER L, 1984, AT T BELL LABS TECH, V63, P712 RABINER LR, 1984, AT&T TECH J, V63, P1981 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 RABINER LR, 1989, IEEE T ACOUST SPEECH, V37, P1214, DOI 10.1109/29.31269 RABINER LR, 1982, AT&T TECH J, V61, P981 Rabiners LR, 1986, IEEE ASSP MAGAZI JAN, P4 SILVERMAN, 1990, IEEE ASSP MAG, P6 SOONG FK, 1986, P ICASSP TOK JAP, P877 TOHKURA Y, 1987, IEEE T ACOUST SPEECH, V35, P1414, DOI 10.1109/TASSP.1987.1165058 NR 37 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 1 EP 20 DI 10.1016/0167-6393(94)90037-X PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400001 ER PT J AU GAUVAIN, JL LAMEL, LF ADDA, G ADDADECKER, M AF GAUVAIN, JL LAMEL, LF ADDA, G ADDADECKER, M TI SPEAKER-INDEPENDENT CONTINUOUS SPEECH DICTATION SO SPEECH COMMUNICATION LA English DT Article DE CONTINUOUS SPEECH RECOGNITION; WORD RECOGNITION; PHONE RECOGNITION; SPEAKER-INDEPENDENT; LARGE VOCABULARY; DICTATION AB In this paper we report on progress made at LIMSI in speaker-independent large vocabulary speech dictation using newspaper-based speech corpora in English and French. The recognizer makes use of continuous density HMMs with Gaussian mixtures for acoustic modeling and n-gram statistics estimated on newspaper texts for language modeling. Acoustic modeling uses cepstrum-based features, context-dependent phone models (intra and interword), phone duration models, and sex-dependent models. For English the ARPA Wall Street Journal-based CSR corpus is used and for French the BREF corpus containing recordings of texts from the French newspaper Le Monde is used. Experiments were carried out with both these corpora at the phone level and at the word level with vocabularies containing up to 20,000 words. Word recognition experiments are also described for the ARPA RM task which has been widely used to evaluate and compare systems. RP GAUVAIN, JL (reprint author), LIMSI, CNRS, BP 133, F-91403 ORSAY, FRANCE. CR COHEN MH, 1989, THESIS U CALIFORNIA DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28 GAUVAIN JL, 1990, P ICSLP 90 GAUVAIN JL, 1992, FEB P DARPA SPEECH N GAUVAIN JL, 1994, INT J PATTERN RECOGN, V8 GAUVAIN JL, 1994, P IEEE INT C ACOUST GAUVAIN JL, 1994, IEEE T SPEECH AU APR, V2 GAUVAIN JL, 1992, SPEECH COMMUN, V11, P205, DOI 10.1016/0167-6393(92)90015-Y GIACHIN E, 1991, COMPUT SPEECH LANGUA, V5 Katz S., 1987, IEEE T ACOUST SPEECH, V35 LAMEL L, 1993, P EUROSPEECH 93 Lamel L., 1991, P EUROSPEECH 91 LAMEL LF, 1993, P EUROPSEECH 93 LAMEL LF, 1992, SEP P FIN REV DARPA LAMEL LF, 1993, P IEEE INT C ACOUST LEE CH, 1990, COMPUT SPEECH LANGUA, V4 NEY H, 1984, IEEE T ACOUST SPEECH, V32 PALLETT DS, 1994, MAR P ARPA HUM LANG PALLETT DS, 1992, SEP P FIN REV DARPA PALLETT DS, 1993, MAR P ARPA HUM LANG PAUL DB, 1992, P ICSLP 92 PRICE P, 1988, P IEEE INT C AC SPEE PROUTS B, 1980, THESIS U PARIS 11 RABINER LR, 1985, AT T TECH J, V64 NR 24 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 21 EP 37 DI 10.1016/0167-6393(94)90038-8 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400002 ER PT J AU BYRD, D AF BYRD, D TI RELATIONS OF SEX AND DIALECT TO REDUCTION SO SPEECH COMMUNICATION LA English DT Article DE TIMIT; GENDER; SEX; DIALECTS; REDUCTION; SPEAKER-SPECIFIC; SPEAKER-DEPENDENT; AMERICAN ENGLISH; DATABASE; ALLOPHONIC VARIATION ID SPEECH; DATABASE; FEMALE; TIMIT AB A set of phonetic studies based on analysis of the TIMIT speech database is presented which addresses topics relevant to the linguistic and speech recognition communities. First, the advantages and shortcomings of using TIMIT for linguistic research are considered, and a database methodological approach is outlined. Next, several small studies are presented which detail new results on the effect of speakers' sex and dialect region on pronunciation. The goal of this paper is to use the database to explore sex and dialect related variation thereby ascertaining differences which may merit further experimental study. This report concerns speaker-dependent effects on certain phonetic characteristics often involved in reduction such as speech rate, stop releases, flapping, central vowels, laryngeal state, syllabic consonants, and palatalization processes. Specifically, it is suggested that the phonetic characteristics found more commonly with male speakers are also those typical of reduction in speech. C1 UNIV CALIF LOS ANGELES, DEPT LINGUIST, PHONET LAB, LOS ANGELES, CA 90024 USA. CR Anshen Frank S., 1969, THESIS NEW YORK U APPLEGATE J, 1984, THESIS KENT STATE U BLADON RAW, 1984, LANG COMMUN, V4, P59, DOI 10.1016/0271-5309(84)90019-3 BRYD D, 1992, P INT C SPOKEN LANGU, V1, P827 BYRD D, 1993, UCLA WORKING PAPERS, P83 BYRD D, 1992, J ACOUST SOC AM, V92, P593, DOI 10.1121/1.404271 BYRD D, 1992, UCLA WORKING PAPERS COHEN M, 1987, P DARPA SPEECH RECOG, P49 COHEN M, 1987, JASA S1, V80, pS50 COHEN MH, 1989, THESIS U CALIFORNIA FASOLD RW, 1968, UNPUB SOCIOLINGUISTI FISCHER JL, 1958, WORD, V14, P47 Fisher W. M., 1986, P DARPA WORKSH SPEEC, P93 Garofolo J. S., 1993, DARPA TIMIT ACOUSTIC GAY T, 1981, PHONETICA, V38, P148 Henton C, 1992, NEW DEPARTURES LINGU, P27 HENTON C, 1988, LENGUAGE SPEECH MIND HENTON C, 1985, THESIS U OXFORD HENTON CG, 1985, LANG COMMUN, V5, P221, DOI 10.1016/0271-5309(85)90012-6 HENTON CG, 1983, J PHONETICS, V11, P353 Herold Ruth, 1990, THESIS U PENNSYLVANI HULTZEN LS, 1964, TABLES TRANSITIONAL KEATING PA, 1994, SPEECH COMMUN, V14, P131, DOI 10.1016/0167-6393(94)90004-3 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 KRAMER C, 1977, LANG SPEECH, V20, P151 KUCHERA H, 1967, COMPUTATIONAL ANAL P Labov W., 1966, SOCIOLINGUISTICS, P84 Labov W., 1970, STUD GEN, V23, P30 Labov W., 1991, NEW WAYS ANAL SOUND, P1 Labov William, 1966, SOCIAL STRATIFICATIO LAMEL L, 1986, FEB P DARPA SPEECH R, P100 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 Lindblom B., 1964, STL QPSR, V2, P1 Milroy L., 1976, BELFAST WORKING PAPE, V1, P1 OSHIKA BT, 1975, IEEE T ACOUST SPEECH, VAS23, P104, DOI 10.1109/TASSP.1975.1162639 RILEY M, 1992, ICSLP 92 P, P285 ROACH P, 1992, SPEECH COMMUN, V11, P475, DOI 10.1016/0167-6393(92)90054-B RYALLS J, 1994, J ACOUST SOC AM, V95, P2274, DOI 10.1121/1.408639 SHUY RW, 1967, LINGUISTIC CORRELATE Smith PM, 1979, SOCIAL MARKERS SPEEC, P109 TRUDGILL P, 1975, LANGUAGE SEX DIFFERE Trudgill Peter, 1974, SOCIAL DIFFERENTIATI UMEDA N, 1991, JASA, V89, P2010 WITHGOTT MM, 1993, COMPUTATIONAL MODELS Wolfram Walt, 1969, SOCIOLINGUISTIC DESC ZUE V, 1990, SPEECH COMMUN, V9, P351, DOI 10.1016/0167-6393(90)90010-7 ZUE VW, 1979, J ACOUST SOC AM, V66, P1039, DOI 10.1121/1.383323 ZUE VW, 1988, 2ND P M ADV MAN MACH NR 48 TC 93 Z9 94 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 39 EP 54 DI 10.1016/0167-6393(94)90039-6 PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400003 ER PT J AU BRUCE, G HOUSE, D TOUATI, P AF BRUCE, G HOUSE, D TOUATI, P TI UNTITLED SO SPEECH COMMUNICATION LA English DT Letter RP BRUCE, G (reprint author), LUND UNIV, DEPT LINGUIST & PHONET, S-22101 LUND, SWEDEN. CR BRUCE G, 1994, 7TH PHON C, P34 BRUCE G, 1990, NOV P ICSLP 90 KOB, V1, P489 Cutler Anne, 1983, PROSODY MODELS MEASU Goldsmith J., 1990, AUTOSEGMENTAL METRIC HOUSE D, 1993, P ESCA WORKSHOP PROS Pierrehumbert J., 1988, JAPANESE TONE STRUCT 1993, 3RD P EUR C SPEECH C NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 55 EP 58 DI 10.1016/0167-6393(94)90040-X PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400004 ER PT J AU GARDING, E AF GARDING, E TI PROSODY IN LUND SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; INTONATION; ACCENTOLOGY; PROSODIC TRANSFER; PROSODIC MODELING; DIALECTAL VARIATION AB The talk gives an overview of prosodic research at the Department of Linguistics and Phonetics at Lund University since 1950. It shows how the word accents have been a prime motor in these activities. The interest first centered on local Fo configurations caused by the distinctive accents, their contextual and dialectal variation and perceptual cues for their identification. Further analyses of global aspects led to a compositional model of Swedish prosody in which the local accent shapes are seen as superposed on a global intonation. Some neglected areas of research are pointed out, e.g. perception and possible effects of general rules of economy. The talk ends with a plea for a more unified framework in prosodic analysis. RP GARDING, E (reprint author), LUND UNIV, DEPT LINGUIST & PHONET, HELGONABACKEN 12, S-22362 LUND, SWEDEN. CR BANNERT R, 1979, PRAKTISK LINGVISTIK, V3 BOTINIS A, 1989, TRAVAUX I LINGUISTIQ, V22 BRUCE G, 1982, PHONETICA, V39, P274 Bruce G., 1977, TRAVAUX I LINGUISTIQ, V12 BRUCE G, 1981, NORDIC PROSODY, V2, P63 Bruce Gosta, 1978, NORDIC PROSODY, P219 Collier R., 1990, PERCEPTUAL STUDY INT GARDING E, 1986, WORKING PAPERS, V29, P115 GARDING E, 1989, WORKING PAPERS, V35, P63 GARDING E, 1967, TRAVAUX I PHONETIQUE, V6 GARDING E, 1975, WORKING PAPERS, V10, P53 GARDING E, 1981, STUD LINGUISTICA, V35, P146, DOI 10.1111/j.1467-9582.1981.tb00707.x GARDING E, 1967, SVENSKT TALSPRAK, P40 GARDING E, 1982, PHONETICA, V39, P288 GARDING E, 1982, WORKING PAPERS, V22, P137 GARDING E, 1993, WORKING PAPERS, V40, P25 GARDING E, 1982, TEXTSTRATEGIER TAL S, P117 GARDING E, 1979, 7 S P 9TH INT C PHON GARDING E, 1978, NORDIC PROSODY GARDING E, 1975, P S FONETIK UTTALSPE, P39 GARDING E, 1973, 1973 P SEC INT C NOR, P466 GARDING E, 1974, SVENSKANS BESKRIVNIN, V8, P97 Garding Eva, 1973, WORKING PAPERS LUND, V7, P36 Garding Eva, 1977, SCANDINAVIAN WORD AC HADDINGKOCH K, 1961, ACOUSTICOPHONETIC ST HADDINGKOCH K, 1964, PHONETICA, V11, P175 HORNE M, 1993, NORDIC PROSODY, V6, P85 HOUSE DAVID, 1990, TRAVAUX I LINGUISTIQ, V24 KOCK A, 1878, SPRAKHISTORISKA UNDE, V1 LEHISTE I, 1960, PHONETICA S, V5 LEHISTE I, 1979, 5 S P 9TH INT C PHON LEHISTE I, 1965, 5TH P C PHON SCI, P171 MADSEN YN, 1992, TRAVAUX I LINGUISTIQ, V27 MALMBERG B, 1967, STRUCTURAL LINGUISTI MALMBERG B, 1955, UNPUB OBSERVATIONS S MEYER EA, 1937, STUDIES SCAND PHILOL, V10 MEYER EA, 1954, STUDIES SCAND PHILOL, V11 OHLSSON SO, 1978, LUND ASTUDIER NORD A, V30 Ohman S., 1967, SPEECH TRANSMISSION, P20 STUDDERTKENNEDY M, 1971, WORKING PAPERS, V5, P1 TOUATI P, 1987, TRAVAUX I LINGUISTIQ, V21 NR 41 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 59 EP 67 DI 10.1016/0167-6393(94)90041-8 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400005 ER PT J AU GELUYKENS, R SWERTS, M AF GELUYKENS, R SWERTS, M TI PROSODIC CUES TO DISCOURSE BOUNDARIES IN EXPERIMENTAL DIALOGS SO SPEECH COMMUNICATION LA English DT Article DE DIALOG; TOPIC STRUCTURE; TURN-TAKING; PROSODY ID CONVERSATION; INTONATION AB In a dialogue, there are at least two sorts of boundaries between discourse units. One type of boundary signals the end of a topical unit; another type of boundary the end of a turn at talk. These two do not necessarily coincide, as a speaker may wish to switch to a new topical unit without wanting to be interrupted by his interlocutor. In order to test whether prosodic cues can differentiate unambiguously between topic and turn boundaries, a series of production experiments was set up in which topic-finality and turn-finality were varied independently, and in which visual and non-prosodic verbal cues could not be used. In the most complex condition, the speaker had to give clear cues for topic finality, while not prematurely losing the floor. In this condition, speakers avoided using low tones at turn-internal topical boundaries, reserving them to signal turn-final topic boundaries. When liseners were confronted with portions of the descriptions taken out of their contexts, they could reliably differentiate between turn-final and non-turn-final topical units. Interestingly, when the final parts of a topical unit were removed, listeners could still discriminate between turn-final and non-turn-final expressions, apparently basing themselves on other, more global, prosodic cues. This holds similarly for both minimally and maximally incomplete units. C1 INST PERCEPT RES, 5600 MB EINDHOVEN, NETHERLANDS. RP GELUYKENS, R (reprint author), UNIV ANTWERP, UFSIA, ROOM D-133, PRINSSTR 13, B-2000 ANTWERP, BELGIUM. RI Swerts, Marc/C-8855-2013 CR Bolinger D., 1989, INTONATION ITS USES Brown Gillian, 1980, QUESTIONS INTONATION Bruce Gosta, 1990, WORKING PAPERS, V36, P37 CRUTTENDEN A, 1981, J LINGUIST, V17, P221 Cutler Anne, 1986, INTONATION DISCOURSE, P139 GELUYKENS R, 1992, 1992 P WORKSH PROS N, P63 Geluykens R., 1992, DISCOURSE PROCESS GR GELUYKENS R, 1987, J PRAGMATICS, V11, P487 GROSJEAN F, 1983, LINGUISTICS, V21, P501, DOI 10.1515/ling.1983.21.3.501 HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 KREIMAN J, 1982, J PHONETICS, V10, P163 Lehiste I., 1979, FRONTIERS SPEECH COM, P191 Levelt W. J., 1989, SPEAKING INTENTION A MENN L, 1982, LANG SPEECH, V25, P341 SACKS H, 1974, LANGUAGE, V50, P696, DOI 10.2307/412243 SCHAFFER D, 1984, J PHONETICS, V12, P327 SCHAFFER D, 1983, J PHONETICS, V11, P243 SWERTS M, 1993, PHONETICA, V50, P189 SWERTS M, 1993, 1993 P ESCA WORKSH P, P96 SWERTS M, 1992, SPEECH COMMUN, V11, P463, DOI 10.1016/0167-6393(92)90052-9 SWERTS M, 1994, LANG SPEECH, V37, P21 SWERTS MGJ, 1994, THESIS EINDHOVEN U T THORSEN NG, 1980, J ACOUST SOC AM, V67, P1014, DOI 10.1121/1.384069 UMEDA N, 1982, J PHONETICS, V10, P290 YULE G, 1980, LINGUA, V52, P33, DOI 10.1016/0024-3841(80)90016-9 NR 25 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 69 EP 77 DI 10.1016/0167-6393(94)90042-6 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400006 ER PT J AU SWERTS, M COLLIER, R TERKEN, J AF SWERTS, M COLLIER, R TERKEN, J TI PROSODIC PREDICTORS OF DISCOURSE FINALITY IN SPONTANEOUS MONOLOGUES SO SPEECH COMMUNICATION LA English DT Article DE PROSODIC FUNCTION; INTONATION; DURATION; ROUTE DESCRIPTIONS ID INTONATION; DUTCH; WORDS AB This article reports on a study of the capacity of prosody to predict upcoming discourse boundaries. More specifically, it is investigated whether the approaching end of a route description can be pre-signalled by melodic and temporal characteristics. Experiment 1 brings to light that listeners are able to estimate on the basis of such prosodic properties how far a given utterance is situated from the end of a description. However, the scope of this prosodic prediction is relatively restricted as listeners can only estimate the absolute discourse position of the last two utterances of the monologue analyzed. Experiment 2 is run in order to explore systematically, by means of a test with synthetic speech, to what extent melodic and durational properties are sufficient to influence finality judgments. RP SWERTS, M (reprint author), INST PERCEPT RES, POB 513, 5600 MB EINDHOVEN, NETHERLANDS. RI Swerts, Marc/C-8855-2013 CR Bolinger D., 1989, INTONATION ITS USES Brown Gillian, 1980, QUESTIONS INTONATION BRUBAKER RS, 1972, J PSYCHOLINGUIST RES, V1, P141, DOI 10.1007/BF01068103 BRUCE G, 1982, PHONETICA, V39, P274 CHAPENTIER F, 1989, 1989 P EUROSPEECH 89, P13 Collier R., 1990, PERCEPTUAL STUDY INT Cooper W. E., 1980, SYNTAX SPEECH Cruttenden A., 1986, INTONATION DEPIJPER JR, 1993, P EUROSPEECH 93 BERL, P1211 EEFTING W, 1991, J ACOUST SOC AM, V89, P412, DOI 10.1121/1.400475 FOWLER CA, 1988, LANG SPEECH, V31, P307 GELUYKENS R, 1993, 1993 P ESCA WORKSH P, P108 GROSJEAN F, 1983, LINGUISTICS, V21, P501, DOI 10.1515/ling.1983.21.3.501 HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 HUBER D, 1989, P EUROSPEECH 89 PARI, P477 KLATT D, 1976, J PHONETICS, V3, P129 Lehiste I., 1975, STRUCTURE PROCESS SP, P195 Lehiste I., 1979, FRONTIERS SPEECH COM, P191 Levelt W. J., 1989, SPEAKING INTENTION A Liberman Mark, 1984, LANGUAGE SOUND STRUC, P157 MENN L, 1982, LANG SPEECH, V25, P341 Nespor M., 1986, PROSODIC PHONOLOGY PRICE PJ, 1991, J ACOUST SOC AM, V90, P2956, DOI 10.1121/1.401770 Silverman K. E. A., 1987, THESIS U CAMBRIDGE SLUIJTER AMC, 1993, PHONETICA, V50, P180 STREETER LA, 1978, J ACOUST SOC AM, V64, P1582, DOI 10.1121/1.382142 SWERTS M, 1993, PHONETICA, V50, P189 SWERTS M, 1993, IPO ANN PROGR REPORT, V27, P19 SWERTS M, 1992, SPEECH COMMUN, V11, P463, DOI 10.1016/0167-6393(92)90052-9 SWERTS M, 1994, LANG SPEECH, V37, P21 SWERTS MGJ, 1994, THESIS EINDHOVEN U T Terken J., 1992, SPEECH PERCEPTION PR, P427 TERKEN JMB, 1984, LANG SPEECH, V27, P269 THORSEN NG, 1985, J ACOUST SOC AM, V77, P1205, DOI 10.1121/1.392187 Torgerson Warren S., 1963, THEORY METHODS SCALI WIGHTMAN CW, 1992, J ACOUST SOC AM, V91, P1707, DOI 10.1121/1.402450 YULE G, 1980, LINGUA, V52, P33, DOI 10.1016/0024-3841(80)90016-9 NR 37 TC 14 Z9 14 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 79 EP 90 DI 10.1016/0167-6393(94)90043-4 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400007 ER PT J AU ROACH, P AF ROACH, P TI CONVERSION BETWEEN PROSODIC TRANSCRIPTION SYSTEMS - STANDARD BRITISH AND TOBI SO SPEECH COMMUNICATION LA English DT Article DE INTONATION; BRITISH ENGLISH; TRANSCRIPTION; DATABASE; CORPUS AB Although systems for the transcription of speech prosody have existed for a long time, the need to represent prosodic information in large-scale speech databases places new demands on such systems. The ToBI system recently developed in the USA differs in many interesting ways from conventional systems such as the ''Standard British'' system that has been in use for several decades. This paper discusses the differences in the context of current research on a machine-readable corpus of spoken English, and examines the possibility of converting automatically between the two types of transcription. RP ROACH, P (reprint author), UNIV READING, DEPT LINGUIST SCI, READING RG6 2AA, ENGLAND. CR ALDERSON P, IN PRESS WORKING SPO Armstrong Lilias E., 1926, HDB ENGLISH INTONATI Brazil D., 1980, DISCOURSE INTONATION Collier R., 1990, PERCEPTUAL STUDY INT Cruttenden A., 1986, INTONATION Crystal D., 1969, PROSODIC SYSTEMS INT DEPIJPER JR, 1993, P EUROSPEECH 93 BERL, P1211 GHALI N, 1992, P I ACOUSTICS, V14, P207 Halliday M. A. K., 1967, INTONATION GRAMMAR B HIRST D, 1991, 12 P INT C PHON SCI, V1, P305 JOHANSSON S, 1991, TEIA12W1 TEXT ENC IN Jones Daniel, 1956, OUTLINE ENGLISH PHON Kingdon Roger, 1958, GROUNDWORK ENGLISH I KNOWLES GO, 1988, MANUAL INFORMATION A LADD DR, 1991, 12 P INT C PHON SCI, V2, P290 LEWIS JW, 1969, PRONUNCIATION ENGLIS O'Connor John D., 1973, INTONATION COLLOQUIA Palmer Harold E., 1922, ENGLISH INTONATION S Pierrehumbert J, 1980, THESIS MIT PRICE PJ, 1991, J ACOUST SOC AM, V90, P2956, DOI 10.1121/1.401770 ROACH PJ, IN PRESS SPOKEN ENGL Roachs P., 1991, ENGLISH PHONETICS PH SILVERMAN K, 1992, 1992 P INT C SPEECH Trager George Leonard, 1951, STUDIES LINGUISTICS, V3 TRAGER GL, 1964, HONOR D JONES PAPERS, P266 WICHMANN A, 1991, THESIS U LANCASTER NR 26 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 91 EP 99 DI 10.1016/0167-6393(94)90044-2 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400008 ER PT J AU GRABE, E WARREN, P NOLAN, F AF GRABE, E WARREN, P NOLAN, F TI RESOLVING CATEGORY AMBIGUITIES - EVIDENCE FROM STRESS SHIFT SO SPEECH COMMUNICATION LA English DT Article DE CATEGORY AMBIGUITY; STRESS SHIFT; SYNTACTIC PROCESSING; PROSODY ID RHYTHM AB This paper presents an experimental study of ''stress shift'' in category-ambiguous and non-ambiguous material. Ambiguous sequences such as Chinese fan exhibit phonological evidence for two structural analyses. If the sequence is a syntactic phrase, with Chinese an adjective modifying the noun fan, then fan has greater relative prominence. If Chinese is a noun and the sequence is a compound, then fan is deaccented and Chinese has greater relative prominence. Additionally, since Chinese is a ''stress shift'' item, stress shift may apply in the phrasal interpretation. Thus, category-ambiguous words with a potential for stress shift might contain earlier cues to syntactic category, in the form of a modified stress pattern, than non-stress shift items. Production data show that stress shift patterns do indeed map onto syntactic categories, but only if the second element in the sequence is not right-branching. A comprehension experiment with category-ambiguous material suggests that compound or phrasal prominence patterns and stress shift facilitate syntactic processing. A second comprehension experiment replicates this effect and extends the investigation to non-ambiguous material such as Torquay College. In non-ambiguous material, again, phrasal and compound stress appear to affect processing, but stress shift does not. RP GRABE, E (reprint author), UNIV CAMBRIDGE, DEPT LINGUIST, SIDGWICK AVE, CAMBRIDGE CB3 9DA, ENGLAND. CR Bauer Laurie, 1983, ENGLISH WORD FORMATI Crystal D., 1969, PROSODIC SYSTEMS INT Fudge E., 1984, ENGLISH WORD STRESS Giegerich H., 1985, METRICAL PHONOLOGY P GRABE E, 1993, IN PRESS LABORATORY, V4 Gussenhoven C., 1991, PHONOLOGY, V8, P1, DOI 10.1017/S0952675700001263 HAYES B, 1984, LINGUIST INQ, V15, P33 HOGG R, 1987, METRICAL PHONOLOGY KELLY MH, 1992, PSYCHOL REV, V99 KIPARSKY P, 1979, LINGUIST INQ, V10, P421 LADD DR, 1993, SEP P ESCA WORKSH PR, P10 LIBERMAN M, 1977, LINGUIST INQ, V8, P249 MARSLENWILSON WD, 1992, Q J EXP PSYCHOL-A, V45, P73 Matthews Peter H., 1974, MORPHOLOGY NESPOR MA, 1983, PROSODIC PHONOLOGY Radford A, 1988, TRANSFORMATIONAL GRA SELKIRK E., 1984, PHONOLOGY SYNTAX SHATTUCKHUFNAGE.S, 1991, 12TH P INT C PHON SC, V4, P266 SHATTUCKHUFNAGE.S, IN PRESS J PHONETICS Tyler L. K., 1982, J SEMANT, V1, P297, DOI [DOI 10.1093/JOS/1.3-4.297, 10. 1093/jos/1. 3-4. 297] TYLER LK, 1977, J VERB LEARN VERB BE, V16, P683, DOI 10.1016/S0022-5371(77)80027-3 NR 21 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 101 EP 114 DI 10.1016/0167-6393(94)90045-0 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400009 ER PT J AU BANEL, MH BACRI, N AF BANEL, MH BACRI, N TI ON METRICAL PATTERNS AND LEXICAL PARSING IN FRENCH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; METRICAL PATTERN; LENGTHENING; RHYTHM; LEXICAL PARSING ID SPEECH; SYLLABLES; SEGMENTATION; ACCESS; FREQUENCY; PROSODY; RHYTHM AB Previous research suggested that rhythmic expectations could play a role in languages contrasting stressed syllables with unstressed ones, whereas languages without such a contrast and with a clear syllabic structure, such as French, would be processed according to a syllable-based procedure. Lexical parsing of bisyllabic words composed of two monosyllabic words are studied. Two experiments examine the effects of usual and reverse metrical patterns on segmentation. The usual iambic pattern produces, more often than not, recognition of bisyllables whereas lexical parsing is not influenced by monosyllable frequency and syllabic structure. The trochaic pattern strongly increases the amount of segmentation. In Experiment 2, focusing subjects' attention on the timing structure strengthens these effects. Consequently, French subjects use a metrical segmentation strategy. By contrast, the processing of spondees (Experiment 3) shows an effect of structural parameters on parsing and suggests the use of a syllable-based segmentation procedure when rhythmic information is absent. Implications for speech recognition models are discussed. C1 UNIV PARIS 05, PSYCHOL EXPTL LAB, CNRS, URA 316, 28 RUE SERPENTE, F-75270 PARIS 06, FRANCE. CR [Anonymous], 1971, TRESOR LANGUE FRANCA BALOTA DA, 1984, J EXP PSYCHOL HUMAN, V10, P340, DOI 10.1037/0096-1523.10.3.340 CLUFF MS, 1990, J EXP PSYCHOL HUMAN, V16, P551 CUTLER A, 1990, ACL MIT NAT, P105 CUTLER A, 1988, J EXP PSYCHOL HUMAN, V14, P113, DOI 10.1037/0096-1523.14.1.113 CUTLER A, 1986, J MEM LANG, V25, P385, DOI 10.1016/0749-596X(86)90033-1 CUTLER A, 1984, ATTENTION PERFORM, V10, P183 CUTLER A, 1986, LANG SPEECH, V29, P201 DAUER RM, 1983, J PHONETICS, V11, P51 Dell F., 1984, FORME SONORE LANGAGE, P65 DICHRISTO A, IN PRESS INTONATION DUPOUX E, 1990, J MEM LANG, V29, P316, DOI 10.1016/0749-596X(90)90003-I FLETCHER J, 1991, J PHONETICS, V19, P193 FONAGY I, 1980, ACCENT FRANCAIS COMT, V15, P123 GAUVAIN JL, 1986, P IEEE INT C ACOUST GROSJEAN F, 1987, COGNITION, V25, P135, DOI 10.1016/0010-0277(87)90007-2 Hirst D. J., 1983, PROSODY MODELS MEASU, P93 HIRST DJ, IN PRESS INTONATION MARTIN JG, 1972, PSYCHOL REV, V79, P487, DOI 10.1037/h0033467 MARTIN P, 1987, LINGUISTICS, V25, P925, DOI 10.1515/ling.1987.25.5.925 MEHLER J, 1981, J VERB LEARN VERB BE, V20, P298, DOI 10.1016/S0022-5371(81)90450-3 NOOTEBOOM SG, 1978, STUDIES PERCEPTION L NORRIS D, 1988, PERCEPT PSYCHOPHYS, V43, P541, DOI 10.3758/BF03207742 PASDELOUP V, 1988, 17EMES ACT JOURN ET PASDELOUP V, 1992, 19EMES ACT JOURN ET PITT MA, 1990, J EXP PSYCHOL HUMAN, V16, P564 RIETVELD ACM, 1980, LANG SPEECH, V23, P289 ROSSI M, 1972, PAPERS LINGUISTICS P, P435 ROSSI M, 1993, SPEECH COMMUN, V13, P87, DOI 10.1016/0167-6393(93)90062-P SEBASTIANGALLES N, 1992, J MEM LANG, V31, P18, DOI 10.1016/0749-596X(92)90003-G SEGUI J, 1984, ATTENTION PERFORMANC, V10 Vaissiere J., 1991, WENNERGREN INT S SER, V59, P108 WENK BJ, 1982, J PHONETICS, V10, P193 Winer B. J., 1971, STATISTICAL PRINCIPL NR 34 TC 17 Z9 17 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 115 EP 126 DI 10.1016/0167-6393(94)90046-9 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400010 ER PT J AU BARBOSA, P BAILLY, G AF BARBOSA, P BAILLY, G TI CHARACTERIZATION OF RHYTHMIC PATTERNS FOR TEXT-TO-SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE PERCEPTUAL-CENTER; RHYTHM; PAUSE GENERATION; DURATION; PROSODY; SPEECH SYNTHESIS ID SEGMENTAL DURATION; FRENCH; ENGLISH; MODEL AB This article proposes an alternative rhythmic unit to the syllable: the inter-perceptual-center group (IPCG). This group is delimited by events which can be detected using only acoustic correlates (Pompino-Marschall, 1989). The rhythmic patterns for French are described using this characterisation: we show that realisation of accents is gradual over the trailed accentual group and that this gradual lengthening is needed for perception. A model of repartition of the IPCG duration among its segmental constituents incorporating automatic generation of pauses (emergence and duration) according to speech rate is then described. RP BARBOSA, P (reprint author), UNIV STENDHAL, ENSERG,INPG,INST COMMUN PARLEE,CNRS,URA 368, 46 AV FELIX VIALLET, F-38031 GRENOBLE 1, FRANCE. CR Allen G. D., 1975, J PHONETICS, V3, P75 AUBERGE V, 1992, TALKING MACHINES THE, P307 BAILLY G, 1989, SPEECH COMMUN, V8, P137, DOI 10.1016/0167-6393(89)90040-X BAILLY G, 1992, TALKING MACHINES THE, P323 BARBOSA P, 1992, 4TH RHYTHM WORKSH RH, P163 BARBOSA P, 1992, 19E J ET PAR, P357 BARTKOVA K, 1987, SPEECH COMMUN, V6, P245, DOI 10.1016/0167-6393(87)90029-X Campbell W. N., 1992, TALKING MACHINES THE, P211 CAMPBELL WN, 1991, J PHONETICS, V19, P37 Classe A, 1939, RHYTHM ENGLISH PROSE Duez D, 1987, THESIS U PROVENCE FANT G, 1989, SPEECH TRANSMISSION, V2, P1 Fraisse P., 1974, PSYCHOL RYTHME HAUPTMANN AG, 1993, SPEAKEZ 1ST EXPT CON, V3, P1701 HIRST D, 1993, SEP P ESCA WORKSH PR, P32 HIRST DJ, IN PRESS INTONATION HOWELL P, 1988, PERCEPT PSYCHOPHYS, V43, P90, DOI 10.3758/BF03208978 JORDAN MI, 1990, ATTENTION PERFORMANC, V13 KLATT DH, 1976, J ACOUST SOC AM, V59, P1208, DOI 10.1121/1.380986 LEA WA, 1974, PX10791 DSD SPERR UN Lehiste I., 1977, J PHONETICS, V5, P253 LJOLJE A, 1986, IEEE T ACOUST SPEECH, V34, P1074, DOI 10.1109/TASSP.1986.1164948 MARCUS SM, 1976, THESIS CAMBRIDGE U MONNIN P, 1993, ANN PSYCHOL, V93, P9 MOULINES E, 1992, TALKING MACHINES THE, P7 NOOTEBOOM S, 1991, 12TH P ICPS FRANC, P228 OCONNOR JD, 1965, 2 U COLL PHON LAB PR OSHAUGHNESSY D, 1984, J ACOUST SOC AM, V76, P1664, DOI 10.1121/1.391613 FANT G, 1991, J PHONETICS, V19, P351 PASDELOUP V, 1992, 19ES ACT J ET PAR BR, P531 PIERREHUMBERT J, 1981, J ACOUST SOC AM, V70, P985, DOI 10.1121/1.387033 Pierrehumbert J, 1980, THESIS MIT Pike K. L., 1945, INTONATION AM ENGLIS POMPINOMARSCHALL B, 1989, J PHONETICS, V17, P175 SAGISAKA Y, 1990, IEEE INT C ACOUST SP, V1, P325 Scordilis M. S., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266404 SEMJAN A, 1992, 4 WORKSH RHYTHM PERC, P73 SHEN Y, 1962, STUD LINGUIST, V9, P1 Takeda K., 1992, TALKING MACHINES, P93 t'Hart J., 1973, J PHONETICS, V1, P309 Touati P., 1987, STRUCTURES PROSODIQU Traber C, 1992, TALKING MACHINES THE, P287 TURVEY M, 1990, HASKINS LABORATORIES, P231 VAISSIERE J, 1980, ANN SCUOLA NORMALE S, V2, P529 Van Santen J. P. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90016-Y VIVIANI P, 1991, TUTORIALS MOTOR BEHA, V2 WENK BJ, 1982, J PHONETICS, V10, P193 WIGHTMAN CW, 1992, J ACOUST SOC AM, V91, P1707, DOI 10.1121/1.402450 NR 48 TC 12 Z9 12 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 127 EP 137 DI 10.1016/0167-6393(94)90047-7 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400011 ER PT J AU PREVOST, S STEEDMAN, M AF PREVOST, S STEEDMAN, M TI SPECIFYING INTONATION FROM CONTEXT FOR SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; PROSODY; INTONATION STRUCTURE; INFORMATION STRUCTURE; COMBINATORY CATEGORIAL GRAMMAR; GENERATION; DISCOURSE CONTEXT AB This paper presents a theory and a computational implementation for generating prosodically appropriate synthetic speech in response to database queries. Proper distinctions of contrast and emphasis are expressed in an intonation contour that is synthesized by rule under the control of a grammar, a discourse model and a knowledge base. The theory is based on Combinatory Categorial Grammar, a formalism which easily integrates the notions of syntactic constituency, semantics, prosodic phrasing and information structure. Results from our current implementation demonstrate the system's ability to generate a variety of intonational possibilities for a given sentence depending on the discourse context. RP PREVOST, S (reprint author), UNIV PENN, DEPT COMP & INFORMAT SCI, PHILADELPHIA, PA 19104 USA. CR Beckman M. E., 1986, PHONOLOGY YB, V3, P255, DOI 10.1017/S095267570000066X BIRD S, 1991, DECLARATIVE PERSPECT, V7, P139 Davis J. R., 1988, 26th Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference HIRSCHBERG J, 1990, PROCEEDINGS : EIGHTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P952 HOUGHTON G, 1986, THESIS U SUSSEX Isard S. D., 1988, 7th FASE Symposium. Proceedings Speech '88 LIBERMAN M, 1985, TM1122585073111 AT T MOORTGAT M, 1989, CATEGORIAL INVESTIGA PIERREHUMBERT J, 1990, SYS DEV FDN, P271 Pierrehumbert J. B., 1980, THESIS MIT BLOOMINGT PREVOST S, 1993, THESIS U PENNSYLVANI PREVOST S, 1993, 3RD P EUR C SPEECH C, P2013 PREVOST S, 1993, UNPUB GENERATING INT PREVOST S, 1993, 6TH P C EUR CHAPT AS, P332 STEEDMAN M, 1990, 28TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P9 STEEDMAN M, 1991, LANGUAGE, V67, P260, DOI 10.2307/415107 STEEDMAN M, 1991, 29TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS : PROCEEDINGS OF THE CONFERENCE, P71 Steedman M., 1991, Natural Language and Speech Symposium Proceedings (EUR 14073) STEEDMAN M, 1987, NAT LANG LINGUIST TH, V5, P403, DOI 10.1007/BF00134555 STEEDMAN MJ, 1990, LINGUIST PHILOS, V13, P207, DOI 10.1007/BF00630734 TERKEN JMB, 1984, LANG SPEECH, V27, P269 Webber B. L., 1992, Artificial Intelligence in Medicine, V4, DOI 10.1016/0933-3657(92)90051-P Wheeler Deirdre, 1988, CATEGORIAL GRAMMARS, P349 YOUNG SJ, 1979, J ACOUST SOC AM, V66, P685, DOI 10.1121/1.383695 ZACHARSKI R, 1993, UNPUB BRIDGE BASIC R NR 25 TC 28 Z9 28 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 139 EP 153 DI 10.1016/0167-6393(94)90048-5 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400012 ER PT J AU KOMPE, R NOTH, E KIESSLING, A KUHN, T MAST, M NIEMANN, H OTT, K BATLINER, A AF KOMPE, R NOTH, E KIESSLING, A KUHN, T MAST, M NIEMANN, H OTT, K BATLINER, A TI PROSODY TAKES OVER - TOWARDS A PROSODICALLY GUIDED DIALOG SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE DIALOG; SENTENCE MODALITY; PROSODY; SPEECH UNDERSTANDING AB The domain of the speech recognition and dialog system EVAR is train time table inquiry. We observed that in real human-human dialogs when the officer transmits the information, the customer very often interrupts. Many of these interruptions are just repetitions of the time of day given by the officer. The functional role of these interruptions is often determined by prosodic cues only. An important result of experiments where naive persons used the EVAR system is that it is hard to follow the train connection given via speech synthesis. In this case it is even more important than in human-human dialogs that the user has the opportunity to interact during the answer phase. Therefore we extended the dialog module to allow the user to repeat the time of day and we added a prosody module guiding the continuation of the dialog by analyzing the intonation contour of this utterance. C1 UNIV MUNICH, INST DEUTSCH PHILOL, D-80799 MUNICH, GERMANY. RP KOMPE, R (reprint author), UNIV ERLANGEN NURNBERG, LEHRSTUHL MUSTERERKENNUNG INFORMAT 5, MARTENSSTR 3, D-91058 ERLANGEN, GERMANY. CR BAKENECKER G, 1994, IN PRESS P INT C SPO BATLINER A, 1992, FORTSCHRITTE AKUSTIK, VB, P541 BATLINER A, 1993, WORKING PAPERS, V41, P112 BATLINER A, 1994, NEW ADV TRENDS SPEEC BUTZBERGER J, 1992, SPEECH NATURAL LANGU DALY N, 1992, INT C SPOKEN LANGUAG, V1, P763 DALY N, 1990, INT C SPOKEN LANGUAG, P497 HIERONYMOUS JL, 1992, P ICASSP SAN FRANC M, V1, P225 HITZENBERGER L, 1989, P EUROPEAN C SPEECH, V2, P597 HITZENBERGER L, 1986, FACID FACHSPRACHLICH HUBER D, 1989, P INT C AC SPEECH SI, P600 KENNY P, 1991, P EUROPEAN C SPEECH, V2, P655 KIESSLING A, 1994, IN PRESS INT C SPOK KIESSLING A, 1992, P INT C ACOUST SPEEC, V2, P17 KOMPE R, 1994, IN PRESS PROGR PROSP KOMPE R, 1994, P INT C AC SPEECH SI, V2, P173 KRAUSE J, 1990, ENDENBERICHT BMFT PR Lea W., 1980, TRENDS SPEECH RECOGN, P166 MAST M, 1992, INT C SPOKEN LANGUAG, V2, P1573 MAST M, 1994, IEEE T PATTERN ANAL, V16, P179, DOI 10.1109/34.273733 MAST M, 1993, DIALOGMODUL SPRACHER, V50 NIEMANN H, 1990, IEEE T PATTERN ANAL, V12, P883, DOI 10.1109/34.57683 NOTH E, 1988, MUSTERKENNUNG 1988, V180, P2 NOTH E, 1991, PROSODISCHE INFORMAT OSHAUGHNESSY D, 1992, INT C SPOKEN LANGUAG, V2, P931 Ostendorf M., 1993, Computer Speech and Language, V7, DOI 10.1006/csla.1993.1010 PRICE P, 1990, INT C SPOKEN LANGUAG, V1, P13 ROBINSON T, 1990, INT C SPOKEN LANGUAG, V2, P1033 SCHUKATTALAMAZZ.E, 1993, P EUROPEAN C SPEECH, V1, P111 SHRIBERG E, 1992, DARPA SPEECH NATURAL SHRIBERG E, 1992, INT C SPOKEN LANGUAG, V2, P991 Singer H., 1992, P ICASSP, V1, P273 VAISSIERE J, 1988, NATO ASI SERIES F, P71 VEILLEUX N, 1990, P INT C ACOUST SPEEC, V2, P777 Wahlster W., 1993, P 3 EUR C SPEECH COM, P29 Waibel A., 1988, PROSODY SPEECH RECOG Wang M. Q., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90025-Y WIGHTMAN CW, 1992, J ACOUST SOC AM, V91, P1707, DOI 10.1121/1.402450 NR 38 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 155 EP 167 DI 10.1016/0167-6393(94)90049-3 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400013 ER PT J AU TAYLOR, P AF TAYLOR, P TI THE RISE FALL CONNECTION MODEL OF INTONATION SO SPEECH COMMUNICATION LA English DT Article DE INTONATION; F(0) ANALYSIS; F(0) SYNTHESIS ID PITCH AB This paper describes a new model of intonation for English. The paper proposes that intonation can be described using a sequence of rise, fall and connection elements. Pitch accents and boundary rises are described using rise and fall elements, and connection elements are used to describe everything else. Equations can be used to synthesize fundamental frequency (F0) contours from these elements. An automatic labelling system is described which can derive a rise/fall/connection description from any utterance without using prior knowledge or top-down processing. Synthesis and analysis experiments are described using utterances from six speakers of various English accents. An analysis/resynthesis experiment is described which shows that the contours produced by the model are similar to within 3.6 to 7.3 Hz of the originals. An assessment of the automatic labeller shows 72% to 92% agreement between automatic and hand labels. The paper concludes with a comparison between this model and others, and a discussion of the practical applications of the model. C1 UNIV EDINBURGH, CTR SPEECH TECHNOL RES, EDINBURGH EH8 9YL, MIDLOTHIAN, SCOTLAND. ATR INTERPRETING TELECOMM UN LABS, KYOTO, JAPAN. RP TAYLOR, P (reprint author), UNIV EDINBURGH, HUMAN COMMUN RES CTR, 2 BUCCLEUCH PL, EDINBURGH EH8 9LW, SCOTLAND. CR ANDERSON MD, 1984, IEEE INT C ACOUST SP 't Hart J., 1975, J PHONETICS, V3, P235 BAGSHAW PC, 1993, P EUROSPEECH 93 BERL Beckman M. E., 1986, PHONOLOGY YB, V3, P255, DOI 10.1017/S095267570000066X BLACK AW, 1994, COLING 94 BLACK AW, 1994, SPR M AC SOC JAP Crystal D., 1969, PROSODIC SYSTEMS INT FUJISAKI H, 1988, IEEE INT C ACOUST SP GEOFFROIS E, 1993, P EUROSPEECH 93 BERL GRICE ML, 1992, THESIS U COLLEGE LON Halliday M. A. K., 1967, INTONATION GRAMMAR B Hess W., 1983, PITCH DETERMINATION HIRST D, 1992, TALKING MACHINES ISARD SD, 1988, SPEECH 88 JENSEN U, 1993, P EUROSPEECH 9O BERL LADD DR, 1988, J ACOUST SOC AM, V84, P530, DOI 10.1121/1.396830 LADD DR, 1987, EUROPEAN C SPEECH TE LADD DR, 1983, LANGUAGE, V59, P721, DOI 10.2307/413371 LADD D R, 1984, Phonetica, V41, P31 Liberman M, 1984, LANGUAGE SOUND STRUC MEDAN Y, 1991, IEEE T SIGNAL PROCES, V39, P40, DOI 10.1109/78.80763 O'Connor John D., 1973, INTONATION COLLOQUIA Pierrehumbert J, 1980, THESIS MIT RABINER LR, 1976, IEEE T ACOUST SPEECH, V24, P399, DOI 10.1109/TASSP.1976.1162846 SILVERMAN K, 1990, PAPERS LABORATORY PH SILVERMAN K, 1992, INT C SPEECH LANGUAG t'Hart J., 1973, J PHONETICS, V1, P309 Taylor P. A., 1992, THESIS U EDINBURGH TAYLOR PA, 1993, P ESCA WORKSHOP PROS TAYLOR PA, 1993, P EUROSPEECH 93 BERL TAYLOR TA, 1994, 2ND ESCA IEEE WORKSH VONWILLER JP, 1990, SST 90 WILLEMS N, 1988, J ACOUST SOC AM, V84, P1250, DOI 10.1121/1.396625 WILLEMS NJ, 1983, MODEL STANDARD ENGLI WODD CA, 1992, ATR INTERPRETING TEL NR 35 TC 18 Z9 18 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1994 VL 15 IS 1-2 BP 169 EP 186 DI 10.1016/0167-6393(94)90050-7 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PT334 UT WOS:A1994PT33400014 ER PT J AU WALDSTEIN, RS BOOTHROYD, A AF WALDSTEIN, RS BOOTHROYD, A TI SPEECHREADING ENHANCEMENT USING A SINUSOIDAL SUBSTITUTE FOR VOICE FUNDAMENTAL-FREQUENCY SO SPEECH COMMUNICATION LA English DT Article ID SENTENCES; PERCEPTION; CUES; AID; SEGMENTATION; RECOGNITION; DURATION; PITCH AB This study compared the effectiveness of three acoustic supplements to speechreading: the low-pass-filtered output of an elecrtroglottograph, a variable-frequency sinusoidal substitute for voice fundamental frequency (F0), and a constant-frequency sinusoidal substitute that served as a representation of voicing. Both sinusoidal signals were synthesized at constant amplitude during periods of voicing. The sinusoidal signals were prepared off-line by a combination of automatic and manual estimation of the F0 contours of video-recorded sentences. These signals were then resynchronized with the audio portions of the original recording. In 12 normally-hearing adults, the electroglottograph signal and the variable-frequency sinusoidal F0 substitute both increased the number of words recognized in sentences of known topic by between 30 and 35 percentage points. The magnitude of this effect was greater for longer sentences but independent of basic speechreading ability. The constant-frequency substitute provided a 13 percentage point increase, suggesting that approximately one-third of the F0 speechreading enhancement effect could be accounted for by voicing detection alone. RP WALDSTEIN, RS (reprint author), CUNY, GRAD CTR, CTR RES SPEECH & HEARING SCI, 33 W 42ND ST, NEW YORK, NY 10036 USA. CR ASCHKENASY E, 1974, APR IEEE S SPEECH RE, P288 BERNSTEIN LE, 1989, J ACOUST SOC AM, V85, P397, DOI 10.1121/1.397690 BOOTHROYD A, 1988, J ACOUST SOC AM, V84, P101, DOI 10.1121/1.396976 BOOTHROYD A, 1988, EAR HEARING, V9, P306 BREEUWER M, 1986, J ACOUST SOC AM, V79, P481, DOI 10.1121/1.393536 CUTLER A, 1988, J EXP PSYCHOL HUMAN, V14, P113, DOI 10.1037/0096-1523.14.1.113 CUTLER A, 1992, J MEM LANG, V31, P218, DOI 10.1016/0749-596X(92)90012-M EBERHARDT SP, 1990, J ACOUST SOC AM, V88, P1274, DOI 10.1121/1.399704 GEERS AE, 1978, J EXP PSYCHOL HUMAN, V4, P273, DOI 10.1037//0096-1523.4.2.273 GRANT KW, 1986, EAR HEARING, V7, P328, DOI 10.1097/00003446-198610000-00008 GRANT KW, 1985, J ACOUST SOC AM, V77, P671, DOI 10.1121/1.392335 HAGGARD M, 1970, J ACOUST SOC AM, V47, P613, DOI 10.1121/1.1911936 HANIN L, 1988, EAR HEARING, V9, P335, DOI 10.1097/00003446-198812000-00010 HARRIS CM, 1963, J ACOUST SOC AM, V35, P339, DOI 10.1121/1.1918463 CHILDERS DG, 1990, J SPEECH HEAR RES, V33, P245 HNATHCHISOLM T, 1992, J SPEECH HEAR RES, V35, P1160 Jeffers J., 1971, SPEECHREADING LIPREA LEHISTE I, 1976, J ACOUST SOC AM, V60, P1199, DOI 10.1121/1.381180 MACLEOD A, 1987, British Journal of Audiology, V21, P131, DOI 10.3109/03005368709077786 MACLEOD A, 1990, British Journal of Audiology, V24, P29, DOI 10.3109/03005369009077840 MCGRATH M, 1985, J ACOUST SOC AM, V77, P678, DOI 10.1121/1.392336 NAKATANI LH, 1981, PHONETICA, V38, P84 NAKATANI LH, 1978, J ACOUST SOC AM, V63, P234, DOI 10.1121/1.381719 OSHAUGHNESSY D, 1979, J PHONETICS, V7, P119 OSHAUGHNESSY D, 1983, J ACOUST SOC AM, V74, P1155, DOI 10.1121/1.390039 PRICE PJ, 1991, J ACOUST SOC AM, V90, P2956, DOI 10.1121/1.401770 RISBERG A, 1978, SPEECH TRANSMISSION, V1, P1 RISBERG A, 1974, SCAND AUDIOL S, V4, P153 Rosen S, 1987, J Rehabil Res Dev, V24, P239 ROSEN SM, 1981, NATURE, V291, P150, DOI 10.1038/291150a0 SCOTT DR, 1982, J ACOUST SOC AM, V71, P996, DOI 10.1121/1.387581 STREETER LA, 1978, J ACOUST SOC AM, V64, P1582, DOI 10.1121/1.382142 SUMMERFIELD Q, 1991, MODULARITY AND THE MOTOR THEORY OF SPEECH PERCEPTION, P117 TONG YC, 1980, J ACOUST SOC AM, V68, P1897, DOI 10.1121/1.385184 WALLIKER JR, 1985, COCHLEAR IMPLANTS, P143 YEUNG E, 1988, EAR HEARING, V9, P342 YEUNG E, 1992, LASER DISC SENTENCE NR 37 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 303 EP 312 DI 10.1016/0167-6393(94)90024-8 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900001 ER PT J AU NAKAJIMA, S AF NAKAJIMA, S TI AUTOMATIC SYNTHESIS UNIT GENERATION FOR ENGLISH SPEECH SYNTHESIS BASED ON MULTILAYERED CONTEXT ORIENTED CLUSTERING SO SPEECH COMMUNICATION LA English DT Article AB In this paper, we propose a new synthesis unit learning method aiming at multi-lingual speech synthesis and describe its application to English speech synthesis. The method termed Multi-Layered Context Oriented Clustering (ML-COC) is a generalized framework of the COC method which has been applied to Japanese speech synthesis. The conventional COC method produces a set of phonetic context dependent units through a cluster splitting process. In ML-COC, the notion of context is generalized and the factors other than phonetic context, such as stressing and syntactical boundaries, are taken into account to capture the richer phoneme variations of English. A synthesis unit generation experiment shows that ML-COC produces about three times as many synthesis units as the conventional COC (Single-Layered COC: SL-COC) method, and the average intra-cluster variance of ML-COC units is 20% lower than that of SL-COC. These results suggest that the ML-COC synthesis units reflect the phonological structure of English much more appropriately than do the SL-COC units. To validate the effectiveness of the ML-COC method, we conducted preference experiments using synthesized speech. The preference test exposed 10 subjects to 52 sentences. The ML-COC method was preferred over the conventional SL-COC method by a score of 70% to 30%. RP NAKAJIMA, S (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, HUMAN INTERFACE LABS, SPEECH & ACOUST LAB, 1-2356 TAKE, YOKOSUKA, KANAGAWA 23803, JAPAN. CR Charpentier F. J., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) FUJIMURA O, 1976, J ACOUST SOC AM S, V1, P59 HAKODA S, 1990, P INT C SPOKEN LANG, P809 ITOH K, 1994, P IEEE INT C ACOUST, V64 KLATT DK, 1979, FRONTIERS SPEECH COM NAKAJIMA, 1988, IEEE T ACOUST SPEECH, P659 NAKAJIMA, 1992, SP929, P17 NAKAJIMA, 1989, T I ELECTRON INF COM, V72, P1174 NAKAJIMA HH, 1986, P FALL M ACOUST SOC Olive J. P., 1977, ICASSP, P568 RUOCOS, 1982, IEEE T ACOUST SPEECH, P1565 SAGAYAMA, 1989, IEEE T ACOUST SPEECH, P397 SELKIRK E., 1984, PHONOLOGY SYNTAX SHIRAKI Y, 1988, IEEE T ACOUST SPEECH, V36, P1437, DOI 10.1109/29.90372 SUGAMURA N, 1981, T IEICE J A, V64, P323 WANG WJ, 1993, P IEEE INT C ACOUST NR 16 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 313 EP 324 DI 10.1016/0167-6393(94)90025-6 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900002 ER PT J AU BOND, ZS MOORE, TJ AF BOND, ZS MOORE, TJ TI A NOTE ON THE ACOUSTIC-PHONETIC CHARACTERISTICS OF INADVERTENTLY CLEAR SPEECH SO SPEECH COMMUNICATION LA English DT Article ID CUES AB Even when speaking conditions and listener responses are very tightly controlled, some talkers are easier to understand than others. In a series of intelligibility tests using both English words without context and English sentences, both native and non-native listeners found one of five talkers difficult to understand. Since all talkers read similar materials and the task was the same for all listeners, the differences in intelligibility must have resulted from particular phonetic characteristics used by the talker. Spectrograms were made of all test words produced by the talkers and compared on selected acoustic-phonetic properties. In comparison with the more intelligible talkers, the least intelligible talkers produced test words at shorter durations; abbreviated vowel durations; used the least differentiated vowel space, as defined by the first two formants; used minimal cues for consonantal contrasts; and had the most varied amplitude of stressed vowels. These characteristics are similar to those distinguishing deliberately clear speech. That non-native and native listeners found the same talker difficult to understand suggests that the effect of clear speech, though different in degree, reflects the use of the same acoustic-phonetic information by both groups of listeners. Further, clear speech has similar acoustic-phonetic characteristics whether deliberately or inadvertently produced. C1 ARMSTRONG LAB, DIV BIODYNAM & BIOCOMMUN, WRIGHT PATTERSON AFB, OH USA. RP BOND, ZS (reprint author), OHIO UNIV, DEPT LINGUIST, 103 GORDY HALL, ATHENS, OH 45701 USA. CR BOND ZS, 1994, LISTENING 2ND LANGUA BOND ZS, 1989, J ACOUST SOC AM, V85, P907, DOI 10.1121/1.397563 Clark J. E., 1988, LANGUAGE TOPICS ESSA, P161 CUTLER A, 1991, SPEECH COMMUN, V10, P335, DOI 10.1016/0167-6393(91)90002-B CUTLER A, 1990, SPEECH COMMUN, V9, P485, DOI 10.1016/0167-6393(90)90024-4 EGAN JP, 1944, ARTICULATION TESTING HOOD JD, 1980, AUDIOLOGY, V19, P434 HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 PICHENY MA, 1986, J SPEECH HEAR RES, V29, P434 SHULMAN R, 1989, J ACOUST SOC AM, V85, P295 Summers W V, 1988, J Acoust Soc Am, V84, P917, DOI 10.1121/1.396660 NR 12 TC 45 Z9 46 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 325 EP 337 DI 10.1016/0167-6393(94)90026-4 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900003 ER PT J AU QIAN, YS DYCHAHINE, G KABAL, P AF QIAN, YS DYCHAHINE, G KABAL, P TI PSEUDO-MULTI-TAP PITCH FILTERS IN A LOW BIT-RATE CELP SPEECH CODER SO SPEECH COMMUNICATION LA English DT Article AB The pitch filter in a low bit-rate CELP speech coder has a strong impact on the quality of the reconstructed speech. In this paper we propose a pseudo-multi-tap pitch filter with fewer degrees of freedom than the number of prediction coefficients, but which gives a higher pitch prediction gain and a more appropriate frequency response than a conventional one-tap pitch filter. First, we present an analysis model for the pseudo-multi-tap pitch prediction filter. Then, we introduce a pseudo-multi-tap pitch prediction filter with a fractional pitch lag. The prediction gain of the pseudo-multi-tap pitch filter is compared to that of conventional one-tap and three-tap pitch filters with integer and non-integer pitch lags. A switching configuration is also studied. This filter switches modes depending on the prediction gain. The stability of a pseudo-multi-tap pitch synthesis filter in a CELP coder is considered. We proposed a stabilization method with a relaxed stability test. This relaxed test gives better results than a strict stability test. Finally, we have incorporated the pseudo-multi-tap pitch filter into a 4.8 kbit/s CELP speech coder. Both the objective SNR and subjective quality are better than for a conventional one-tap pitch filter. C1 MCGILL UNIV, DEPT ELECT ENGN, TELECOMMUN & SIGNAL PROC LAB, 3480 UNIV ST, MONTREAL H3A 2A7, QUEBEC, CANADA. CR Campbell J., 1990, Stilt, P58 Crochiere R. E., 1983, MULTIRATE DIGITAL SI IYENGAR V, 1991, IEEE T SIGNAL PROCES, V39, P1049, DOI 10.1109/78.80962 KLEIJN WB, 1988, 1988 P IEEE INT C AC, P155 KROON P, 1991, IEEE T SIGNAL PROCES, V39, P733, DOI 10.1109/78.80859 RAMACHANDRAN RP, 1987, IEEE T ACOUST SPEECH, V35, P937, DOI 10.1109/TASSP.1987.1165238 RAMACHANDRAN RP, 1989, IEEE T ACOUST SPEECH, V37, P467, DOI 10.1109/29.17527 SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 NR 8 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 339 EP 358 DI 10.1016/0167-6393(94)90027-2 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900004 ER PT J AU BLAAUW, E AF BLAAUW, E TI THE CONTRIBUTION OF PROSODIC BOUNDARY MARKERS TO THE PERCEPTUAL DIFFERENCE BETWEEN READ AND SPONTANEOUS SPEECH SO SPEECH COMMUNICATION LA English DT Article AB Listeners are able to tell apart read-aloud and spontaneously produced speech. Prosody appears to be important for this perceptual distinction. In this paper, the importance of the distribution and realization of prosodic boundaries is investigated. Recordings were made of five male speakers, spontaneously producing so-called instruction monologues. Transcripts of these monologues were read aloud by the same speakers. A perception experiment was carried out to obtain classification scores for isolated utterances selected from the spontaneous and read material. Auditory prosodic transcriptions were made of the entire spontaneous and read monologues, assessing the distribution and realization of underlying prosodic boundaries in both speech types. The underlying prosodic structure was assessed by means of an automatic text-to-speech system. Observed differences in the production of prosodic boundaries in the spontaneous and read material are related to the perceptual classification scores by means of a multiple regression analysis. Results show a significant correlation, suggesting that differences in the distribution and realization of prosodic boundaries contribute significantly to the perceptual difference between spontaneous and read speech. RP BLAAUW, E (reprint author), UNIV UTRECHT, LANGUAGE & SPEECH RES INST, TRANS 10, 3512 JK UTRECHT, NETHERLANDS. CR BLAAUW E, 1992, 1992 P INT C SPOK LA, V1, P751 BLAAUW E, 1992, OTS YB 1992, P1 BLAAUW E, 1991, P ESCA ETRW PHONETIC, V12 BOOMER DS, 1965, LANG SPEECH, V8, P148 BOVES L, 1991, J PHONETICS, V19, P25 Bringmann E., 1990, THESIS UTRECHT U Collier R., 1975, STRUCTURE PROCESS SP, P107 Collier R., 1990, PERCEPTUAL STUDY INT Cooper W. E., 1980, SYNTAX SPEECH DEPIJPER JR, 1993, P EUROSPEECH 93 BERL, V2, P1211 DEROOIJ JJ, 1979, THESIS UTRECHT U GEE JP, 1983, COGNITIVE PSYCHOL, V15, P411, DOI 10.1016/0010-0285(83)90014-2 GOLDMANEISLER F, 1958, Q J EXP PSYCHOL, V10, P96, DOI 10.1080/17470215808416261 Goldman-Eisler F., 1968, PSYCHOLINGUISTICS EX HOWELL P, 1991, SPEECH COMMUN, V10, P163, DOI 10.1016/0167-6393(91)90039-V LAAN GPM, 1993, P EUROSPEECH 93 BERL, V1, P569 Levelt W. J., 1989, SPEAKING INTENTION A LEVIN H, 1982, LANG SPEECH, V25, P43 MACLAY H, 1959, WORD, V15, P19 Nespor M., 1986, PROSODIC PHONOLOGY Remez R. E., 1985, J ACOUST SOC AM, V77, pS38, DOI 10.1121/1.2022306 SILVERMAN K, 1992, P INT C SPOKEN LANGU, V2, P1299 *SPSS INC, 1988, SPSSX US GUID STRANGERT E, 1990, 12TH P SCAND C LING SWERTS M, 1992, P WORKSHOP PROSODY N, P221 TERKEN JMB, 1984, LANG SPEECH, V27, P269 URIBE C, 1991, P ESCA ETRW PHONETIC, V17 NR 27 TC 26 Z9 27 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 359 EP 375 DI 10.1016/0167-6393(94)90028-0 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900005 ER PT J AU [Anonymous] AF [Anonymous] TI CAN BABBLING OR EMERGENT LANGUAGE PREDICT LATER LANGUAGE-DEVELOPMENT SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 381 EP 381 PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900006 ER PT J AU [Anonymous] AF [Anonymous] TI THE BIANNUAL MEETING OF THE FRANCOPHONIC-SPEECH-COMMUNICATION-GROUP CELEBRATES ITS 20TH SESSION ON 1-3 JUNE, 1994, TREGASTEL, FRANCE SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1994 VL 14 IS 4 BP 382 EP 384 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA PH839 UT WOS:A1994PH83900007 ER PT J AU DE, A KABAL, P AF DE, A KABAL, P TI AUDITORY DISTORTION MEASURE FOR SPEECH CODER EVALUATION - DISCRIMINATION INFORMATION APPROACH SO SPEECH COMMUNICATION LA English DT Article DE AUDITORY (COCHLEAR) MODEL; DISTORTION MEASURE; RENYI-SHANNON ENTROPY; DISCRIMINATION INFORMATION; RATE DISTORTION FUNCTION ID OBJECTIVE-MEASURE; QUALITY; MODEL; SIGNALS; ENTROPY; EAR AB In this article, we devise a fidelity criterion for quantifying the degree of distortion introduced by a speech coder. An original speech and its coded version are transformed from the time-domain to a perceptual-domain using an auditory (cochlear) model. This perceptual-domain representation provides information pertaining to the probability-of-firings in the neural channels. The introduced cochlear discrimination information (CDI) measure compares these firing probabilities in an information-theoretic sense. In essence, it evaluates the cross-entropy of the neural firings for the coded speech with respect to those for the original one. The performance of this objective measure is compared with subjective evaluation results. Finally, we provide a rate-distortion analysis by computing the rate-distortion function for speech coding using the Blahut algorithm. Four state-of-the-art speech coders with rates ranging from 4.8 kbit/s (CELP) to 32 kbit/s (ADPCM) are studied from the view-point of their performances (as assessed by the CDI measure) with respect to the rate-distortion limits. C1 MCGILL UNIV, DEPT ELECT ENGN, MONTREAL H3A 2A7, QUEBEC, CANADA. UNIV QUEBEC INRS TELECOMMUN, VERDUN H3E 1H6, PQ, CANADA. CR ABUT H, 1979, IEEE T INFORM THEORY, V25, P225, DOI 10.1109/TIT.1979.1056024 ACZEL J, 1975, MEASURES INFORMATION ACZEL J, 1978, IEEE T INFORM THEORY, V24, P592, DOI 10.1109/TIT.1978.1055948 Allen J., 1985, IEEE ASSP MAG JAN, P3 ATAL B, 1991, ADV SPEECH CODING Berger T., 1971, RATE DISTORTION THEO BLAHUT RE, 1974, IEEE T INFORM THEORY, V20, P405, DOI 10.1109/TIT.1974.1055254 Blahut RE, 1987, PRINCIPLES PRACTICE BUZO A, 1986, IEEE T INFORM THEORY, V32, P141, DOI 10.1109/TIT.1986.1057167 Campbell J. P. Jr., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90106-U Carlson A.B., 1986, COMMUNICATION SYSTEM CHU PL, 1982, IEEE T ACOUST SPEECH, V30, P545, DOI 10.1109/TASSP.1982.1163930 Coetzee H. J., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266497 CROCHIERE RE, 1980, IEEE T ACOUST SPEECH, V28, P318, DOI 10.1109/TASSP.1980.1163417 DE A, 1992, 16TH P BIENN S COMM, P419 DE A, 1992, P IEEE GLOBECOM, P452 DE A, 1994, UNPUB SPEECH COMMUNI DE A, 1993, THESIS MCGILL U DENG L, 1992, NEURAL NETWORKS, V5, P19, DOI 10.1016/S0893-6080(05)80004-8 Flanagan J., 1972, SPEECH ANAL SYNTHESI GEISLER CD, 1988, J PHONETICS, V16, P19 GHITZA O, 1987, IEEE T ACOUST SPEECH, V35, P736, DOI 10.1109/TASSP.1987.1165223 GRAY RM, 1980, IEEE T ACOUST SPEECH, V28, P367, DOI 10.1109/TASSP.1980.1163421 GREENBERG S, 1988, J PHONETICS, V16, P139 HALKA U, 1992, SPEECH COMMUN, V11, P15, DOI 10.1016/0167-6393(92)90060-K HALL T, 1980, HEARING RES, V2, P455 HECKER MHL, 1966, J ACOUST SOC AM, V40, P946 Helmholtz H.v., 1954, SENSATIONS TONE Itakura F., 1968, 6TH P INT C AC TOK, p[C17, C] Jayant N. S., 1984, DIGITAL CODING WAVEF KITAWAKI N, 1988, IEEE J SEL AREA COMM, V6, P242, DOI 10.1109/49.601 Kubichek R. F., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90094-2 KULLBACK S, 1959, INFORMATION THEORY S LALOU J, 1990, ANN TELECOMMUN, V45, P47 LEE YT, 1991, IEEE T SIGNAL PROCES, V39, P330, DOI 10.1109/78.80815 LIN JH, 1991, IEEE T INFORM THEORY, V37, P145, DOI 10.1109/18.61115 LYON RF, 1986, P IEEE INT C AC SPEE Lyon R.F., 1982, IEEE INT C AC SPEECH, P1282 MCDERMOTT BJ, 1978, BELL SYST TECH J, P1597 MERMELSTEIN P, 1979, J ACOUST SOC AM, V66, P1664, DOI 10.1121/1.383638 MORE BCJ, 1989, INTRO PSYCHOL HEARIN NOLL P, 1974, ZURICH SEMINAR DIG C PAILLARD B, 1992, J AUDIO ENG SOC, V40, P21 PENNER MJ, 1979, J ACOUST SOC AM, V66, P1719, DOI 10.1121/1.383644 Pickles JO, 1982, INTRO PHYSL HEARING QUACKENBUSH, 1988, OBJECTIVE MEASURES S RAO CR, 1985, IEEE T INFORM THEORY, V31, P589, DOI 10.1109/TIT.1985.1057082 Renyi A., 1970, PROBABILITY THEORY Richards D. H., 1964, P IEEE, V52, P941 ROY G, 1991, INT CONF ACOUST SPEE, P17, DOI 10.1109/ICASSP.1991.150268 SACHS MB, 1988, J PHONETICS, V16, P37 SCHROEDE.MR, 1974, J ACOUST SOC AM, V55, P1055, DOI 10.1121/1.1914647 SCHROEDER MR, 1979, J ACOUST SOC AM, V66, P1647, DOI 10.1121/1.383662 SENEFF S, 1988, J PHONETICS, V16, P55 SHAMMA SA, 1985, J ACOUST SOC AM, V78, P1622, DOI 10.1121/1.392800 SLANEY M, 1988, 13 APPL COMP INC TEC TOUSSAINT GT, 1975, IEEE T INFORM THEORY, V21, P99, DOI 10.1109/TIT.1975.1055311 Voiers WD, 1977, IEEE INT C AC SPEECH, P204 WANG SH, 1992, IEEE J SEL AREA COMM, V10, P819, DOI 10.1109/49.138987 NR 59 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1994 VL 14 IS 3 BP 205 EP 229 DI 10.1016/0167-6393(94)90063-9 PG 25 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NU869 UT WOS:A1994NU86900001 ER PT J AU MARQUES, JS ABRANTES, AJ AF MARQUES, JS ABRANTES, AJ TI HYBRID HARMONIC CODING OF SPEECH AT LOW BIT-RATES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH MODELING; SINUSODAL MODELING; CODING AB This paper presents a novel approach to sinusoidal coding of speech which avoids the use of a voicing detector. The proposed model represents the speech signal as a sum of sinusoids and bandpass random signals and it is denoted hybrid harmonic model in this paper. The use of two different sets of basis functions increases the robustness of the model since there is no need to switch between techniques tailored to particular classes of sounds. Sinusoidal basis functions with harmonically related frequencies allow an accurate representation of the quasi-periodic structure of voiced speech but show difficulties in representing unvoiced sounds. On the other hand, the bandpass random functions are well suited for high quality representation of unvoiced speech sounds, since their bandwidth is larger than the bandwidth of sinusoids. The amplitudes of both sets of basis functions are simultaneously estimated by a least squares algorithm and the output speech signal is synthesized in the time domain by the superposition of all basis functions multiplied by their amplitudes. Experimental tests confirm an improved performance of the hybrid model for operation with noise-corrupted input speech, relative to classic sinusoidal models which exhibit a strong dependency on voicing decision. Finally, the implementation and test of a fully quantized hybrid coder at 4.8 kbit/s is described. C1 INES, ISEL, R ALVES REDOL 9, P-1000 LISBON, PORTUGAL. INESC, IST, P-1000 LISBON, PORTUGAL. RI Marques, Jorge/C-1427-2010 OI Marques, Jorge/0000-0002-3800-7756 CR Abrantes A. J., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings ABRANTES A, 1992, EUSIPCO ALMEIDA L, 1984, IEEE INT C ACOUST SP ALMEIDA LB, 1983, IEEE T ACOUST SPEECH, V31, P664, DOI 10.1109/TASSP.1983.1164128 ATAL BS, 1982, IEEE T COMMUN, V30, P600, DOI 10.1109/TCOM.1982.1095501 BEROUTI M, 1979, APR P IEEE INT C AC, P208 BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 CARL H, 1991, IEEE INT C ACOUST SP CARL H, SIGNAL PROCESSING, V6 GERSON IA, 1990, APR INT C AC SPEECH, P461 GRIFFIN DW, 1988, IEEE T ACOUST SPEECH, V36, P1223, DOI 10.1109/29.1651 GRIFFIN DW, 1987, THESIS MIT JAROUDI A, 1991, P ACOUST SPEECH SIGN, V39 KROON P, 1990, INT CONF ACOUST SPEE, P661, DOI 10.1109/ICASSP.1990.115832 KUBIN B, 1993, P IEEE WORKSHOP SPEE, P35 KWON S, 1984, IEEE T ACOUST SPEECH, V32 Makhoul J., 1978, IEEE ICASSP, P163 Markel JD, 1976, LINEAR PREDICTION SP MARQUES J, 1991, EUROPEAN C SPEECH TE Marques J. S., 1989, Eurospeech 89. European Conference on Speech Communication and Technology MARQUES JS, 1990, INT CONF ACOUST SPEE, P17, DOI 10.1109/ICASSP.1990.115526 MARQUES J, 1988, EUSIPCO 1988, P891 Marques J.S., 1990, P IEEE INT C AC SPEE, P665 MCAULAY R, 1989, IEEE INT C ACOUST SP, P207 McAulay R. J., 1986, IEEE T ACOUST SPEECH, V34 PAPOULIS A, 1984, PROBABILITY RANDOM V, P441 SCHROEDER M, 1984, INT C COMMUNICATION, P1610 SLUYTER H, IEEE INT C ACOUST SP, P188 SOONG F, 1984, IEEE INT C ACOUST SP TRANCOSO IM, 1989, IEE PROC-I, V136, P141 TRANCOSO L, 1986, IEEE INT C ACOUST SP, P1709 NR 31 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1994 VL 14 IS 3 BP 231 EP 247 DI 10.1016/0167-6393(94)90064-7 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NU869 UT WOS:A1994NU86900002 ER PT J AU SOROKIN, VN AF SOROKIN, VN TI INVERSE PROBLEM FOR FRICATIVES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH; INVERSE PROBLEM; VOCAL TRACT SHAPE; FRICATIVES; OPTIMIZATION AB Articulatory parameters, vocal tract shape and cross-sectional area function were determined from fricative spectra. A model of fricative generation was used for providing acoustical constraints for an optimization procedure with muscles work as the criterion of optimality. A distance between spectra was measured with the use of the Cauchy-Bounjakovsky non-equality. A proper initial approximation of articulatory parameters is required to obtain an accurate and stable solution of the inverse problem. RP SOROKIN, VN (reprint author), RUSSIAN ACAD SCI, INST INFORMAT TRANSMISS PROBLEMS, ERMOLOVOI STR 19, 101447 MOSCOW, RUSSIA. CR BINDER RC, 1962, FLUID MECHANICS Blokhintsev D. I., 1981, ACOUSTICS NONHOMOGEN COKER CH, 1976, P IEEE, V64, P452, DOI 10.1109/PROC.1976.10154 Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1972, KTH1 SPEECH TRNASM L, P1 FLANAGAN JL, 1964, SPEECH ANAL SYNTHESI FLANAGAN JL, 1976, IEEE T ACOUST SPEECH, V24, P163, DOI 10.1109/TASSP.1976.1162778 GORDON CC, 1981, J ACOUST SOC AM, V70, P1624 Hall G., 1976, MODERN NUMERICAL MET Ishizaka K., 1972, SCRL MONOGRAPH, V8 MCCASLAND GP, 1979, J ACOUST SOC AM, V65, pS78, DOI 10.1121/1.2017441 Morse PM, 1948, VIBRATION SOUND MULLER EM, 1980, SPEECH LANGUAGE ADV, V4, P317 NAKAJIMA T, 1977, DYNAMIC ASPECTS SPEE, P251 NARTEY JNA, 1982, UCLA WORKING PAPERS, V64 RZHEVKIN SN, 1963, COURSE LECTURES THEO SAWASHIMA M, 1977, DYNAMIC ASPECTS SPEE, P31 Schroeter J., 1989, P ICASSP GLASG UK, P588 SCHROETER J, 1987, IEEE T ACOUST SPEECH, P308 Schroeter J., 1992, ADV SPEECH SIGNAL PR, P231 SHADLE CH, 1985, MIT506 TECHN REP SHIRAI K, 1976, J ELECTRONICS COMM A, V59, P35 Sorokin V. N., 1992, SPEECH SYNTHESIS Sorokin V. N., 1985, THEORY SPEECH PRODUC SOROKIN VN, 1992, SPEECH COMMUN, V11, P71, DOI 10.1016/0167-6393(92)90064-E STEVENS KN, 1971, J ACOUST SOC AM, V50, P1180, DOI 10.1121/1.1912751 STEVENS KN, 1984, MIT SPEECH COMM GROU, V4, P1 VODOPYANOV VG, 1980, AEROACOUSTICS, P78 WAKITA H, 1978, KTH1 SPEECH TRANSM L, P9 WILHELMS P, 1986, SIGNAL PROCESS, V111, P477 NR 30 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1994 VL 14 IS 3 BP 249 EP 262 DI 10.1016/0167-6393(94)90065-5 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NU869 UT WOS:A1994NU86900003 ER PT J AU ARONSON, L ROSENHOUSE, J PODOSHIN, L ROSENHOUSE, G AF ARONSON, L ROSENHOUSE, J PODOSHIN, L ROSENHOUSE, G TI MULTICHANNEL COCHLEAR PROSTHESIS ADAPTED TO HEBREW - A CASE-STUDY SO SPEECH COMMUNICATION LA English DT Article DE MULTICHANNEL COCHLEAR IMPLANT; HEBREW VOWEL FORMANTS; HEBREW COMPREHENSION ID SPEECH-PROCESSING STRATEGIES; IMPLANT; RECOGNITION; ELECTRODES AB The purpose of this work was to investigate the speech comprehension of four deaf Hebrew-speaking patients implanted with a cochlear prosthesis, the Nucleus 22-channels (N-22) system. Experiments were performed under two conditions: The speech tests (isolated vowels, bisyllabic words and fluent speech in closed and open sets) were first conducted using the Default Frequency Boundaries (DFBs) of the cochlear implant's speech processor. The Default Frequency Boundaries of each electrode which are specified by the computer program of the system, are assumed to be selected on the basis of English. Different sets of frequency boundaries were then established by altering the frequency-to-electrode mapping, taking into account the formant patterns of the modern Hebrew vowels and the number of active electrodes implanted in each patient. These changes yielded what we called Modified Frequency Boundaries (MFBs). The patients were then retested using the same speech material, and the results were compared with those previously obtained. As a result of the Modified Frequency Boundaries, improvements in the patients' comprehension of the speech elements were noted. The differences in performance between the two sets of frequency boundary distributions suggest that better speech comprehension could be achieved by implanted patients, at least partly, by adjusting the frequency-to-electrode mapping of the N-22 speech processor on a language basis. C1 TECHNION ISRAEL INST TECHNOL, DEPT GEN STUDIES, IL-32000 HAIFA, ISRAEL. TECHNION ISRAEL INST TECHNOL, FAC CIVIL ENGN, IL-32000 HAIFA, ISRAEL. TECHNION ISRAEL INST TECHNOL, BNAI ZION MED CTR, FAC MED, DEPT OTOLARYNGOL, IL-32000 HAIFA, ISRAEL. CR Abberton E., 1985, COCHLEAR IMPLANTS, P527 AGELFORS E, 1989, QPSR, P145 ARONSON L, UNPUB BLAMEY PJ, 1987, J ACOUST SOC AM, V82, P48, DOI 10.1121/1.395436 BLAMEY PJ, 1987, J ACOUST SOC AM, V82, P38, DOI 10.1121/1.395542 BOOTHROYD A, 1987, ANN OTO RHINOL LARYN, V96, P58 CLARK GM, 1984, ANN OTO RHINOL LARYN, V93, P127 DEFILIPPO CL, 1978, J ACOUST SOC AM, V63, P1186, DOI 10.1121/1.381827 DORMAN MF, 1988, J ACOUST SOC AM, V84, P501, DOI 10.1121/1.396828 DORMAN MF, 1990, J ACOUST SOC AM, V88, P2074, DOI 10.1121/1.400104 DOWELL RC, 1985, ANN OTO RHINOL LARYN, V94, P244 DOWELL RC, 1986, ARCH OTOLARYNGOL, V112, P1054 EDDINGTON DK, 1983, ANN NY ACAD SCI, V405, P241, DOI 10.1111/j.1749-6632.1983.tb31637.x EVANS EF, 1984, COCHLEAR IMPLANTS, P167 Fant G., 1960, ACOUSTIC THEORY SPEE Hochmair-Desoyer I. J., 1985, COCHLEAR IMPLANTS, P291 HOCHMAIRDESOYER IJ, 1983, ANN NY ACAD SCI, V405, P295, DOI 10.1111/j.1749-6632.1983.tb31642.x IIVONEN A, 1987, HONOUR I LEHISTE, P125 LIBERMAN AM, 1985, SR8283 HASK LAB SPEE, P63 Malmberg Bertil, 1963, STRUCTURAL LINGUISTI MANRIQUE AMB, 1982, J ACOUST SOC AM, V62, P1145 MECKLENBURG DJ, 1987, 3008 COCHL CORP REH OWENS E, 1981, HEARING AID J, V34, P9 PARKIN JL, 1988, LARYNGOSCOPE, V98, P262 PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 QUILIS A, 1983, COLLECTANEA PHONETIC, V7, P137 SCHUBERT ED, 1985, COCHLEAR IMPLANTS, P269 SIMMONS FB, 1986, ANN OTO RHINOL LARYN, V95, P71 SKINNER MW, 1991, EAR HEARING, V12, P3, DOI 10.1097/00003446-199102000-00002 SUMMERFIELD Q, 1985, COCHLEAR IMPLANTS, P417 TONG YC, 1983, J ACOUST SOC AM, V74, P73, DOI 10.1121/1.389620 TYEMURRAY N, 1989, EAR HEARING, V10, P292, DOI 10.1097/00003446-198910000-00004 Tyler R. S., 1983, IOWA COCHLEAR IMPLAN WHITE MW, 1983, ANN NY ACAD SCI, V405, P348, DOI 10.1111/j.1749-6632.1983.tb31649.x WHITE MW, 1985, COCHLEAR IMPLANTS, P243 1989, AUDIOLOGISTS HDB NR 36 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1994 VL 14 IS 3 BP 263 EP 277 DI 10.1016/0167-6393(94)90066-3 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NU869 UT WOS:A1994NU86900004 ER PT J AU MAK, MW ALLEN, WG AF MAK, MW ALLEN, WG TI LIP-MOTION ANALYSIS FOR SPEECH SEGMENTATION IN NOISE SO SPEECH COMMUNICATION LA English DT Article DE LIP-READING; LIP-TRACKING; SPEECH SEGMENTATION; ARTICULATORY DYNAMICS; DATA FUSION; BLOCK MATCHING ID PERCEPTION; RECOGNITION; KINEMATICS; MOVEMENTS; HEARING AB This paper explains how visual information from the lips and acoustic signals can be combined together for speech segmentation. The psychological aspects of lip-reading and current automatic lip-reading systems are reviewed. The paper describes an image processing system which can extract the velocity of the lips from image sequences. The velocity of the lips is estimated by a combination of morphological image processing and block matching techniques. The resultant velocity of the lips is used to locate the syllable boundaries. This information is particularly useful when the speech signal is corrupted by noise. The paper also demonstrates the correlation between speech signals and lip information. Data fusion techniques are used to combine the acoustic and visual information for speech segmentation. The principal results show that using the combination of visual and acoustic signals can reduce segmentation errors by at least 10.4% when the signal-to-noise ratio is lower than 15 dB. C1 UNIV NORTHUMBRIA, DEPT ELECT ELECTR ENGN & PHYS, ELLISON BLDG, NEWCASTLE UPON TYNE NE1 8ST, ENGLAND. CR BROOKE NM, 1983, J PHONETICS, V11, P63 DODD B, 1907, HEARING EYE PSYCHOL EGGERS M, 1990, P INT JOINT C NEURAL, V2, P7 ERBER NP, 1972, J SPEECH HEAR RES, V15, P413 FINN EK, 1988, PATTERN RECOGN, V8, P159 FOWLER CA, 1984, PERCEPT PSYCHOPHYS, V36, P359, DOI 10.3758/BF03202790 GREEN KP, 1989, PERCEPT PSYCHOPHYS, V45, P34, DOI 10.3758/BF03208030 HACKETT JK, 1990, IEEE T ROBOTIC AUTOM, P1324 HESSELMANN NL, 1983, SPEECH COMMUN, V2, P327, DOI 10.1016/0167-6393(83)90049-3 KELSO JAS, 1985, J ACOUST SOC AM, V77, P266, DOI 10.1121/1.392268 KNOTTS SL, 1993, P SOC PHOTO-OPT INS, V1721, P322 KUROSU K, 1989, 8TH EUR ANN C HUM DE, P1 MAK MW, 1993, THESIS U NORTHRUMBRI Mak M. W., 1993, Acoustics Letters, V16 MAK MW, 1994, SIGNAL PROCESSING IM, V6 MARAGOS P, 1986, P SPIE INT SOC OPT E, V477, P64 MASE K, 1991, SYSTEM COMPUTERS JAP, V22, P657 MCGURK H, 1976, NATURE, V264, P746, DOI 10.1038/264746a0 MUSMANN HG, 1985, P IEEE, V73, P523, DOI 10.1109/PROC.1985.13183 NATARAJA NP, 1983, HEARING AID J JAN, P13 O'Neill JJ, 1954, J SPEECH HEAR DISORD, V19, P429 OSTRY DJ, 1983, J EXP PSYCHOL HUMAN, V9, P622 OSTRY DJ, 1985, J ACOUST SOC AM, V77, P640, DOI 10.1121/1.391882 Petajan E., 1988, P HUM FACT COMP SYST, P19, DOI 10.1145/57167.57170 PRESS W, 1990, NUMERICAL RECIPES C, P94 Stork D.G., 1992, P INT JOINT C NEUR N, V2, P289, DOI 10.1109/IJCNN.1992.226994 SUMBY WH, 1954, J ACOUST SOC AM, V26, P212, DOI 10.1121/1.1907309 Summerfield Q, 1987, HEARING EYE PSYCHOL, P3 YUHAS BP, 1991, J ACOUST SOC AM, V90, P598, DOI 10.1121/1.401235 YUHAS BP, 1990, P IEEE, V78, P1658, DOI 10.1109/5.58349 NR 30 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1994 VL 14 IS 3 BP 279 EP 296 DI 10.1016/0167-6393(94)90067-1 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NU869 UT WOS:A1994NU86900005 ER PT J AU MA, CX OSHAUGHNESSY, D AF MA, CX OSHAUGHNESSY, D TI THE MASKING OF NARROW-BAND NOISE BY BROAD-BAND HARMONIC COMPLEX SOUNDS AND IMPLICATIONS FOR THE PROCESSING OF SPEECH SOUNDS SO SPEECH COMMUNICATION LA English DT Article DE AUDITORY MASKING; SPEECH PROCESSING ID PERIODIC PULSE; AUDIBILITY; TRANSFORM; PATTERNS; VOWELS; MODEL; EAR AB The evaluation of processed and synthesized speech is closely related to the auditory perception of complex sounds. An understanding of the perception of complex sounds is therefore helpful to improve the quality of processed sounds. The perceptual study of speech sounds in this paper is mainly concerned with auditory masking. Unlike most such studies, the targets in our experiment are narrowband noise signals and the maskers are wideband harmonic complex sounds. We show that the detection of targets at low frequencies is mainly determined by the spectral properties of the maskers. At high frequencies, the detection of targets is predominantly determined by the temporal behaviour of maskers. The relative contributions of spectral and temporal analysis strongly depend on the fundamental frequency of the masker. Better temporal resolution is associated with a higher masker level. RP MA, CX (reprint author), INRS TELECOMMUN, 16 PL COMMERCE, ILE DES SOEURS H3E 1H6, PQ, CANADA. CR DUIFHUIS H, 1971, J ACOUST SOC AM, V49, P1155, DOI 10.1121/1.1912477 DUIFHUIS H, 1970, J ACOUST SOC AM, V48, P888, DOI 10.1121/1.1912228 Flanagan J., 1972, SPEECH ANAL SYNTHESI GRIFFIN DW, 1984, IEEE T ACOUST SPEECH, V32, P236, DOI 10.1109/TASSP.1984.1164317 HOUTGAST T, 1974, ACUSTICA, V31, P320 JESTEADT W, 1982, J ACOUST SOC AM, V71, P951 JOHNSTON JD, 1988, IEEE J SEL AREA COMM, V6, P314, DOI 10.1109/49.608 KOHLRAUSCH A, 1988, BASIC ISSUES HEARING, P339 LEVITT H, 1971, J ACOUST SOC AM, V49, P467, DOI 10.1121/1.1912375 MOORE BCJ, 1983, J ACOUST SOC AM, V73, P906, DOI 10.1121/1.389015 Moriya T., 1986, P IEEE INT C AC SPEE, P1701 PATTERSON RD, 1987, J ACOUST SOC AM, V82, P1560, DOI 10.1121/1.395146 PLOMP R, 1969, J ACOUST SOC AM, V46, P409, DOI 10.1121/1.1911705 QUATIERI TF, 1990, VISTA SPEECH ENHANCE, P29 SCHROEDE.MR, 1970, IEEE T INFORM THEORY, V16, P85, DOI 10.1109/TIT.1970.1054411 SCHROEDER MR, 1979, J ACOUST SOC AM, V66, P1647, DOI 10.1121/1.383662 Schroeder MR, 1982, REPRESENTATION SPEEC, P79 SMITH BK, 1986, J ACOUST SOC AM, V80, P1631, DOI 10.1121/1.394327 STRUBE HW, 1985, ACUSTICA, V58, P207 TYLER RS, 1982, J ACOUST SOC AM, V71, P220, DOI 10.1121/1.387354 Wegel RL, 1924, PHYS REV, V23, P266, DOI 10.1103/PhysRev.23.266 Zwicker E., 1990, PSYCHOACOUSTICS FACT ZWICKER E, 1980, J ACOUST SOC AM, V68, P1523, DOI 10.1121/1.385079 NR 23 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 103 EP 118 DI 10.1016/0167-6393(94)90002-7 PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500001 ER PT J AU MANTYSALO, J TORKKOLA, K KOHONEN, T AF MANTYSALO, J TORKKOLA, K KOHONEN, T TI MAPPING CONTEXT-DEPENDENT ACOUSTIC INFORMATION INTO CONTEXT INDEPENDENT FORM BY LVQ SO SPEECH COMMUNICATION LA English DT Article DE CONTEXT MODELING; LEARNING VECTOR QUANTIZATION; HIDDEN MARKOV MODELS; LVQ IN PHONEMIC SPEECH RECOGNITION; NEURAL NETWORKS IN SPEECH RECOGNITION ID HIDDEN MARKOV-MODELS; SPEECH RECOGNITION AB In the framework of phonemic speech recognition using Hidden Markov Models (HMMs) together with codebooks trained by Learning Vector Quantization (LVQ), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context independent phonemic form. The acoustic data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic, for example, triphone HMMs, and the difficulties associated therein. Instead, simpler context independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish. C1 INST DALLE MOLLE INTELLIGENCE ARTIFICIELLE PERCEPT, CP 609, CH-1920 MARTIGNY, SWITZERLAND. HELSINKI UNIV TECHNOL, INFORMAT & COMP SCI LAB, SF-02150 ESPOO, FINLAND. CR BAMBERG P, 1990, 1990 P DARPA SPEECH, P163 CHENG YM, 1992, 1992 P IEEE INT C AC, V1, P593 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 GUPTA VN, 1987, 1987 P IEEE INT C AC, V2, P697 IWAMIDA H, 1991, 1991 P IEEE INT C AC, V1, P553 IWAMIDA S, 1990, 1990 P IEEE INT C AC, V1, P489 KIMBER DG, 1990, 1990 P IEEE INT C AC, V1, P497 KOHONEN T, 1990, P IEEE, V78, P1464, DOI 10.1109/5.58325 KOHONEN T, 1992, 1992 P IEEE INT JOIN, P725 KOHONEN T., 1989, SELF ORG ASS MEMORY KOHONEN T, 1988, COMPUTER, V21, P11, DOI 10.1109/2.28 KOHONEN T, 1988, 2ND P IEEE INT C NEU, V1, P61 KOHONEN T, 1988, P 1988 IEEE INT C AC, P607 KOHONEN T, 1986, TKKFA601 HELS TU TEC Kohonen T., 1990, 1990 IJCNN INT JOINT, V1, P545 LEE KF, 1990, IEEE T ACOUST SPEECH, V38, P599, DOI 10.1109/29.52701 LEUNG HC, 1992, 1992 P IEEE INT C AC, V1, P613 MCDERMOTT E, 1993, 1993 P IEEE INT C AC, V2, P291 MCDERMOTT E, 1989, 1989 P IEEE INT C AC, V1, P81 MCDERMOTT E, 1992, 1992 P IEEE INT C AC, V1, P417 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 RAMESH P, 1991, 1991 P IEEE INT C AC, V1, P113 TORKKOLA K, 1991, 1991 P INT C ART NEU, P771 YOUNG SJ, 1992, 1992 P IEEE INT C AC, V1, P569 YU G, 1990, 1990 P IEEE INT C AC, V1, P685 NR 25 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 119 EP 130 DI 10.1016/0167-6393(94)90003-5 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500002 ER PT J AU KEATING, PA BYRD, D FLEMMING, E TODAKA, Y AF KEATING, PA BYRD, D FLEMMING, E TODAKA, Y TI PHONETIC ANALYSES OF WORD AND SEGMENT VARIATION USING THE TIMIT-CORPUS OF AMERICAN ENGLISH SO SPEECH COMMUNICATION LA English DT Article DE AMERICAN ENGLISH; TIMIT; SPEECH DATABASES; PRONOUNCIATION VARIATION; VELAR FRONTING AB This paper reports a set of studies of some phonetic characteristics of the American English represented in the TIMIT speech database. First we describe some relevant characteristics of TIMIT, and how we use the non-speech files on the TIMIT CD with a commercial database program. Two studies are then described: one using only the non-audio parts of TIMIT (segmental transcriptions and durations, and speaker information), and one using the audio signal for acoustic analysis. Results of such studies should be useful not only to linguistic phoneticians but also for speech recognition lexicons and text-to-speech systems. RP KEATING, PA (reprint author), UNIV CALIF LOS ANGELES, DEPT LINGUIST, PHONET LAB, 405 HILGARD AVE, LOS ANGELES, CA 90024 USA. CR BYRD D, 1992, ICSLP 92 P, V1, P827 COHEN M, 1989, THESIS U C BERKELEY HENTON C, 1987, J IPA, V17, P72 KEATING B, 1992, UCLA WORKING PAPERS, V81, P1 KEATING P, 1992, ICSLP 92 P, V1, P823 KEATING P, 1993, PHONETICA, V50, P73 LADEFOGED P, 1982, COURSE PHONETICS, P58 LAMEL L, 1986, FEB P DARPA SPEECH R, P100 NOLAN F, 1993, PHONOLOGICAL STRUCTU, V3 PALLETT D, 1990, ICSLP 90 P, P2431 RANDOLPH MA, 1989, THESIS MIT RILEY M, 1992, ICSLP 92 P, P285 SEMRENO J, 1987, J PHONETICS, V15, P247 SENEFF S, 1988, UNPUB TRANSCRIPTION SUSSMAN HM, 1991, J ACOUST SOC AM, V90, P1309, DOI 10.1121/1.401923 ZUE V, 1988, 2ND P M ADV MAN MACH, P111 ZUE V, 1990, SPEECH COMMUN, V9, P351, DOI 10.1016/0167-6393(90)90010-7 ZUE V, 1980, THESIS INDIANA U LIN NR 18 TC 24 Z9 24 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 131 EP 142 DI 10.1016/0167-6393(94)90004-3 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500003 ER PT J AU VANBERGEM, DR AF VANBERGEM, DR TI A MODEL OF COARTICULATORY EFFECTS ON THE SCHWA SO SPEECH COMMUNICATION LA English DT Article DE COARTICULATION; ACOUSTIC VOWEL REDUCTION; LEXICAL VOWEL REDUCTION; CENTRALIZATION; CONTEXTUAL ASSIMILATION AB In this study coarticulatory effects on the formant frequencies and on the duration of the Dutch schwa were investigated, both in open syllables and in closed syllables, by using nonsense words of the form C1eC2V and VC1eC2. In these nonsense words C1 and C2 could be any of the consonants /p, t, k, f, s, chi, m, n, eta, r, l, j, upsilon/ and V was taken from the vowel set /i, a:, u/. Consonants and vowels were systematically varied in all possible combinations which gave a total of 897 test words that were read aloud by three male speakers. It appeared that the coarticulatory effects on the schwa could be successfully described with a simple linear model. Especially for F2-tracks of schwas, the model fit turned out to be very good. The model for F2-tracks could also be successfully applied to schwas in meaningful words. We believe that the schwa should be interpreted as a vowel without articulatory target that is completely assimilated with its phonemic context. The widespread view of a schwa position in the centre of the vowel triangle, that the formant patterns of reduced vowels are shifting to, is not very accurate. In our interpretation vowel reduction results in a shift of formant frequencies to a schwa position that can be almost anywhere in the vowel plane, dependent on the phonemic context. RP VANBERGEM, DR (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR ALFONSO PJ, 1982, LANG SPEECH, V25, P151 ATAL BS, 1976, IEEE T ACOUST SPEECH, V24, P201, DOI 10.1109/TASSP.1976.1162800 BELLBERTI F, 1976, HASKINS LAB STATUS R, V45, P197 BENGUERE.AP, 1974, PHONETICA, V30, P41 BENGUEREL AP, 1977, J PHONETICS, V5, P149 BROAD DJ, 1970, J ACOUST SOC AM, V47, P155 BROAD DJ, 1987, J ACOUST SOC AM, V81, P1572 Browman C. P., 1992, PAPERS LABORATORY PH, P26 DANILOFF R, 1968, J SPEECH HEAR RES, V11, P707 Daniloff R. G., 1973, J PHONETICS, V1, P239 DELATTRE PC, 1955, J ACOUST SOC AM, V27, P769, DOI 10.1121/1.1908024 DENOS EA, 1988, THESIS U UTRECHT DUNN OJ, 1961, J AM STAT ASSOC, V56, P52, DOI 10.2307/2282330 Fant G., 1960, ACOUSTIC THEORY SPEE Forster K., 1979, SENTENCE PROCESSING, P27 Hamming R. W., 1973, NUMERICAL METHODS SC Jakobson Roman, 1979, SOUND SHAPE LANGUAGE KOOPMANSVANBEIN.FJ, 1992, P I PHONETIC SCI AMS, V16, P53 Kuijpers C., 1993, THESIS U AMSTERDAM LEHISTE I, 1962, ACOUSTICAL CHARACTER, P1 MAKHOUL J, 1976, P IEEE INT C AC SPEE, P466 MORTON J, 1969, PSYCHOL REV, V76, P165, DOI 10.1037/h0027366 Nooteboom S., 1972, THESIS U UTRECHT NORD L, 1986, STL QPSR, V4, P19 OHMAN SEG, 1966, J ACOUST SOC AM, V40, P979 OHMAN SEG, 1967, J ACOUST SOC AM, V41, P310 SCHOUTEN MEH, 1979, J PHONETICS, V7, P1 SHARMAN JR, 1981, J ANAL TOXICOL, V5, P153 STEVENS KN, 1974, J ACOUST SOC AM, V55, P653, DOI 10.1121/1.1914578 STEVENS KN, 1966, J ACOUST SOC AM, V40, P123, DOI 10.1121/1.1910027 STEVENS KN, 1963, J SPEECH HEAR RES, V6, P111 Van den Broecke M. P. R., 1988, TER SPRAKE, P400 VANBERGEM DR, 1991, P SCA WORKSHOP PHONE, P101 VANBERGEM DR, 1990, P I PHONETIC SCI AMS, V14, P53 VANBERGEM DR, 1993, EUROSPEECH 93 BERLIN, P677 VANBERGEM DR, 1992, SPEECH COMMUN, V12, P1 VANSON RJJ, 1991, P I PHONETIC SCU AM, V15, P43 VANSON RJJ, 1960, J ACOUST SOC AM, V88, P1683 ZIPF G, 1936, PSYCHOBIOL LANGUAGE NR 39 TC 23 Z9 23 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 143 EP 162 DI 10.1016/0167-6393(94)90005-1 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500004 ER PT J AU SEGURA, JC RUBIO, AJ PEINADO, AM GARCIA, P ROMAN, R AF SEGURA, JC RUBIO, AJ PEINADO, AM GARCIA, P ROMAN, R TI MULTIPLE VQ HIDDEN MARKOV MODELING FOR SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODELS; SPEECH RECOGNITION ID ISOLATED-WORD RECOGNITION; DYNAMIC FEATURES; PREPROCESSOR AB In this paper a new variant of HMM, named Multiple VQ HMM (MVQHMM), is presented. Its main characteristic is the use of a separate codebook for each model. Procedures for training and probability evaluation of these models are described. The evaluation procedure combines the quantization distortions of the vector sequences with the discrete HMM generation probabilities. Comparative results on an isolated word recognition system are shown, between MVQHMM and discrete and semi-continuous HMM. These results show that using separate codebooks and including the quantization distortion in the decision criterion improve the performance of the system. Furthermore, the multiple VQ hidden Markov models seem to be more robust than the discrete and semi-continuous ones in relation to the inter-speaker variability of the recognition system. C1 UNIV GRANADA, DEPT FIS APLICADA, E-18071 GRANADA, SPAIN. RP SEGURA, JC (reprint author), UNIV GRANADA, DEPT ELECTR & TECNOL COMPUTADORES, E-18071 GRANADA, SPAIN. RI Peinado, Antonio/C-2401-2012; Segura, Jose/B-7008-2008 OI Segura, Jose/0000-0003-3746-0978 CR BERGH AF, 1985, AT&T TECH J, V64, P1047 DUDA R, PATTERN CLASSIFICATI, V1, P211 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 FURUI S, 1988, IEEE T ACOUST SPEECH, V36, P980, DOI 10.1109/29.1619 HUANG X, 1989, P INT C AC SPEECH SI, V2, P639 JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 PEINADO AM, 1991, IEE PROC-I, V138, P201 RABINER LR, 1985, AT&T TECH J, V64, P1211 SHORE JE, 1983, IEEE T INFORM THEORY, V29, P473, DOI 10.1109/TIT.1983.1056716 NR 9 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 163 EP 170 DI 10.1016/0167-6393(94)90006-X PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500005 ER PT J AU TAKEDA, S ICHIKAWA, A AF TAKEDA, S ICHIKAWA, A TI ANALYSIS OF PROMINENCE IN SPOKEN JAPANESE SENTENCES AND APPLICATION TO TEXT-TO-SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE JAPANESE; TEXT-TO-SPEECH; FUJISAKIS MODEL; PROMINENCE; PROSODY; PROMINENCE-PRODUCTION RULE ID PERCEPTION; INTONATION; DANISH AB This paper focuses on the partial emphasis or ''prominence'' of parts of Japanese sentences. Four sets of 43 read sentences uttered by two speakers including various types of prominence (172 sentences in total) are analyzed. This analysis shows that in 88% of the sentences prominence is produced by enhancing F0 and increasing power. No examples of lengthening of phoneme duration are observed in the emphasized parts of the sentences except for some special cases. One exception is lengthening accompanied by pause insertion as a mark of prominence, and another slowing total speech rate. The prosodic features of read natural speech are then used to develop rules for changing a reference sentence to produce prominence for rule-based speech synthesis. Listening test results using 10 subjects do not show any significant difference in expressibility between prominence synthesized by rule (rate of correct expression: 76.9%) and prominence in natural speech (79.9%) at the 5% level. To further improve prominence expressibility, listening tests for 10 subjects are used to clarify the conditions under which prominence expressibility becomes optimal. These tests show that the prosodic control parameters increase the expressibility of prominence by about 20%. Finally, prosodic features of spontaneous conversational speech are analyzed and compared with those of read sentence speech. Speech-rate reduction in parts where prominence is placed is more conspicuous in spontaneous conversational speech. C1 HITACHI LTD, CENT RES LAB, KOKUBUNJI, TOKYO 185, JAPAN. CR BRUCE G, 1991, P ESCA WORKSHOP BARC, V13 DELATTRE P, 1966, IRAL-INT REV APPL LI, V4, P183, DOI 10.1515/iral.1966.4.1-4.183 FOWLER CA, 1987, J MEM LANG, V26, P489, DOI 10.1016/0749-596X(87)90136-7 Fujisaka H., 1984, Journal of the Acoustical Society of Japan (E), V5 FUJISAKI H, 1988, P IEEE INT C ACOUS S, V14, P663 GARDING E, 1982, 13TH INT C LING TOK, V9, P85 HAKODA K, 1980, SYSTEMS COMPUTERS CH, V3, P28 Hirose K., 1986, P IEEE IECEJ ASJ ICA, P2415 HIROSE K, 1988, 2ND P S ADV MAN MACH, V3 HIRSCHBERG J, 1992, TALKING MACHINES THE, P367 Kitahara Y., 1988, Systems and Computers in Japan, V19, DOI 10.1002/scj.4690191106 KUBOZONO H, 1993, ORG JAPANESE PROSODY, V2 KUREMATSU A, 1991, SPEECH COMMUN, V10, P1, DOI 10.1016/0167-6393(91)90023-M LEHISTE I, 1980, PHONETIC MANIFASTATI LINDBLOM B, 1991, P ESCA WORKSHOP BARC, V2 MONATHAN AIC, 1989, SEP P EUROSPEECH 89, V1, P502 MONOGHAN AIC, 1992, TALKING MACHINES THE, P143 Quene H., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90044-5 RISCHEL J, 1991, P ESCA WORKSHOP BARC, V6 SAGISAKA Y, 1984, REV ELEC COMMUN LAB, V32, P188 SAGISAKA Y, 1990, P INT C AC SPEECH SI, V90, P325 SAGISAKA Y, 1991, INT C PHONETICS SCI, V3, P506 SHIRAI K, 1986, P IEEE IECEJ ASJ ICA, P2043 TAKEDA S, 1990, P ICSLP 90 KOBE, V123, P493 Terken J, 1987, LANG COGNITIVE PROC, V2, P145, DOI 10.1080/01690968708406928 TERKEN J, 1991, J ACOUST SOC AM, V89, P1768, DOI 10.1121/1.401019 THART J, 1991, J ACOUST SOC AM, V90, P3368, DOI 10.1121/1.401396 THART J, 1982, 13TH INT C LING TOK, P23 THORSEN NG, 1982, 13TH INT C LING TOK, V6, P47 THORSEN NG, 1985, J ACOUST SOC AM, V77, P1205, DOI 10.1121/1.392187 THORSEN NG, 1980, J ACOUST SOC AM, V67, P1014, DOI 10.1121/1.384069 Vaissiere Jacqueline, 1983, PROSODY MODELS MEASU, P53 NR 32 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1994 VL 14 IS 2 BP 171 EP 196 DI 10.1016/0167-6393(94)90007-8 PG 26 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA NG535 UT WOS:A1994NG53500006 ER PT J AU HEITZ, C BECKER, JD AF HEITZ, C BECKER, JD TI AN OPTIMIZED TIME-FREQUENCY DISTRIBUTION FOR SPEECH ANALYSIS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ANALYSIS; ELEMENTARY WAVE-FORM; TIME-FREQUENCY DISTRIBUTION; COHENS CLASS; AMBIGUITY FUNCTION AB In order to analyse and interpret speech signals, different time-frequency representations are used (e.g. spectrogram, Wigner-Ville distribution, wavelets). In this paper we construct within Cohen's class of time-frequency distributions the distribution that is optimally suited for the representation of speech signals. Thereby we take advantage of the special time-frequency structure of speech expressed in the Elementary Waveform Speech Model (EWSM, d'Alessandro, 1990). As an application we present an algorithm that extracts a point pattern in the time-frequency plane out of the speech signal using the optimized distribution. Thus we get a very simple representation of the speech signal that is well interpretable both for non-stationary and for stationary speech segments. Furthermore this represention could serve as a base for further analysis (e.g. classification). RP HEITZ, C (reprint author), UNIV FREIBURG, FAK PHYS, HERMANN HERDER STR 3, D-79104 FREIBURG, GERMANY. CR COHEN L, 1989, P IEEE, V77, P941, DOI 10.1109/5.30749 DALESSANDRO C, 1990, SPEECH COMMUN, V9, P419, DOI 10.1016/0167-6393(90)90018-5 Englert F., 1989, Informationstechnik - IT, V31 PAPOULIS A, 1973, IEEE T INFORM THEORY, V19, P9, DOI 10.1109/TIT.1973.1054956 Papoulis A., 1962, FOURIER INTEGRAL ITS SMIRNOW WI, 1958, LEHRGANG HOHEREN M 4, pCH2 ZHAO Y, 1990, IEEE T ACOUST SPEECH, V38 NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 1 EP 18 DI 10.1016/0167-6393(94)90054-X PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100001 ER PT J AU MCGOWAN, RS AF MCGOWAN, RS TI RECOVERING ARTICULATORY MOVEMENT FROM FORMANT FREQUENCY TRAJECTORIES USING TASK DYNAMICS AND A GENETIC ALGORITHM - PRELIMINARY MODEL TESTS SO SPEECH COMMUNICATION LA English DT Article DE ARTICULATORY RECOVERY; INVERSE PROBLEM; TASK DYNAMICS ID VOCAL-TRACT; SPEECH SYNTHESIZER; PERTURBATIONS; INVERSION; GESTURES; SIGNAL; MOTION; WAVE; TIME AB Articulatory trajectories of an articulatory model were recovered by means of a genetic algorithm from acoustic information using a task-dynamic model of speech articulation. Tests on simulated utterances /Ib dipthong ae/ and /Id dipthong ae/ show that the method can recover most of parts of an original trajectory, but it has trouble in obtaining precise timing. For the recovery of articulation, formant frequency trajectories should be supplemented by additional acoustic information, such as RMS amplitude. RP MCGOWAN, RS (reprint author), HASKINS LABS INC, 270 CROWN ST, NEW HAVEN, CT 06511 USA. CR ABBS JH, 1984, J NEUROPHYSIOL, V51, P705 ATAL BS, 1978, J ACOUST SOC AM, V63, P1535, DOI 10.1121/1.381848 ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 BAER T, 1991, J ACOUST SOC AM, V90, P799, DOI 10.1121/1.401949 BOE LJ, 1992, J PHONETICS, V20, P27 BROWMAN CP, 1990, J PHONETICS, V18, P299 COKER CH, 1976, P IEEE, V64, P452, DOI 10.1109/PROC.1976.10154 DUNN HK, 1950, J ACOUST SOC AM, V22, P740, DOI 10.1121/1.1906681 FLANAGAN JL, 1980, J ACOUST SOC AM, V68, P780, DOI 10.1121/1.384817 Goldberg D. E, 1989, GENETIC ALGORITHMS S KELLY JL, 1962, 4 INT C AC, P1 KELSO JAS, 1984, J EXP PSYCHOL HUMAN, V10, P812, DOI 10.1037/0096-1523.10.6.812 LARAR JN, 1988, IEEE T ACOUST SPEECH, V36, P1812, DOI 10.1109/29.9026 LEVINSON SE, 1983, J ACOUST SOC AM, V74, P1145, DOI 10.1121/1.390038 LIBERMAN AM, 1985, COGNITION, V21, P1, DOI 10.1016/0010-0277(85)90021-6 LILJENCRANTS J, 1985, THESIS KTH STOCKHOLM LIN Q, 1990, THESIS KTH STOCKHOLM Maeda S., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90017-6 MCGOWAN RS, 1991, 12TH P INT C PHON SC, V4, P486 MERMELST.P, 1973, J ACOUST SOC AM, V53, P1070, DOI 10.1121/1.1913427 MERMELST.P, 1967, J ACOUST SOC AM, V41, P1283, DOI 10.1121/1.1910470 MEYER P, 1991, IEEE T SIGNAL PROCES, V39, P1493, DOI 10.1109/78.134389 MEYER P, 1989, J ACOUST SOC AM, V86, P523, DOI 10.1121/1.398232 MILENKOVIC P, 1987, IEEE T ACOUST SPEECH, V35, P1089, DOI 10.1109/TASSP.1987.1165271 MILENKOVIC P, 1984, IEEE T ACOUST SPEECH, V32, P1122, DOI 10.1109/TASSP.1984.1164455 PAPCUN G, 1992, J ACOUST SOC AM, V92, P688, DOI 10.1121/1.403994 Parthasarathy S., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90043-4 PERKELL JS, 1992, J ACOUST SOC AM, V92, P3078, DOI 10.1121/1.404204 RAHIM MG, 1993, J ACOUST SOC AM, V93, P1109, DOI 10.1121/1.405559 RUBIN P, 1981, J ACOUST SOC AM, V70, P321, DOI 10.1121/1.386780 SALTZMAN E, 1987, PSYCHOL REV, V94, P84, DOI 10.1037//0033-295X.94.1.84 Saltzman E. L., 1989, ECOL PSYCHOL, V1, P333, DOI 10.1207/s15326969eco0104_2 Saltzman E. L, 1986, EXPT BRAIN RES SERIE, P129 SCHROEDE.MR, 1967, J ACOUST SOC AM, V41, P1002, DOI 10.1121/1.1910429 SCHROETER J, 1990, IEEE P INT C ACOUST Schroeter J., 1992, ADV SPEECH SIGNAL PR, P231 SCHROETER J, 1987, IEEE P INT C ACOUST, V87, P308 SCHROETER J, 1989, IEEE P INT C ACOUST, V89, P588 SHIRAI K, 1986, SPEECH COMMUN, V5, P159, DOI 10.1016/0167-6393(86)90005-1 SHIRAI K, 1991, J PHONETICS, V19, P379 SONDHI MM, 1983, J ACOUST SOC AM, V73, P985, DOI 10.1121/1.389024 SONDHI MM, 1987, IEEE T ACOUST SPEECH, V35, P955 STEVENS KN, 1960, J ACOUST SOC AM, V32, P47, DOI 10.1121/1.1907874 STEVENS KN, 1955, J ACOUST SOC AM, V27, P484, DOI 10.1121/1.1907943 STEVENS KN, 1993, 3RD SEMINAR SPEECH P WAKITA H, 1973, IEEE T ACOUST SPEECH, VAU21, P417, DOI 10.1109/TAU.1973.1162506 NR 46 TC 24 Z9 24 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 19 EP 48 DI 10.1016/0167-6393(94)90055-8 PG 30 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100002 ER PT J AU KENNY, P BOULIANNE, G GARUDADRI, H TRUDELLE, S HOLLAN, R LENNIG, M OSHAUGHNESSY, D AF KENNY, P BOULIANNE, G GARUDADRI, H TRUDELLE, S HOLLAN, R LENNIG, M OSHAUGHNESSY, D TI EXPERIMENTS IN CONTINUOUS SPEECH RECOGNITION USING BOOKS ON TAPE SO SPEECH COMMUNICATION LA English DT Article DE CONTINUOUS SPEECH RECOGNITION; A-ASTERISK SEARCH; HIDDEN MARKOV MODEL AB We present a new search algorithm for very large vocabulary continuous speech recognition. Continuous speech recognition with this algorithm is only about 10 times more computationally expensive than isolated word recognition. We report preliminary recognition results obtained by testing our recognizer on books on tape using a 60000 word dictionary. C1 BELL NO RES LTD, MONTREAL, PQ, CANADA. RP KENNY, P (reprint author), INRS TELECOMMUN, 16 PL COMMERCE, VERDUN H3E 1H6, PQ, CANADA. CR AUSTIN S, 1990, JUN P DARPA SPEECH N AUSTIN S, 1991, P INT C ACOUST SPEEC, V91, P697 AVERBUCH A, 1987, P INT C ACOUST SPEEC, V87, P701 Bahl L, 1991, P INT C AC SPEECH SI, P185, DOI 10.1109/ICASSP.1991.150308 Bahl L. R., 1989, P ICASSP 89 GLASG SC, P465 BAHL LR, 1988, P 1988 INT C AC SPEE, P489 Bahl LR, 1993, IEEE T SPEECH AUDI P, V1, P59, DOI 10.1109/89.221368 BOULIANNE G, 1994, SPEECH COMMUN, V14, P61, DOI 10.1016/0167-6393(94)90057-4 DENG L, 1991, IEEE T SIGNAL PROCES, V39, P655 FISSORE L, 1989, IEEE T ACOUST SPEECH, V37, P1197, DOI 10.1109/29.31268 Gupta V., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90027-2 Gupta V., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90028-3 GUPTA VN, 1988, J ACOUST SOC AM, V84, P2007, DOI 10.1121/1.397045 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 Jelinek F., 1969, IBM Journal of Research and Development, V13 Kenny P, 1993, IEEE T SPEECH AUDI P, V1, P49, DOI 10.1109/89.221367 KENNY P, 1991, P EUROSPEECH 91, P655 KENNY P, 1992, P DARPA SPEECH NATUR KENNY P, 1994, UNPUB INT C ACOUST S KENNY P, 1993, SEP EUROSPEECH 93 KUBALA F, 1990, JUN P DARPA SPEECH N LENNIG M, 1990, P DARPA SPEECH NATUR, P391, DOI 10.3115/116580.116726 LENNIG M, 1992, OCT P ICSLP 92, P93 NEY H, 1992, P INT C ACOUST SPEEC Nilsson N., 1982, PRINCIPLES ARTIFICIA PAESLER A, 1989, MAY P INT C AC SPEEC Paul D., 1991, P IEEE INT C AC SPEE, P693, DOI 10.1109/ICASSP.1991.150434 SAGAYAMA S, 1991, P EUROPEAN C SPEECH, P1225 SCHWARTZ R, 1992, P INT C ACOUST SPEEC SEITZ F, 1990, COMPUTER SPEECH LANG, V4, P193 SOONG FK, 1991, P INT C AC SPEECH SI, V1, P705 ZHAO R, 1993, SEP EUROSPEECH 93 ZUE V, 1991, P 1991 IEEE INT C AC, P713, DOI 10.1109/ICASSP.1991.150439 NR 33 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 49 EP 60 DI 10.1016/0167-6393(94)90056-6 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100003 ER PT J AU BOULIANNE, G KENNY, P LENNIG, M OSHAUGHNESSY, D MERMELSTEIN, P AF BOULIANNE, G KENNY, P LENNIG, M OSHAUGHNESSY, D MERMELSTEIN, P TI BOOKS ON TAPE AS TRAINING DATA FOR CONTINUOUS SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODEL; CONTINUOUS SPEECH RECOGNITION; TRAINING ALGORITHM; SPEECH SEGMENTATION; SPEECH LABELING; VITERBI DECODING ID HIDDEN MARKOV-MODELS; WORD RECOGNITION AB Training algorithms for natural speech recognition require very large amounts of transcribed speech data. Commercially distributed books on tape constitute an abundant source of such data, but it is difficult to take advantage of it using current training algorithms because of the requirement that the data be hand-segmented into chunks that can be comfortably processed in memory. In order to address this problem we have developed a training algorithm which is capable of handling unsegmented data files of arbitrary length; the computational requirements of the algorithm are linear in the amount of data to be processed and the memory requirements are constant. C1 BELL NO RES LTD, MONTREAL, PQ, CANADA. RP BOULIANNE, G (reprint author), INRS TELECOMMUN, 16 PL COMMERCE, VERDUN H3E 1H6, PQ, CANADA. CR BOULIANNE G, 1992, OCT P ICSLP BANFF, P229 Bridle J. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 DENG L, 1991, IEEE T SIGNAL PROCES, V39, P1677, DOI 10.1109/78.134406 KENNY P, 1992, 1992 P INT C SPOK LA, P225 Kenny P, 1993, IEEE T SPEECH AUDI P, V1, P49, DOI 10.1109/89.221367 KENNY P, 1994, SPEECH COMMUN, V14, P49, DOI 10.1016/0167-6393(94)90056-6 KRIOUILE A, 1990, 18EMES JOURN ET PAR, P207 Lee E. A., 1988, DIGITAL COMMUNICATIO LEE KF, 1990, IEEE T ACOUST SPEECH, V38, P599, DOI 10.1109/29.52701 PAUL DB, 1992, DARPA SPEECH NAT LAN, P357 PIERACCINI R, 1990, JUN P DARPA SPEECH N, P311 PIERACCINI R, 1991, SPEECH COMMUN, V10, P105, DOI 10.1016/0167-6393(91)90034-Q RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 Siddall J., 1972, ANAL DECISION MAKING NR 15 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 61 EP 70 DI 10.1016/0167-6393(94)90057-4 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100004 ER PT J AU CULLING, JF SUMMERFIELD, Q MARSHALL, DH AF CULLING, JF SUMMERFIELD, Q MARSHALL, DH TI EFFECTS OF SIMULATED REVERBERATION ON THE USE OF BINAURAL CUES AND FUNDAMENTAL-FREQUENCY DIFFERENCES FOR SEPARATING CONCURRENT VOWELS SO SPEECH COMMUNICATION LA English DT Article DE REVERBERATION; PERCEPTUAL GROUPING; BINAURAL CUES; FUNDAMENTAL FREQUENCY ID INTERAURAL TIME DIFFERENCES; HEADPHONE SIMULATION; INFERIOR COLLICULUS; PERCEPTION; SPEECH; SEGREGATION; LEVEL; NOISE; SOUND; ROOM AB A computational simulation was used to generate impulse responses between points in a rectangular room and two points on opposite sides of a spherical ''head''. Sounds were convolved with the impulse responses to generate stimuli with which to study the effects of reverberation on the ability of listeners to use differences in fundamental frequency (DELTAFos) to separate concurrent vowels. Experiment 1 verified the suitability of the simulation by showing that it produced (i) appropriate percepts of lateralization, (ii) a larger contribution to lateralization from interaural differences in timing than level, and (iii) no effects of reverberation on lateralization. Experiments 2-5 measured masked identification thresholds for synthetic harmonic ''target'' vowels in the presence of masking sounds. In Experiment 2, listeners identified targets against pink-noise maskers. The experiment established a spatial geometry and a degree of reverberation for which listeners did not benefit from binaural cues arising from the spatial geometry of the sources. Experiment 3 demonstrated that the same arrangement did not undermine the ability to use DELTAFos to separate targets from vowel-like maskers when both had static Fo contours, but did prevent listeners from using DELTAFos carried on coherently changing Fo contours. Experiment 4 showed that a modulation width of +/- 1.45% was sufficient to reduce the benefits of DELTAFos, but that the benefits were not eliminated until the width of modulation exceeded the DELTAFo. It is argued that these results are compatible with existing models of the ability to use DELTAFos to separate concurrent vowels and that reverberation undermines the ability when the Fos are changing by diffusing the periodicities of the competing sources. Finally, Experiment 5 demonstrated that reverberation had no effect on the ability to separate a modulated vowel from pink noise. Thus, reverberation may have its detrimental effects in these experiments by diffusing the periodicity of the masking sounds rather than the targets. Overall, the experiments demonstrate that DELTAFos can be more robust cues for separating concurrent sounds than binaural cues. The relevance of these results to the perception of natural continuous speech is discussed. RP CULLING, JF (reprint author), UNIV NOTTINGHAM, MRC, INST HEARING RES, UNIV PK, NOTTINGHAM NG7 2RD, ENGLAND. RI Culling, John/D-1468-2009 CR ALLEN JB, 1979, J ACOUST SOC AM, V65, P943, DOI 10.1121/1.382599 ASSMANN PF, 1990, J ACOUST SOC AM, V88, P680, DOI 10.1121/1.399772 ASSMANN PF, IN PRESS J ACOUST SO BATTEAU DW, 1967, PROC R SOC SER B-BIO, V168, P158, DOI 10.1098/rspb.1967.0058 Blauert J., 1983, SPATIAL HEARING Bregman AS., 1990, AUDITORY SCENE ANAL BROADBENT DE, 1957, J ACOUST SOC AM, V29, P708, DOI 10.1121/1.1909019 BROKX JPL, 1982, J PHONETICS, V10, P23 BRONKHORST AW, 1988, J ACOUST SOC AM, V83, P1508, DOI 10.1121/1.395906 CAIRD DM, 1991, HEARING RES, V57, P91, DOI 10.1016/0378-5955(91)90078-N Campbell M., 1987, MUSICIANS GUIDE ACOU CARHART R, 1967, J ACOUST SOC AM, V42, P124, DOI 10.1121/1.1910541 CARHART R, 1969, J ACOUST SOC AM, V45, P694, DOI 10.1121/1.1911445 CARHART R, 1969, J ACOUST SOC AM, V45, P411, DOI 10.1121/1.1911389 CARHART R, 1968, J ACOUST SOC AM, V43, P1223, DOI 10.1121/1.1910971 CARIANI P, 1993, MIDWINTER M ASS RES CARLYON RP, 1991, J ACOUST SOC AM, V89, P329, DOI 10.1121/1.400468 CARLYON RP, 1992, PHILOS T ROY SOC B, V336, P347, DOI 10.1098/rstb.1992.0068 CHALIKIA MH, 1989, PERCEPT PSYCHOPHYS, V46, P487, DOI 10.3758/BF03210865 CHERRY EC, 1953, J ACOUST SOC AM, V25, P975, DOI 10.1121/1.1907229 Colburn HS, 1978, HDB PERCEPTION, VIV, P467 CULLING JF, IN PRESS J ACOUST SO DARWIN CJ, 1990, ADV SPEECH HEARING L, V1, P219 DARWIN CJ, 1981, Q J EXP PSYCHOL-A, V33, P1636 DARWIN CJ, 1990, SPEECH COMMUN, V9, P469, DOI 10.1016/0167-6393(90)90022-2 DECHEVEIGNE A, 1993, J ACOUST SOC AM, V93, P3271 DEMANY L, 1990, PERCEPT PSYCHOPHYS, V48, P436, DOI 10.3758/BF03211587 Durlach N., 1992, PRESENCE, V1, P251 GARDNER RB, 1989, J ACOUST SOC AM, V85, P1329, DOI 10.1121/1.397464 HARRIS GG, 1960, J ACOUST SOC AM, V32, P685, DOI 10.1121/1.1908181 JEFFRESS LA, 1972, F MODERN AUDITORY TH, V2, P351 KISTLER DJ, 1992, J ACOUST SOC AM, V91, P1637, DOI 10.1121/1.402444 KLATT DH, 1980, J ACOUST SOC AM, V67, P838 KUHN GF, 1977, J ACOUST SOC AM, V62, P157, DOI 10.1121/1.381498 LEVITT H, 1971, J ACOUST SOC AM, V49, P467, DOI 10.1121/1.1912375 LICKLIDER JCR, 1948, J ACOUST SOC AM, V20, P150, DOI 10.1121/1.1906358 Lieberman Philip, 1967, INTONATION PERCEPTIO MCKEOWN JD, 1992, SPEECH COMMUN, V11, P1 MEDDIS R, 1992, J ACOUST SOC AM, V91, P233, DOI 10.1121/1.402767 MILLS AW, 1960, J ACOUST SOC AM, V32, P132, DOI 10.1121/1.1907864 NABELEK AK, 1985, J SPEECH HEAR DISORD, V50, P126 PARSONS TW, 1976, J ACOUST SOC AM, V60, P911, DOI 10.1121/1.381172 PETERSON PM, 1986, J ACOUST SOC AM, V80, P1527, DOI 10.1121/1.394357 PEUTZ VMA, 1973, 1973 P FASE S SPEECH, P89 Plomp R., 1976, Acustica, V34 Plomp R., 1983, HEARING PHYSL BASES, P270 Rayleigh, 1876, NATURE, V14, P32 RAYLEIGH, 1904, THEORY SOUND Rayleigh,, 1907, PHILOS MAG, V13, P214, DOI 10.1080/14786440709463595 SANDEL TT, 1955, J ACOUST SOC AM, V27, P842, DOI 10.1121/1.1908052 Scheffers M. T. M., 1983, THESIS RIJKSUNIVERSI Schroder M., 1954, Acustica, V4 SCHUBERT ED, 1956, J ACOUST SOC AM, V28, P895, DOI 10.1121/1.1908508 SCHUBERT ED, 1962, J ACOUST SOC AM, V34, P844, DOI 10.1121/1.1918203 SHACKLETON TM, 1992, J ACOUST SOC AM, V91, P3579, DOI 10.1121/1.402811 SHACKLETON TM, IN PRESS Q J EXP PSY STEENEKEN HJM, 1980, J ACOUST SOC AM, V67, P318, DOI 10.1121/1.384464 STUBBS RJ, 1991, J ACOUST SOC AM, V89, P1383, DOI 10.1121/1.400539 SUMMERFIELD Q, 1992, PHILOS T ROY SOC B, V336, P357, DOI 10.1098/rstb.1992.0069 Summerfield Q., 1992, J ACOUST SOC AM, V92, P2317, DOI 10.1121/1.405031 Summerfield Q., 1992, AUDITORY PROCESSING, P157 SUMMERFIELD Q, 1991, J ACOUST SOC AM, V89, P1364, DOI 10.1121/1.400659 WAGENAARS WM, 1990, J AUDIO ENG SOC, V38, P99 WALLACH H, 1949, AM J PSYCHOL, V62, P315, DOI 10.2307/1418275 WIGHTMAN FL, 1989, J ACOUST SOC AM, V85, P858, DOI 10.1121/1.397557 WIGHTMAN FL, 1992, J ACOUST SOC AM, V91, P1648, DOI 10.1121/1.402445 WIGHTMAN FL, 1989, J ACOUST SOC AM, V85, P868, DOI 10.1121/1.397558 Winer B J, 1962, STATISTICAL PRINCIPL YIN TCT, 1987, J NEUROPHYSIOL, V58, P562 YOST WA, 1971, J ACOUST SOC AM, V50, P1526, DOI 10.1121/1.1912806 ZWICKER UT, 1984, SPEECH COMMUN, V3, P265, DOI 10.1016/0167-6393(84)90023-2 NR 71 TC 38 Z9 38 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 71 EP 95 DI 10.1016/0167-6393(94)90058-2 PG 25 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100005 ER PT J AU FELLBAUM, K AF FELLBAUM, K TI SHORT REPORT ON EUROSPEECH-93 SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1994 VL 14 IS 1 BP 97 EP 98 DI 10.1016/0167-6393(94)90059-0 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MZ551 UT WOS:A1994MZ55100006 ER PT J AU INGRAM, J PITTAM, J AF INGRAM, J PITTAM, J TI SPEECH SCIENCE AND TECHNOLOGY - A SELECTION FROM THE PAPERS PRESENTED AT THE 4TH INTERNATIONAL-CONFERENCE IN SPEECH SCIENCE AND TECHNOLOGY (SST-92) THE UNIVERSITY-OF-QUEENSLAND, BRISBANE, AUSTRALIA, 30 NOVEMBER 3 DECEMBER 1992 SO SPEECH COMMUNICATION LA English DT Editorial Material RP INGRAM, J (reprint author), UNIV QUEENSLAND, DEPT ENGLISH, ST LUCIA, QLD 4067, AUSTRALIA. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 257 EP 259 DI 10.1016/0167-6393(93)90023-E PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400001 ER PT J AU LEE, CH GAUVAIN, JL PIERACCINI, R RABINER, LR AF LEE, CH GAUVAIN, JL PIERACCINI, R RABINER, LR TI LARGE VOCABULARY SPEECH RECOGNITION USING SUBWORD UNITS SO SPEECH COMMUNICATION LA English DT Article DE CONTINUOUS SPEECH RECOGNITION; SUBWORD HMMS; CONTEXT DEPENDENCY; TASK DEPENDENCY; SPEAKER DEPENDENCY; MAXIMUM A POSTERIORI ESTIMATION ID WORD RECOGNITION; MODELS AB Research in large vocabulary speech recognition has been intensively carried out worldwide, in the past several years, spurred on by advances in algorithms, architectures and hardware. In the United States, the DARPA community has focused efforts on studying several continuous speech recognition tasks including Naval Resource Management, a 991 word task, ATIS (Air Travel Information System), a speech understanding task with an open vocabulary (in practice on the order of several thousand words) and a natural language component, and Wall Street Journal, a voice dictation task with a vocabulary on the order of 20,000 words. Although we have learned a great deal about how to build and efficiently implement large vocabulary speech recognition systems, there remain a whole range of fundamental questions for which we have no definitive answers. In this paper we review the basic structure of a large vocabulary speech recognition system, address the basic system design issues, discuss the considerations in the selection of training material, choice of subword unit, method of training and adaptation of models of subword units, integration of language model, and implementation of the overall system, and report on some recent results, obtained at AT&T Bell Laboratories, on the Resource Management task. RP LEE, CH (reprint author), AT&T BELL LABS, MURRAY HILL, NJ 07974 USA. CR AVERBUCH A, 1987, APR IEEE INT C AC SP, P701 BAKER J, 1992, P DARPA SPEECH NATUR, P387, DOI 10.3115/1075527.1075621 BELLEGARDA JR, 1990, IEEE T ACOUST SPEECH, V38, P2033, DOI 10.1109/29.61531 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 DENG L, 1990, P INT C ACOUST SPEEC, P741 FISSORE F, 1989, IEEE T ACOUST SPEECH, V37, P1977 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 Gauvain J., 1992, P DARPA SPEECH NAT L, P185, DOI 10.3115/1075527.1075568 GAUVAIN JL, 1992, SPEECH COMMUN, V11, P205, DOI 10.1016/0167-6393(92)90015-Y GAUVIN J, 1992, P INT C ACOUSTICS SP, P481, DOI 10.1109/ICASSP.1992.225867 Giachin E. P., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90022-I Hemphill C., 1990, P DARPA SPEECH NAT L, P96, DOI 10.3115/116580.116613 HIRSHMAN L, 1992, MULTISITE DATA COLLE, P903 Hon H.-W., 1992, THESIS CARNEGIE MELL Huang X., 1993, COMPUTER SPEECH LANG, V7, P137, DOI 10.1006/csla.1993.1007 Huang X.D., 1990, HIDDEN MARKOV MODELS JELINEK F, 1985, P IEEE, V73, P1616, DOI 10.1109/PROC.1985.13343 LAMEL LF, 1992, P DARPA CONTINUOUS S, P77 LEE CH, 1992, P ARPA CONT SPEECH R, P59 Lee C.-H., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90022-V Lee C. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90002-N Lee K.-F., 1989, AUTOMATIC SPEECH REC Ljolje A., 1992, P INT C SPOKEN LANGU, P313 Lowerre B., 1980, TRENDS SPEECH RECOGN, P340 MORIMOTO H, 1990, P INFO JAPAN 90, P553 MURVEIT, 1989, P DARPA SPEECH NATUR, P238 NEY H, 1988, SPEECH COMMUN, V7, P367, DOI 10.1016/0167-6393(88)90052-0 Paul D., 1992, P ICSLP, P899 Paul D.B., 1989, P IEEE INT C ACOUSTI, P449 PIERACCINI, 1992, P INT C ACOUST SPEEC, P193 PIERACCINI R, 1991, P ACL 91 Price P., 1988, P IEEE INT C AC SPEE, P651 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 ROE DB, 1992, P ICASSP 92, V1, P213 ROSENBERG AE, 1990, P ICSLP 90 SAGAYAMA, 1992, P AUSTR SPEECH SCI T, P324 SCHWARTZ R, 1989, P DARPA SPEECH NATUR, P94, DOI 10.3115/100964.100968 SCHWARTZ R, 1992, P INT C AC SPEECH SI, P1 SOONG FK, 1988, IEEE T ACOUST SPEECH, V36, P871, DOI 10.1109/29.1598 WEINTRAUB M, 1989, P IEEE INT C ASSP GL, P699 WOODLAND PC, 1992, P DARPA CONTINUOUS S, P47 Zue V. W., 1989, P DARPA SPEECH NAT L, P179, DOI 10.3115/100964.100983 NR 42 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 263 EP 279 DI 10.1016/0167-6393(93)90025-G PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400002 ER PT J AU SCHIEL, F AF SCHIEL, F TI A NEW APPROACH TO SPEAKER ADAPTATION BY MODELING PRONUNCIATION IN AUTOMATIC SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER ADAPTATION; PRONUNCIATION MODELS; RULES OF DIFFERING PRONUNCIATION AB To deal with large lexica (more than 2000 words) automatic speech recognition systems (ASR) use an internal phonetic representation of the speech signal and phonemic models of pronunciation from the lexicon to search for the spoken word chain or sentence. Therefore it is possible to model different pronunciations of a word in the lexicon. In German we observed that individual speakers pronounce words in a typical way that depends on several factors as sex, age, place of living, place of birth, etc. Our goal is to enhance speech recognition by automatically adapting the models of pronunciation in the lexicon to the unknown speaker. The obvious problem is: You cannot wait until the present speaker has uttered approximately 2000 different words at least once. We solved this problem by generalization of observed rules of differing pronunciation to words not yet observed. Another method presented in this paper is speaker adaptation by re-estimating the a posteriori probabilities of the phonetic units used in a ''bottom up'' ASR system. A word hypothesis is evaluated by the product of the a posteriori probabilities of the phonetic units produced by the classification to the phonetic units belonging to the word hypothesis. Normally these probabilities are estimated during the training of the ASR system and stay fixed during the test. We propose an algorithm which observes the typical confusions of phonetic units of the unknown speaker and adapts the a posteriori probabilities continuously. C1 TECH UNIV MUNICH, LEHRSTUHL DATENVERARBEITUNG, W-8000 MUNICH 2, GERMANY. CR DUDEN, 1990, AUSSPRACHEWORTERBUCH, V6 HOFMANN U, 1991, THESIS TU MUNCHEN HUANG XD, 1988, ELECTRON LETT, V24, P6, DOI 10.1049/el:19880004 JEKOSCH U, 1989, INFORMATIONSTECHNIK, V6, P400 PLANNERER B, 1992, P INT C ACOUST SPEEC, P581, DOI 10.1109/ICASSP.1992.225842 REICHL W, 1992, 1992 P DAGM S DRESD, P261 RUSKE G, 1991, SPRACHLICHE MENSCH M, P33 RUSKE G, 1991, STUDIENTEXTE SPRACHK, P173 *SAM, 1990, TECH REP SAM ESPR PR, P244 SCHIEL F, 1991, STUDIENTEXTE SPRACHK, P173 SCHIEL F, 1991, INFORMATIK FACHBERIC, V290, P244 WEIGEL W, 1990, THESIS TU MUNCHEN WINTER M, 1991, THESIS TU MUNCHEN WOLFERTSTETTER F, 1991, THESIS TU M NR 14 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 281 EP 286 DI 10.1016/0167-6393(93)90026-H PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400003 ER PT J AU SONG, JM SAMOUELIAN, A AF SONG, JM SAMOUELIAN, A TI A ROBUST SPEAKER-INDEPENDENT ISOLATED WORD HMM RECOGNIZER FOR OPERATION OVER THE TELEPHONE NETWORK SO SPEECH COMMUNICATION LA English DT Article DE HMM; AUTOMATIC SPEECH RECOGNITION; SPEAKER INDEPENDENT; TELEPHONE SERVICES; CONTEXT-DEPENDENT MODELING; POST-PROCESSING ID HIDDEN MARKOV-MODELS; SPEECH RECOGNITION AB This paper presents the results of a speaker-independent, isolated word speech recognition system developed for information access over Australian public switched telephone network (PSTN). The recognition system is based on Continuous Density Hidden Markov Modelling (CDHMM). The speech database was collected over the PSTN from a large variety of speakers and different geographical locations. The database contained a vocabulary of 55 words consisting of 41 country names and their variations plus a few control words. The recognition performance, tested on 100 other speakers (50 males and 50 females) with no grammar constraint, resulted in an overall recognition rate of 97.3%. This paper describes the HMM training methodology, which consisted of three stages: hand segmented seed model training, automatic word segmentation and reestimation. To facilitate the future implementation of the recognition system in a DSP environment, a fast frame synchronous Viterbi algorithm was implemented with no degradation in recognition performance. The end-point detection is performed by the combination of the silence/noise model with the word models. For confusable word pairs, sub-word models are used to improve the recognition rate. A post-processing approach is used to enhance the performance of the recognition system, in which all ranked candidates from the Viterbi decoding are subject to the tests of the minimum word duration and the likelihood difference between the first candidate and the second candidate. C1 TELSTRA AUSTRALIA, INT BUSINESS UNIT, TECH DEV GRP, SYDNEY 2001, AUSTRALIA. RP SONG, JM (reprint author), UNIV SYDNEY, DEPT ELECT ENGN, SYDNEY, NSW 2006, AUSTRALIA. CR COLE RA, 1992, CSE92014 ORG GRAD I DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 Lee K.-F., 1989, AUTOMATIC SPEECH REC NEY H, 1992, IEEE T SIGNAL PROCES, V40, P272, DOI 10.1109/78.124938 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 RABINER LR, 1989, IEEE T ACOUST SPEECH, V37, P1214, DOI 10.1109/29.31269 Wilpon J. G., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90015-5 NR 8 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 287 EP 295 DI 10.1016/0167-6393(93)90027-I PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400004 ER PT J AU KATAGISHI, K SINGER, H AIKAWA, K SAGAYAMA, S AF KATAGISHI, K SINGER, H AIKAWA, K SAGAYAMA, S TI FEATURE-EXTRACTION USING A MATRIX COEFFICIENT FILTER FOR SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE FILTERING; CEPSTRUM; DYNAMICAL FEATURE; DELTA-CEPSTRUM; DYNAMIC CEPSTRUM; HMM; SPEECH RECOGNITION AB This paper provides a new interpretation of the so-called ''delta-cepstrum'' and extends the formulation of the conventional delta-cepstrum towards an optimal design of the filter, which extracts important spectral dynamics from a cepstrum sequence. The algorithm to obtain new feature parameters is unified to a formulation using a matrix coefficient filter and is tested through Japanese speech recognition experiments. The average recognition error rate in a Japanese 24 phoneme recognition experiment for four speakers was reduced from 12.2% to 10.3%. RP KATAGISHI, K (reprint author), ATR RES LABS, DEPT 1, HIKARIDAI 2-2, SEIKA, KYOTO 61902, JAPAN. CR AIKAWA K, 1992, IEICE SP9243 TECHN R FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 Hermansky H., 1991, P EUROSPEECH, P1367 JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 Miyasaka E., 1983, Journal of the Acoustical Society of Japan, V39 SAGAYAMA S, 1979, P SPRING C AC SOC JA, P589 NR 6 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 297 EP 306 DI 10.1016/0167-6393(93)90028-J PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400005 ER PT J AU HUO, Q CHAN, CK AF HUO, Q CHAN, CK TI THE GRADIENT PROJECTION METHOD FOR THE TRAINING OF HIDDEN MARKOV-MODELS SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODEL; AUTOMATIC SPEECH RECOGNITION; HMM TRAINING; GRADIENT PROJECTION METHOD; NONLINEAR PROGRAMMING; CONSTRAINED OPTIMIZATION ID CHAINS AB In this paper, the training of HMMs has been considered a general optimization problem with linear constraints. A gradient projection method for nonlinear programming with linear constraints has been introduced and presented to solve for ''optimal'' values of the model parameters. When this classic method is applied to train HMMs of discrete or Gaussian mixture observation densities, a very simple formulation can be derived due to the special structure of the constraints on the HMM parameters. This kind of classical gradient-based optimization methods can offer an opportunity for more flexible modeling of speech signals and more sophisticated training of model parameters for speech recognition. C1 UNIV SCI & TECHNOL CHINA, DEPT RADIO & ELECTR, HEFEI, PEOPLES R CHINA. RP HUO, Q (reprint author), UNIV HONG KONG, DEPT COMP SCI, HONG KONG, HONG KONG. CR BAUM LE, 1967, B AM MATH SOC, V73, P360, DOI 10.1090/S0002-9904-1967-11751-8 BAUM LE, 1970, ANN MATH STAT, V41, P164, DOI 10.1214/aoms/1177697196 BRIDLE JS, 1990, SPEECH COMMUN, V9, P83, DOI 10.1016/0167-6393(90)90049-F Brown P., 1987, THESIS CARNEGIE MELL EPHRAIM Y, 1989, IEEE T INFORM THEORY, V35, P1001, DOI 10.1109/18.42209 FRANCO H, 1991, P ICASSP 91, P357, DOI 10.1109/ICASSP.1991.150350 Gill P. E., 1981, PRACTICAL OPTIMIZATI GILL PE, 1974, NUMERICAL METHODS CO GILL PE, 1979, MATH PROGRAM, V17, P32, DOI 10.1007/BF01588224 Goldstein A. A., 1965, SIAM J CONTROL, V3, P147 GOPALAKRISHNAN PS, 1991, IEEE T INFORM THEORY, V37, P107, DOI 10.1109/18.61108 HUO Q, 1991, TR9111 U HONG KONG D JUANG BH, 1985, AT&T TECH J, V64, P1235 JUANG BH, 1992, IEEE T SIGNAL PROCES, V40, P3043, DOI 10.1109/78.175747 KAPADIA S, 1993, P ICASSP, V2, P491 LIPORACE LA, 1982, IEEE T INFORM THEORY, V28, P729, DOI 10.1109/TIT.1982.1056544 Ljolje A., 1990, P ICASSP 90, P709 Luenberger D. G., 1984, LINEAR NONLINEAR PRO NORMANDIN Y, 1991, THESIS MCGILL U DEP ROSEN JB, 1960, J SOC IND APPL MATH, V8, P181, DOI 10.1137/0108011 WOLFE P, 1969, SIAM REV, V11, P226, DOI 10.1137/1011036 NR 21 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 307 EP 313 DI 10.1016/0167-6393(93)90029-K PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400006 ER PT J AU SINGER, H SAGAYAMA, S AF SINGER, H SAGAYAMA, S TI SUPRASEGMENTAL DURATION CONTROL WITH MATRIX PARSING IN CONTINUOUS SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODEL; CYK PARSER; MATRIX PARSER; CONTINUOUS SPEECH RECOGNITION AB This paper describes a unified framework for continuous speech recognition (CSR) under grammatical constraints, where trellis calculations and parsing are performed by the same simple fundamental operations, namely multiplication and addition of likelihood matrices. The matrix parser is shown to be a generalization of the CYK parser. It also facilitates explicit supra-segmental duration control for all grammatical categories. Preliminary results showed that improved duration control on the mora level raised the recognition accuracy for a phrase recognition task from 86.7% to 88.5%. RP SINGER, H (reprint author), ATR ITL, DEPT SPEECH PROC, 2-2 HIKARIDAI, SEIKA, KYOTO 61902, JAPAN. CR Fu K.S., 1982, SYNTACTIC PATTERN RE Kita K., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266524 NEY H, 1987, IEEE INT C ACOUST SP, P69 PORT RF, 1987, J ACOUST SOC AM, V81, P1574, DOI 10.1121/1.394510 SAGAYAMA S, 1991, P EUROPEAN C SPEECH, P1225 SINGER H, 1992, SPR P ACOUST SOC JAP, P89 NR 6 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 315 EP 322 DI 10.1016/0167-6393(93)90030-O PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400007 ER PT J AU HUNT, A AF HUNT, A TI RECURRENT NEURAL NETWORKS FOR SYLLABICATION SO SPEECH COMMUNICATION LA English DT Article DE SYLLABICATION; RECURRENT NEURAL NETWORKS; SYLLABLE SEGMENTATION AB An important procedure in many prosodic analysis systems is locating syllables. The location of syllables is used in the identification of stress and of pitch accents, which in turn form the basis for the analysis of rhythm and intonation. This paper presents a novel syllabification system utilising recurrent neural networks which operates on speaker-independent continuous speech. It is trained and tested on dialect region 1 of the TIMIT database, and finds 94% of syllables and places most syllable boundary points within 20 msec of the desired location. Methods for optimising the performance and training of the recurrent neural networks are investigated. RP HUNT, A (reprint author), UNIV SYDNEY, DEPT ELECT ENGN, SPEECH TECHNOL RES GRP, SYDNEY, NSW 2006, AUSTRALIA. CR AULL A, 1985, P INT C ACOUST SPEEC, P4111 BAGSHAW PC, 1992, 4TH P AUSTR INT C SP, P808 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 HAFFNER P, 1988, FAL P M ACOUST SOC J JACOBS RA, 1988, NEURAL NETWORKS, V1, P295, DOI 10.1016/0893-6080(88)90003-2 Lea W., 1980, TRENDS SPEECH RECOGN, P166 LIPPMAN RP, 1988, P INT C ACOUST SPEEC, P1 Lippmann R. P., 1989, Neural Computation, V1, DOI 10.1162/neco.1989.1.1.1 MARCUS SM, 1981, PERCEPT PSYCHOPHYS, V30, P247, DOI 10.3758/BF03214280 MERMELSTEIN P, 1975, J ACOUST SOC AM, V58, P880, DOI 10.1121/1.380738 ROBINSON T, 1990, 3RD P AUSTR INT C SP, P362 Robinson T., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90010-N Waibel A., 1988, PROSODY SPEECH RECOG WERBOS PJ, 1990, P IEEE, V78, P1550, DOI 10.1109/5.58337 NR 14 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 323 EP 332 DI 10.1016/0167-6393(93)90031-F PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400008 ER PT J AU BAGSHAW, PC AF BAGSHAW, PC TI AN INVESTIGATION OF ACOUSTIC EVENTS RELATED TO SENTENTIAL STRESS AND PITCH ACCENTS, IN ENGLISH SO SPEECH COMMUNICATION LA English DT Article DE PROSODIC LABELING; SYLLABICATION; STRESS; PITCH MOVEMENTS ID SPEECH AB An algorithm is described which abstracts acoustic parameters of a speech waveform to automatically transcribe sentential stress and pitch movements. The waveform acoustics used are duration, energy and fundamental frequency. The abstractions described aim to isolate the prosodically imposed variations in these parameters. A method of syllabification from acoustic parameters is presented. The prominence of each syllable is determined using the automatic process described and the resultant transcription is compared with a hand-labelled prosodic transcription. The agreement level of 61.6% suggests that acoustic parameters other than those already used by the algorithm may be available to the human labeller. RP BAGSHAW, PC (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, ATR INTERPRETING TELEPHONY RES LABS, 80 SOUTH BRIDGE, EDINBURGH EH1 1HN, SCOTLAND. CR Bagshaw P. C., 1993, P EUR C SPEECH COMM, V2, P1003 BAGSHAW PC, 1992, P INT C SPOKEN LANGU, V2, P859 CAMPBELL WN, 1992, P INT C SPOKEN LANGU, V1, P663 CAMPBELL WN, 1990, P INT C SPOKEN LANGU, V1, P9 Crystal D., 1969, PROSODIC SYSTEMS INT HARRIS FJ, 1978, P IEEE, V66, P51, DOI 10.1109/PROC.1978.10837 HIERONYMUS JL, 1989, P EUROPEAN C SPEECH, V1, P226 LAVER J, 1988, ASTR CSTR1 U ED CTR MEDAN Y, 1991, IEEE T SIGNAL PROCES, V39, P40, DOI 10.1109/78.80763 PICKERING B, IN PRESS WORKING SPE, pCH4 RABINER LR, 1975, IEEE T ACOUST SPEECH, V23, P552, DOI 10.1109/TASSP.1975.1162749 Rousseeuw PJ, 1987, ROBUST REGRESSION OU SCHEFFERS MTM, 1988, 7TH P FASE S ED, V3, P981 NR 13 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 333 EP 342 DI 10.1016/0167-6393(93)90032-G PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400009 ER PT J AU CAMPBELL, N AF CAMPBELL, N TI AUTOMATIC DETECTION OF PROSODIC BOUNDARIES IN SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH-SEGMENTATION; DURATION; SYLLABLES; PHRASING ID DURATIONS AB This paper describes a method for automatic annotation of prosodic events in speech, using segmental duration information. It details a way of differentiating prominence-related lengthening from boundary-related lengthening, using durational clues alone, and discusses an anomaly in the phrasing characteristics of four speakers' readings of 200 phonetically-balanced sentences. An algorithm is described that uses syllable-level differences in normalised segmental duration measures to detect prosodic boundaries in a speech signal. Tests with read-speech data from four British-English RP speakers show high agreement between speakers with respect to the number of boundaries detected and the length of the phrases delimited by each pair of boundaries, but the correlation between speakers on actual boundary locations is low. There is particular disagreement between speakers in the case of a single function word linking two groups of content words. This discrepancy can be resolved if the boundary is taken to be at the function word location itself, rather than at one or other side of the word. These results are taken to indicate some freedom in the placement of prosodic boundaries in such cases, sometimes being cued by a syntactic boundary, and sometimes by a rhythmic one. RP CAMPBELL, N (reprint author), ADV TELECOMMUN RES INST, INTERPRETING TELEPHONY RES LABS, KYOTO 61902, JAPAN. CR CAMPBELL WN, 1991, J PHONETICS, V19, P37 CAMPBELL WN, 1989, P EUROPEAN C SPEECH, P698 CRYSTAL TH, 1986, IEEE INT C ACOUST SP, V51, P2791 EDWARDS J, 1988, PHONETICA, V45, P156 EDWARDS JR, 1991, J ACOUST SOC AM, V89 EDWARDS K, 1992, EVALUATION HMM BASED GAITENBY J, 1965, 112 HASK LAB NEW HAV GARDING E, 1960, STUD LINGUISTICA, V14, P37 Klatt D.H., 1975, J PHONETICS, V3, P129 Levinson S. E., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80009-2 SCHMIDT MS, 1991, P EUROSPEECH 91 GENE, V2, P701 SCOTT DR, 1982, J ACOUST SOC AM, V71, P996, DOI 10.1121/1.387581 Silverman K., 1992, P INT C SPOK LANG PR, P867 VAISSIERE J, 1992, COMMUNICATION WIGHTMAN CW, 1992, J ACOUST SOC AM, V91, P1707, DOI 10.1121/1.402450 WIGHTMAN CW, 1993, UNPUB AUTOMATIC BELL NR 16 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 343 EP 354 DI 10.1016/0167-6393(93)90033-H PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400010 ER PT J AU FLETCHER, J MCVEIGH, A AF FLETCHER, J MCVEIGH, A TI SEGMENT AND SYLLABLE DURATION IN AUSTRALIAN ENGLISH SO SPEECH COMMUNICATION LA English DT Article DE SYLLABLE DURATION; PROSODY; ELASTICITY HYPOTHESIS; NEURAL NETWORK AB A data base of around 6500 syllables and their component segments were analysed to describe and develop a model of segment and syllable duration for Australian English. Segment duration was analysed according to prosodic context. Syllables were labelled and analysed according to their prosodic context, length (number of segments), and nature of syllabic peak. Syllable duration was modelled using a three-layer neural network that was trained and tested on different portions of the database. Segment durations were stretched or compressed to fit the network-assigned syllable duration frame. This relatively simple syllable model was able to account for nearly 80% of syllable-level durational variance observed in the database. RP FLETCHER, J (reprint author), MACQUARIE UNIV, SPEECH HEARING & LANGUAGE RES CTR, N RYDE, NSW 2109, AUSTRALIA. CR Barry W. J., 1992, Computer Speech and Language, V6, DOI 10.1016/0885-2308(92)90041-2 Bernard J., 1967, AUSTR U MODERN LANGU, V27, P37 BERNARD JR, 1986, STUDY H D WORDS AUST BERNARD JR, 1970, LANG SPEECH, V13, P37 Campbell W. N., 1992, TALKING MACHINES THE, P211 CAMPBELL WN, 1990, SPEECH COMMUN, V9, P57, DOI 10.1016/0167-6393(90)90046-C CAMPBELL WN, 1991, J PHONETICS, V19, P37 CAMPBELL WN, 1989, P EUROSPEECH 1989, V2, P698 CAMPBELL WN, 1989, SPEECH COMMUN, V2, P698 CARLSON R, 1979, 9TH P INT C PHON SCI CARLSON R, 1986, PHONETICA, V43, P140 COCHRANE GR, 1970, PHONETICA, V22, P240 COOPER WE, 1981, PHONETICA, V38, P106 CROOT K, 1992, 4TH P INT C SPEECH S, P86 CRYSTAL TH, 1988, J ACOUST SOC AM, V83, P1553, DOI 10.1121/1.395911 EDWARDS J, 1991, J ACOUST SOC AM, V89, P369, DOI 10.1121/1.400674 HARRINGTON J, IN PRESS COMPUT SPEE HIERONYMUS J, 1990, PROPOSED SPEECH SEGM HOUSE AS, 1961, J ACOUST SOC AM, V33, P1174, DOI 10.1121/1.1908941 Kirk R. E., 1968, EXPT DESIGN PROCEDUR KLATT DH, 1976, J ACOUST SOC AM, V59, P1208, DOI 10.1121/1.380986 Klatt D. H., 1979, FRONTIERS SPEECH COM, P287 Lindblom B. E. F., 1973, PAPERS LINGUISTICS U, V21, P1 MILLAR JB, 1990, 3RD P AUSTR INT C SP Mitchell A, 1965, SPEECH AUSTR ADOLESC PETERSON GE, 1960, J ACOUST SOC AM, V32, P693, DOI 10.1121/1.1908183 PIERREHUMBERT JB, 1980, THESIS INDIANA U LIN Rumelhart D, 1986, PARALLEL DISTRIBUTED Silverman K., 1992, P INT C SPOK LANG PR, P867 UMEDA N, 1977, J ACOUST SOC AM, V61, P846, DOI 10.1121/1.381374 NR 30 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 355 EP 365 DI 10.1016/0167-6393(93)90034-I PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400011 ER PT J AU STEVENS, KN AF STEVENS, KN TI MODELS FOR THE PRODUCTION AND ACOUSTICS OF STOP CONSONANTS SO SPEECH COMMUNICATION LA English DT Article DE STOP CONSONANTS; STOP CONSONANT ACOUSTICS; AERODYNAMICS OF STOP CONSONANTS; MODELS OF STOP CONSONANTS AB Stop consonants are produced by forming a closure in the vocal tract, building up pressure in the mouth behind this closure, and releasing the closure. Models of the mechanical, aerodynamic, and acoustic events in the vicinity of the stop consonant are described, and examples of calculations of the airflow and of various components of the radiated sound are given. At the consonantal release, these components of the sound include an initial transient, a burst of frication noise, and an interval in which there is a sound source at the glottis and transitions in the formants. The models predict the absolute levels of these components for different places of articulation for the consonants. C1 MIT, DEPT ELECT ENGN & COMP SCI, CAMBRIDGE, MA 02139 USA. RP STEVENS, KN (reprint author), MIT, ELECTR RES LAB, CAMBRIDGE, MA 02139 USA. CR Fant G., 1960, ACOUSTIC THEORY SPEE FUJIMURA O, 1961, J SPEECH HEAR RES, V4, P233 ISHIZAKA K, 1975, IEEE T ACOUST SPEECH, V23, P370, DOI 10.1109/TASSP.1975.1162701 KEWLEYPORT D, 1982, J ACOUST SOC AM, V72, P379, DOI 10.1121/1.388081 MULLER EM, 1980, SPEECH LANGUAGE ADV, V4, P317 PASTEL L, 1987, THESIS MIT CAMBRIDGE Perkell JS, 1969, PHYSL SPEECH PRODUCT, V53 ROTHENBERG M, 1968, BIBLIOTHECA PHONETIC SHADLE C, 1985, RLE506 MIT TECH REP STEVENS KN, 1966, J ACOUST SOC AM, V40, P123, DOI 10.1121/1.1910027 STEVENS KN, 1971, J ACOUST SOC AM, V50, P1180, DOI 10.1121/1.1912751 SUSSMAN HM, 1991, J ACOUST SOC AM, V90, P1309, DOI 10.1121/1.401923 WESTBURY JR, 1979, TEXAS LINGUISTIC FOR NR 13 TC 13 Z9 13 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 367 EP 375 DI 10.1016/0167-6393(93)90035-J PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400012 ER PT J AU CLERMONT, F AF CLERMONT, F TI SPECTROTEMPORAL DESCRIPTION OF DIPHTHONGS IN F1-F2-F3 SPACE SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTIC PHONETICS; DIPHTHONG; FORMANT CONTOURS; FORMANT SPACE; DIPHTHONG SURFACE; VOWEL SURFACE; AUSTRALIAN ENGLISH ID AMERICAN ENGLISH VOWELS; AUDITORY REPRESENTATION; MODEL AB The prevailing approach to the acoustic-phonetic description of the diphthong is based (1) on the two lowest, vocal-tract resonance (or formant) frequencies (F1 and F2) considered either individually or jointly in the F1-F2 plane, and (2) on a very sparse representation of the temporal course of these frequencies. While this time-honoured approach has been particularly useful for characterising the initial and final vowels of the diphthong, there appears to be very little progress beyond the F1-F2 plane, as a parametric framework for elucidating the dynamic nature of the vowel-to-vowel transition. By contrast, a more accurate spectro-temporal description of a subset of the Australian English diphthongs (/aI/ and /I/) is obtained in this work by considering a detailed, temporal representation of the three lowest formant-frequencies (F1, F2 and F3). In particular, certain nonlinearity features of the densely-sampled contour of the F3 are highlighted, which appear to have hitherto been either unknown or considered inconsequential to the specification of the diphthong. This finding is shown to contribute a new, three-dimensional (F1-F2-F3) perspective on the acoustic characteristics of the vocalic transition of the diphthong. RP CLERMONT, F (reprint author), UNIV COLL NEW S WALES, AUSTRALIAN DEF FORCE ACAD, DEPT COMP SCI, CANBERRA, AUSTRALIA. CR Bernard J. R. L., 1970, Z PHONETIK, V23, P113 BOND ZS, 1978, LANG SPEECH, V21, P253 BOND ZS, 1982, J PHONETICS, V10, P259 BROAD DJ, 1977, J ACOUST SOC AM, V62, P1467, DOI 10.1121/1.381676 BROAD DJ, 1987, J ACOUST SOC AM, V81, P155, DOI 10.1121/1.395025 BURGESS N, 1969, LANG SPEECH, V12, P238 BUTLER SJ, 1982, J ACOUST SOC AM, V72, pS79, DOI 10.1121/1.2020071 CARRE R, 1991, J PHONETICS, V19, P433 CLERMONT F, 1992, 4 P AUSTR INT C SPEE, P48 CLERMONT F, 1988, 2 P AUSTR INT C SPEE, P216 CLERMONT F, 1991, THESIS AUSTR NAT U C COLLIER R, 1982, LANG SPEECH, V25, P305 de Manrique A. M., 1979, PHONETICA, V36, P194 Fant G., 1960, ACOUSTIC THEORY SPEE GAY T, 1968, J ACOUST SOC AM, V44, P1570, DOI 10.1121/1.1911298 GERBER SE, 1971, 7 P INT C PHON SCI, P479 GOTTFRIED M, 1989, J ACOUST SOC AM, V86, pS123, DOI 10.1121/1.2027360 HOLBROOK A, 1962, J SPEECH HEAR RES, V5, P38 HOUDE RA, 1968, SCRL MONOGRAPH, V2 JHA SK, 1985, J PHONETICS, V13, P107 KENT R D, 1972, Folia Phoniatrica, V24, P278 KENT RD, 1972, PHONETICA, V26, P16 KOENIG W, 1946, J ACOUST SOC AM, V18, P19, DOI 10.1121/1.1916342 LEHISTE I, 1961, J ACOUST SOC AM, V33, P268, DOI 10.1121/1.1908638 PEETERS WJM, 1991, THESIS U UTRECHT UTR PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 POLS LCW, 1969, J ACOUST SOC AM, V46, P458, DOI 10.1121/1.1911711 Potter R. K., 1947, VISIBLE SPEECH POTTER RK, 1948, J ACOUST SOC AM, V20, P528, DOI 10.1121/1.1906406 POTTER RK, 1950, J ACOUST SOC AM, V22, P807, DOI 10.1121/1.1906694 REN H, 1986, THESIS U CALIFORNIA SUSSMAN HM, 1990, J ACOUST SOC AM, V88, P87, DOI 10.1121/1.399848 SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 SYRDAL AK, 1985, SPEECH COMMUN, V4, P121, DOI 10.1016/0167-6393(85)90040-8 TOLEDO GA, 1987, 11 P INT C PHON SCI, P125 YANG S, 1987, 11 P INT C PHON SCI, V1, P239 YEGNANARAYANA, 1979, P INT C ACOUST SPEEC, P744 YEGNANARAYANA B, 1978, J ACOUST SOC AM, V63, P1638, DOI 10.1121/1.381864 NR 38 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 377 EP 390 DI 10.1016/0167-6393(93)90036-K PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400013 ER PT J AU COOKE, MP BROWN, GJ AF COOKE, MP BROWN, GJ TI COMPUTATIONAL AUDITORY SCENE ANALYSIS - EXPLOITING PRINCIPLES OF PERCEIVED CONTINUITY SO SPEECH COMMUNICATION LA English DT Article DE PERCEIVED CONTINUITY; AUDITORY SCENE ANALYSIS; AUDITORY MODEL; AUDITORY GROUPING; SPEECH SEGREGATION; RESYNTHESIS ID SPEECH AB Acoustic sources are often occluded by other sounds, yet the strategies for recovering individual sources employed by the auditory system in tasks such as speech recognition are remarkably robust against these intrusions. There are often sufficient cues which allow the auditory system to determine whether sound components continue through such occlusions. This paper reviews the situations where an assumption of continuity is warranted and demonstrates how the principles governing the so-called ''continuity illusion'' can be used within a computational system for segregating acoustic sources. RP COOKE, MP (reprint author), UNIV SHEFFIELD, DEPT COMP SCI, 211 PORTOBELLO ST, SHEFFIELD S10 2TN, S YORKSHIRE, ENGLAND. CR Bregman AS., 1990, AUDITORY SCENE ANAL BROWN GA, 1992, P 34 MECH WORK STEEL, P439 BROWN GJ, 1992, THESIS U SHEFFIELD Cooke M., 1993, MODELLING AUDITORY P DENBIGH PN, 1992, SPEECH COMMUN, V11, P119, DOI 10.1016/0167-6393(92)90006-S ELLIS D, 1992, THESIS MIT, V2 GLASBERG BR, 1990, HEARING RES, V47, P103, DOI 10.1016/0378-5955(90)90170-T Kiang NY-s, 1965, DISCHARGE PATTERNS S MEDDIS R, 1988, J ACOUST SOC AM, V83, P1056, DOI 10.1121/1.396050 MELLINGER DK, 1991, THESIS STANFORD U MOORE DR, 1987, BRIT MED BULL, V43, P856 PATTERSON RD, 1988, SVOS VINAL REPORT AU WARREN RM, 1970, SCIENCE, V167, P392, DOI 10.1126/science.167.3917.392 Weintraub M., 1985, THESIS STANFORD U NR 14 TC 13 Z9 14 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 391 EP 399 DI 10.1016/0167-6393(93)90037-L PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400014 ER PT J AU CUTLER, A KEARNS, R NORRIS, D SCOTT, DR AF CUTLER, A KEARNS, R NORRIS, D SCOTT, DR TI PROBLEMS WITH CLICK DETECTION - INSIGHTS FROM CROSS-LINGUISTIC COMPARISONS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; CLICK DETECTION; ENGLISH; FRENCH; OPEN-CLASS AND CLOSED-CLASS WORDS; LEVELS OF PROCESSING ID WORD AB Cross-linguistic comparisons may shed light on the levels of processing involved in the performance of psycholinguistic tasks. For instance, if the same pattern of results appears whether or not subjects understand the experimental materials, it may be concluded that the results do not reflect higher-level linguistic processing. In the present study, English and French listeners performed two tasks - click location and speeded click detection - with both English and French sentences, closely matched for syntactic and phonological structure. Clicks were located more accurately in open- than in closed-class words in both English and French; they were detected more rapidly in open- than in closed-class words in English, but not in French. The two listener groups produced the same pattern of responses, suggesting that higher-level linguistic processing was not involved in the listeners' responses. It is concluded that click detection tasks are primarily sensitive to low-level (e.g. acoustic) effects, and hence are not well suited to the investigation of linguistic processing. C1 UNIV BRIGHTON, INFORMAT TECHNOL RES INST, BRIGHTON BN2 4AT, ENGLAND. RP CUTLER, A (reprint author), MRC, APPL PSYCHOL UNIT, 15 CHAUCER RD, CAMBRIDGE CB2 2EF, ENGLAND. RI Cutler, Anne/C-9467-2012 CR ABRAMS K, 1969, Q J EXP PSYCHOL, V21, P280, DOI 10.1080/14640746908400223 AKEROYD MA, 1992, THESIS U CAMBRIDGE BERENT I, 1993, COGNITION, V46, P203, DOI 10.1016/0010-0277(93)90010-S CORCORAN DW, 1966, NATURE, V210, P658, DOI 10.1038/210658a0 CUTLER A, 1985, LINGUISTICS, V23, P659, DOI 10.1515/ling.1985.23.5.659 Cutler A., 1979, SENTENCE PROCESSING, P113 CUTLER A, 1993, J PSYCHOLINGUIST RES, V22, P109 CUTLER A, 1987, COGNITIVE PSYCHOL, V19, P141, DOI 10.1016/0010-0285(87)90010-7 DELATTRE P, 1966, IRAL-INT REV APPL LI, V4, P183, DOI 10.1515/iral.1966.4.1-4.183 DREWNOWSKI A, 1977, MEM COGNITION, V5, P636, DOI 10.3758/BF03197410 FODOR JA, 1965, J VERB LEARN VERB BE, V4, P414, DOI 10.1016/S0022-5371(65)80081-0 HOLMES VM, 1972, PERCEPT PSYCHOPHYS, V12, P9, DOI 10.3758/BF03212836 HOLMES VM, 1970, PERCEPT PSYCHOPHYS, V7, P297, DOI 10.3758/BF03210171 KELLAS G, 1988, J EXP PSYCHOL HUMAN, V14, P601, DOI 10.1037//0096-1523.14.4.601 KORIAT A, 1991, J EXP PSYCHOL LEARN, V17, P66, DOI 10.1037//0278-7393.17.1.66 Otake T, 1993, J MEM LANG, V32, P358 REBER AS, 1970, PERCEPT PSYCHOPHYS, V8, P81, DOI 10.3758/BF03210179 SCOTT DR, 1985, J PHONETICS, V13, P155 SEITZ MR, 1974, MEM COGNITION, V2, P43, DOI 10.3758/BF03197490 NR 19 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 401 EP 410 DI 10.1016/0167-6393(93)90038-M PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400015 ER PT J AU FORSYTH, ME SUTHERLAND, AM ELLIOTT, JA JACK, MA AF FORSYTH, ME SUTHERLAND, AM ELLIOTT, JA JACK, MA TI HMM SPEAKER VERIFICATION WITH SPARSE TRAINING DATA ON TELEPHONE QUALITY SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER VERIFICATION; SPEAKER RECOGNITION; HMM; SEMICONTINUOUS; DISCRETE; DURATION MODELING AB Speaker verification experiments using discrete and semi-continuous HMMs with telephone quality isolated digits are reported. The models were trained with varying numbers of tokens, giving equal error rates of 14% and 12%, respectively, on single isolated digits, and 4% and 2% on a sequence of 12 isolated digits. RP FORSYTH, ME (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, 80 SOUTH BRIDGE, EDINBURGH EH1 1HN, SCOTLAND. CR FORSYTH ME, 1993, 3RD P EUR C SPEECH C, V1, P319 HUANG XD, 1989, P EUROPEAN C SPEECH, V1, P163 HUANG XD, 1988, ELECTRON LETT, V24, P149, DOI 10.1049/el:19880099 HUANG XD, 1990, INT CONF ACOUST SPEE, P689, DOI 10.1109/ICASSP.1990.115853 Huang X. D., 1988, 9th International Conference on Pattern Recognition (IEEE Cat. No.88CH2614-6), DOI 10.1109/ICPR.1988.28254 Levinson S. E., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80009-2 ROSENBERG AE, 1990, INT CONF ACOUST SPEE, P269, DOI 10.1109/ICASSP.1990.115621 ROSENBERG AE, 1992, KP INT CO SPEECH LAN, P599 RUSSEL MJ, 1987, IEEE ICASSP APR, P2376 WILPON JG, 1985, IEEE T ACOUST SPEECH, V33, P587, DOI 10.1109/TASSP.1985.1164581 NR 10 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 411 EP 416 DI 10.1016/0167-6393(93)90039-N PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400016 ER PT J AU AMBIKAIRAJAH, E KEANE, M KELLY, A KILMARTIN, L TATTERSALL, G AF AMBIKAIRAJAH, E KEANE, M KELLY, A KILMARTIN, L TATTERSALL, G TI PREDICTIVE MODELS FOR SPEAKER VERIFICATION SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER VERIFICATION; MULTILAYER PERCEPTRON; SELF-SEGMENTING LINEAR PREDICTOR; NEURAL PREDICTOR; HIDDEN CONTROL NEURAL NETWORK AB This paper outlines four novel methods for the task of speaker verification. The first model, a Hybrid Multi-Layer Perceptron (MLP)-Radial Basis Function (RBF) model, is an MLP predictor whose weights are then used as inputs to an RBF classifier for the verification process. The second model uses an array of linear predictors to model the true speaker where each predictor is associated with a particular sub-unit of the test utterance. The third, a Neural Prediction Model, consists of an array of MLP predictors and the fourth, a Hidden Control Neural Network, is a single MLP predictor with added control inputs. These control inputs modulate the MLP mapping and allow a single MLP to model a complete utterance. Each method was trained and tested on a modest database and each performs well with verification rates of 100% for the first three models and of 90% for the Hidden Control Neural Network. C1 UNIV E ANGLIA, SCH INFORMAT SYST, NORWICH NR4 7TJ, NORFOLK, ENGLAND. RP AMBIKAIRAJAH, E (reprint author), REG TECH COLL, SPEECH RES GRP, ATHLONE, IRELAND. CR AMBIKAIRAJAH E, 1992, 4TH P AUSTR SPEECH S, P515 AMBIKAIRAJAH E, 1992, NOV P ISITA92 C SING, P419 BOTROS SM, 1991, ADV NEURAL INFORMATI, V3, P707 ISO K, 1990, P INT C AC SPEECH SI, P441 KELLY A, 1992, 4TH P AUSTR SPEECH S, P508 KILMARTIN L, 1992, JTH P AUSTR SPEECH S, P73 LAPEDES A, 1987, UR87266 L AL NAT LAB LEVIN E, 1990, P IEEE INT C AC SPEE, P433 Lippman R., 1987, IEEE ASSP MAGAZI APR, P4 LOVELL BC, 1990, 3RD P AUSTR SPEECH S, P298 LOWE D, 1989, IEE CONF PUBL, P95 Markel JD, 1976, LINEAR PREDICTION SP RUMMELHART D, 1987, PARALLEL DISTRIBUTED, P318 NR 13 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 417 EP 425 DI 10.1016/0167-6393(93)90040-R PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400017 ER PT J AU BOOTH, I BARLOW, M WATSON, B AF BOOTH, I BARLOW, M WATSON, B TI ENHANCEMENTS TO DTW AND VQ DECISION ALGORITHMS FOR SPEAKER RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER VERIFICATION; DYNAMIC TIME WARPING; VECTOR QUANTIZATION; NORMALIZATION AB Dynamic Time Warping (DTW) and Vector Quantisation (VQ) techniques have been applied with considerable success to speaker verification. It is standard practice to use these techniques to calculate a single distance score, and threshold this value to produce a verification decision. In this paper we examine applying a statistical weighting to a number of parameters extracted using the DTW warp path and VQ decision mechanisms. Results are presented which show that the additional parameters extracted encode further speaker specific information, and can be used to improve upon the speaker verification performance of the baseline systems. The application of a distance normalisation technique, which involves comparing DTW or VQ scores for the claimed identity against other speakers, is also investigated. Speaker verification results for baseline and enhanced DTW and VQ systems are reported for a population of 42 speakers. RP BOOTH, I (reprint author), UNIV QUEENSLAND, DEPT ELECTR ENGN, SPEAKER VERIFICAT GRP, BRISBANE, QLD 4072, AUSTRALIA. CR BARLOW M, 1991, THESIS U NSW Barlow M., 1992, Proceedings of the Fourth Australian International Conference on Speech Science and Technology DODDINGTON GR, 1971, THESIS U WISCONSIN DODDINGTON GR, 1985, P IEEE, V73, P1651, DOI 10.1109/PROC.1985.13345 FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P254, DOI 10.1109/TASSP.1981.1163530 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 Gersho A., 1992, VECTOR QUANTISATION HAYS WL, 1971, STATISTICS PROBABILI Higgins A., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90098-6 MATSUI T, 1992, 92 P INT C AC SPEECH, V2, P157 MATSUI T, 1992, P INT C SPOKEN LANGU, V1, P603 SAITO S, 1978, 4TH P INT JOINT C PA, P1014 SMITH JEK, 1962, J ACOUST SOC AM, V34, P1968 SOONG F, 1985, P INT C ACOUST SPEEC, V85, P387 NR 14 TC 3 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 427 EP 433 DI 10.1016/0167-6393(93)90041-I PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400018 ER PT J AU DUTOIT, T LEICH, H AF DUTOIT, T LEICH, H TI MBR-PSOLA - TEXT-TO-SPEECH SYNTHESIS BASED ON AN MBE RE-SYNTHESIS OF THE SEGMENTS DATABASE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; TEXT-TO-SPEECH SYSTEMS; SEGMENTS DATABASE RE-SYNTHESIS AB The use of the Time-Domain Pitch Synchronous OverLap-Add (TD-PSOLA) algorithm in a Text-To-Speech synthesizer is reviewed. Its drawbacks are underlined and three conditions on the speech database are examined. In order to satisfy them, a previously described high quality resynthesis process is developed and enhanced, which makes use of the well-known Multi-Band Excited (MBE) model. An important by product of this operation is that optimal Pitch Marking turns out to be automatic. A temporal interpolation block is finally added. The resulting Multi-Band Resynthesis Pitch Synchronous OverLap Add (MBR-PSOLA) synthesis algorithm supports spectral interpolation between voiced parts of segments, with virtually no increase in complexity. It provides the basis of a high-quality Text-To-Speech (TTS) synthesizer. RP DUTOIT, T (reprint author), FAC POLYTECH MONS, 31 BLVD DOLEZ, B-7000 MONS, BELGIUM. CR ABRANTES AJ, 1992, EUSIPCO 92, P487 Charpentier F., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196674 CHARPENTIER MJ, 1986, INT C ACOUST SPEECH, V86, P2015 DIFRANCESCO R, 1989, EUROSPEECH 89, V2, P39 Dutoit T., 1992, EUSIPCO, P343 Dutoit T., 1993, THESIS FACULTE POLYT DUTOIT T, 1991, P IEEE PRORISC S CIR, P151 GRIFFIN DW, 1987, THESIS MIT ISAKSSON A, 1989, SIGNAL PROCESS, V18, P435, DOI 10.1016/0165-1684(89)90085-6 MCAULAY RJ, 1986, IEEE T ACOUST SPEECH, V34, P744, DOI 10.1109/TASSP.1986.1164910 MOULINES E, 1990, SPEECH COMMUN, V9, P453, DOI 10.1016/0167-6393(90)90021-Z NR 11 TC 27 Z9 28 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 435 EP 440 DI 10.1016/0167-6393(93)90042-J PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400019 ER PT J AU SULLIVAN, KPH DAMPER, RI AF SULLIVAN, KPH DAMPER, RI TI NOVEL-WORD PRONUNCIATION - A CROSS-LANGUAGE STUDY SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; NOVEL-WORD PRONUNCIATION; SYNTHESIS-BY-ANALOGY ID READING ALOUD; ACTIVATION; PHONOLOGY AB In the case of a ''novel word'' absent from a text-to-speech system's pronouncing dictionary, traditional systems invoke context-dependent letter-to-phoneme rules to produce a pronunciation. A proposal in the psychological literature, however, is that human readers pronounce novel words not by using explicit rules, but by analogy with letter-to-phoneme patterns for words they already know. In this paper, a synthesis-by-analogy system is presented which is, accordingly, also a model of novel-word pronunciation by humans. It employs analogy in both orthographic and phonological domains and is applied here to the pronunciation of novel words in British (Received Pronunciation) English and German. In implementing the system, certain detailed questions were confronted which analogy theory is at present inadequately developed to answer. Thus, a major part of this work concerns the impact of implementational choices on performance, where this is defined as the ability of the system to produce pronunciations in line with those given by humans. The size and content of the lexical database on which any analogy system must be based are also considered. The better performing implementations produced useful results for both British English and German. However, best results for each of the two languages were obtained from rather different implementations. C1 UNIV SOUTHAMPTON, DEPT ELECTR & COMP SCI, SOUTHAMPTON SO9 5NH, HANTS, ENGLAND. RP SULLIVAN, KPH (reprint author), UNIV OTAGO, DEPT COMP SCI, DUNEDIN, NEW ZEALAND. CR BROWN P, 1987, ATTENTION PERFORM, P471 Carre B., 1979, GRAPHS AND NETWORKS CHOMSKY C, 1970, HARVARD EDUC REV, V40, P287 COLTHEART M, IN PRESS PSYCHOL REV Coltheart M., 1984, ORTHOGRAPHIES READIN, P67 Coltheart M., 1978, STRATEGIES INFORMATI, P151 Dedina M. J., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90017-K DORFFNER G, 1985, IEEE T ACOUST SPEECH, V2, P774 GLUSHKO RJ, 1979, J EXP PSYCHOL HUMAN, V5, P674, DOI 10.1037//0096-1523.5.4.674 Glushko R. J., 1981, INTERACTIVE PROCESSE, P61 Katz L., 1981, INTERACTIVE PROCESSE, P85 Kucera H., 1967, COMPUTATIONAL ANAL P Lawrence S. G. C., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80020-1 MEIER H, 1967, DTSCH SPRACHSTATISIK OGDEN CK, 1937, BASIC ENGLISH Oxford University Press, 1989, OXFORD ADV LEARNERS ROSSON MB, 1985, MEM COGNITION, V13, P90, DOI 10.3758/BF03198448 RUHL HW, 1982, IEEE T ACOUST SPEECH, V2, P1608 SEIDENBERG MS, 1985, COGNITION, V19, P1, DOI 10.1016/0010-0277(85)90029-0 SULLIVAN KPH, 1990, IEEE T ACOUST SPEECH, V1, P341 SULLIVAN KPH, 1992, TALKING MACHINES THE, P183 SULLIVAN KPH, 1992, THESIS U SOUTHAMPTON *U OSLO, 1978, LANC OSL BERG CORP NR 23 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 441 EP 452 DI 10.1016/0167-6393(93)90043-K PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400020 ER PT J AU BLAMEY, PJ DOOLEY, GJ ALCANTARA, JI GERIN, ES SELIGMAN, PM AF BLAMEY, PJ DOOLEY, GJ ALCANTARA, JI GERIN, ES SELIGMAN, PM TI FORMANT-BASED PROCESSING FOR HEARING-AIDS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; SPEECH PROCESSING; BACKGROUND NOISE; HEARING AIDS ID SPEECH-INTELLIGIBILITY AB A body-worn hearing aid has been developed with the ability to estimate formant frequencies and amplitudes in real time. These parameters can be used to enhance the output signal by ''sharpening'' the formant peaks, by ''mapping'' the amplitudes of the formants onto the available dynamic range of hearing at each frequency, or by resynthesizing a speech signal that is suited to the listener's hearing characteristics. The aid can also be used in a ''frequency response tailoring'' mode similar to a conventional hearing aid. Initial evaluations of the the peak sharpening mode produced small improvements in speech perception for three groups of subjects: (a) Five severely-to-profoundly hearing-impaired people scored 7% higher on average when using the formant-based hearing aid combined with a multiple-electrode cochlear implant compared with their implant and their conventional hearing aid together. (b) A hearing aid user with a severe hearing loss scored 11% higher with a ''peak-sharpened'' signal than with his own conventional hearing aid. (c) Four normally hearing listeners showed a mean improvement of 19% in the perception of vowels and a decrease of 5% for consonants in background noise when the signal was processed. These preliminary results illustrate some potential effects of formant-based processing. RP BLAMEY, PJ (reprint author), COOPERAT RES CTR BION EAR SPEECH & HEARING RES, 384-388 ALBERT ST, MELBOURNE 3002, AUSTRALIA. CR BLAMEY PJ, 1987, J ACOUST SOC AM, V82, P38, DOI 10.1121/1.395542 Boothroyd A, 1968, SOUND, V2, P3 BOOTHROYD A, 1990, SOUND, V469, P166 BOOTHROYD A, 1990, ACTA OTO-LARYNGOL, P166 BYRNE D, 1986, EAR HEARING, V7, P257 DOOLEY GJ, 1993, ARCH OTOLARYNGOL, V119, P55 DOWELL RC, 1991, OTORHINOLARYNGOLOGY, HEAD AND NECK SURGERY, VOLS 1 AND 2, P1167 EVANS EF, 1975, AUDIOLOGY, V14, P419 FAULKNER A, 1992, J ACOUST SOC AM, V91, P2136, DOI 10.1121/1.403674 Gregorian R., 1986, ANALOG MOS INTEGRATE, P280 HNATH T, 1985, RCI1 CIT U GRAD CTR LEVITT H, 1986, J REHABIL RES DEV, V23, P13 Levitt H, 1986, J Rehabil Res Dev, V23, P79 LIPPMANN RP, 1981, J ACOUST SOC AM, V69, P524, DOI 10.1121/1.385375 MILLAR JB, 1992, ADV SPEECH HEARING L, V2, P217 PATUZZI R, 1990, NEUROL NEUR, V56, P45 Peterson P M, 1987, J Rehabil Res Dev, V24, P103 SELIGMAN P, 1987, ANN OTO RHINOL LARYN, V96, P71 SIMPSON AM, 1990, ACTA OTO-LARYNGOL, P101 Skinner M. W., 1988, HEARING AID EVALUATI STONE MA, 1992, J REHABIL RES DEV, V29, P39, DOI 10.1682/JRRD.1992.04.0039 VILLCHUR E, 1973, J ACOUST SOC AM, V53, P1646, DOI 10.1121/1.1913514 NR 22 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 453 EP 461 DI 10.1016/0167-6393(93)90044-L PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400021 ER PT J AU HILLER, S ROONEY, E LAVER, J JACK, M AF HILLER, S ROONEY, E LAVER, J JACK, M TI SPELL - AN AUTOMATED-SYSTEM FOR COMPUTER-AIDED PRONUNCIATION TEACHING SO SPEECH COMMUNICATION LA English DT Article DE PRONUNCIATION; SPEECH ANALYSIS; SPEECH TECHNOLOGY; INTONATION; RHYTHM; VOWEL QUALITY ID SPEECH AB This paper describes a speech technology project called SPELL (Interactive System for Spoken European Language Training), whose main aim is to design an automated system for improving the pronunciation of foreign languages by learners of English, French and Italian. The project has just completed a two-year feasibility study which has created a prototype vehicle incorporating teaching modules in intonation, rhythm and vowel quality. The paper highlights the speech signal processing techniques, similarity metrics and user interfaces which have been integrated to produce an initial demonstration system. A preliminary evaluation by a group of language teaching professionals suggests that the SPELL system is an appropriate tool for exploring the automated teaching of pronunciation. RP HILLER, S (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, 80 SOUTH BRIDGE, EDINBURGH EH1 1HN, SCOTLAND. CR BAKER JH, 1982, INTRO ENGLISH LEGAL CHAPALLAZ M, 1964, HONOUR D JONES, P306 Chapallaz Marguerite, 1979, PRONUNCIATION ITALIA Crystal D., 1969, PROSODIC SYSTEMS INT Crystal David, 1975, ENGLISH TONE VOICE CUTLER A, 1988, J EXP PSYCHOL HUMAN, V14, P113, DOI 10.1037/0096-1523.14.1.113 DAUER RM, 1983, J PHONETICS, V11, P51 FARNETANI E, 1986, SPEECH COMMUN, V5, P17, DOI 10.1016/0167-6393(86)90027-0 FAURE G, 1973, INTERROGATION INTONA, P1 Grundstrom AllanW, 1973, INTERROGATION INTONA, P19 HALLIDAY MAK, 1973, PHONETICS LINGUISTIC, P103 Kenning Marie-Madeleine, 1983, J INT PHON ASSOC, V13, P32 KENNING MM, 1979, J INT PHON ASSOC, V9, P15 Kenworthy Joanne, 1987, TEACHING ENGLISH PRO LAVER J, 1993, PRINCIPLES PHONETICS LEACH P, 1988, J INT PHON ASSOC, V18, P125 LEFEVRE JP, 1992, SPEECH COMMUN, V11, P31, DOI 10.1016/0167-6393(92)90061-B Liberman Mark, 1975, THESIS MIT Martin P., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90021-8 Martin Philippe, 1975, LINGUISTICS, V146, P35 MCCANDLE.SS, 1974, IEEE T ACOUST SPEECH, VSP22, P135, DOI 10.1109/TASSP.1974.1162559 MCINNES FR, 1992, P I ACOUSTICS 6, V14, P109 MEDAN Y, 1991, IEEE T SIGNAL PROCES, V39, P40, DOI 10.1109/78.80763 MILLER M, 1984, J PHONETICS, V12, P75 Muljacic Zarko, 1972, FONOLOGIA LINGUA ITA O'Connor J. D., 1961, INTONATION C ENGLISH Pierrehumbert Janet, 1980, THESIS MIT CAMBRIDGE PIERREHUMBERT JB, 1979, SPEECH COMMUN, P523 Price G., 1991, INTRO FRENCH PRONUNC RABINER LR, 1975, IEEE T ACOUST SPEECH, V23, P552, DOI 10.1109/TASSP.1975.1162749 Roach P., 1982, LINGUISTIC CONTROVER, P73 Smith B., 1987, LEARNER ENGLISH TEAC SYRDAL AK, 1986, J ACOUST SOC AM, V68, P1465 Tranel B., 1987, SOUNDS FRENCH INTRO VANBERGEM DR, 1991, P EUROPSEECH 91, V91, P1455 WENK BJ, 1982, J PHONETICS, V10, P193 NR 36 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1993 VL 13 IS 3-4 BP 463 EP 473 DI 10.1016/0167-6393(93)90045-M PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MR144 UT WOS:A1993MR14400022 ER PT J AU FANT, G HIROSE, K KIRITANI, S AF FANT, G HIROSE, K KIRITANI, S TI FUJISAKI FEST SCHRIFT - FOREWORD SO SPEECH COMMUNICATION LA English DT Article NR 0 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 1 EP 1 PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100001 ER PT J AU FANT, G AF FANT, G TI SOME PROBLEMS IN VOICE SOURCE ANALYSIS SO SPEECH COMMUNICATION LA English DT Article DE VOICE PRODUCTION THEORY; INVERSE FILTERING; GLOTTAL FLOW; VOICE SOURCE DYNAMICS; SOURCE SPECTRUM ID VOCAL-TRACT; SPEECH; MODEL; FLOW AB This is an overview of some recent studies of voice source acoustics and glottal flow analysis and modelling performed at the KTH. Time and frequency domain aspects of the production process are discussed with a view of relating glottal flow parameters from inverse filtering and vocal tract transfer functions to formant amplitudes and bandwidths. Alternative methods of determining the time constant T(a) = 1/(2piF(a)) in the return phase of glottal flow derivative after the instant of excitation, and thus of spectral tilt, are discussed. Selective inverse filtering, removing all but one formant, is potentially useful for this purpose. The influence of uncertainties in quantifying the vocal tract transfer function is exemplified by a calculation of the effects of introducing a finite baffle effect of the human head adding a high-frequency emphasis above the standard + 6 dB/octave. Particular attention has been paid to temporal variations within an utterance as derived from continuous inverse filtering. Aspects of breathy voicing and female-male differences in voice production are discussed. It is demonstrated that the temporal profile of the excitation amplitude, E(e)(t), within an utterance derived from a male speaker can be approximated by the envelope of the negative part of the speech wave. RP FANT, G (reprint author), ROYAL INST TECHNOL, KTH, DEPT SPEECH COMMUN & MUSIC ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR Ananthapadmanabha T. V., 1984, STLQPSR231984, P1 Ananthapadmanabha T. V., 1982, SPEECH COMMUN, V1, P167, DOI 10.1016/0167-6393(82)90015-2 BICKLEY C, 1991, VOCAL FOLD PHYSL ACO, P37 BICKLEY CA, 1986, J PHONETICS, V14, P373 BRIESS B, 1962, STLQPSR, P6 CARLSON R, 1991, SPEECH COMMUN, V10, P481, DOI 10.1016/0167-6393(91)90051-T CARLSON R, 1989, MAY P INT C AC SPEEC, V1, P223 FANT G, 1986, J PHONETICS, V14, P393 FANT G, 1988, STLQPSR23, P1 FANT G, 1985, SPEECH TRANSMISSION, P1 FANT G, 1991, SPEECH COMMUN, V10, P521, DOI 10.1016/0167-6393(91)90055-X FANT G, 1959, ERICSSON TECHNICS Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1987, STL QPSR, P133 FANT G, 1987, 11TH P INT C PHON SC, V3, P376 FANT G, 1959, 3RD P INT C AC STUTT, P187 FANT G, 1982, STLQPSR23 ROYAL I TE, P1 FANT G, 1979, STL QPSR, V1, P85 FANT G, 1963, J ACOUST SOC AM, V35, P1753, DOI 10.1121/1.1918812 FANT G, 1966, STL QPSR, P1 FANT G, 1991, VOCAL FOLD PHYSL ACO, P47 Fant G., 1982, STLOPSR41982 KTH ROY, P28 FANT G, 1979, STLQPSR, P31 FANT G, 1985, SPEECH TRANSMISS APR, P21 FANT G, 1980, STL QPSR, P17 FANT G, 1972, STL QPSR, P1 Flanagan J., 1972, SPEECH ANAL SYNTHESI FLANAGAN JL, 1975, AT&T TECH J, V54, P485 FUJISAKI H, 1987, INT C ACOUST SPEECH, V2, P637 FUJISAKI H, 1986, IEEE T ACOUST SPEECH, V3, P1605 GOBL C, 1988, STL QPSR, V1, P123 GOBL C, 1988, STL QPSR, P23 GOBL C, 1989, VOCAL FOLD PHYSL, P121 KARLSSON I, 1991, P INT C PHONETIC SCI, V4, P10 Karlsson I., 1990, 1990 P INT C SPOK LA, P69 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 LILJENCRANTS J, 1991, SPEECH TRANSMISSION, P1 LIN Q, 1990, THESIS DEP SPEECH CO LIN Q, 1992, INT C ACOUST SPEECH, V2, P57 LINDHARD J, 1964, STUDIES PENETRATION, P1 LINDQVISTGAUFFI.J, 1970, STL QPSR, P3 LINDQVISTGAUFFI.J, 1965, STL QPSR, P8 MARTONY J, 1965, STL QPSR, P4 NORD L, 1986, J PHONETICS, V14, P401 ROTHENBE.M, 1973, J ACOUST SOC AM, V53, P1632, DOI 10.1121/1.1913513 ROTHENBERG M, 1983, VOCAL FOLD PHYSL, P465 SCHUTTE HK, 1986, J PHONETICS, V14, P385 STEVENS KN, 1987, 11TH P INT C PHON SC, P385 STRIK H, 1991, SPEECH COMMUN, V11, P167 Sundberg J, 1979, FRONTIERS SPEECH COM, P301 SUNDBERG J, 1973, STL QPSR, P14 NR 51 TC 50 Z9 51 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 7 EP 22 DI 10.1016/0167-6393(93)90055-P PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100002 ER PT J AU KIRITANI, S HIROSE, H IMAGAWA, H AF KIRITANI, S HIROSE, H IMAGAWA, H TI HIGH-SPEED DIGITAL IMAGE-ANALYSIS OF VOCAL CORD VIBRATION IN DIPLOPHONIA SO SPEECH COMMUNICATION LA English DT Article DE DIPLOPHONIA; VOCAL CORD VIBRATION; HIGH-SPEED VIDEO; PATHOLOGICAL VOICE; VOICE SOURCE AB Simultaneous recording of vocal fold vibrations and speech signals were performed with three patients having diplophonia using a high-speed digital image recording system developed by the present authors. All three cases studied (1 case of unilateral paralysis of the recurrent nerve; 2 cases of unilateral paralysis of external branch of the superior laryngeal nerve) showed a difference in the vibratory frequency between the left and right vocal folds. The phase difference between the vocal cords varies with time. When it reaches a certain threshold, the phase difference is reset and the vocal cord movements resumes synchrony. When the movements of the vocal cords are in phase, glottal closure is complete and the excitation pattern in the speech waveform is strong, whereas when the movements are out of phase, glottal closure is incomplete and the excitation pattern is weak, resulting in a quasi-periodic vibration in speech waveform. RP KIRITANI, S (reprint author), UNIV TOKYO, LOGOPED & PHONIATR RES INST, 7-3-1 HONGO, BUNKYO KU, TOKYO 113, JAPAN. CR FRANSWORTH DW, 1940, BELL LAB REC, V18, P203 HIRANO M, 1976, J OTOLARYNGOL JPN, V79, P1553 Imagawa H., 1987, Japanese Journal of Medical Electronics and Biological Engineering, V25 ISHIZAKA K, 1976, J ACOUST SOC AM, V60, P1193, DOI 10.1121/1.381221 ISSHIKI N, 1977, ANN OTO RHINOL LARYN, V86, P58 KIRITANI H, 1990, NOV P ICSLP KOB, P61 KIRITANI S, 1986, APR P INT C AC SPEEC, P1633 MOORE GP, 1962, J SPEECH HEAR DISORD, V127, P165 TANABE M, 1976, PRATICA OTOLOGICA KY, V69, P67 WARD PH, 1969, ANN OTO RHINOL LARYN, V78, P771 YOSHIDA Y, 1972, J OT JPN, V75, P1256 NR 11 TC 22 Z9 22 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 23 EP 32 DI 10.1016/0167-6393(93)90056-Q PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100003 ER PT J AU STEVENS, KN AF STEVENS, KN TI MODELING AFFRICATE CONSONANTS SO SPEECH COMMUNICATION LA English DT Article DE AFFRICATE CONSONANTS; SPEECH PRODUCTION MODELS; ACOUSTIC PHONETICS AB A model for the aerodynamics and acoustics of affricate consonants is developed. The model is used to predict the changing acoustic pattern that occurs during the 100-odd milliseconds following the release of a palato-alveolar affricate. Appropriate assumptions are made about rates of release and dimensions of cavities in the anterior part of the vocal tract. The calculated acoustic pattern includes an initial transient and an interval in which the frication noise undergoes changes in amplitude and spectrum as the tongue-tip constriction increases in size. Measurements of acoustic spectra following the release of naturally produced affricate consonants show good agreement with spectra predicted from the model. C1 MIT, DEPT ELECT ENGN & COMP SCI, CAMBRIDGE, MA 02139 USA. RP STEVENS, KN (reprint author), MIT, ELECTR RES LAB, CAMBRIDGE, MA 02139 USA. CR HALLE M, 1991, SPEECH COMMUN, V7, P77 KLATT DH, 1968, ANN NY ACAD SCI, V155, P42, DOI 10.1111/j.1749-6632.1968.tb56748.x Maddieson I., 1984, PATTERNS SOUNDS PASTEL L, 1987, THESIS MIT CAMBRIDGE ROTHENBERG M, 1968, BIBLIOTHECA PHONETIC SCULLY C, 1991, 12TH P INT C PHON SC, V3, P58 SHADLE C, 1985, RLE506 MIT TECHN REP STEVENS KN, 1992, J ACOUST SOC AM, V91, P2979, DOI 10.1121/1.402933 STEVENS KN, 1971, J ACOUST SOC AM, V50, P1180, DOI 10.1121/1.1912751 SVIRSKY MA, 1992, J ACOUST SOC AM, V92, P2390, DOI 10.1121/1.404761 NR 10 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 33 EP 43 DI 10.1016/0167-6393(93)90057-R PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100004 ER PT J AU SHIRAI, K AF SHIRAI, K TI ESTIMATION AND GENERATION OF ARTICULATORY MOTION USING NEURAL NETWORKS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; ARTICULATORY MODEL; NEURAL NETWORKS; ARTICULATORY DYNAMICS AB In this paper, it is tried to apply neural networks for two kinds of problems concerning articulatory motion. They are estimation of articulatory motion from speech waves and generation of articulator movement for a sequence of phonemic symbols. In the former problem, since estimation of articulatory parameters is regarded as a nonlinear mapping between the acoustic parameters and the articulatory ones, a neural network is expected to be a suitable method. In the latter problem, a nonlinear control system that produces articulatory motion is successfully constructed combining neural networks. RP SHIRAI, K (reprint author), WASEDA UNIV, DEPT INFORMAT & COMP SCI, 3-4-1 OKUBO, SHINJUKU KU, TOKYO 169, JAPAN. CR BAILLY G, 1992, SIGNAL PROCESS, V6, P159 FUNAHASHI K, 1988, P ATR WORKSHOP NEURA JOSPA P, 1992, SIGNAL PROCESS, V6, P171 SCHROETER J, 1987, IEEE T ACOUST SPEECH, P308 SCHROETER J, 1990, INT CONF ACOUST SPEE, P393, DOI 10.1109/ICASSP.1990.115711 SHIRAI K, 1986, SPEECH COMMUN, V5, P159, DOI 10.1016/0167-6393(86)90005-1 SHIRAI K, 1982, IEEE P INT C ACOUST, P2004 SHIRAI K, 1976, T IECE JAPAN A, V59, P668 SHIRAI K, 1991, J PHONETICS, V19, P379 SHIRAI K, 1978, T IECE JAPAN A, V61, P409 Shirai K., 1981, ICASSP 81. Proceedings of the 1981 IEEE International Conference on Acoustics, Speech and Signal Processing NR 11 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 45 EP 51 DI 10.1016/0167-6393(93)90058-S PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100005 ER PT J AU MOBIUS, B PATZOLD, M HESS, W AF MOBIUS, B PATZOLD, M HESS, W TI ANALYSIS AND SYNTHESIS OF GERMAN F0 CONTOURS BY MEANS OF FUJISAKI MODEL SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; INTONATION; ANALYSIS OF F0 CONTOURS; F0 SYNTHESIS BY RULE AB This paper presents the adaptation of Fujisaki's quantitative model to the analysis of German intonation and its application to F0 synthesis by rule. The parameter values of the model are determined by an automatic approximation of naturally produced F0 contours. The algorithm is not primarily based on mathematical criteria but is subject to constraints that emerge from a linguistic interpretation of the model. The potential sources of variation of the parameter values are examined using statistical methods. A set of rules is formulated that capture the effects of both linguistic and speaker-dependent features. The rules generate artificial intonation contours which in turn can be related to linguistic features such as sentence mode or word accent. Acceptability of the rule-generated intonation patterns as well as the adequate modelling of linguistic prosodic properties are evaluated perceptually by both phonetically trained subjects and prosodically ''naive'' listeners. In general, utterances resynthesized with rule-generated F0 contours are judged highly acceptable and natural by both groups of listeners. Detailed judgements with respect to word accent and sentence mode are obtained that help to improve several specific rules and contribute to a more adequate description of German intonation. RP MOBIUS, B (reprint author), UNIV BONN, INST KOMMUNIKATIONSFORSCH & PHONET, POPPELSDORFER ALLEE 47, D-53115 BONN, GERMANY. CR Fujisaki H., 1983, PRODUCTION SPEECH, P39 Fujisaki H., 1990, P INT C SPOKEN LANGU, V1, P485 Fujisaki H., 1988, VOCAL PHYSL VOICE PR, P347 Fujisaki H., 1979, ANN B RES I LOGOPEDI, V13, P163 Gronnum Thorsen N., 1988, ANN REP I PHONET U C, V22, P1 MOBIUS B, 1993, EIN QUANTITATIIVES M MOBIUS B, 1992, P INT C SPOKEN LANGU, V1, P361 MOBIUS M, 1992, SEP ASL PROS WORKSH MOULINES E, 1990, SPEECH COMMUN, V9, P453, DOI 10.1016/0167-6393(90)90021-Z Ohman S., 1967, SPEECH TRANSMISSION, P20 Patzold M., 1991, THESIS U BONN PATZOLD M, 1991, DEC WORKSH PROS MENS PORTELE T, 1990, P ESCA WORKSHOP SPEE, P161 SOTSCHECK J, 1984, FORTSCHRITTE AKUSTIK, P873 NR 14 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 53 EP 61 DI 10.1016/0167-6393(93)90059-T PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100006 ER PT J AU BRUCE, G GRANSTROM, B AF BRUCE, G GRANSTROM, B TI PROSODIC MODELING IN SWEDISH SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; SWEDISH; PROMINENCE LEVELS; PHRASING; INTONATION MODEL; F0; DURATIONS; SPEECH DATA BASE; SPEECH SYNTHESIS; PERCEPTION AB Our present work concerns Swedish prosody in a speech synthesis framework. Two main problem areas are examined: prominence and phrasing. In a model for Swedish prosody, prominence levels (stress, accent, focus) are represented as layered and multidimensional for different domains (syllable, foot, word). Phrasing involves both coherence in the form of specific combinations of existing accentual gestures and separate boundary gestures. The main features of the intonation model are given in outline. Experiments on prominence include modelling of durations in a combined speech data base and rule synthesis framework, where the stressed-unstressed alternation appears to be the most important duration factor. Other experimentation concerns typical differences in the timing characteristics of the tonal gesture for focal accent between compound words and simplex accent II words. Experiments on phrasing include both production data from a varied speech material as well as synthesis and perception. Our experiments demonstrate that both coherence and boundary cues are effective as phrasing signals and that a combination of F0 and duration is typically used to signal phrasing. Our future plans include working with prosodic modelling of Swedish in a dialogue context and in a concept-to-speech framework. C1 KTH, DEPT SPEECH COMMUN & MUSIC ACOUST, S-10044 STOCKHOLM, SWEDEN. RP BRUCE, G (reprint author), LUND UNIV, DEPT LINGUIST & PHONET, HELGONABACKEN 12, S-22362 LUND, SWEDEN. CR BRUCE G, 1987, NORDIC PROSODY, V4, P41 BRUCE G, 1991, 12TH P ICPHS AIX EN, V4, P182 BRUCE G, IN PRESS NORDIC PROS, V6 BRUCE G, 1992, TALKING MACHINES THE, P113 BRUCE G, 1991, 38 LUND U DEP LING P, P5 BRUCE G, 1985, 1985 P FRENCH SWED S, P549 BRUCE G, 1987, P 5TH INT PHONOLOGY, P21 BRUCE G, 1990, NORDIC PROSODY, P26 BRUCE G, 1992, P ICSLP 92, V1, P109 BRUCE G, 1989, P FONETIK 89 STL QPS, P17 Bruce Gosta, 1978, NORDIC PROSODY, P219 Bruce Gosta, 1977, SWEDISH WORD ACCENTS CARLSON R, 1973, SPEECH TRANSMISSION, V2, P31 CARLSON R, 1989, P ESCA WORKSHOP SPEE CARLSON R, 1979, FRONTIERS SPEECH COM CARLSON R, 1986, PHONETICA, V43, P140 CARLSON R, 1989, P EUROSPEECH 89 EURO, V2, P328 Collier R., 1975, STRUCTURE PROCESS SP, P107 FANT G, 1991, 12TH P ICPHS AIX EN, V1, P251 FRETHEIM T, 1984, WORKING PAPERS LINGU, V2, P28 FRETHEIM T, 1978, NORDIC PROSODY, P5 Fujisaki H., 1971, Journal of the Acoustical Society of Japan, V27 GARDING E, 1982, PHONETICA, V39, P288 GRANSTROM B, 1992, SPEECH COMMUN, V11, P459, DOI 10.1016/0167-6393(92)90051-8 HOEL T, 1981, NORDIC PROSODY, V2, P96 Nespor M., 1986, PROSODIC PHONOLOGY Pierrehumbert J., 1988, JAPANESE TONE STRUCT THORSEN N, 1978, NORDIC PROSODY, P23 NR 28 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 63 EP 73 DI 10.1016/0167-6393(93)90060-X PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100007 ER PT J AU LAVER, J AF LAVER, J TI REPETITION AND RE-START STRATEGIES FOR PROSODY IN TEXT-TO-SPEECH CONVERSION SYSTEMS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; TEXT-TO-SPEECH CONVERSION; PROSODY; NOISE; SPOKEN DIALOG AB Speakers in conversations between humans continually adapt the prosodic and structural aspects of their speech to the perceived needs of their listeners, in terms of judgments about the potentially masking effects of transient and ambient noise levels, and in response to explicit requests by listeners for repetition. Adaptive strategies for repetition include changing such prosodic aspects of utterances as pitch range and mean, intensity mean and overall tempo of speaking, together with intonational re-structuring. Such repetition also deploys re-start strategies based on structural linguistic knowledge. An outline is offered of principles for incorporating elements of such intelligent adaptivity in the operation of text-to-speech conversion systems, to improve their interactive ability with human partners in dialogue. RP LAVER, J (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, 80 S BRIDGE, EDINBURGH EH1 1HN, SCOTLAND. CR ALLEN MS, 1987, TEXT SPEECH MITALK S Atkinson J. M., 1984, STRUCTURES SOCIAL AC Brown P., 1987, QUESTIONS POLITENESS, P56 Cheepen C., 1988, PREDICTABILITY INFOR DOCHERTY G, 1988, ASPECTS SPEECH TECHN, P144 Fant G., 1966, SPEECH TRANSMISSION, V4, P22 Goffinan Erving, 1971, RELATIONS PUBLIC HAGERMAN B, 1984, THESIS KAROLINSKA I JACK M, 1988, ASPECTS SPEECH TECHN Kent Raymond D., 1992, ACOUSTIC ANAL SPEECH LAVER J, 1988, 2ND S ADV MAN MACH I LAVER J, 1991, GIFT SPEECH PAPERS A LAVER J, 1993, IN PRESS ENCY LANGUA LAVER J, 1976, HDB PERCPETION, V7, P345 LAVER J, 1972, COMMUNICATION FACE F Laver J., 1975, ORG BEHAV FACE TO FA, P215 LAVER J, 1993, PRINCIPLES PHONETICS LAVER J, 1989, STUDIES PRONUNCIATIO, P323 Laver J, 1989, LOGIC LINGUISTICS RE, P37 Laver J., 1981, CONVERSATIONAL ROUTI, P289 Levelt W. J., 1989, SPEAKING INTENTION A MCALLISTER J, 1993, IN PRESS ENCY LANGUA MCALLISTER R, 1989, PERILUS, V9, P29 NOFSINGER RE, 1991, EVERYDAY CONSERVATIO PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 Schenkein Jim, 1978, STUDIES ORG CONVERSA Schiffrin Deborah, 1987, DISCOURSE MARKERS SIEGEL GM, 1974, J ACOUST SOC AM, V56, P1618, DOI 10.1121/1.1903486 Sudnow D., 1972, STUDIES SOCIAL INTER Tannen D., 1989, TALKING VOICES REPET Yost W. A., 1985, FUNDAMENTALS HEARING NR 31 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 75 EP 85 DI 10.1016/0167-6393(93)90061-O PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100008 ER PT J AU ROSSI, M AF ROSSI, M TI A MODEL FOR PREDICTING THE PROSODY OF SPONTANEOUS SPEECH (PPSS MODEL) SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; INTONATION; ACCENT; STRESS; RHYTHM; SYNTAX; SEMANTICS; PRAGMATICS; SPONTANEOUS SPEECH; INTONEME AB The prosodic structure of speech is the result of complex interactions within and between several different levels of organization. The intonative hierarchy, which is essentially manifested by the nature of the prosodic markers, is the product of complex interactions and constraints within and across organizational levels. Presented here is a model for predicting and interpreting the prosodic organization of spontaneous speech utterances. This model is a hierarchical system composed of six modules: (1) semantic-pragmatic, (2) syntactic, (3) phonotactic, (4) accentual, (5) semantic adjustment, and (6) rhythmic. For a given utterance, the system determines (i) the levels of the boundaries and prosodic markers on the basis of semantic information, and the syntactic structure as defined by the X-bar theory, and (ii) the accentual and rhythmic structures based on phonotactic constraints. The phonetic step, which should transform the abstract labelling into acoustic values, is not presented here. This model can and should be further developed. Future enhancements will pertain to (a) the nature of the rules, (b) different aspects of conversation, and (c) theoretical considerations. Concerning the latter point, current fruitful developments in X-bar theory are likely to lead to positive modifications in the prosodic model, which should enable it to account for certain unexplained phenomena. However, even in its current state, the model produces highly convincing results, since it predicts the number and hierarchy of intonative and stress units in an utterance with a high accuracy rate. RP ROSSI, M (reprint author), UNIV PROVENCE, CNRS PAROLE & LANGAGE, URA, AIX EN PROVENCE, FRANCE. CR Abney S. P., 1987, THESIS MIT CAMBRIDGE AVESANI C, 1990, P ICSLP 90, V2, P833 Bolinger D., 1989, INTONATION ITS USES BRESNAN JW, 1970, FOUND LANG, V6, P297 Burzio L., 1986, ITALIAN SYNTAX CAELENHAUMONT G, 1992, THESIS AIX EN PROVEN Chomsky N., 1986, BARRIERS Chomsky Noam, 1981, LECTURES GOVT BINDIN Collier R., 1990, PERCEPTUAL STUDY INT Dell F., 1984, FORME SONORE LANGAGE, P65 DICRISTO A, 1993, IN PRESS INTONATION EDELMAN MG, 1992, BRIGHT AIR BRILLIANT Fonagy I., 1983, FOLIA LINGUIST, V17, P153, DOI 10.1515/flin.1983.17.1-4.153 FUJISAKI H, 1990, ICSLP 90, V1, P485 Giorgi A., 1991, SYNTAX NOUN PHRASES GIORGI A, 1987, LINGUIST INQ, V18, P511 GROSJEAN F, 1983, LINGUISTICS, V21, P501, DOI 10.1515/ling.1983.21.3.501 GUAITELLA I, 1993, IN PRESS LANGUAGE SP Haegeman L, 1991, INTRO GOVT BINDING T HOGG R, 1987, METRICAL PHONOLOGY JACKENDOFF R, 1987, LINGUIST INQ, V18, P369 JOHNSON K, 1988, LINGUIST INQ, V19, P583 LEON P, 1970, STUDIA PHONETICA, V3, P57 MARTIN P, 1986, 15EM JEP AIX EN PROV, P89 Mathesius V., 1939, SLOVO SLOVESNOST, V5, p[467, 171] Mathesius V., 1937, SLOVO SLOVESNOST, V3, P248 MATHESIUS V, 1941, SLOVO SLOVESNOST, V7, P37 MATHESIUS V, 1973, SLOVO SLOVESNOST, V3, P193 MILNER JC, 1976, METHODES GRAMMAIRE F, P153 MILNER JC, 1973, ARGUMENTS LINGUISTIQ Minsky M., 1985, SOC MIND NIQUE C, 1978, GRAMMAIRE GENERATIVE PASDELOUP V, 1990, THESIS U AIX EN PROV PIERREHUMBERT JB, 1980, THESIS INDIANA U LIN Pike K. L., 1945, INTONATION AM ENGLIS POLLOCK JY, 1989, LINGUIST INQ, V20, P365 Radford A, 1988, TRANSFORMATIONAL GRA ROBERTS I, 1988, LINGUIST INQ, V19, P703 ROCHEMONT SM, 1990, ENGLISH FOCUS CONSTR ROCHEMONT SM, 1986, FOCUS GENERATIVE GRA ROSSI M, 1987, ETUDES LINGUISTIQUE, V66, P20 ROSSI M, 1985, PHONETICA, V42, P135 ROSSI M, UNPUB FRENCH SPEECH ROSSI M, 1976, CONTRIBUTION METHODO ROSSI M, 1990, SCRITTI ONORE LUCIO Rossi Mario, 1981, INTONATION ACOUSTIQU Sportiche D., 1989, LANGAGES, V95, P35 SPORTICHE D, 1983, THESIS MIT CAMBRIDGE VERGNAUD JR, 1992, LINGUIST INQ, V23, P595 WIERZBICKA A, 1976, J PRAGMATICS, V10, P67 WIOLAND F, 1984, B RECONTRES REGIONAL, P293 NR 51 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 87 EP 107 DI 10.1016/0167-6393(93)90062-P PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100009 ER PT J AU PISONI, DB AF PISONI, DB TI LONG-TERM-MEMORY IN SPEECH-PERCEPTION - SOME NEW FINDINGS ON TALKER VARIABILITY, SPEAKING RATE AND PERCEPTUAL-LEARNING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; PERCEPTUAL NORMALIZATION; LONG-TERM MEMORY; TALKER VARIABILITY; SPEAKING RATE; IMPLICIT MEMORY; ACOUSTIC-PHONETIC VARIABILITY; PROCEDURAL MEMORY; NONANALYTIC PERCEPTION; EXEMPLAR-BASED ENCODING; INDEXICAL PROPERTIES OF SPEECH ID SPOKEN WORD LISTS; IMPLICIT MEMORY; RECOGNITION; RECALL; MODEL; DISCRIMINATION; ENGLISH AB This paper summarizes results from recent studies on the role of long-term memory in speech perception and spoken word recognition. Experiments on talker variability, speaking Tate and perceptual learning provide strong evidence for implicit memory for very fine perceptual details of speech. Listeners apparently encode specific attributes of the talker's voice and speaking rate into long-term memory. Acoustic-phonetic variability does not appear to be ''lost'' as a result of phonetic analysis. The process of perceptual normalization in speech perception may therefore entail encoding of specific instances or ''episodes'' of the stimulus input and the operations used in perceptual analysis. These perceptual operations may reside in a ''procedural memory'' for a specific talker's voice. Taken together, the present set of findings are consistent with non-analytic accounts of perception, memory and cognition which emphasize the contribution of episodic or exemplar-based encoding in long-term memory. The results from these studies also raise questions about the traditional dissociation in phonetics between the linguistic and indexical properties of speech. Listeners apparently retain non-linguistic information in long-term memory about the speaker's gender, dialect, speaking rate and emotional state, attributes of speech signals that are not traditionally considered part of phonetic or lexical representations of words. These properties influence the initial perceptual encoding and retention of spoken words and therefore should play an important role in theoretical accounts of how the nervous system maps speech signals onto linguistic representations in the mental lexicon. RP PISONI, DB (reprint author), INDIANA UNIV, DEPT PSYCHOL, SPEECH RES LAB, BLOOMINGTON, IN 47405 USA. CR Aslin R. N., 1980, CHILD PHONOLOGY, P67 Brooks L. R., 1978, COGNITION CATEGORIZA, P169 CREELMAN CD, 1957, J ACOUST SOC AM, V29, P655, DOI 10.1121/1.1909003 EICH JM, 1982, PSYCHOL REV, V89, P627, DOI 10.1037//0033-295X.89.6.627 Elman J. L., 1986, INVARIANCE VARIABILI, P360 FOWLER CA, IN PRESS HUMAN DEV C Fujisaki H., 1969, Annual Report of the Engineering Research Institute, Faculty of Engineering, University of Tokyo, V28 GOLDINGER SD, 1991, J EXP PSYCHOL LEARN, V17, P152, DOI 10.1037//0278-7393.17.1.152 GOLDINGER SD, 1992, 7 IND U TECHN REP HINTZMAN DL, 1986, PSYCHOL REV, V93, P411, DOI 10.1037//0033-295X.93.4.411 Jacoby L. L., 1984, PSYCHOL LEARN MOTIV, P1 JUSCZYK PW, 1993, TRANSITION SPEECH SO Kakehi K., 1992, SPEECH PERCEPTION PR, P135 KLATT DH, 1979, J PHONETICS, V7, P279 KLATT DH, 1986, INVARIANCE VARIABILI, P300 KOLERS PA, 1973, MEM COGNITION, V1, P347, DOI 10.3758/BF03198119 KOLERS PA, 1976, SCIENCE, V191, P1280, DOI 10.1126/science.1257750 KREIMAN J, 1988, BRAIN LANG, V34, P246, DOI 10.1016/0093-934X(88)90136-8 Ladefoged P., 1975, COURSE PHONETICS Laver J, 1979, SOCIAL MARKERS SPEEC, P1 LIBERMAN AM, 1967, PSYCHOL REV, V74, P431, DOI 10.1037/h0020279 Lively S. E., 1992, 18 IND U, P185 Lively S. E., 1992, SPEECH PERCEPTION PR, P175 LIVELY SE, J ACOUST SOC AM LOGAN JS, 1991, J ACOUST SOC AM, V89, P874, DOI 10.1121/1.1894649 MARTIN CS, 1989, J EXP PSYCHOL LEARN, V15, P676, DOI 10.1037/0278-7393.15.4.676 MULLENNIX JW, 1990, PERCEPT PSYCHOPHYS, V47, P379, DOI 10.3758/BF03210878 MULLENNIX JW, 1989, J ACOUST SOC AM, V85, P365, DOI 10.1121/1.397688 Neisser U., 1976, COGNITIVE PSYCHOL NYGAARD LC, 1992, 1992 P INT C SPOK LA, P209 Nygaard LC, 1992, J ACOUST SOC AM, V91, P2340, DOI 10.1121/1.403475 NYGAARD LC, IN PRESS PSYCHOL SCI Palmeri T. J., 1993, J EXPT PSYCHOL LEARN, V19, P1 PAPCUN G, 1989, J ACOUST SOC AM, V85, P913, DOI 10.1121/1.397564 Peters RW, 1955, 56 US NAV SCH AV MED, V56, P1 PISCONI DB, 1986, PATTERN RECOGN, P1 PISONI D, 1990, 1990 P INT C SPOK LA, P1399 Pisoni D. B., 1992, SPEECH PERCEPTION PR, P143 PISONI DB, 1985, SPEECH COMMUN, V4, P75, DOI 10.1016/0167-6393(85)90037-8 PISONI DB, 1973, PERCEPT PSYCHOPHYS, V13, P253, DOI 10.3758/BF03214136 PISONI DB, 1987, COGNITION, V25, P21, DOI 10.1016/0010-0277(87)90003-5 PISONI DB, 1992, 1992 P INT C SPOK LA, P587 PISONI DB, 1978, HDB LEARNING COGNITI, V6, P167 POSNER M, 1986, J EXP PSYCHOL, V77, P353 POSNER MI, 1969, PSYCHOL LEARN MOTIV, P43 ROEDIGER HL, 1990, AM PSYCHOL, V45, P1043, DOI 10.1037//0003-066X.45.9.1043 SCHACTER DL, 1990, DEV NEURAL BASES HIG, V608, P543 SCHACTER DL, 1992, AM PSYCHOL, V47, P559, DOI 10.1037//0003-066X.47.4.559 SOMMERS MS, 1992, 1992 P INT C SPOK LA, P217 SOMMERS MS, 1992, J ACOUST SOC AM, V91, P2340, DOI 10.1121/1.403474 STEVENS KN, 1971, 7TH P INT C PHON SCI, P206 Stevens KN, 1972, HUMAN COMMUNICATION, P51 STRANGE W, 1984, PERCEPT PSYCHOPHYS, V36, P131, DOI 10.3758/BF03202673 STUDDERTKENNEDY M, 1980, LANG SPEECH, V23, P45 Studdert-Kennedy M., 1976, CONT ISSUES EXPT PHO, P243 Studdert-Kennedy M, 1974, CURR TRENDS LINGUIST, P2349 Studdert-Kennedy M, 1983, Hum Neurobiol, V2, P191 TULVING E, 1990, SCIENCE, V247, P301, DOI 10.1126/science.2296719 VANLANCKER DR, 1989, J CLIN EXP NEUROPSYC, V11, P665, DOI 10.1080/01688638908400923 VANLANCKER DR, 1988, CORTEX, V24, P195 WALLEY AC, 1981, DEV PERCEPTION PSYCH, P2119 NR 61 TC 71 Z9 71 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 109 EP 125 DI 10.1016/0167-6393(93)90063-Q PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100010 ER PT J AU MASSARO, DW COHEN, MM AF MASSARO, DW COHEN, MM TI PERCEIVING ASYNCHRONOUS BIMODAL SPEECH IN CONSONANT-VOWEL AND VOWEL SYLLABLES SO SPEECH COMMUNICATION LA English DT Article DE BIMODAL SPEECH PERCEPTION; LIPREADING; AUDITORY-VISUAL TEMPORAL ASYNCHRONY; PHONETIC CLASSIFICATION ID PERCEPTION; INFORMATION AB Subjects naturally integrate auditory and visual information in bimodal speech perception. To assess the robustness of the integration process, the relative onset time of the audible and visible sources was systematically varied. In the first experiment, bimodal syllables composed of the auditory and visible syllables /ba/ and /da/ were present at five different onset asynchronies. The second experiment replicated the same procedure but with the vowels /i/ and /u/. The results indicated that perceivers integrated the two sources of information at all asynchronies. Cluster responses (for example, /bda/ given visual /ba/ and auditory /da/) occurred primarily for the consonants but not for the vowels. In addition, cluster responses require that both the visual and the auditory information be reasonable compatible with the physical properties of a cluster articulation. For both vowels and consonant-vowel syllables, information from the auditory and visual sources is continuous, independent and combined in a three-stage process of feature evaluation, integration and decision. RP MASSARO, DW (reprint author), UNIV CALIF SANTA CRUZ, PROGRAM EXPTL PSYCHOL, SANTA CRUZ, CA 95064 USA. CR BRAIDA LD, 1991, IN PRESS Q J EXPT PS COHEN MM, 1984, THESIS U CALIFORNIA COHEN MM, 1991, PERCEIVING VISUAL AU DODD B, 1987, HEARING EYE EXPT STU FISHER BD, 1991, THESIS U CALIFORNIA FUJISAKI H, 1970, ANN REPORT ENG RES I, V29, P206 GREEN KP, 1991, PERCEPT PSYCHOPHYS, V50, P524, DOI 10.3758/BF03207536 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 Massaro D., 1975, UNDERSTANDING LANGUA Massaro D. W., 1987, SPEECH PERCEPTION EA MASSARO DW, 1989, BEHAV BRAIN SCI, V12, P741 MASSARO DW, 1983, J EXP PSYCHOL HUMAN, V9, P753, DOI 10.1037/0096-1523.9.5.753 MASSARO DW, 1990, PSYCHOL SCI, V1, P55, DOI 10.1111/j.1467-9280.1990.tb00068.x MCGRATH M, 1985, J ACOUST SOC AM, V77, P678, DOI 10.1121/1.392336 SUMMERFIELD Q, 1991, MODULARITY AND THE MOTOR THEORY OF SPEECH PERCEPTION, P117 SUMMERFIELD Q, 1984, Q J EXP PSYCHOL-A, V36, P51 NR 16 TC 26 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 127 EP 134 DI 10.1016/0167-6393(93)90064-R PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100011 ER PT J AU POLS, LCW VANSON, RJJH AF POLS, LCW VANSON, RJJH TI ACOUSTICS AND PERCEPTION OF DYNAMIC VOWEL SEGMENTS SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTICS OF VOWELS; VOWELS IN READ SPEECH; DUTCH; PERCEPTION OF VOWELS; (DYNAMIC) FORMANT ANALYSIS; VOWEL IDENTIFICATION; VOWEL MODELS ID DUTCH VOWELS; FORMANT; CONTEXT; TEXT; READ AB Some 550 vowel segments have been excised from a text read by a Dutch speaker, both at normal rate and at fast rate. The duration of each segment is measured, as well as static and dynamic formant characteristics, such as midpoint formant frequencies, and descriptions of the formant tracks in terms of 16 equidistant points per segment, or Legendre polynomial functions. We examined these formant characteristics as a function of vowel duration, but found no indication for duration-dependent undershoot. Instead, this speaker showed very consistent consonant-specific coarticulatory behavior and adapted his speaking style to the speaking rate in order to reach the same midpoint formant frequencies. Various (parabolically stylized) formant tracks, at various durations, in isolation or in CVC contexts, were synthesized and presented to listeners for identification. Net shifts in vowel responses, compared to stationary stimuli, showed no indication of perceptual overshoot. A weighted averaging method with the greatest weight to formant frequencies in the final part of the vowel tokens, explained the results best. RP POLS, LCW (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR DIBENEDETTO MG, 1989, J ACOUST SOC AM, V86, P67, DOI 10.1121/1.398221 KOOPMANSVANBEIN.FJ, 1980, THESIS U AMSTERDAM LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 Lindblom B., 1988, PHONETIC EXPT RES I, VVIII, P21 LINDBLOM BE, 1967, J ACOUST SOC AM, V42, P830, DOI 10.1121/1.1910655 MANN V, 1991, PERCEPT PSYCHOPHYS, V49, P399, DOI 10.3758/BF03212174 NORD L, 1987, PAPERS SWEDISH PHONE, P16 POLS LCW, 1984, P I ACOUST, V6, P371 Pols LCW, 1977, THESIS FREE U AMSTER SCHULMAN R, 1989, J ACOUST SOC AM, V85, P295, DOI 10.1121/1.397737 STRANGE W, 1989, J ACOUST SOC AM, V85, P2135, DOI 10.1121/1.397863 STRANGE W, 1989, J ACOUST SOC AM, V85, P2081, DOI 10.1121/1.397860 VANBERGEM DR, 1993, SPEECH COMMUN, V12, P1, DOI 10.1016/0167-6393(93)90015-D van Son R., 1993, ANAL SYNTHESIS SPEEC, P171 VANSON RJJ, 1991, IFA P, V15, P43 VANSON RJJ, 1991, P EUROSPEECH 91 GENO, V3, P117 VANSON RJJ, 1993, STUDIES LANGUAGE LAN, V3, P69 VANSON RJJH, 1992, J ACOUST SOC AM, V92, P121, DOI 10.1121/1.404277 VANSON RJJH, 1990, J ACOUST SOC AM, V88, P1683, DOI 10.1121/1.400243 VANWIERINGEN A, 1992, J ACOUST SOC AM, V92, pA2298, DOI 10.1121/1.405128 NR 20 TC 13 Z9 13 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 135 EP 147 DI 10.1016/0167-6393(93)90065-S PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100012 ER PT J AU LEHISTE, I FOX, RA AF LEHISTE, I FOX, RA TI INFLUENCE OF DURATION AND AMPLITUDE ON THE PERCEPTION OF PROMINENCE BY SWEDISH LISTENERS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PROMINENCE; SPEECH PERCEPTION; PERCEPTION OF PROSODY AB This study examines the perception of prominence in speech signals by listeners who are native speakers of Swedish. It represents a continuation of earlier research dealing with perception of prominence by native speakers of English and Estonian (Lehiste and Fox, 1992). The stimuli consisted of the synthesized token [ba] at a constant fundamental frequency. The basic stimulus token was 400 ms in duration. Listeners heard sequences of four stimulus tokens with an inter-token interval of 120 ms. One token in each sequence could be lengthened to 425, 450, 475 or 500 ms. Either the same or a different token could be simultaneously increased in amplitude by 3 or 6 dB. Changes in duration and amplitude were independent. Listeners were required to indicate which token was ''most prominent''. The data show that Swedish listeners, like Estonian listeners, were more responsive to duration cues than were English listeners. However, Swedish listeners were not as sensitive to duration as Estonian listeners when the longer token was in the fourth position, possibly because Swedish listeners expect pre-boundary lengthening (a factor that seemed to have a very significant effect upon the English listeners). In general, these results support the hypothesis that the prosodic structure of a listener's native language can significantly influence his perception of suprasegmental stimuli. RP LEHISTE, I (reprint author), OHIO STATE UNIV, DEPT LINGUIST, 222 OXLEY HALL, 1712 NEIL AVE, COLUMBUS, OH 43210 USA. CR FOX RA, 1987, J PHONETICS, V15, P349 FOX RA, 1989, J PHONETICS, V17, P167 KLATT DH, 1980, J ACOUST SOC AM, V67, P979 LEHISTE I, 1992, LANG SPEECH, V35, P419 NR 4 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 149 EP 154 DI 10.1016/0167-6393(93)90066-T PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100013 ER PT J AU OHALA, JJ AF OHALA, JJ TI SOUND CHANGE AS NATURES SPEECH-PERCEPTION EXPERIMENT SO SPEECH COMMUNICATION LA English DT Article DE SOUND CHANGE; SPEECH PERCEPTION; FRICATIVES; NASALS; PHONETICS; PHONOLOGY AB Variation in pronunciation observed in speakers today parallels in many details the documented variation in pronunciation over the centuries (sound change). It is reasonable to conclude that there is some necessary link between the two. I argue that diachronic variation emerges for the most part from synchronic variation thus: universal and timeless physical constraints on speech production and perception leads listeners to misapprehend the speech signal. Any such misapprehension that leads the listener to pronounce things in a different way is potentially the beginning of a sound change. If we study sound change we can gain insights into how speech is produced and perceived. I exemplify this point by considering a variety of sound changes that involved voiceless fricatives: so-called spontaneous nasalization, s-aspiration, and nasal effacement. They suggest that one cue to this class of sounds is a special voice quality on that portion of vowels immediately abutting the fricative. RP OHALA, JJ (reprint author), UNIV CALIF BERKELEY, DEPT LINGUIST, BERKELEY, CA 94720 USA. CR DURAND M, 1955, J Psychol Norm Pathol (Paris), V52, P347 FANT G, 1973, SPEECH SOUNDS FEATUR, P3 FUJIMURA O, 1971, J ACOUST SOC AM, V49, P541, DOI 10.1121/1.1912385 HOMBERT JM, 1979, LANGUAGE, V55, P37, DOI 10.2307/412518 KLATT DH, 1968, ANN NY ACAD SCI, V155, P42, DOI 10.1111/j.1749-6632.1968.tb56748.x Lorentz J., 1977, P ANN M BERKELEY LIN, V3, P577 Ohala J, 1975, NASALFEST PAPERS S N, P289 Ohala J. J, 1980, SPEECH LANGUAGE ADV, P75 OHALA JJ, 1992, OCT P INT C SPOK LAN, P1303 OHALA JJ, 1992, SPEECH COMMUN, V11, P369, DOI 10.1016/0167-6393(92)90042-6 OHALA JJ, 1981, J ACOUST SOC AM, V68, pS54 OHALA JJ, 1983, P 13 INT C LING TOK, P232 OHALA JJ, 1992, 3RD P INT S LANG LIN, V3 Ohala M., 1983, ASPECTS HINDI PHONOL Prokosch Eduard, 1938, COMP GERMANIC GRAMMA Robins R. H., 1967, SHORT HIST LINGUISTI WIDDISON KA, 1991, THESIS U CALIFORNIA WINITZ H, 1972, J ACOUST SOC AM, V51, P1309, DOI 10.1121/1.1912976 NR 18 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 155 EP 161 DI 10.1016/0167-6393(93)90067-U PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100014 ER PT J AU CORAZZA, A DEMORI, R GRETTER, R SATTA, G AF CORAZZA, A DEMORI, R GRETTER, R SATTA, G TI LANGUAGE MODELING USING STOCHASTIC CONTEXT-FREE GRAMMARS SO SPEECH COMMUNICATION LA English DT Article DE STOCHASTIC CONTEXT-FREE GRAMMARS; UPPER-BOUNDS; BEST DERIVATION AB Island-driven parsers have interesting potential applications in Automatic Speech Understanding (ASU). Most of the recently developed ASU systems are based on an Acoustic Processor (AP) and a Language Processor (LP). AP computes the a priori probability of the acoustic data given a linguistic interpretation. LP computes the probability of the linguistic interpretation. This paper describes an effort to adapt island-driven parsers to handle stochastic context-free grammars. These grammars could then be used as Language Models (LM) by LP to compute the probability of a linguistic interpretation. C1 MCGILL UNIV, SCH COMP SCI, MONTREAL H3A 2T5, QUEBEC, CANADA. UNIV PENN, PHILADELPHIA, PA 19104 USA. RP CORAZZA, A (reprint author), IST RICERCA SCI & TECNOL, I-38050 TRENT, ITALY. CR Aho A. V., 1972, THEORY PARSING TRANS, V1 BAKER JK, 1979, SPR P C AC SOC AM CORAZZA A, 1992, 10TH P NAT C ART INT, P344 CORAZZA A, 1991, IEEE T PATTERN ANAL, V13, P936, DOI 10.1109/34.93811 CORAZZA A, 1992, 920701 I RIC SCI TEC Fu K.S., 1982, SYNTACTIC PATTERN RE FUJISAKI H, 1990, RECENT RES ADV MAN M Gonzalez RC, 1978, SYNTACTIC PATTERN RE Harrison M.D., 1978, INTRO FORMAL LANGUAG JELINEK, 1991, P EUROPEAN C SPEECH, P1037 Jelinek F., 1991, Computational Linguistics, V17 Jelinek Frederick, 1992, SPEECH RECOGNITION U Lari K., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90022-X LEE HC, 1972, IEEE T COMPUT, V4, P660 LU SY, 1977, IEEE T COMPUT, V26, P1268 PERSOON E, 1975, INT J COMPUT INF SCI, V4, P205, DOI 10.1007/BF01007759 SALOMAA A, 1969, INFORM CONTROL, V15, P529, DOI 10.1016/S0019-9958(69)90554-3 WETHERELL CS, 1980, COMPUT SURV, V12, P361 YOUNGER DH, 1967, INFORM CONTROL, V10, P189, DOI 10.1016/S0019-9958(67)80007-X NR 19 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 163 EP 170 DI 10.1016/0167-6393(93)90068-V PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100015 ER PT J AU MARIANI, JJ AF MARIANI, JJ TI AUTOMATED VOICE DICTATION IN FRENCH SO SPEECH COMMUNICATION LA English DT Article DE AUTOMATIC SPEECH PROCESSING; SPEECH RECOGNITION; VOICE DICTATION; VOICE ACTIVATED TYPEWRITER (VAT); ACOUSTIC MODELS; LANGUAGE MODELS AB Designing a Voice-Activated Typewriter in French necessitates a study both on how to design the acoustic level recognition, and on how to obtain a model of the French language. Such a project was initiated at LIMSI 15 years ago. This paper presents the different steps that have been completed since the beginning of this project. First, a study on the phoneme-to-grapheme conversion, for continuous, error-free phonemic strings, using a large vocabulary and a natural language syntax was completed in 1979. The corresponding results were then improved, with attempts to convert phoneme strings containing (simulated) errors, while the methodology was adapted to the case of stenotype-to-grapheme conversion. In the ESPRIT project 860 ''Linguistic Analysis of the European Languages'', our approach for language modeling was compared with other approaches on 7 different European languages. The link between the acoustic recognition and the language model resulted in a complete system (''Hamlet''), for a limited vocabulary (2,000 words), pronounced in isolation, which was then extended to a vocabulary of 5,000 words, taking advantage of a specialized DTW chip (MuPCD), also designed at LIMSI. This study resulted in the conclusion that dictation in an isolated mode was not acceptable. A speaker-independent continuous speech recognition system is now developed for vocabularies of 5 to 20 KWords. RP MARIANI, JJ (reprint author), LIMSI, CNRS, BP 133, F-91403 ORSAY, FRANCE. CR ADDA G, 1987, THESIS U PARIS 11 ANDREEWSKI A, 1972, CEA1606 NOT ANDREEWSKI A, 1978, CEA2055 NOT ANDREEWSKI A, 1979, 10EM JOURN ETUD PAR, P285 AVERBUCH A, 1987, APR IEEE INT C AC SP, P701 AVRAIN J, 1983, TRANSCRIPTION ORTHOG BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 BAHL LR, 1978, IEEE INT C ACOUST SP, P422 BAHL LR, 1989, IEEE INT C ACOUST SP BAKER JK, 1986, SPEECH TECH 86 NEW Y, P193 BAKER JK, 1975, IEEE T ACOUST SPEECH, VAS23, P24, DOI 10.1109/TASSP.1975.1162650 BELLILTY D, 1984, CONVERSION PHONEMESG BOVES L, 1984, EUROPEAN C SPEECH TE, P385 Buzo A., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing DEROUAULT AM, 1986, IEEE T PATTERN ANAL, V5, P742 DEROUAULT AM, 1985, THESIS U PARIS 7 Dumouchel P., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196632 FORNEY GD, 1973, P IEEE, V61, P268, DOI 10.1109/PROC.1973.9030 GAUVAIN JL, 1988, 1988 IOA SPEECH GROU GAUVAIN JL, IN PRESS INT J PATTE GAUVAIN JL, 1982, IEEE INT C ACOUST SP, P891 GAUVAIN JL, 1992, 1992 DARPA SPEECH LA JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 JELINEK F, 1987, IEEE INT C ACOUST SP, P701 KUHN MH, 1983, IEEE T ACOUST SPEECH, V31 LEE KF, 1988, LARGE VOCABULARLY SP MARIANI J, 1981, 1981 SEM GROUP AC LA MARIANI J, 1987, 1987 EUR C SPEECH TE MARIANI J, 1977, THESIS U PARIS 6 MARSLENWILSON WD, 1980, NATO ASI SER, P39 MERIALDO B, IEEE INT C ACOUST SP, P364 MURRAY IR, 1991, CS9109 1 DUND REP PALLETT D, 1993, MAR ARPA HUM LANG TE PORITZ AB, 1982, APR P IEEE INT C AC, P1291 PROUTS B, 1980, THESIS U PARIS 11 QUENOT G, 1986, IEEE INT C ACOUST SP SHICHMAN G, 1986, APR IEEE INT C AC SP, P53 SIMON F, 1985, PRECLASSIFICATION RE VITTORELLI V, 1987, LINGUISTIC ANAL EURO, P1358 Zwicker E., 1981, PSYCHOACOUSTIQUE ORE NR 40 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 171 EP 185 DI 10.1016/0167-6393(93)90069-W PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100016 ER PT J AU GONG, YF HATON, JP AF GONG, YF HATON, JP TI PLAUSIBILITY FUNCTIONS IN CONTINUOUS SPEECH RECOGNITION - THE VINICS SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE AUTOMATIC SPEECH RECOGNITION; PLAUSIBILITY FUNCTIONS; EMBEDDED DYNAMIC PROGRAMMING; VINICS SYSTEM; CONTINUOUS SPEECH; ACOUSTIC SYMBOLS AB We propose a new approach to phoneme-based continuous speech recognition when a time function of plausibility of observing each phoneme is given. We introduce a criterion for best sentence, related to the sum of plausibilities of individual symbols composing the sentence. Based on the idea of making use of a high plausibility region to reduce the computation load while keeping optimality, OUT method finds the most plausible sentences relating to the input speech, given the plausibility mu(a,n) of observing each phoneme a at each time, slot n. Two optimization procedures are defined to deal with the following embedded search processes: (1) find the best path connecting peaks of the plausibility functions of two successive symbols, and (2) find the best time transition slot index for two given peaks. Dynamic programming is used in these two procedures. Since the best path finding algorithm does not search slot by slot, the recognition is highly efficient. Experimental results with the VINICS system show that the method gives a better recognition precision while requiring about 1/20 computing time, compared to traditional DP based methods. The experimental system obtained a 95% sentence recognition rate on a speaker-dependent test. RP GONG, YF (reprint author), CRIN, INRIA LORRAINE, BP 239, F-54506 VANDOEUVRE LES NANCY, FRANCE. CR FORNY GD, 1973, P IEEE, V61, P26 FRANZINI M, 1990, 1990 P IEEE INT C AC, V1, P425 GONG Y, 1991, 1991 P IEEE INT C AC, V1, P121 GONG Y, 1987, 1987 P EUR C SPEECH, V1, P121 GONG Y, 1989, 7EM ACT C REC FORM I, V3, P1191 GONG Y, 1991, 1991 P IEEE INT C AC, V1, P153 GONG YF, 1991, IEEE T PATTERN ANAL, V13, P297, DOI 10.1109/34.75518 ISO K, 1990, 1993 P IEEE INT C AC, V1, P441 LEVIN E, 1990, 1990 P IEEE INT C AC, V1, P433 LIPPMANN RP, 1987, IEEE ASSP MEGAZINE, V3, P422 Minsky M.L, 1969, PERCEPTRONS MORGAN N, 1990, 1993 P IEEE INT C AC, V1, P413 Rabiner L.R., 1978, DIGITAL PROCESSING S SAWAI H, 1989, 1989 P INT C AC SPEE, V1, P25 SCIARRA D, 1989, 1989 P EUR C SPEECH, V2, P164 TEBELSKIS J, 1990, 1990 P IEEE INT C AC, V1, P437 NR 16 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 187 EP 196 DI 10.1016/0167-6393(93)90070-2 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100017 ER PT J AU RAO, PVS AF RAO, PVS TI VOICE - AN INTEGRATED SPEECH RECOGNITION SYNTHESIS SYSTEM FOR THE HINDI LANGUAGE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION AND SYNTHESIS; DISTRIBUTED PROCESSING; SPEECH CODING; SEGMENTATION AND LABELING; F-RATIO; FEATURE VECTOR AB A Voice Oriented Interactive Computing Environment (VOICE) has been implemented in the Hindi language. The system provides an interactive facility for visual and voice feedback. The 200 isolated word recognition system is designed around a railway reservation enquiry task and uses acoustic-phonetic segments as the basic units of recognition. Frame level classification into broad acoustic-phonetic categories is accomplished by a maximum likelihood classifier and segmentation by hierarchical clustering of the frame level likelihood vectors by use of explicit duration semi (Hidden) Markov Models. A more detailed classification of a few categories (vowels, voice bar and nasals in the first instance) is performed by neural nets. String matching using dynamic programming accomplishes lexical access, or conversion of the phonetic category symbol strings into words. Distributed processing of the word recognition task enables recognition at four times real time. A language processor disambiguates between multiple choices given by the recognizer for each word and even corrects some acoustic level recognition errors. This, the first system working in any Indian language, gives a recognition performance of 85% at the word level. For comparison, a purely HMM based word level recognizer has also been implemented. The performance is expected to improve further as there is still substantial scope for refinement. RP RAO, PVS (reprint author), TATA INST FUNDAMENTAL RES, COMP SYST & COMMUN GRP, BOMBAY 400005, INDIA. CR GLASS JR, 1988, MIT536 TECHN REP KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 PODDAR P, 1993, IEEE C NEURAL NETWOR PODDAR P, 1992, 1992 P SPEECH TECHN RAO PVS, 1992, RECENT RES SPOKEN LA RAO PVS, 1988, 2ND S ADV MAN MACH I RAO PVS, 1992, 1992 INT C SPOK LANG NR 7 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 197 EP 205 DI 10.1016/0167-6393(93)90071-R PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100018 ER PT J AU FLANAGAN, JL SURENDRAN, AC JAN, EE AF FLANAGAN, JL SURENDRAN, AC JAN, EE TI SPATIALLY SELECTIVE SOUND CAPTURE FOR SPEECH AND AUDIO PROCESSING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PROCESSING; AUDIO PROCESSING; MICROPHONE SYSTEMS ID MICROPHONE AB Advances in transducer technology, signal processing and computing make possible high-quality sound capture from designated spatial volumes under adverse acoustic conditions. The techniques of multiple beamforming and matched filtering are applied to two- and three-dimensional arrays of sensors. Array performance is assessed in a preliminary way from computer simulations of rooms and from image characterization of the multipath environment. The results suggest that high-quality signals can be retrieved from spatially-selected volumes in severely reverberant enclosures. Reciprocally, the same techniques can be applied to spatially-selective sound projection. RP FLANAGAN, JL (reprint author), RUTGERS UNIV, CTR CAIP, FRELINGHUYSEN RD, PISCATAWAY, NJ 08855 USA. CR ALLEN JB, 1979, J ACOUST SOC AM, V65, P943, DOI 10.1121/1.382599 BERKLEY D, 1990, ATT TECH J SEP, P87 CHE C, 1992, J ACOUST SOC AM, V92, P2476, DOI 10.1121/1.404443 ELKO GW, 1988, Patent No. 4741038 FLANAGAN JL, 1989, J ACOUST SOC AM, V82, P539 FLANAGAN JL, 1991, ACUSTICA, V73, P58 FLANAGAN JL, 1992, P INT WORKSHOP MICRO FLANAGAN JL, 1985, J ACOUST SOC AM, V78, P1508, DOI 10.1121/1.392786 KANEDA Y, 1984, J ACOUST SOC AM, V76, P584 NAYLOR G, 1992, ACOUST SOC AM, V92, pA2345 SESSLER GM, 1969, J ACOUST SOC AM, V46, P28, DOI 10.1121/1.1911657 SILVERMAN HF, 1987, IEEE T ACOUST SPEECH, V35, P1699, DOI 10.1109/TASSP.1987.1165098 STERN RM, 1989, DARPA SPEECH NATURAL NR 13 TC 38 Z9 38 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 207 EP 222 DI 10.1016/0167-6393(93)90072-S PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100019 ER PT J AU HIROSE, K AF HIROSE, K TI SPEECH SIGNAL-PROCESSING USING OPTICAL METHOD SO SPEECH COMMUNICATION LA English DT Article DE OPTICAL PROCESSING; SPECTRAL ANALYSIS; TEMPLATE MATCHING; SPEECH RECOGNITION; LIQUID CRYSTAL PLATE AB A method has been proposed for the use of optical processing techniques in the analysis and recognition of speech signals. It was realized as an optical processor consisting of a Helium-Neon laser, optical lenses, photographic film plates and diffusers. A personal computer system was further introduced for the total system of optical processing to manage the experimental data. Since the optical processing method has an inherent advantage for high-speed and parallel processing of two-dimensional patterns, the frequency-time pattern of a one-dimensional signal can be obtained without shifting a window along the time axis and the template matching for speech recognition can be conducted in a short period. Utterances of vowels and syllables were analyzed using the processor and the results showed a close agreement with those obtained by the computer simulation. Nonlinear warping of the time axis, indispensable for spoken word recognition, was shown to be accomplished by controlling the transmittance function of the windowing plate. Template matching of vowel sounds gave a correct recognition of the Japanese five vowels. These results indicate the validity of the optical processor. The use of a liquid crystal plate was also proposed as a windowing plate, and an experiment was conducted on the analysis of vowel sounds. A near-real-time template matching with non-linear time warping is possible by electrically controlling the liquid crystal plate using the result of template matching as a feedback signal. RP HIROSE, K (reprint author), UNIV TOKYO, FAC ENGN, DEPT ELECTR ENGN, 7-3-1 HONGO, BUNKYO KU, TOKYO 113, JAPAN. CR BAKER LM, 1982, APPL OPTICS, V21, P3157, DOI 10.1364/AO.21.003157 BARTELT HO, 1980, OPT COMMUN, V32, P32, DOI 10.1016/0030-4018(80)90308-9 FELSTEAD EB, 1971, APPL OPTICS, V10, P2468, DOI 10.1364/AO.10.002468 HIROSE K, 1986, FAL M AC SOC JAP, V1, P133 HIROSE K, 1986, 1986 P IEEE INT C AC, V1, P485 TAKAHASHI N, 1987, FAL M AC SOC JAP, V1, P259 YU FTS, 1975, IEEE SPECTRUM, V12, P51 YU FTS, 1971, ACOUST SOC AM, V51, P433 YU FTS, 1985, APPL OPTICS, V24, P836 NR 9 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 223 EP 229 DI 10.1016/0167-6393(93)90073-T PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100020 ER PT J AU SCHROEDER, MR AF SCHROEDER, MR TI A BRIEF-HISTORY OF SYNTHETIC SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; SPEECH COMPRESSION; LINEAR PREDICTION; RECOGNITION; HISTORY ID SIGNALS AB This paper retraces, in an informal way, some of the history of speech synthesis and speech research from Von Kempelen's speaking machine to linear prediction. C1 AT&T BELL LABS, MURRAY HILL, NJ 07974 USA. RP SCHROEDER, MR (reprint author), UNIV GOTTINGEN, DRITTES PHYS INST, W-3400 GOTTINGEN, GERMANY. CR ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 ATAL BS, 1970, AT&T TECH J, V49, P1973 ATAL BS, 1967, 1967 P IEEE C COMM P, P360 Bell A. M., 1867, VISIBLE SPEECH SCI U BELL AG, 1907, MECHANISMS SPEECH BELL AG, 1922, NATIONAL GEOGRAPHIC, V14, P223 BELL CG, 1961, J ACOUST SOC AM, V33, P1725, DOI 10.1121/1.1908556 BOGERT BP, 1956, J ACOUST SOC AM, V28, P399, DOI 10.1121/1.1908340 Chui C. K., 1992, INTRO WAVELETS COOPER FS, 1952, J ACOUST SOC AM, V24, P597, DOI 10.1121/1.1906940 DAVID EE, 1962, 4TH P INT C AC COP DENES PB, 1970, PR INST ELECTR ELECT, V58, P520, DOI 10.1109/PROC.1970.7691 DUDLEY H, 1950, J ACOUST SOC AM, V22, P151, DOI 10.1121/1.1906583 Fant G., 1970, ACOUSTIC THEORY SPEE FLANAGAN JL, 1972, SPEECH ANAL SYNTHESI, P206 FUJISAKI H, 1960, J ACOUST SOC AM, V32, P1518, DOI 10.1121/1.1936361 Gramss T., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), DOI 10.1109/NNSP.1991.239513 Grutzmacher M., 1927, Elektrische Nachrichten-Technik, V4 Hess W., 1983, PITCH DETERMINATION HOUTGAST T, 1973, ACUSTICA, V28, P66 HYDE JE, 1976, TELEPHONE BOOK, P3 ISHIZAKA K, 1972, AT&T TECH J, V51, P1233 ITAKURA F, 1969, ACOUST SOC JAPAN M Kempelen Wolfgang von, 1791, MECHANISMUS MENSCHLI KIRITANI S, 1971, 7TH P INT C AC BUD Kratzenstein C. G., 1782, J PHYS, V21, P358 MARCOU P, 1955, 3RD P S INF THEOR LO, P231 MERMELST.P, 1967, J ACOUST SOC AM, V41, P1283, DOI 10.1121/1.1910470 MEYER EA, 1910, UNTERSUCHUNGEN LAUTB, P172 NOLL AM, 1967, J ACOUST SOC AM, V36, P1030 OBATA J, 1932, JAP J PHYSICS, V8 PAGET R., 1930, HUMAN SPEECH PAPING M, 1993, IN PRESS P INT C APP Rabiner L. R., 1986, IEEE ASSP Magazine, V3, DOI 10.1109/MASSP.1986.1165342 Rabiner L.R., 1978, DIGITAL PROCESSING S RAHIM MG, 1993, J ACOUST SOC AM, V93, P1109, DOI 10.1121/1.405559 RAYLEIGH IWS, 1945, THEORY SOUND, V2, P469 RUSSEL OG, 1929, J ACOUST SOC AM, V1, P83 RUSSEL OG, 1928, VOWELS SCHROEDER MR, 1981, ACUSTICA, V49, P179 Schroeder M, 1991, FRACTALS CHAOS POWER SCHROEDE.MR, 1966, PR INST ELECTR ELECT, V54, P720, DOI 10.1109/PROC.1966.4841 SCHROEDER MR, 1979, J ACOUST SOC AM, V66, P1647, DOI 10.1121/1.383662 SCHROEDE.MR, 1967, PR INST ELECTR ELECT, V55, P396, DOI 10.1109/PROC.1967.5497 Schroeder M.R., 1962, Journal of the Audio Engineering Society, V10 SCHROEDE.MR, 1967, J ACOUST SOC AM, V41, P1002, DOI 10.1121/1.1910429 SCHROEDER MR, 1986, J ACOUST SOC AM, V79, P1580, DOI 10.1121/1.393292 SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 Schroeder M.R., 1960, Acustica, V10 Schroeder M.R., 1967, Bell System Technical Journal, V46 SCHROEDER MR, 1962, 4TH P INT C AC COP Stumpf C., 1926, SPRACHLAUTE Thienhaus E., 1934, Zeitschrift fur Technische Physik, V15 Ungeheuer G., 1962, ELEMENTE AKUSTISCHEN VONHELMHOLTZ H, 1870, SENSATIONS TONE, P103 WILLIS W, 1838, T CAMB PHILOS SOC, V3, P231 NR 56 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 231 EP 237 DI 10.1016/0167-6393(93)90074-U PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100021 ER PT J AU SUNDBERG, J AF SUNDBERG, J TI HOW CAN MUSIC BE EXPRESSIVE SO SPEECH COMMUNICATION LA English DT Article DE MUSIC PERFORMANCE; PROSODY; GROUPING; CATEGORICAL PERCEPTION; MUSICAL PHRASING; MUSICAL ARTICULATION ID PERFORMANCE; RULES; KEYS AB Music performance is examined using an analysis-by-synthesis method. As a result a note-to-tone conversion program emerged containing a generative rule system for musical performance. Some of these performance rules are presented and discussed. In particular, the purpose of the rules in music communication is analyzed. Three main purposes are identified: (1) differentiation of pitch and duration categories, (2) marking of boundaries between tone groups, and (3) timing and tuning agreement in ensembles. The possible origin of the performance rules is also discussed. Many striking similarities with speech and other types of human communication suggest that expressive details in music performance are not specific to music. On the contrary, they appear to allude to and thus become meaningful because of the listener's extramusical experiences. RP SUNDBERG, J (reprint author), KTH, DEPT SPEECH COMMUN & MUSIC ACOUST, BOX 70014, STOCKHOLM, SWEDEN. CR Bengtsson I., 1975, HAMBURGER JB MUSIKWI, V1, P195 BENGTSSON I, 1983, STUDIES MUSIC PERFOR, V39, P27 BURNS E, 1978, J ACOUST SOC AM, V67, P456 Carlson R., 1989, CONT MUSIC REV, V4, P389 CLARKE E, 1987, ROYAL SWEDISH ACADEM, V55, P19 Clarke E. F., 1988, GENERATIVE PROCESSES, P1 CLARKE EF, 1982, ACTA PSYCHOL, V50, P1, DOI 10.1016/0001-6918(82)90047-6 CLYNES M, 1983, PUBLICATION ROYAL SW, V39, P76 EDLUND B, 1985, ACTA U UPSAL STUDIA, V9 FRIBERG A, 1993, 41992 SPEECH TRANSM, P97 FRIBERG A, 1991, COMPUT MUSIC J, V15, P56, DOI 10.2307/3680917 FRIBERG A, 1991, COMPUT MUSIC J, V15, P49, DOI 10.2307/3680916 GABRIELSSON A, 1983, ROYAL SWEDISH ACADEM, V39, P27 GABRIELSSON A, 1987, ROYAL SWEDISH ACAD M, V55, P81 GARBUZOW N, 1948, ZONNAJA PRIRODA ZVUK Howell P., 1991, REPRESENTING MUSICAL, P161 Kronman U., 1987, ACTION PERCEPTION RH, V55, P57 KRUMHANSL CL, 1982, PSYCHOL REV, V89, P334, DOI 10.1037/0033-295X.89.4.334 KRUMHANSL CL, 1982, J EXP PSYCHOL HUMAN, V8, P24 LARSSON B, 1977, 1 SWED ROYAL I TECHN, P38 LPAS Van Noorden, 1975, TEMPORAL COHERENCE P PALMER C, 1989, J EXP PSYCHOL HUMAN, V15, P331, DOI 10.1037/0096-1523.15.2.331 POVEL DJ, 1977, ACTA PSYCHOL, V41, P309, DOI 10.1016/0001-6918(77)90024-5 RASCH RA, 1979, ACUSTICA, V43, P121 REPP BH, 1992, J ACOUST SOC AM, V92, P2546, DOI 10.1121/1.404425 Seashore C., 1938, PSYCHOL MUSIC SHAFFER LH, 1980, TUTORIALS MOTOR BEHA, P443 SHAFFER LH, 1981, COGNITIVE PSYCHOL, V13, P326, DOI 10.1016/0010-0285(81)90013-X SLOBODA JA, 1983, Q J EXP PSYCHOL-A, V35, P377 SUNDBERG J, 1991, MUSIC PERCEPT, V9, P71 Sundberg J., 1978, SWED J MUSICOL, V60, P107 Sundberg J., 1989, CONT MUSIC REV, V3, P89, DOI DOI 10.1080/07494468900640071 SUNDBERG J, 1980, J ACOUST SOC AM, V68, P772, DOI 10.1121/1.384816 TAGUTI T, 1989, 1ST P INT C MUS PERC, P219 Thompson W. F., 1989, PSYCHOL MUSIC, V17, P63, DOI 10.1177/0305735689171006 TODD N, 1985, MUSIC PERCEPT, V3, P33 NR 36 TC 16 Z9 16 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1993 VL 13 IS 1-2 BP 239 EP 253 DI 10.1016/0167-6393(93)90075-V PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MH201 UT WOS:A1993MH20100022 ER PT J AU RAMLOT, JM SORIN, C CARRE, R BOE, LJ AF RAMLOT, JM SORIN, C CARRE, R BOE, LJ TI WAJSKOP,MAX - OBITUARY SO SPEECH COMMUNICATION LA English DT Biographical-Item CR WAJSKOP M, PUBLICATION LIST NR 1 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 296 EP 298 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500001 ER PT J AU LAVER, J AF LAVER, J TI FALLSIDE,FRANK - OBITUARY SO SPEECH COMMUNICATION LA English DT Biographical-Item NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 299 EP 299 DI 10.1016/0167-6393(93)90079-Z PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500002 ER PT J AU MIN, BJ UN, CK AF MIN, BJ UN, CK TI POLE-ZERO MODELING OF SPEECH USING FAST TRANSVERSAL FILTERING ALGORITHM SO SPEECH COMMUNICATION LA English DT Article DE POLE-ZERO MODELING; FAST TRANSVERSAL FILTER (FTF); RLS LATTICE FILTER AB This paper describes a pole-zero (ARMA) modeling of speech using a recursive-least-squares (RLS) fast transversal filter (FTF) algorithm. This ARMA FTF algorithm can estimate unknown input excitation and the estimated input is used to determine the parameters of the pole-zero model. This algorithm is derived using geometric projections. The geometric projection approach gives insight and useful interpretation of various filters that form the algorithm. We give a performance evaluation of the proposed algorithm by applying to synthetic and natural speech spectral estimations. This algorithm accurately represents spectral peaks and valleys of speech and requires less computations than RLS lattice filters and ARMA FTF algorithm of Ardalan and Faber (1988). Additionally, this algorithm can also be applied to other signal processing areas where the input is unknown. RP MIN, BJ (reprint author), KOREA ADV INST SCI & TECHNOL, DEPT ELECT ENGN, COMMUN RES LAB, 373-1 KUSUNG DONG, YUSING GU, TAEJON, SOUTH KOREA. CR ALEXANDER ST, 1986, ADAPTIVE SIGNAL PROC, P123 ARDALAN SH, 1988, IEEE T ACOUST SPEECH, V36, P349, DOI 10.1109/29.1531 KARLSSON E, 1987, IEEE T ACOUST SPEECH, V35, P994, DOI 10.1109/TASSP.1987.1165246 KONVALINKA IS, 1979, IEEE T ACOUST SPEECH, V27, P485, DOI 10.1109/TASSP.1979.1163276 LEE DT, 1982, IEEE T AUTOMAT CONTR, V35, P753 LIM BK, 1991, SPEECH COMMUN, V10, P303, DOI 10.1016/0167-6393(91)90018-O RABINER LR, 1978, DIGITAL PROCESSING S, P76 SONG KH, 1983, IEEE T ACOUST SPEECH, V31, P1556 STROBACH P, 1988, IEEE T ACOUST SPEECH, V36, P560, DOI 10.1109/29.1559 NR 9 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 301 EP 320 DI 10.1016/0167-6393(93)90080-5 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500003 ER PT J AU SAERENS, M AF SAERENS, M TI A CONTINUOUS-TIME DYNAMIC FORMULATION OF VITERBI ALGORITHM FOR ONE-GAUSSIAN-PER-STATE HIDDEN MARKOV-MODELS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; HIDDEN MARKOV MODELS; VITERBI ALGORITHM ID CONTINUOUS SPEECH RECOGNITION AB When using hidden Markov models for speech recognition, it is usually assumed that the probability that a particular acoustic vector is emitted at a given time only depends on the current state and the current acoustic vector observed. In this paper, we introduce another idea, i.e., we assume that, in a given state, the acoustic vectors are generated by a continuous Markov process. Indeed, the time evolution of the acoustic vector is inherently dynamic and continuous, and sampling only occurs for the purpose of computation. This allows us to assign a probability density to the time trajectory of the acoustic vector inside the state, reflecting the probability that this particular path has been generated by the continuous Markov process associated with this state. Roughly speaking, it measures the ''adequacy'' of the observed trajectory with respect to an ideal trajectory, which is modelled by a vectorial linear differential equation. This model is introduced in order to describe the dynamic behaviour of the acoustic vector inside a state. Once the segmentation is fixed, reestimation formulae for the parameters of the continuous Markov process are derived for the Viterbi algorithm. As usual, the segmentation can be obtained by sampling the continuous process, and by applying dynamic programming to find the best path over all the possible sequences of states and all the possible durations. Finally, we sketch a possible generalization to path mixtures, for which different trajectories are available in each state. However, we have to stress that no experimental results are available at present. Indeed, we did not have the opportunity to test the algorithm on real speech. We are aware of the fact that the assumptions we did may not be appropriate for the modelling of speech. RP SAERENS, M (reprint author), UNIV LIBRE BRUXELLES, IRIDIA LAB, CP 194-6, 50 AV F ROOSEVELT, B-1050 BRUSSELS, BELGIUM. CR Bellman R. E., 1962, APPL DYNAMIC PROGRAM BOURLARD H, 1992, THESIS U MONS BELGIU BOURLARD H, 1986, P EUROPEAN SIGNAL PR, P511 Cox D. R., 1965, THEORY STOCHASTIC PR DENG L, 1992, SIGNAL PROCESS, V27, P65, DOI 10.1016/0165-1684(92)90112-A DENG L, 1992, P INT C SIGNAL PROCE, P1025 DENG L, 1991, 1991 P IEEE WORKSH N, P411 DIGALAKIS V, 1991, MAY P IEEE INT C AC, P289 Duda R. O., 1973, PATTERN CLASSIFICATI FEYNMAN RP, 1948, REV MOD PHYS, V20, P367, DOI 10.1103/RevModPhys.20.367 FEYNMAN RP, 1965, QUANTUM MECHANICS PA Friedman B., 1956, PRINCIPLES TECHNIQUE FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 FURUI S, 1991, P EUROSPEECH GENOVA, P3 Gardiner C.W., 1985, HDB STOCHASTIC METHO GELFAND IM, 1960, J MATH PHYS, V1, P48, DOI 10.1063/1.1703636 Golub G. H., 1989, MATRIX COMPUTATIONS GU HY, 1991, IEEE T SIGNAL PROCES, V39, P1743 GURGEN F, 1990, 1990 P INT C SPOK LA Huang X.D., 1990, HIDDEN MARKOV MODELS ISO K, 1991, INT CONF ACOUST SPEE, P57, DOI 10.1109/ICASSP.1991.150277 ISO K, 1990, INT CONF ACOUST SPEE, P441, DOI 10.1109/ICASSP.1990.115744 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 JUANG BH, 1984, AT&T TECH J, V63, P1213 JUANG BH, 1985, IEEE T ACOUST SPEECH, V33, P1404 KENNY P, 1990, IEEE T ACOUST SPEECH, V38, P220, DOI 10.1109/29.103057 LEVIN E, 1990, P IEEE INT C AC SPEE, P433 LEVIN E, 1991, NIPS C P, V3, P147 Levinson S. E., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80009-2 LEVINSON SE, 1983, AT&T TECH J, V62, P1035 MONTROLL EW, 1952, COMMUN PUR APPL MATH, V5, P415, DOI 10.1002/cpa.3160050403 Papoulis A, 1991, PROBABILITY RANDOM V, V3rd PETEK B, 1992, SPEECH COMMUN, V11, P273, DOI 10.1016/0167-6393(92)90021-X PORITZ AB, 1982, APR P IEEE INT C AC, P1291 PORITZ AB, 1988, APR P IEEE INT C AC, P7 RABINER LR, 1985, AT&T TECH J, V64, P1211 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 Russell M. J., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) SAERENS M, 1993, EUROSPEECH C BERLIN SAERENS M, 1993, UNPUB HIDDEN MARKOV SCHULMAN LS, 1981, TECHNIQUES APPLICATI SUAUDEAU N, 1992, 19E ACT JOURN ET PAR, P189 TEBELSKIS J, 1991, INT CONF ACOUST SPEE, P61, DOI 10.1109/ICASSP.1991.150278 TEBELSKIS J, 1990, INT CONF ACOUST SPEE, P437, DOI 10.1109/ICASSP.1990.115742 TISHBY NZ, 1991, IEEE T SIGNAL PROCES, V39, P563 TSUBOKA E, 1990, 1990 P INT C SPOK LA VASEGHI SV, 1992, P EUROPEAN SIGNAL PR, P435 WELLEKENS C, 1987, ICASSP 87, P384 Wiener N, 1930, ACTA MATH-DJURSHOLM, V55, P117, DOI 10.1007/BF02546511 Woodland P. C., 1992, ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech and Signal Processing (Cat. No.92CH3103-9), DOI 10.1109/ICASSP.1992.225860 NR 50 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 321 EP 333 DI 10.1016/0167-6393(93)90081-U PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500004 ER PT J AU RECASENS, D FONTDEVILA, J PALLARES, MD SOLANAS, A AF RECASENS, D FONTDEVILA, J PALLARES, MD SOLANAS, A TI AN ELECTROPALATOGRAPHIC STUDY OF STOP CONSONANT CLUSTERS SO SPEECH COMMUNICATION LA English DT Article DE COARTICULATION; ELECTROPALATOGRAPHY; GESTURAL BLENDING; INTERGESTURAL COHESIVENESS ID SPEECH PRODUCTION; COARTICULATION AB This is an electropalatographic investigation of coarticulation for heterosyllabic stop consonant clusters in American English and Catalan VCCV sequences. The heterorganic clusters under analysis were [tk], [kt], [tp], [pt], [kp], [pk]. Evidence for gestural overlap between the two adjacent consonants in the cluster is found quite systematically about the closure midpoint and, less so, at C1 onset and at C2 offset. Overlap occurs between constrained and unconstrained lingual regions (e.g., [p] is produced with more alveolar contact than usual when preceded by [t]) and gives rise to blending between tongue front and tongue dorsum activity during the production of lingual clusters [kt] and [tk]. Clusters are equally sensitive to vowel-dependent effects at all moments in time during the closure period. Such effects are quite large for clusters involving lingual [t] or [k] and non-lingual [p] and quite small for clusters made of the two former lingual stop consonants. These data on consonant- and vowel-dependent coarticulatory effects suggest that stop clusters are produced as highly cohesive production units. A constraint for anticipatory vowel-dependent effects to occur at the cluster midpoint but not so at cluster onset can be taken in support of a time-locked model of coarticulation. Speaker-dependent trends were also observed. C1 UNIV AUTONOMA BARCELONA, DEPT FILOL CATALANA, BARCELONA, SPAIN. UNIV BARCELONA, FAC PSICOL, DEPT ESTADIST, BARCELONA 7, SPAIN. RP RECASENS, D (reprint author), INST ESTUDIS CATALANS, CEDI, C CARME 47, E-08001 BARCELONA, SPAIN. CR BELLBERTI F, 1981, PHONETICA, V38, P9 BORDEN GJ, 1979, PHONETICA, V36, P21 BROWMAN CP, 1989, STATUS REPORT SPEECH, V97, P1 BROWMAN CP, 1989, STATUS REPORT SPEECH, V99, P69 BUTCHER A, 1989, CLIN LINGUIST PHONET, V3, P39, DOI 10.3109/02699208908985269 Catford John C., 1977, FUNDAMENTAL PROBLEMS FARNETANI E, 1990, NATO ADV SCI I D-BEH, V55, P93 FOWLER CA, 1987, SPEECH COMMUN, V6, P231, DOI 10.1016/0167-6393(87)90028-8 GAY T, 1981, PHONETICA, V38, P148 GUNZBURGER D, 1983, SOUND STRUCTURES STU, P121 Hardcastle W. J., 1979, CURRENT ISSUES PHONE, P531 HARDCASTLE WJ, 1985, SPEECH COMMUN, V4, P247, DOI 10.1016/0167-6393(85)90051-2 HENDERSON JB, 1982, PHONETICA, V39, P71 Kent R. D., 1983, PRODUCTION SPEECH, P57 KENT RD, 1975, BRAIN LANG, V2, P304, DOI 10.1016/S0093-934X(75)80072-1 KOZHEVNIKOV VA, 1965, RECH ARTIKULYATSIYA Kuehn David P., 1976, J PHONETICS, V4, P303 MADDIESON I, 1989, WORKING PAPERS PHONE, V72, P116 MARCHAL A, 1988, SPEECH COMMUN, V7, P287, DOI 10.1016/0167-6393(88)90074-X NIE NH, 1982, STATISTICAL PACKAGE OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 RECASENS D, 1990, J PHONETICS, V18, P267 SHIBATA S, 1978, ANN B RES I LOGOPEDI, V12, P5 ZSIGA E, 1990, 118TH M ASA SAN DIEG NR 24 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 335 EP 355 DI 10.1016/0167-6393(93)90082-V PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500005 ER PT J AU BRUGNARA, F FALAVIGNA, D OMOLOGO, M AF BRUGNARA, F FALAVIGNA, D OMOLOGO, M TI AUTOMATIC SEGMENTATION AND LABELING OF SPEECH-BASED ON HIDDEN MARKOV-MODELS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SEGMENTATION AND LABELING; HMM (HIDDEN MARKOV MODELS); SPEECH DATABASES; ACOUSTIC PHONETIC UNITS AB An accurate database documentation at phonetic level is very important for speech research: however, manual segmentation and labeling is a time consuming and error prone task. This article describes an automatic procedure for the segmentation of speech: given either the linguistic or the phonetic content of a speech utterance, the system provides phone boundaries. The technique is based on the use of an acoustic-phonetic unit Hidden Markov Model (HMM) recognizer: both the recognizer and the segmentation system have been designed exploiting the DARPA-TIMIT acoustic-phonetic continuous speech database of American English. Segmentation and labeling experiments have been conducted in different conditions to check the reliability of the resulting system. Satisfactory results have been obtained, especially when the system is trained with some manually presegmented material. The size of this material is a crucial factor; system performance has been evaluated with respect to this parameter. It turns out that the system provides 88.3% correct boundary location, given a tolerance of 20 ms, when only 256 phonetically balanced sentences are used for its training. RP BRUGNARA, F (reprint author), IST RIC SCI & TECNOL, I-38050 TRENT, ITALY. CR BRUGNARA F, 1992, OCT P INT C SPOK LAN, P627 COSI P, 1991, SEP P EUR C SPEECH C, P693 FALAVIGNA D, 1990, P EUROPEAN SIGNAL PR, P1139 LAMEL L, 1986, FEB P DARPA SPEECH R, P100 LEE KF, 1989, IEEE T ACOUST SPEECH, V37, P1641, DOI 10.1109/29.46546 LJOLJE A, 1991, INT CONF ACOUST SPEE, P473, DOI 10.1109/ICASSP.1991.150379 MARZAL A, 1990, SEP P EUR SIGN PROC, P43 Rabiner LR, 1989, P IEEE, V77, P267 STRINGA L, 1990, IRST901211 TECHN REP SVENDSEN T, 1987, IEEE T ACOUST SPEECH, P77 NR 10 TC 55 Z9 57 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 357 EP 370 DI 10.1016/0167-6393(93)90083-W PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500006 ER PT J AU MONAGHAN, AIC AF MONAGHAN, AIC TI THE INTONATION OF TEXTUAL ANOMALIES IN TEXT-TO-SPEECH SO SPEECH COMMUNICATION LA English DT Article DE INTONATION; TEXT-TO-SPEECH; ABBREVIATIONS; PROSODY; SEMANTICS AB This paper examines the intonational characteristics of a number of types of non-word, e.g. numbers, dates, times and other abbreviations, which occur in text and are readily identifiable. These phenomena are often treated as problems for text-to-speech systems, but we take the view that the ease with which they can be identified and the relative predictability of their semantic or pragmatic content makes their intonational behaviour more regular and therefore easier to model than the behaviour of other areas of text. Examples of such phenomena are presented, and heuristics for their treatment by an automatic system are proposed. A formal evaluation of these heuristics is presented, showing a success rate of over 94%. A final discussion outlines the advantages and disadvantages of such a treatment, and suggests lines of future research. RP MONAGHAN, AIC (reprint author), UNIV EDINBURGH, DEPT LINGUIST, EDINBURGH EH8 9YL, MIDLOTHIAN, SCOTLAND. CR BARBER S, 1988, 7TH P FASE S ED, V3, P967 Bolinger D., 1986, INTONATION ITS PARTS BOOTH B, 1987, COMPUTATIONAL ANAL E, P97 CARLSON R, 1990, ADV SPEECH HEARING L, V1, P269 Gussenhoven C., 1984, GRAMMAR SEMANTICS SE Kingdon Roger, 1958, GROUNDWORK ENGLISH S Ladd D. R., 1980, STRUCTURE INTONATION Levi Judith N., 1978, SYNTAX SEMANTICS COM MILLIGAN S, 1972, A HITLER MY PART HIS Monaghan A. I. C., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90024-Z Monaghan A. I. C., 1992, TALKING MACHINES THE, P143 Monaghan A. I. C., 1991, THESIS U EDINBURGH MONAGHAN AIC, 1989, 1989 P EUR PAR, V1, P502 MONAGHAN AIC, 1989, SEP P ESCA WORKSH SP MONAGHAN AIC, 1990, SPEECH COMMUN, V9, P305, DOI 10.1016/0167-6393(90)90006-U MONAGHAN AIC, 1988, 7TH P FASE S ED, V3, P1249 Pierrehumbert J, 1980, THESIS MIT SCHNABEL B, 1990, SEP P ESCA WORKSH SP, P121 SELKIRK E., 1984, PHONOLOGY SYNTAX Sproat R. W., 1987, 25th Annual Meeting of the Association for Computational Linguistics. Proceedings of the Conference SPROAT RW, 1986, UNPUB STRESSING ENGL Strawson Peter F., 1959, INDIVIDUALS WOTHKE K, 1990, SEP P ESCA WORKSH SP, P219 NR 23 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 371 EP 382 DI 10.1016/0167-6393(93)90084-X PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500007 ER PT J AU LEE, CH LIN, CH AF LEE, CH LIN, CH TI ON THE USE OF A FAMILY OF SIGNAL LIMITERS FOR RECOGNITION OF NOISY SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SIGNAL LIMITER; SPEECH RECOGNITION; INFINITE CLIPPING; 3-LEVEL CENTER CLIPPING; DYNAMIC TIME WARPING AB The performance of a speech recognizer is often degraded by noise. Part of the reason for this performance degradation is due to the fact that there is often a strong mismatch between the training and the testing conditions, i.e. the recognition features used in the training case are vastly different from the features used in the testing condition because of the effect of the noise. One way to circumvent this mismatch problem is to use features which are less susceptible to changing noise conditions. In this paper, we propose the use of a family of signal limiters for recognition of noisy speech. The signal limiter, when properly scaled, is equivalent to performing an arcsin transformation on the autocorrelation functions of the original signal. The effect of using the signal limiter as preprocessor is to reduce the variability of the feature vector, so that the mismatch between training and testing conditions in noise is reduced. Testing on a 39-word English alpha-digit vocabulary, in a speaker trained mode, indicates that the recognition performance of a template-based, dynamic time-warping (DTW) recognizer can be significantly improved in noisy conditions when the robust signal limiter is used as a pre-processor to reduce the variability of the features in strong mismatch conditions. RP LEE, CH (reprint author), AT&T BELL LABS, SPEECH RES DEPT, MURRAY HILL, NJ 07974 USA. CR ACERO A, 1990, INT CONF ACOUST SPEE, P849, DOI 10.1109/ICASSP.1990.115971 FAWE AL, 1964, IEEE T AUDIO ELEC AU, V14 FLANAGAN JF, 1988, COMMUNICATION JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 Juang B. H., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90011-E LEE CH, 1984, IEEE P INT C ACOUST MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37, P1659, DOI 10.1109/29.46548 MERHAV N, 1993, IEEE T SPEECH AUDIO, V1 NADAS A, 1988, IEEE P INT C ACOUST, P849 Rabiner L.R., 1978, DIGITAL PROCESSING S SOONG FK, 1988, IEEE T ACOUST SPEECH, V36, P871, DOI 10.1109/29.1598 THOMAS JB, 1966, INTRO STATISTICAL CO WILPON JG, 1985, IEEE T ACOUST SPEECH, V33, P587, DOI 10.1109/TASSP.1985.1164581 NR 13 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1993 VL 12 IS 4 BP 383 EP 392 DI 10.1016/0167-6393(93)90085-Y PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC805 UT WOS:A1993MC80500008 ER PT J AU GRENIE, M JUNQUA, JC AF GRENIE, M JUNQUA, JC TI LATITUDES - WHERE SPEECH MET THE BEACH THE TIME OF A WORKSHOP SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 207 EP 209 DI 10.1016/0167-6393(93)90090-8 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300001 ER PT J AU GAGNON, L AF GAGNON, L TI A STATE-BASED NOISE-REDUCTION APPROACH FOR NONSTATIONARY ADDITIVE INTERFERENCE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ENHANCEMENT; NOISE REDUCTION; NONSTATIONARY; HMM AB In this paper we describe a technique that we developed for enhancing speech signals degraded by additive non-stationary noise. The performance of the technique is evaluated in the context of a speech recognition task on connected digits corrupted by different types of noise representative of military environments. The algorithm is based upon spectral amplitude estimation of the speech signal given state-dependent parametric speech and noise models. The spectral analysis is performed by a resonator based frequency interpolation filterbank whose parameters are selected according to the nature of the noise process. The models are ergodic hidden Markov models (HMMs) with Gaussian multivariate distributions trained on noise and speech samples. RP GAGNON, L (reprint author), DEPT NATL DEF, POB 9703, OTTAWA K1G 3Z4, ON, CANADA. CR CUNG HM, 1993, SPEECH COMMUN, V12, P267, DOI 10.1016/0167-6393(93)90098-6 EPHRAIM Y, 1992, P IEEE, V80, P1526, DOI 10.1109/5.168664 GAGNON L, 1991, P INT C ACOUST SPEEC, P981, DOI 10.1109/ICASSP.1991.150505 Mendel J. M., 1987, LESSONS DIGITAL ESTI Varga A. P., 1992, NOISEX92 STUDY EFFEC VARGA AP, 1990, APR P IEEE INT C AC, P845 NR 6 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 213 EP 219 DI 10.1016/0167-6393(93)90091-X PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300002 ER PT J AU DOBLER, S GELLER, D HAEBUMBACH, R MEYER, P NEY, H RUEHL, HW AF DOBLER, S GELLER, D HAEBUMBACH, R MEYER, P NEY, H RUEHL, HW TI DESIGN AND USE OF SPEECH RECOGNITION ALGORITHMS FOR A MOBILE RADIO TELEPHONE SO SPEECH COMMUNICATION LA English DT Article DE ROBUST SPEECH RECOGNITION; CAR ENVIRONMENT; USER INTERFACE AB To decrease the hazards of using mobile phones while driving. voice processing provides several tools that simplify their use: echo cancellation allows CoMfortable hands-free conversation, feedback and user guidance by voice allow to operate the phone in eyes-busy situations, and last not least speech recognition frees from keypad data entry to operate the telephone. A comprehensive view of a device incorporating the above mentioned technologies, which has been realized as an add-on for the Philips car telephone family. will be presented. Emphasis is placed on the speech recognition algorithms. Robustness of the algorithms to changing acoustic environment was improved by estimating and subtracting the long-term spectrum. We will show that. if this operation is done recursively. it is equivalent to the high-pass filtering or RASTA (Relative Spectral Approaches) methods recently proposed in the literature C1 PHILLIPS KOMMUNIKAT IND AG, W-8500 NURNBERG, GERMANY. RP DOBLER, S (reprint author), PHILLIPS RES LABS AACHEN, POB 1980, W-5100 AACHEN, GERMANY. CR BRIDLE J, 1977, IEEE T ACOUST SPEECH, P656 HERMANSKY H, 1992, P INT C SPOKEN LANGU, P85 Hirsch H.-G., 1991, P EUROSPEECH, P413 MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37, P1659, DOI 10.1109/29.46548 MEYER P, 1991, P EUROPEAN C SPEECH, P809 Ney H., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90033-4 NOLL A, 1989, P IEEE INT C AC SPEE, P679 RUEHL HW, 1991, SPEECH COMMUN, V10, P11, DOI 10.1016/0167-6393(91)90024-N VARY P, 1988, APR INT C AC SPEECH, P227 NR 9 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 221 EP 229 DI 10.1016/0167-6393(93)90092-Y PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300003 ER PT J AU GALES, MJF YOUNG, SJ AF GALES, MJF YOUNG, SJ TI CEPSTRAL PARAMETER COMPENSATION FOR HMM RECOGNITION IN NOISE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; NOISE COMPENSATION; AMN; PMC AB This paper describes a method of adapting a continuous density HMM recogniser trained on clean cepstral speech data to make it robust to noise. The technique is based on parallel model combination (PMC) in which the parameters of corresponding pairs of speech and noise states are combined to yield a set of compensated parameters. It improves on earlier cepstral mean compensation methods in that it also adapts the variances and as a result can deal with much lower SNRs. The PMC method is evaluated on the NOISEX-92 noise database and shown to work well down to 0 dB SNR and below for both stationary and non-stationary noises. Furthermore, for relatively constant noise conditions, there is no additional computational cost at run-time. RP GALES, MJF (reprint author), UNIV CAMBRIDGE, DEPT ENGN, TRUMPINGTON ST, CAMBRIDGE, ENGLAND. CR BEATTIE VL, 1992, P ICSLP, P519 BEATTIE VL, 1991, MAY INT C ACOUST SPE BERSTEIN A, 1991, P INT C ACOUST SPEEC BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 CARLSON BA, 1991, P IEEE 1991 INT C AC, P921, DOI 10.1109/ICASSP.1991.150490 Chen Y., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) CUNG HM, 1992, NOV P ESCA WORKSH SP, P171 CUNG HM, 1993, SPEECH COMMUN, V12, P267, DOI 10.1016/0167-6393(93)90098-6 FURUI S, 1992, NOV P ESCA WORKSH SP, P31 GALES MJF, 1992, MAR P INT C AC SPEEC, P233 HOMES JN, 1986, P INT C ACOUST SPEEC, P741 Juang B. H., 1991, Computer Speech and Language, V5, DOI 10.1016/0885-2308(91)90011-E KADIRKAMANATHAN, 1992, NOV P ESCA WORKSH SP, P187 KLATT DH, 1979, P ICASSP, P573 LOCKWOOD P, 1992, SPEECH COMMUN, V11, P215, DOI 10.1016/0167-6393(92)90016-Z Mansour D., 1988, P ICASSP 88, P36 MELLOR BA, 1992, AUT P I AC C SPEECH, V14, P361 MOORE RK, 1986, RSRE3931 ROY SIGN RA Paul D. B., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) SORENSEN HBD, 1991, P INT C AC SPEECH SI, P933, DOI 10.1109/ICASSP.1991.150493 VANCOMPERNOLLE D, 1989, P IEEE INT C AC SPEE, P258 VARGA A, 1989, SEP P EUR PAR, P167 Varga A. P., 1992, NOISEX92 STUDY EFFEC YOUNG SJ, 1992, HTK VERSION 1 4 REFE NR 24 TC 76 Z9 79 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 231 EP 239 DI 10.1016/0167-6393(93)90093-Z PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300004 ER PT J AU STEENEKEN, HJM VARGA, A AF STEENEKEN, HJM VARGA, A TI ASSESSMENT FOR AUTOMATIC SPEECH RECOGNITION .1. COMPARISON OF ASSESSMENT METHODS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNIZER ASSESSMENT; VOCABULARIES; NOISE AB The use of three types of vocabularies (cockpit-control words, digits, and initial consonants) was compared for the assessment of five speech recognizers. The goal of this study is to compare various assessment methods from application oriented to carefully controlled laboratory situations. It was found that the discrimination between various recognizer (input) conditions is improved for more difficult vocabularies. Confusions between stimuli and responses of testwords can be used as a diagnostic tool for prediction of performance and developments. C1 DRA, SPEECH RES UNIT, MALVERN WR14 3PS, WORCS, ENGLAND. RP STEENEKEN, HJM (reprint author), TNO, INST HUMAN FACTORS, POB 23, 3769 ZG SOESTERBERG, NETHERLANDS. CR MOORE RK, 1991, P WORKSHOP INT COOPE Steeneken H. J. M., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266483 STEENEKEN HJM, 1992, THESIS U AMSTERDAM STEENEKEN HJM, 1991, ESPRIT SAMTNO041 DOC STEENEKEN HJM, 1989, EUROPEAN SPEECH C ES STEENEKEN HJM, 1991, P EUROSPEECH 91 GEN, P529 THORNTON AR, 1978, J SPEECH HEAR RES, V21, P507 VANVELDEN JG, 1992, IZF A26 I PERC REP Varga A. P., 1992, NOISEX92 STUDY EFFEC NR 9 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 241 EP 246 DI 10.1016/0167-6393(93)90094-2 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300005 ER PT J AU VARGA, A STEENEKEN, HJM AF VARGA, A STEENEKEN, HJM TI ASSESSMENT FOR AUTOMATIC SPEECH RECOGNITION .2. NOISEX-92 - A DATABASE AND AN EXPERIMENT TO STUDY THE EFFECT OF ADDITIVE NOISE ON SPEECH RECOGNITION SYSTEMS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNIZER ASSESSMENT; CALIBRATED DATA BASE; ADDITIVE NOISE AB The NOISEX-92 experiment and database is described and discussed. NOISEX-92 specifies a carefully controlled experiment on artificially noisy speech data, examining performance for a limited digit recognition task but with a relatively wide range of noises and signal-to-noise ratios. Example recognition results are given. C1 TNO, INST HUMAN FACTORS, 3769 ZG SOESTERBERG, NETHERLANDS. RP VARGA, A (reprint author), DRA, SPEECH RES UNIT, ST ANDREWS RD, MALVERN WR14 3PS, WORCS, ENGLAND. CR STEENEKEN HJM, 1986, IZF198620 TNO I PERC STEENEKEN HJM, 1988, IZF19883 TNO I PERC STEENEKEN HJM, 1993, SPEECH COMMUN, V12, P241, DOI 10.1016/0167-6393(93)90094-2 Varga A. P., 1992, NOISEX92 STUDY EFFEC VARGA AP, 1990, APR P IEEE INT C AC, P845 VARGA AP, 1989, ESCA P EUROSPEECH89 NR 6 TC 351 Z9 369 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 247 EP 251 DI 10.1016/0167-6393(93)90095-3 PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300006 ER PT J AU TONER, E CAMPBELL, DR AF TONER, E CAMPBELL, DR TI SPEECH ENHANCEMENT USING SUBBAND INTERMITTENT ADAPTION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ENHANCEMENT; ADAPTIVE PROCESSING; MULTISENSOR; SUBBAND PROCESSING ID TRANSFORM AB A sub-band multisensor structure using intermittent adaption is proposed for speech enhancement. The convergence of the proposed method is compared with conventional LMS and frequency domain LMS and a dramatic increase in convergence rate is shown using both simulated and real data. Preliminary investigation of sub-band filter order is also reported. RP TONER, E (reprint author), UNIV PAISLEY, DEPT ELECT ENGN, HIGH ST, PAISLEY PA1 2BE, RENFREW, SCOTLAND. CR BRACEWELL RN, 1984, P IEEE, V72, P1010, DOI 10.1109/PROC.1984.12968 CAMPBELL DR, 1992, SIGNAL PROCESS, V26, P177, DOI 10.1016/0165-1684(92)90128-J CHENG YM, 1991, IEEE T SIGNAL PROCES, V39, P1943, DOI 10.1109/78.134427 Dabis H. S., 1991, EUROSPEECH 91. 2nd European Conference on Speech Communication and Technology Proceedings DABIS HS, 1990, PROCEEDINGS OF INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1 AND 2, P345 EVANS EF, 1982, SENSES GHITZA O, 1988, P IEEE INT C AC SPEE, P91 GILLOIRE A, 1987, INT C ACOUST SPEECH, P2141 HATTY B, 1990, INT CONF ACOUST SPEE, P1145, DOI 10.1109/ICASSP.1990.116159 LEE JC, 1986, IEEE T ACOUST SPEECH, V34, P499 NARAYAN SS, 1981, P IEEE, V69, P124, DOI 10.1109/PROC.1981.11928 REED FA, 1981, IEEE T CIRCUITS SYST, V28, P610, DOI 10.1109/TCS.1981.1085010 Somayazulu V. S., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266581 SUMMERFIELD Q, 1984, PERCEPT PSYCHOPHYS, V35, P203, DOI 10.3758/BF03205933 TUCKER R, 1992, IEE P I, V139 VANCOMPERNOLLE D, 1989, EUROPEAN C SPEECH TE, V2, P657 Widrow B, 1985, ADAPTIVE SIGNAL PROC ZELINSKI R, 1990, ELECTRON LETT, V26, P2036, DOI 10.1049/el:19901314 NR 18 TC 13 Z9 13 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 253 EP 259 DI 10.1016/0167-6393(93)90096-4 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300007 ER PT J AU HIRSCH, HG AF HIRSCH, HG TI INTELLIGIBILITY IMPROVEMENT OF NOISY SPEECH FOR PEOPLE WITH COCHLEAR IMPLANTS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ENHANCEMENT; ELECTRONIC HEARING AID ID SUPPRESSION AB With a cochlear implant, deaf people are able to understand speech under good listening conditions. Problems occur in adverse conditions, e.g. in reverberant and/or noisy environments. Experiments were carried out to improve the intelligibility of noisy speech using two different single-channel noise suppression techniques. This was realized by preprocessing, i.e. by applying the resynthesized speech signal to the cochlear implant system. Intelligibility tests were carried out in cooperation with the medical department. RP HIRSCH, HG (reprint author), RHEIN WESTFAL TH AACHEN, INST NACHRICHTENGERATE & DATENVERARBEITUNG, TEMPLERGRABER 55, D-52056 AACHEN, GERMANY. CR ALLEN JB, 1977, J ACOUST SOC AM, V62, P912, DOI 10.1121/1.381621 BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 CLARK G, 1991, J OCTOLARYNG SOC AUS, V5, P354 CLARK GM, 1990, COCHLEAR PROSTHESES DILLIER N, 1992, ORL J OTO-RHINO-LARY, V54, P299 *ESCA, 1992, P WORKSH SPEECH P HERMANSKY H, 1991, 2ND EUR C SPEECH COM, P1367 HIRSCH HG, 1991, 2ND EUR C SPEECH COM, P413 HIRSCH HG, 1992, NATO ASI SERIES F, V75, P101 HOUTGAST T, 1980, ACUSTICA, V46, P60 LIM JS, 1983, SPEECH ENHANCEMENT MARTIN R, 1992, 1992 P DIG SIGN PROC VARY P, 1985, SIGNAL PROCESS, V8, P387, DOI 10.1016/0165-1684(85)90002-7 1989, MINI SYSTEM 22 AUDIO NR 14 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 261 EP 266 DI 10.1016/0167-6393(93)90097-5 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300008 ER PT J AU CUNG, HM NORMANDIN, Y AF CUNG, HM NORMANDIN, Y TI NOISE ADAPTATION ALGORITHMS FOR ROBUST SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE ROBUST SPEECH RECOGNITION; NOISE ADAPTATION; VQ-BASED FEATURE MAPPING; HIERARCHICAL ADAPTATION; FUZZY CLUSTERING AB This paper proposes three noise adaptation algorithms which allow improvements in the performance of speech recognition systems under noisy conditions. They are VQ-based feature mapping techniques which hierarchically transform noisy feature vectors into clean feature vectors. The first algorithm was originally used for unsupervised speaker adaptation. It is based on hard clustering and iteratively adapts the noisy input data to a small set of codebooks created from clean data. The second algorithm is a modified version of the first one. It redefines the mapping function using the notion of cluster scope. The last algorithm proposes a fuzzy clustering technique as a substitute to the original hard clustering technique. In the NATO digit task, these algorithms significantly improve the performance of CRIM's speech recognition system. RP CUNG, HM (reprint author), CTR RECH INFORMAT MONTREAL, 1801 AVE MCGILL COLL, BUR 800, MONTREAL H3A 2N4, PQ, CANADA. CR Bezdek J.C., 1981, PATTERN RECOGNITION CARDIN R, 1991, INT CONF ACOUST SPEE, P533, DOI 10.1109/ICASSP.1991.150394 CUNG HM, 1992, NOISE ADAPTATION ALG Furui S., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266421 MATSUI T, 1991, INT CONF ACOUST SPEE, P377, DOI 10.1109/ICASSP.1991.150355 MATSUMOTO H, 1988, 2ND JOINT M ACOUST S RABINER LR, 1985, AT T TECHN J, V64 SHIRAKI Y, 1990, INT CONF ACOUST SPEE, P657, DOI 10.1109/ICASSP.1990.115829 STEENEKEN HJM, 1991, SAMTNO041 ESPR SAM D NR 9 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 267 EP 276 DI 10.1016/0167-6393(93)90098-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300009 ER PT J AU ALEXANDRE, P LOCKWOOD, P AF ALEXANDRE, P LOCKWOOD, P TI ROOT CEPSTRAL ANALYSIS - A UNIFIED VIEW - APPLICATION TO SPEECH PROCESSING IN CAR NOISE ENVIRONMENTS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION IN NOISE; ROOT CEPSTRAL ANALYSIS; NONLINEAR SPECTRAL SUBTRACTION (NSS); HIDDEN MARKOV MODEL (HMM); LINEAR PREDICTION AB The performance of speech recognition systems is significantly degraded in the presence of noise. To solve the noise problem, there is a need to reconsider standard approaches by taking into account this new constraint. We first envisage two well-known cepstral representations (parametric and non-parametric) of speech signals and propose a unifying view of both schemes. We introduce a pseudo-autocorrelation domain, which can be interpreted as a ''Root-cepstral domain'', and we show how non-parametric cepstral and linear predictive analyses converge to the same optimal solution. Experiments are carried out using an HMM-based isolated word recogniser for speaker-dependent and speaker-independent tasks in car noise environments. RP ALEXANDRE, P (reprint author), MATRA COMMUNICAT, RUE J P TIMBAUD, BP 26, F-78392 BOIS DARCY, FRANCE. CR ALEXANDRE P, 1993, 1993 IEEE P INT C AC ATAL BS, 1974, J ACOST SOC AM, V55 BATEMAN DC, 1992, MAR IEEE P INT C AC CHILDERS DG, 1977, OCT P IEEE, V65 COHEN TJ, 1970, GEOPHYS J RAY ASTR S, V20 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28 DURBIN J, 1960, REV I INT STATIST, V28 FURUI S, 1986, IEEE T ACOUST SPEECH, V34 HERMANSKY H, 1985, SPEECH COMMUN, V4, P181, DOI 10.1016/0167-6393(85)90045-7 HIRSCH HG, 1991, 1991 P EUROSPEECH HUNT MJ, 1987, 1987 IEEE P INT C AC IMAI S, 1988, 1988 P EURASIP KAY SM, 1979, 1979 IEEE T AC SPEEC, V27 KOBAYASHI T, 1984, 1984 IEEE T AC SPEEC, V32 LECOMTE I, 1989, 1989 IEEE P INT C AC LIM JS, 1979, IEEE T ACOUST SPEECH, V27 LOCKWOOD P, 1992, MAR IEEE P INT C AC LOCKWOOD P, 1992, SPEECH COMMUN, V11, P215, DOI 10.1016/0167-6393(92)90016-Z MAKHOUL J, 1975, APR P IEEE, V63 MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37 OPPENHEIM AV, 1968, IEEE T AUDIO ELECTRO, V16 TOKUDA K, 1990, 1990 P ICSLP KOB NR 22 TC 32 Z9 34 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1993 VL 12 IS 3 BP 277 EP 288 DI 10.1016/0167-6393(93)90099-7 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MC583 UT WOS:A1993MC58300010 ER PT J AU DIMOLITSAS, S SHERIF, M SOUTH, C ROSENBERGER, JR AF DIMOLITSAS, S SHERIF, M SOUTH, C ROSENBERGER, JR TI THE CCITT 16 KBIT/S SPEECH CODING RECOMMENDATION G728 SO SPEECH COMMUNICATION LA English DT Editorial Material C1 AT&T BELL LABS, HOLMDEL, NJ 07733 USA. BT LABS, IPSWICH, SUFFOLK, ENGLAND. BELLCORE, RED BANK, NJ USA. RP DIMOLITSAS, S (reprint author), COMSAT LABS, CLARKSBURG, MD 20871 USA. CR CHEN JH, 1991, ADV SPEECH CODING, P25 IYENGAR V, 1988, APR P IEEE INT C AC, P243 1987, NETWORKING ASPECTS 1 1989, DESCRIPTION 16 KBIT 1987, SOME NETWORKING ASPE 1989, 2ND M AD HOC GROUP 1 1989, MAR M AD HOC GROUP 1 1989, AT T LOW DELAY CODE 1989, 3RD M AD HOC GROUP 1 1988, DEC M AD HOC GROUP 1 1991, SEP WHISTL MOUNT M A NR 11 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 97 EP 100 PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500001 ER PT J AU CHEN, JH COX, RV AF CHEN, JH COX, RV TI THE CREATION AND EVOLUTION OF 16 KBIT/S LD-CELP - FROM CONCEPT TO STANDARD SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; VOICE COMPRESSION; LOW-DELAY CODING; G.728; LD-CELP; LINEAR PREDICTION; GAIN ADAPTATION; VECTOR QUANTIZATION ID CORRELATION-COEFFICIENTS; VECTOR QUANTIZATION; SPEECH AB The creation and evolution of the 16 kbit/s Low-Delay CELP (LD-CELP) speech coding algorithm (CCITT Recommendation G.728) represents an intensive research effort from 1988 to 1992. In this paper, we give a historical overview of this four-year effort, with emphasis on discussions of technical merits of many alternative algorithmic techniques we investigated. As a part of such discussions, we explain why we put some of these techniques in the final G.728 coder and left the others out. It is hoped that this paper illustrates how the G.728 algorithm was created out of very simple initial concepts, and then modified and improved little by little under real-time implementation constraints, until the final algorithm fully met the performance requirements that were seemingly impossible at the outset. RP CHEN, JH (reprint author), AT&T BELL LABS, MURRAY HILL, NJ 07974 USA. CR ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 BARNWELL TP, 1981, IEEE T ACOUST SPEECH, V29, P1062, DOI 10.1109/TASSP.1981.1163683 CHEN J, 1991, MAY P ICASSP91 TOR, P21 CHEN JH, 1990, APR P IEEE INT C AC, P453 CHEN JH, 1990, APR P IEEE INT C AC, P181 CHEN JH, 1992, MAR P IEEE INT C AC, pI69 CHEN JH, 1989, NOV P IEEE GLOB COMM, P1237 CHEN JH, 1987, IEEE T COMMUN, V35, P918 CHEN JH, 1992, JUN IEEE J SEL AREAS, P830 CHEN JH, 1985, JUN P IEEE INT C COM, P1456 CUPERMAN V, 1993, SPEECH COMMUN, V12, P193, DOI 10.1016/S0167-6393(05)80011-1 CUPERMAN V, 1989, NOV C REC IEEE GLOB, P1242 DEMARCA JRB, 1987, JUN P IEEE INT COMM, P1128 Honig Michael L., 1984, ADAPTIVE FILTERS STR IYENGAR V, 1988, APR P IEEE INT C AC, P243 Jayant N. S., 1984, DIGITAL CODING WAVEF KETCHUM RH, 1990, COMMUNICATION LEE CH, 1988, IEEE T ACOUST SPEECH, V36, P642, DOI 10.1109/29.1574 LEROUX J, 1977, IEEE T ACOUST SPEECH, V25, P257, DOI 10.1109/TASSP.1977.1162944 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MAKHOUL JI, 1981, IEEE T ACOUST SPEECH, V29, P654, DOI 10.1109/TASSP.1981.1163566 TOHKURA Y, 1978, IEEE T ACOUST SPEECH, V26, P587, DOI 10.1109/TASSP.1978.1163165 ZEGER K, 1990, IEEE T COMMUN, V38, P2147, DOI 10.1109/26.64657 ZEGER KA, 1987, ELECTRON LETT, V23, P654, DOI 10.1049/el:19870468 1988, TERMS REFERENCE AD H NR 25 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 103 EP 111 DI 10.1016/S0167-6393(05)80003-2 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500002 ER PT J AU SOUTH, CR RUGELBAK, J USAI, P KITAWAKI, N IRII, H ROSENBERGER, J CAVANAUGH, JR ADESANYA, CA PASCAL, D GLEISS, N BARNES, GJ AF SOUTH, CR RUGELBAK, J USAI, P KITAWAKI, N IRII, H ROSENBERGER, J CAVANAUGH, JR ADESANYA, CA PASCAL, D GLEISS, N BARNES, GJ TI SUBJECTIVE PERFORMANCE ASSESSMENT OF CCITTS 16 KBIT/S SPEECH CODING ALGORITHM SO SPEECH COMMUNICATION LA English DT Article DE SUBJECTIVE MEASUREMENT; SPEECH CODING; QUANTIZING DISTORTION AB An internationally coordinated series of subjective listening experiments was conducted on the 16 kbit/s, LD-CELP algorithm, before its adoption as a CCITT Recommendation (G.728). In this paper we describe the measurement methods employed and report on results obtained of the performance of the algorithm, under realistic network conditions. C1 NTA, N-2007 KJELLER, NORWAY. CSELT SPA, I-10148 TURIN, ITALY. NIPPON TELEGRAPH & TEL PUBL CORP, TOKYO 180, JAPAN. BELLCORE, RED BANK, NJ USA. CNET, F-22300 LANNION, FRANCE. TELIA RES, S-13680 HANINGE, SWEDEN. BNR EUROPE, HARLOW, ESSEX, ENGLAND. RP SOUTH, CR (reprint author), BT LABS, IPSWICH, SUFFOLK, ENGLAND. CR CAVANAUGH JR, 1980, JUN IEEE INT C COMM *CCIR, 1986, CCIR GREEN BOOK 1, V11 *CCITT, 1992, HDB TEL *CCITT, 1992, METH SUBJ DET TRANSM, P80 *CCITT, 1988, BLUE BOOK, V5, P48 *CCITT, 1988, BLUE BOOK, V5, P81 CHEN J, 1989, IEEE P GLOBECOM 89 D, P1237, DOI 10.1109/GLOCOM.1989.64152 COLEMAN A, 1989, IEEE P GLOBECOM 89 D, P1075, DOI 10.1109/GLOCOM.1989.64123 COLEMAN AE, 1988, SPEECH COMMUN, V7, P151, DOI 10.1016/0167-6393(88)90036-2 CUPERMAN V, 1989, IEEE P GLOBECOM 89 D, P1242, DOI 10.1109/GLOCOM.1989.64153 DIMOLITSAS FL, 1993, SPEECH COMMUNICAITON, V12, P145 DIMOLITSAS S, 1991, IEE P I, V138 Finney D. J., 1962, PROBIT ANAL, V2nd Kirk RE, 1982, EXPT DESIGN PROCEDUR MODENA G, 1986, IEEE P GLOBECOM 86 H, P599 WILLIAMS G, 1984, IEEE P GLOBECOM 84, V2, P778 CCITT SG XII SQ1089R CCITT SG XII SQ2489 CCITT SQ7691 CCITT SG XII SQ1291 CCIT SG XII SO2189 CCITT SG XII SQ2289 CCITT SERIES P RE S3, V5 NR 23 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 113 EP 133 DI 10.1016/S0167-6393(05)80004-4 PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500003 ER PT J AU DIMOLITSAS, S AF DIMOLITSAS, S TI CHARACTERIZATION OF LOW-RATE DIGITAL VOICE CODER PERFORMANCE WITH NONVOICE SIGNALS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; VOICE COMPRESSION; G728; LD-CELP; SIGNALING; VOICEBAND DATA; TRANSMISSION QUALITY AB This paper describes the test methodology employed for the performance evaluation of the CCITT 16 kbit/s LD-CELP coder (Recommendation G.728) with non-voice signals. This methodology is sufficiently general that it can be employed for the evaluation of other low transmission-rate digital speech coders. The types of non-voice signals considered in this paper include voiceband data, network signaling, circuit continuity topes and dual-tone multi-frequency signaling. RP DIMOLITSAS, S (reprint author), COMSAT LABS, 22300 COMSAT DR, CLARKSBURG, MD 20871 USA. CR *CCITT, 1984, RED BOOK, V3, P125 *CCITT, 1984, RED BOOK, V8, P117 *CCITT, 1972, GREEN BOOK, V4, P530 *CCITT, 1984, RED BOOK, V6, P55 DIMOLITSAS S, 1987, COMSAT TECH REV, V17, P323 Dimolitsas S., 1991, COMSAT Technical Review, V21 DIMOLITSAS S, 1993, SPEECH COMMUN, V12, P145, DOI 10.1016/S0167-6393(05)80006-8 SOUTH CR, 1993, SPEECH COMMUN, V12, P113, DOI 10.1016/S0167-6393(05)80004-4 1975, 41009 BELL SYST TECH NR 9 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 135 EP 144 DI 10.1016/S0167-6393(05)80005-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500004 ER PT J AU DIMOLITSAS, S CORCORAN, FL BARANIECKI, M PHIPPS, JG AF DIMOLITSAS, S CORCORAN, FL BARANIECKI, M PHIPPS, JG TI NONVOICE PERFORMANCE OF THE 16 KBIT/S LD-CELP ALGORITHM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; VOICE COMPRESSION; G728; LD-CELP; SIGNALING; VOICEBAND DATA; TRANSMISSION QUALITY AB Recently, CCITT adopted a Low-Delay Code Excited Linear Predictive (LD-CELP) algorithm as a new recommendation for the coding of telephone bandwidth speech at 16 kbit/s with ''toll quality''. In this paper, the performance of the LD-CELP algorithm with non-voice signals is presented and compared with 64 kbit/s PCM. From these results it can be determined that the LD-CELP, in a single encoding/decoding configuration, is acceptably transparent to network signaling as well as voiceband data at rates not exceeding 2.4 kbit/s. C1 INTELSAT, WASHINGTON, DC 20008 USA. RP DIMOLITSAS, S (reprint author), COMSAT LABS, 22300 COMSAT DR, CLARKSBURG, MD 20871 USA. CR *CCITT REC, 1991, DIG CIRC MULT EQ US *CCITT REC, 1984, RED BOOOK, V3, P85 CHEN JH, 1992, IEEE J SEL AREA COMM, V10, P830, DOI 10.1109/49.138988 DIMOLITSAS S, 1993, SPEECH COMMUN, V12, P97 DIMOLITSAS S, 1993, SPEECH COMMUN, V12, P135, DOI 10.1016/S0167-6393(05)80005-6 1989, BLUE BOOK, V8 1989, BLUE BOOK, V3 1992, G728 REC NR 8 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 145 EP 150 DI 10.1016/S0167-6393(05)80006-8 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500005 ER PT J AU IRII, H KUBICHEK, R ATKINSON, D AF IRII, H KUBICHEK, R ATKINSON, D TI OBJECTIVE ASSESSMENT OF 16 KBIT/S LD-CELP SPEECH QUALITY SO SPEECH COMMUNICATION LA English DT Article DE OBJECTIVE QUALITY; SPEECH QUALITY; COMMUNICATION PERFORMANCE; SPEECH CODING; QUALITY ASSESSMENT; CEPSTRAL DISTANCE; INFORMATION INDEX AB Objective assessment techniques are computer-based algorithms to automatically assess speech quality without requiring human listeners. When compared with listener-based tests, objective systems offer the potential advantages of low cost, quick turn-around and repeatable results. This paper briefly outlines four candidate objective techniques being considered by CCITT, and discusses their relative performance as determined during the CCITT 16 kbit/s LD-CELP tests. The results showed that none of the methods is sufficiently accurate or reliable to replace human listener panels in all applications. However, objective measures may play an important role in future large-scale subjective evaluations as a means of obtaining coarse preliminary quality estimates for test design and validation. C1 UNIV WYOMING, DEPT ELECT ENGN, LARAMIE, WY 82071 USA. US DEPT COMMERCE, NATL TELECOMMUN & INFORMAT ADM, BOULDER, CO 80303 USA. RP IRII, H (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, MUSASHINO ELECT COMMUN LAB, 3-9-11 MIDORICHO, MUSASHINO, TOKYO 180, JAPAN. CR *BELL NO RES, 1983, OBJ EV NONL DIST EFF KITAWAKI N, 1988, IEEE J SEL AREA COMM, V6, P242, DOI 10.1109/49.601 KUBICHEK R, 1981, 1991 IEEE GLOB TEL C Kubichek R. F., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90094-2 LALOU J, 1990, ANN TELECOMMUN, V45, P47 RICHARDS D, P IEEE, V121, P313 SOUTH CR, 1993, SPEECH COMMUN, V12, P113, DOI 10.1016/S0167-6393(05)80004-4 NR 7 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 151 EP 155 DI 10.1016/S0167-6393(05)80007-X PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500006 ER PT J AU JOHANSEN, FT AF JOHANSEN, FT TI A NON-BIT-EXACT APPROACH FOR IMPLEMENTATION VERIFICATION OF THE CCITT LD-CELP SPEECH CODER SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; CCITT REC G728; VERIFICATION TESTING AB This paper presents the design methodology behind the floating point verification procedure for the Low-Delay Code-Excited Linear Prediction speech coder, recently selected by the CCITT as Recommendation G.728. This procedure is based on a non bit-exact specification, which is different from previous CCITT speech coder verification procedures. This approach gives additional freedom for the implementor of the algorithm, and will allow for more efficient implementations on various kinds of hardware. However, this flexibility also means that different implementations will respond slightly different to verification test signals. To cope with this, explicit objective measurements of such deviations are used in the verification process. These measurements are simple weighted and unweighted signal-to-noise ratios (WSNR and SNR). In addition to the objective measurements, certain restrictions had to be placed on the test sequence design. In spite of the input restrictions, a set of test sequences giving reasonably good coverage of the LD-CELP algorithm and state space has been found. Evaluation experiments are reported, showing that these sequences have a satisfactory error detecting capability. A final discussion concludes that the chosen verification approach is indeed feasible as an implementor's tool. RP JOHANSEN, FT (reprint author), NORWEGIAN TELECOM RES, N-2007 KJELLER, NORWAY. CR BANERJEE S, 1987, P INT C COMMUNICATIO, P1487 CAMPBELL JP, 1991, ADV SPEECH CODING *CCITT, 1988, BLUE BOOK, V3 CHEN JH, 1992, IEEE J SEL AREAS COM, V10 *EIA, 1990, IS54 TECH RES *EIA TIA, 1992, I585 TECH REP JOHANSEN FT, 1992, NORWEGIAN TELECOM RE JOHANSEN FT, 1992, P GLOBECOM, P1714 *NATO, 1984, STAND AGR Tremain T.E., 1982, SPEECH TECHNOLOG APR, P40 1992, COLLECTIVE LETT CCIT 1988, ANNEX 1 QUESTION 21 1992, G728 DRAFT REC NR 13 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 157 EP 169 DI 10.1016/S0167-6393(05)80008-1 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500007 ER PT J AU CORCORAN, F IRII, H ROSENBERGER, JR AF CORCORAN, F IRII, H ROSENBERGER, JR TI HOST LABORATORY AND DATA INTERCHANGES SO SPEECH COMMUNICATION LA English DT Article AB COMSAT Laboratories acted as the processing host laboratory for Phases 1 and 2 of the CCITT Study Group XV 16 kbit/s speech coder evaluation program. The host laboratory was responsible for all preparations needed to allow the participating organizations to conduct subjective speech quality tests on the floating point hardware version of the 16 kbit/s LD-CELP coding algorithm. COMSAT was assisted by INTELSAT who provided funding, British Telecom and CPqD/Telebras (Brazilian Telecom Administration) who supplied personnel 1 and by AT&T, Bellcore, National Institute of Science and Technology (NIST, a part of the US Department of Commerce), Nippon Telegraph & Telephone (NTT), and Philips Communication Industry AG (PKI) which supplied hardware/software. Additionally, the host laboratory worked with 12 laboratories (see Table 1) which provided source speech material in 9 different languages and conducted the subjective and objective tests. These efforts spanned the period between October 1989 and July 1990 for Phase 1 and from February to mid-April 1991 for Phase 2. The purpose of this paper is to describe the host laboratory functions that COMSAT performed. We also describe the Write Once and Read Many times (WORM) based data interchange system, defined by Bellcore and COMSAT, that was used to exchange data efficiently (a total of about 16,000 Mbytes) between participating laboratories. Finally, we describe the Common Analog Interface, built by NTT, which was used to ensure that all tested codecs received equal treatment. C1 NIPPON TELEGRAPH & TEL PUBL CORP, TELECOMMUN NETWORKS LABS, TOKYO 180, JAPAN. CR *CCITT, 1991, TEST SPEC COMM AN IN *CCITT, 1989, PROGR REP NEW DAT IN *CCITT, 1990, DIG INT METH *CCITT, CCITT BLUE BOOK, V5, P198 *CCITT, 1991, REP ACT 16 KBITS SUB *CCITT, 1984, CCITT RED BOOK, V3, P125 *CCITT, 1988, CCITT RED BOOK, V3, P269 SOUTH CR, 1993, SPEECH COMMUN, V12, P113, DOI 10.1016/S0167-6393(05)80004-4 NR 8 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 171 EP 181 DI 10.1016/S0167-6393(05)80009-3 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500008 ER PT J AU BEERY, Y AF BEERY, Y TI VARIABLE BIT-RATE METHODS FOR LOW-DELAY SPEECH CODERS SO SPEECH COMMUNICATION LA English DT Article DE VARIABLE-BIT-RATE; VBR; LD-CELP; CELP; LATTICE CODEBOOK; SPEECH CODING; PARITY CODE AB Several efficient Variable-Bit-Rate (VBR) methods suitable for low-delay Delayed-Decision Tree-Code (DDTC) and CELP speech coders are presented in this paper. These methods are based on modifying the existing codebook(s) by means of block or lattice codes. To achieve a reduced rate operation of the DDTC coder a novel technique based on a parity code is developed and up to 2 dB SNR improvement over conventional methods is demonstrated. For extended rate operation of the LD-CELP coder an approach based on a lattice code is used. The modified or additional lattice codebook has a geometrical (algebraic) structure, allowing the search procedure to be performed efficiently and with minimal additional memory. Simulation results of these VBR methods for a 16 kbit/s DDTC coder and for the 16 kbit/s LD-CELP coder proposed for the CCITT recommendation (G.728) are presented. RP BEERY, Y (reprint author), TEL AVIV UNIV, DEPT ELECT ENGN SYST, IL-69978 TEL AVIV, ISRAEL. RI Be'ery, Yair/K-3568-2012 CR Adoul J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) ANDERSON JB, 1975, IEEE T INFORM THEORY, V21 BEERY Y, 1990, ADV SPEECH CODING, P37 BERGSTROM A, 1989, MAY P INT C GLASG, P53 BLUM M, 1989, THESIS U TECHNOLOGY CELLARIO L, 1989, MAY P INT C AC SPEEC, P73 CHEN JH, 1992, IEEE J SEL AREA COMM, V10, P830, DOI 10.1109/49.138988 *CONS SPEECH COD, 1989, TIY12 89055 INF DOC Conway J. H., 1988, SPHERE PACKINGS LATT CUPERMAN V, 1990, ADV SPEECH CODING, P13 ISRAEL, 1988, CCITT STUDY GROUP 18 IYENGAR V, 1988, APR P IEEE INT C AC, P243 JAYANT NS, 1978, IEEE T COMM, V26 SCHROEDER MR, 1987, IEEE T INFORM THEORY, V33, P144, DOI 10.1109/TIT.1987.1057260 SVENDSEN T, 1984, APR P INT C AC SPEEC 1989, CCITT15 STUD GROUP NR 16 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 183 EP 192 DI 10.1016/S0167-6393(05)80010-X PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500009 ER PT J AU CUPERMAN, V GERSHO, A AF CUPERMAN, V GERSHO, A TI LOW-DELAY SPEECH CODING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; LOW DELAY; BACKWARD PREDICTION; DIGITAL VOICE AB High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms, such as Low-Delay CELP (the new CCITT 16 kbit/s standard), Low-Delay Vector Excitation Coding (LD-VXC), and backward adaptive tree/trellis codecs. The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s. A number of approaches for improving the speech quality at 8 kbit/s are discussed. Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook. Finally, robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented. C1 UNIV CALIF SANTA BARBARA, CTR INFORMAT PROC RES, DEPT ELECT & COMP ENGN, SANTA BARBARA, CA 93106 USA. RP CUPERMAN, V (reprint author), SIMON FRASER UNIV, SCH ENGN SCI, COMMUN SCI LAB, BURNABY V5A 1S6, BC, CANADA. CR Atal B.S., 1984, P INT C COMM AMST, P1610 CHEN JH, 1990, APR P IEEE INT C AC, P453 CHEN JH, 1992, MAR P IEEE INT C AC, pI69 CHEN JH, 1993, SPEECH COMMUN, V12, P103, DOI 10.1016/S0167-6393(05)80003-2 CHEN JH, 1992, IEEE J SEL AREA COMM, V10, P830, DOI 10.1109/49.138988 CHEN JH, 1990, Patent No. 4969867 CHEN JH, 1989, NOV P IEEE GLOB COMM, P1237 CHEN JH, 1987, IEEE T COMMUN, V35, P918 CHEN JH, 1991, DEC IEEE GLOB COMM C, P1894 CHEN JH, 1987, APR P IEEE INT C AC, V4, P2185 CHEN JH, ADV SPEECH CODING, P25 CUPERMAN V, 1990, Patent No. 4893034 CUPERMAN V, 1991, 25TH AS C SIGN SYST, P935 CUPERMAN V, 1992, IEEE T COMMUN, V40, P129, DOI 10.1109/26.126714 CUPERMAN V, 1985, IEEE T COMMUN, V33, P685, DOI 10.1109/TCOM.1985.1096372 CUPERMAN V, 1991, IEE PROC-I, V138, P338 CUPERMAN V, 1989, P IEEE GLOBAL COMMUN CUPERMAN V, ADV SPEECH CODING, P13 DAVIDSON G, 1987, APR P IEEE INT C AC, V4, P2189 DAVIDSON G, 1986, IEEE T ACOUST SPEECH, V4, P3055 FOODEEI M, 1991, APR P IEEE INT C AC, P25 GIBSON JD, 1980, P IEEE, V68, P488, DOI 10.1109/PROC.1980.11676 Gibson J. D., 1991, Advances in Speech Coding GOODMAN DJ, 1975, IEEE T COMMUN NOV, P1362 HUSSAIN A, 1993, IN PRESS SPEECH AUDI IYENGAR V, 1991, IEEE T SIGNAL PROCES, V39, P1049, DOI 10.1109/78.80962 Iyengar V., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196560 JAYANT NS, 1973, AT&T TECH J, P1119 KATAOKA A, 1991, DEC C REC IEEE GLOB, P1889 Marcellin M. W., 1991, Advances in Speech Coding PENG R, 1991, INT CONF ACOUST SPEE, P29, DOI 10.1109/ICASSP.1991.150271 PENG R, 1990, DEC P IEEE GLOB COMM, P951 PETTIGREW R, 1989, NOV P IEEE GLOB COMM, P57 RAMAMOORTHY V, 1984, AT&T TECH J, V63, P1465 WATTS L, 1988, P IEEE GLOBAL COMMUN, P275 WOO HC, 1991, DEC P IEEE GLOB COMM, P1884 YAO JH, 1991, DEC P IEEE GLOB COMM, P695 YAO JH, 1992, MAR P IEEE INT C AC, P45 YATSUZUKA Y, 1986, APR P IEEE INT C AC, P3071 NR 39 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1993 VL 12 IS 2 BP 193 EP 204 DI 10.1016/S0167-6393(05)80011-1 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA MA115 UT WOS:A1993MA11500010 ER PT J AU VANBERGEM, DR AF VANBERGEM, DR TI ACOUSTIC VOWEL REDUCTION AS A FUNCTION OF SENTENCE ACCENT, WORD STRESS, AND WORD CLASS SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTIC VOWEL REDUCTION; LEXICAL VOWEL REDUCTION; SENTENCE ACCENT; WORD STRESS; WORD CLASS; CENTRALIZATION; INCREASED CONTEXTUAL ASSIMILATION ID SPEAKING RATE; DUTCH VOWELS AB The effect of sentence accent, word stress, and word class (function words versus content words) on the acoustic properties of 9 Dutch vowels in fluent speech was investigated. A list of sentences was read aloud by 15 male speakers. Each sentence contained one syllable of interest. This could be a monosyllabic function word, an unstressed syllable of a content word, or a stressed syllable of a content word. The same syllable occurred in all three conditions. Sentence accent was manipulated with questions that preceded the sentences. A total number of 3465 vowels were segmented from the syllables and analysed. It was found that all three factors mentioned above had a significant effect both on the steady-state formant frequencies (F1 and F2) and on the duration of the vowels. Word stress and word class had a stronger effect on the vowels than sentence accent. A listening experiment showed the perceptual significance of the acoustic measurements. It appeared that spectral vowel reduction could be better interpreted as the result of an increased contextual assimilation than as the tendency to centralize. We also studied changes in the dynamics of the formant tracks due to the experimental conditions. It was found that formant tracks of reduced vowels became flatter which supports the view of an increased contextual assimilation. Three simple models of vowel reduction are discussed. RP VANBERGEM, DR (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR Altman G., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90022-3 BAART J, 1987, THESIS U LEIDEN BOLINGER D, 1972, LANGUAGE, V48, P633, DOI 10.2307/412039 BOLINGER D., 1975, ASPECTS LANGUAGE BOLINGER D, 1985, J LINGUIST, V21, P79, DOI 10.1017/S0022226700010033 BOOIJ GE, 1981, GENERATTIEVE FONOLOG Carter D. M., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90023-4 Collier R., 1990, PERCEPTUAL STUDY INT Daniloff R. G., 1973, J PHONETICS, V1, P239 DELATTRE P, 1969, IRAL-INT REV APPL LI, V7, P295, DOI 10.1515/iral.1969.7.4.295 DRULLMAN R, 1991, J ACOUST SOC AM, V90, P1766, DOI 10.1121/1.401657 DUEZ D, 1991, P ESCA WORKSHOP PHON ENGSTRAND O, 1988, J ACOUST SOC AM, V83, P1863, DOI 10.1121/1.396522 Fant G., 1960, ACOUSTIC THEORY SPEE FOURAKIS M, 1991, J ACOUST SOC AM, V90, P1816, DOI 10.1121/1.401662 GAY T, 1978, J ACOUST SOC AM, V63, P223, DOI 10.1121/1.381717 Kirk RE, 1982, EXPT DESIGN PROCEDUR KOOPMANSVANBEIN.FJ, 1980, THESIS U AMSTERDAM KOOPMANSVANBEINUM FJ, 1992, SPEECH COMMUN, V11, P439, DOI 10.1016/0167-6393(92)90049-D KORSTER KI, 1979, PSYCHOL STUDIES PRES, P27 Krull D., 1989, PERILUS, VX, P87 Kruyt J. G., 1985, THESIS U LEIDEN Ladefoged P., 1982, COURSE PHONETICS, V2nd Lee K.-F., 1989, AUTOMATIC SPEECH REC LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 Lindblom B., 1988, PHONETIC EXPT RES I, VVIII, P21 Makhoul J., 1976, 1976 IEEE International Conference on Acoustics, Speech and Signal Processing Markel JD, 1976, LINEAR PREDICTION SP Marslen-Wilson W. D., 1989, LEXICAL REPRESENTATI, P169 MORTON J, 1969, PSYCHOL REV, V76, P165, DOI 10.1037/h0027366 NORD L, 1986, ACOUSTIC STUDIES VOW, P19 NORD L, 1975, SPEECH COMMUN, V2, P149 Pols LCW, 1977, THESIS FREE U AMSTER POLS LCW, 1973, J ACOUST SOC AM, V53, P1093, DOI 10.1121/1.1913429 RABINER LR, 1983, AT&T TECH J, V62, P1075 STALHAMMAR U, 1973, CONTEXTUAL EFFECTS V, P1 Svendsen T., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) VANBERGEM DR, 1990, P LINGUISTICS PHONET, P427 VANBERGEM DR, 1990, P I PHONETIC SCI AMS, V14, P53 VANBERGEM DR, 1989, P I PHONETIC SCI AMS, V13, P97 VANBERGEM DR, 1991, P EUROSPEECH 91 GENO, V3, P1455 VANBERGEM DR, IN PRESS SPEECH COMM VANBERGEM DR, 1991, P ESCA WORKSHOP PHON VANBERGEM DR, 1989, P EUROSPEECH 89 PARI, V2, P285 VANCOILE BM, 1987, P EUROPEAN C SPEECH, V2, P233 VANSON RJJ, 1991, P I PHONETIC SCU AM, V15, P43 VANSON RJJH, 1992, J ACOUST SOC AM, V92, P121, DOI 10.1121/1.404277 VANSON RJJH, 1990, J ACOUST SOC AM, V88, P1683, DOI 10.1121/1.400243 VANWIJK C, 1980, ITL REV APPL LINGUIS, V47, P53 WILLEMS LF, 1986, IPO ANN PROGR REPORT, V21, P34 1985, PROPOSAL CREATING NA 1989, COLLINS DICT ENGLISH NR 52 TC 61 Z9 61 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1993 VL 12 IS 1 BP 1 EP 23 DI 10.1016/0167-6393(93)90015-D PG 23 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA LA387 UT WOS:A1993LA38700001 ER PT J AU GRENIER, Y AF GRENIER, Y TI A MICROPHONE ARRAY FOR CAR ENVIRONMENTS SO SPEECH COMMUNICATION LA English DT Article DE ADAPTIVE BEAMFORMING; MICROPHONE ARRAY; NOISE REDUCTION; SPEECH RECOGNITION AB This paper describes a microphone array for speech recording in car environments. The array is designed for hands-free radiotelephone, and is also used as a front-end for an automatic speech recognition system (this study has been realised within the european ESPRIT project ARS ''adverse environment recognition of speech''). We first summarise the adaptive beamforming techniques that we have used. We then describe several aspects of the implementation of the array (configuration, design of fixed beamformers, adaptation, complexity reduction). In the last section, we evaluate the performance of the array. Two measures of performance have been retained, one is the signal-to-noise ratio, and the other is the score obtained with the speech recognition system. RP GRENIER, Y (reprint author), ENST, DEPT SIGNAL, 46 RUE BARRAULT, F-75634 PARIS 13, FRANCE. CR ALLEN JB, 1977, P IEEE, V65, P1558, DOI 10.1109/PROC.1977.10770 ALLEN JB, 1977, J ACOUST SOC AM, V62, P912, DOI 10.1121/1.381621 BERGER MF, 1991, IEEE T SIGNAL PROCES, V39, P2377, DOI 10.1109/78.97993 CHOLLET GF, 1982, ICASSP, P2026 FLANAGAN JL, 1991, ACUSTICA, V73, P58 FLANAGAN JL, 1985, J ACOUST SOC AM, V78, P1508, DOI 10.1121/1.392786 FROST OL, 1972, PR INST ELECTR ELECT, V60, P926, DOI 10.1109/PROC.1972.8817 GIERL S, 1990, 22ND ISATA FLOR, P517 GRENIER Y, 1990, 22ND ISATA FLOR, P485 GRIFFITHS LJ, 1982, IEEE T ANTENN PROPAG, V30, P27, DOI 10.1109/TAP.1982.1142739 HAYKIM S, 1985, ARRAY SIGNAL PROCESS KANEDA Y, 1986, IEEE T ACOUST SPEECH, V34, P1391, DOI 10.1109/TASSP.1986.1164975 Kataoka A., 1990, Journal of the Acoustical Society of Japan (E), V11 KELLER M, 1990, 22ND ISATA FLOR, P477 Monzingo R. A., 1980, INTRO ADAPTIVE ARRAY Peterson P M, 1987, J Rehabil Res Dev, V24, P103 Pirz F., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing Sondhi M. M., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) Van Veen B. D., 1988, IEEE ASSP Magazine, V5, DOI 10.1109/53.665 VANCOMPERNNOLLE D, 1989, GRETSI 89 JUAN PINS, P513 VANCOMPERNOLLE D, 1990, SPEECH COMMUN, V9, P433, DOI 10.1016/0167-6393(90)90019-6 XU M, 1989, GRETSI 89 JUAN PIN, P351 Zelinski R., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.197172 NR 23 TC 23 Z9 23 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1993 VL 12 IS 1 BP 25 EP 39 DI 10.1016/0167-6393(93)90016-E PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA LA387 UT WOS:A1993LA38700002 ER PT J AU WANG, FM KABAL, P RAMACHANDRAN, RP OSHAUGHNESSY, D AF WANG, FM KABAL, P RAMACHANDRAN, RP OSHAUGHNESSY, D TI FREQUENCY-DOMAIN ADAPTIVE POSTFILTERING FOR ENHANCEMENT OF NOISY SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ENHANCEMENT; ADAPTIVE POSTFILTER; NOISY SPEECH; FORMANT; LINEAR PREDICTION; SPECTRUM AB This paper presents a new frequency-domain approach to implement an adaptive postfilter for enhancement of noisy speech. The postfilter is described by a set of DFT coefficients which suppress noise in the spectral valleys and allow for more noise in formant regions which is masked by the speech signal. First, we perform an LPC analysis of the noisy speech and calculate the log magnitude spectrum of the input speech. After identifying the formants and valleys (by a new method), the log magnitude spectrum is modified to obtain the postfilter coefficients. The filtering operation is also done in the frequency domain through an FFT and an overlap-add strategy to get the postfiltered speech. Experimental results on 8-kHz-sampled speech show that this new frequency-domain approach results in enhanced speech of better perceptual quality than obtained by a time-domain method. This new method is especially efficient in eliminating high frequency noise and in preserving the weaker, high frequency formants in sonorant sounds. C1 MCGILL UNIV, DEPT ELECT ENGN, MONTREAL H3A 2A7, QUEBEC, CANADA. AT&T BELL LABS, SPEECH RES DEPT, MURRAY HILL, NJ 07974 USA. RP WANG, FM (reprint author), UNIV QUEBEC, INST NATL RES SCI TELECOMMUN, VERDUN H3E 1H6, PQ, CANADA. CR CHEN JH, 1987, APR P INT C AC SPEEC, P2185 JAYANT NS, 1986, 1986 P INT C AC SPEE, P829 MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P129, DOI 10.1109/TAU.1972.1162367 MCCANDLE.SS, 1974, IEEE T ACOUST SPEECH, VSP22, P135, DOI 10.1109/TASSP.1974.1162559 O'Shaughnessy D., 1987, SPEECH COMMUNICATION Rabiner L.R., 1978, DIGITAL PROCESSING S RAMAMOORTHY V, 1988, IEEE J SEL AREA COMM, V6, P364, DOI 10.1109/49.613 SHAUGNESSY D, 1989, IEEE COMM MAG FEB, P46 NR 8 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1993 VL 12 IS 1 BP 41 EP 56 DI 10.1016/0167-6393(93)90017-F PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA LA387 UT WOS:A1993LA38700003 ER PT J AU FAKOTAKIS, N TSOPANOGLOU, A KOKKINAKIS, G AF FAKOTAKIS, N TSOPANOGLOU, A KOKKINAKIS, G TI A TEXT-INDEPENDENT SPEAKER RECOGNITION SYSTEM BASED ON VOWEL SPOTTING SO SPEECH COMMUNICATION LA English DT Article DE TEXT-INDEPENDENT SPEAKER RECOGNITION; SPEAKER VERIFICATION; SPEAKER IDENTIFICATION; CEPSTRAL PARAMETERS; VOWEL SPOTTING ID ALGORITHM AB An automatic text-independent speaker recognition system suitable for identification and verification purposes is presented. The system is based on spotting the vowels of the test utterance, extracting parameter vectors and classifying them into a speaker-dependent reference database. This database consists of L prototypes for every speaker, representing the vowels of the language, which are estimated from L vowel clusters. These are formed by applying a modified k-means algorithm on the patterns extracted from the vowels of training utterances. The patterns of the training utterances are stored in a training database to be used for updating the reference data of the system. The system was tested over a period of four months with a population of 15 male and female speakers with non-correlated training and test data. Its accuracy proved to be satisfactory (91.39% for verification, 90.19% for closed-set identification, 95.28% for open-set identification), considering that the training utterances per speaker do not exceed 50 sec and the test utterances have a duration of 1.3 sec on the average. The accuracy is substantially increased when increasing the length of the test utterance (e.g. 93.75% verification accuracy for test utterances having an average duration of 4 sec). Additional advantages of the system are the small memory requirements and the fast response. RP FAKOTAKIS, N (reprint author), UNIV PATRAS, WIRE COMMUN LAB, GR-26110 PATRAS, GREECE. CR ATAL BS, 1974, J ACOUST SOC AM, V55, P1304, DOI 10.1121/1.1914702 DERMATAS E, 1991, 1991 P INT C AC SPEE FAKOTAKIS N, 1991, 1991 INT C DIG SIGN, P447 FAKOTAKIS N, 1986, 3RD EUR SIGN PROC C, P585 FAKOTAKIS N, 1991, 6TH P IEE INT C PROC FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P254, DOI 10.1109/TASSP.1981.1163530 KASUYA H, 1979, IEEE T ACOUST SPEECH, V27, P319, DOI 10.1109/TASSP.1979.1163251 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 RABINER LR, 1975, IEEE T ACOUST SPEECH, V23, P552, DOI 10.1109/TASSP.1975.1162749 RABINER LR, 1989, IEEE T ACOUST SPEECH, V37, P1214, DOI 10.1109/29.31269 TOHKURA Y, 1987, IEEE T ACOUST SPEECH, V35, P1414, DOI 10.1109/TASSP.1987.1165058 NR 11 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1993 VL 12 IS 1 BP 57 EP 68 DI 10.1016/0167-6393(93)90018-G PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA LA387 UT WOS:A1993LA38700004 ER PT J AU MA, CX KAMP, Y WILLEMS, LF AF MA, CX KAMP, Y WILLEMS, LF TI ROBUST SIGNAL SELECTION FOR LINEAR PREDICTION ANALYSIS OF VOICED SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ANALYSIS; LPC; SAMPLE SELECTION ID FORMANT FREQUENCIES AB This paper investigates a weighted LPC analysis of voiced speech. In view of the speech production model, the weighting function is either chosen to be the short-time energy function of the preemphasized speech sample sequence with certain delays or is obtained by thresholding the short-time energy function. In this method, speech samples are selectively weighted based on how well they match the speech production model. Therefore, the estimates of the LPC coefficients obtained by this novel LPC analysis are more accurate than those obtained from the conventional LPC analysis. They are also less sensitive to the values of the fundamental frequency than conventional LPC. C1 INST PERCEPT RES, 5600 MB EINDHOVEN, NETHERLANDS. CR ANANTHAPADMANABHA TV, 1979, IEEE T ACOUST SPEECH, V27, P309, DOI 10.1109/TASSP.1979.1163267 ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 ATAL BS, 1978, J ACOUST SOC AM, V64, P1310, DOI 10.1121/1.382117 DELSARTE P, 1987, IEEE T INFORM THEORY, V33, P412, DOI 10.1109/TIT.1987.1057310 FLANAGANJL, 1972, SPEECH ANAL SYNTHESI FUJISAKI H, 1986, IEEE T ACOUST SPEECH, V2, P1605 ITAKURA F, 1970, ELECTRON COMMUN JPN, V53, P36 KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 LARAR JN, 1985, P INT C ACOUST SPEEC LEE CH, 1988, IEEE T ACOUST SPEECH, V36, P642, DOI 10.1109/29.1574 LEE DTL, 1981, IEEE T ACOUST SPEECH, V29, P627, DOI 10.1109/TASSP.1981.1163587 MA C, 1990, SIGNAL PROCESS, V5, P1171 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P129, DOI 10.1109/TAU.1972.1162367 MARKEL JD, 1970, LINEAR PREDICTION SP MINC H, 1988, NONNEGATIVE MATRIX MIYOSHI Y, 1987, IEEE T ACOUST SPEECH, V35, P1233, DOI 10.1109/TASSP.1987.1165282 PINSON EN, 1978, J ACOUST SOC AM, V35, P1264 STEIGLITZ K, 1977, IEEE T ACOUST SPEECH, V25, P34, DOI 10.1109/TASSP.1977.1162908 YANAGIDA M, 1985, IASTED APPLIED SIGNA, P129 NR 20 TC 21 Z9 21 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1993 VL 12 IS 1 BP 69 EP 81 DI 10.1016/0167-6393(93)90019-H PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA LA387 UT WOS:A1993LA38700005 ER PT J AU SCHOENTGEN, J AF SCHOENTGEN, J TI GLOTTAL WAVE-FORM SYNTHESIS WITH VOLTERRA SHAPING FUNCTIONS SO SPEECH COMMUNICATION LA English DT Article DE SYNTHESIS; GLOTTAL PULSE; VOLTERRA SERIES AB We recently proposed an input output model of the glottal pulse. Mathematically speaking, the pulse is broken down into a cosinusoidal input signal and a pair of nonlinear shaping functions. The pulse is recovered when the cosinusoid is put through the shapers. In this article, it is shown that the cycles of a speaker's glottal waveform can be synthesized with the shaping functions of a small number of reference cycles. Indeed, nonlinear systems are not described by a transfer function. Therefore, it may be assumed that the nonlinear shaping functions of a glottal pulse are less variable than the shape of the pulse itself. Two experiments were carried out to test this assumption. In a first, the output static waveforms from a two-mass model of the vocal folds were copied. In a second, the glottis signal that was obtained from a logatome [ama] spoken by a male speaker was analyzed and synthesized. Each pulse was characterized by its peak amplitude, period and form factor. In both experiments. the features of all the glottal pulses could be copied by calculating the shaper coefficients of just two reference pulses and by adjusting the control parameters of the driving cosinusoid till the output of the shaper exhibited the desired feature values. RP SCHOENTGEN, J (reprint author), UNIV LIBRE BRUXELLES, INST PHONET, CP 110, 50 AV FD ROOSEVELT, B-1050 BRUSSELS, BELGIUM. CR ALANSARI A, 1981, THESIS I POLYTECHNIQ BAILLY G, 1989, COMMUNICATION Barnsley M.F., 1988, FRACTALS EVERYWHERE Broad DJ, 1979, SPEECH LANGUAGE ADV, V2, P203 CARLSON R, 1991, SPEECH COMMUN, V10, P481, DOI 10.1016/0167-6393(91)90051-T GAUFFIN J, 1989, J SPEECH HEAR RES, V32, P556 GECKINLI NC, 1981, SIGNAL PROCESS, V3, P49, DOI 10.1016/0165-1684(81)90064-5 GUERIN B, 1978, THESIS I POLYTECHNIQ ISHIZAKA K, 1972, AT&T TECH J, V51, P1233 KAMGARPARSI B, 1989, IEEE T PATTERN ANAL, V11, P998, DOI 10.1109/34.35504 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 LILJENCRANTS J, 1985, STL QPSR, V4, P106 NELDER JA, 1965, COMPUT J, V7, P308 PRIESTLEY MB, 1981, NONLINEAR NONSTATION, P25 SCHOENTGEN J, 1990, SPEECH COMMUN, V9, P189, DOI 10.1016/0167-6393(90)90056-F SCHOENTGEN J, 1989, P EUROSPEECH 1989, P481 SHEINGOLD DH, 1974, ANALOG DEVICES, P65 Sundberg J., 1978, STL QPSR, V2-3, P35 THOMPSON JR, 1989, EMPIRICAL MODEL BUIL, P140 NR 19 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1992 VL 11 IS 6 BP 499 EP 512 DI 10.1016/0167-6393(92)90026-4 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA KJ918 UT WOS:A1992KJ91800001 ER PT J AU VANSANTEN, JPH AF VANSANTEN, JPH TI CONTEXTUAL EFFECTS ON VOWEL DURATION SO SPEECH COMMUNICATION LA English DT Article DE SEGMENTAL DURATION; SYLLABLE; STRESS TIMING; PITCH ACCENT; SYNTACTIC BOUNDARY; SYLLABIC STRESS; SPEECH SYNTHESIS; PHONETIC FEATURE; QUANTITATIVE MODELING ID CONNECTED-SPEECH SIGNALS; SEGMENTAL DURATIONS; ENGLISH; STRESS; ARTICULATION AB Speech produced by two speakers was manually segmented. generating two databases with 18,000 and 6,000 vowel segments. The effects on vowel duration of several contextual factors were measured, including those of syllabic stress, pitch accent, the identities of adjacent segments, the syllabic structure of a word and proximity to a syntactic boundary. With statistical techniques for de-confounding factors, detailed characterizations of the effects of the factors and their interactions could be given, which were then summarized in the form of a simple equation for predicting vowel duration from context. RP VANSANTEN, JPH (reprint author), AT&T BELL LABS, MURRAY HILL, NJ 07974 USA. CR Allen J., 1987, TEXT SPEECH MITALK S BAMBER D, 1985, J MATH PSYCHOL, V29, P443, DOI 10.1016/0022-2496(85)90005-7 BOLINGER DL, 1961, LANGUAGE, V37, P83, DOI 10.2307/411252 Bolinger D., 1986, INTONATION ITS PARTS CHEN M, 1970, PHONETICA, V22, P129 Church K. W., 1988, Second Conference on Applied Natural Language Processing COKER CH, 1973, IEEE T ACOUST SPEECH, VAU21, P293, DOI 10.1109/TAU.1973.1162458 COKER CH, 1990, WORKSHOP SPEECH SYNT, P83 CRYSTAL TH, 1990, J ACOUST SOC AM, V88, P101, DOI 10.1121/1.399955 CRYSTAL TH, 1988, J ACOUST SOC AM, V83, P1553, DOI 10.1121/1.395911 CRYSTAL TH, 1988, J ACOUST SOC AM, V83, P1574, DOI 10.1121/1.395912 DEJONG K, 1991, PHONETICA, V48, P1 EDWARDS J, 1988, PHONETICA, V45, P156 EEFTINK W, 1991, THESIS U UTRECHT FARNETANI E, 1986, SPEECH COMMUN, V5, P17, DOI 10.1016/0167-6393(86)90027-0 Grosz B. J., 1986, Computational Linguistics, V12 HARRIS NS, 1974, J ACOUST SOC AM, V56, P1016 Hays W. L., 1981, STATISTICS HIRSCHBERG J, 1990, TALKING MACHINES THE, P367 HIRSCHBERG J, 1992, IN PRESS ARTIFICIAL Hoaglin DC, 1983, UNDERSTANDING ROBUST Jackendoff Ray S., 1972, SEMANTIC INTERPRETAT KLATT DH, 1987, J ACOUST SOC AM, V82, P737, DOI 10.1121/1.395275 KLATT DH, 1973, J ACOUST SOC AM, V54, P1102, DOI 10.1121/1.1914322 Klatt D.H., 1975, J PHONETICS, V3, P129 KRANTZ DH, 1971, F MEASUREMENT, V1 LEHISTE I, 1973, J ACOUST SOC AM, V54, P1228, DOI 10.1121/1.1914379 Lehiste I., 1977, J PHONETICS, V5, P253 Lindblom B. E. F., 1973, PAPERS LINGUISTICS U, V21, P1 Morrison D F, 1967, MULTIVARIATE STATIST Nooteboom S., 1972, THESIS U UTRECHT Nooteboom S. G., 1992, SPEECH PERCEPTION PR, P439 NOOTEBOOM SG, 1991, 12TH P INT C PHON SC OLIVE J, 1990, WORKSH SPEECH SYNTH, P25 OLIVE JP, 1992, COMMUNICATION OLIVE JP, 1985, J ACOUST SOC AM S1, V78, P6 OLLER DK, 1973, J ACOUST SOC AM, V54, P1235, DOI 10.1121/1.1914393 PETERSON GE, 1960, J ACOUST SOC AM, V32, P693, DOI 10.1121/1.1908183 PORT RF, 1981, J ACOUST SOC AM, V69, P262, DOI 10.1121/1.385347 Prince Ellen, 1981, RADICAL PRAGMATICS, P223 RIETVELD ACM, 1987, 11TH P INT C PHON SC, P28 SPROAT RW, 1990, WORKSHOP SPEECH SYNT, P129 TALKIN D, 1989, SPEECH TECHNOLOGY, V11, P384 TULLER B, 1982, J ACOUST SOC AM, V71, P1534, DOI 10.1121/1.387807 UMEDA N, 1975, J ACOUST SOC AM, V58, P434, DOI 10.1121/1.380688 Van Santen J. P. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90016-Y VANSANTEN JPH, 1992, TALKING MACHINES THE, P275 VANSANTEN JPH, 1993, IN PRESS J MATH PSYC, V37 VANSANTEN JPH, 1992, J ACOUST SOC AM, V92, P2444, DOI 10.1121/1.404554 VANSANTEN JPH, 1992, IN PRESS COMPUTER SP, V6 Winer B. J., 1991, STATISTICAL PRINCIPL NR 51 TC 50 Z9 50 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1992 VL 11 IS 6 BP 513 EP 546 DI 10.1016/0167-6393(92)90027-5 PG 34 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA KJ918 UT WOS:A1992KJ91800002 ER PT J AU KLEIJN, WB SUKKAR, RA AF KLEIJN, WB SUKKAR, RA TI EFFICIENT CHANNEL CODING FOR CELP USING SOURCE INFORMATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; CHANNEL ERRORS; CELP ID SPEECH; DESIGN AB Since simultaneous optimization of source and channel coding is often not practical for speech-coding algorithms, it is useful to develop channel codes optimal for a particular speech coding algorithm operating under a specified range of channel-error conditions. Such source-dependent channel-error codes can be obtained by minimizing an appropriate speech distortion criterion. To carry out this minimization, we use a simulated-annealing procedure. The resulting channel codes are efficient, since they provide error correction of non-uniform accuracy (highly probable quantization levels receive more accurate correction) and/or non-uniform error detection (errors which greatly impact the speech quality are more likely to be detected). An optimal trade-off between error correction and error detection can be obtained. Source-dependent channel codes aimed at counteracting low, random error rates (up to 2%) are applied to the CELP algorithm and the resulting performance is reported. It is found that a small allocation of codewords (equivalent to less than one bit) for the protection of a particular parameter often results in a large performance improvement. RP KLEIJN, WB (reprint author), AT&T BELL LABS, 600 MT AVE, MURRAY HILL, NJ 07974 USA. CR ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 Atal B.S., 1984, P INT C COMM AMST, P1610 Berger T., 1971, RATE DISTORTION THEO CAMPBELL JP, 1989, P IEEE INT C AC SPEE, P735 Campbell J. P. Jr., 1991, Advances in Speech Coding CHEN JH, 1987, P IEEE INT C COMM SE, P756 COX RV, 1989, P IEEE INT C AC SPEE, P739 DEMARCA JRB, 1987, JUN P IEEE INT COMM, P1128 ELGAMAL AA, 1987, IEEE T INFORM THEORY, V33, P116 FARVARDIN N, 1987, IEEE T INFORM THEORY, V33, P827, DOI 10.1109/TIT.1987.1057373 GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 Gray R. M., 1990, SOURCE CODING THEORY KANG GS, 1985, 8857 NAV RES LAB REP KEMP RD, 1989, P INT C ACOUST SPEEC, P200 KIRKPATRICK S, 1983, SCIENCE, V220, P671, DOI 10.1126/science.220.4598.671 Kleijn W.B., 1988, P INT C AC SPEECH SI, P155 KLEIJN WB, 1988, SPEECH COMMUN, V7, P305, DOI 10.1016/0167-6393(88)90076-3 KLEIJN WB, 1991, ADV SPEECH CODING, P257 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 Kroon P., 1991, Advances in Speech Coding LEGUYADER A, 1988, SPEECH COMMUN, V7, P217, DOI 10.1016/0167-6393(88)90041-6 METROPOLIS N, 1953, J CHEM PHYS, V21, P1087, DOI 10.1063/1.1699114 Oppenheim A. V., 1975, DIGITAL SIGNAL PROCE Rabiner L.R., 1978, DIGITAL PROCESSING S Saito S., 1985, FUNDAMENTALS SPEECH TRANCOSO IM, 1986, P INT C ACOUST SPEEC, P2379 Tremain T.E., 1982, SPEECH TECHNOLOG APR, P40 TREMAIN TE, 1988, P MOBILE SATELLITE C, P491 van Laarhoven P., 1987, SIMULATED ANNEALING Viterbi A. J., 1979, PRINCIPLES DIGITAL C ZEGER KA, 1987, ELECTRON LETT, V23, P654, DOI 10.1049/el:19870468 NR 31 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1992 VL 11 IS 6 BP 547 EP 566 DI 10.1016/0167-6393(92)90028-6 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA KJ918 UT WOS:A1992KJ91800003 ER PT J AU MARTINELLI, G RICOTTI, LP RAGAZZINI, S AF MARTINELLI, G RICOTTI, LP RAGAZZINI, S TI SINGULAR VALUE EXCITATION SPEECH CODER SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; SINGULAR VALUE DECOMPOSITION; VECTOR QUANTIZATION; PTVQ; PVQ AB A new low delay speech coder is proposed. In order to achieve low delay, the coder is based on the memory dependent vector quantization of the synthesis filter. Another basic principle of the proposed scheme is to transform the excitation in a domain where the uncorrelated excitation samples are ordered according to their decreasing level of performance; this is achieved by means of a singular value decomposition transformation. The transformed samples are considered to be a sequence of Laplacian random variables that are vector quantized in an efficient manner using a geometrical lattice VQ. The coder was tested at a bit-rate of 8625 bit/s and a delay of 8 ms; a good reproduced voice quality was obtained. C1 FDN UGO BORDONI, ROME, ITALY. RP MARTINELLI, G (reprint author), UNIV ROME LA SAPIENZA, VIA EUDOSSIANA 18, I-00185 ROME, ITALY. CR ATAL BS, 1989, 1989 P INT C AC SPEE, P45 FISHER TR, 1986, IEEE T INFORM THEORY, V32, P568 JUANG BH, 1988, IEEE T ACOUST SPEECH, V36, P1423, DOI 10.1109/29.90370 NR 3 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1992 VL 11 IS 6 BP 567 EP 579 DI 10.1016/0167-6393(92)90029-7 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA KJ918 UT WOS:A1992KJ91800004 ER PT J AU YANNAKOUDAKIS, EJ HUTTON, PJ AF YANNAKOUDAKIS, EJ HUTTON, PJ TI AN ASSESSMENT OF N-PHONEME STATISTICS IN PHONEME GUESSING ALGORITHMS WHICH AIM TO INCORPORATE PHONOTACTIC CONSTRAINTS SO SPEECH COMMUNICATION LA English DT Article DE N-PHONEMES; PHONOTACTIC KNOWLEDGE; PHONEME GUESSING; SPEECH RECOGNITION; ENTROPY; REDUNDANCY; POSITIONAL ANALYSIS OF PHONEMES AB This paper presents statistics relating to various phoneme guessing algorithms. The N-phoneme statistics were obtained by exhaustive analysis of a lexicon of 96,998 phonetic words. The results show that by incorporating detailed phonotactic knowledge, coupled with broad phonetic knowledge, an algorithm can be formulated which successfully guesses the correct phoneme with mean success rates of up to 67%. This implies that. as far as the computer is concerned, spoken English is, at a minimum, 67% redundant. The results also show that the ability to guess correctly depends on word length and position; phoneme type and the number of unknown phonemes in the word have very little effect on the final results. C1 UNIV LEEDS, DEPT TRANSPORT STUDIES, LEEDS LS2 9JT, W YORKSHIRE, ENGLAND. RP YANNAKOUDAKIS, EJ (reprint author), ATHENS UNIV ECON & BUSINESS, DEPT INFORMAT, 76 PATISSION ST, GR-10434 ATHENS, GREECE. CR BAHL L, 1984, P IEEE INT C ACOUST ELOVITZ HS, 1976, IEEE T ACOUST SPEECH, V24, P446, DOI 10.1109/TASSP.1976.1162873 HUTTENLOCHER DP, 1983, AAAI INT C WASHINGTO HUTTENLOCHER DP, 1984, P IEEE INT C ACOUST, V26 JOHNSON SR, 1985, RES DEV EXPERT SYSTE, P95 PICONE J, 1986, IEEE T ACOUST SPEECH, V34 SHANNON CE, 1951, AT&T TECH J, V30, P50 SHANNON CE, 1948, AT&T TECH J, V27, P623 SHIPMAN DW, 1982, 1982 P INT C AC SPEE, P546 SUEN C, 1979, IEEE T PATTERN ANAL, V1 YANNAKOUDAKIS E, INT J PATTERN RECOGN, V23, P509 YANNAKOUDAKIS EJ, 1988, ARCHITECTURAL LOGIC YANNAKOUDAKIS EJ, 1988, IEEE T PATTERN ANAL, V10, P960, DOI 10.1109/34.9119 Yannakoudakis E.J., 1987, SPEECH SYNTHESIS REC ZUE VW, 1985, P IEEE, V73, P1602, DOI 10.1109/PROC.1985.13342 Zue V. W., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing NR 16 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1992 VL 11 IS 6 BP 581 EP 602 DI 10.1016/0167-6393(92)90030-B PG 22 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA KJ918 UT WOS:A1992KJ91800005 ER PT J AU LLISTERRI, J POCHOLIVE, D AF LLISTERRI, J POCHOLIVE, D TI SPECIAL ISSUE ON PHONETICS AND PHONOLOGY OF SPEAKING STYLES - REDUCTION AND ELABORATION IN SPEECH-COMMUNICATION SO SPEECH COMMUNICATION LA English DT Editorial Material RP LLISTERRI, J (reprint author), UNIV AUTONOMA BARCELONA, DEPT SPANISH, PHILOL, BARCELONA, SPAIN. NR 0 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 321 EP 322 DI 10.1016/0167-6393(92)90036-7 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500001 ER PT J AU ARGENTE, JA AF ARGENTE, JA TI FROM SPEECH TO SPEAKING STYLES SO SPEECH COMMUNICATION LA English DT Article DE SPEAKING STYLES; SOCIOLINGUISTICS; METHODOLOGY AB ''Langue'' is the systematic and social part of ''langage'' while ''parole'' is its unpatterned and individual aspect. This Saussurean tenet was objected by sociolinguists insofar as they uncovered the patterned rule-governed character of speech, understood as a social event (Hymes, 1962; Gumperz, 1970, 1971), and insofar as they showed up a methodological contradiction known as the ''Saussurean paradox'' (Labov, 1970, 1972). These authors furnished some notions in order to approach socio-symbolic speech variation: that of ''ways of speaking'', ''natural varieties'' and ''styles'', respectively. The former (Hymes-Gumperz's) approach displaces the object of description from speech itself towards interaction and the communicative event. The latter (Labov's) holds a contextualized view of speech and looks for meaningful correlations between sound pattern features and independently defined macrosociological variables. The former is to be prefer-red when studying social behaviour patterns; the latter may be more suited to defining styles inherently. The notion of a sociolinguistic variable allows us to define them as specific co-occurrence patterns of variables. While some sociolinguists concentrate on the analysis of segmental variables, phoneticians concerned with speaking styles emphasize suprasegmental variables. It is suggested that this difference in approach may reflect a difference in the social meaning conveyed by variables. Beyond description, it is proposed the search for an explanatory principle of style variation in terms of adaptive variability. RP ARGENTE, JA (reprint author), UNIV AUTONOMA BARCELONA, FAC FILOSOFIA & LLETRES, DEPT FILOL CATALANA, E-08193 BARCELONA, SPAIN. CR ARGENTE JA, 1987, LIMITS, V3, P69 Blom J-P., 1972, DIRECTIONS SOCIOLING Bortoni-Ricardo S. M., 1985, URBANIZATION RURAL D Brown Roger, 1960, STYLE LANGUAGE BUMPERZ JJ, 1971, LANGUAGE SOCIAL GROU Chomsky N., 1968, SOUND PATTERN ENGLIS Dorian N., 1981, LANGUAGE DEATH DRESSLER WU, 1987, LIMITS, V3, P87 Gal S., 1979, LANGUAGE SHIFT Gardiner Alan, 1932, THEORY SPEECH LANGUA GUMPERZ JJ, 1970, 36 U CAL LANG BEH RE HYMES D, 1968, ANTHR HUMAN BEHAVIOR HYMES D, 1968, READINGS SOCIOLOGY L Jakobson R., 1960, STYLE IN LANGUAGE Jakobson Roman, 1979, SOUND SHAPE LANGUAGE Labov W., 1970, STUD GEN, V23, P30 LABOV W, 1969, LANGUAGE, V45, P715, DOI 10.2307/412333 Labov William, 1972, LANG SOC, V1, P97, DOI DOI 10.1017/S0047404500006576 LINDBLOM B, 1986, J PHONETICS, V14, P117 Lindblom B., 1990, SPEECH PRODUCTION SP LINDBLOM B, UNPUB EVOLUTION SPOK Milroy Lesley, 1980, LANGUAGE SOCIAL NETW Morris Charles, 1938, INT ENCY UNIFIED SCI Sapir Edward, 1949, SELECTED WRITINGS E Sapir Edward, 1921, LANGUAGE Schmidt Annette, 1985, YOUNG PEOPLES DYIRBA VERSCHUEREN J, 1987, IPRA1 WORK DOC Weinreich Ulrich, 1953, LANGUAGES CONTACT WOLFSON N, 1976, LANG SOC, V5, P189 NR 29 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 325 EP 335 DI 10.1016/0167-6393(92)90038-9 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500002 ER PT J AU ENGSTRAND, O AF ENGSTRAND, O TI SYSTEMATICITY OF PHONETIC VARIATION IN NATURAL DISCOURSE SO SPEECH COMMUNICATION LA English DT Article DE PHONETIC SYSTEMATICITY; NATURAL SPEECH; SPONTANEOUS SPEECH; REDUCTION; ELABORATION AB Phonetic variation in natural speech is more extensive and its systematicity less transparent than in experimentally elicited speech. This paper raises the question to what extent phonetic structure in natural speech can be observed and explained. In the light of some recent data, it is argued that this can indeed be done with a fair amount of success. RP ENGSTRAND, O (reprint author), UNIV STOCKHOLM, INST LINGUIST, S-10691 STOCKHOLM, SWEDEN. CR BARRY MC, 1984, CAMBRIDGE PAPERS PHO, V3, P1 Bruce G., 1977, TRAVAUX I LINGUISTIQ, V12 DALBY JM, 1984, PHONETIC STRUCTURE F DISNER SF, 1978, UCLA40 WORK PAP PHON DRESSLER W, 1972, INNSBRUCKER BEITRAGE, V9 ENGSTRAND O, 1989, 1989 P FONETIK 89 3R, P95 ENGSTRAND O, 1989, PERILUS10 U STOCKH I, P1 ENGSTRAND O, 1989, PERILUS8 U STOCKH I, P34 ENGSTRAND O, 1988, J ACOUST SOC AM, V83, P1863, DOI 10.1121/1.396522 ENGSTRAND O, 1989, 1989 P SPEECH RES 89, P88 KOHLER KJ, 1990, NATO ADV SCI I D-BEH, V55, P69 LINDBLOM B, 1988, PERILUS8 U STOCKH I, P21 LINDBLOM B, 1981, DURATIONAL PATTERNS LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 LINDBLOM B., 1983, PRODUCTION SPEECH, P217 LINDBLOM B, 1987, 11TH P INT C PHON SC, V3, P9 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 LYBERG B, 1981, MONOGRAPHS I LING U, V6 MOON SJ, 1989, 11989 ROYAL I TECHN, P121 NORD L, 1986, STL QPSR, V4, P19 SHOCKEY L, 1973, OHIO STATE WORKING P, V17 Zwicky A.M., 1972, 8 REG M CHIC LING SO, P607 NR 22 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 337 EP 346 DI 10.1016/0167-6393(92)90039-A PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500003 ER PT J AU GRANSTROM, B AF GRANSTROM, B TI THE USE OF SPEECH SYNTHESIS IN EXPLORING DIFFERENT SPEAKING STYLES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; SPEAKING STYLE; TEXT-TO-SPEECH; SPEAKING RATE; EMOTIONS AB The possibility to vary speaker type and speaking style will be a feature of the next generation text-to-speech systems. Already today, the need for these possibilities is apparent in dialogue systems and when speech synthesis is used as prostheses for persons with a communication handicap. Much of the information needed is not yet available. In this contribution we argue that speech synthesis itself is an efficient tool to study and understand the variability in speech. Some different methods are reviewed, representing both analysis/synthesis techniques and signal manipulations as well as text-to-speech. The main emphasis is, on the work at KTH, to study the variation of speaker and speaking styles in the context of our text-to-speech system. RP GRANSTROM, B (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR ABE M, 1990, ESCA WORKSHOP SPEAKE, P40 BLADON A, 1987, EUROPEAN C SPEECH TE, V1, P55 BLOMBERG M, 1990, ESCA WORKSHOP SPEAKE, P58 BROWN BL, 1974, J ACOUST SOC AM, V55, P313, DOI 10.1121/1.1914504 Cahn J.E., 1990, J AM VOICE I O SOC, V8, P1 CARLSON R, 1986, DEV ELECTRONIC AIDS, P87 CARLSON R, 1990, ESCA WORKSHOP SPEAKE, P28 CARLSON R, 1991, P EUROSPEECH 91, V3, P1043 CARLSON R, 1990, ADV SPEECH HEARING L, P269 CHARPENTIER F, 1989, P EUROSPEECH 89, V2, P13 FANT G, 1985, SPEECH TRANSMISSION, P1 FONAGY I, 1978, LANG SPEECH, V21, P34 GOBL C, 1989, IN PRESS P VOCAL FOL GRANSTROM B, 1992, SPEECH COMMUN, V11, P459, DOI 10.1016/0167-6393(92)90051-8 GRANSTROM B, 1991, 11TH P ICPHS, V4, P278 KARLSSON I, 1990, J PHONETICS, V19, P111 KARLSSON I, 1992, SPEECH COMMUN, V11, P491, DOI 10.1016/0167-6393(92)90056-D KARLSSON I, 1988, P SPEECH 88, P225 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 MURRAY IR, 1988, P SPEECH 88, P1217 OSTER AM, 1985, STL QPSR, P95 OSTER AM, 1986, STL QPSR, P79 ROSS M, 1973, AM ANN DEAF, P37 ROTHENBERG M, 1975, SPEECH COMMUN, V2, P235 SORIN C, 1987, 11TH P ICPHS TALL, V1, P125 STEVENS K, 1990, J PHONETICS, V19, P161 STREVENS P, 1958, SPECIFICATION SPEECH TRAUMULLER H, 1989, STL QPSR, P63 ULDALL E, 1960, LANG SPEECH, V3, P223 NR 29 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 347 EP 355 DI 10.1016/0167-6393(92)90040-E PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500004 ER PT J AU LINDBLOM, B BROWNLEE, S DAVIS, B MOON, SJ AF LINDBLOM, B BROWNLEE, S DAVIS, B MOON, SJ TI SPEECH TRANSFORMS SO SPEECH COMMUNICATION LA English DT Article DE BABY TALK; SIGNAL VARIABILITY; SPEAKING STYLES; SPONTANEOUS SPEECH; VOWEL REDUCTION ID STOP CONSONANTS; CONVERSATIONAL SPEECH; SPEAKING RATE; INVARIANCE; VOWEL; PERCEPTION; HEARING; CLEAR; PLACE; HARD AB This paper reports acoustic-phonetic data on three speaking styles: Informal conversational speech, ''clear'' speech and Baby Talk. These sets of observations illustrate the fact that a speaker's pronunciation of a given linguistic form can undergo rather drastic physical transformations, particularly in the wide range of contexts presented by spontaneously produced speech. Despite their extensive variations, vowel formant measurements showed a high degree of predictability. The findings bring to the fore a classical issue of speech research: signal variability and phonetic invariance. While the present results do not conclusively preclude the possibility that the investigated speech samples are organized around a core of signal invariants, the extent as well as the systematicity of the variability observed lend support to a different perspective. The paper proposes that the variegated acoustic pattern of speech be seen as products of adaptation. According to this interpretation, phonetic gestures and signals are modulated and tuned adaptively in accordance with on-line communicative and socio-linguistic demands (e.g., controlling the ''social distance'' between speakers, preserving intelligibility, performing ''phatic'' and ''emotive'' functions, etc). Furthermore, it is argued that the linguistic task of the phonetic signal is not to encode invariants but to complement information already available to the speech processing system of the listener. Accordingly, intra-speaker phonetic variations need not be seen as invariants embedded in linguistically irrelevant variability. They rather represent genuine behavioral adaptations that may jeopardize or demolish signal invariance but that transform speech patterns in essentially principled ways. C1 UNIV STOCKHOLM, DEPT LINGUIST, S-10691 STOCKHOLM, SWEDEN. RP LINDBLOM, B (reprint author), UNIV TEXAS, DEPT LINGUIST, AUSTIN, TX 78712 USA. CR Bates E., 1979, EMERGENCE SYMBOLS BLUMSTEIN SE, 1979, J ACOUST SOC AM, V66, P1001, DOI 10.1121/1.383319 BLUMSTEIN SE, 1980, J ACOUST SOC AM, V67, P648, DOI 10.1121/1.383890 Bregman AS., 1990, AUDITORY SCENE ANAL BROWMAN CP, 1990, J PHONETICS, V18, P299 ENGSTRAND O, 1988, J ACOUST SOC AM, V83, P1863, DOI 10.1121/1.396522 Ferguson Charles A, 1977, TALKING CHILDREN LAN Fonagy I., 1983, VIVE VOIX FOWLER CA, 1986, J PHONETICS, V14, P3 GAY T, 1978, J ACOUST SOC AM, V63, P223, DOI 10.1121/1.381717 Golinkoff R., 1984, ORIGINS GROWTH COMMU, P5 GRIESER D, 1989, DEV PSYCHOL, V25, P577, DOI 10.1037/0012-1649.25.4.577 Jakobson R, 1960, STYLE LANG, P350 Kuehn David P., 1976, J PHONETICS, V4, P303 KUHL PK, 1991, PERCEPT PSYCHOPHYS, V50, P93, DOI 10.3758/BF03212211 KUHL PA, IN PRESS SCIENCE Labov William, 1972, SOCIOLINGUISTIC PATT Lashley K. S., 1951, CEREBRAL MECH BEHAV, P112 LIBERMAN AM, 1985, COGNITION, V21, P1, DOI 10.1016/0010-0277(85)90021-6 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 MILLER JD, 1989, J ACOUST SOC AM, V85, P2114, DOI 10.1121/1.397862 MOON SJ, 1991, THESIS U TEXAS AUSTI NORD L, 1986, STL QPSR, V4, P19 NOTTEBOOM S, 1991, 12 P INT C PHON SCI, V1, P107 Perkell J. S., 1986, INVARIANCE VARIABILI PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 PICHENY MA, 1985, J SPEECH HEAR RES, V28, P96 PICHENY MA, 1986, J SPEECH HEAR RES, V29, P434 SCHULMAN R, 1989, J ACOUST SOC AM, V85, P95 SHEPARD RN, 1984, PSYCHOL REV, V91, P417, DOI 10.1037/0033-295X.91.4.417 Stevens K. N., 1981, PERSPECTIVES STUDY S, P1 STEVENS KN, 1978, J ACOUST SOC AM, V64, P1358, DOI 10.1121/1.382102 Sundberg J, 1987, SCI SINGING VOICE SUNDBERG U, 1991, PERILUS, V13 SUSSMAN HM, 1991, J ACOUST SOC AM, V90, P1309, DOI 10.1121/1.401923 TRAUNMULLER H, 1981, J ACOUST SOC AM, V69, P1465 TRAUNMULLER H, 1991, PERILUS, V14 WHISTLER K, 1988, USERS MANUAL NR 39 TC 30 Z9 30 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 357 EP 368 DI 10.1016/0167-6393(92)90041-5 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500005 ER PT J AU OHALA, JJ AF OHALA, JJ TI WHAT IS THE INPUT TO THE SPEECH PRODUCTION MECHANISM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; LINGUISTICS; PHONEME; EPENTHETIC STOPS; DISSIMILATION; COMPARATIVE METHOD AB Our conception of what it is that the speech production mechanism is attempting to implement in speaking comes from linguistics. But linguistics first developed its methods for other purposes. In the early 19th century linguistics produced a method for tracing the family relationship between suspected cognate words and their constituent sounds. This, the comparative method, involved establishing an optimal path between these forms via a reconstructed parent form. 20th century structuralist linguistics (including generative phonology), essentially grafted the same method onto the task of finding the underlying phonemic constituents of words. Although the underlying structure found in this way may be a good hypothesis as to the mental elements determining actual spoken utterances, there are reasons to suspect that it is too simple. Too much emphasis is placed on the simplicity of the system and on the purely lexical (as opposed to the demarcative and attitudinal) function of the elements in speech. This paper presents some initial attempts to differentiate between phonetic variants in speech which stem from single underlying forms as opposed to those which arise from separate underlying forms (though they may have had a common source historically). C1 UNIV CALIF BERKELEY, DEPT LINGUIST, BERKELEY, CA 94720 USA. RP OHALA, JJ (reprint author), UNIV ALBERTA, DEPT LINGUIST, EDMONTON T6G 2E7, ALBERTA, CANADA. CR Beddor Patrice Speeter, 1986, PHONOLOGY YB, V3, P197 Clumeck H., 1976, J PHONETICS, V4, P337 COWAN W, 1980, SOURCE BOOK LINGUIST GRASSMANN H, 1863, Z SPRACHFORSCHUNG GE, V12, P51 HOUSE AS, 1953, J ACOUST SOC AM, V25, P105, DOI 10.1121/1.1906982 KLATT DH, 1975, J SPEECH HEAR RES, V18, P686 LABOV W, 1981, LANGUAGE, V57, P267, DOI 10.2307/413692 Lehiste I., 1970, SUPRASEGMENTALS LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 Mandelbrot B, 1954, WORD, V10, P1 MANN VA, 1980, PERCEPT PSYCHOPHYS, V28, P213, DOI 10.3758/BF03204377 NOOTEBOOM SG, 1970, IPO ANN REP, V5, P55 NORD L, 1974, AUG SPEECH COMM SEM, V2, P149 Ohala J. J., 1981, COGNITIVE REPRESENTA, P111 OHALA JJ, 1975, SPEECH COMMUN, V3, P299 OHALA JJ, 1986, INVARIANCE VARIABILI, P386 OHALA JJ, 1990, 1990 P ICSLP 90 INT, V1, P405 OHALA JJ, 1989, SERIES TRENDS LINGUI, V43, P173 OHALA JJ, 1985, NATO ASI SERIES F, V16, P447 OHALA JJ, IN PRESS DIACHRONY S OHALA JJ, 1981, PHONETICA, V38, P204 OHALA JJ, 1987, 11TH P INT C PHON SC, V4, P120 Ohala John J, 1981, PAPERS PARASESSION L, P178 SOLE MJ, 1991, 12TH P INT C PHON SC, V2, P110 Wang William S-Y, 1977, LEXICON PHONOLOGICAL, P148, DOI 10.1515/9783110802399.148 WANG WSY, 1961, J SPEECH HEAR RES, V4, P130 YOUNG T, 1855, MISCELLANEOUS WORKS, V2, P8 NR 27 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 369 EP 378 DI 10.1016/0167-6393(92)90042-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500006 ER PT J AU RISCHEL, J AF RISCHEL, J TI FORMAL LINGUISTICS AND REAL SPEECH SO SPEECH COMMUNICATION LA English DT Article DE NATURAL SPEECH; SPEAKING STYLE; SPONTANEOUS SPEECH; REDUCTION PHENOMENA; SPEECH DATA; EMPIRICAL LINGUISTICS AB This paper discusses the nature of the data that form the input to linguistic descriptions, particularly with regard to the greater or lesser artificiality of such data. The paper argues in favour of a much stronger emphasis in linguistic work on natural speech data even including spontaneous speech. A major part of the paper is devoted to a general discussion of some preliminaries to the study of linguistic variation in natural speech. RP RISCHEL, J (reprint author), UNIV COPENHAGEN, DEPT LINGUIST, 80 NJALSGADE, DK-2300 COPENHAGEN, DENMARK. CR Abbs J. H., 1986, INVARIANCE VARIABILI, P202 ARGENTE JA, 1992, SPEECH COMMUN, V11, P325, DOI 10.1016/0167-6393(92)90038-9 BRUCE G, 1992, SPEECH COMMUN, V11, P453, DOI 10.1016/0167-6393(92)90050-H Dressler W. U., 1975, PHONOLOGICA 1972, P219 DRESSLER WU, 1991, PHONETICA, V48, P233 FANT G, 1989, STLQPSR21989 FRETHEIM T, 1991, 1991 P ESCA WORKSH P GREGERSEN F, 1991, COPENHAGEN STUDY URB, V1, P5 HANSEN PM, 1991, COPENHAGEN WORKING P, V1, P153 KARLGREN H, 1962, 4TH P INT C PHON SCI, P671 LACHERETDUJOUR A, 1991, 1991 P ESCA WORKSH P LINDBLOM B, 1992, SPEECH COMMUN, V11, P357, DOI 10.1016/0167-6393(92)90041-5 LINELL P, 1982, J PHONETICS, V10, P37 LLISTERRI J, 1991, P ESCA WORKSHOP PHON OHALA JJ, 1992, SPEECH COMMUN, V11, P369, DOI 10.1016/0167-6393(92)90042-6 Rischel J., 1983, FOLIA LINGUIST, VXVII, P51, DOI 10.1515/flin.1983.17.1-4.51 RISCHEL J, 1983, ANN REP I PHON U CAP, V17, P125 RISCHEL J, 1991, PHONETICA, V48, P233 RISCHEL J, 1990, J PHONETICS, V18, P395 ROMAINE S, 1991, J LINGUIST, V17, P93 SOLE MJ, 1991, 1991 P ESCA WORKSH P STRANGERT E, 1987, NORDIC PROSODY, V4, P91 NR 22 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 379 EP 392 DI 10.1016/0167-6393(92)90043-7 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500007 ER PT J AU BARRY, MC AF BARRY, MC TI PALATALISATION, ASSIMILATION AND GESTURAL WEAKENING IN CONNECTED SPEECH SO SPEECH COMMUNICATION LA English DT Article DE RUSSIAN; EPG; ARTICULATORY PHONOLOGY AB An investigation is reported into the variation encountered in the distribution of palatalisation in consonant-clusters in Russian. EPG evidence from two speakers suggests that while the spread of palatalisation does not vary with speaking rate or style, the lingual gesture constituting palatalisation is shown to be sensitive to speech-rate, diminishing in magnitude with increased speaking rate. A similar reduction in gesture magnitude was found for coronal gestures in clusters with dental nasals. A comparison is made between the spread of palatalisation and the assimilation of place of articulation. In both cases speakers of Russian have access to phonological processes determining discrete categorical assimilations; at the same time speakers exhibit a tendency elsewhere for a lingual gesture to weaken in the circumstances of increased speaking rate. Gestures which weaken may involve articulatory subsystems of lesser mass than those in non-weakening gestures, while the phenomenon may none the less stake a claim to inclusion within the heading of ''the speaker's knowledge of how the language is pronounced'' which may be taken as a broad definition of the subject matter of phonological theory RP BARRY, MC (reprint author), UNIV MANCHESTER, DEPT LINGUIST, MANCHESTER M13 9PL, LANCS, ENGLAND. CR AVANESOV RI, 1984, RUSSKOYE LITERATURNO Barry M., 1985, CAMBRIDGE PAPERS PHO, V4, P1 BARRY MC, 1991, 12TH P INT C PHON SC, V4, P14 Browman Catherine, 1986, PHONOLOGY YB, V3, P219 Browman C. P., 1990, PAPERS LABORATORY PH, P341 CHASAIDE AN, 1977, THESIS U COLLEGE N W CRUTTENDEN A, 1978, J CHILD LANG, V5, P373 FERGUSON CA, 1975, LANGUAGE, V51, P419, DOI 10.2307/412864 GAY T, 1981, PHONETICA, V38, P148 HARDCASTLE WJ, 1977, WORK PROGR, V1, P27 Jakobson R., 1968, CHILD LANGUAGE APHAS JONES W, 1984, WORK PROGR, V4, P41 Kean M.-L., 1975, THESIS MIT CAMBRIDGE LINDBLOM B., 1983, PRODUCTION SPEECH, P217 Nolan F, 1983, PHONETIC BASES SPEAK OHALA JJ, UNPUB PAPERS LABORAT, V2 Paradis Carole, 1991, PHONETICS PHONOLOGY, V2, P1 PRENTIS JM, 1980, DYNAMICS MECHANICAL SALTZMAN E, 1987, PSYCHOL REV, V94, P84, DOI 10.1037//0033-295X.94.1.84 Saltzman E. L, 1986, EXPT BRAIN RES SERIE, P129 NR 20 TC 17 Z9 17 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 393 EP 400 DI 10.1016/0167-6393(92)90044-8 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500008 ER PT J AU FARNETANI, E FABER, A AF FARNETANI, E FABER, A TI TONGUE JAW COORDINATION IN VOWEL PRODUCTION - ISOLATED WORDS VERSUS CONNECTED SPEECH SO SPEECH COMMUNICATION LA English DT Article DE INTERARTICULATOR COORDINATION; VOWEL REDUCTION; COARTICULATION; COMPENSATION AB This study investigates the positions of the tongue body and the jaw in the production of Italian vowels /i/ and /a/ in different phonetic, prosodic and utterance contexts, with the aim of assessing the role of and the coordination between the two articulators in the processes of coarticulation, reduction and compensation. The data indicate that, within the same phonetic context, a change in utterance type (from isolated words to words in connected speech) or in lexical stress position (from stressed to unstressed vowels) induces decreased displacement of the jaw and the tongue from their Test position along the high/low dimension for both vowels. Thus prosodic and utterance contexts induce vowel reduction through a decrease in displacement of both jaw and tongue. Variation in vowels as a function of the consonantal context (/t,d,z, integral,1/) was observed in jaw displacements only in the front/back dimension: vowels were more fronted when adjacent to fricatives. All the other coarticulatory effects concern tongue body movements and tend to increase, as does reduction, from isolated words to connected speech. In symmetric VCV sequences extensive compensatory tongue displacements in the back direction were observed during the production of reduced /a/ vowels: thus, if vowel reduction causes a decrease in the articulatory distance between /i/ and /a/ along the high/low dimension, this compensatory tongue movement appears to counteract such effect by increasing the articulatory distance along the front/back dimension. In asymmetric sequences, the V-to-V effects seem to overrule the compensatory movements, and, adding to the reduction effects, cause a further decrease in the articulatory distance between the two vowel types. C1 HASKINS LABS INC, NEW HAVEN, CT 06511 USA. RP FARNETANI, E (reprint author), CNR, CTR FONET, I-35100 PADUA, ITALY. CR BRANDERUD P, 1990, UNPUB MOVETRACK BRANDERUD P, 1985, PERILUS, V4, P20 Browman C. P., 1990, PAPERS LABORATORY PH, P341 BROWMAN CP, 1990, J PHONETICS, V18, P299 CHUANG CK, 1978, JASA S1, V63 EDWARDS J, 1985, THESIS CITY U NEW YO EDWARDS J, 1985, J ACOUST SOC AM, V78, P1944, DOI 10.1121/1.392650 FABER A, 1991, JASA, V89, P1872, DOI 10.1121/1.2029331 FARNETANI E, 1991, 12TH P INT C PHON SC, V2, P14 FARNETANI E, 1990, PHONETICA, V47, P50 FARNETANI E, 1991, PERILUS, V14, P11 Fowler C. A., 1985, SPEECH SCI RECENT AD, P193 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 LINDBLOM B, 1967, STLQPSR41967, P1 LINDBLOM B., 1983, PRODUCTION SPEECH, P217 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 Saltzman E. L., 1989, ECOL PSYCHOL, V1, P333, DOI 10.1207/s15326969eco0104_2 STONE M, 1992, J PHONETICS, V20, P253 NR 18 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 401 EP 410 DI 10.1016/0167-6393(92)90045-9 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500009 ER PT J AU SCULLY, C GRABEGEORGES, E CASTELLI, E AF SCULLY, C GRABEGEORGES, E CASTELLI, E TI ARTICULATORY PATHS FOR SOME FRICATIVES IN CONNECTED SPEECH SO SPEECH COMMUNICATION LA English DT Article DE ARTICULATORY PATHS; FRICATIVES; CONNECTED SPEECH; AERODYNAMICS METHODS AB Articulatory paths have been analysed for the fricative [s] produced by a woman speaker of General American English in connected speech and also in phonetically controlled speech-like sequences at a slower speech rate. Aerodynamically derived traces indicating the acoustically relevant parameter of cross-section area of the constricted region of the vocal tract suggested that, contrary to related data for the movements of the solid structures, some portions of the traces for this vocal tract indicator had the same path shape across some different vowel contexts and across different speech styles. In the connected speech, the whole cross-section area path seemed to be invariant across different stressed vowel contexts. The acoustic pattern features associated with the invariant portions of vocal tract articulation, in combination with appropriate respiratory and laryngeal articulations, are discussed. C1 UNIV CAMBRIDGE, DEPT LINGUIST, CAMBRIDGE CB3 9DA, ENGLAND. ECOLE NATL SUPER ELECTR & RADIOELECT, INGP, INST COMMUN PARLEE, F-38031 GRENOBLE, FRANCE. RP SCULLY, C (reprint author), UNIV LEEDS, DEPT PSYCHOL, LEEDS LS2 9JT, W YORKSHIRE, ENGLAND. CR Fujimura O., 1986, INVARIANCE VARIABILI, P226 HIXON T, 1966, FOLIA PHONIATR, V18, P168 SCULLY C, 1992, J PHONETICS, V20, P39 SCULLY C, 1987, SPEECH COMMUN, V6, P77, DOI 10.1016/0167-6393(87)90036-7 SCULLY C, 1991, 12TH P INT C PHON SC, V3, P58 SCULLY C, 1991, PERLUS, V14, P69 SCULLY C, 1986, J PHONETICS, V14, P407 SCULLY C, 1990, NATO ADV SCI I D-BEH, V55, P151 NR 8 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 411 EP 416 DI 10.1016/0167-6393(92)90046-A PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500010 ER PT J AU DUEZ, D AF DUEZ, D TI 2ND FORMANT LOCUS-NUCLEUS PATTERNS - AN INVESTIGATION OF SPONTANEOUS FRENCH SPEECH SO SPEECH COMMUNICATION LA English DT Article DE 2ND FORMANT LOCUS-NUCLEUS PATTERNS; SPONTANEOUS SPEECH; LOCUS EQUATION; VOWEL DURATION; REDUCTION PROCESSES ID SPEAKING RATE AB Data on F2 measured at the consonant-vowel boundary and in the vowel nucleus so called locus nucleus patterns of vowels were analysed in CV sequences in spontaneous speech in French and were compared with the results obtained for different languages such as Swedish. It was found that the F2 locus-nucleus differences were smaller in spontaneous speech than in reference words, in non-prominent syllables than in prominent syllables, and in given words than in new words. These results were interpreted as reflecting differences in coarticulation, both an anticipatory effect of vowel on the preceding consonant and/or formant undershoot. F2 locus-nucleus patterns appeared to depend on duration, confirming Lindblom's model (1963), they may also be influenced by other factors such as speaker adaptations to the communicative situation. C1 UNIV STOCKHOLM, INST LINGUIST, S-10691 STOCKHOLM, SWEDEN. RP DUEZ, D (reprint author), INST PHONET AIX PROVENCE, CNRS, URA 261, 29 AVE ROBERT SCHUMAN, F-13621 AIX EN PROVENCE, FRANCE. CR Duez D., 1991, PAUSE PAROLE HOMME P ENGSTRAND O, 1988, J ACOUST SOC AM, V83, P1863, DOI 10.1121/1.396522 ENGSTRAND O, 1989, P SPEECH RES 89 BUDA, P88 GAY T, 1978, J ACOUST SOC AM, V63, P223, DOI 10.1121/1.381717 Koopmans-van Beinum F. J., 1989, Eurospeech 89. European Conference on Speech Communication and Technology Krull D., 1987, PHONETIC EXPT RES I, V5, P43 Krull D., 1989, PERILUS, VX, P87 Kuehn David P., 1976, J PHONETICS, V4, P303 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 LINDBOLM B, 1988, PERILUS, V8, P20 NORD L, 1986, STL QPSR, V4, P19 SHOCKEY L, 1973, THESIS OHIO STATE U Vaissiere Jacqueline, 1983, PROSODY MODELS MEASU, P53 van Bergem D. R., 1989, Eurospeech 89. European Conference on Speech Communication and Technology VANSON RJJH, 1990, J ACOUST SOC AM, V88, P1683, DOI 10.1121/1.400243 NR 15 TC 20 Z9 21 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 417 EP 427 DI 10.1016/0167-6393(92)90047-B PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500011 ER PT J AU HARMEGNIES, B POCHOLIVE, D AF HARMEGNIES, B POCHOLIVE, D TI A STUDY OF STYLE-INDUCED VOWEL VARIABILITY - LABORATORY VERSUS SPONTANEOUS SPEECH IN SPANISH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH STYLES; SPONTANEOUS SPEECH; CONNECTED SPEECH; VOWEL SYSTEM; SPANISH AB The paper is focussed on the vocalic differences between spontaneous and laboratory speech in Spanish. The first and second formants of 954 vowel utterances (477 in laboratory and 477 in spontaneous speech) have been measured. They constitute clusters in the F1/F2 space. The paper describes inter- and intra-cluster variabilities caused by communication situations changes. In spontaneous speech, the formants values show (1) a marked schwa-tendency; (2) increasing intra-cluster variability. Both phenomena result in lowered differentiation of the sounds in spontaneous speech. C1 UNIV AUTONOMA BARCELONA, DEPT FILOL ESPANOLA, BARCELONA, SPAIN. RP HARMEGNIES, B (reprint author), UNIV MONS HAINAUT, SERV COMMUN PARLEE, AV CHAMP MARS, CH 2, B-7000 MONS, BELGIUM. CR Borzone de Manrique Ana Maria, 1980, MANUAL FONETICA ACUS CASAS RM, 1980, ASPECTOS FONETICOS V CHOLLET CF, 1976, 92ND M AC SOC AM SAN DELATTRE P, 1969, IRAL-INT REV APPL LI, V7, P295, DOI 10.1515/iral.1969.7.4.295 DEMANRIQUE AMB, 1980, MANUAL FONETICA DENOS E, 1985, PRIPU, V10, P3 DUEZ D, 1991, PERILUS, V12, P109 ENGSTRAND O, 1989, PERILUS, V10, P1 KOHLER K, 1990, SPEECH PRODUCTION MO KOOPMANSVANBEIN.F, 1980, VOVEL CONTRAST REDUC Krull D., 1991, PERILUS, VXII, P101 LACERDA A, 1942, ANEJOS REV FILOLOGIA, V32 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1771 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 LOBACZ P, 1976, SPEECH ANAL SYNTHESI, V4, P186 Navarro Tomas T., 1918, MANUAL PRONUNCIACION Nooteboom S., 1972, PRODUCTION PERCEPTIO NORD L, 1986, STL QPSR, V4, P19 NORD L, 1974, SPEECH COMMUNICATION POCHOLIVE D, 1992, J PHYSIQUE 3 S, V2, P286 POCHOLIVE D, 1989, 19TH S SOC ESP LING, V20, P222 QUILIS A, 1983, ESTUDIOS FONETICA, V1 Roach P, 1992, INTRO PHONETICS STALHAMMAR U, 1973, STL QPSR, V4, P1 STEVENS KN, 1963, J SPEECH HEAR RES, V6, P111 TOMAS TN, 1917, REV FILOL ESPAN, V4, P371 TOMAS TN, 1916, REV FILOL ESPAN, V3, P387 NR 27 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 429 EP 437 DI 10.1016/0167-6393(92)90048-C PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500012 ER PT J AU KOOPMANSVANBEINUM, FJ AF KOOPMANSVANBEINUM, FJ TI THE ROLE OF FOCUS WORDS IN NATURAL AND IN SYNTHETIC CONTINUOUS SPEECH - ACOUSTIC ASPECTS SO SPEECH COMMUNICATION LA English DT Article DE SPONTANEOUS SPEECH; READ ALOUD SPEECH; TEXT-TO-SPEECH DIPHONE SYNTHESIS; FOCUS WORDS AB In everyday communicative situations not all parts of the spoken message are pronounced equally clear. Especially words bearing a high load of semantic information are put in focus by the speaker. The question of how this is realized in natural spontaneous and read speech, and whether resulting knowledge can be applied in synthetic speech to improve naturalness and acceptability, is subject of this study. By introducing a ''peak-and-level'' model we examined spectral and temporal aspects in focus and non-focus words from spontaneous speech material and from the same texts, read out after orthographic transcription. Audio recordings were made of a professional male speaker, whose voice and pronunciation also served as a model for the diphone-based component of the Dutch national speech synthesis program. For a number of acoustic parameters it can be concluded that there is a clear difference, both in ''peak values'' and in ''level values'', between the two natural speech styles, but that the peak values display comparable contrasts to the level values in both styles. The results of our measurements in natural speech were compared to the data of the same texts synthesized by the Dutch diphone text-to-speech system. In a pilot experiment, varying temporal aspects in the synthesized speech, listeners were asked to judge the naturalness and intelligibility in order to determine the starting-point for future evaluation of text-to-speech synthesis including peak-and-level contrasts. RP KOOPMANSVANBEINUM, FJ (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR DRULLMAN R, 1991, J ACOUST SOC AM, V90, P1766, DOI 10.1121/1.401657 DRULLMAN R, 1990, SPIN ASSP27 REP KOOPMANSVANBEIN.FJ, 1991, P EUROSPEECH 91 GENO, V5, P1459 KOOPMANSVANBEIN.FJ, 1991, 1991 P ESCA ETRW PHO, P36 KOOPMANSVANBEIN.FJ, 1990, 1990 P INT C SPOK LA, V1, P21 KOOPMANSVANBEIN.FJ, 1980, THESIS U AMSTERDAM LINDBLOM BEF, 1963, J ACOUST SOC AM, V305, P1773 QUENE H, 1990, SPIN ASSP REPORT, V17 REETZ H, 1989, P EUROSPEECH PARIS, V1, P476 VANBERGEN DR, 1991, P EUROSPEECH 91 GENO, V5, P1455 VANLEEUWEN HC, 1991, P INT C ACOUST SPEEC, V2, P781 NR 11 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 439 EP 452 DI 10.1016/0167-6393(92)90049-D PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500013 ER PT J AU BRUCE, G TOUATI, P AF BRUCE, G TOUATI, P TI ON THE ANALYSIS OF PROSODY IN SPONTANEOUS SPEECH WITH EXEMPLIFICATION FROM SWEDISH AND FRENCH SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; SWEDISH AND FRENCH; SPONTANEOUS AND READ SPEECH; POLITICAL RHETORIC AB This paper reports on a research project concerning prosody in spontaneous speech. Two questions inaugurate the project. The first one concerns prosodic differences between spontaneous speech and read speech. Evidence from Swedish shows that these differences are not fundamental. The second question concerns the relationship between prosody and discourse categories. A methodology has been developed in order to study this relationship. Four different kinds of analyses are applied: (1) analysis of the discourse structure of the speech corpus without specific reference to prosodic information, (2) auditory analysis in the form of a prosody-oriented transcription, (3) acoustic-phonetic analysis and (4) analysis-by-synthesis. Part of this analysis is illustrated with exemplification from a persuasive monologue in French political rhetoric. Focal accent and contrast in pitch range seem to account for typical prosodic means used during political debate. RP BRUCE, G (reprint author), DEPT LINGUIST & PHONET, HELGONABACKEN 12, S-22362 LUND, SWEDEN. CR BRUCE G, 1987, NORDIC PROSODY, V4, P41 BRUCE G, 1990, NORDIC PROSODY, V5, P36 BRUCE G, 1985, 1985 P FRENCH SWED S, P549 BRUCE G, 1990, 1990 P ICSLP 90 KOB, V1, P489 Bruce G., 1982, WORKING PAPERS LUND, V22, P51 Bruce Gosta, 1977, SWEDISH WORD ACCENTS Bruce Gosta, 1990, WORKING PAPERS, V36, P37 DUEZ D, 1987, THESIS U PROVENCE AI FONAGY I, 1982, PRAGMATICS BEYOND, P1 GARDING E, 1981, STUD LINGUISTICA, V305, P146 GARDING E, 1982, PHONETICA, V39, P288 LEON P, 1971, STUDIA PHONETICA, V4, P131 Lucci V., 1983, ETUDE PHONETIQUE FRA NIR R, 1988, LANG LEARN, V38, P187, DOI 10.1111/j.1467-1770.1988.tb00408.x TOUATI P, 1991, PERILUS, V13, P53 Touati P., 1987, STRUCTURES PROSODIQU NR 16 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 453 EP 458 DI 10.1016/0167-6393(92)90050-H PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500014 ER PT J AU GRANSTROM, B NORD, L AF GRANSTROM, B NORD, L TI NEGLECTED DIMENSIONS IN SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; SPEAKING STYLE; TEXT-TO-SPEECH; INTENSITY; LOUDNESS AB In traditional accounts on speech prosody, fundamental frequency, duration and intensity have been described as the most important attributes. Among these, intensity has attracted the least attention. In perceptual studies both F0 and duration have had an undisputable role in signalling prosodic categories, but the role of intensity has been less clear. This has resulted in an emphasis on the former attributes in current speech synthesis schemes. We are in this study exploring the use of speech intensity and also other segmental correlates of prosody. Intensity has a dynamic aspect, discriminating emphasized and reduced stretches of speech. A more global aspect of intensity must be controlled when we try to model different speaking styles. Specifically, we have been trying to model the continuum from soft to loud speech. RP GRANSTROM, B (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR CARLSON R, 1990, ESCA WORKSHOP SPEAKE, P28 CARLSON R, 1980, ADV SPEECH HEARING L Fant G., 1985, STL QPSR FANT G, 1986, STL QPSR, P1 NR 4 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 459 EP 462 DI 10.1016/0167-6393(92)90051-8 PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500015 ER PT J AU SWERTS, M COLLIER, R AF SWERTS, M COLLIER, R TI ON THE CONTROLLED ELICITATION OF SPONTANEOUS SPEECH SO SPEECH COMMUNICATION LA English DT Article DE ELICITATION METHODS; SPONTANEOUS SPEECH; PROSODY; INTONATION ID PITCH AB This paper presents an elicitation method that allows the controlled analysis of properties of spontaneous speech. The method is characterized by the fact that a speaker is constrained by the experimenter with respect to what he will say, while keeping his speech output spontaneous. As an example of the potential of this technique, a pilot study is presented in which such spontaneous discourse was used in prosodic research. The purpose of this simple experiment was to find out whether the endings of large-scale units in spontaneous discourse have particular melodic correlates that are communicatively relevant. RP SWERTS, M (reprint author), INST PERCEPT RES, POB 513, 5600 MB EINDHOVEN, NETHERLANDS. RI Swerts, Marc/C-8855-2013 CR Brown Gillian, 1980, QUESTIONS INTONATION Bruce Gosta, 1990, WORKING PAPERS, V36, P37 HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 Lehiste I., 1975, STRUCTURE PROCESS SP, P195 Levelt W. J., 1989, SPEAKING INTENTION A LEVELT WJM, 1983, COGNITION, V14, P41, DOI 10.1016/0010-0277(83)90026-4 SWERTS M, 1991, J ACOUST SOC AM, V90, P2344, DOI 10.1121/1.402181 TERKEN JMB, 1984, LANG SPEECH, V27, P269 THORSEN NG, 1985, J ACOUST SOC AM, V77, P1205, DOI 10.1121/1.392187 NR 9 TC 14 Z9 14 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 463 EP 468 DI 10.1016/0167-6393(92)90052-9 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500016 ER PT J AU GREISBACH, R AF GREISBACH, R TI READING ALOUD AT MAXIMAL SPEED SO SPEECH COMMUNICATION LA English DT Article DE REDUCTION RULES; FAST SPEECH AB Experiments with 8 native speakers of German showed that after a short preparation everyone was able to read aloud at maximal speed, even if the speed attained varied. The known reduction rules for connected speech in German are not violated and new farther reaching reductions can be observed. So reading aloud at maximal speed is an appropriate method to elicit phenomena of very fast speech, which may also occur in spontaneous speech. RP GREISBACH, R (reprint author), UNIV COLOGNE, INST PHONET, GREINSTR 2, W-5000 COLOGNE 41, GERMANY. CR Dedenbach Beate, 1987, REDUKTIONS VERSCHMEL GNUTZMANN C, 1973, ARBEITSBERICHTE, P66 GNUTZMANN C, 1975, THESIS KIEL Kohler KJ, 1977, EINFUHRUNG PHONETIK KOHLER KJ, 1973, ARBEITSBERICHTE, P55 KOHLER KJ, 1990, NATO ADV SCI I D-BEH, V55, P69 1982, GROSSES WORTERBUCH D NR 7 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 469 EP 473 DI 10.1016/0167-6393(92)90053-A PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500017 ER PT J AU ROACH, P SERGEANT, P MILLER, D AF ROACH, P SERGEANT, P MILLER, D TI SYLLABIC CONSONANTS AT DIFFERENT SPEAKING RATES - A PROBLEM FOR AUTOMATIC SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SYLLABICITY; TIMING; DURATION; RHYTHM; AUTOMATIC SPEECH RECOGNITION AB A familiar aspect of English pronunciation is the occurrence of syllabic consonants. It is common to treat consonantal syllabicity as a consequence of vowel elision, implying that, for example, the pronunciation of ''button'' [GRAPHICS]. Since elision is a phenomenon that is subject to the influence of speaking style, it would seem to follow that in rapid or casual speech we should expect to find more cases of syllabic consonants and fewer cases of unstressed vowels followed by continuant consonants. This paper sets out to show that the phenomenon is not this simple: we look at problems that confront our attempts at the automatic recognition of syllables and other sub-word units, and consider phonotactic and phonetic factors that may help to resolve them. C1 UNIV SHEFFIELD, DEPT INFORMAT STUDIES, SHEFFIELD S10 2TN, S YORKSHIRE, ENGLAND. RP ROACH, P (reprint author), UNIV LEEDS, DEPT PSYCHOL, SPEECH LAB, LEEDS LS2 9JT, W YORKSHIRE, ENGLAND. CR Chomsky N., 1968, SOUND PATTERN ENGLIS GREEN PD, 1990, P I ACOUSTICS, V12, P249 JONES D, 1959, Z PHONETIK, V12, P136 ROACH PJ, 1990, J INT PHON ASSOC, V20, P15 ROACH PJ, 1991, 12 P INT C PHON SCI, V4, P482 Roachs P., 1991, ENGLISH PHONETICS PH WELLS JC, 1965, PHONETICA, V13, P110 NR 7 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 475 EP 479 DI 10.1016/0167-6393(92)90054-B PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500018 ER PT J AU GOBL, C CHASAIDE, AN AF GOBL, C CHASAIDE, AN TI ACOUSTIC CHARACTERISTICS OF VOICE QUALITY SO SPEECH COMMUNICATION LA English DT Article DE VOICE QUALITY; VOICE SOURCE; INVERSE FILTERING ID FEMALE AB Subtle variations in voice (phonatory) quality may reveal aspects of the speaker's mood and attitude, and are thus an important aspect of speaking style. This paper illustrates current research being carried out by the authors on the voice source correlates of a range of such quality differences. The voice qualities looked at include modal, breathy, whispery, tense, lax and creaky voice, following the description of Laxer (1980) and analyses presented here focus on a word extracted from a prose passage read with these qualities. The principal method used for analysing the voice source involved inverse filtering of the speech wave. In order to quantify the source characteristics, a four parameter voice source model (the LF-model) was matched to the inverse filtered waveform. Frequency domain analyses of the speech waveform, based on narrow band spectral sections and on spectral averaging, were also carried out. Detailed comparisons of the data measured directly from the glottal waveform and those measured from the speech output yield insights which could not be infer-red from either alone. Results suggest a number of important differences between the qualities as well as considerable dynamic variation within a single quality. The data should also prove useful for resynthesis, which is an important tool for testing perceptual aspects of voice quality; e.g., the attitudinal and emotional colouring that may be associated with particular voice qualities. C1 ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, S-10044 STOCKHOLM 70, SWEDEN. RP GOBL, C (reprint author), UNIV DUBLIN TRINITY COLL, CTR LANGUAGE & COMMUN STUDIES, DUBLIN 2, IRELAND. CR ABERCROMBIE D, 1967, ELEMENTS GENERAL PHO BICKLEY CA, 1986, J PHONETICS, V14, P373 Catford J. C., 1964, HONOUR D JONES, P26 Daniloff R, 1980, PHYSL SPEECH HEARING DIEHL CF, 1968, INTRO ANATOMY PHYSL FANT G, 1979, STL QPSR, V3, P31 Fant Gunnar, 1985, STL QPSR, V4, P1 FRITZELL B, 1986, J PHONETICS, V14, P549 GOBL C, 1988, STL QPSR, V1, P123 Gobl C., 1989, STL QPSR, P9 GOBL C, 1988, STL QPSR, V2, P23 Hardcastle W., 1976, PHYSL SPEECH PRODUCT Kaplan H, 1960, ANATOMY PHYSL SPEECH KARLSSON I, 1992, SPEECH COMMUN, V11, P491, DOI 10.1016/0167-6393(92)90056-D KASUYA H, 1989, VOCAL FOLD PHYSL ACO KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 Laver J, 1980, PHONETIC DESCRIPTION LEE CK, 1989, VOCAL FOLD PHYSL ACO PALMER JM, 1965, ANATOMY SPEECH HEARI PRICE PJ, 1989, SPEECH COMMUN, V8, P261, DOI 10.1016/0167-6393(89)90005-8 Sundberg J, 1987, SCI SINGING VOICE ZEMLIN WR, 1981, SPEECH HEARING SCI A NR 22 TC 26 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 481 EP 490 DI 10.1016/0167-6393(92)90055-C PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500019 ER PT J AU KARLSSON, I AF KARLSSON, I TI MODELING VOICE VARIATIONS IN FEMALE SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE VOICE SOURCE; VOICE QUALITY; VOICE VARIATIONS; FEMALE SPEECH; FEMALE SPEECH SYNTHESIS; HIGH FIDELITY SPEECH SYNTHESIS; SYNTHETIC VOICE SOURCE AB The voice source is an important factor in the production of different voice qualities. These different voice qualities are used in speech to convey, among other things, different suprasegmental aspects, e.g., emphasis, phrase boundaries and also different speaking styles such as an authoritative or a submissive voice. Voice source variations are also an important means of conveying extralinguistic information of various kinds in ordinary speech. In the present study, voice source variations in normal speech by female speakers have been investigated using inverse filtering. The results of the inverse filtering are given in voice source parameters appropriate for controlling speech synthesis. Accordingly, the resulting descriptions have been utilized to produce voice variations in our new synthesis system. RP KARLSSON, I (reprint author), KUNGLICA TEKN HOGSKOLAN, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM, SWEDEN. CR CARLSON R, 1991, SPEECH COMMUN, V10, P481, DOI 10.1016/0167-6393(91)90051-T FANT G, 1985, SPEECH TRANSMISSION, P1 GOBL C, 1991, VOCAL FODL PHYSL ACO, P121 GOBL C, 1988, STL QPSR, V1, P123 Karlsson I., 1990, 1990 P INT C SPOK LA, P69 KARLSSON I, 1988, 1988 P SPEECH 88 7TH, P225 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 LIN Q, 1990, THESIS KTH STOCKHOLM NR 8 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1992 VL 11 IS 4-5 BP 491 EP 495 DI 10.1016/0167-6393(92)90056-D PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JW655 UT WOS:A1992JW65500020 ER PT J AU MODENA, G PIRANI, G AF MODENA, G PIRANI, G TI SPECIAL ISSUE ON EUROSPEECH 91 SO SPEECH COMMUNICATION LA English DT Editorial Material C1 CTR STUDI & LAB TELECOMUN SPA, ORG UNIT INT COLLABORAT & STANDARDIZAT, TURIN, ITALY. RP MODENA, G (reprint author), CTR STUDI & LAB TELECOMUN SPA, DIV SUBSCRIBER TERMINALS, TURIN, ITALY. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 95 EP 97 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400002 ER PT J AU VANOOYEN, B CUTLER, A NORRIS, D AF VANOOYEN, B CUTLER, A NORRIS, D TI DETECTION OF VOWELS AND CONSONANTS WITH MINIMAL ACOUSTIC VARIATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; PHONEME DETECTION; VOWELS; CONSONANTS; SEMIVOWELS ID CONTEXT AB Previous research has shown that, in a phoneme detection task, vowels produce longer reaction times than consonants, suggesting that they are harder to perceive. One possible explanation for this difference is based upon their respective acoustic/articulatory characteristics. Another way of accounting for the findings would be to relate them to the differential functioning of vowels and consonants in the syllabic structure of words. In this experiment, we examined the second possibility. Targets were two pairs of phonemes, each containing a vowel and a consonant with similar phonetic characteristics. Subjects heard lists of English words and had to press a response key upon detecting the occurrence of a pre-specified target. This time. the phonemes which functioned as vowels in syllabic structure yielded shorter reaction times than those which functioned as consonants. This rules out an explanation for response time difference between vowels and consonants in terms of function in syllable structure. Instead, we propose that consonantal and vocalic segments differ with respect to variability of tokens, both in the acoustic realisation of targets and in the representation of targets by listeners. RP VANOOYEN, B (reprint author), MRC, APPL PSYCHOL UNIT, 15 CHAUCER RD, CAMBRIDGE CB2 2EF, ENGLAND. RI Cutler, Anne/C-9467-2012 CR ADES AE, 1977, PSYCHOL REV, V84, P524, DOI 10.1037//0033-295X.84.6.524 BOND ZS, 1980, PERCEPTION PRODUCTIO COWAN N, 1986, J ACOUST SOC AM, V79, P500, DOI 10.1121/1.393537 CRYSTAL TH, 1988, J ACOUST SOC AM, V83, P1553, DOI 10.1121/1.395911 CUTLER A, 1990, P INT C SPOK LANG PR, V1, P581 FOSS DJ, 1969, J VERB LEARN VERB BE, V8, P457, DOI 10.1016/S0022-5371(69)80089-7 FRAUENFELDER UH, 1989, MEM COGNITION, V17, P134, DOI 10.3758/BF03197063 HAKES DT, 1971, PERCEPT PSYCHOPHYS, V10, P229, DOI 10.3758/BF03212810 Ladefoged P., 1982, COURSE PHONETICS, V2nd Liberman A. M., 1954, PSYCHOL MONOGRAPHS, V68 LIBERMAN AM, 1967, PSYCHOL REV, V74, P431, DOI 10.1037/h0020279 MEHLER J, 1981, J VERB LEARN VERB BE, V20, P298, DOI 10.1016/S0022-5371(81)90450-3 RAKERD B, 1984, J ACOUST SOC AM, V76, P27, DOI 10.1121/1.391114 STRANGE W, 1979, J EXP PSYCHOL HUMAN, V5, P643, DOI 10.1037//0096-1523.5.4.643 STRANGE W, 1976, J ACOUST SOC AM, V60, P213, DOI 10.1121/1.381066 VANOOIJEN B, 1991, P EUROSPEECH 91, V3, P1451 NR 16 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 101 EP 108 DI 10.1016/0167-6393(92)90004-Q PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400003 ER PT J AU ALKU, P AF ALKU, P TI GLOTTAL WAVE ANALYSIS WITH PITCH SYNCHRONOUS ITERATIVE ADAPTIVE INVERSE FILTERING SO SPEECH COMMUNICATION LA English DT Article DE GLOTTAL WAVE ANALYSIS; INVERSE FILTERING AB A new glottal wave analysis method, Pitch Synchronous Iterative Adaptive Inverse Filtering (PSIAIF) is presented. The algorithm is based on a previously developed method, Iterative Adaptive Inverse Filtering (IAIF), In the IAIF-method the glottal contribution to the speech spectrum is first estimated with an iterative structure. The vocal tract transfer function is modeled after eliminating the average glottal contribution. The glottal excitation is obtained by cancelling the effects of the vocal tract and lip radiation by inverse filtering. In the new PSIAIF-method the glottal pulseform is computed by applying the IAIF-algorithm twice to the same signal . The first IAIF-analysis gives as a result a glottal excitation that spans over several pitch periods. This pulseform is used in order to determine positions and lengths of frames for the pitch synchronous analysis. The final result is obtained by analysing the original speech signal with the IAIF-algorithm one fundamental period at a time. The PSIAIF-algorithm was applied in glottal wave analysis using both synthetic and natural vowels. The results show that the method is able to give a fairly accurate estimate for the glottal flow excluding the analysis of vowels with a low first formant that are produced with a pressed phonation type. RP ALKU, P (reprint author), HELSINKI UNIV TECHNOL, ACOUST LAB, OTAKAARI 5A, SF-02150 ESPOO, FINLAND. RI Alku, Paavo/E-2400-2012 CR ALKU P, 1991, 12E C INT SCI PHON A, V4, P362 ALKU P, 1990, 1ST P INT C SPOK LAN, P197 Ananthapadmanabha T., 1984, 2 ROYAL I TECHN SPEE, P1 GOLD B, 1968, IEEE T ACOUST SPEECH, VAU16, P81, DOI 10.1109/TAU.1968.1161954 HUNT MJ, 1978, IEEE T ACOUST SPEECH, P15 KARJALAINEN M, 1988, 1988 P IEEE INT C AC, P1682 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 Markel JD, 1976, LINEAR PREDICTION SP WONG DY, 1979, IEEE T ACOUST SPEECH, V27, P350, DOI 10.1109/TASSP.1979.1163260 NR 9 TC 128 Z9 132 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 109 EP 118 DI 10.1016/0167-6393(92)90005-R PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400004 ER PT J AU DENBIGH, PN ZHAO, J AF DENBIGH, PN ZHAO, J TI PITCH EXTRACTION AND SEPARATION OF OVERLAPPING SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SEPARATION; PITCH EXTRACTION AB This paper describes two algorithms for separating two overlapping speech signals. Both rely heavily on accurate measurements of the pitch of the target voice. The first uses a single microphone, and an important feature is the exploitation of the onset of a voiced sound as an aid to the extraction of its pitch in the presence of interference. The second uses two microphones, and an important feature is that it also makes use of the direction of the target voice. RP DENBIGH, PN (reprint author), UNIV SUSSEX, SCH ENGN & APPL SCI, BRIGHTON BN1 9QT, ENGLAND. CR DECHEVEIGNE A, 1991, P ESCA EUROSPEECH GE, P445 FRAZIER RH, 1976, IEEE ICASSP, P251 GU Y, 1991, P 2 EUR C SPEECH COM, P453 HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 Nawab S., 1988, ADV TOPICS SIGNAL PR, P289 PETERSON PM, 1986, J ACOUST SOC AM, V80, P1527, DOI 10.1121/1.394357 STUBBS RJ, 1990, J ACOUST SOC AM, V87, P359, DOI 10.1121/1.399257 ZHAO J, 1990, P I ACOUST, V10, P515 NR 8 TC 14 Z9 15 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 119 EP 125 DI 10.1016/0167-6393(92)90006-S PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400005 ER PT J AU JAYANT, NS JOHNSTON, JD SHOHAM, Y AF JAYANT, NS JOHNSTON, JD SHOHAM, Y TI CODING OF WIDE-BAND SPEECH SO SPEECH COMMUNICATION LA English DT Article DE DIGITAL CODING; PERCEPTUAL CODING; WIDE-BAND SPEECH; CODE-EXCITED LINEAR PREDICTION; SUBBAND CODING; TRANSFORM CODING; MODIFIED DISCRETE COSINE TRANSFORM; QUADRATURE MIRROR FILTERBANK; NOISE SHAPING; MASKING THRESHOLD AB The technologies of ISDN teleconferencing, CD-ROM multimedia services, and High Definition Television are creating new opportunities and challenges for the digital coding of wideband audio signals, wideband speech in particular. In the coding of wideband speech, an important point of reference is the CCITT standard for 7 kHz speech at a rate of 64 kbit/s. Results of recent research are pointing to better capabilities higher signal bandwidth at 64 kbit/s, and 7 kHz bandwidth at lower bit-rates such as 32 and 16 kbit/s. The coding of audio with a signal bandwidth of 20 kHz is receiving significant attention due to recent activity in the ISO (International Standards Organization), with a goal of storing a CD-grade monophonic audio channel at a bit-rate not exceeding 128 kbit/s. Prospects for accomplishing this are very good. As a side result, emerging algorithms will offer very attractive options at lower rates such as 96 and 64 kbit/s. As we address new challenges in wideband speech technology, several strides in coding research are likely to occur. Among these are refinements of existing models for auditory noise-masking, and a unification of linear prediction and frequency-domain coding. RP JAYANT, NS (reprint author), AT&T BELL LABS, SIGNAL PROC RES DEPT, 600 MT AVE, MURRAY HILL, NJ 07974 USA. CR ATAL BS, 1984, 1984 P ICC AMST, P1610 BRANDENBURG KH, 1987, 1987 P INT C AC SPEE, P141 CHEN JH, 1990, 1990 P INT C AC SPEE FLANAGAN JL, 1991, ACUSTICA, V73, P58 HELLMAN RP, 1972, PERCEPT PSYCHOPHYS, V11, P241, DOI 10.3758/BF03206257 Jayant N. S., 1984, DIGITAL CODING WAVEF JOHNSTON JD, 1988, IEEE J SEL AREA COMM, V6, P314, DOI 10.1109/49.608 KRASNER MA, 1979, MIT535 LINC LAB TECH LAFLAMME C, 1991, MAY P INT C AC SPEEC, P13 MERMELSTEIN P, 1988, IEEE COMMUN MAG JAN, P8 MIYOSHI M, 1988, IEEE T ACOUST SPEECH, V36, P145, DOI 10.1109/29.1509 MODENA G, 1986, 1986 P GLOB MUSMANN HG, 1990, 1990 P IEEE GLOB ORDENTLICH E, 1991, 1991 P INT C AC SPEE PRINCEN J, 1987, 1987 P INT C AC SPEE, P2161 QUACKENBUSH SR, 1991, 1991 P INT C AC SPEE Scharf B., 1970, F MODERN AUDITORY TH, V1, P159 SCHROEDER MR, 1979, J ACOUST SOC AM, V66, P1647, DOI 10.1121/1.383662 THEILE G, 1988, EBU TECH REV, P71 NR 19 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 127 EP 138 DI 10.1016/0167-6393(92)90007-T PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400006 ER PT J AU FULDSETH, A HARBORG, E JOHANSEN, FT KNUDSEN, JE AF FULDSETH, A HARBORG, E JOHANSEN, FT KNUDSEN, JE TI WIDE-BAND SPEECH CODING AT 16 KBIT/S FOR A VIDEOPHONE APPLICATION SO SPEECH COMMUNICATION LA English DT Article DE WIDE-BAND SPEECH CODING; CELP AB This paper presents results on wideband 7 kHz speech coding at 16 kbit/s where the proposed CELP algorithm is implementable on a single floating point DSP. As a basic coder structure, the long-term predictor is implemented as an adaptive codebook, while a sparse Gaussian codebook with non-overlapping vectors is used for the stochastic excitation. In order to meet the complexity requirements, several methods for efficient codebook search are adopted. With these methods, it is shown that the computational effort for the basic coder structure can be reduced to 12.4 MIPS with a 7 bit stochastic codebook. A two-stage hierarchical search through the adaptive codebook is investigated. This search method reduces the computational effort further although at the cost of a small degradation in coder performance. The coder is evaluated in an absolute category rating (MOS) test using both a hi-fi handset and a loudspeaker, and compared to the CCITT standard coder G.722 at 48-64 kbit/s. The speech quality with the basic CELP structure is judged to be comparable to the G.722 coder at 48 kbit/s. C1 NORWEGIAN TELECOM RES, N-2007 KJELLER, NORWAY. RP FULDSETH, A (reprint author), SINTEF DELAB, N-7034 TRONDHEIM, NORWAY. CR Adoul J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) ATAL BS, 1984, P IEEE INT C COMMUNI CHEN JH, 1990, P IEEE ICASSP APR, P453 DEJACOVO RD, 1989, P INT C ACOUST SPEEC, P192 KLEIJN WB, 1990, IEEE T ACOUST SPEECH, V38, P1330, DOI 10.1109/29.57568 Laflamme C., 1990, P ICASSP, P177 MARKEL JD, 1976, LINEAR PREDICTION SP, P154 MARKEL JD, 1980, IEEE T ACOUST SPEECH, V28, P574 Rose R. C., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) TRANCOSO IM, 1986, P INT C ACOUST SPEEC, P2379 NR 10 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 139 EP 148 DI 10.1016/0167-6393(92)90008-U PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400007 ER PT J AU SECKER, P PERKIS, A AF SECKER, P PERKIS, A TI JOINT SOURCE AND CHANNEL TRELLIS CODING OF LINE SPECTRUM PAIR PARAMETERS SO SPEECH COMMUNICATION LA English DT Article DE JOINT SOURCE AND CHANNEL CODING; LINE SPECTRUM PAIR; TRELLIS CODING; SPEECH CODING ID DESIGN AB An innovative method of coding Line Spectrum Pair (LSP) parameters for transmission over noisy channels is presented. Typically, low bit-rate speech coders use these parameters to convey perceptually important spectral information. Thus it is necessary that these parameters are not only efficiently quantized but preserved during transmission. The scheme uses a joint source and channel coding technique applied to a concatenated trellis structure. Operating as a source coder, the system encodes below the 1 dB spectral distortion limit. It is shown that by modifying the encoder cost function to include the expected channel distortion, the code exhibits improved robustness to channel errors. This is accomplished with minimal increase in complexity and without increase in bit-rate. It is noted that the scheme performs well over a wide range of channel bit error rates and compares favourably with a standard scalar LSP quantizer tandemed with a channel coder both in terms of bit-rate and channel noise immunity, C1 NORWEGIAN INST TECHNOL, DIV TELECOMMUN, N-7034 TRONDHEIM, NORWAY. RP SECKER, P (reprint author), UNIV WOLLONGONG, DEPT ELECT & COMP ENGN, WOLLONGONG, NSW 2500, AUSTRALIA. CR AYANOGLU E, 1987, IEEE T INFORM THEORY, V33, P855, DOI 10.1109/TIT.1987.1057376 DUNHAM JG, 1981, IEEE T INFORM THEORY, V27, P516, DOI 10.1109/TIT.1981.1056366 FARVARDIN N, 1989, 1989 P INT C AC SPEE, P168 FARVARDIN N, 1990, IEEE T INFORM THEORY, V36, P799, DOI 10.1109/18.53739 FENICHEL R, 1989, 1016 NAT COMM SYST O FORNEY GD, 1973, P IEEE, V61, P268, DOI 10.1109/PROC.1973.9030 HAGEN R, 1990, 1990 P INT C AC SPEE, P189 ITAKURA F, 1975, J ACOUST SOC AM, V57, pS35, DOI 10.1121/1.1995189 KURTENBA.AJ, 1969, IEEE T COMMUN TECHN, VCO17, P291, DOI 10.1109/TCOM.1969.1090091 RIBBUM B, 1991, SPEECH COMMUN, V10, P277, DOI 10.1016/0167-6393(91)90017-N SECKER P, 1991, 1991 P IREECON 91 SY, P297 SHANNON CE, 1948, AT&T TECH J, V27, P623 SOONG F, 1990, 1990 P INT C AC SPEE, P185 SOONG F, 1984, 1984 P INT C AC SPEE STEWART LC, 1982, IEEE T COMMUN, V34, P1073 SUGAMURA N, 1988, IEEE J SEL AREA COMM, V6, P432, DOI 10.1109/49.618 NR 16 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 149 EP 158 DI 10.1016/0167-6393(92)90009-V PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400008 ER PT J AU CARLSON, R AF CARLSON, R TI SYNTHESIS - MODELING VARIABILITY AND CONSTRAINTS SO SPEECH COMMUNICATION LA English DT Article DE SYNTHESIS; TEXT-TO-SPEECH AB This paper discusses some important topics in current speech synthesis research. Modeling of speaker characteristics and emotions are used as an example of new trends in the speech synthesis field. The relation to speech recognition research is emphasized. New methods such as automatic learning and the use of new analysis techniques are also discussed. C1 ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, S-10044 STOCKHOLM 70, SWEDEN. CR ABE M, 1991, P INT C ACOUST SPEEC ABE M, 1990, P SPEAKER CHARACTERI BAILLY G, 1990, J PHONETICS, V19 BICKLEY C, 1991, J ACOUST SOC AM, V89 BLADON A, 1987, P EUROPEAN C SPEECH BLOMBERG M, 1989, P INT C ACOUST SPEEC BLOMBERG M, 1988, STL QPSR, V2 BLOMBERG M, 1988, 2ND P S ADV MAN MACH BLOMBERG M, 1990, P ESCA WORKSHOP SPEA BOSCH L, 1990, P ESCA WORKSHOP SPEE BOVES L, 1990, J PHONETICS, V19 CAHN JE, 1990, J AM VOICE I O SOC, V8 CAMPBELL N, 1990, J PHONETICS, V19 CARLSON B, 1990, P ESCA WORKSHOP SPEA CARLSON B, 1991, ADV SPEECH HEARING L CARLSON R, 1989, P INT C ACOUST SPEEC CARLSON R, 1991, 12TH P INT C PHON SC CARPENTIER F, 1989, P EUROPEAN C SPEECH COLLIER R, 1990, J PHONETICS, V19 FANT G, 1990, P ESCA WORKSHOP SPEA FANT G, 1990, J PHONETICS, V19 FANT G, 1985, STLQPSR4 SPEECH TRAN GOPAL HS, 1991, J ACOUST SOC AM, V89 GRANSTROM B, 1991, 12TH P INT C PHON SC HAKODA K, 1990, P INT C SPOKEN LANGU HERTZ S, 1990, J PHONETICS, V19 HOLMES WJ, 1990, ESCA WORKSHOP SPEECH HUANG C, 1990, P INT C SPOKEN LANGU JAVKIN H, 1989, P INT C ACOUST SPEEC KAIKI N, 1990, P INT C SPOKEN LANGU KARLSSON I, 1990, J PHONETICS, V19 KARLSSON I, 1991, 12TH P INT C PHON SC Klatt D. H., 1990, J ACOUST SOC AM, V87 KOHLER K, 1990, J PHONETICS, V19 LINDBLOM B, 1990, SPEECH PRODUCTION MO MOULINE E, 1990, P INT C ACOUST SPEEC MURRAY IR, 1988, 7TH P SPEECH 88 FASE MURRAY IR, 1991, P EUROPEAN C SPEECH NAKAJIMA S, 1988, P INT C ACOUST SPEEC OLIVE JP, 1990, P ESCA WORKSHOP SPEE PHILIPS M, 1991, P EUROPEAN C SPEECH PIERREHUMBERT J, 1987, THESIS MIT RAHM M, 1991, P INT C ACOUST SPEEC RILEY M, 1990, P ESCA WORKSHOP SPEE SAGISAKA Y, 1991, 12TH INT C PHON SCI SAGISAKA Y, 1988, P INT C ACOUST SPEEC Scherer KR, 1989, HDB PSYCHOPHYSIOLOGY SORIN C, 1987, 11TH P INT C PHON SC STEVENS K, 1990, J PHONETICS, V19 STEVENS K, 1991, 12TH P INT C PHON SC TALKIN D, 1990, P ESCA WORKSHOP SPEE VANLEEUWEN HC, 1991, P INT C ACOUST SPEEC VANSANTEN J, 1990, COMPUT SPEECH LANGUA VANSON RJJ, 1989, P EUROPEAN C SPEECH WILLIAMS CE, 1972, J ACOUST SOC AM, V52 NR 55 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 159 EP 166 DI 10.1016/0167-6393(92)90010-5 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400009 ER PT J AU STRIK, H BOVES, L AF STRIK, H BOVES, L TI ON THE RELATION BETWEEN VOICE SOURCE PARAMETERS AND PROSODIC FEATURES IN CONNECTED SPEECH SO SPEECH COMMUNICATION LA English DT Article DE INVERSE FILTERING; LF-MODEL; VOICE SOURCE AB The behaviour of the voice source characteristics in connected speech was studied. Voice source parameters were obtained by automatic inverse filtering. followed by automatic fitting of a glottal waveform model to the data. Consistent relations between voice source parameters and prosodic features were observed. RP STRIK, H (reprint author), UNIV NIJMEGEN, DEPT LANGUAGE & SPEECH, POB 9103, 6500 HD NIJMEGEN, NETHERLANDS. CR BERANEK L, 1954, ACOUSTICS, P23 CARLSON R, 1989, MAY P INT C AC SPEEC, V1, P223 DENNIS JE, 1981, ACM T MATH SOFTWARE, V7, P348, DOI 10.1145/355958.355965 DEVETH J, 1990, P INT C ACOUST SPEEC FANT G, 1988, STL QPSR, V2, P1 Fant Gunnar, 1985, STL QPSR, V4, P1 FERGUSON GA, 1987, STATISTICAL ANAL PHY, P195 JANSEN J, 1991, P EUROSPEECH 91, V1, P259 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 STRIK H, 1992, IN PRESS J PHONETICS NR 10 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 167 EP 174 DI 10.1016/0167-6393(92)90011-U PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400010 ER PT J AU VALBRET, H MOULINES, E TUBACH, JP AF VALBRET, H MOULINES, E TUBACH, JP TI VOICE TRANSFORMATION USING PSOLA TECHNIQUE SO SPEECH COMMUNICATION LA English DT Article DE PSOLA ANALYSIS-SYNTHESIS; VOICE CONVERSION; LINEAR MULTIVARIATE REGRESSION; DYNAMIC FREQUENCY WARPING ID NORMALIZATION AB In this contribution, a new system for voice conversion is described. The proposed architecture combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation. The synthesizer based on the classical source-filter decomposition allows prosodic and spectral transformations to be performed independently. Prosodic modifications are applied on the excitation signal using the TD-PSOLA scheme; converted speech is then synthesized using the transformed spectral parameters. Two different approaches to derive spectral transformations, borrowed from the speech-recognition domain, are compared: Linear Multivariate Regression (LMR) and Dynamic Frequency Warping (DFW). Vector-quantization is carried out as a preliminary stage to render the spectral transformations dependent of the acoustical realization of sounds. A formal listening test shows that the synthesizer produces a satisfyingly natural "transformed" voice. LMR proves yet to allow a slightly better conversion than DFW. Still there is room for improvement in the spectral transformation stage. RP VALBRET, H (reprint author), TELECOM PARIS, DEPT SIGNAL, CNRS, URA 820, 46 RUE BARRAULT, F-75634 PARIS 13, FRANCE. CR ABE M, 1991, 1991 P INT C AC SPEE, P765 ABE S, 1988, 1988 P INT C AC SPEE, P655 AINSWORTH WA, 1984, P I ACOUST, V6, P303 ATAL BS, 1976, P IEEE, V64, P460, DOI 10.1109/PROC.1976.10155 CARLSON R, 1991, P EUROSPEECH 91 GENO, P1043 CHARPENTIER F, 1988, THESIS ECOLE NATIONA ELJAROUDI A, 1987, 1987 P INT C AC SPEE, P320 FANT G, 1966, 4 R I TECHN SPEECH T, P22 Fant G., 1975, 231975 STLQPSR, P1 GALAS T, 1990, 1990 INT C MUS COMP Galas T., 1991, P EUROSPEECH GENOVA, P1085 GONCHAROFF V, 1988, 1988 P INT C AC SPEE, P343 Gonzales R. C., 1987, DIGITAL IMAGE PROCES HECKER MHL, 1971, ASHA MONOGRAPHS, V16 LEE KF, 1988, DEV SPHINX SYSTEM LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 Markel JD, 1976, LINEAR PREDICTION SP MATSUMOTO H, 1986, SPEECH COMMUN, V5, P239, DOI 10.1016/0167-6393(86)90011-7 MOULINES E, 1990, SPEECH COMMUN, V9, P453, DOI 10.1016/0167-6393(90)90021-Z NORDSTROM PE, 1975, 1975 INT C PHON SCI OSHAUGHNESSY D, 1986, IEEE ASSP MAG, P4 SAKOE S, 1978, IEEE T ACOUST SPEECH, V28, P623 Savic M., 1991, Digital Signal Processing, V1, DOI 10.1016/1051-2004(91)90099-7 SHIKANO K, 1986, 1986 P IEEE INT C AC, P2643 Tubach J. P., 1990, Traitement du Signal, V7 VAISSIERE J, 1974, MIT114 Q PROGR REP R, P212 Verhelst W., 1991, P EUROSPEECH 91, P1319 WAKITA H, 1977, IEEE T ACOUST SPEECH, V25, P183, DOI 10.1109/TASSP.1977.1162929 NR 28 TC 74 Z9 82 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 175 EP 187 DI 10.1016/0167-6393(92)90012-V PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400011 ER PT J AU SYDESERFF, HA CALEY, RJ ISARD, SD JACK, MA MONAGHAN, AIC VERHOEVEN, J AF SYDESERFF, HA CALEY, RJ ISARD, SD JACK, MA MONAGHAN, AIC VERHOEVEN, J TI EVALUATION OF SPEECH SYNTHESIS TECHNIQUES IN A COMPREHENSION TASK SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; EVALUATION; MULTIPULSE LINEAR PREDICTIVE CODING; PSOLA AB Six types of speech synthesis were evaluated for comprehensibility: standard linear predictive coding analysis/resynthesis; pitch synchronous analysis/resynthesis; pitch synchronous multi-pulse analysis/resynthesis, and three PSOLA (pitch synchronous overlap-and-add) techniques. The relative comprehensibility of the synthesis types was tested by using the synthesised speech to convey information that subjects needed in order to perform a diagram-based multiple-choice task. C1 UNIV OXFORD, OXFORD, ENGLAND. RP SYDESERFF, HA (reprint author), CTR SPEECH TECHNOL RES, 80 S BRIDGE, EDINBURGH EH1 1HN, SCOTLAND. CR BENOIT C, 1989, P ESCA WORKSHOP SPEE CAMPBELL WN, 1990, P ICSLP 90 HAZAN V, 1989, P ESCA WORKSHOP SPEE HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 HUNT MJ, 1986, P INT C ACOUSTICS TO, V1 HUNT MJ, 1989, P EUROPEAN C SPEECH, V2, P348 Klatt D. H., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing LOGAN JS, 1989, J ACOUST SOC AM, V86, P566, DOI 10.1121/1.398236 MACKIE K, 1987, SPEECH COMMUN, V6, P309, DOI 10.1016/0167-6393(87)90005-7 PISONI DB, 1987, TEXT SPEECH MITALK S, P151 SILVERMAN K, 1990, P ICSLP 90, P981 SPIEGEL M, 1989, P ESCA WORKSHOP SPEE EXAMPLE ANSWER BOOK NR 13 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 189 EP 194 DI 10.1016/0167-6393(92)90013-W PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400012 ER PT J AU FURUI, S AF FURUI, S TI RECENT ADVANCES IN SPEECH RECOGNITION TECHNOLOGY AT NTT LABORATORIES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; FEATURE EXTRACTION; SPEAKER ADAPTATION; HIDDEN MARKOV MODEL; CONTINUOUS SPEECH RECOGNITION; SPEAKER IDENTIFICATION; SPEAKER VERIFICATION ID SPEAKER AB This paper introduces recent research activities on speech recognition, ranging from acoustic processing to linguistic processing, at NTT (Nippon Telegraph and Telephone Corporation) Laboratories. These include the proposal of DELTA-LSP parameters, hierarchical DELTA-cepstral parameters, a new method of utilizing pitch information, automatic speaker adaptation techniques, robust HMM phoneme models, new training algorithms for neural networks, linguistic processing that uses syntactic and semantic knowledge, implementation of prototype continuous speech recognition systems, and an efficient text-independent speaker recognition algorithm. RP FURUI, S (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, MUSASHINO ELECT COMMUN LAB, HUMAN INTERFACE LABS, MUSASHINO, TOKYO 180, JAPAN. CR FURUI S, 1990, P ESCA TUTORIAL RES, P10 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 Furui S., 1989, DIGITAL SPEECH PROCE FURUI S, 1989, P IEEE INT C ACOUST FURUI S, 1990, P IEEE INT C ACOUST FURUI S, 1989, IEEE T ACOUST SPEECH, V37, P1923, DOI 10.1109/29.45538 GURGEN F, 1990, P INT C SPOKEN LANGU GURGEN K, 1991, P SYNAPSE 91 OSAKA IMAMURA A, 1991, P IEEE INT C ACOUST Itakura F, 1975, J ACOUST SOC AM, V57 MARIANI J, 1989, P IEEE INT C ACOUST MATSUI T, 1991, P IEEE INT C ACOUST MATSUMOTO H, 1988, 2ND JOINT M AC SOC A MATSUNAGA S, 1990, P IEEE INT C ACOUST MATSUOKA T, 1991, P IEEE INT C ACOUST PALIWAL KK, 1990, P IEEE INT C ACOUST SAGAYAMA S, 1989, P IEEE INT C ACOUST SHIKANO K, 1986, P IEEE INT C ACOUST SHIRAKI Y, 1990, P IEEE INT C ACOUST TAKAHASHI S, 1990, P INT C SPOKEN LANGU TAKAHASHI S, 1991, SPR P M AC SOC JAP TSUBOI T, 1990, P INT C SPOKEN LANGU YAMADA T, 1991, P IEEE INT C ACOUST NR 23 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 195 EP 204 DI 10.1016/0167-6393(92)90014-X PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400013 ER PT J AU GAUVAIN, JL LEE, CH AF GAUVAIN, JL LEE, CH TI BAYESIAN LEARNING FOR HIDDEN MARKOV MODEL WITH GAUSSIAN MIXTURE STATE OBSERVATION DENSITIES SO SPEECH COMMUNICATION LA English DT Article DE BAYESIAN LEARNING; HIDDEN MARKOV MODELS; PARAMETER SMOOTHING; SPEAKER ADAPTATION; SPEAKER CLUSTERING; CORRECTIVE TRAINING AB An investigation into the use of Bayesian learning of the parameters of a multivariate Gaussian mixture density has been carried out. In a framework of continuous density hidden Markov model (CDHMM), Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker clustering and corrective training. The goal is to enhance model robustness in a CDHMM-based speech recognition system so as to improve performance. Our approach is to use Bayesian learning to incorporate prior knowledge into the training process in the form of prior densities of the HMM parameters. The theoretical basis for this procedure is presented and results applying it to parameter smoothing, speaker adaptation, speaker clustering and corrective training are given. C1 AT&T BELL LABS, SPEECH RES DEPT, MURRAY HILL, NJ 07974 USA. CR Brown P. F., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing DeGroot M. H., 1970, OPTIMAL STATISTICAL DEMPSTER AP, 1977, J ROY STAT SOC B MET, V39, P1 FERRETTI M, 1989, P EUR, P154 GAUVAIN JL, 1992, UNPUB MAXIMUM POSTER GAUVAIN JL, 1991, 1991 P DARPA SPEECH HUANG X, 1990, 1990 P DARPA SPEECH Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop LEE CH, 1990, 1990 P DARPA SPEECH LEE CH, 1990, P ICASSP, P145 Lee C. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90002-N Normandin Y., 1991, P ICASSP 91, P537, DOI 10.1109/ICASSP.1991.150395 RABINER LR, 1986, AT&T TECH J, V65, P21 STERN RM, 1987, IEEE T ACOUST SPEECH, V35 Zelinski R., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing NR 15 TC 26 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 205 EP 213 DI 10.1016/0167-6393(92)90015-Y PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400014 ER PT J AU LOCKWOOD, P BOUDY, J AF LOCKWOOD, P BOUDY, J TI EXPERIMENTS WITH A NONLINEAR SPECTRAL SUBTRACTOR (NSS), HIDDEN MARKOV-MODELS AND THE PROJECTION, FOR ROBUST SPEECH RECOGNITION IN CARS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; PROJECTION MEASURE; SPEECH ENHANCEMENT; SPECTRAL SUBTRACTION; NOISE; CONTINUOUS DENSITY HIDDEN MARKOV MODEL AB Achieving reliable performance for a speech recogniser is an important challenge, especially in the context of mobile telephony applications where the user can access telephone functions through voice. The breakthrough of such a technology is appealing, since the driver can concentrate completely and safely on his task while composing and conversing in a "full" hands-free mode. This paper addresses the problem of speaker-dependent discrete utterance recognition in noise. Special reference is made to the mismatch effects due to the fact that training and testing are made in different environments. A novel technique for noise compensation is proposed: nonlinear spectral subtraction (NSS). Robust variance estimates and robust pdf evaluations (projection) are also introduced and combined with NSS into the HMM framework. We show that the lower limit of applicability of the projection (low SNR values) can be loosened after combination with NSS. Experimental results are reported. The performance of an HMM-based recogniser rises from 56% (no compensation) to 98% after speech enhancement. More than 3300 utterances have been used to evaluate the systems (three databases, two European languages). This result is achieved by the use of robust training/recognition schemes and by preprocessing the noisy speech by NSS. RP LOCKWOOD, P (reprint author), MATRA COMMUN, RUE JP TIMBAUD, F-78392 BOIS DARCY, FRANCE. CR BAILLARGEAT C, 1990, P ISATA BEROUTI M, 1979, P IEEE INT C ACOUST Boll S. F., 1979, IEEE T ACOUST SPEECH, V27 BOUDY J, 1990, UNPUB SINGLE MICROPH CARLSON BA, 1991, P IEEE INT C ACOUST DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 FAUCON G, 1991, SEMINAIRE TRAITEMENT HERMANSKY A, 1988, P IEEE INT C ACOUST JUANG BH, 1987, IEEE T ACOUST SPEECH, V35 JUANG BH, 1991, COMPUT SPEECH LANGUA, V5 LECOMTE I, 1989, P IEEE INT C ACOUST LOCKWOOD P, 1991, P EUROSPEECH LOCKWOOD P, 1992, UNPUB NONLINEAR SPEC LOCKWOOD P, 1992, P IEEE INT C ACOUST MANSOUR D, 1989, IEEE T ACOUST SPEECH, V37 Paliwal K. K., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90034-6 PAUL DB, 1981, IEEE T ACOUST SPEECH, V29, P786, DOI 10.1109/TASSP.1981.1163643 PAUL DB, 1986, P SPEECH TECH 86 PICONE J, 1989, P IEEE INT C ACOUST RUEHL HW, 1991, SPEECH COMMUN, V10, P11, DOI 10.1016/0167-6393(91)90024-N TOHKURA Y, 1987, IEEE T ACOUST SPEECH, V35 Van Compernolle D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90027-2 NR 22 TC 120 Z9 131 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 215 EP 228 DI 10.1016/0167-6393(92)90016-Z PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400015 ER PT J AU RAMESH, P WILPON, JG MCGEE, MA ROE, DB LEE, CH RABINER, LR AF RAMESH, P WILPON, JG MCGEE, MA ROE, DB LEE, CH RABINER, LR TI SPEAKER INDEPENDENT RECOGNITION OF SPONTANEOUSLY SPOKEN CONNECTED DIGITS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; CONNECTED DIGITS; WORDSPOTTING AB An important area of speech recognition is automatic recognition of connected digit strings (i.e., sequences composed of the digits zero through nine, and oh). Applications of this technology include credit card authorization, catalog ordering, dialing of telephone numbers, and data entry. For the past two years AT&T has experimented with a system for automatic recognition of 10 digit merchant identification codes, and 15 digit customer credit card numbers, for the purpose of authorizing purchases charged to a credit card. Our evaluation used data collected from about 1000 customers who provided 2000 connected digit strings over 800-based dialed up telephone connections. The recognizer correctly recognized 97% of the digit strings with no rejections using constraints on the validity of both merchant identifications and credit card numbers. Several schemes for applying these task constraints in a practical implementation are discussed in this paper. Also, recognition of the dollar amounts of the transaction are presented with. some preliminary results. RP RAMESH, P (reprint author), AT&T BELL LABS, 600 MT AVE, MURRAY HILL, NJ 07974 USA. CR LEE CH, 1991, 1991 P INT C AC SPEE, P161 RABINER LR, 1988, 1988 P INT C AC SPEE, V1, P119 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 SOONG FK, 1990, J ACOUST SOC AM, V87, P105 WILPON JG, 1991, 1991 P INT C AC SPEE, P349 WILPON JG, 1990, IEEE T ACOUST SP NOV NR 6 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 229 EP 235 DI 10.1016/0167-6393(92)90017-2 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400016 ER PT J AU BOURLARD, H MORGAN, N RENALS, S AF BOURLARD, H MORGAN, N RENALS, S TI NEURAL NETS AND HIDDEN MARKOV-MODELS - REVIEW AND GENERALIZATIONS SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODELS; MULTILAYER PERCEPTRONS; A POSTERIORI PROBABILITY; LIKELIHOOD; CONTEXT-DEPENDENT MODELS; RADIAL BASIS FUNCTION; AUTOREGRESSIVE MODELS ID CONTINUOUS SPEECH RECOGNITION; ALGORITHM; NETWORK AB Previous work has shown the ability of Artificial Neural Networks (ANNs), and Multilayer Perceptrons (MLPs) in particular, to estimate a posteriori probabilities that can be used, after division by the a priori probabilities of the classes, as emission probabilities for Hidden Markov Models (HMMs). The advantages of a speech recognition system incorporating both MLPs and HMMs are the best discrimination and the ability to incorporate multiple sources of evidence (features, temporal context) without restrictive assumptions of distributions or statistical independence. While this approach has been shown useful for speech recognition, it is still important to understand the underlying problems and limitations and to consider its consequences on other algorithms. For example, while state of the art HMM-based speech recognizers now model context-dependent phonetic units such as triphones instead of phonemes to improve their performance, most of the MLP-based approaches are restricted to phoneme models. After a short review, it is shown here how such neural network approaches can be generalized to context-dependent phoneme models. Also, it is discussed how previous theoretical results can affect the development of other algorithms like nonlinear Autoregressive (AR) Models and Radial Basis Functions (RBFs). C1 INT COMP SCI INST, BERKELEY, CA 94704 USA. RP BOURLARD, H (reprint author), L&H SPEECHPROD, ROZENDAALST 14, B-8900 IEPER, BELGIUM. CR BOTTOU L, 1991, THESIS U PARIS S BOURLARD H, 1991, NEURAL NETWORKS ADV, P215 Bourlard H., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90011-9 BOURLARD H, 1990, IEEE T PATTERN ANAL, V12, P1167, DOI 10.1109/34.62605 Bourlard H., 1990, ADV NEURAL INFORMATI, V2, P186 BRIDLE JS, 1990, SPEECH COMMUN, V9, P83, DOI 10.1016/0167-6393(90)90049-F Broomhead D. S., 1988, Complex Systems, V2 BROWN P, 1987, THESIS CMU COVER TM, 1965, IEEE TRANS ELECTRON, VEC14, P326, DOI 10.1109/PGEC.1965.264136 DEMPSTER AP, 1977, J ROY STAT SOC B MET, V39, P1 ELMAN JL, 1980, UCSD1988 CRL TECH TE Huang X. D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90020-X JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 JORDAN MI, 1986, UCSD8604 TECH REP JUANG BH, 1985, IEEE T ACOUST SPEECH, V33, P1404 KUHN G, 1990, SPEECH COMMUN, V9, P41, DOI 10.1016/0167-6393(90)90044-A LANDAUER TK, 1987, 9TH P ANN C COGN SCI, P531 LEE KF, 1990, IEEE T ACOUST SPEECH, V38, P599, DOI 10.1109/29.52701 Lee K.-F., 1989, AUTOMATIC SPEECH REC LEVIN E, 1990, P IEEE INT C ACOUST Lippmann R.P., 1987, IEEE ASSP MAG, V3, P4 LIPPMANN RP, 1987, 1ST INT C NEUR NETW, P417 MORGAN N, 1991, 1991 P EUR 91 GEN MORGAN N, 1992, IN PRESS NEURAL COMP MORGAN N, 1990, 1990 IEEE P INT C AS, P413 MOZER MC, 1988, CRGTR883 U TOR TECH MURVEIT H, 1986, IEEE T ACOUST SPEECH, V34, P1465, DOI 10.1109/TASSP.1986.1164986 NEY H, 1984, IEEE T ACOUST SPEECH, V32, P263, DOI 10.1109/TASSP.1984.1164320 NILES L, 1989, IEEE T ACOUST SPEECH, V1, P17 PAUL DB, 1991, INT CONF ACOUST SPEE, P569, DOI 10.1109/ICASSP.1991.150403 PEELING SM, 1988, SPEECH COMMUN, V7, P403, DOI 10.1016/0167-6393(88)90057-X POGGIO T, 1990, P IEEE, V78, P1481, DOI 10.1109/5.58326 POWELL MJD, 1985, DAMPTNA12 U CAMBR DE RENALS S, 1990, THESIS U EDINBURGH Renals S., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), DOI 10.1109/NNSP.1991.239511 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1 WAIBEL A, 1988, P IEEE INT C ACOUST WATROUS, 1987, 1ST INT C NEUR NETW, P381 WATROUS RL, 1988, P IEEE WORKSHOP SPEE NR 39 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 237 EP 246 DI 10.1016/0167-6393(92)90018-3 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400017 ER PT J AU FALLSIDE, F AF FALLSIDE, F TI ON THE ACQUISITION OF SPEECH BY MACHINES, ASM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ACQUISITION; SPEECH RECOGNITION; SPEECH SYNTHESIS; NEURAL NETWORKS AB As is well known, the acquisition of speech skills by humans involves the "simultaneous" learning of speech perception and of speech production in an environment of speakers who have already acquired these skills. By contrast in speech processing by machines, speech recognition and speech synthesis are studied and implemented separately (and different methodologies have been developed for each). The present paper puts forward a structure for the acquisition of speech by machines, asm, in which both recognition and synthesis are trained "simultaneously" from human training speech. The structure consists of a synthesis chain in which a synthesiser is driven by a trainable neural network controller from a synthesis state vector and of a recognition chain comprising a trainable neural network recogniser which produces a recogniser state vector. The recogniser alternately receives training speech from a human speaker and speech from the synthesiser. A coupled minimisation is set up which trains the recogniser network and the synthesiser state and network necessary to classify or recognise human input speech and to produce synthetic speech which is recognised to be of the same class as the human speech. The algorithm is demonstrated for the acquisition of steady state vowels and simple isolated words. RP FALLSIDE, F (reprint author), UNIV CAMBRIDGE, DEPT ENGN, TRUMPINGTON ST, CAMBRIDGE CB2 1PZ, ENGLAND. CR Allen J., 1987, TEXT SPEECH MITALK S BABA N, 1989, NEURAL NETWORKS, V2, P367, DOI 10.1016/0893-6080(89)90021-X Bridle J. S., 1985, Computer speech processing FALLSIDE F, 1990, CUEDFINFENGTR54 CAMB FALLSIDE F, 1990, P ESCA WORKSHOP SPEE, P237 HOLMES JN, 1983, SPEECH COMMUN, V2, P251, DOI 10.1016/0167-6393(83)90044-4 Lee K.-F., 1989, AUTOMATIC SPEECH REC LJOLJE A, 1986, IEEE T ACOUST SPEECH, V34, P1074, DOI 10.1109/TASSP.1986.1164948 Matyas J., 1965, AUTOMAT REM CONTR, V26, P246 Robinson A., 1991, COMPUTER SPEECH LANG, V5, P259 ROBINSON AJ, 1988, P NEURO 88, P541 RUSSELL NH, 1992, P EUROSPEECH 91 GENO, P1023 SOLIS FJ, 1981, MATH OPER RES, V6, P19, DOI 10.1287/moor.6.1.19 TEBLISKIS J, 1990, P INT C ACOUST SPEEC, P437 TRABER C, 1990, P ESCA WORKSHOP SPEE, P141 NR 15 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 247 EP 260 DI 10.1016/0167-6393(92)90019-4 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400018 ER PT J AU BENGIO, Y DEMORI, R FLAMMIA, G KOMPE, R AF BENGIO, Y DEMORI, R FLAMMIA, G KOMPE, R TI PHONETICALLY MOTIVATED ACOUSTIC PARAMETERS FOR CONTINUOUS SPEECH RECOGNITION USING ARTIFICIAL NEURAL NETWORKS SO SPEECH COMMUNICATION LA English DT Article DE ANNS (ARTIFICIAL NEURAL NETWORKS); ACOUSTIC-PHONETIC DECODING; ACOUSTIC PARAMETERS; TIMIT DATABASE; DISTINCTIVE FEATURES; ANN HMM HYBRID; GLOBAL OPTIMIZATION AB In the framework of an ANN/HMM hybrid system for phone recognition three specialized ANNs were designed and evaluated. One of these ANNs detects the manner of articulation. The other two ANNs describe the speech signal in terms of place of articulation. One of these is used for plosive and nasal classification, and the other one is used for fricative classification. The design of these networks was inspired by acoustic-phonetic knowledge. Input parameters, ANN topology and desired output representation have been optimized for the specific task of the network. Experiments are reported for the TIMIT database. Frame classification errors of 17.7% with the manner ANN (5 broad classes), 25.4% with the plosive and nasal ANN (10 phones). and 25.2% with the fricative ANN (11 phones) were obtained on a set of 616 sentences from 77 new speakers. Experiments for a prototype ANN/HMM hybrid system are also reported. We developed an algorithm for the global optimization of this hybrid system. The network for the manner of articulation and one network for the place of articulation were merged to a single ANN which outputs were modeled by an HMM. With this globally optimized hybrid system we achieved a recognition accuracy of 86% on an 8 class recognition problem (7 plosives and one class corresponding to all other phonemes). C1 MCGILL UNIV, SCH COMP SCI, MONTREAL H3A 2A7, QUEBEC, CANADA. AALBORG UNIV, CTR SPEECH TECHNOL, DK-9220 AALBORG, DENMARK. UNIV ERLANGEN NURNBERG, W-8520 ERLANGEN, GERMANY. RP BENGIO, Y (reprint author), MIT, DEPT BRAIN & COGNIT SCI, 77 MASSACHUSETTS AVE, CAMBRIDGE, MA 02139 USA. CR BENGIO Y, 1992, IEEE T NEURAL NETWOR, V3, P252, DOI 10.1109/72.125866 BENGIO Y, 1991, P EUROSPEECH, V2, P1007 BENGIO Y, 1991, THESIS MCGILL U MONT BENGIO Y, 1990, ADV NEURAL INFORMATI, V2, P218 BIMBOT F, 1990, P ICSLP, V1, P665 CHEUNG S, 1991, P INT C ACOUST SPEEC, P457, DOI 10.1109/ICASSP.1991.150375 CHIGIER B, 1988, P INT C ACOUST SPEEC, P449 COLE RA, 1988, P ICASSP 88, P453 Fant G., 1973, SPEECH SOUNDS FEATUR FLAMIA G, 1991, THESIS MCGILL U MONT Jakobson R., 1961, PRELIMINARIES SPEECH KEWLEYPORT D, 1983, J ACOUST SOC AM, V73, P322, DOI 10.1121/1.388813 LECUN Y, 1989, CONNECTIONISM IN PERSPECTIVE, P143 LEUNG HC, 1990, P ICASSP, P525 Lippmann R. P., 1989, Neural Computation, V1, DOI 10.1162/neco.1989.1.1.1 MENG HM, 1991, P INT C AC SPEECH SI, P285, DOI 10.1109/ICASSP.1991.150333 NATHAN KS, 1991, P IEEE INT C ACOUSTI, P445, DOI 10.1109/ICASSP.1991.150372 O'Shaughnessy D., 1987, SPEECH COMMUNICATION Picone J., 1990, IEEE ASSP Magazine, V7, DOI 10.1109/53.54527 Rumelhart D.E., 1986, PARALLEL DISTRIBUTED, V1, P318 Stevens K. N., 1981, PERSPECTIVES STUDY S, P1 Stevens K. N., 1975, AUDITORY ANAL PERCEP, P303 STEVENS KN, 1983, PRODUCTION SPEECH, P248 ZUE V, 1990, SPEECH COMMUN, V9, P451 NR 24 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 261 EP 271 DI 10.1016/0167-6393(92)90020-8 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400019 ER PT J AU PETEK, B WAIBEL, AH TEBELSKIS, JM AF PETEK, B WAIBEL, AH TEBELSKIS, JM TI INTEGRATED PHONEME AND FUNCTION WORD ARCHITECTURE OF HIDDEN CONTROL NEURAL NETWORKS FOR CONTINUOUS SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE AUTOMATIC SPEECH RECOGNITION; HIDDEN CONTROL NEURAL NETWORK; LARGE VOCABULARY RECOGNITION; CONTEXT-DEPENDENT MODELING; FUNCTION-WORD MODELING AB We present a context-dependent, phoneme and function word based, Hidden Control Neural Network (HCNN-CDF) architecture for continuous speech recognition. The system can be seen as a large vocabulary extension of the word-based HCNN system proposed by Levin in 1990. Initially, we analysed context-independent HCNN modeling principle in the framework of the Linked Predictive Neural Network (LPNN) speech recognition system and found that it results in a 6% increase of the word recognition accuracy at perplexity 402. Significant savings compared to the LPNN in the resource requirements and computational load for the HCNN implementation can be achieved. In speaker-dependent recognition experiments with perplexity 111, the current versions of the LPNN and HCNN-CDF systems achieve 60% and 75% word recognition accuracies, respectively. C1 CARNEGIE MELLON UNIV, SCH COMP SCI, PITTSBURGH, PA 15213 USA. CR BOURLARD H, 1991, P EUROSPEECH 91, V2, P363 BOURLARD H, 1990, IEEE T PATTERN ANAL, V12, P1167, DOI 10.1109/34.62605 Broomhead D. S., 1988, Complex Systems, V2 FRANZINI MA, 1991, P EUROSPEECH 91, V3, P1213 ISO K, 1991, INT CONF ACOUST SPEE, P57, DOI 10.1109/ICASSP.1991.150277 ISO K, 1990, INT CONF ACOUST SPEE, P441, DOI 10.1109/ICASSP.1990.115744 LEE KF, 1988, THESIS CARNEGIEMELLO LEVIN E, 1991, ADV NEURAL INFORMATI, V3, P147 LEVIN E, 1990, INT CONF ACOUST SPEE, P433, DOI 10.1109/ICASSP.1990.115740 MCCLELLAND JL, 1986, PARALLEL DISTRIBUTED, V2, P217 MORGAN N, 1991, P EUROSPEECH 91, V1, P109 NIRANJAN M, 1988, CUEDFINFENGTR22 U EN PETEK B, 1991, P EUROSPEECH 91, V3, P1407 POGGIO T, 1990, P IEEE, V78, P1481, DOI 10.1109/5.58326 Renals S., 1989, P INT JOINT C NEURAL, P461 TEBELSKIS J, 1991, INT CONF ACOUST SPEE, P61, DOI 10.1109/ICASSP.1991.150278 TEBELSKIS J, 1990, INT CONF ACOUST SPEE, P437, DOI 10.1109/ICASSP.1990.115742 TISHBY N, 1990, INT CONF ACOUST SPEE, P365, DOI 10.1109/ICASSP.1990.115686 Waibel A, 1990, READINGS SPEECH RECO WATROUS R, 1989, CRGTR895 U TOR TECHN NR 20 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 273 EP 282 DI 10.1016/0167-6393(92)90021-X PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400020 ER PT J AU PIERACCINI, R LEVIN, E AF PIERACCINI, R LEVIN, E TI STOCHASTIC REPRESENTATION OF SEMANTIC STRUCTURE FOR SPEECH UNDERSTANDING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH UNDERSTANDING; SEMANTICS; LANGUAGE MODELING AB We propose a model for a statistical representation of the conceptual structure of a restricted subset of spoken natural language. The model is used for segmenting a sentence into phrases and labeling them with concept relations (or cases). The model is trained using a corpus of annotated transcribed sentences. An understanding system is being built around this model, allowing for unconstrained spoken input in a database retrieval task. The scope of this paper is to give details and results concerning the new language representation model. To that aim, the model was implemented and tested allowing a text input. While the model parameters were estimated using 547 training sentences, the results on a test set of 148 sentences showed that almost 97% of the concepts were correctly detected and labeled by the automatic concept labeling procedure; eventually. 65% of the sentences were correctly understood. RP PIERACCINI, R (reprint author), AT&T BELL LABS, SPEECH RES DEPT, 600 MT AVE, MURRAY HILL, NJ 07974 USA. CR BOISEN S, 1989, 2ND P DARPA WORKSH S, P135 FISSORE L, 1989, IEEE T ACOUST SPEECH, V37, P1197, DOI 10.1109/29.31268 HEMPHILL CT, 1990, 3RD P DARPA WORKSH S, P96 Lee C. H., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90002-N PIERACCINI R, 1991, 4TH P DARPA WORKSH S PRICE PJ, 1990, 3RD P DARPA WORKSH S, P91 NR 6 TC 9 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 283 EP 288 DI 10.1016/0167-6393(92)90022-Y PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400021 ER PT J AU BAGGIA, P FISSORE, L GERBINO, E GIACHIN, EP RULLENT, C AF BAGGIA, P FISSORE, L GERBINO, E GIACHIN, EP RULLENT, C TI IMPROVING SPEECH UNDERSTANDING PERFORMANCE THROUGH FEEDBACK VERIFICATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH UNDERSTANDING; LATTICE PARSING; RECOGNITION UNDERSTANDING INTERACTION; SHORT WORD DETECTION ID RECOGNITION AB A parser for continuous speech has to deal with lattices where the word hypotheses of the correct sentence are not usually perfectly aligned and short function words may be missing. To cope with these problems, a two-way interaction between the recognition module and the parser, called feedback verification procedure (FVP), has been investigated. The parser generates many solutions, that are fed back to the recognizer which realigns them against the acoustical data, finds the missing function words among the given candidates, and attributes them a new score. The best scoring solution is finally selected by the parser. Results on a 787-word, speaker-independent, telephone-bandwidth continuous speech recognition task are presented. RP BAGGIA, P (reprint author), CTR STUDI & LAB TELECOMUNN, VIA REISS ROMOLI 274, I-10148 TURIN, ITALY. CR BAGGIA P, 1991, P IJCAI 91 SIDNEY, P979 BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 CIARAMELLA A, 1991, P EUROSPEECH 91 GENO, P1341 FISSORE L, 1991, P IEEE INT C AC SPEE, P253, DOI 10.1109/ICASSP.1991.150325 GIACHIN EP, 1990, NATO ASI SERIES F, V75, P455 GIACHIN EP, 1988, P COLING 88 BUDAPEST, P196 GIACHIN EP, 1989, P IJCAI, P1537 POESIO M, 1987, P IJCAI, P622 RABINER LR, 1986, AT&T TECH J, V65, P21 WOODS WA, 1982, ARTIF INTELL, V18, P295, DOI 10.1016/0004-3702(82)90025-X NR 10 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 289 EP 297 DI 10.1016/0167-6393(92)90023-Z PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400022 ER PT J AU PRIETO, N VIDAL, E AF PRIETO, N VIDAL, E TI LEARNING LANGUAGE MODELS THROUGH THE ECGI METHOD SO SPEECH COMMUNICATION LA English DT Article DE LANGUAGE MODELING; GRAMMATICAL INTERFERENCE; ADAPTIVE LANGUAGE ACQUISITION; SPEECH RECOGNITION ID INFERENCE AB A new approach to adaptive Semantic-Language modelling has recently been proposed which allows automatic learning of all the acoustic and syntactic-semantic models that are required for a given Continuous Speech Recognition task. The proposed approach is based on the so called "Error Correcting Grammatical Inference" algorithm which supplies homogeneous finite state structural models both at the acoustic and at the syntactic-semantic levels. Recognition or Understanding is seen as a Formal Transduction procedure that exploits the set of acoustic and linguistic constraints that have been captured in the learned models to directly input raw acoustic signals and output the semantic messages that are conveyed by these signals. In this paper, the proposed approach is reviewed and new improvements are presented. Also, preliminary results with a large semantic-space continuous speech task (Spanish numbers in the one-million range) are presented showing the currently achieved capabilities of this approach. RP PRIETO, N (reprint author), UNIV POLITECN VALENCIA, DEPT SISTEMAS INFORMAT & COMPUTAC, CAMINO VERA S-N, E-46071 VALENCIA, SPAIN. CR ANDREU G, 1990, SIGNAL PROCESSING 5, V1, P1259 ANGLUIN D, 1983, ACM COMPUT SURV, V15, P237, DOI 10.1145/356914.356918 Berstel J., 1979, TRANSDUCTIONS CONTEX FALASCHI A, 1990, 1990 P EUPSICO, P1375 FELDMAN J, 1972, INFORM CONTROL, V20, P244, DOI 10.1016/S0019-9958(72)90424-X FERRATE G, SYNTACTICAL STRUCTUR, P446 Forney Jr G. D., 1973, IEEE P, V61, P268 Fu K.S., 1982, SYNTACTIC PATTERN RE GALIANO I, 1991, P EUROSPEECH 91, V2, P675 GOLD EM, 1967, INFORM CONTROL, V10, P447, DOI 10.1016/S0019-9958(67)91165-5 GORIN AL, 1990, P INT C ACOUST SPEEC, P601 KUMAR SK, 1987, ACTA INFORM, V24, P353 PIERACCINI R, 1991, 1991 P EUROSPEECH 91, V2, P383 PRIETO N, 1991, P ICASSP 91, P789, DOI 10.1109/ICASSP.1991.150091 RANDYS SJ, 1990, INT C PATTERN RECOGN, P417 RULOT H, MODELLING SUB STRING RULOT H, 1989, P INT C ACOUST SPEEC, V1, P643 RULOT H, 1988, EFFICIENT ALGORITHM SEGARRA E, 1991, P EUROSPEECH 91, P861 SHARMAN RA, 1990, P EUPSICO 90, P1271 Vidal E., 1990, P EUR SIGN PROC C SE, P43 VIDAL E, 1989, STRUCTURAL PATTERN A, P17 VIDAL E, 1991, DSIC23291 TECHN REP VIDAL N, 1988, APLICATION ERROR COR WAGNER RA, 1974, J ACM, V21, P168, DOI 10.1145/321796.321811 NR 25 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 299 EP 309 DI 10.1016/0167-6393(92)90024-2 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400023 ER PT J AU ROE, DB MORENO, PJ SPROAT, RW PEREIRA, FCN RILEY, MD MACARRON, A AF ROE, DB MORENO, PJ SPROAT, RW PEREIRA, FCN RILEY, MD MACARRON, A TI A SPOKEN LANGUAGE TRANSLATOR FOR RESTRICTED-DOMAIN CONTEXT-FREE LANGUAGES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH UNDERSTANDING; SPEECH RECOGNITION; MACHINE TRANSLATION AB An effort is underway at AT&T Bell Laboratories and Telefonica Investigacion y Desarrollo to build a restricted domain spoken language translation system, which we call VEST (Voice English/Spanish Translator). The eventual goal is a voice output translator which is speaker-independent, and has a vocabulary of several thousand words covering a specific application. This paper describes the first step of our research, a system which recognizes two speakers in each of Spanish and English and is limited to some four hundred words. The key new idea is that the speech recognition and the language analysis are tightly coupled by using the same language model, an augmented phrase-structure grammar, for both. C1 TELEFONICA, INTEGRATE & DUMP DETECT, MADRID, SPAIN. RP ROE, DB (reprint author), AT&T BELL LABS, 600 MT AVE, MURRAY HILL, NJ 07974 USA. CR Aho A.V, 1977, PRINCIPLES COMPILER ANDERSON MD, 1984, P INT C ACOUST SPEEC, V1 Coker C., 1990, P 1 ESCA WORKSH SPEE, P83 EARLEY J, 1970, COMMUN ACM, V13, P102 Harrison M.D., 1978, INTRO FORMAL LANGUAG KITANO H, 1991, IEEE COMPUTER JUN, P36 LEVISON SE, 1990, IEEE COM MAG, P28 MACARRON A, 1991, P EUROSPEECH 91 GENO, P617 MORENO PJ, 1989, P EUROSPEECH 89 PARI, P360 MORIMOTO T, 1990, P INFO JAPAN 90 INT, P553 OLIVE JP, 1985, J ACOUST SOC AM S1, V78, P56 PEREIRA FCN, 1991, 29TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS : PROCEEDINGS OF THE CONFERENCE, P246 RABINER LR, 1989, P IEEE, V77, P429 ROE DB, 1989, P INT C ACOUST SPEEC, P778 Shieber Stuart, 1986, INTRO UNIFICATION BA Slocum J., 1985, Computational Linguistics, V11 Waibel A., 1991, P ICASSP 91, P793, DOI 10.1109/ICASSP.1991.150456 1991, J I ELECTRON INFORM, V74, P300 NR 18 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP 311 EP 319 DI 10.1016/0167-6393(92)90025-3 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400024 ER PT J AU WAJSKOP, M AF WAJSKOP, M TI SPECIAL ISSUE ON EUROSPEECH 91 - FOREWORD SO SPEECH COMMUNICATION LA English DT Editorial Material RP WAJSKOP, M (reprint author), FREE UNIV BRUSSELS, INST PHONET, CP 110, 50 AV F ROOSEVELT, B-1050 BRUSSELS, BELGIUM. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1992 VL 11 IS 2-3 BP R5 EP R5 DI 10.1016/0167-6393(92)90001-N PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA JE504 UT WOS:A1992JE50400001 ER PT J AU MCKEOWN, JD AF MCKEOWN, JD TI PERCEPTION OF CONCURRENT VOWELS - THE EFFECT OF VARYING THEIR RELATIVE LEVEL SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; DOUBLE VOWELS; PERCEPTUAL GROUPING; VOWEL DOMINANCE; LEVEL DIFFERENCES; SPECTRAL SUBTRACTION ID FUNDAMENTAL-FREQUENCY; RECOGNITION AB When identifying pairs of simultaneous steady-state vowels, listeners perform well even when the two vowels start and stop at the same time, are presented monaurally, have the same fundamental frequency (f0), and have approximately equal intensities. The sensation described by listeners is of one dominant vowel "coloured" by a second or non-dominant vowel. A small difference in f0 improves performance and typically results in a sensation of two voice sources rather than of one voice coloured by another. An experiment is reported in which four listeners attempted to identify separately the dominant and the non-dominant vowel when the relative levels of the vowels were varied over a 28 dB range and the vowels had either the same f0 or different f0s. As in previous experiments, performance improved when there was an f0 difference between the vowels, but this advantage was reduced though not abolished when the f0 separation was one octave. This advantage of f0 separation was seen almost entirely in increased identification of the non-dominant vowel and was apparent at all but the greatest relative level differences tested. Finally, there were clear differences in the pattern of vowel dominance across the different vowel combinations, with subjects showing broadly similar patterns. Dominance may reflect cognitive "decision" strategies in addition to spectral masking at the level of the peripheral auditory system. RP MCKEOWN, JD (reprint author), UNIV MANCHESTER, DEPT EXPTL PSYCHOL, OXFORD RD, MANCHESTER M13 9PL, LANCS, ENGLAND. CR ASSMANN PF, 1990, J ACOUST SOC AM, V88, P680, DOI 10.1121/1.399772 ASSMANN PF, 1989, J ACOUST SOC AM, V85, P327, DOI 10.1121/1.397684 Bregman AS., 1990, AUDITORY SCENE ANAL CHALIKIA MH, 1989, PERCEPT PSYCHOPHYS, V46, P487, DOI 10.3758/BF03210865 CHERRY EC, 1953, J ACOUST SOC AM, V25, P975, DOI 10.1121/1.1907229 DARWIN CJ, 1981, Q J EXP PSYCHOL-A, V33, P185 DARWIN CJ, 1987, PSYCHOPHYSICS SPEECH, P112 DARWIN CJ, 1990, SPEECH COMMUN, V9, P469, DOI 10.1016/0167-6393(90)90022-2 DEMANY L, 1990, PERCEPT PSYCHOPHYS, V48, P436, DOI 10.3758/BF03211587 EGAN JP, 1954, J ACOUST SOC AM, V26, P774, DOI 10.1121/1.1907416 GREEN DM, 1983, J ACOUST SOC AM, V73, P639, DOI 10.1121/1.389009 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 MARIN CMH, 1991, J ACOUST SOC AM, V89, P341, DOI 10.1121/1.400469 MCGURK H, 1976, NATURE, V264, P746, DOI 10.1038/264746a0 MCKEOWN JD, 1990, J ACOUST SOC AM, V88, pS26 MEDDIS R, 1991, IN PRESS J ACOUST SO MOORE B J, 1982, INTRO PSYCHOL HEARIN MOORE BCJ, 1987, HEARING RES, V28, P209, DOI 10.1016/0378-5955(87)90050-5 PALMER AR, 1990, J ACOUST SOC AM, V88, P1412, DOI 10.1121/1.400329 Patterson R. D., 1991, ADV SPEECH HEARING L, V3 Scheffers M. T. M., 1983, THESIS U GRONINGEN SUMMERFIELD Q, 1989, PERCEPT PSYCHOPHYS, V45, P529, DOI 10.3758/BF03208060 SUMMERFIELD Q, 1991, J ACOUST SOC AM, V89, P1364, DOI 10.1121/1.400659 TRIESMAN AM, 1964, Q J EXP PSYCHOL, V77, P533 ZWICKER UT, 1984, SPEECH COMMUN, V3, P265, DOI 10.1016/0167-6393(84)90023-2 NR 25 TC 21 Z9 21 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 1 EP 13 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900001 ER PT J AU HALKA, U HEUTE, U AF HALKA, U HEUTE, U TI A NEW APPROACH TO OBJECTIVE QUALITY-MEASURES BASED ON ATTRIBUTE-MATCHING SO SPEECH COMMUNICATION LA English DT Article DE SPEECH QUALITY; OBJECTIVE QUALITY-MEASURES; ATTRIBUTE-MATCHING; SPEECH-MODEL PROCESS ID PERFORMANCE; SYSTEMS AB In this paper the results of a study of objective quality measures for a broad range of coding systems are presented. These objective measures take the linear and the nonlinear distortions of the coder into account. A correlation analysis was performed, in order to find out those measures which are most effective in predicting perceivable parametric attributes of speech quality. The results of this experiment, the so-called attribute-matching, yield a good composite measure for predicting the total quality for a wide range of coding systems and can be computed in pseudo-realtime. Furthermore, we describe the test signal we have used in our study, which was not natural speech but a speech-model process. RP HALKA, U (reprint author), RUHR UNIV BOCHUM, ARBEITSGRP DIGITALE SIGNALVERARBEITUNG, W-4630 BOCHUM, GERMANY. CR BAPPERT V, 1992, IN PRESS SPEECH COMM Barnwell T. P. III, 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing BARNWELL TP, 1979, J ACOUST SOC AM, V66, P1658, DOI 10.1121/1.383664 BARNWELL TP, 1978, IEEE P INT C ACOUST, P595 BREHM H, 1987, SIGNAL PROCESS, V12, P119, DOI 10.1016/0165-1684(87)90001-6 BREHM H, 1981, 47 U ERL NURNB AUSG Breitkopf P., 1981, ICASSP 81. Proceedings of the 1981 IEEE International Conference on Acoustics, Speech and Signal Processing *C EUR POST TEL AD, 1988, FULL RAT SPEECH ENC, P1 *CONS COMM INT TEL, 1983, CCITT RED BOOK 9TH P, V5, P175 COX RV, 1984, IEEE P INT C ACOUST CROCHIERE RE, 1978, P INT ZURICH SEMINAR Dillon W. R., 1984, MULTIVARIATE ANAL ME GLUTH R, 1988, IEEE P INT S CIRCUIT, P1565 GOODMAN DJ, 1979, AT&T TECH J, V58, P601 GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 GRAY RM, 1980, IEEE T ACOUST SPEECH, V28, P367, DOI 10.1109/TASSP.1980.1163421 HALKA U, 1991, INT CONF ACOUST SPEE, P497, DOI 10.1109/ICASSP.1991.150385 HALKA U, 1991, P INT C SPEECH COMMU, V5, P887 HELLWIG K, 1986, 2ND NORD SEM DIG LAN, P257 HEUTE U, 1988, SPEECH COMMUN, V7, P125, DOI 10.1016/0167-6393(88)90035-0 ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 KITAWAKI N, 1982, IEEE P INT C ACOUST, P1000 MANN T, 1989, THESIS RUHR U BOCHUM MCDERMOTT B, 1978, AT&T TECH J, V57, P1597 MCDERMOTT BJ, 1978, IEEE P INT C ACOUST, P581 MERMELSTEIN P, 1979, J ACOUST SOC AM, V66, P1664, DOI 10.1121/1.383638 NAKATSUI M, 1982, J ACOUST SOC AM, V72, P1136, DOI 10.1121/1.388323 NOLL P, 1974, IEEE P INT C ACOUST Oppenheim A. V., 1975, DIGITAL SIGNAL PROCE Quackenbush S. R., 1988, OBJECTIVE MEASURES S QUACKENBUSH SR, 1983, IEEE P INT C ACOUST, P547 SCHUSSLER HW, 1990, FREQUENZ, V44, P82 SCHUSSLER HW, 1987, FREQUENZ, V41, P147 Schussler H. W., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266873 Tribolet J., 1978, ICASSP 78, P586 TRIBOLET JM, 1979, AT&T TECH J, V58, P699 VARY P, 1988, APR INT C AC SPEECH, P227 VISWANATHAN VR, 1983, IEEE P INT C ACOUST, P543 Voiers WD, 1977, IEEE INT C AC SPEECH, P204 ZIEGLER K, 1989, THESIS U ERLANGEN NU NR 40 TC 12 Z9 12 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 15 EP 30 DI 10.1016/0167-6393(92)90060-K PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900002 ER PT J AU LEFEVRE, JP HILLER, SM ROONEY, E LAVER, J DIBENEDETTO, MG AF LEFEVRE, JP HILLER, SM ROONEY, E LAVER, J DIBENEDETTO, MG TI MACRO AND MICRO FEATURES FOR AUTOMATED PRONUNCIATION IMPROVEMENT IN THE SPELL SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ANALYSIS; PROSODIC FEATURES; SEGMENTAL FEATURES; PRONUNCIATION AID ID REPRESENTATION AB In this paper, the analysis of macro (prosodic) and micro (segmental) features is described for a workstation designed to improve the pronunciation of English, French and Italian by non-native speakers. The SPELL workstation is intended to be a teaching device aimed at intermediate ability foreign language learners. Audio and visual aids will be used to help students improve their general intelligibility within a basic teaching paradigm called DELTA (Demonstrate, Evaluate Listening, Teach and Assess). Prosodic analysis will apply to the features of intonation, stress and rhythm. A phonological approach is used for intonation which provides a well-structured system of contrasting units that correlate with discrete linguistic functions. A more limited approach to the prosodic phonology of stress and rhythm will be taught in the SPELL system by manipulating the relatively simple acoustic features of vowel quality and segmental duration. The micro feature analysis will focus on the segmental class of vowels. A distinctive feature approach is used to characterize non-native vowel pronunciation. Acoustic properties are sought which will be speaker-independent. C1 UNIV ROME LA SAPIENZA, DEPT INFO COM, I-00184 ROME, ITALY. OROS SA, F-38241 MEYLAN, FRANCE. UNIV EDINBURGH, CTR SPEECH TECHNOL RES, EDINBURGH EH1 1HN, SCOTLAND. CR ABERCROMBIE D, 1967, ELEMENTS GENERAL PHO Adams C., 1979, ENGLISH SPEECH RHYTH Arnold G. F., 1961, INTONATION COLLOQUIA Bernstein J., 1990, P INT C SPOK LANG PR, P1185 CHAPALLAZ M, 1964, HONOUR D JONES, P306 Chapallaz Marguerite, 1979, PRONUNCIATION ITALIA Crystal D., 1969, PROSODIC SYSTEMS INT Crystal David, 1975, ENGLISH TONE VOICE Cutler Anne, 1983, PROSODY MODELS MEASU DAUER RM, 1983, J PHONETICS, V11, P51 DIBENEDETTO MG, 1989, J ACOUST SOC AM, V86, P55, DOI 10.1121/1.398220 DIBENEDETTO MG, 1990, P INT C SPEECH TECHN, P248 DIBENEDETTO MG, 1989, J ACOUST SOC AM, V86, P66 FAURE G, 1973, INTERROGATION INTONA, P1 Grundstrom AllanW, 1973, INTERROGATION INTONA, P19 HALLIDAY MAK, 1973, PHONETICS LINGUISTIC, P103 Harmer J., 1983, PRACTICE ENGLISH LAN KENNING MM, 1983, PHONETICS ASS, V13, P32 KENNING MM, 1979, J INT PHON ASSOC, V9, P15 LEACH P, 1988, J INT PHON ASSOC, V18, P125 Lehiste I., 1970, SUPRASEGMENTALS LEON PR, 1980, MELODY LANGUAGE Liberman Mark, 1975, THESIS MIT Madsen H. S., 1983, TECHNIQUES TESTING Martin P., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90021-8 Martin Philippe, 1975, LINGUISTICS, V146, P35 MILLER M, 1984, J PHONETICS, V12, P75 MINIFIE FD, 1973, NORMAL ASPECTS SPEEC, P235 Muljacic Zarko, 1972, FONOLOGIA LINGUA ITA Pierrehumbert Janet, 1980, THESIS MIT CAMBRIDGE PIERREHUMBERT JB, 1979, SPEECH COMMUN, P523 Pike K. L., 1945, INTONATION AM ENGLIS RIVERS WM, 1975, PRACTICAL GUIDE TEAC ROACH P, 1976, EDINBURGH U DEP LING, V9, P97 SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 Tranel B., 1987, SOUNDS FRENCH INTRO WENK BJ, 1982, J PHONETICS, V10, P193 NR 37 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 31 EP 44 DI 10.1016/0167-6393(92)90061-B PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900003 ER PT J AU KOO, JM UN, CK AF KOO, JM UN, CK TI A RECOGNITION TIME REDUCTION ALGORITHM FOR LARGE-VOCABULARY SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; LARGE VOCABULARY; TIME REDUCTION AB We propose an efficient pre-classification algorithm that extracts candidate words to reduce the recognition time in a large-vocabulary recognition system and also propose the use of spectral and temporal smoothing of the observation probability to improve its classification performance. The proposed algorithm computes the coarse likelihood score for each word in a lexicon using the observation probabilities of speech spectra and duration information of recognition units. With the proposed approach we could reduce the computational amount by 74% with slight degradation of recognition accuracy in an 1160-word recognition system based on the phoneme based hidden Markov modeling (HMM). Also, we observed that the proposed coarse likelihood score computation algorithm is a good estimator of the likelihood score computed by the Viterbi algorithm. RP KOO, JM (reprint author), KOREA ADV INST SCI & TECHNOL, DEPT ELECT ENGN, COMMUN RES LAB, POB 150, SEOUL 131, SOUTH KOREA. CR BAHL L, 1989, P INT C ACOUST SPEEC FISSORE L, 1988, P INT C ACOUST SPEEC KANEKO T, 1983, IEEE T ACOUST SPEECH, V31, P1061, DOI 10.1109/TASSP.1983.1164211 KOO JM, 1990, ELECTRON LETT, V26, P743, DOI 10.1049/el:19900485 LEE KF, 1989, IEEE T ACOUST SPEECH, V37, P1641, DOI 10.1109/29.46546 NR 5 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 45 EP 50 DI 10.1016/0167-6393(92)90062-C PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900004 ER PT J AU CHUNG, YM LEE, SU AF CHUNG, YM LEE, SU TI A COMPARISON OF 2 SPEECH CODERS FOR DIGITAL MOBILE RADIO APPLICATIONS SO SPEECH COMMUNICATION LA English DT Article DE DIGITAL MOBILE RADIO; SPEECH CODING; CHANNEL CODING; RAYLEIGH FADING CHANNEL; M-ARY DPSK RECEIVER MODEL AB This paper attempts to compare the performance of two popular speech coders, namely regular pulse excitation with long-term predictor (RPE-LTP) and code-excited linear prediction (CELP), in both random and burst error environments. In simulation, the bit stream generated by each coder is corrupted with error patterns obtained from the burst error model. The burst error model consists of two parts: a Rayleigh fading envelope generator and an M-ary differential phase shift keying receiver model in which the signal level is varied according to the Rayleigh fading envelope. The performance of each coder is evaluated both objectively and subjectively. The simulation results indicate that the RPE-LTP coder provides more consistent performance in the burst error environment. This paper also discusses the comparison of four channel coding techniques designed for the CELP coder. RP CHUNG, YM (reprint author), SEOUL NATL UNIV, DEPT CONTROL & INSTRUMENTAT ENGN, KWANAK KU, SEOUL 151742, SOUTH KOREA. CR ARREDOND.GA, 1973, IEEE T VEH TECHNOL, VVT22, P241, DOI 10.1109/T-VT.1973.23560 ATAL BS, 1986, 1986 P INT C AC SPEE, P1681 BEROUTI M, 1984, 1984 P INT C AC SPEE GALAND C, 1988, SPEECH COMMUN, V7, P167, DOI 10.1016/0167-6393(88)90037-4 Hagenauer J., 1988, 38th IEEE Vehicular Technology Conference: `Telecommunications Freedom - Technology on the Move' (Cat. No.88CH2622-9), DOI 10.1109/VETEC.1988.195349 HELLWIG K, 1989, 1989 P GLOBECOM 89 D, P1065 HOYLE RD, 1987, IEEE J SEL AREA COMM, V5, P915, DOI 10.1109/JSAC.1987.1146597 Jakes W. C., 1974, MICROWAVE MOBILE COM Jayant N. S., 1984, DIGITAL CODING WAVEF KROON P, 1988, IEEE J SEL AREA COMM, V6, P353, DOI 10.1109/49.612 LAZZARI V, 1988, SPEECH COMMUN, V7, P193, DOI 10.1016/0167-6393(88)90039-8 LeBlanc W. P., 1989, 39th IEEE Vehicular Technology Conference (IEEE Cat. No.89CH2739-1), DOI 10.1109/VETEC.1989.40140 Lin S., 1983, ERROR CONTROL CODING MODESTINO JW, 1985, IEEE T COMMUN, V33, P210, DOI 10.1109/TCOM.1985.1096277 NATVIG JE, 1989, 1989 P GLOBECOM 89 D, P1060 NATVIG JE, 1988, IEEE J SEL AREA COMM, V6, P324, DOI 10.1109/49.609 Proakis J. D., 1989, DIGITAL COMMUNICATIO Rabiner L.R., 1978, DIGITAL PROCESSING S SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 SOONG F, 1984, 1984 P INT C AC SPEE SUDA H, 1988, IEEE J SEL AREA COMM, V6, P346, DOI 10.1109/49.611 SUGAMURA N, 1986, SPEECH COMMUN, V5, P199, DOI 10.1016/0167-6393(86)90008-7 SUZUKI H, 1982, IEEE T VEH TECHNOL, V31, P7, DOI 10.1109/T-VT.1982.23907 SVEAN J, 1982, MAY P INT C AC SPEEC, P1700 VARY P, 1988, SPEECH COMMUN, V7, P209, DOI 10.1016/0167-6393(88)90040-4 NR 25 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 51 EP 69 DI 10.1016/0167-6393(92)90063-D PG 19 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900005 ER PT J AU SOROKIN, VN AF SOROKIN, VN TI DETERMINATION OF VOCAL-TRACT SHAPE FOR VOWELS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH; INVERSE PROBLEM; VOCAL TRACT SHAPE; INTERNAL MODEL; OPTIMIZATION ID SPEECH AB The inverse problem for vocal tract shape, area function and articulatory parameters was solved for steady-state vowels by means of an optimization procedure requiring the conditional minimum of work on the part of the articulatory organs. One to four formant frequencies were used as references. The shape of the tongue was measured with an X-ray microbeam system for male and female speakers. The shapes of the vocal tract calculated for the experiments are very similar to the measured shapes. RP SOROKIN, VN (reprint author), ACAD SCI USSR, INST INFORMAT TRANSMISS PROBLEMS, ERMOLOVOI STR 19, MOSCOW 101447, USSR. CR ATAL BS, 1970, J ACOUST SOC AM, V47, P65 COKER CH, 1976, P IEEE, V64, P452, DOI 10.1109/PROC.1976.10154 FANT G, 1976, STL QPSR, V4, P28 FOLKINS JW, 1975, J SPEECH HEAR RES, V18, P207 GOPINATH B, 1970, AT&T TECH J, V49, P1195 ISHIZAKA K, 1975, IEEE T ACOUST SPEECH, V23, P370, DOI 10.1109/TASSP.1975.1162701 KELSO SJ, 1976, MOTOR CONTROL ISSUES, P3 LEVINSON SE, 1983, J ACOUST SOC AM, V74, P1145, DOI 10.1121/1.390038 LINDBLOM BEF, 1979, J PHONETICS, P147 MAEDA S, 1979, SPEECH COMMUN, P67 MERMELST.P, 1973, J ACOUST SOC AM, V53, P1070, DOI 10.1121/1.1913427 Morse PM, 1948, VIBRATION SOUND NAKAJIMA T, 1977, DYNAMIC ASPECTS SPEE, P251 SCHMIDT RA, 1982, HUMAN MOTOR BEHAV IN, P219 SHIRAI K, 1976, J ELECTRONICS COMM A, V59, P35 Sorokin V. N., 1985, THEORY SPEECH PRODUC SOROKIN VN, 1980, J ACOUST SOC AM, V68, pS32, DOI 10.1121/1.2004671 SOROKIN VN, 1987, 11TH P INT C PHON SC, V3, P382 Tikhonov A.N., 1974, METHODS SOLVING INCO WAKITA H, 1973, IEEE T ACOUST SPEECH, VAU21, P417, DOI 10.1109/TAU.1973.1162506 Wilde D., 1964, OPTIMUM SEEKING METH WITHELMS R, 1986, SIGNAL PROCESS, V3, P477 NR 22 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1992 VL 11 IS 1 BP 71 EP 85 DI 10.1016/0167-6393(92)90064-E PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA HN109 UT WOS:A1992HN10900006 ER PT J AU LAUER, J AF LAUER, J TI SPEAKER CHARACTERIZATION IN SPEECH TECHNOLOGY SO SPEECH COMMUNICATION LA English DT Editorial Material RP LAUER, J (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, EDINBURGH EH8 9YL, MIDLOTHIAN, SCOTLAND. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 431 EP 433 PG 3 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500001 ER PT J AU BAMBERG, PG MANDEL, MA AF BAMBERG, PG MANDEL, MA TI ADAPTABLE PHONEME-BASED MODELS FOR LARGE-VOCABULARY SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER ADAPTATION; RECOGNITION AB For a large-vocabulary speech-recognition system, such as Dragon Systems' 30,000 word DragonDictate recognizer, an efficient approach to training is to use "phonemes-in-context" (Pics) which are triphones supplemented by a code to describe prepausal lengthening. Each PIC is in turn represented by a sequence of one to six "phonetic elements" (PELs). For each phoneme, there may be thousands of different PICs, but there are no more than 63 PELs. Initially all PICs and PELs are trained from a database of about 16,000 tokens recorded by a reference speaker. When the recognizer is used by a new speaker, each word that is recognized is immediately used to adapt the PELs in its Markov models. After about a thousand words have been recognized, most PELs have been adapted to the new speaker, so that even models for words that have not vet been spoken are appropriate for the new speaker. The recognizer was tested with two texts that differed greatly in vocabulary and style. Three speakers dictated each text: the reference speaker, a new male speaker and a new female speaker. After adaptation on 1,500 words, performance for all three speakers was better than the performance for the reference speaker on unadapted models. With an active vocabulary of 25,000 words, the fraction of words recognized correctly was 86%, with an additional 8% on a "choice list" of eight words. RP BAMBERG, PG (reprint author), DRAGON SYST INC, 320 NEVADA ST, NEWTON, MA 02160 USA. CR RANDOM HOUSE UNABRID NR 1 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 437 EP 451 DI 10.1016/0167-6393(91)90047-W PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500002 ER PT J AU BLOMBERG, M AF BLOMBERG, M TI ADAPTATION TO A SPEAKERS VOICE IN A SPEECH RECOGNITION SYSTEM BASED ON SYNTHETIC PHONEME REFERENCES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; VOICE SOURCE ADAPTATION; SYNTHETIC PROTOTYPES; SPEECH PRODUCTION ORIENTED RECOGNITION AB A speech recognition system based on synthetic generation of reference prototypes is described. The vocabulary and grammar are described in a finite-state phoneme network. In the transformation from symbolic to spectral representation, reduction rules modify the initial phoneme target values and a coarticulation module inserts interpolated transition states at phoneme boundaries. The phoneme templates are specified in terms of control parameters to a serial formant synthesiser. At each state, a 16-channel filter bank section is computed from the synthesis parameters. The recognition process uses a time-synchronous dynamic programming technique to find the path in the network that minimises the accumulated spectral distance to the input utterance. A technique for dynamic adaptation to the speaker's voice source spectrum is performed during recognition. Without adaptation, the average recognition for ten male speakers was 88% on an isolated-word task using a 26-word vocabulary. Adding voice source adaptation raised the performance to 96%. On a vocabulary of 3 connected digits, the adaptation technique improved the recognition rate for six male speakers from 87.7% to 92.8%. The improvement was largest for subjects with low initial recognition rate, indicating the benefit of the voice source adaptation technique for certain voices. Changing the voice source model and optimising the adaptation time constant raised the recognition rate further to 96.1%. Current work is directed towards speaker adaptation of phoneme parameters and modelling of the variability of the parameter dynamics at phoneme boundaries. RP BLOMBERG, M (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR Ananthapadmanabha T. V., 1984, STLQPSR231984, P1 BLOMBERG M, 1989, MAY P INT C AC SPEEC, V1, P687 BLOMBERG M, 1991, 12TH P INT C PHON SC BLOMBERG M, 1988, STLQPSR231988 KTH, P69 BLOMBERG M, 1991, 2ND P EUR C SPEECH C BLOMBERG M, 1987, P EUROPEAN C SPEECH, V2, P369 BLOMBERG M, 1989, SEP P EUR C SPEECH C, P621 CARLSON R, 1989, MAY P INT C AC SPEEC, V1, P223 FANT G, 1985, SPEECH TRANSMISSION, P1 GOBL C, 1988, STLQPSR11988 KTH, P123 KARLSSON I, 1988, STLQPSR231988 KTH, P61 STALHAMMAR U, 1973, STLQPSR4, P1 NR 12 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 453 EP 461 DI 10.1016/0167-6393(91)90048-X PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500003 ER PT J AU BONNEAUMAYNARD, H AF BONNEAUMAYNARD, H TI VECTOR QUANTIZATION FOR SPEAKER ADAPTATION - RESULTS ON A 5000-WORD DATABASE SO SPEECH COMMUNICATION LA English DT Article DE VECTOR QUANTIZATION; SPEAKER ADAPTATION; DYNAMIC TIME WARPING AB With a view to designing a speaker-independent large vocabulary recognition system, we evaluate a vector quantization approach for speaker adaptation. Only one speaker (the reference speaker) pronounces the application vocabulary. He also pronounces a small vocabulary called the adaptation vocabulary. Each new speaker then merely pronounces the adaptation vocabulary. We have compared two adaptation methods, establishing a correspondence between the codebooks of the reference and the new speakers, on a 20-speaker database with a 104-word application vocabulary. Method I uses a transposed codebook to represent the new speaker during the recognition process, whereas Method 11 uses a codebook which is obtained by clustering analysis on the NS's pronunciation of the adaptation vocabulary. The adaptation vocabulary contains 136 words. Comparison of the performance of the two methods shows that a new speaker's codebook is not necessary to represent the new speaker. Consequently we have used the first method to perform tests with a 5000-word application vocabulary, and a 4-speaker database. The adaptation is still efficient (the mean improvement is about 14% ), even if the relative improvement is 30% compared to 56% obtained in the 104-word application experiment. Further experiments show that the recognition accuracy can be improved by increasing the adaptation vocabulary size and the codebook size. RP BONNEAUMAYNARD, H (reprint author), LAB INFORMAT MEAN & SCI INGN, CNRS, BP133, F-91403 ORSAY, FRANCE. CR BONNEAU H, 1987, P IEEE INT C ACOUST FENG MW, 1988, P IEEE INT C ACOUST FLAMENBAUM G, 1979, CAHIERS ANAL DONNEES, V4, P357 Grenier Y., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing HUNT MJ, 1981, J ACOUST SOC AM, V69, pS41, DOI 10.1121/1.386266 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 NAKAMURA S, 1989, P IEEE INT C ACOUST QUENOT G, 1986, P IEEE INT C ACOUST RABINER LR, 1983, AT&T TECH J, V62, P1075 RIGOLL G, 1989, P IEEE INT C ACOUST Rosenberg A. E., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) Shikano K., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) NR 12 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 463 EP 469 DI 10.1016/0167-6393(91)90049-Y PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500004 ER PT J AU HIERONYMUS, JL AF HIERONYMUS, JL TI FORMANT NORMALIZATION FOR SPEECH RECOGNITION AND VOWEL STUDIES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; FORMANT NORMALIZATION; MALE AND FEMALE VOICES; PHONETIC CONTEXT AB Vowel formant target frequencies from different talkers depend on the details of the vocal tract. sex, regional accent, speaking habits and other factors. Good vowel recognition and studies of vowels from different talkers require an accurate method for compensating for speaker differences in these frequencies. The major variance seen in the data is between males and females. However, even within the same sex class, there are large variations in the formant target frequencies for the same vowel in the same phonetic context. Various methods of compensating for speaker variation in formants were studied. Bark scaled formants and subtraction of Bark fundamental frequency from the first formant was tried first. In spite of recent published papers on the efficacy of this technique, it was found inadequate. The transformations were incapable of improving the clusters of the cardinal vowels, for example. A modification of the Gerstman technique, determining the speaker's formant range and then transforming into an "ideal" talker's range, was found to account for most of the variance due to different talkers given a small amount of training data. This technique was applied to vowel in context studies on American English. Formant ranges were studied for 125 talkers of General American English. Plots of formant ranges for males and females showed interesting patterns. The lower limit of the second formant was not very different, while the lower limit of the first formant was lower for males. Both the first and second formant maxima were larger for females. The modified Gerstman transformation was able to superimpose the formant targets for the same vowel in the same context from different talkers into the same region of F1, F2 space. There remained some residual variance between male and female, even after the transformation. These trends are shown in a series of plots of vowel target frequency data. RP HIERONYMUS, JL (reprint author), UNIV EDINBURGH, CTR SPEECH TECHNOL RES, EDINBURGH EH8 9YL, MIDLOTHIAN, SCOTLAND. CR Barry W. J., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90003-X DISNER SF, 1980, J ACOUST SOC AM, V67, P253, DOI 10.1121/1.383734 FANT G, 1973, SPEECH SOUNDS FEATUR, P84 GERSTMAN LJ, 1968, IEEE T ACOUST SPEECH, VAU16, P78, DOI 10.1109/TAU.1968.1161953 LINDBLOM B, 1987, 11TH P INT C PHON SC, V3, P9 MAJURSKI WJ, 1987, MAR P DARPA SPEECH R, P61 MILLER JD, 1989, J ACOUST SOC AM, V85, P2114, DOI 10.1121/1.397862 NEAREY TM, 1989, J ACOUST SOC AM, V85, P2088, DOI 10.1121/1.397861 PETERSON G, 1952, J ACOUST SOC AM, V24, P693 PETERSON GE, 1961, J SPEECH HEAR RES, V4, P10 SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 TRAUNMULLER H, 1981, J ACOUST SOC AM, V69, P1465 NR 12 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 471 EP 478 DI 10.1016/0167-6393(91)90050-4 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500005 ER PT J AU CARLSON, R GRANSTROM, B KARLSSON, I AF CARLSON, R GRANSTROM, B KARLSSON, I TI EXPERIMENTS WITH VOICE MODELING IN SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; VOICE MODELING; VOICE SOURCE MODELS; TEXT-TO-SPEECH CONVERSION; FEMALE VOICE ID QUALITY AB Some experiments with voice modelling using recent developments of the KTH speech synthesis system will be presented. A new synthesizer, GLOVE, an extended version of OVE III has been implemented in the system. It contains an improved glottal source built on the LF voice source model, some extra control parameters for the voiced and noise sources and an extra pole/zero-pair in the nasal branch. Furthermore, the present research versions of the KTH text-to-speech system include possibilities for interactive manipulations at the parameter level with on-screen reference to natural speech. The synthesis system constitutes a flexible environment for voice modelling experiments. The new synthesis tools and models were used for synthesis-by-analysis experiments. A sentence uttered by a female speaker was analysed and a stylized copy was made using both the old and the new synthesis system. With the new system the synthetic copy sounded very similar to the natural utterance. RP CARLSON, R (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR BLADON A, 1987, SEP P EUR C SPEECH T, V1, P55 CARLSON R, 1989, SEP P ESCA WORKSH SP CARLSON R, 1989, MAY P INT C AC SPEEC, V1, P223 CARLSON R, 1990, ADV SPEECH HEARING L, P269 CARLSON R, 1990, APR P INT C AC SPEEC, P317 FANT G, 1990, JUN P TUT RES WORKSH, P106 Fant Gunnar, 1985, STL QPSR, V4, P1 GOBL C, 1988, STL QPSR, V1, P123 GOBL C, 1989, IN PRESS P VOCAL FOL Karlsson I., 1988, 7th FASE Symposium. Proceedings Speech '88 KARLSSON I, 1989, SEP P EUR 89 EUR C S, P345 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 LILJENCR.JC, 1968, IEEE T ACOUST SPEECH, VAU16, P137, DOI 10.1109/TAU.1968.1161961 NORD L, 1986, J PHONETICS, V14, P401 PINTO NB, 1989, IEEE T ACOUST SPEECH, V37, P1870, DOI 10.1109/29.45534 ROTHENBERG M., 1975, P SPEECH COMM SEM SO, V2, P235 NR 16 TC 24 Z9 25 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 481 EP 489 DI 10.1016/0167-6393(91)90051-T PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500006 ER PT J AU KUWABARA, H TAKAGI, T AF KUWABARA, H TAKAGI, T TI ACOUSTIC PARAMETERS OF VOICE INDIVIDUALITY AND VOICE-QUALITY CONTROL BY ANALYSIS - SYNTHESIS METHOD SO SPEECH COMMUNICATION LA English DT Article DE VOICE INDIVIDUALITY; ANALYSIS SYNTHESIS; SPECTRAL MANIPULATION AB Experiments on voice individuality have been performed using an analysis-synthesis system capable of modifying pitch, formant frequencies and formant bandwidths. The results show that the perception of voice-individuality is significantly affected by formant shifts, especially of the lower three, and it is completely lost for a uniform shift of five percent. Pitch frequency and bandwidth manipulation, on the other hand, is less important to the individuality perception. C1 NHK JAPAN BROADCASTING CORP, SCI & TECH RES LABS, TOKYO 157, JAPAN. RP KUWABARA, H (reprint author), NISHI TOKYO UNIV, UENOHARA, YAMANASHI 40901, JAPAN. CR ITOH K, 1982, T IEICE JAPAN A, V65, P101 KUWABARA H, 1984, T COMMITTEE SPEECH R, pS84 KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 KUWABARA H, 1983, T COMMITTEE SPEECH R, pS82 WONG DY, 1979, IEEE T ACOUST SPEECH, V27, P350, DOI 10.1109/TASSP.1979.1163260 NR 5 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 491 EP 495 DI 10.1016/0167-6393(91)90052-U PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500007 ER PT J AU HERMES, DJ AF HERMES, DJ TI SYNTHESIS OF BREATHY VOWELS - SOME RESEARCH METHODS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH SYNTHESIS; BREATHINESS; VOWEL PERCEPTION; AUDITORY SCENE ANALYSIS ID PERIODIC PULSE; HIGH HARMONICS; NOISE; INTEGRATION; SEGREGATION; COMPONENTS; AUDIBILITY; PERCEPTION; FREQUENCY; TIME AB When vowels are synthesised by means of a source-filter model, a delta-pulse train is often used as a source signal. Although breathiness can to some extent be simulated by using a sophisticated glottal-source model, a more natural simulation of breathiness requires the addition of aspiration noise. When stationary noise is used, however, the noise is to a large extent perceived as coming from a separate sound source which hardly contributes to the breathy timbre of the vowel. This problem can be solved by using noise with a temporal envelope of the same periodicity as the pulse train. In a simple source-filter model, a combination of lowpass-filtered pulses and synchronous highpass-filtered noise bursts of equal energy was used as a source signal. In this way, the noise was no longer perceived as a separate sound, but integrated perceptually with the strictly periodic part of the signal. It will be shown that this integration consists of both a reduction of the loudness of the separate noise stream and a timbre change in the breathy vowel. RP HERMES, DJ (reprint author), INST PERCEPT RES, POB 513, 5600 MB EINDHOVEN, NETHERLANDS. CR Bregman A. S., 1978, ATTENTION PERFORM, VVII, P63 BREGMAN AS, 1985, PERCEPT PSYCHOPHYS, V37, P483, DOI 10.3758/BF03202881 Bregman AS., 1990, AUDITORY SCENE ANAL CARLSON R, 1990, JUN P TUT RES WORKSH, P28 DANNENBRING GL, 1976, J EXP PSYCHOL HUMAN, V2, P544, DOI 10.1037//0096-1523.2.4.544 DARWIN CJ, 1984, Q J EXP PSYCHOL-A, V36, P193 DARWIN CJ, 1981, Q J EXP PSYCHOL-A, V33, P185 DUIFHUIS H, 1971, J ACOUST SOC AM, V49, P1155, DOI 10.1121/1.1912477 DUIFHUIS H, 1970, J ACOUST SOC AM, V48, P888, DOI 10.1121/1.1912228 HALL JW, 1984, J ACOUST SOC AM, V76, P50, DOI 10.1121/1.391005 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 MAKHOUL J, 1978, J ACOUST SOC AM, V64, P1577, DOI 10.1121/1.382141 MCADAMS S, 1984, THESIS STANDORD U MCFADDEN D, 1987, J ACOUST SOC AM, V81, P1519, DOI 10.1121/1.394504 REPP BH, 1988, LANG SPEECH, V31, P239 WAKEFIELD GH, 1985, J ACOUST SOC AM, V77, P1535, DOI 10.1121/1.391996 Weintraub M., 1985, THESIS STANFORD U NR 17 TC 26 Z9 26 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 497 EP 502 DI 10.1016/0167-6393(91)90053-V PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500008 ER PT J AU FURUI, S AF FURUI, S TI SPEAKER-DEPENDENT-FEATURE EXTRACTION, RECOGNITION AND PROCESSING TECHNIQUES SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER-DEPENDENT FEATURES; SPEAKER RECOGNITION; SPEAKER ADAPTATION; VOICE CONVERSION ID ISOLATED WORD RECOGNITION AB This paper discusses recent advances in and perspectives of research on speaker-dependent-feature extraction from speech waves, automatic speaker identification and verification, speaker adaptation in speech recognition, and voice conversion techniques. Speaker-dependent information exists both in the spectral envelope and in the supra-segmental features of speech. This individual information can be further classified into temporal and dynamic features. Speaker identification/verification methods can be divided into text-dependent and text-independent methods. Although text-dependent speaker verification techniques have almost reached the level suitable for practical implementation, text-independent techniques are still in the fundamental research stage. Both supervised and unsupervised speaker adaptation algorithms for speech recognition have recently been proposed, and remarkable progress has been achieved in this field. Improving synthesized speech quality by adding natural characteristics of voice individuality, and converting synthesized voice individuality from one speaker to another, are as yet little exploited research fields to be studied in the near future. Research on speaker-dependent information is one of the most important future directions for achieving advanced speech information processing systems. RP FURUI, S (reprint author), NIPPON TELEGRAPH & TEL PUBL CORP, MUSASHINO ELECT COMMUN LAB, HUMAN INTERFACE LABS, MUSASHINO, TOKYO 180, JAPAN. CR ABE M, 1988, P IEEE INT C ACOUST BENNANI Y, 1990, P IEEE INT C ACOUST EATOCK J, 1990, P INT C SPOKEN LANGU FENG MW, 1989, P IEEE INT C ACOUST FURUI F, 1972, T IECE 55A, V10, P549 FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P254, DOI 10.1109/TASSP.1981.1163530 FURUI S, 1986, SPEECH COMMUN, V5, P183, DOI 10.1016/0167-6393(86)90007-5 FURUI S, 1990, P VERBA 90, P164 Furui S., 1974, T IECE 57 A, V12, P880 Furui S., 1989, DIGITAL SPEECH PROCE FURUI S, 1989, P IEEE INT C ACOUST FURUI S, 1980, IEEE T ACOUST SPEECH, V28, P129, DOI 10.1109/TASSP.1980.1163393 FURUI S, 1989, IEEE T ACOUST SPEECH, V37, P1923, DOI 10.1109/29.45538 HAMPSHIRE JB, 1990, P IEEE INT C ACOUST ISO K, 1989, SPR P M ACOUST SOC J Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop JUANG BH, 1990, P IEEE INT C ACOUST KATO K, 1985, T COMMITTEE HEARING LEE KF, 1988, THESIS CARNEGIEMELLO LI KP, 1983, P IEEE INT C ACOUST MARIANI J, 1989, P IEEE INT C ACOUST MATSUI T, 1990, P INT C SPOKEN LANGU MATSUI T, 1991, P IEEE INT C ACOUST MONTACIE C, 1989, P IEEE INT C ACOUST NAIK JM, 1989, P IEEE INT C ACOUST NISHIMURA M, 1988, P IEEE IT C ACOUST S OGLESBY J, 1990, P IEEE INT C ACOUST PORITZ AB, 1982, P IEEE INT C ACOUST RABINER LR, 1983, AT&T TECH J, V62, P1075 RIGOLL G, 1989, P IEEE INT C ACOUST ROSENBERG AE, 1990, P INT C SPOKEN LANGU ROSENBERG AE, 1990, P IEEE INT C ACOUST SAVIC M, 1990, P IEEE INT C ACOUST SCHWARTZ R, 1987, P IEEE INT C ACOUST SHIKANO K, 1986, P IEEE INT C ACOUST SHIRAKI Y, 1987, SP8767 AC SOC JAP SOONG FK, 1988, IEEE T ACOUST SPEECH, V36, P871, DOI 10.1109/29.1598 ZHENG YC, 1988, P IEEE INT C ACOUST NR 38 TC 19 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 505 EP 520 DI 10.1016/0167-6393(91)90054-W PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500009 ER PT J AU FANT, G KRUCKENBERG, A NORD, L AF FANT, G KRUCKENBERG, A NORD, L TI PROSODIC AND SEGMENTAL SPEAKER VARIATIONS SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; TIMING; MALE FEMALE SPEECH; VOICE SOURCE AB The purpose of our presentation is to make an inventory of knowledge developed at the KTH about speaker variabilities including findings from our more recent databank projects on text reading. We shall have something to say about male/female differences. voice source characteristics, and about prosodic and segmental features in connected speech. We also have some data of more general statistical nature such as pause durations, long time average spectrum, and about relative proportions of voiced and voiceless segments in speech. RP FANT, G (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR BLADON RAW, 1978, STL QPSR, V1, P1 Carlson R., 1975, AUDITORY ANAL PERCEP, P55 FANT G, 1989, SPEECH TRANSMISSION, V2, P1 Fant G, 1975, STL QPSR, V2-3, P1 FANT G, 1991, 12EME C INT SCI PHON FANT G, 1986, J PHONETICS, V14, P303 FANT G, 1983, STL QPSR, V2, P1 FANT G, 1988, STL QPSR, V2, P1 FANT G, 1986, STL QPSR, V4, P1 FANT G, 1989, P EUROSPEECH 89 PARI, V1, P498 Fant Gunnar, 1959, ACOUSTIC ANAL SYNTHE Fant Gunnar, 1985, STL QPSR, V4, P1 Fonagy I., 1983, VIVE VOIX GOBL C, 1988, STL QPSR, V1, P123 Gobl C., 1989, STL QPSR, P9 GOBL C, 1988, STL QPSR, V2, P23 KARLSSON I, 1988, SPEECH TRANSMISSION, V2, P61 Karlsson I., 1989, STL QPSR, V1, P75 KLATT DH, 1990, J ACOUST SOC AM, V87, P820, DOI 10.1121/1.398894 MARTONY J, 1965, STL QPSR, V1, P4 METTAS O, 1977, STL APSR, V2, P1 NR 21 TC 17 Z9 17 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 521 EP 531 DI 10.1016/0167-6393(91)90055-X PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500010 ER PT J AU SCHOENTGEN, J DEGUCHTENEERE, R AF SCHOENTGEN, J DEGUCHTENEERE, R TI AN ALGORITHM FOR THE MEASUREMENT OF JITTER SO SPEECH COMMUNICATION LA English DT Article DE JITTER; VOICE QUALITY; DYSPHONIA ID SPEECH SIGNALS; PERTURBATION; VOICE AB Jitter is the small fluctuation from one glottis cycle to the next in the duration of the fundamental period of the voice source. Analyzing jitter requires measuring glottal cycle durations accurately. Generally speaking, this is carried out by sampling al a medium rate and interpolating the discretized signal to obtain the required time resolution. In this article we describe an algorithm which solves the following two signal processing problems. Firstly, signal samples obtained by interpolation are only estimates of the original samples, which are unknown. The quality of t@e reconstruction of the signal therefore has to be evaluated. Secondly, small variations in cycle durations are easily corrupted by noise and measurement errors. The magnitude of measurement errors therefore has to be gauged. In our algorithm, the quality of reconstruction by signal interpolation is evaluated by a statistical test which takes into account the distribution of the corrections (which are brought about by interpolation) to the positions of the signal events which mark the beginnings of the glottal cycles. Three different interpolation methods have been implemented. Measurement errors are controlled by estimating independently the cycle durations of the speech and the electroglottographic signals. When the series obtained from both signals agree, we may then conclude that they reflect vocal fold activity and that they have not been unduly corrupted by errors or noise. The algorithm has been tested on 77 signals produced by healthy and dysphonic subjects. Its performance was satisfactory on all counts. C1 UNIV LIBRE BRUXELLES, INST PHONET, B-1050 BRUSSELS, BELGIUM. CR Chatfield C., 1984, ANAL TIME SERIES INT DEEM JF, 1989, J SPEECH HEAR RES, V32, P689 Heiberger V, 1982, SPEECH LANGUAGE ADV, V7, P299 HESS W, 1987, SPEECH COMMUN, V6, P55, DOI 10.1016/0167-6393(87)90069-0 HORII Y, 1982, J SPEECH HEAR RES, V25, P12 HORII Y, 1979, J SPEECH HEAR RES, V22, P5 KOIKE Y, 1977, ACTA OTO-LARYNGOL, V84, P105, DOI 10.3109/00016487709123948 LIEBERMAN P, 1963, J ACOUST SOC AM, V35, P344, DOI 10.1121/1.1918465 MCCLELLAN JH, 1979, PROGRAMS DIGITAL SIG NITTROUER S, 1990, J SPEECH HEAR RES, V33, P761 ORLIKOFF RF, 1989, J ACOUST SOC AM, V85, P888, DOI 10.1121/1.397560 PINTO NB, 1990, J ACOUST SOC AM, V87, P1278, DOI 10.1121/1.398803 SCHOENTGEN J, 1989, SPEECH COMMUN, V8, P61, DOI 10.1016/0167-6393(89)90068-X TITZE IR, 1987, J SPEECH HEAR RES, V30, P252 NR 14 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 533 EP 538 DI 10.1016/0167-6393(91)90056-Y PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500011 ER PT J AU JAVKIN, H HANSON, B KAUN, A AF JAVKIN, H HANSON, B KAUN, A TI THE EFFECTS OF BREATHY VOICE ON INTELLIGIBILITY SO SPEECH COMMUNICATION LA English DT Article DE BREATHINESS; INTELLIGIBILITY; DIFFERENCE LIMEN; SPEECH SYNTHESIS; FEMALE SPEECH AB Breathiness is used to form linguistic contrasts in some languages, but also characterizes speakers as individuals and, to an extent, gender. The acoustic consequences of breathy phonation are varied, and separable in synthetic speech: they include the introduction of a frication component into the voice source, a raising of the relative amplitude of the first harmonic and a lowering of the overall spectral tilt. Henton and Bladon (1985) claimed that breathiness diminishes intelligibility. The experiments described in the present paper used synthetic speech to determine the effect of adding a noise source to a modal voice source and to determine the effects of the different acoustic consequences of breathiness on the intelligibility of isolated words. No significant effects were found. C1 UNIV CALIF SANTA BARBARA, DEPT LINGUIST, SANTA BARBARA, CA 93106 USA. UNIV CALIF LOS ANGELES, DEPT LINGUIST, LOS ANGELES, CA 90024 USA. RP JAVKIN, H (reprint author), PANASONIC TECHNOL INC, SPEECH TECHNOL LAB, SANTA BARBARA, CA 93102 USA. CR BERNSTEIN J, 1981, J ACOUST SOC AM, V69, P1132, DOI 10.1121/1.385693 FLANAGAN JL, 1955, J ACOUST SOC AM, V27, P613, DOI 10.1121/1.1907979 HENTON CG, 1985, LANG COMMUN, V5, P221, DOI 10.1016/0271-5309(85)90012-6 Javkin H., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266410 JAVKIN HR, 1987, 11TH P INT C PHON SC KLATT D H, 1985, Journal of the Acoustical Society of America, V78, pS81, DOI 10.1121/1.2023019 NR 6 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1991 VL 10 IS 5-6 BP 539 EP 543 DI 10.1016/0167-6393(91)90057-Z PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GY255 UT WOS:A1991GY25500012 ER PT J AU VILKMAN, E LAINE, UK KOLJONEN, J AF VILKMAN, E LAINE, UK KOLJONEN, J TI SUPRAGLOTTAL ACOUSTICS AND VOWEL INTRINSIC FUNDAMENTAL-FREQUENCY - AN EXPERIMENTAL-STUDY SO SPEECH COMMUNICATION LA English DT Article DE LARYNX; VOCAL FOLDS; PHONATION; PITCH; FUNDAMENTAL FREQUENCY; VOWELS; VOCAL TRACT; ACOUSTICS ID TRACT; F0 AB Two excised human larynges were used to investigate the effects of changes in supraglottal acoustics on phonation, especially fundamental frequency (F0). An artificial supraglottal tube was connected to the larynx. Two different sized cylindrical blocks were inserted and moved in the tube to simulate the acoustics of neutral. front and back vowels. The changes in supraglottal acoustics were found to influence the vibratory pattern of the vocal folds from irregular to regular and also vice versa. Variations in F0 were also observed. The F0 changes were greatest at the instant when the blocks were inserted in the tube, i.e. when the tube was converted from open to blocked. Moving the block caused slight changes only. When the smaller block was inserted an F0 rise was generally noted. On average (X +/- SD) a 3.8 +/- 0.9 Hz F0 rise was measured in low chest phonation. F0 changes connected with the use of the larger block did not show any systematical variation. In many cases an F0 drop was measured. The results are interpreted in terms of an acoustic-mechanical feedback in vocal source-tract interaction. It is concluded that the so-called intrinsic F0 of vowels cannot be explained on an acoustical basis. The acoustical conditions for the small block were simulated using a theoretical model. C1 HELSINKI UNIV TECHNOL, ACOUST LAB, SF-02150 ESPOO 15, FINLAND. RP VILKMAN, E (reprint author), UNIV OULU, DEPT OTOLARYNGOL & PHONIATR, SF-90220 OULU 22, FINLAND. CR BEIL RG, 1962, J ACOUST SOC AM, V34, P347, DOI 10.1121/1.1928124 CONRAD WA, 1987, LARYNGEAL FUNCTION P, P320 EWAN WG, 1979, J ACOUST SOC AM, V66, P358, DOI 10.1121/1.383669 FLANAGAN JL, 1968, IEEE T ACOUST SPEECH, VAU16, P57, DOI 10.1109/TAU.1968.1161949 GUERIN B, 1980, PHONETICA, V37, P169 ISHIZAKA K, 1976, J ACOUST SOC AM, V60, P190, DOI 10.1121/1.381064 ISHIZAKA K, 1972, AT&T TECH J, V51, P1233 LAINE U, 1987, 11TH P INT C PHON SC, V5, P19 LAINE U, 1989, THESIS HELSINKI U TE Lehiste I., 1970, SUPRASEGMENTALS Petersen N. R., 1978, J PHONETICS, V6, P177 SAPIR S, 1989, Journal of Voice, V3, P44, DOI 10.1016/S0892-1997(89)80121-3 TERNSTROM S, 1988, J SPEECH HEAR RES, V31, P187 TITZE IR, 1988, J ACOUST SOC AM, V83, P1536, DOI 10.1121/1.395910 van den BERG J., 1959, PRACTICA OTO RHINO LARYNGOL, V21, P425 VILKMAN E, 1987, FOLIA PHONIATR, V39, P169 VILKMAN E, 1989, J PHONETICS, V17, P193 VILKMAN O, 1991, 6TH P VOC FOLD PHYS YAMADA H, 1979, STRENGTH BIOL MATERI Zemlin WR., 1988, SPEECH HEARING SCI A ZENKER W, 1958, MSCHR OHREN HEILK LA, V92, P296 NR 21 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 325 EP 334 DI 10.1016/0167-6393(91)90001-A PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900001 ER PT J AU CUTLER, A BUTTERFIELD, S AF CUTLER, A BUTTERFIELD, S TI WORD BOUNDARY CUES IN CLEAR SPEECH - A SUPPLEMENTARY REPORT SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; INTELLIGIBILITY; CLEAR SPEECH; WORD BOUNDARIES; SEGMENTATION; INTONATION; PITCH; LOUDNESS; INTENSITY ID SENTENCE COMPREHENSION; CONVERSATIONAL SPEECH; INTELLIGIBILITY; HEARING; HARD AB One of a listener's major tasks in understanding continuous speech is segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately speaking more clearly. In four experiments, we examined how word boundaries are produced in deliberately clear speech. In an earlier report we showed that speakers do indeed mark word boundaries in clear speech, by pausing at the boundary and lengthening pre-boundary syllables; moreover, these effects are applied particularly to boundaries preceding weak syllables. In English, listeners use segmentation procedures which make word boundaries before strong syllables easier to perceive, thus marking word boundaries before weak syllables in clear speech will make clear precisely those boundaries which are otherwise hard to perceive. The present report presents supplementary data, namely prosodic analyses of the syllable following a critical word boundary. More lengthening and greater increases in intensity were applied in clear speech to weak syllables than to strong. Mean F0 was also increased to a greater extent on weak syllables than on strong. Pitch movement, however. increased to a greater extent on strong syllables than on weak. The effects were, however, very small in comparison to the durational effects we observed earlier for syllables preceding the boundary and for pauses at the boundary. RP CUTLER, A (reprint author), MRC, APPL PSYCHOL UNIT, CAMBRIDGE, ENGLAND. RI Cutler, Anne/C-9467-2012 CR Bond Z. S., 1980, PERCEPTION PRODUCTIO, P115 BOND ZS, 1983, PERCEPT PSYCHOPHYS, V34, P470, DOI 10.3758/BF03203063 BROWN RG, 1978, GROWTH, V42, P1 BUTTERFIELD S, 1990, P I ACOUSTICS 10, V12, P87 Cutler A., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90004-0 CHEN FR, 1983, SPEAKING CLEARLY ACO, V2, P1 Clark J. E., 1988, LANGUAGE TOPICS ESSA, P161 Cooper W. E., 1980, SYNTAX SPEECH CUTLER A, 1988, J EXP PSYCHOL HUMAN, V14, P113, DOI 10.1037/0096-1523.14.1.113 CUTLER A, 1992, J MEMORY LANGUAGE, V31 CUTLER A, 1990, SPEECH COMMUN, V9, P485, DOI 10.1016/0167-6393(90)90024-4 CUTLER A, 1984, ATTENTION PERFORM, V10, P183 FAIRBANKS G, 1957, J ACOUST SOC AM, V29, P621, DOI 10.1121/1.1908985 FODOR JA, 1967, PERCEPT PSYCHOPHYS, V2, P289, DOI 10.3758/BF03211044 Francis WN, 1982, FREQUENCY ANAL ENGLI Grosjean Francois, 1980, TEMPORAL VARIABLES S, P91 HAKES DT, 1970, PERCEPT PSYCHOPHYS, V8, P413, DOI 10.3758/BF03207036 HAKES DT, 1972, J VERB LEARN VERB BE, V11, P278, DOI 10.1016/S0022-5371(72)80088-4 HAKES DT, 1970, PERCEPT PSYCHOPHYS, V8, P5, DOI 10.3758/BF03208920 KLATT DH, 1975, J SPEECH HEAR RES, V18, P686 MAASSEN B, 1986, J SPEECH HEAR RES, V29, P227 MALECOT A, 1958, LANGUAGE, V34, P370, DOI 10.2307/410929 PAULBROWN D, 1988, J SPEECH HEAR RES, V31, P630 PICHENY MA, 1985, J SPEECH HEAR RES, V28, P96 PICHENY MA, 1986, J SPEECH HEAR RES, V29, P434 SHAFERVINCENT K, 1983, PHONETICA, V40, P177 Valian V. V., 1976, COGNITION, V4, P115 NR 27 TC 23 Z9 24 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 335 EP 353 DI 10.1016/0167-6393(91)90002-B PG 19 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900002 ER PT J AU GANDOUR, J POTISUK, S PONGLORPISIT, S DECHONGKIT, S AF GANDOUR, J POTISUK, S PONGLORPISIT, S DECHONGKIT, S TI INTERSPEAKER AND INTRASPEAKER VARIABILITY IN FUNDAMENTAL-FREQUENCY OF THAI TONES SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; LEXICAL TONES; FUNDAMENTAL FREQUENCY; THAI ID PERCEPTION AB A measure was obtained of variability in fundamental frequency (F0) in citation forms of lexical tones. The language selected for investigation was Thai, a tone language with five lexical tones: mid, low, falling, high and rising. Twenty speakers participated in the experiment: 10 ''young'' male speakers and 10 ''old'' speakers, 5 male and 5 female. High-quality tape recordings were obtained of each subject's productions of a minimal set of five monosyllabic words. F0 contours were extracted by a cepstral analysis. A comparison was made of inter- and intraspeaker variability in the production of the five Thai tones. Results of analysis of variance indicated that the degree of intersubject variability in F0 was greater than intraspeaker across all five tones, that young and old speakers exhibited the same pattern of variability, and that variability in tone production differed depending on the lexical tone. The falling and rising tones exhibited smaller degrees of variability than the mid, low or high. Findings are interpreted to highlight the nature of F0 variability, the relationship of F0 variability to amount of F0 movement, and crosslinguistic differences in F0 variability as a function of prosodic structure. C1 MAHIDOL UNIV, BANGKOK 10700, THAILAND. RP GANDOUR, J (reprint author), PURDUE UNIV, W LAFAYETTE, IN 47907 USA. CR ABRAMSON AS, 1978, LANG SPEECH, V21, P319 ABRAMSON AS, 1962, INT J AM LINGUISTI 2, V28 Abrao A., 1976, TAI LINGUISTICS HONO, P1 ATKINSON JE, 1976, J ACOUST SOC AM, V60, P440, DOI 10.1121/1.381101 BRADLEY C, 1916, U CALIFORNIA PUBLICA, V12, P195 Bradley C. B., 1911, J AM ORIENTAL SOC, V31, P282, DOI 10.2307/3087645 EADY SJ, 1982, LANG SPEECH, V25, P29 EARLE MA, 1975, SCRL MONOGRAPH, V11 Erickson D., 1974, PASAA, V4, P1 Erickson D. M, 1976, THESIS U CONNECTICUT GANDOUR JT, 1978, LANG SPEECH, V21, P1 GANDOUR J, 1975, STUDIES THAIL LINGUI, P170 GANDOUR J, 1983, J PHONETICS, V11, P149 GANDOUR J, 1988, BRAIN LANG, V35, P201, DOI 10.1016/0093-934X(88)90109-5 HENDERSON EJA, 1964, HONOR D JONES, P415 KENT RD, 1979, J SPEECH HEAR RES, V22, P627 Maddieson I, 1978, UNIVERSALS HUMAN LAN, P335 OHALA Johni, 1978, TONE LINGUISTIC SURV, P5 Pike K. L., 1948, TONE LANGUAGES ROSE P, 1987, SPEECH COMMUN, V6, P343, DOI 10.1016/0167-6393(87)90009-4 SHEN SXN, 1991, JAN ANN M LING SOC A SUNDBERG J, 1979, J PHONETICS, V7, P71 NR 22 TC 14 Z9 14 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 355 EP 372 DI 10.1016/0167-6393(91)90003-C PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900003 ER PT J AU TOURATZIDIS, L DOLOGLOU, I CARAYANNIS, G AF TOURATZIDIS, L DOLOGLOU, I CARAYANNIS, G TI THE EIGENPROBLEM FORMULATION FOR HMM SO SPEECH COMMUNICATION LA English DT Article DE HIDDEN MARKOV MODEL; SPEECH RECOGNITION ID HIDDEN MARKOV-MODELS; RECOGNITION AB This paper presents a new matrix formulation of the basic concepts governing discrete Hidden Markov Models (HMM). Using this formulation, we show that symbol and state probabilities are exponential functions of the transition matrix of the model. Furthermore, based on the eigenanalysis of the transition matrix. a closed form relationship is derived between the eigenvalues of this matrix and the symbol probabilities at different instants. The matrix formulation provides a useful tool for the physical interpretation of the learning and decision process through HMM. A better insight is obtained. and tools are also given for a design with improved learning characteristics. RP TOURATZIDIS, L (reprint author), NATL TECH UNIV ATHENS, DEPT ELECT ENGN, DIV COMP SCI, GR-15773 ZOGRAFOS, GREECE. CR Baum L. E., 1972, INEQUALITIES, V3, P1 BAUM LE, 1968, PAC J MATH, V27, P211 BAUM LE, 1967, B AM MATH SOC, V73, P360, DOI 10.1090/S0002-9904-1967-11751-8 BAUM LE, 1966, ANN MATH STAT, V37, P1554, DOI 10.1214/aoms/1177699147 Horn R.A., 1988, MATRIX ANAL Kleinrock L, 1975, QUEUING SYSTEMS, V1 Picone J., 1990, IEEE ASSP Magazine, V7, DOI 10.1109/53.54527 RABINER LR, 1989, P IEEE, V77, P257, DOI 10.1109/5.18626 STRANG G, 1988, LINEAR ALGEBRA ITS A, P167 WILPON JG, 1990, IEEE T ACOUST SPEECH, V38, P1870, DOI 10.1109/29.103088 ZAMORA EM, 1981, INFORM PROCESS MANAG, V17, P305, DOI 10.1016/0306-4573(81)90044-3 NR 11 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 373 EP 380 DI 10.1016/0167-6393(91)90004-D PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900004 ER PT J AU YANNAKOUDAKIS, EJ HUTTON, PJ AF YANNAKOUDAKIS, EJ HUTTON, PJ TI GENERATION OF SPELLING RULES FROM PHONEMES AND THEIR IMPLICATIONS FOR LARGE DICTIONARY SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPELLING RULES; PHONEMETOGRAPHEME CONVERSION; SPEECH RECOGNITION AB This paper presents the results of a statistical and deterministic analysis of two phonemic lexicons, with respect to the storage and generation of spelling rules using graphemes. The aim of this paper is to demonstrate the feasibility of generating correctly spelled words for the English language using phoneme-to-grapheme rules. An algorithm for generating the rules is presented. A set of spelling rules were identified by the analysis of two differently sized lexicons, 96,939 words and 11,638 words, the smaller lexicon being a subset of the larger. These rules were then tested for their general usability. 62.3% of all words in the 96,939 word lexicon could be spelled correctly utilising rules alone. A smaller lexicon which consisted of many of the more frequently occurring words plus a selection of less common words showed that 84.5% of this lexicon could be spelled correctly using rules generated by the analysis of its own lexicon. However, only 62.3% of this dictionary could be spelled correctly using rules generated from the lexicon of 96,939 words. It was also shown that phoneme-to-grapheme mappings are between 63% and 69% alphabetic. depending on the size of dictionary used. 59 general default rules were identified, unfortunately only 22.6% of the smaller dictionary could be spelled correctly by using these rules. C1 UNIV LEEDS, DEPT TRANSPORT STUDIES, LEEDS LS2 9JT, W YORKSHIRE, ENGLAND. RP YANNAKOUDAKIS, EJ (reprint author), ATHENS UNIV ECON & BUSINESS, DEPT INFORMAT, GR-10434 ATHENS, GREECE. CR ELOVITZ HS, 1976, IEEE T ACOUST SPEECH, V24, P446, DOI 10.1109/TASSP.1976.1162873 HANNA PR, 1951, ELEMENTARY SCH J, V53, P329 HANNA PR, 1966, PHONEME TO GRAPHEME YANNAKOUDAKIS EJ, 1988, ARCHITECTURAL LOGIC YANNAKOUDAKIS EJ, 1983, 7TH P ASLIB C INF, P39 YANNAKOUDAKIS EJ, 1983, INFORM PROCESS MANAG, V19, P101, DOI 10.1016/0306-4573(83)90046-8 YANNAKOUDAKIS EJ, 1983, INFORM PROCESS MANAG, V19, P87, DOI 10.1016/0306-4573(83)90045-6 YANNAKOUDAKIS EJ, 1987, INFORM PROCESS MANAG, V23, P563, DOI 10.1016/0306-4573(87)90060-4 Yannakoudakis E.J., 1987, SPEECH SYNTHESIS REC NR 9 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 381 EP 394 DI 10.1016/0167-6393(91)90005-E PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900005 ER PT J AU NISHINUMA, Y DUEZ, D PABOUDJIAN, C AF NISHINUMA, Y DUEZ, D PABOUDJIAN, C TI AUTOMATIC CLASSIFICATION OF CONSONANT CLUSTERS IN FRENCH SO SPEECH COMMUNICATION LA English DT Article DE CONSONANT CLUSTER; DURATION; WORD RECOGNITION; FRENCH AB This study describes a method for the automatic detection of two-consonant clusters in French. Four corpora were used, consisting of 603 different consonant clusters and single consonants combined with the 3 vowels /i, a, a/, in disyllabic and trisyllabic words. The CCV, VCC, CV and VC syllable structures were studied. Stimuli were recorded 5 times in an anechoic room by 10 subjects. Word length, syllable length and syllable-component length were measured by means of a signal editor. A set of rules was deduced from a statistical analysis carried out on these data. Five relevant parameters were extracted, namely (1) voice feature, (2) mode of articulation in the first half of the cluster, (3) duration ratio between the vowel and consonant segments, (4) duration of the consonant segments, and (5) position (prevocalic, intervocalic and postvocalic). This phonetic approach was examined on the GRECO-BDSONS public test corpus. More than 90% of the consonant clusters were correctly classified on the basis of the values of the duration parameters extracted from our data base. Seventeen rules were used to output all the macro-classes of the consonant clusters in the test corpus. RP NISHINUMA, Y (reprint author), UNIV PROVENCE, INST PHONET, CNRS, URA 261, F-13621 AIX EN PROVENCE, FRANCE. CR AUBERGE V, 1988, 17EMES ACT JOURN ET, P55 AUTESSERE D, 1985, 14EMES ACT JOURN ET, P147 BARTKOVA K, 1987, SPEECH COMMUN, V6, P245, DOI 10.1016/0167-6393(87)90029-X BENKIRANE T, 1982, THESIS U PROVENCE AI CUTLER A, 1986, SPEECH HEARING, V8, P31 DUEZ D, 1987, THESIS U PROVENCE AI DUEZ D, 1986, 15EMES ACT JOURN ET, P97 ESPESSER R, 1984, TRAITEMENT SIGNAL SY Fujimura Osamu, 1978, SYLLABLES SEGMENTS, P107 Gimson A. C., 1962, INTRO PRONUNCIATION Gueron Jacqueline, 1985, GRAMMATICAL REPRESEN, P87 Haggard M., 1973, J PHONETICS, V1, P9 Haggard M., 1973, J PHONETICS, V1, P111 KLATT DH, 1973, MASSACHUSETTS I TECH, V108, P253 LINDBLOM B, 1973, AUDITORY ANAL PERCEP, P387 LINDBLOM B, 1981, DURATIONAL PATTERNS MACNEILAGE PF, 1963, J ACOUST SOC AM, V35, P461, DOI 10.1121/1.1918505 MALECOT A, 1955, LINGUA, V5, P45 OSHAUGHN.D, 1974, IEEE T ACOUST SPEECH, VAS22, P282, DOI 10.1109/TASSP.1974.1162588 OSHAUGHNESSY D, 1981, J PHONETICS, V9, P385 ROCHETTE C, 1974, GROUPES CONSONNES FR ROSSI M, 1968, REV ACOUSTIQUE, V3, P306 TESTON B, 1973, TRAVAUS I PHONETIQUE, V1, P115 TREIMAN R, 1982, EFFECTS SYLLABLE STR, V8 NR 24 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 395 EP 403 DI 10.1016/0167-6393(91)90006-F PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900006 ER PT J AU ARENDS, N POVEL, DJ VANOS, E MICHIELSEN, S CLAASSEN, J FEITER, I AF ARENDS, N POVEL, DJ VANOS, E MICHIELSEN, S CLAASSEN, J FEITER, I TI AN EVALUATION OF THE VISUAL SPEECH APPARATUS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH TRAINING; VISUAL AID; DEAF CHILDREN; EVALUATION ID QUALITY; DEAF; AIDS AB This study evaluates the Visual Speech Apparatus (VSA) as a visual aid for speech training of hearing-impaired children. Its efficacy was tested during a whole school year in a comparative study using an experimental and control group of children in the age range of 4-7 years. The 22 children in the experimental group were trained using the VSA, whereas the 16 children in the control group only received the regular speech lessons. Several times during the year. speech performance of the two groups was tested using the CID Phonetic Inventory. In addition. the performance of subjects in the experimental group was measured with a specially developed test. the VSA test. Results show that the children trained with the VSA obtained significantly higher scores on those subtests that assess the acquisition of basic speech skills. such as voice control and vowel production, than the children in the control group. The verv young children especially benefited from the training with the VSA. C1 INST DOVEN, ST MICHIELSGESTEL, NETHERLANDS. RP ARENDS, N (reprint author), CATHOLIC UNIV NIJMEGEN, NIJMEGEN INST COGNIT RES & INFORMAT TECHNOL, DEPT EXPTL PSYCHOL, NIJMEGEN, NETHERLANDS. RI Michielsen, Stephen/A-3459-2008; Michielsen, Stephen/C-4726-2015 OI Michielsen, Stephen/0000-0001-8743-1521 CR ARENDS N, 1990, J SPEECH HEAR RES, V33, P116 BULLIS M, 1986, AM ANN DEAF, V131, P344 FRIEDMAN M, 1985, J COMMUN DISORD, V18, P159 GREENE BG, 1984, J ACOUST SOC AM, V76, P32, DOI 10.1121/1.391035 GULIAN E, 1986, British Journal of Audiology, V20, P181, DOI 10.3109/03005368609079015 *INT STAND ORG, 1964, ISOR389 REC KEWLEYPORT D, 1991, CLIN LINGUIST PHONET, V5, P13, DOI 10.3109/02699209108985500 KEWLEYPORT D, 1991, BEHAVIORAL ASPECTS S Ling D., 1976, SPEECH HEARING IMPAI LIPPMANN RP, 1982, SPEECH LANGUAGE ADV, V7, P105 MAKI D, 1983, SPEECH HEARING IMPAI MCGARR NS, 1989, VOLTA REV, V91, P7 MOOG JS, 1988, CID PHONETIC INVENTO NICKERSON RS, 1976, J SPEECH HEAR DISORD, V41, P120 POVEL DJ, 1991, SPEECH COMMUN, V10, P59, DOI 10.1016/0167-6393(91)90028-R POVEL DJ, 1986, J SPEECH HEAR RES, V29, P99 RUOSS M, 1990, FOLIA PHONIATR, V42, P184 VANUDEN F, 1977, WORLD LANGUAGE DEA 1 WATSON CS, 1989, J SPEECH HEAR RES, V32, P245 WATSON CS, 1989, VOLTA REV, V91, P29 YAMADA Y, 1988, J ACOUSTICAL SOC S1, V84, pS43, DOI 10.1121/1.2026312 NR 21 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD NOV PY 1991 VL 10 IS 4 BP 405 EP 414 DI 10.1016/0167-6393(91)90007-G PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GW019 UT WOS:A1991GW01900007 ER PT J AU MURTHY, HA YEGNANARAYANA, B AF MURTHY, HA YEGNANARAYANA, B TI FORMANT EXTRACTION FROM GROUP DELAY FUNCTION SO SPEECH COMMUNICATION LA English DT Article DE FOURIER TRANSFORM PHASE; SPECTRAL ROOT CEPSTRUM; GROUP DELAY FUNCTIONS; FORMANT EXTRACTION ID PHASE RP MURTHY, HA (reprint author), INDIAN INST TECHNOL, DEPT COMP SCI & ENGN, MADRAS 600036, TAMIL NADU, INDIA. CR BRACEWELL RN, 1986, FOURIER TRANSFORM IT, P6 LIM JS, 1979, IEEE T ACOUST SPEECH, V27, P223 MAKHOUL J, 1975, P IEEE, V63, P261 PAPOULIS A, 1977, SIGNAL ANAL, P231 RABINER LR, 1978, DIGITAL PROCESSING S, P367 TRIBOLET JM, 1977, IEEE T ACOUST SPEECH, V25, P170, DOI 10.1109/TASSP.1977.1162923 YEGNANARAYANA B, 1984, IEEE T ACOUST SPEECH, V32, P610, DOI 10.1109/TASSP.1984.1164365 YEGNANARAYANA B, 1978, J ACOUST SOC AM, V63, P1638, DOI 10.1121/1.381864 YEGNANARAYANA B, 1988, SIGNAL PROCESS, V4, P447 NR 9 TC 32 Z9 32 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 209 EP 221 DI 10.1016/0167-6393(91)90011-H PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900001 ER PT J AU HULT, G AF HULT, G TI SOME REMARKS ON A HALTING CRITERION FOR ITERATIVE LOW-PASS FILTERING IN A RECENTLY PROPOSED PITCH DETECTION ALGORITHM SO SPEECH COMMUNICATION LA English DT Note DE ITERATIVE FILTERING; LPC; AUTOCORRELATION ANALYSIS; PITCH DETECTION RP HULT, G (reprint author), SWEDISH TELECOMMUN, DIV RES & DEV, S-12386 FARSTA, SWEDEN. CR DOLOGLOU I, 1989, SPEECH COMMUN, V8, P309, DOI 10.1016/0167-6393(89)90013-7 FANT G, 1957, ERICSSON TECHNICS, V1 HULT G, 1990, P SPEECH SCI TECHNOL, P134 Markel JD, 1976, LINEAR PREDICTION SP Wolfram S., 1988, MATH SYSTEM DOING MA NR 5 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 223 EP 226 DI 10.1016/0167-6393(91)90012-I PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900002 ER PT J AU DOLOGLOU, I CARAYANNIS, G AF DOLOGLOU, I CARAYANNIS, G TI SOME REMARKS ON THE HALTING CRITERION FOR ITERATIVE LOW-PASS FILTERING IN A RECENTLY PROPOSED PITCH DETECTION ALGORITHM - REPLY SO SPEECH COMMUNICATION LA English DT Letter RP DOLOGLOU, I (reprint author), NATL TECH UNIV ATHENS, DEPT ELECT ENGN, DIV COMP SCI, GR-15773 ZOGRAFOS, GREECE. CR DOLOGLOU I, 1989, SPEECH COMMUN, V8, P309, DOI 10.1016/0167-6393(89)90013-7 HULT G, 1991, SPEECH COMMUN, V10, P223, DOI 10.1016/0167-6393(91)90012-I NR 2 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 227 EP 228 DI 10.1016/0167-6393(91)90013-J PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900003 ER PT J AU ROSE, P AF ROSE, P TI HOW EFFECTIVE ARE LONG-TERM MEAN AND STANDARD-DEVIATION AS NORMALIZATION PARAMETERS FOR TONAL FUNDAMENTAL-FREQUENCY SO SPEECH COMMUNICATION LA English DT Article DE CHINESE; TONE; NORMALIZATION; FUNDAMENTAL FREQUENCY; LONG TERM MEAN; LONG TERM STANDARD DEVIATION ID NORMALIZATION RP ROSE, P (reprint author), AUSTRALIAN NATL UNIV, DEPT LINGUIST, GPO BOX 4, CANBERRA, ACT 2601, AUSTRALIA. CR Baken R. J, 1987, CLIN MEASUREMENT SPE Catford John C., 1977, FUNDAMENTAL PROBLEMS Chambers Jack K., 1980, DIALECTOLOGY CHAO YR, 1930, TROISIEME SERIE, V30, P106 Chen G. T., 1974, J CHINESE LINGUISTIC, V2, P159 DISNER SF, 1980, J ACOUST SOC AM, V67, P253, DOI 10.1121/1.383734 DSINER S, 1986, EXPT PHONOLOGY, P69 EARLE MA, 1975, MONOGRAPHH SPEECHH C, V11 FJUISAKI H, 1983, PRODUCTION SPEECH, P39 FOK ACY, 1974, PUBLICATIONS CTR ASI, V18 HENDERSON EJA, 1964, HONOR D JONES, P415 ISHIZAKA, 1976, J ACOUST SOC AM, V60, P1193 Jassem W, 1971, J INT PHON ASSOC, V1, P59 Jassem W, 1973, SPEECH ANAL SYNTHESI, V3, P209 JASSEM W, 1975, AUDITORY ANAL PERCEP, P523 KRATOCHVIL P, 1977, CAHIERS LINGUISTIQUE, V1, P7 Ladefoged P., 1967, 3 AREAS EXPT PHONETI LADEFOGED P, 1986, WORKING PAPERS PHONE, V64 Lehiste I., 1970, SUPRASEGMENTALS MADDIESON I, 1979, WORKING PAPERS PHONE, V45, P84 Nolan F, 1983, PHONETIC BASES SPEAK PHUONG CT, 1981, THESIS AUSTR NATIONA ROSE P, 1982, THESIS U CAMBRIDGE Rose P., 1989, CAHIERS LINGUISTIQUE, V18, P229, DOI 10.3406/clao.1989.1304 ROSE P, 1987, SPEECH COMMUN, V6, P343, DOI 10.1016/0167-6393(87)90009-4 ROSE P, 1990, PHONETICA, V47, P1 ROSE P, 1989, PROSODIC ANAL ASIAN, P55 ROSE PJ, 1990, 3RD P AUSTR INT C SP, P388 ROSE PJ, 1985, 18TH INT C SIN TIB L NR 29 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 229 EP 247 DI 10.1016/0167-6393(91)90014-K PG 19 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900004 ER PT J AU ESKENAZI, M LACHERETDUJOUR, A AF ESKENAZI, M LACHERETDUJOUR, A TI EXPLORATION OF INDIVIDUAL STRATEGIES IN CONTINUOUS SPEECH SO SPEECH COMMUNICATION LA English DT Article DE PHONOLOGY; PRODUCTION RULES; INDIVIDUAL STRATEGIES; CAUSAL RELATIONS RP ESKENAZI, M (reprint author), LAB INFORMAT MEAN & SCI INGN, CNRS, BP 133, F-91403 ORSAY, FRANCE. CR BARD EG, 1990, HCRC P05 U ED REP COHEN MH, 1989, THESIS U CALIFORNIA Delattre P, 1966, SELECTED PAPERS FREN FANT G, 1989, SPEECH TRANSMISSION, P1 Fonagy I., 1983, VIVE VOIX FOUCHE P, 1969, TRAITE PRONONCIATION GROSJEAN F, 1975, PHONETICA, V31, P144 KERKHOFF J, 1987, FONPARS 1 USERS MANU Lacheret-Dujour A., 1989, Eurospeech 89. European Conference on Speech Communication and Technology LACHERETDUJOUR A, 1990, 1990 P ESCA ETRW ED, P143 LACHERETDUJOUR A, 1990, THESIS U PARIS PARIS Laver J, 1980, PHONETIC DESCRIPTION LAVER JDM, 1968, BRIT J DISORD COMMUN, V3, P43 LEON PR, 1973, FRENCH REV, V45, P783 LIENARD JS, 1984, 13EMES JEP SFA BRUSS LIENARD JS, 1989, SPEAK VAR SPEC STUD, P1 Martinet A., 1945, PRONONCIATION FRANCA Martinet Andre, 1973, DICT PRONONCIATION F PROUTS B, 1980, THESIS U PARIS SUD P SHOCKEY L, 1983, PHONETIC PHONOLOGICA Skinner B., 1969, CONTINGENCIES REINFO *UNIX, 1988, MAN US UNIV PROGR UT, P203 NR 22 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 249 EP 264 DI 10.1016/0167-6393(91)90015-L PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900005 ER PT J AU KREIMAN, J PAPCUN, G AF KREIMAN, J PAPCUN, G TI COMPARING DISCRIMINATION AND RECOGNITION OF UNFAMILIAR VOICES SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER RECOGNITION; VOICE DISCRIMINATION; PROTOTYPE THEORY ID MULTIDIMENSIONAL CLASSIFICATION; QUALITIES; MEMORY C1 VET ADM MED CTR, LOS ANGELES, CA 90073 USA. RP KREIMAN, J (reprint author), UNIV CALIF LOS ANGELES, SCH MED, DIV HEAD & NECK SURG, WILSHIRE & SAWTELLE BLVDS, LOS ANGELES, CA 90073 USA. CR BRICKER PD, 1966, J ACOUST SOC AM, V40, P1441, DOI 10.1121/1.1910246 Carterette E. C., 1975, STRUCTURE PROCESS SP, P246 CLIFFORD B R, 1981, Law and Human Behavior, V5, P201, DOI 10.1007/BF01044763 Clifford BR, 1980, LAW HUMAN BEHAV, V4, P373, DOI DOI 10.1007/BF01040628 COMREY AL, 1973, 1ST COURSE FACTOR AN DORFMAN DD, 1969, J MATH PSYCHOL, V6, P487, DOI 10.1016/0022-2496(69)90019-4 FAGEL WPF, 1983, SPEECH COMMUN, V2, P315, DOI 10.1016/0167-6393(83)90048-1 HOLMGREN GL, 1967, J SPEECH HEAR RES, V10, P57 KEMPSTER G, 1984, THESIS NW U LEGGE GE, 1984, J EXP PSYCHOL LEARN, V10, P298 MATSUMOT.H, 1973, IEEE T ACOUST SPEECH, VAU21, P428, DOI 10.1109/TAU.1973.1162507 MURRY T, 1980, J ACOUST SOC AM, V68, P1294, DOI 10.1121/1.385122 MURRY T, 1977, J ACOUST SOC AM, V61, P1630, DOI 10.1121/1.381439 PAPCUN G, 1989, J ACOUST SOC AM, V85, P913, DOI 10.1121/1.397564 POLLACK I, 1954, J ACOUST SOC AM, V26, P403, DOI 10.1121/1.1907349 *SAS I, 1983, SUGI SUPPL LIBR US G SASLOVE H, 1980, J APPL PSYCHOL, V65, P111, DOI 10.1037/0021-9010.65.1.111 Schiffman Susan S., 1981, INTRO MULTIDIMENSION SINGH S, 1978, J ACOUST SOC AM, V64, P81, DOI 10.1121/1.381958 Swets J. A., 1982, EVALUATION DIAGNOSTI VOIERS WD, 1964, J ACOUST SOC AM, V36, P1065, DOI 10.1121/1.1919153 WALDEN BE, 1978, J SPEECH HEAR RES, V21, P265 NR 22 TC 19 Z9 19 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 265 EP 275 DI 10.1016/0167-6393(91)90016-M PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900006 ER PT J AU RIBBUM, B PERKIS, A PALIWAL, KK RAMSTAD, T AF RIBBUM, B PERKIS, A PALIWAL, KK RAMSTAD, T TI PERFORMANCE STUDY OF STOCHASTIC SPEECH CODERS SO SPEECH COMMUNICATION LA English DT Article DE DIGITAL SIGNAL PROCESSING; SPEECH CODING; STOCHASTIC CODERS RP RIBBUM, B (reprint author), NORWEGIAN INST TECHNOL, RUNIT, ELAB, DIV TELECOMMUN & ACOUST, N-7034 TRONDHEIM, NORWAY. CR [Anonymous], 1969, IEEE T AUDIO ELE SEP, P227 ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1982, IEEE T COMMUN, V30, P600, DOI 10.1109/TCOM.1982.1095501 ATAL BS, 1984, 1984 P ICC AMST, P1610 BRIT TEL, CEPT TR3COST207 DAVIDSON G, 1988, 1988 P INT C AC SPEE, P163 DAVIDSON G, 1986, 1986 P INT C ACOUST, P3055 GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P459, DOI 10.1109/TASSP.1976.1162857 Jayant N. S., 1984, DIGITAL CODING WAVEF JAYANT NS, 1981, AT&T TECH J, V60, P707 JAYANT NS, 1987, 1987 P INT C ACOUST, P1288 KABAL P, 1988, 1988 P INT C ACOUST, P147 KITAWAKI N, 1984, IEEE COMMUN MAG, V22, P26, DOI 10.1109/MCOM.1984.1091825 KLEIJN WB, 1988, 1988 P IEEE INT C AC, P155 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 KROON P, 1989, 1989 IEEE WORKSH SPE KROON P, 1987, 1987 P INT C ACOUST, P1649 LIN D, 1986, SIGNAL PROCESSING, V3 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MAX I, 1960, IEEE T INFORMATION S PALIWAL KK, 1987, ELAB STF44 F87108 RE PERKIS A, 1990, 3RD P AUSTR INT C SP, P40 PERKIS A, 1990, P TENCON 90, P334 PERKIS A, 1989, IEEE WORKSHOP SPEECH Perkis A., 1991, Advances in Speech Coding PERKIS A, 1990, 1990 P ISSPA BRISB, P710 PERKIS A, 1988, 2ND P AUSTR INT C SP, P60 RIBBUM B, 1988, 2ND P AUSTR INT C SP, P408 RIBBUM B, 1988, ELAB STF44 F88152 RE ROSE R, 1986, P INT C AC SPEECH SI, P453 SALAMI RA, 1989, IEEE WORKSHOP SPEECH SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 SCHROEDER MR, 1982, 1982 P INT C ACOUST, P1668 SINGHAL S, 1984, P INT C ACOUST SPEEC SOONG FK, 1988, 1988 P INT C ACOUST, P394 SOONG FK, 1984, 1984 P INT C ACUST S SREENIVAS RV, 1988, 1988 P INT C ACOUST, P171 SUNDET AF, 1988, THESIS NORWEGIAN I T SVENDSEN T, 1986, ELAB AN86198 PROJ ME Torgerson W.S., 1958, THEORY METHODS SCALI TRANCOSO IM, 1986, 1986 P INT C ACOUST, P2375 TRO J, 1988, ELAB STF44 F88175 RE UN CK, 1975, IEEE T COMMUN, V23, P1466 VISWANATHAN R, 1985, IEEE T ACOUST SPEECH, V23, P309 NR 45 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 277 EP 301 DI 10.1016/0167-6393(91)90017-N PG 25 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900007 ER PT J AU LIM, BK UN, CK AF LIM, BK UN, CK TI DUAL RLS LATTICE JOINT PROCESS ESTIMATION ALGORITHM FOR A TIME-VARYING ARMA SPEECH MODEL SO SPEECH COMMUNICATION LA English DT Note DE ARMA SPEECH MODELING; DUAL RLS ESTIMATOR; BOOTSTRAP INPUT ESTIMATION RP LIM, BK (reprint author), KONSTANTINOV GLASS WORKS, DEPT ELECT ENGN, COMMUN RES LAB, POB 150, KONSTANTINOVO, USSR. CR HONIG ML, 1983, IEEE T ACOUST SPEECH, V31, P415, DOI 10.1109/TASSP.1983.1164084 HONIG ML, 1984, ADAPTIVE FILTERS STR, P144 LEE DTL, 1982, IEEE T AUTOMAT CONTR, V27, P753, DOI 10.1109/TAC.1982.1103038 LIM BK, 1990, ELECTRON LETT, V26, P674, DOI 10.1049/el:19900441 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 SONG KH, 1983, IEEE T ACOUST SPEECH, V31, P1556 NR 6 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1991 VL 10 IS 3 BP 303 EP 306 DI 10.1016/0167-6393(91)90018-O PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA GF939 UT WOS:A1991GF93900008 ER PT J AU PIERACCINI, R AF PIERACCINI, R TI SPEAKER INDEPENDENT RECOGNITION OF ITALIAN TELEPHONE SPEECH WITH MIXTURE DENSITY HIDDEN MARKOV-MODELS SO SPEECH COMMUNICATION LA English DT Article DE ITALIAN LANGUAGE; CONTINUOUS SPEECH; CONTEXT INDEPENDENT PHONES; COARTICULATION C1 CTR STUDI & LAB TELECOMUN SPA, I-10148 TURIN, ITALY. CR BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 BAKIS R, 1976, 91ST M AC SOC AM BROWN PF, 1987, THESIS CARNEGIEMELLO CIARAMELLA A, 1989, P EUROSPEECH 89 PARI, P437 CRAVERO M, 1984, INT C ACOUST SPEECH FISSORE L, 1988, P INT C ACOUST SPEEC, P279 FISSORE L, 1988, P INT C ACOUST SPEEC, P414 FISSORE L, 1988, P INT C ACOUST SPEEC, P203 LEE CH, 1988, P IEEE INT C ACOUSTI, P410 LEE CH, 1989, P SPEECH NATURAL LAN, P280, DOI 10.3115/1075434.1075481 Lee K. F., 1989, P IEEE INT C AC SPEE, P445 LEE KF, 1989, P EUROSPEECH 89, P148 LEE KF, 1988, THESIS CARNEGIEMELLO MURVEITH H, 1989, P SPEECH NAT LANG WO, P238, DOI 10.3115/100964.100990 Ney H., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Paul D.B., 1989, P IEEE INT C ACOUSTI, P449 Price P., 1988, P IEEE INT C AC SPEE, P651 RABINER LR, 1986, AT&T TECH J, V65, P21 SCHWARTZ R, 1984, P INT C ACOUST SPEEC, P1205 TOHKURA Y, 1986, P ICASSP 86, P761 NR 20 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 105 EP 115 DI 10.1016/0167-6393(91)90034-Q PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400001 ER PT J AU BASZTURA, C AF BASZTURA, C TI EXPERIMENTS OF AUTOMATIC SPEAKER RECOGNITION IN OPEN SETS SO SPEECH COMMUNICATION LA English DT Article DE SPEAKER RECOGNITION; OPEN SETS RP BASZTURA, C (reprint author), WROCLAW TECH UNIV, INST TELECOMMUN & ACOUST, I-28, PL-50370 WROCLAW, POLAND. CR Basztura C., 1978, Archives of Acoustics, V3 Basztura C., 1978, Archives of Acoustics, V3 DANTE HM, 1979, IEEE T ACOUST SPEECH, V27, P255, DOI 10.1109/TASSP.1979.1163238 MAJEWSKI W, 1984, 10TH P INT C PHON SC, P322 MAJEWSKI W, 1987, 11TH P INT C PHON SC, V2, P237 SONDHI MM, 1968, IEEE T ACOUST SPEECH, VAU16, P262, DOI 10.1109/TAU.1968.1161986 NR 6 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 117 EP 127 DI 10.1016/0167-6393(91)90035-R PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400002 ER PT J AU CHIEN, LF LEE, LS CHEN, KJ AF CHIEN, LF LEE, LS CHEN, KJ TI AN AUGMENTED CHART DATA STRUCTURE WITH EFFICIENT WORD LATTICE PARSING SCHEME IN SPEECH RECOGNITION APPLICATIONS SO SPEECH COMMUNICATION LA English DT Article DE WORD LATTICE PARSING; CHART PARSING; SPEECH RECOGNITION C1 ACAD SINICA, INST INFORMAT SCI, TAIPEI 115, TAIWAN. RP CHIEN, LF (reprint author), NATL TAIWAN UNIV, DEPT COMP SCI & INFORMAT ENGN, TAIPEI, TAIWAN. CR CHIEN LF, 1991, THESIS NATIONAL TAIW CHIEN LF, 1990, 1990 P INT C AC SPEE CHOW YL, 1989, 1989 P INT C AC SPEE, P727 DEROUAULT AM, 1986, IEEE T PATTERN ANAL, V8, P742 Hayes P. J., 1986, 11th International Conference on Computational Linguistics. Proceedings of Coling '86 HELLWING P, 1988, 12TH P INT C COMP LI, P242 HUANG CR, 1988, 1988 P INT COMP S TA, P38 KARTTUNEN L, 1986, CSLI LECTURE NOTES KAY M, 1980, CSL8012 XER REP LEE LS, 1987, P NATIONAL SCI COUNC, P273 LEE LS, 1987, 10 INT JOINT C ART I, P619 POLLARD C, 1987, CSLI LECTURE NOTE 12, V1 STOCK O, 1988, 12TH P INT C COMP LI, P636 THOMPSON H, 1981, 19TH P ANN M ASS COM TOMITA M, 1986, 1986 P IEEE INT C AC, P1569 WARD WH, 1988, 1988 P INT C AC SPEE, P275 NR 16 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 129 EP 144 DI 10.1016/0167-6393(91)90036-S PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400003 ER PT J AU MELONI, H GILLES, P BETARI, A AF MELONI, H GILLES, P BETARI, A TI REPRESENTATION OF ACOUSTIC AND PHONETIC KNOWLEDGE FOR SPEAKER-INDEPENDENT RECOGNITION OF SMALL VOCABULARIES SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTIC AND PHONETIC KNOWLEDGE; KNOWLEDGE REPRESENTATION; SPEAKER-INDEPENDENT WORD RECOGNITION; PROLOG ID SPEECH RP MELONI, H (reprint author), FAC SCI AVIGNON, INFORMAT LAB, 33 RUE LOUIS PASTEUR, F-84000 AVIGNON, FRANCE. CR ABRY C, 1985, 14E JOURN ET PAR PAR, P156 ALDEFELD B, 1980, P IEEE, V68, P1364, DOI 10.1109/PROC.1980.11879 BRIDLE JS, 1983, SPEECH COMMUN, V2, P187, DOI 10.1016/0167-6393(83)90024-9 BULOT R, 1989, P EUROPEAN C SPEECH, P533 BULOT R, 1987, THESIS U AIX MARSEIL BULOT R, 1988, REV ACOUSTIQUE, V1, P241 Burton D. K., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) CAELEN J, 1979, THESIS TOULOUSE CAELEN J, 1985, 14E JOURN ET PAR PAR, P1129 CAELEN J, 1981, PROCESSUS ENCODAGE D, P128 Calliope, 1989, PAROLE SON TRAITEMEN COLMERAUER A, 1983, TSI, V2 FOHR D, 1985, 14E JOURN ET PAR PAR, P164 FUJIMURA O, 1981, PHONETICA, V38, P66 GIANNESINI F, 1985, PROLOG GUIZOL J, 1986, P S UNITES LEUR REPR, P24 HUANG SS, 1988, SPEECH COMMUN, V7, P41, DOI 10.1016/0167-6393(88)90020-9 JACOBSON R, 1963, ESSAI LINGUISTIQUE G JELINEK F, 1985, P IEEE, V73, P1616, DOI 10.1109/PROC.1985.13343 Lea W., 1980, TRENDS SPEECH RECOGN LIENARD JS, 1972, 3E JOURN ET PAR LANN, P347 MALMBERG B, 1972, PHONETIQUE FRANCAISE MELONI H, 1987, C AFCET RFIA ANTIBES MELONI H, 1986, JUL P S SPEECH REC MELONI H, 1989, P EUROPEAN C SPEECH, P625 MELONI H, 1982, THESIS AIX MARSEILLE PERENNOU G, 1985, 14E JOURN ET PAR PAR, P142 ROSENBERG AE, 1978, J ACOUST SOC AM, V64, pS181, DOI 10.1121/1.2004065 ROSSI M, 1977, LINGUISTIQUE, V13, P63 SAMUEL A, 1989, THESIS U AIX MARSEIL WOODS WA, 1982, ARTIF INTELL, V18, P295, DOI 10.1016/0004-3702(82)90025-X ZUE VW, 1983, SPEECH COMMUN, V2, P181, DOI 10.1016/0167-6393(83)90023-7 NR 32 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 145 EP 154 DI 10.1016/0167-6393(91)90037-T PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400004 ER PT J AU PEELING, SM PONTING, KM AF PEELING, SM PONTING, KM TI VARIABLE FRAME RATE ANALYSIS IN THE ARM CONTINUOUS SPEECH RECOGNITION SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE VARIABLE FRAME RATE ANALYSIS; SPEECH RECOGNITION SYSTEM RP PEELING, SM (reprint author), ROYAL SIGNALS & RADAR ESTAB, SPEECH RES UNIT, SPEECH RES UNIT, MALVERN WR14 3PS, WORCS, ENGLAND. CR BRIDLE JS, 1982, IEEE T ACOUST SPEECH, P899 BRIDLE JS, 1982, NOV P I AC AUT C BOU CHOW YL, 1987, APR P IEEE INT C AC, P89 Holmes J. N., 1988, SPEECH SYNTHESIS REC HOLMES JN, 1980, IEE PROC-F, V127, P53 Hunt M. J., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196617 KUHN MH, IEEE T ACOUST SPEECH, P736 LEE KF, THESIS CARNEGIEMELLO MOORE RK, 1989, COMMUNICATION PAUL DB, 1989, MAY P INT C AC SPEEC PONTING KM, 1991, IN PRESS COMPUT SPEE PONTING KM, 1989, NOV P MIL GOV SPEECH, P223 RUSSELL MJ, 1990, INT CONF ACOUST SPEE, P69, DOI 10.1109/ICASSP.1990.115539 NR 13 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 155 EP 162 DI 10.1016/0167-6393(91)90038-U PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400005 ER PT J AU HOWELL, P KADIHANIFI, K AF HOWELL, P KADIHANIFI, K TI COMPARISON OF PROSODIC PROPERTIES BETWEEN READ AND SPONTANEOUS SPEECH MATERIAL SO SPEECH COMMUNICATION LA English DT Article DE PROSODY; SPONTANEOUS SPEECH; TONE UNITS; COALESCENCES; FRAGMENTATIONS; STRESS; PAUSES RP HOWELL, P (reprint author), UNIV LONDON UNIV COLL, DEPT PSYCHOL, LONDON WC1E 6BT, ENGLAND. CR Crystal D., 1969, PROSODIC SYSTEMS INT ENGSTRAND O, 1989, PERILUS, V10, P1 HOWELL P, 1989, P ESCA WORKSHOP SPEE Krull D., 1989, PERILUS, VX, P87 LINDBLOM B, 1990, NATO ADV SCI I D-BEH, V55, P403 PICHENY MA, 1985, J SPEECH HEAR RES, V28, P36 RABINER LR, 1975, AT&T TECH J, V54, P297 NR 7 TC 23 Z9 23 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 163 EP 169 DI 10.1016/0167-6393(91)90039-V PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400006 ER PT J AU BAKAMIDIS, SG GLAROS, NA CARAYANNIS, G AF BAKAMIDIS, SG GLAROS, NA CARAYANNIS, G TI A REDUCED COMPLEXITY MULTIPULSE COMPRESSION SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH COMPRESSION; FAST TECHNIQUES; MULTIPULSE EXCITATION; FIXED-POINT ARITHMETIC RP BAKAMIDIS, SG (reprint author), NATL TECH UNIV ATHENS, GR-15773 ZOGRAFOS, GREECE. CR Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing DELSARTE P, 1987, IEEE T ACOUST SPEECH, V35, P645, DOI 10.1109/TASSP.1987.1165193 Jayant N. S., 1984, DIGITAL CODING WAVEF KROON P, 1986, IEEE T ACOUST SPEECH, V34 KROON P, 1984, P IEEE INT C ACOUST LEFEVRE JP, 1985, P IEEE INT C ACOUST MONTAGNA R, 1986, GLOBECOM 86 SENENSIEB GA, 1984, P INT C ACOUST SPEEC SINGHAL S, 1989, IEEE T ACOUST SPEECH, V37 SINGHAL S, 1984, P IEEE INT C ACOUST SLUYTER RJ, 1983, PHILIPS TECHN REV, V41 1987, DIGITAL SIGNAL PROCE NR 12 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 171 EP 178 DI 10.1016/0167-6393(91)90040-Z PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400007 ER PT J AU ESKENAZI, M MARIANI, J BORNERAND, S AF ESKENAZI, M MARIANI, J BORNERAND, S TI REPORT ON THE ICSLP SATELLITE WORKSHOP ON ASSESSMENT IN KOBE (JAPAN) AND VISITS TO SEVERAL JAPANESE LABORATORIES WORKING ON SPEECH-COMMUNICATION, 19-30 NOVEMBER 1990 SO SPEECH COMMUNICATION LA English DT Article C1 BULL SA, F-91300 MASSY, FRANCE. RP ESKENAZI, M (reprint author), CNRS, INFORMAT MECAN & SCI INGN LAB, BP 133, F-91403 ORSAY, FRANCE. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1991 VL 10 IS 2 BP 179 EP 198 DI 10.1016/0167-6393(91)90041-Q PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FW334 UT WOS:A1991FW33400008 ER PT J AU KUREMATSU, A IIDA, H MORIMOTO, T SHIKANO, K AF KUREMATSU, A IIDA, H MORIMOTO, T SHIKANO, K TI LANGUAGE PROCESSING IN CONNECTION WITH SPEECH TRANSLATION AT ATR INTERPRETING TELEPHONY RESEARCH LABORATORIES SO SPEECH COMMUNICATION LA English DT Article DE HMM (HIDDEN MARKOV MODEL); HMM-LR PARSER; SPOKEN LANGUAGE TRANSLATION; HPSG (HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR); UNIFICATION-BASED GRAMMAR C1 NIPPON TELEGRAPH & TEL PUBL CORP, MUSASHINO ELECT COMMUN LAB, MUSASHINO, TOKYO 180, JAPAN. RP KUREMATSU, A (reprint author), ATR, INTERPRETING TELEPHONY RES LABS, SEIKA CHO, KYOTO 61902, JAPAN. CR Gunji T, 1987, JAPANESE PHRASE STRU HANAZAWA T, 1989, J ACOUST SOC JAPAN, V10, P776 IIDA H, 1990, 13TH P INT C COMP LI, P370 KAKIGAHARA K, 1989, FAL P M AC SOC JAP, P93 KARTTUNEN L, 1986, CSLI8661 STANF U REP KITA K, 1989, P ICASSP 89, P703 KOGURE K, 1989, P COMPUTER WORLD 89, P135 Searle John R., 1969, SPEECH ACTS SHIEBER SM, 1986, CSLI LECTURE NOTES, V4, P11 YAMAOKA T, 1990, P C EUROPEAN AI, P726 YOSHIMOTO K, 1988, 12TH P INT C COMP LI, P779 NR 11 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 1 EP 9 DI 10.1016/0167-6393(91)90023-M PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200001 ER PT J AU RUEHL, HW DOBLER, S WEITH, J MEYER, P NOLL, A HAMER, HH PIOTROWSKI, H AF RUEHL, HW DOBLER, S WEITH, J MEYER, P NOLL, A HAMER, HH PIOTROWSKI, H TI SPEECH RECOGNITION IN THE NOISY CAR ENVIRONMENT SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION IN NOISE; HIDDEN MARKOV MODELS; CONNECTED-WORDS RECOGNITION; CAR ENVIRONMENT; MOBILE RADIO TELEPHONY ID CONNECTED WORD RECOGNITION C1 PHILIPS GMBH, W-2000 HAMBURG 54, GERMANY. RP RUEHL, HW (reprint author), PHILIPS KOMMUN IND AG, THURN & TAXIS STR 14, W-8500 NURNBERG, GERMANY. CR BAKER JK, 1975, SPEECH RECOGNITION, P512 BOURLARD H, 1985, BIBLIOTHECA PHONETIC, V12, P115 HIRSCH G, 1989, P EUROSPEECH 89 PARI, V2, P652 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 LEVINSON SE, 1983, AT&T TECH J, V62, P1035 NEY H, 1984, IEEE T ACOUST SPEECH, V32, P263, DOI 10.1109/TASSP.1984.1164320 NOLL A, 1989, P ICASSP 89 GLASGOW, V1, P679 NOLL A, 1986, 1986 P NTG C SPRACH, P26 RABINER LR, 1986, AT&T TECH J, V65, P21 VARY P, 1985, SIGNAL PROCESS, V8, P387, DOI 10.1016/0165-1684(85)90002-7 NR 10 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 11 EP 22 DI 10.1016/0167-6393(91)90024-N PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200002 ER PT J AU GAGNOULET, C JOUVET, D DAMAY, J AF GAGNOULET, C JOUVET, D DAMAY, J TI MAIRIEVOX - A VOICE-ACTIVATED INFORMATION-SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; VOICE ACTIVATED SYSTEM; ISOLATED WORDS; TELEPHONE NETWORK; INDUSTRIAL APPLICATIONS RP GAGNOULET, C (reprint author), CTR NATL ETUD TELECOMMUN, LAA TSS, BP 40, F-22301 LANNION, FRANCE. CR BARTKOVA K, 1987, 1987 P ICPHS TALL, P244 GAGNOULET C, 1989, 1989 P EUR PAR, P569 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 JOUVET J, 1986, P IEEE ICASSP 86 TOK, P1109 TUBACH C, 1989, C SPEECH TECH NEW YO, P266 NR 5 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 23 EP 31 DI 10.1016/0167-6393(91)90025-O PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200003 ER PT J AU JUNQUA, JC AF JUNQUA, JC TI A 2-PASS HYBRID SYSTEM USING A LOW DIMENSIONAL AUDITORY MODEL FOR SPEAKER-INDEPENDENT ISOLATED-WORD RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE AUTOMATIC SPEECH RECOGNITION; SPEAKER-INDEPENDENT; ISOLATED-WORDS; DISCRIMINATION; PHYSIOLOGY; PSYCHOACOUSTICS; PHONETICS; HYBRID SYSTEM; DISTANCE MEASURE; DYNAMIC FEATURES; AUDITORY MODEL ID SPEECH RP JUNQUA, JC (reprint author), DIV PANASON TECHNOL INC, SPEECH TECHNOL LAB, 3888 STATE ST, SANTA BARBARA, CA 93105 USA. CR ALINAT P, 1973, THESIS ATAL BS, 1974, J ACOUST SOC AM, V55, P1304, DOI 10.1121/1.1914702 BELGUTTE B, 1982, REPRESENTATION SPEEC, P131 BRADSHAW GL, 1982, ICASSP 82, P554 Buchanan B. G., 1985, RULE BASED EXPERT SY CAELEN J, 1979, THESIS U P SABATIER CARBONELL, 1986, ICASSP 86, P1201 Casacuberta F., 1988, Recent Advances in Speech Understanding and Dialog Systems. Proceedings of the NATO Advanced Institute COLE RA, 1983, ICASSP 84, P731 COLE RM, 1985, VARIABILITY INVARIAN, P325 DALLOS P, 1972, SCIENCE, V177, P356, DOI 10.1126/science.177.4046.356 DELGUTTE B, 1984, THESIS DOLMAZON JM, 1982, REPRESENTATION SPEEC, P151 ERMAN LD, 1980, ACM COMPUT SURV, V12, P213, DOI 10.1145/356810.356816 Fletcher H, 1940, REV MOD PHYS, V12, P0047, DOI 10.1103/RevModPhys.12.47 FURUI S, 1986, J ACOUST SOC AM, V80, P1016, DOI 10.1121/1.393842 HANSON BA, 1985, J ACOUST SOC AM S, V1, P49 Hanson B. A., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) HERMANSKY, 1987, ICASSP 87, P1159 HERMANSKY H, 1985, SPEECH COMMUN, V4, P181, DOI 10.1016/0167-6393(85)90045-7 Hermansky H., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196553 JUANG LR, 1986, ICASSP 86, P765 JUNQUA JC, 1988, J ACOUST SOC AM S, V1 JUNQUA JC, 1987, J ACOUST SOC AM S, V1, pS93 JUNQUA JC, 1989, THESIS U NANCY 1 Klatt D. H., 1982, IEEE ICASSP, P1278 LAMEL LF, 1982, ICASSP 82, P558 LYON RF, 1982, ICASSP 82, P1281 MAKHOUL J, 1973, IEEE T ACOUST SPEECH, VAU21, P140, DOI 10.1109/TAU.1973.1162470 MAKHOUL J, 1985, VARIABILAITY INVARIA, P344 MARTIN EA, 1987, ICASSP 87, P709 MYERS C, 1980, IEEE T ACOUST SPEECH, V28, P623, DOI 10.1109/TASSP.1980.1163491 Paliwal K. K., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90034-6 Rabiner L. R., 1981, ICASSP 81. Proceedings of the 1981 IEEE International Conference on Acoustics, Speech and Signal Processing ROBINSON DW, 1956, BRIT J APPL PHYS, V7, P166, DOI 10.1088/0508-3443/7/5/302 SCHROEDER MR, 1981, IEEE T ACOUST SPEECH, V29, P297, DOI 10.1109/TASSP.1981.1163546 Seneff S., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) STEVENS SS, 1957, PSYCHOL REV, V64, P153, DOI 10.1037/h0046162 TOKHURA Y, 1985, J ACOUST SOC AM S, V1, pS11 Yegnanarayana B., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing Zue V. W., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) NR 41 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 33 EP 44 DI 10.1016/0167-6393(91)90026-P PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200004 ER PT J AU DENDRINOS, M BAKAMIDIS, S CARAYANNIS, G AF DENDRINOS, M BAKAMIDIS, S CARAYANNIS, G TI SPEECH ENHANCEMENT FROM NOISE - A REGENERATIVE APPROACH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ENHANCEMENT; PRINCIPAL COMPONENT ANALYSIS; EIGEN FILTERS; SIGNAL REGENERATION ID MAXIMUM-LIKELIHOOD; REPRESENTATION RP DENDRINOS, M (reprint author), NATL TECH UNIV ATHENS, DEPT ELECT ENGN, DIV COMP SCI, GR-15773 ATHENS, GREECE. CR BAKAMIDIS S, IN PRESS IEEE T ASSP BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 CADZOW JA, 1984, IEEE T ACOUST SPEECH, V32, P512, DOI 10.1109/TASSP.1984.1164352 FEDER M, 1989, IEEE T ACOUST SPEECH, V37, P204, DOI 10.1109/29.21683 Golub G, 1983, MATRIX COMPUTATIONS KONSTANTINIDES K, 1988, IEEE T ACOUST SPEECH, V36, P757, DOI 10.1109/29.1585 LIM JS, 1978, IEEE T ACOUST SPEECH, V26, P471 MCAULAY RJ, 1980, IEEE T ACOUST SPEECH, V28, P137, DOI 10.1109/TASSP.1980.1163394 MCAULAY RJ, 1986, IEEE T ACOUST SPEECH, V34, P744, DOI 10.1109/TASSP.1986.1164910 PALIWAL KK, 1988, IEEE T ACOUST SPEECH, V36, P292, DOI 10.1109/29.1523 RAO BD, 1988, IEEE T ACOUST SPEECH, V36, P1026, DOI 10.1109/29.1626 SAMBUR MR, 1978, IEEE T ACOUST SPEECH, V26, P419, DOI 10.1109/TASSP.1978.1163137 TUFTS DW, 1982, P IEEE, V70, P975, DOI 10.1109/PROC.1982.12428 WIDROW B, 1975, P IEEE, V63, P1692, DOI 10.1109/PROC.1975.10036 NR 14 TC 58 Z9 63 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 45 EP 57 DI 10.1016/0167-6393(91)90027-Q PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200005 ER PT J AU POVEL, DJ ARENDS, N AF POVEL, DJ ARENDS, N TI THE VISUAL SPEECH APPARATUS - THEORETICAL AND PRACTICAL ASPECTS SO SPEECH COMMUNICATION LA English DT Article ID VOWEL CORRECTOR; DEAF; PERCEPTION; PITCH; AID C1 NIJMEGEN INST COGNIT RES & INFORMAT TECHNOL, NIJMEGEN, NETHERLANDS. INST DOVEN, ST MICHIELSGESTEL, NETHERLANDS. RP POVEL, DJ (reprint author), CATHOLIC UNIV NIJMEGEN, DEPT EXPTL PSYCHOL, NIJMEGEN, NETHERLANDS. CR ARENDS N, 1990, J SPEECH HEAR RES, V33, P116 ARENDS N, 1990, 17TH INT C ED DEAF R ATTNEAVE F, 1959, APPLICATIONS INFORMA Beinum F. J. Koopmans-van, 1986, PRECURSORS EARLY SPE, P37 Bernstein L E, 1988, J Rehabil Res Dev, V25, P53 BERNSTEIN LE, 1986, P INT C ACOUSTIC SPE, V1, P633 BRAEGES J, 1982, DEAFNESS COMMUNICATI Bruner J. S, 1966, THEORY INSTRUCTION CALVERT D, 1983, SPEECH DEAFNESS DUIFHUIS H, 1982, J ACOUST SOC AM, V71, P1568, DOI 10.1121/1.387811 ESSER G, 1983, PADAUDIOLOGIE AKTUEL, P185 Fant G., 1960, ACOUSTIC THEORY SPEE FLETCHER SG, 1983, J SPEECH HEAR DISORD, V48, P178 FLETCHER SG, 1983, AM ANN DEAF, V128, P525 FOURCIN A, 1986, J PHONETICS, V14, P435 GROENEN P, 1989, DIMENSIONELE REDUCTI HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 HUDGINS CV, 1935, VOLTA REV, V37, P637 KEWLEYPORT D, 1989, UNPUB J SPEECH HEARI Levelt W. J., 1989, SPEAKING INTENTION A LEVITT H, 1972, P IEEE C SPEECH COMM, P230 LINDBOLM BEF, 1989, WORKING MODELS HUMAN Ling D., 1976, SPEECH HEARING IMPAI LIPPMANN RP, 1982, SPEECH LANGUAGE ADV, V7, P105 MAASSEN B, 1985, J ACOUST SOC AM, V78, P877, DOI 10.1121/1.392918 MACNEILA.PF, 1970, PSYCHOL REV, V77, P182, DOI 10.1037/h0029070 MENS LH, 1986, Q J EXP PSYCHOL-A, V38, P177 MILLER JD, 1989, J ACOUST SOC AM, V85, P2114, DOI 10.1121/1.397862 NICKERSON RS, 1976, J SPEECH HEAR DISORD, V41, P120 Papert S., 1980, MINDSTORMS Piaget J, 1950, INTRO EPISTEMOLOGIE PICKETT JM, 1972, IEEE T ACOUST SPEECH, VAU20, P3, DOI 10.1109/TAU.1972.1162343 POLS LCW, 1969, J ACOUST SOC AM, V46, P458, DOI 10.1121/1.1911711 POVEL DJ, 1974, PSYCHOL RES-PSYCH FO, V37, P51, DOI 10.1007/BF00309078 POVEL DJ, 1987, 11TH P INT C PHON SC, V1, P373 POVEL DJ, 1974, ARTICULATION CORRECT POVEL DJ, 1989, INT J REHABIL RES, V12, P109, DOI 10.1097/00004356-198903000-00027 POVEL DJ, 1986, J SPEECH HEAR RES, V29, P99 REETZ H, 1989, P EUROSPEECH PARIS, V1, P476 SACHS O, 1989, SEEING VOICES JOURNE STRANGE W, 1989, J ACOUST SOC AM, V85, P2081, DOI 10.1121/1.397860 STRONG WJ, 1975, VOLTA REV, V77, P536 SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 VEENENDAAL M, 1989, PARALLEL PROCESSING WATANABE A, 1985, IEEE T ACOUST SPEECH, V33, P164, DOI 10.1109/TASSP.1985.1164539 WATSON CS, 1989, J SPEECH HEAR RES, V32, P245 WATSON CS, 1989, UNPUB VOLTA REV YAMADA Y, 1990, 17TH INT C ED DEAF R ZAHORIAN SA, 1988, UNPUB TRANSFORMATION NR 49 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 59 EP 80 DI 10.1016/0167-6393(91)90028-R PG 22 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200006 ER PT J AU MARTENS, JP DEPUYDT, L AF MARTENS, JP DEPUYDT, L TI BROAD PHONETIC CLASSIFICATION AND SEGMENTATION OF CONTINUOUS SPEECH BY MEANS OF NEURAL NETWORKS AND DYNAMIC-PROGRAMMING SO SPEECH COMMUNICATION LA English DT Article DE NEURAL NETWORKS; BROAD PHONETIC CLASSIFICATION; SEGMENTATION RP MARTENS, JP (reprint author), STATE UNIV GHENT, ELECTR & METROL LAB, ST PIETERSNIEUWSTR 41, B-9000 GHENT, BELGIUM. CR AUBERT XL, 1989, P ICASSP 89, P659 BECKER S, 1988, IMPROVING CONVERGENC BRIDLE J, 1978, P INT ACOUST AUTOMN, P25 CHIGIER B, 1988, P INT C ACOUST SPEEC, P449 COLE RA, 1988, P ICASSP 88, P453 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 GLASS JR, 1988, P ICASSP NEW YORK AP, P429 KOHONEN T, 1987, P EUROPEAN C SPEECH, V2, P377 LAMEL L, 1986, FEB P DARPA SPEECH R, P100 LEE KF, 1989, P EUROSPEECH 89, P148 LEE KF, 1989, IEEE T ACOUST SPEECH, V37, P1641, DOI 10.1109/29.46546 RUMELHART DE, 1986, PARALLEL DISTRIBUTIO, V1 Schwartz R., 1985, P ICASSP 85, P1205 Sejnowski T. J., 1987, Complex Systems, V1 ZUE V, 1989, P IEEE INT C ACOUSTI, P389 NR 15 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 81 EP 90 DI 10.1016/0167-6393(91)90029-S PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200007 ER PT J AU FELDMAN, FA HAQUE, T AF FELDMAN, FA HAQUE, T TI DEVELOPMENT OF WALSH LINEAR CODING AND ITS APPLICATION TO SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Note DE SPEECH; RECOGNITION; WALSH; LPC ID VECTOR QUANTIZATION; WORD RECOGNITION; ALGORITHM RP FELDMAN, FA (reprint author), SUFFOLK UNIV, DEPT PHYS & ENGN, BOSTON, MA 02114 USA. CR BEAUCHAMP KG, 1984, APPLICATIONS WALSH R BURTON DK, 1985, IEEE T ACOUST SPEECH, V33, P837, DOI 10.1109/TASSP.1985.1164650 BUZO A, 1980, IEEE T ACOUST SPEECH, V28, P562, DOI 10.1109/TASSP.1980.1163445 CYBENKO G, 1980, SIAM J SCI STAT COMP, V1, P303, DOI 10.1137/0901021 GRAY RM, 1980, IEEE T ACOUST SPEECH, V28, P367, DOI 10.1109/TASSP.1980.1163421 Itakura F, 1975, IEEE T ACOUST SPEECH, V23, P57 JUANG BH, 1982, IEEE T ACOUST SPEECH, V30, P294 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 MYERS C, 1980, IEEE T ACOUST SPEECH, V28, P623, DOI 10.1109/TASSP.1980.1163491 OHGA H, 1982, IEEE T CONSUM ELECTR, V28, P263, DOI 10.1109/TCE.1982.353920 RABINER LR, 1975, AT&T TECH J, V54, P297 RABINER LR, 1984, AT&T TECH J, V63, P721 RABINER LR, 1978, IEEE T ACOUST SPEECH, V26, P34, DOI 10.1109/TASSP.1978.1163037 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 TYLER J, 1986, MICROPROCESS MICROSY, V10, P427, DOI 10.1016/0141-9331(86)90211-5 WHITE GM, 1976, IEEE T ACOUST SPEECH, V24, P183, DOI 10.1109/TASSP.1976.1162779 NR 16 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1991 VL 10 IS 1 BP 91 EP 97 DI 10.1016/0167-6393(91)90030-W PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA FH202 UT WOS:A1991FH20200008 ER PT J AU TRANCOSO, IM MARQUES, JS RIBEIRO, CM AF TRANCOSO, IM MARQUES, JS RIBEIRO, CM TI CELP AND SINUSOIDAL CODERS - 2 SOLUTIONS FOR SPEECH CODING AT 4.8-9.6 KBPS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH CODING; CELP; SINUSOIDAL ID LINEAR PREDICTION; SIGNALS C1 INESC, ISEL, LISBON, PORTUGAL. RP TRANCOSO, IM (reprint author), Univ Tecn Lisboa, INST SUPER TECN, INESC, R ALVES REDOL 9, P-1000 LISBON, PORTUGAL. RI Marques, Jorge/C-1427-2010; Trancoso, Isabel/C-5965-2008 OI Marques, Jorge/0000-0002-3800-7756; Trancoso, Isabel/0000-0001-5874-6313 CR Adoul J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) ALMEIDA L, 1984, P IEEE ICASSP84 SAN Almeida L. B., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ALMEIDA LB, 1983, IEEE T ACOUST SPEECH, V31, P664, DOI 10.1109/TASSP.1983.1164128 ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1970, AT&T TECH J, V49, P1973 Bronson E. C., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) CAMPBELL JP, 1989, P IEEE INT C AC SPEE, P735 CHEN JH, 1989, P GLOBECOM DALLAS, P1237 GERSON I, 1989, IEEE WORKSHOP SPEECH, P66 GRIFFIN D, 1987, THESIS MIT MASSACHUS GRIFFIN DW, 1985, P IEEE INT C AC SPEE, P513 Hanson B. A., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing HEDELIN P, 1981, P IEEE INT C AC SPEE, P205 Kleijn W.B., 1988, P INT C AC SPEECH SI, P155 KROON P, 1988, P ICASSP, P151 KROON P, 1989, IEEE WORKSHOP SPEECH, P49 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 LEGUYADER A, 1989, P INT C ASSP, P120 LI D, 1987, P IEEE ICASSP87 DALL, P1354 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 Markel JD, 1976, LINEAR PREDICTION SP MARQUES J, 1989, P ISSE ERLANGEN, P128 MARQUES J, 1989, P EUR C SPEECH COMM, P509 MARQUES J, 1988, SIGNAL PROCESS, V4, P891 MARQUES JS, 1989, P EUROSPEECH, P203 MCAULAY R, 1984, P IEEE ICASSP84 SAN MCAULAY R, 1989, P IEEE ICASSP89 GLAS, P207 McAulay R. J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) MCAULAY R, 1985, P IEEE ICASSP85 TAMP, P945 MENEZ J, 1989, P INT C ASSP GLASGOW, P132 SCHROEDER M, 1984, P INT C COMMUN, P1610 SINGHAL S, 1984, P IEEE ICASSP84 SAN THOMSON D, 1988, P IEEE ICASSP88 NEW, P378 TRANCOSO I, 1987, P UER C SPEECH COMM, V2, P181 TRANCOSO IM, 1986, P IEEE INT C AC SPEE, P2375 Tremain T.E., 1982, SPEECH TECHNOLOG APR, P40 UN CK, 1975, IEEE T COMMUN, V23, P1466 ZELINSKI R, 1977, IEEE T ACOUST SPEECH, V25, P299, DOI 10.1109/TASSP.1977.1162974 NR 40 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 389 EP 400 DI 10.1016/0167-6393(90)90016-3 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200002 ER PT J AU MOULINES, E DIFRANCESCO, R AF MOULINES, E DIFRANCESCO, R TI DETECTION OF THE GLOTTAL CLOSURE BY JUMPS IN THE STATISTICAL PROPERTIES OF THE SPEECH SIGNAL SO SPEECH COMMUNICATION LA English DT Article DE GLOTTAL CLOSURE; SPEECH SIGNAL; STATISTICAL PROPERTIES; STATISTICAL EVENTS DETECTION; HYPOTHESIS TESTING; LIKELIHOOD RATIO; DIVERGENCE CONVEXITY TEST C1 CTR NATL ETUD TELECOMMUN, DEPT RECH COMMUN PAROLE, F-22301 LANNION, FRANCE. CR Ananthapadmanabha T. V., 1982, SPEECH COMMUN, V1, P167, DOI 10.1016/0167-6393(82)90015-2 ANANTHAPADMANABHA TV, 1979, IEEE T ACOUST SPEECH, V27, P309, DOI 10.1109/TASSP.1979.1163267 ANDREOBRECHT R, 1988, IEEE T ACOUST SPEECH, V36, P29, DOI 10.1109/29.1486 BASSEVILLE M, 1986, LECTURE NOTES CONTRO CHENG YM, 1989, IEEE T ACOUST SPEECH, V37, P1805, DOI 10.1109/29.45529 Deller J. R. Jr., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing DEVETH J, 1989, P EUROSPEECH, V2, P43 DIFRANCESCO R, 1990, IN PRESS IEEE T ACOU Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1979, STL QPSR, P21 GUERIN B, 1980, PHONETICA, V37, P161 HEDELIN P, 1986, P IEEE ICASSP86 TOKY, P465 HESS W, 1983, PITCH ALGORITHMS DEV HOLMES J, 1976, P IEEE INT C AC SPEE, P39 HONIG ML, 1984, ADAPTIVE FILTERS STU Hunt M. J., 1978, Proceedings of the 1978 IEEE International Conference on Acoustics, Speech and Signal Processing KAY SM, 1983, IEEE T ACOUST SPEECH, V31, P56, DOI 10.1109/TASSP.1983.1164050 KULLBACK S, 1959, INFORMATION THEORY S MAKHOUL JI, 1981, IEEE T ACOUST SPEECH, V29, P654, DOI 10.1109/TASSP.1981.1163566 Markel JD, 1976, LINEAR PREDICTION SP MOULINES E, 1990, THESIS PARIS SHON S, 1984, P IEEE ICASSP84 SAN STRUBE HW, 1974, J ACOUST SOC AM, V56, P1625, DOI 10.1121/1.1903487 VEENEMAN DE, 1985, IEEE T ACOUST SPEECH, V33, P369, DOI 10.1109/TASSP.1985.1164544 WONG DY, 1979, IEEE T ACOUST SPEECH, V27, P350, DOI 10.1109/TASSP.1979.1163260 NR 25 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 401 EP 418 DI 10.1016/0167-6393(90)90017-4 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200003 ER PT J AU DALESSANDRO, C AF DALESSANDRO, C TI TIME-FREQUENCY SPEECH TRANSFORMATION BASED ON AN ELEMENTARY WAVE-FORM REPRESENTATION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH ANALYSIS SYNTHESIS; WAVE-FORM SYNTHESIS; FORMANT SYNTHESIS; SINUSOIDAL REPRESENTATION; TIME-FREQUENCY MODIFICATION ID SINUSOIDAL REPRESENTATION; FOURIER-ANALYSIS; SYSTEM RP DALESSANDRO, C (reprint author), LAB INFORMAT MEAN & SCI INGN, CNRS, BP 133, F-91403 ORSAY, FRANCE. CR ALLEN JB, 1977, IEEE T ACOUST SPEECH, V25, P235, DOI 10.1109/TASSP.1977.1162950 Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Charpentier F. J., 1986, P ICASSP, P2015 CROCHIERE RE, 1980, IEEE T ACOUST SPEECH, V28, P99, DOI 10.1109/TASSP.1980.1163353 DALESSANDRO C, 1988, P 1988 IEEE INT C AC, P351 DALESSANDRO C, 1989, THESIS U PARIS 6, P135 Fant G., 1960, ACOUSTIC THEORY SPEE, P15 FLANAGAN JL, 1972, SPEECH ANAL SYNTHESI, P204 FLANAGAN JL, 1966, AT&T TECH J, V45, P1493 Gabor D., 1946, Journal of the Institution of Electrical Engineers. III. Radio and Communication Engineering, V93 HOLMES JN, 1983, SPEECH COMMUN, V2, P251, DOI 10.1016/0167-6393(83)90044-4 KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 LIENARD JS, 1989, WAVELETS TIME FREQUE, P158 Lienard J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) MAKHOUL JI, 1981, IEEE T CIRCUITS SYST, V28, P494, DOI 10.1109/TCS.1981.1085022 MAKHOUL J, 1976, P IEEE ICASSP 76 PHI, P87 MALAH D, 1979, IEEE T ACOUST SPEECH, V27, P121, DOI 10.1109/TASSP.1979.1163210 MCAULAY RJ, 1986, IEEE T ACOUST SPEECH, V34, P744, DOI 10.1109/TASSP.1986.1164910 NEUBURG EP, 1978, J ACOUST SOC AM, V63, P624, DOI 10.1121/1.381764 PORTNOFF MR, 1981, IEEE T ACOUST SPEECH, V29, P374, DOI 10.1109/TASSP.1981.1163581 QUATIERI TF, 1986, IEEE T ACOUST SPEECH, V34, P1449, DOI 10.1109/TASSP.1986.1164985 Rodet X., 1980, SPOKEN LANGUAGE GENE RODET X, 1985, P IEEE ICASSP 85 TAM, P736 SCOTT RJ, 1967, J ACOUST SOC AM, V41, P60, DOI 10.1121/1.1910329 SENEFF S, 1982, IEEE T ACOUST SPEECH, V30, P566, DOI 10.1109/TASSP.1982.1163919 TRANCOSO IM, 1988, SPEECH COMMUN, V7, P239, DOI 10.1016/0167-6393(88)90043-X NR 26 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 419 EP 431 DI 10.1016/0167-6393(90)90018-5 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200004 ER PT J AU VANCOMPERNOLLE, D MA, W XIE, F VANDIEST, M AF VANCOMPERNOLLE, D MA, W XIE, F VANDIEST, M TI SPEECH RECOGNITION IN NOISY ENVIRONMENTS WITH THE AID OF MICROPHONE ARRAYS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; MICROPHONE ARRAY PROCESSING; NOISY ENVIRONMENTS C1 ESAT, KU LEUVEN, B-3001 HEVERLEE, BELGIUM. CR Blauert J., 1983, SPATIAL HEARING BOLL SF, 1979, IEEE T ACOUST SPEECH, V27, P113, DOI 10.1109/TASSP.1979.1163209 FAUCON G, 1989, 12TH P GRETSI S J LE, P517 FERRARA ER, 1981, IEEE T ACOUST SPEECH, V29, P766, DOI 10.1109/TASSP.1981.1163589 FLANAGAN JL, 1985, J ACOUST SOC AM, V78, P1508, DOI 10.1121/1.392786 GRIFFITHS LJ, 1982, IEEE T ANTENN PROPAG, V30, P27, DOI 10.1109/TAP.1982.1142739 PETERSON PM, 1989, THESIS MIT PORTER JE, 1984, 1984 P IEEE ICASSP84, V2 VANCOMPERNOLLE D, 1989, 1989 P IEEE INT C AC, P258 VANCOMPERNOLLE D, 1990, 1990 INT C AC SPEECH, P833 VANCOMPERNOLLE D, 1990, ESAT HIDDEN MARKOV M Van Compernolle D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90027-2 Widrow B, 1985, ADAPTIVE SIGNAL PROC NR 13 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 433 EP 442 DI 10.1016/0167-6393(90)90019-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200005 ER PT J AU COLLIER, R AF COLLIER, R TI ON THE PERCEPTUAL ANALYSIS OF INTONATION SO SPEECH COMMUNICATION LA English DT Article DE INTONATION; PITCH; PROSODY; SPEECH PERCEPTION ID DECLINATION; ENGLISH; SPEECH RP COLLIER, R (reprint author), INST PERCEPT RES, POB 513, 5600 MB EINDHOVEN, NETHERLANDS. CR 't Hart J., 1975, J PHONETICS, V3, P235 THART J, 1986, J ACOUST SOC AM, V80, P1838, DOI 10.1121/1.394299 BOVES BL, 1984, INTONATION ACCENT RH, P250 COHEN A, 1964, 5TH P INT C AC LIEG COHEN A, 1982, PHONETICA, V39, P254 COLLIER R, INJ PRESS J PHONETIC COLLIER R, 1983, EXPLANATIONS LANGUAG, P237 Collier R., 1990, PERCEPTUAL STUDY INT Collier R., 1987, LARYNGEAL FUNCTION P, P403 COLLIER R, 1989, WORLDS WORDS ESSAYS, P245 COLLIER R, 1987, P EUROPEAN C SPEECH, V2, P165 COLLIER R, 1975, J ACOUST SOC AM, V58, P249, DOI 10.1121/1.380654 Cooper W. E., 1981, FUNDAMENTAL FREQUENC Cooper W. E., 1977, J ACOUST SOC AM, V62, P682 DEPIJPER JR, 1983, MODELLING BRIT INTON GUSSENHOVEN C, 1988, AUTOSEGMENTAL STUDIE, P95 HERMES DJ, 1990, J ACOUST SOC AM, V87, P866, DOI 10.1121/1.398896 HERMES DJ, 1987, INT J REHABIL RES, V10, P457, DOI 10.1097/00004356-198712000-00026 Hess W., 1983, PITCH DETERMINATION HUBER D, 1989, P IEEE ICASSP89 GLAS, V1, P601 Jones Daniel, 1909, INTONATION CURVES LADD DR, 1988, J ACOUST SOC AM, V84, P530, DOI 10.1121/1.396830 LADD DR, 1983, LANGUAGE, V59, P721, DOI 10.2307/413371 LEROY L, 1984, ANTWERP PAPERS LINGU, V40 Liberman Mark, 1984, LANGUAGE SOUND STRUC, P157 LIEBERMAN P, 1985, J ACOUST SOC AM, V77, P649, DOI 10.1121/1.391883 Ode C., 1989, RUSSIAN INTONATION P PIERREHUMBERT J, 1979, J ACOUST SOC AM, V66, P363, DOI 10.1121/1.383670 t'Hart J., 1973, J PHONETICS, V1, P309 TERKEN J, 1989, IPO ANN PROGR REPORT, V24, P33 THART J, UNPB J ACOUST SOC AM WILLEMS N, 1988, J ACOUST SOC AM, V84, P1250, DOI 10.1121/1.396625 NR 32 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 443 EP 451 DI 10.1016/0167-6393(90)90020-A PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200006 ER PT J AU MOULINES, E CHARPENTIER, F AF MOULINES, E CHARPENTIER, F TI PITCH-SYNCHRONOUS WAVE-FORM PROCESSING TECHNIQUES FOR TEXT-TO-SPEECH SYNTHESIS USING DIPHONES SO SPEECH COMMUNICATION LA English DT Article DE TEXT-TO-SPEECH SYNTHESIS; VOICE QUALITY; PITCH-SYNCHRONOUS OVERLAP-ADD (PSOLA) ID TIME FOURIER-ANALYSIS; EXCITATION; TRANSFORM; SIGNALS; VOCODER C1 CTR NATL ETUD TELECOMMUN, DEPT SIGNAL, F-75643 PARIS, FRANCE. CR ALLEN JB, 1977, IEEE T ACOUST SPEECH, V25, P235, DOI 10.1109/TASSP.1977.1162950 ALLEN JB, 1977, P IEEE, V65, P1558, DOI 10.1109/PROC.1977.10770 CASPERS BE, 1983, J ACOUST SOC AM, P73 CHARPENTIER F, 1988, THESIS ECOLE NATIONA CHARPENTIER F, 1988, P INT C ACOUST SPEEC, P667 CROCHIERE RE, 1981, P IEEE, V69, P300, DOI 10.1109/PROC.1981.11969 Flanagan J., 1972, SPEECH ANAL SYNTHESI Ghitza O., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) GRIFFIN DW, 1988, IEEE T ACOUST SPEECH, V36, P1223, DOI 10.1109/29.1651 GRIFFIN DW, 1984, IEEE T ACOUST SPEECH, V32, P236, DOI 10.1109/TASSP.1984.1164317 Hamon C., 1989, P INT C AC SPEECH SI, P238 Hedelin P., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) KANG GS, 1985, IEEE T ACOUST SPEECH, V33, P377, DOI 10.1109/TASSP.1985.1164556 LOBO AP, 1989, P EUROSPEECH 89 PARI, V2, P27 Lukaszewicz K., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) MAKHOUL J, 1978, J ACOUST SOC AM, V64, P1577, DOI 10.1121/1.382141 MALAH D, 1979, IEEE T ACOUST SPEECH, V27, P121, DOI 10.1109/TASSP.1979.1163210 MIYOSHI Y, 1987, IEEE T ACOUST SPEECH, V35, P1223 MOORER JA, 1978, J AUDIO ENG, V26, P41 MOULINES E, 1988, P FASE INT C EDINBUR, P47 MOULINES E, 1990, THESIS ECOLE NATIONA Papoulis A., 1984, PROBABILITY RANDOM V PORTNOFF MR, 1981, IEEE T ACOUST SPEECH, V29, P364, DOI 10.1109/TASSP.1981.1163580 RICHARDS MA, 1979, IEEE T ACOUST SPEECH, V27, P841 Roucos S., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) Sagisaka Y., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196677 SENEFF S, 1982, IEEE T ACOUST SPEECH, V30, P566, DOI 10.1109/TASSP.1982.1163919 VANHEMERT JP, 1984, IPO19 PROGR REP, P20 VARGA A, 1987, IEEE T ACOUST SPEECH, V35, P586, DOI 10.1109/TASSP.1987.1165151 NR 29 TC 364 Z9 376 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 453 EP 467 DI 10.1016/0167-6393(90)90021-Z PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200007 ER PT J AU DARWIN, CJ CULLING, JF AF DARWIN, CJ CULLING, JF TI SPEECH-PERCEPTION SEEN THROUGH THE EAR SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PERCEPTION; LOW-LEVEL GROUPING MECHANISMS; F0 DIFFERENCES; PERCEPTUAL SEGREGATION; FREQUENCY MODULATION ID FUNDAMENTAL-FREQUENCY; BACKGROUND-NOISE; VOWEL QUALITY; REPRESENTATION; SEGREGATION; TONE RP DARWIN, CJ (reprint author), UNIV SUSSEX, EXPTL PSYCHOL, BRIGHTON BN1 9QG, E SUSSEX, ENGLAND. RI Culling, John/D-1468-2009 CR ASSMANN PF, 1990, J ACOUST SOC AM, V88, P680, DOI 10.1121/1.399772 ASSMANN PF, 1989, J ACOUST SOC AM, V85, P327, DOI 10.1121/1.397684 BREGMAN AS, 1990, PERCEPT PSYCHOPHYS, V47, P68, DOI 10.3758/BF03208166 BROKX JPL, 1982, J PHONETICS, V10, P23 CARLYON RP, 1989, J ACOUST SOC AM, V85, pS121, DOI 10.1121/1.2026689 CHALIKIA MH, 1989, PERCEPT PSYCHOPHYS, V46, P487, DOI 10.3758/BF03210865 CHALIKIA MH, UNPUB PERCEPTUAL SEG CIOCCA V, 1989, PERCEPT PSYCHOPHYS, V46, P39, DOI 10.3758/BF03208072 CULLING JF, 1989, BRIT J AUDIOL, V24, P194 DARWIN CJ, 1984, J ACOUST SOC AM, V76, P1636, DOI 10.1121/1.391610 DARWIN CJ, 1986, J ACOUST SOC AM, V79, P838, DOI 10.1121/1.393474 DARWIN CJ, 1981, Q J EXP PSYCHOL-A, V33, P185 DARWIN CJ, 1990, MODULARITY MOTOR THE DARWIN CJ, 1989, PERCEPT PSYCHOPHYS, V45, P333, DOI 10.3758/BF03204948 DARWIN CJ, 1987, PSYCHOPHYSICS SPEECH, P112 Delgutte B, 1987, PSYCHOPHYSICS SPEECH, P333 GARDNER RB, 1989, BRIT J AUDIOL, V23, P170 GARDNER RB, 1989, J ACOUST SOC AM, V85, P1329, DOI 10.1121/1.397464 HOUTSMA AJM, 1990, J ACOUST SOC AM, V87, P304, DOI 10.1121/1.399297 MATTINGLY IG, IN PRESS SPEECH OTHE MCADAMS S, 1989, J ACOUST SOC AM, V86, P2148, DOI 10.1121/1.398475 McAdams S., 1984, THESIS STANFORD U MEDDIS R, 1988, J ACOUST SOC AM, V83, P1056, DOI 10.1121/1.396050 MEDDIS R, IN PRESS MODELLING I MILLER MI, 1987, J ACOUST SOC AM, V81, P665, DOI 10.1121/1.394835 Moore B. C. J., 1989, INTRO PSYCHOL HEARIN MOORE BCJ, 1988, FREQUENCY SELECTIVIT NIEDERJOHN RJ, 1985, IEEE T ACOUST SPEECH, V33, P349, DOI 10.1109/TASSP.1985.1164571 SACHS MB, 1983, J NEUROPHYSIOL, V50, P27 Scheffers M. T. M., 1983, THESIS GRONINGEN U N WEINTRAUB M, 1987, NATO ASI SER, P125 Weintraub M., 1985, THESIS STANFORD U Winslow R. L., 1987, AUDITORY PROCESSING, P212 ZWICKER UT, 1984, SPEECH COMMUN, V3, P265, DOI 10.1016/0167-6393(84)90023-2 NR 34 TC 20 Z9 20 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 469 EP 475 DI 10.1016/0167-6393(90)90022-2 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200008 ER PT J AU NORD, L KRUCKENBERG, A FANT, G AF NORD, L KRUCKENBERG, A FANT, G TI SOME TIMING STUDIES OF PROSE, POETRY AND MUSIC SO SPEECH COMMUNICATION LA English DT Article DE TIMING; SPEECH; PROSE; POETRY; MUSIC; PERFORMANCE; METRICS RP NORD, L (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. CR Allen G. D., 1975, J PHONETICS, V3, P75 BENGTSSON I, 1983, STUDIES MUSIC PERFOR BROWMAN CP, 1988, PHONETICA, V45, P140 CARLSON R, 1987, STLQPSR41987, P7 CARLSON R, 1972, STLQPSR41972, P11 CARLSON R, 1986, PHONETICA, V43, P140 DAUER RM, 1983, J PHONETICS, V11, P51 DENOS EA, 1988, THESIS U UTRECHT FANT G, 1989, STLQPSR11989, P7 FANT G, 1988, 2ND S ADV MAN MACH I FANT G, 1989, SPEECH TRANSMISSION, P1 FANT G, 1986, UNPUB APR MIN RHYTHM GOUDE G, 1970, UNPUB FORSOK OVER UP GOUDE G, 1968, UNDERHANDSRAPPORTERI Kozhevnikov V. A., 1965, SPEECH ARTICULATION Lehiste I., 1977, J PHONETICS, V5, P253 Lerdahl F., 1983, GENERATIVE THEORY TO LIE H, 1967, NORSK VERSLAERE LOOTS ME, 1980, METRICAL MYTHS EXPT MARCUS SM, 1981, PERCEPT PSYCHOPHYS, V30, P247, DOI 10.3758/BF03214280 NEWTON RP, 1975, LANG STYLE, P127 Ohala J. J., 1975, AUDITORY ANAL PERCEP, P431 Pike K. L., 1945, INTONATION AM ENGLIS RAPP K, 1971, STLQPRS11971 SPEECH, P14 SJOBERG B, 1973, FRIDAS VISOR Strangert E., 1985, SWEDISH SPEECH RHYTH SUNDBERG J, 1983, STUDIES MUSIC PERFOR VANDERSLICE R, 1968, WORKING PAPERS PHONE, V8 WENK BJ, 1982, J PHONETICS, V10, P193 NR 29 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 477 EP 483 DI 10.1016/0167-6393(90)90023-3 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200009 ER PT J AU CUTLER, A BUTTERFIELD, S AF CUTLER, A BUTTERFIELD, S TI DURATIONAL CUES TO WORD BOUNDARIES IN CLEAR SPEECH SO SPEECH COMMUNICATION LA English DT Article DE SPEECH PRODUCTION; INTELLIGIBILITY; CLEAR SPEECH; WORD BOUNDARIES; SEGMENTATION; DURATION; LENGHTENING; PAUSING RP CUTLER, A (reprint author), MRC, APPL PSYCHOL UNIT, 15 CHAUCER RD, CAMBRIDGE CB2 2EF, ENGLAND. RI Cutler, Anne/C-9467-2012 CR BUTTERFIELD S, 1988, P SPEECH 88 7 S FED, V3, P827 Cutler A., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90004-0 Chen F., 1983, WORKING PAPERS, V2, P1 Clark J. E., 1988, LANGUAGE TOPICS ESSA, P161 Cooper W. E., 1980, SYNTAX SPEECH COOPER WE, 1978, COGNITIVE PSYCHOL, V10, P154, DOI 10.1016/0010-0285(78)90012-9 CUTLER A, 1988, J EXP PSYCHOL HUMAN, V14, P113, DOI 10.1037/0096-1523.14.1.113 Francis WN, 1982, FREQUENCY ANAL ENGLI Grosjean Francois, 1980, TEMPORAL VARIABLES S, P91 PICHENY MA, 1986, J SPEECH HEAR RES, V29, P434 Valian V. V., 1976, COGNITION, V4, P115 NR 11 TC 38 Z9 39 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 485 EP 495 DI 10.1016/0167-6393(90)90024-4 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200010 ER PT J AU LEE, KF HON, HW HWANG, MY HUANG, XD AF LEE, KF HON, HW HWANG, MY HUANG, XD TI SPEECH RECOGNITION USING HIDDEN MARKOV-MODELS - A CMU PERSPECTIVE SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; HIDDEN MARKOV MODELS; SPHINX SYSTEM RP LEE, KF (reprint author), CARNEGIE MELLON UNIV, SCH COMP SCI, PITTSBURGH, PA 15213 USA. CR AVERBUCH A, 1986, P 1986 IEEE INT C AC, P53 Bahl L. R., 1989, P ICASSP 89 GLASG SC, P465 BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 BAHL LR, 1988, P IEEE ICASSP88 NEW BAKER JK, 1975, IEEE T ACOUST SPEECH, VAS23, P24, DOI 10.1109/TASSP.1975.1162650 BAKIS R, 1976, 91ST M AC SOC AM Baum L. E., 1972, INEQUALITIES, V3, P1 BELLEGARDA J, 1989, P INT C AC SPEECH SI, P13 Breiman J. F. R. O., 1984, CLASSIFICATION REGRE Brown P., 1987, THESIS CARNEGIE MELL CHOW YL, 1986, P IEEE ICASSP36 TOKY, P593 DODDINGTON G, 1989, P IEEE INT C AC SPEE, P556 FRANZINI M, 1989, P ICASSP 89 GLASG, P425 FURUI S, 1986, IEEE T ACOUST SPEECH, V34, P52, DOI 10.1109/TASSP.1986.1164788 GUPTA VN, 1987, APR P IEEE INT C AC, P697 HON HW, 1990, P IEEE ICASSP90 ALBU, P725 HUANG XD, 1988, IEEE WORKSHOP SPEECH Huang X.D., 1990, HIDDEN MARKOV MODELS Huang X.D., 1990, P ICASSP, P689 HUANG XD, 1990, DARPA SPEECH LANGUAG Huang X. D., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90020-X HWANG MY, 1989, P EUR, P5 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop LEE KF, 1990, IEEE T ACOUST SPEECH, V38, P599, DOI 10.1109/29.52701 LEE KF, 1990, P ICASSP 90, P749 Lee K.-F., 1989, AUTOMATIC SPEECH REC Lee K.-F., 1988, THESIS CARNEGIE MELL LEE KF, 1989, P EUROSPEECH 89 PARI, P490 Lowerre B. T., 1976, THESIS CARNEGIE MELL Paul D.B., 1989, P IEEE INT C ACOUSTI, P449 PAUL DB, 1986, P SPEECH TECHNOLOGY PORITZ AB, 1986, P IEEE INT C AC SPEE, P705 RABINER LR, 1985, AT&T TECH J, V64, P1211 RABINER LR, 1988, P IEEE ICASSP88 NEW SAGAYAMA S, 1989, P IEEE ICASSP89 GLAS, P377 Schwartz R., 1985, P ICASSP 85, P1205 VITERBI AJ, 1967, IEEE T INFORM THEORY, V13, P260, DOI 10.1109/TIT.1967.1054010 NR 38 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 497 EP 508 DI 10.1016/0167-6393(90)90025-5 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200011 ER PT J AU DALSGAARD, P BAEKGAARD, A AF DALSGAARD, P BAEKGAARD, A TI RECOGNITION OF CONTINUOUS SPEECH USING NEURAL NETS AND EXPERT SYSTEM PROCESSING SO SPEECH COMMUNICATION LA English DT Article DE ACOUSTIC-PHONETIC FEATURE TRANSFORMATION; NEURAL NETWORK SEGMENTATION; PHONEME CLASSIFICATION; EXPERT SYSTEM PROCESSING RP DALSGAARD, P (reprint author), AALBORG UNIV, CTR SPEECH TECHNOL, 7 FREDRIK BAJERS VEJ, DK-9220 AALBORG, DENMARK. CR BAEKGAARD A, 1989, P EUROSPEECH 89 PARI, P545 BAEKGAARD A, 1987, P EUROSPEECH 87 EDIN, P5 BUNDGAARD M, 1989, P EUROSPEECH 89 PARI, P98 DALSGAARD P, 1989, P EUROSPEECH 89 PARI, P541 FOURCIN AJ, 1988, SPEECH INPUT OUTPUT JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 KOHONEN T, 1988, P 1988 IEEE INT C AC, P607 LARSEN LB, 1989, P EUROSPEECH 89 PARI, P232 LESSER VR, 1975, IEEE T ACOUST SPEECH, VAS23, P11, DOI 10.1109/TASSP.1975.1162648 Rabiner L. R., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80021-3 RABINER LR, 1988, P ICASSP, P119 NR 11 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 509 EP 520 DI 10.1016/0167-6393(90)90026-6 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200012 ER PT J AU BILLI, R BUTTAFAVA, P CERICOLA, D DIGIAMPIETRO, W MASSIA, G MOLLO, MJ TAFINI, F VARESE, G VITTORELLI, V AF BILLI, R BUTTAFAVA, P CERICOLA, D DIGIAMPIETRO, W MASSIA, G MOLLO, MJ TAFINI, F VARESE, G VITTORELLI, V TI A PC-BASED VERY LARGE VOCABULARY ISOLATED WORD SPEECH RECOGNITION SYSTEM SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; MACHINE DICTATION; ISOLATED WORD; ITALIAN RP BILLI, R (reprint author), ING C OLIVETTI & C SPA, DOR SPEECH & LANGUAGE LAB, CSO SVIZZERA 185, I-10149 TURIN, ITALY. CR BILLI R, 1986, P IEEE ICASSP86 TOKY, P65 BILLI R, 1989, P EUROSPEECH 89 PARI, P157 BURR DJ, 1984, IEEE T ACOUST SPEECH, V32, P119, DOI 10.1109/TASSP.1984.1164276 BUTTAFAVA P, 1989, P EUROSPEECH 89 PARI, P90 CERICOLA D, 1989, P EUROSPEECH 89 PARI, P386 LONGO G, 1980, TEORIA INFORMAZIONE SUGIYAMA M, 1982, T IECEJ A, V65, P965 NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 521 EP 530 DI 10.1016/0167-6393(90)90027-7 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200013 ER PT J AU FERRETTI, M MALTESE, G SCARCI, S AF FERRETTI, M MALTESE, G SCARCI, S TI MEASURING INFORMATION PROVIDED BY LANGUAGE MODEL AND ACOUSTIC MODEL IN PROBABILISTIC SPEECH RECOGNITION - THEORY AND EXPERIMENTAL RESULTS SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; INFORMATION THEORY; LANGUAGE MODEL; ACOUSTIC MODEL RP FERRETTI, M (reprint author), IBM CORP, ITALY ROME SCI CTR, VIA GIORGIONE 159, I-00147 ROME, ITALY. CR ALTO P, 1989, 1989 IEEE COMP EUR 8 BAHL LR, 1988, P 1988 INT C AC SPEE, P497 Bahl L. R., 1989, Eurospeech 89. European Conference on Speech Communication and Technology BRANDETTI M, 1988, P EUSIPCO88 GRENOBLE, P147 BROWN PF, 1987, THESIS CARNEGIEMELLO CODOGNO M, 1987, P EUROPEAN C SPEECH, V1, P159 DORTA P, 1988, IBM J RES DEV, V32, P217 Ferretti M., 1989, Eurospeech 89. European Conference on Speech Communication and Technology FERRETTI M, 1989, P IEEE ICASSP 89 GLA, P707 Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop JELINEK F, 1977, 94TH M AC SOC AM MIA JELINEK F, 1986, 1986 IBM EUR I ADV S JELINEK F, 1985, P IEEE, V73, P1616, DOI 10.1109/PROC.1985.13343 KATZ S, 1987, MAR IEEE T AC SPEECH, V34, P400 NR 14 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 531 EP 539 DI 10.1016/0167-6393(90)90028-8 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200014 ER PT J AU KOKKONEN, M TORKKOLA, K AF KOKKONEN, M TORKKOLA, K TI USING SELF-ORGANIZING MAPS AND MULTILAYERED FEEDFORWARD NETS TO OBTAIN PHONEMIC TRANSCRIPTIONS OF SPOKEN UTTERANCES SO SPEECH COMMUNICATION LA English DT Article DE PHONEMIC SPEECH RECOGNITION; TRANSCRIBING SPEECH INTO PHONEMES; NEURAL NETS IN SPEECH RECOGNITION; BACK-PROPAGATION NETS; SELF-ORGANIZING FEATURE MAPS RP KOKKONEN, M (reprint author), HELSINKI UNIV TECHNOL, INFORMAT & COMP SCI LAB, TKK-F, RAKENTAJANAUKIO 2C, SF-02150 ESPOO, FINLAND. CR BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 BOURLARD H, 1989, P ICASSP, P33 IWAMIDA H, 1990, 1990 P IEE ICASSP90 KANGAS J, 1989, P EUROSPEECH 89 PARI KEPUSKA V, 1989, 1989 P IEEE ICASSP 8, P504 KOHONEN T., 1989, SELF ORG ASS MEMORY KOHONEN T, 1988, 1988 IEEE INT C NEUR, V1, P61 KOHONEN T, 1988, IEEE COMPUTER MAR KOHONEN T, 1987, 1987 P EUR C SPEECH, V2, P377 Kohonen T., 1984, Seventh International Conference on Pattern Recognition (Cat. No. 84CH2046-1) Lee K.-F., 1989, AUTOMATIC SPEECH REC Lippmann R. P., 1989, Neural Computation, V1, DOI 10.1162/neco.1989.1.1.1 MCDERMOTT E, 1989, 1989 P IEE ICASSP89, P81 OKUDA T, 1976, IEEE T COMPUT, V25, P172 Prager R. W., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90015-6 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1 TORKKOLA K, 1988, 1988 P IEEE ICASSP88, P611 WAIBEL A, 1989, IEEE T ACOUSTICS SPE, V37, P382 NR 18 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 541 EP 549 DI 10.1016/0167-6393(90)90029-9 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200015 ER PT J AU YOUNG, SR AF YOUNG, SR TI USE OF DIALOG, PRAGMATICS AND SEMANTICS TO ENHANCE SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article DE SPEECH RECOGNITION; DIALOG; PRAGMATICS; SEMANTICS; GRAMMARS PERPLEXITY; PROBLEM-SOLVING BEHAVIOR; MINDS SYSTEM RP YOUNG, SR (reprint author), CARNEGIE MELLON UNIV, SCH COMP SCI, SHENLEY PK, PITTSBURGH, PA 15213 USA. CR ALLEN JF, 1980, ARTIF INTELL, V15, P143, DOI 10.1016/0004-3702(80)90042-9 BROWN JS, 1975, REPRESENTATION UNDER, P311 CHARNIAK E, 1988, ARTIF INTELL, V34, P275, DOI 10.1016/0004-3702(88)90063-X Cohen P. R., 1979, COGNITIVE SCI, V3, P177, DOI DOI 10.1207/S15516709COG0303_1 FIKES RE, 1971, ARTIF INTELL, V2, P189, DOI 10.1016/0004-3702(71)90010-5 Fink P. K., 1986, Computational Linguistics, V12 HAYES PJ, 1983, CMUCS83158 CARNM U D Hendrix G.G., 1975, P 4 IJCAI, P115 LEE KF, 1990, IEEE T ACOUST SPEECH, V38 Minsky M., 1975, PSYCHOL COMPUTER VIS PERRAULT CR, 1978, 2ND P C THEOR ISS NA Sacerdoti Earl D., 1977, STRUCTURE PLANS BEHA SACERDOT.ED, 1974, ARTIF INTELL, V5, P115, DOI 10.1016/0004-3702(74)90026-5 Schank R. C., 1977, SCRIPTS GOALS PLANS Sussman G.J., 1975, COMPUTER MODEL SKILL WALKER DE, 1980, TRENDS SPEECH RECOGN Wilensky R., 1983, PLANNING UNDERSTANDI Wilensky R., 1978, UNDERSTANDING GOAL B YOUNG SR, 1989, COMMUN ACM, V32, P183, DOI 10.1145/63342.63344 YOUNG SR, 1989, P DARPA SPEECH NATUR, P131, DOI 10.3115/100964.100974 NR 20 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 551 EP 564 DI 10.1016/0167-6393(90)90030-D PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200016 ER PT J AU NIEDERMAIR, GT STREIT, M TROPF, H AF NIEDERMAIR, GT STREIT, M TROPF, H TI LINGUISTIC PROCESSING RELATED TO SPEECH UNDERSTANDING IN SPICOS-II SO SPEECH COMMUNICATION LA English DT Article DE NATURAL LANGUAGE INTERFACE; SPEECH UNDERSTANDING; LINGUISTIC PROCESSING; GRAMMAR; SYNTACTIC CONSTRAINTS; SEMANTIC CONSTRAINTS; FORMAL SEMANTIC REPRESENTATION; DISCOURSE REPRESENTATION; DIALOG HANDLING RP NIEDERMAIR, GT (reprint author), SIEMENS AG, ZFE IS KOM 31, OTTO HAHN RING 6, W-8000 MUNICH 83, GERMANY. CR BLOCK HU, 1986, P COLING 86 BONN BRONNENBERG WJ, 1980, NATURAL COMMUNICATIO Bunt H. C., 1985, MASS TERMS MODEL THE BUNT HC, 1984, IPO19 ANN PROGR REP CHAFFIN R, 1984, MEMORY COGNITION, V12 DEVET J, 1988, IPO24 REP DEVET J, 1989, IPO703 REP FREDERKING RE, 1988, 4 P OST ART INT TAG GEHRKE M, 1989, 5 P OST ART INT TAG HORTON D, 1988, 12TH P INT C COMP LI, V1 KARTTUNEN L, 1981, FORMAL METHODS STUDY, V2 Lowerre B.T., 1980, TRENDS SPEECH RECOGN MERGEL D, 1987, P IEEE ICASSP87 DALL, V2 MONTAGUE R, 1974, APPROACHES NATURAL N NIEDERMAIR GT, 1986, P COLING 86 BONN NIEDERMAIR GT, 1987, P EUROSPEECH 87 EDIN, V1 PAESELER A, 1989, P IEEE ICASSP89 GLAS, V2 POESIO M, 1987, P IJCAI 87, V2 SAGERER G, 1988, RECENT ADV SPEECH UN SCHMANDT C, 1986, 1986 P AVIOS C STEINBISS V, 1989, P EUROSPEECH 89 PARI, V2 STREIT M, 1989, P GWAI 89 STREIT M, 1989, P EUROSPEECH 89 PARI, V2 THURMAIR G, 1986, KLEINHEUBACHER BERIC THURMAIR G, 1988, RECENT ADV SPEECH UN WINOGRAD T, 1983, LANGUAGE COGNITIVE P, V41 Winston M.E., 1987, COGNITIVE SCI, V11 XHIGENAG M, 1986, T IEEE JAPAN E, V69 YOUNG SJ, 1988, P IEEE ICASSP 88 DAL NR 29 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP 565 EP 585 DI 10.1016/0167-6393(90)90031-4 PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200017 ER PT J AU MARIANI, J TUBACH, JP AF MARIANI, J TUBACH, JP TI EUROSPEECH '89 SO SPEECH COMMUNICATION LA English DT Editorial Material C1 ECOLE NATL SUPER TECH, PARIS, FRANCE. RP MARIANI, J (reprint author), LAB INFORMAT MEAN & SCI INGN, CNRS, ORSAY, FRANCE. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1990 VL 9 IS 5-6 BP R5 EP R5 DI 10.1016/0167-6393(90)90015-2 PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EX942 UT WOS:A1990EX94200001 ER PT J AU VANBEZOOIJEN, R POLS, LCW AF VANBEZOOIJEN, R POLS, LCW TI EVALUATING TEXT-TO-SPEECH SYSTEMS - SOME METHODOLOGICAL ASPECTS SO SPEECH COMMUNICATION LA English DT Article RP VANBEZOOIJEN, R (reprint author), UNIV AMSTERDAM, INST PHONET SCI, HERENGRACHT 338, 1016 CG AMSTERDAM, NETHERLANDS. CR AKERS G, 1985, J ACOUST SOC AM, V77, P2157, DOI 10.1121/1.391739 BAART JLG, 1908, 7TH P FASE S ED, P959 BARBER R, 1989, P EUR C SPEECH COMM, V2, P517 CARLSON R, 1989, P ESCA WORKSHOP SPEE CARTIER M, 1989, P ESCA WORKSHOP SPEE EGAN JP, 1948, LARYNGOSCOPE, V58, P955, DOI 10.1288/00005537-194809000-00002 GRICE M, 1989, P ESCA WORKSHOP SPEE HJELMQUIST E, 1989, READING DAILY NEWSPA HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 Isard S. D., 1988, 7th FASE Symposium. Proceedings Speech '88 JEKOSCH U, 1989, P ESCA WORKSHOP SPEE LOGAN JS, 1985, 11 IND U REV SPEECH, P3 MANOUS LM, 1985, 11 IND U RES SPEECH, P33 MONAGHAN AIC, 1989, P ESCA WORKSHOP SPEE NYE P, 1974, SR38 HASK LAB, P169 PAVLOVIC CV, 1989, P ESCA WORKSHOP SPEE PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 POLS LCW, 1987, P EUR C SPEECH TECHN, V1, P179 SPIEGEL M, 1989, P ESCA WORKSHOP SPEE TERKEN JMB, 1989, P EUROSPEECH, V1, P357 van Bezooijen R, 1986, LINGUISTICS NETHERLA, P1 VANBEZOOIJEN R, 1987, LINGUISTICS NETHERLA, V11, P47 VANBEZOOIJEN R, 1987, SPINASSP1 UTR STICHT VANBEZOOIJEN R, 1989, P EUR C SPEECH COMM, V1, P218 VANBEZOOIJEN R, 1989, P ESCA WORKSHOP SPEE VANBEZOOIJEN R, 1988, SPINASSP5 UTR STICHT VANBEZOOIJEN R, 1988, SPINASSP3 UTR STRICH VANGERWEN RPM, 1989, P ESCA WORKSHOP SPEE van Son N., 1988, 7th FASE Symposium. Proceedings Speech '88 WILLEMS N, 1988, J ACOUST SOC AM, V84, P1250, DOI 10.1121/1.396625 WILLEMSE R, 1987, PERFORMANCE ASSESSME NR 31 TC 12 Z9 12 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 263 EP 270 DI 10.1016/0167-6393(90)90002-Q PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700002 ER PT J AU CARLSON, R GRANSTROM, B NORD, L AF CARLSON, R GRANSTROM, B NORD, L TI EVALUATION AND DEVELOPMENT OF THE KTH TEXT-TO-SPEECH SYSTEM ON THE SEGMENTAL LEVEL SO SPEECH COMMUNICATION LA English DT Article RP CARLSON, R (reprint author), KUNGLIGA TEKN HOGSKOLAN, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM, SWEDEN. CR BARBER S, 1989, SEP P EUR 89 EUR C S, V2, P517 CARLSON, 1989, 1989 P ESCA WORKSH S CARLSON R, 1982, P ICASSP 82, V3, P1604 CARLSON R, 1989, 1989 P EUR 89 EUR C, V1, P458 CARLSON R, 1975, SPEECH COMMUNICATION Fourcin A., 1989, SPEECH INPUT OUTPUT LOGAN JS, 1989, J ACOUST SOC AM, V86, P566, DOI 10.1121/1.398236 LOGAN JS, 1985, 11 IND U REV SPEECH, P3 MALSHEEN BJ, 1987, 11TH P ICPHS TALL NORD L, 1988, P SPEECH 88 7TH FASE, P39 PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 VANBEZOOIJEN R, 1987, P EUROPEAN C SPEECH, P183 NR 12 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 271 EP 277 DI 10.1016/0167-6393(90)90003-R PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700003 ER PT J AU SPIEGEL, MF ALTOM, MJ MACCHI, MJ WALLACE, KL AF SPIEGEL, MF ALTOM, MJ MACCHI, MJ WALLACE, KL TI COMPREHENSIVE ASSESSMENT OF THE TELEPHONE INTELLIGIBILITY OF SYNTHESIZED AND NATURAL SPEECH SO SPEECH COMMUNICATION LA English DT Article C1 NYU, DEPT LINGUIST, NEW YORK, NY 10003 USA. RP SPIEGEL, MF (reprint author), BELL COMMUN RES INC, MORRISTOWN, NJ 07960 USA. CR GREENE BG, 1982, J ACOUST SOC AM, V73, pS28 HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 LOGAN JS, 1989, J ACOUST SOC AM, V86, P566, DOI 10.1121/1.398236 lovins J. B., 1979, SPEECH COMM 97 M AC, P519 LUCE P A, 1982, Journal of the Acoustical Society of America, V71, pS96, DOI 10.1121/1.2019658 MILLER GA, 1955, J ACOUST SOC AM, V27, P338, DOI 10.1121/1.1907526 MOSER H, 1969, ONE SYLLABE WORDS NUSBAUM HC, 1984, P AM VOICE INP OUT S NUSBAUM HC, 1985, BEHAV RES METH INS C, V17, P235 Pisoni D. B., 1981, J ACOUST SOC AM, V70, pS98, DOI 10.1121/1.2019150 PISONI D B, 1982, Journal of the Acoustical Society of America, V71, pS94, DOI 10.1121/1.2019648 SPIEGEL MF, 1988, 1988 P AM VOIC INP O VOIERS WD, 1977, SPEECH INTELLIGIBILI, V11 WRIGHT JT, 1986, 1986 P AM VOICE IO S, P235 NR 14 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 279 EP 291 DI 10.1016/0167-6393(90)90004-S PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700004 ER PT J AU BENOIT, C AF BENOIT, C TI AN INTELLIGIBILITY TEST USING SEMANTICALLY UNPREDICTABLE SENTENCES - TOWARDS THE QUANTIFICATION OF LINGUISTIC COMPLEXITY SO SPEECH COMMUNICATION LA English DT Article RP BENOIT, C (reprint author), UNIV STENDHAL, ECOLE NATL SUPER ELECTR & RADIOELECT GRENOBLE, INST NATL POLYTECH GRENOBLE, F-38040 GRENOBLE, FRANCE. CR BAILLY G, 1986, P ICASSP 86 TOKYO, P2419 BENOIT C, 1989, P EUROSPEECH 89 C PA, P633 DAKKAK OA, 1987, RECENT ADV SPEECH UD GRICE M, 1989, P ESCA WORKSP SPEECH Hamon C., 1989, P INT C AC SPEECH SI, P238 HAZAN V, 1989, P ESCA WORKSH SPEECH LIEBERMAN P, 1963, LANG SPEECH, V6, P172 MILLER GA, 1951, J EXP PSYCHOL, V41, P329, DOI 10.1037/h0062491 MILLER GA, 1963, J VERB LEARN VERB BE, V2, P217, DOI 10.1016/S0022-5371(63)80087-0 MILLER GA, 1962, IRE T INFORM THEOR, V8, P81, DOI 10.1109/TIT.1962.1057697 NYE PW, 1974, HASKINS LAB STAT REP, V37, P169 SORIN C, 1987, 11TH P ICPHS TALL, V1, P125 NR 12 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 293 EP 304 DI 10.1016/0167-6393(90)90005-T PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700005 ER PT J AU MONAGHAN, AIC LADD, DR AF MONAGHAN, AIC LADD, DR TI SYMBOLIC OUTPUT AS THE BASIS FOR EVALUATING INTONATION IN TEXT-TO-SPEECH SYSTEMS SO SPEECH COMMUNICATION LA English DT Article RP MONAGHAN, AIC (reprint author), UNIV EDINBURGH, DEPT LINGUIST, CTR SPEECH TECHNOL RES, EDINBURGH EH8 9YL, MIDLOTHIAN, SCOTLAND. CR AINSWORTH W, 1988, 7TH P FAS S ED ALLEN S, 1987, TEXT SPEECH MITALK S BAART J, 1987, THESIS RIJKSUNIVERSI BARBER S, 1989, EUROPEAN C SPEECH CO, V2, P517 BENOIT C, 1989, 1989 P ESCA WORKSH S de Pijper Jan Roelof, 1983, MODELLING BRIT ENGLI FIRZPATRIC E, 1989, 1989 P ANN AI SYST G, P188 KOHLER K, 1988, 1988 P 7TH FASE SYMP, P1241 LADD DR, 1988, J ACOUST SOC AM, V84, P530, DOI 10.1121/1.396830 LADD DR, IN PRESS PAPERS LABA, V1 LADD DR, 1987, EUROPEAN C SPEECH TE, P21 LAVER J, 1987, EUROPEAN C SPEECH TE, V1 MCALLISTER J, 1986, EDINBURGH U DEPT LIN, V19, P36 Monaghan A. I. C., 1990, Computer Speech and Language, V4, DOI 10.1016/0885-2308(90)90024-Z MONAGHAN AIC, 1988, 1988 P 7TH FASE SYMP, P1249 OLASZY G, 1989, EUROPEAN C SPEECH CO, V2, P525 OLIVE JP, 1975, J ACOUST SOC AM, V57, P476, DOI 10.1121/1.380436 OSHAUGHNESSY D, 1983, J ACOUST SOC AM, V74, P1155, DOI 10.1121/1.390039 PIERREHUMBERT J, 1981, J ACOUST SOC AM, V70, P985, DOI 10.1121/1.387033 Pierrehumbert J, 1980, THESIS MIT PISONI D, 1987, TEXT SPEECH MITALK S, pCH13 POLS LCW, 1990, JAN P VERB 90 ROM, P295 QUAZZA S, 1987, EUROPEAN C SPEECH TE, P389 QUENE H, 1989, EUROPEAN C SPEECH CO, V1, P214 Shi B., 1989, Eurospeech 89. European Conference on Speech Communication and Technology Silverman K. E. A., 1987, THESIS U CAMBRIDGE TERKEN J, 1989, 2ND C LAB PHON ED TERKEN J, 1988, J PHONETICS, V16, P453 TUBACH JP, 1989, EUROPEAN C SPEECH CO, V1 VANBEZOOIJEN R, 1989, EUROPEAN C SPEECH CO, V1, P218 VANBEZOOIJEN R, 1989, 9 I PHON SCI AN SYNT WILLIAMS BJ, 1986, IBM UKSC154 REP YIOURGALIS N, 1990, 1990 P VERB 90 ROM, P409 YOUNG SJ, 1980, INT J MAN MACH STUD, V12, P241, DOI 10.1016/S0020-7373(80)80027-7 NR 34 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 305 EP 314 DI 10.1016/0167-6393(90)90006-U PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700006 ER PT J AU PECKHAM, J THOMAS, T FRANGOULIS, E AF PECKHAM, J THOMAS, T FRANGOULIS, E TI RECOGNIZER SENSITIVITY ANALYSIS - A METHOD FOR ASSESSING THE PERFORMANCE OF SPEECH RECOGNIZERS SO SPEECH COMMUNICATION LA English DT Article RP PECKHAM, J (reprint author), LOGICA CAMBRIDGE LTD, BETJEMAN HOUSE, 104 HILLS RD, CAMBRIDGE CB2 1LQ, ENGLAND. CR FOURCIN AJ, 1989, SPEECH FEATURE DESCR Gillick L., 1989, P ICASSP, P532 HUNT MJ, 1988, P ICASSP, P457 KNIGHT JA, 1984, GENERIC MODEL ASSESS Lea W. A., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing MARCUS JN, 1989, P EUROSPEECH, V2, P465 MOODY T, 1987, P SPEECH TECH, V87, P275 PALLET DS, 1989, P ICASSP89, P526 PECKHAM JB, 1985, P SPEECH TECH, V85, P165 Peckham J., 1986, Speech Technology, V3 PISONI DB, 1986, COMMUNICATION Simpson C. A., 1987, Speech Technology, V3 STEENEKEN HJM, 1989, P ICASSP, P540 THOMAS TJ, 1989, P ICASSP89, P544 WINSKI R, 1989, INVESTIGATION SUITAB NR 15 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 317 EP 327 DI 10.1016/0167-6393(90)90007-V PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700007 ER PT J AU HUNT, MJ AF HUNT, MJ TI FIGURES OF MERIT FOR ASSESSING CONNECTED-WORD RECOGNIZERS SO SPEECH COMMUNICATION LA English DT Article RP HUNT, MJ (reprint author), MARCONI SPEECH & INFORMAT SYST, AIRSPEED RD, AIRPORT, PORTSMOUTH PO3 5RE, ENGLAND. CR Gillick L., 1989, 1989 IEEE INT C AC S, V1, P532 HUNT MJ, 1985, CANADIAN ACOUSTICS HUNT MJ, 1988, 1988 P ICASSP88 NEW, V1, P457 PALLETT D, 1987, MAR P DARPA SPEECH R, P75 WOODARD JP, 1982, 1982 P NBS WORKSH ST, P37 NR 5 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 329 EP 336 DI 10.1016/0167-6393(90)90008-W PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700008 ER PT J AU VANDEVEGTE, JME TAYLOR, MM AF VANDEVEGTE, JME TAYLOR, MM TI TESTING THE EFFECTIVE VOCABULARY CAPACITY METHOD OF EVALUATING SPEECH RECOGNIZERS SO SPEECH COMMUNICATION LA English DT Article RP VANDEVEGTE, JME (reprint author), DEF & CIVIL INST ENVIRONM MED, POB 2000, 1133 SHEPPARD AVE W, DOWNSVIEW M3M 3B9, ONTARIO, CANADA. CR *CAN BUR MAN CONS, 1987, 35806 PROJ GOLD EM, 1973, PSYCHOMETRIKA, V38, P555, DOI 10.1007/BF02291494 MOORE RK, 1977, IEEE T ACOUST SPEECH, V25, P178, DOI 10.1109/TASSP.1977.1162916 SCHIFFMAN H, 1968, PHYSIOL BEHAV, V3, P197, DOI 10.1016/0031-9384(68)90054-1 SCHONEMA.PH, 1970, PSYCHOMETRIKA, V35, P349, DOI 10.1007/BF02310794 TAYLOR MM, 1985, OCT CAN AC ASS M OTT TAYLOR MM, 1986, J AM VOICE I O SOC, V3, P39 VIGNAULTRASIULI.L, 1985, OCT CAN AC ASS M OTT NR 8 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 337 EP 347 DI 10.1016/0167-6393(90)90009-X PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700009 ER PT J AU ZUE, V SENEFF, S GLASS, J AF ZUE, V SENEFF, S GLASS, J TI SPEECH DATABASE DEVELOPMENT AT MIT - TIMIT AND BEYOND SO SPEECH COMMUNICATION LA English DT Article RP ZUE, V (reprint author), MIT, COMP SCI LAB, SPOKEN LANGUAGE SYST GRP, CAMBRIDGE, MA 02139 USA. CR CARRE R, 1984, P ICASSP84 FISHER W, 1986, DARPA SPEECH RECOGNI HULTZEN IS, 1964, TABLE TRANSITIONAL F KASSEL RH, 1986, THESIS MIT Kucera H., 1967, COMPUTATIONAL ANAL P Kuwabara H, 1989, P ICASSP89, P560 Lamel L. F., 1986, P DARPA SPEECH REC W, P100 LEE KF, 1989, IEEE T ACOUST SPEECH, V37, P1641, DOI 10.1109/29.46546 LEUNG HC, 1984, P ICASSP84 LEUNG HC, 1985, THESIS MIT PALLET DS, 1987, PUBLIC DOMAIN SPEECH PALLETT D, 1989, P ICASSP, P536 Seneff S., 1989, P IEEE ICASSP GLASG, P711 ZUE V, 1990, UNPUB ICASSP90 ZUE V, 1989, P IEEE INT C ACOUSTI, P389 ZUE VW, 1988, 2NDP M ADV MAN MACH NR 16 TC 90 Z9 91 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 351 EP 356 DI 10.1016/0167-6393(90)90010-7 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700010 ER PT J AU KUREMATSU, A TAKEDA, K SAGISAKA, Y KATAGIRI, S KUWABARA, H SHIKANO, K AF KUREMATSU, A TAKEDA, K SAGISAKA, Y KATAGIRI, S KUWABARA, H SHIKANO, K TI ATR JAPANESE SPEECH DATABASE AS A TOOL OF SPEECH RECOGNITION AND SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article RP KUREMATSU, A (reprint author), ATR INTERPRETING TELEPHONY RES LABS, SEIKA, KYOTO 61902, JAPAN. CR Abe M., 1988, P ICASSP, P655 CARRE R, 1984, P ICASSP84 HATAZAKI K, 1989, P INT C AC SPEECH SI, P393 ISO K, 1988, MAR P ACC SOC JAP, P89 Itahashi S., 1985, Journal of the Acoustical Society of Japan, V41 KAWABATA T, 1989, P ICASSP89, P461 Kuwabara H, 1989, P ICASSP89, P560 PALLETT DS, 1987, PUBLIC DOMAIN SPEECH PERENNOU G, 1986, P ICASSP86, P325 Sagisaka Y., 1988, P INT C AC SPEECH SI, P679 TADEDA K, 1988, J ACOUST SOC JAPAN, V44, P747 TAKEDA K, 1988, MAR P ACOUST SOC JAP, P177 Tamura S., 1988, P IEEE INT C AC SPEE, P553 WAIBEL A, 1989, IEEE T ACOUST SPEECH, V37, P1888, DOI 10.1109/29.45535 WAIBEL A, 1989, IEEE T ACOUST SPEECH, V37, P328, DOI 10.1109/29.21701 NR 15 TC 49 Z9 49 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 357 EP 363 DI 10.1016/0167-6393(90)90011-W PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700011 ER PT J AU HEDELIN, P HUBER, D AF HEDELIN, P HUBER, D TI THE CTH SPEECH DATABASE - AN INTEGRATED MULTILEVEL APPROACH SO SPEECH COMMUNICATION LA English DT Article RP HEDELIN, P (reprint author), CHALMERS UNIV TECHNOL, DEPT INFORMAT THEORY, S-41296 GOTHENBURG, SWEDEN. CR ALLEN S, 1970, NYSVENSK FREKVENSORD Baker J. M., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Beaugrande R., 1981, INTRO TEXT LINGUISTI CARRE R, 1984, P ICASSP84 SAN DIEGO COLLINS P, 1966, P ICASSP86 TOKYO, P2779 ELERT CC, 1966, ALLMAN SVENSK FONETI ELOVITZ HS, 1976, IEEE ASSP, V24 Garlen Claes, 1988, SVENSKANS FONOLOGI GOLD B, 1969, J ACOUST SOC AM, V46, P442, DOI 10.1121/1.1911709 HEDELIN P, 1988, CTH8 CHALM U TECHN T HEDELIN P, 1988, P ICASSP88 NEW YORK, P643 HEDELIN P, 1986, CTH5 CHALM U TECHN T HUBER D, 1989, P INT WORKSHOP PARSI, P115 HUBER D, 1989, P INT C AC SPEECH SI, P600 HUBER D, 1988, THESIS GOTEBORG LUND ITAHASHI S, 1986, P ICASSP86 TOKYO, P321 Jorgensen Nils, 1976, MENINGSBYGGNADEN TAL KORSANBENGTSEN M, 1973, ACTA OTALARYNGOLOGIC, V310 KUWABARA K, 1989, P ICASSP89 GLASG, P560 MAKINO S, 1986, P ICASSP86 TOK, P2783 Markel JD, 1976, LINEAR PREDICTION SP MILLAR PC, 1967, P ICASSP88 NEW YORK, P647 NOLL AM, 1967, J ACOUST SOC AM, V41, P293, DOI 10.1121/1.1910339 PERENNOU G, 1986, P ICASSP86, P325 Price P., 1988, P IEEE INT C AC SPEE, P651 WAGNER M, 1981, P ICASSP81 ATLANTA, P1156 ZUE VW, 1989, P ESCA WORKSHOP SPEE NR 27 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 365 EP 374 DI 10.1016/0167-6393(90)90012-X PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700012 ER PT J AU CARLSON, R GRANSTROM, B NORD, L AF CARLSON, R GRANSTROM, B NORD, L TI THE KTH SPEECH DATABASE SO SPEECH COMMUNICATION LA English DT Article RP CARLSON, R (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, S-10044 STOCKHOLM 70, SWEDEN. CR BLOMBERG M, 1985, SPEECH TRANSMISSION, P37 CARLSON R, 1989, P EUROSPEECH 89 PARI, P458 CARLSON R, 1985, SPEECH TRANSMISSION, P29 CARLSON R, 1986, PHONETICA, V43, P140 FANT G, 1989, SPEECH TRANSMISSION, P1 Nord L., 1988, 7th FASE Symposium. Proceedings Speech '88 NR 6 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 375 EP 380 DI 10.1016/0167-6393(90)90013-Y PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700013 ER PT J AU HENDRIKS, JPM AF HENDRIKS, JPM TI A FORMALISM FOR SPEECH DATABASE ACCESS SO SPEECH COMMUNICATION LA English DT Article RP HENDRIKS, JPM (reprint author), PTT RES NEHER LABS, POB 421, 2260 AK LEIDSCHENDAM, NETHERLANDS. CR GUYOTE MF, 1986, P ICASSP86, P313 HENDRIKS JPM, 1988, P ICASSP88, P643 Hendriks J. P. M., 1988, 7th FASE Symposium. Proceedings Speech '88 ITAHASHI S, 1986, P ICASSP86 TOKYO, P321 LEONARD RG, 1984, P ICASSP84 MOLBAEKHAMSEN P, 1987, COST209 FIN REP EUR, P236 PALLETT DS, 1986, P ICASSP86, P317 VANERP A, 1989, 1989 P EUR PAR, V2, P88 VANHEUGTEN LJP, 1989, 1989 P ESCA TUT DAY NR 9 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP 381 EP 388 DI 10.1016/0167-6393(90)90014-Z PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700014 ER PT J AU POLS, LCW AF POLS, LCW TI SPECIAL ISSUE ON SPEECH INPUT OUTPUT ASSESSMENT AND SPEECH DATABASES SO SPEECH COMMUNICATION LA English DT Editorial Material RP POLS, LCW (reprint author), UNIV AMSTERDAM, PHOENT SCI, AMSTERDAM, NETHERLANDS. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD AUG PY 1990 VL 9 IS 4 BP R5 EP R6 DI 10.1016/0167-6393(90)90001-P PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EH627 UT WOS:A1990EH62700001 ER PT J AU FANT, G AF FANT, G TI SPEECH RESEARCH IN PERSPECTIVE SO SPEECH COMMUNICATION LA English DT Editorial Material RP FANT, G (reprint author), ROYAL INST TECHNOL, DEPT SPEECH COMMUN & MUS ACOUST, BOX 70014, S-10044 STOCKHOLM 70, SWEDEN. NR 0 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 171 EP 176 DI 10.1016/0167-6393(90)90054-D PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800001 ER PT J AU PITTAM, J GALLOIS, C CALLAN, V AF PITTAM, J GALLOIS, C CALLAN, V TI THE LONG-TERM SPECTRUM AND PERCEIVED EMOTION SO SPEECH COMMUNICATION LA English DT Article C1 UNIV QUEENSLAND, DEPT PSYCHOL, ST LUCIA, QLD 4067, AUSTRALIA. RP PITTAM, J (reprint author), UNIV QUEENSLAND, DEPT ENGLISH, ST LUCIA, QLD 4067, AUSTRALIA. RI Gallois, Cindy/F-9546-2014 OI Gallois, Cindy/0000-0002-7938-7839 CR ALPERT M, 1963, J ACOUST SOC AM, V35, P1877, DOI 10.1121/1.2142610 DARBY JK, 1977, FOLIA PHONIATR, V29, P279 FRIEDHOFF AJ, 1962, NATURE, V193, P357, DOI 10.1038/193357a0 HARGREAVES WILLIAM A., 1965, J ABNORM PSYCHOL, V70, P218, DOI 10.1037/h0022151 HARGREAVES WA, 1964, LANG SPEECH, V7, P84 HARMEGNIES B, 1985, 15EMES ACT JEP PAR, P51 HOLLIEN H, 1974, 8TH P INT C AC, V11, P269 HORII Y, 1973, J SPEECH HEAR RES, V16, P67 Kroonenberg P. M., 1983, 3 MODE PRINCIPAL COM KROONENBERG PM, 1985, USERS GUIDE TUCKALS3 Law H, 1984, RES METHODS MULTIMOD Mehrabian A., 1974, APPROACH ENV PSYCHOL Ostwald P.F., 1963, SOUNDMAKING ACOUSTIC PITTAM J, 1989, LONG TERM SPECTRUM A PITTAM J, 1988, 2ND P AUSTR INT C SP, P308 PITTAM J, 1987, LANG SPEECH, V30, P1 POPOV VA, 1971, ZH VYSSHEY NERVNOY D, V1, P104 ROESSLER R, 1976, J NERV MENT DIS, V163, P166, DOI 10.1097/00005053-197609000-00004 Scherer K. R., 1982, HDB METHODS NONVERBA, P136 SCHERER KR, 1986, PSYCHOL BULL, V99, P143, DOI 10.1037//0033-2909.99.2.143 Scherer K.R., 1981, SPEECH EVALUATION PS, P189 WILLIAMS CE, 1972, J ACOUST SOC AM, V52, P1238, DOI 10.1121/1.1913238 NR 22 TC 18 Z9 18 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 177 EP 187 DI 10.1016/0167-6393(90)90055-E PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800002 ER PT J AU SCHOENTGEN, J AF SCHOENTGEN, J TI NONLINEAR SIGNAL REPRESENTATION AND ITS APPLICATION TO THE MODELING OF THE GLOTTAL WAVE-FORM SO SPEECH COMMUNICATION LA English DT Article RP SCHOENTGEN, J (reprint author), FREE UNIV BRUSSELS, INST PHONET, CP 110, 50 AVE FD ROOSEVELT, B-1050 BRUSSELS, BELGIUM. CR Abramowitz M., 1965, HDB MATH FUNCTIONS ALANSARI A, 1981, THESIS I NATIONAL PO ARFIB D, 1979, J AUDIO ENG SOC, V27, P757 BAKEN RJ, 1987, CLIN MEASUREMENT SPE, P299 Broad DJ, 1979, SPEECH LANGUAGE ADV, V2, P203 CHENG YM, 1987, LARYNGEAL FUNCTION P, P219 DELLER JR, 1983, SPEECH COMMUN, V2, P57, DOI 10.1016/0167-6393(83)90064-X FANT G, 1979, STL QPSR, V1, P85 Fant Gunnar, 1985, STL QPSR, V4, P1 Fox L., 1968, CHEBYSHEV POLYNOMIAL Franks L.E., 1969, SIGNAL THEORY LEBRUN M, 1979, J AUDIO ENG SOC, V27, P250 MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 SCHAEFER RA, 1970, J AUDIO ENG SOC, V18, P413 SELBY SM, 1967, STANDARD MATH TABLES SUEN CY, 1970, J AUDIO ENG SOC, V18, P675 TAGASUKI T, 1971, J RAD RES LAB, P209 VEENEMAN DE, 1985, IEEE T ACOUST SPEECH, V33, P369, DOI 10.1109/TASSP.1985.1164544 NR 19 TC 10 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 189 EP 201 DI 10.1016/0167-6393(90)90056-F PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800003 ER PT J AU FUNADA, T SUZUKI, T YU, L AF FUNADA, T SUZUKI, T YU, L TI A PITCH EXTRACTION METHOD USING A BANK OF BANDPASS FILTER-PAIRS SO SPEECH COMMUNICATION LA English DT Article RP FUNADA, T (reprint author), KANAZAWA UNIV, FAC TECHNOL, KANAZAWA, ISHIKAWA 920, JAPAN. CR ATAL BS, 1972, J ACOUST SOC AM, V52, P1687, DOI 10.1121/1.1913303 DUIFHUIS H, 1982, J ACOUST SOC AM, V71, P1568, DOI 10.1121/1.387811 FUJISAKI H, 1987, ICASSP87, P2422 FUNADA T, 1989, J I ELECTRO INFORM A, V72, P466 Furui S., 1989, DIGITAL SPEECH PROCE HESS W, 1987, SPEECH COMMUN, V6, P55, DOI 10.1016/0167-6393(87)90069-0 Hess W., 1983, PITCH DETERMINATION Markel JD, 1976, LINEAR PREDICTION SP MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P367, DOI 10.1109/TAU.1972.1162410 NOLL AM, 1967, J ACOUST SOC AM, V41, P293, DOI 10.1121/1.1910339 PICONE J, 1987, ICASSP87, P1442 RABINER LR, 1976, IEEE T ACOUST SPEECH, V24, P399, DOI 10.1109/TASSP.1976.1162846 NR 12 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 203 EP 216 DI 10.1016/0167-6393(90)90057-G PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800004 ER PT J AU BOE, LJ PERRIER, P AF BOE, LJ PERRIER, P TI DISTINCTIVE REGIONS AND MODES - A NEW THEORY OF SPEECH PRODUCTION - COMMENT SO SPEECH COMMUNICATION LA English DT Letter RP BOE, LJ (reprint author), UNIV STENDHAL, ECOLE NATL SUPER ELECTR & RADIOELECT GRENOBLE, INST NATL POLYTECH GRENOBLE, F-38031 GRENOBLE, FRANCE. CR Abbs J. H., 1986, INVARIANCE VARIABILI, P202 ABRY C, 1980, DONNEES FONDAMENTALE ABRY C, 1989, J PHONETICS, V17, P47 ATAL BS, 1978, J ACOUST SOC AM, V63, P1535, DOI 10.1121/1.381848 BADIN P, 1987, 11TH P INT C PHON SC, V2, P352 BADIN P, 1990, J ACOUST SOC AM, V87, P1290, DOI 10.1121/1.398804 BARBIER P, 1978, TRAV I PHONETIQUE ST, V10, P98 BOE LJ, 1988, 17EMES JOURN ET PAR, P200 BOE LJ, 1973, REV ACOUSTIQUE, V27, P235 BONDER LJ, 1983, ACUSTICA, V52, P216 BONDER LJ, 1983, 10TH P INT C PHON B, V2, P347 BONDER LJ, 1983, ACUSTICA, V53, P201 Chalmers A. F., 1982, WHAT IS THIS THING C CHARPENTIER F, 1986, B I PHONETIQUE GRENO, V15, P1 CHARPENTIER F, 1984, SPEECH COMMUN, V3, P291, DOI 10.1016/0167-6393(84)90025-6 CHARPENTIER F, 1982, J ACOUST SOC AM, V72, pS49, DOI 10.1121/1.2019918 Chomsky N., 1968, SOUND PATTERN ENGLIS DELATTRE P, 1967, 6TH P INT C PHON SCI, P35 DELATTRE PC, 1962, STUD LINGUISTICA, V16, P104, DOI 10.1111/j.1467-9582.1962.tb00430.x Ehrenfest P, 1916, P AMSTERDAM ACAD, V19, P576 Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1989, EUROPEAN C SPEECH CO, V1, P3 FANT G, 1974, SPEECH COMM SEM, V2, P121 FANT G, 1970, MANUAL PHONETICS, P173 FANT G, 1979, 9TH P INT C PHON SCI, V3, P79 FENG G, 1986, THESIS I NATIONAL PO FOLKINS JW, 1987, J ACOUST SOC AM, V82, P1919, DOI 10.1121/1.395687 FUJIMURA O, 1987, 11TH P INT C PHON SC, P10 GAY T, 1981, J ACOUST SOC AM, V69, P802, DOI 10.1121/1.385591 GENTNER DR, 1987, PSYCHOL REV, V94, P255, DOI 10.1037//0033-295X.94.2.255 GOLDSTEIN L, 1983, 10TH P INT C PHON A, V2, P267 HASSAN O, 1988, THESIS I NATIONAL PO Hempel Carl, 1966, PHILOS NATURAL SCI HUIZENGA E, 1931, ARCH NEERLANDAISES P, V4, P66 JONGMAN A, 1985, J PHONETICS, V13, P235 KELSO JAS, 1986, J PHONETICS, V14, P29 Ladefoged Peter, 1982, SPEECH COMMUN, V1, P185, DOI 10.1016/0167-6393(82)90016-4 LIBERMAN AM, 1957, J ACOUST SOC AM, V29, P117, DOI 10.1121/1.1908635 LINDBLOM BE, 1971, J ACOUST SOC AM, V50, P1166, DOI 10.1121/1.1912750 LUBKER J, 1982, J ACOUST SOC AM, V71, P437, DOI 10.1121/1.387447 LUBKER J, 1981, SPEECH MOTOR CONTROL, P205 MACNEILA.PF, 1970, PSYCHOL REV, V77, P182, DOI 10.1037/h0029070 Maddieson I., 1986, PATTERNS SOUNDS MAEDA S, 1979, J ACOUST SOC AM, V65, pS22, DOI 10.1121/1.2017158 MAEDA S, 1988, J ACOUST SOC AM, V84, pS146, DOI 10.1121/1.2025845 MAJID R, 1986, THESIS I NATIONAL PO MAJID R, 1987, 11TH P INT C PHON SC, V2, P348 MERMELST.P, 1973, J ACOUST SOC AM, V53, P1070, DOI 10.1121/1.1913427 MOL H, 1970, FUNDAMENTALS PHONETI, V2 MRAYATI M, 1976, THESIS I NATIONAL PO MRAYATI M, 1988, SPEECH COMMUN, V7, P257, DOI 10.1016/0167-6393(88)90073-8 NELSON WL, 1983, BIOL CYBERN, V46, P135, DOI 10.1007/BF00339982 NITTROUER S, 1988, J ACOUST SOC AM, V84, P1653, DOI 10.1121/1.397180 OHMAN SEG, 1974, SPEECH COMMUN, P133 OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 PERKELL JS, 1985, J ACOUST SOC AM, V77, P1889, DOI 10.1121/1.391940 Perkell JS, 1969, PHYSL SPEECH PRODUCT, V53 Popper K. R., 1959, LOGIC SCI DISCOVERY RIORDAN CJ, 1977, J ACOUST SOC AM, V62, P998, DOI 10.1121/1.381595 RUBIN P, 1981, J ACOUST SOC AM, V70, P321, DOI 10.1121/1.386780 SCHROEDE.MR, 1967, J ACOUST SOC AM, V41, P1002, DOI 10.1121/1.1910429 SONDHI MM, 1979, IEEE T ACOUST SPEECH, V27, P268, DOI 10.1109/TASSP.1979.1163240 Stevens KN, 1972, HUMAN COMMUNICATION, P51 STEVENS KN, 1955, J ACOUST SOC AM, V27, P484, DOI 10.1121/1.1907943 STEVENS KN, 1985, INVARIANCE VARIABILI VAISSIERE J, 1988, COURS INDICES ACOUST WOOD S, 1979, J PHONETICS, V7, P25 WOOD S, 1986, J ACOUST SOC AM, V80, P391, DOI 10.1121/1.394090 NR 68 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 217 EP 230 DI 10.1016/0167-6393(90)90058-H PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800005 ER PT J AU MRAYATI, M CARRE, R GUERIN, B AF MRAYATI, M CARRE, R GUERIN, B TI DISTINCTIVE REGIONS AND MODES - ARTICULATORY-ACOUSTIC-PHONETIC ASPECTS - REPLY SO SPEECH COMMUNICATION LA English DT Letter C1 TELECOM PARIS, DEPT SIGNAL, CNRS, UA 46, F-75634 PARIS 13, FRANCE. ECOLE NATL SUPER ELECTR & RADIOELECT, INST NATL POLYTECH GRENOBLE,CNRS,UA, INST COMMUN PARLEE, F-38031 GRENOBLE, FRANCE. RP MRAYATI, M (reprint author), SCI STUDIES & RES CTR, POB 470, DAMASCUS, SYRIA. CR BOE LJ, 1990, SPEECH COMMUN, V9, P217, DOI 10.1016/0167-6393(90)90058-H CARRE R, 1989, P IEEE INT C ACOUSTI, P231 CARRE R, 1990, NATO ASI SERIES DELATTRE P, 1967, 6TH P INT C PHON SCI, P35 FANT G, 1975, STL QPSR, P1 HARSHMAN R, 1977, J ACOUST SOC AM, V62, P693, DOI 10.1121/1.381581 KAKITA Y, 1985, PHONETIC LINGUISTICS, P133 MAEDA S, 1979, B I PHONETIQUE GRENO, V8, P35 MRAYATI M, 1976, PHONETICA, V33, P285 MRAYATI M, 1988, SPEECH COMMUN, V7, P257, DOI 10.1016/0167-6393(88)90073-8 MRAYATI M, 1989, P EUROSPEECH 89, P172 OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 Perkell JS, 1969, PHYSL SPEECH PRODUCT, V53 Stevens KN, 1972, HUMAN COMMUNICATION, P51 WOOD S, 1979, J PHONETICS, V7, P25 WOOD S, 1986, J ACOUST SOC AM, V80, P391, DOI 10.1121/1.394090 NR 16 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 231 EP 238 DI 10.1016/0167-6393(90)90059-I PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800006 ER PT J AU YANG, WJ WANG, HC AF YANG, WJ WANG, HC TI FINITE REGISTER LENGTH EFFECTS IN A HIDDEN MARKOV MODEL SPEECH RECOGNIZER SO SPEECH COMMUNICATION LA English DT Note RP YANG, WJ (reprint author), NATL TSING HUA UNIV, DEPT ELECT ENGN, HSINCHU 30043, TAIWAN. CR DAUTRICH BA, 1983, IEEE T ACOUST SPEECH, V31, P793, DOI 10.1109/TASSP.1983.1164172 FORNEY GD, 1973, P IEEE, V61, P268, DOI 10.1109/PROC.1973.9030 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 LEVINSON SE, 1983, AT&T TECH J, V62, P1035 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MARASA JD, 1973, IEEE T COMPUT, VC 22, P587 RABINER LR, 1978, IEEE T ACOUST SPEECH, V26, P575, DOI 10.1109/TASSP.1978.1163164 RABINER LR, 1984, BELL SYST TECH J, P627 RABINER LR, 1983, AT&T TECH J, V62, P1075 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 YANG WJ, 1988, J CHIN INST ENG, V11, P361 YOHE JM, 1973, IEEE T COMPUT, VC 22, P577 NR 12 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 239 EP 245 DI 10.1016/0167-6393(90)90060-M PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800007 ER PT J AU ARANI, G AF ARANI, G TI REPORT ON EUROSPEECH 89 SO SPEECH COMMUNICATION LA English DT Editorial Material RP ARANI, G (reprint author), CTR STUDI ELECT LAB TELECOMUN, TURIN, ITALY. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1990 VL 9 IS 3 BP 255 EP 258 PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA EB278 UT WOS:A1990EB27800008 ER PT J AU CHILDERS, DG WU, K AF CHILDERS, DG WU, K TI QUALITY OF SPEECH PRODUCED BY ANALYSIS-SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article RP CHILDERS, DG (reprint author), UNIV FLORIDA, DEPT ELECT ENGN, GAINESVILLE, FL 32611 USA. CR AGRAWAL A, 1975, J ACOUST SOC AM, V57, P217, DOI 10.1121/1.380427 ALLEN DR, 1985, J ACOUST SOC AM, V78, P58, DOI 10.1121/1.392454 ANANTH AA, 1985, P IEEE INT C ACOUST ANANTHAPADMANABHA T.V., 1984, STL QPSR, V2, P1 ANANTHAPADMANAB.T, 1982, STL QPSR, V1, P1 ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 ATAL BS, 1975, SPEECH RECOGNITION, P221 Atal B. S., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1982, P IEEE INT C ACOUST, V614 BARNWELL TP, 1979, J ACOUST SOC AM, V66, P1658, DOI 10.1121/1.383664 BEROUTI MG, 1977, IEEE ICASSP P, P33 CARLSON R, 1979, FRONTIERS SPEECH COM, P233 CHAN PP, 1974, NUMER MATH, V22, P403, DOI 10.1007/BF01436922 CHENG YM, 1987, LARYNGEAL FUNCTION P, P219 CHILDERS DG, 1987, IEEE T ACOUST SPEECH, P293 Childers D. G., 1985, Mathematics and computers in biomedical applications CHILDERS DG, 1989, IEEE T ACOUST SPEECH, V37, P1771, DOI 10.1109/29.46561 CHILDERS DG, 1985, CRIT REV BIOMED ENG, V12, P131 CHILDERS DG, 1984, IEEE T BIO-MED ENG, V31, P807, DOI 10.1109/TBME.1984.325242 CHILDERS DG, 1989, SPEECH COMMUN, V8, P147, DOI 10.1016/0167-6393(89)90041-1 CHILDERS DG, 1983, 10TH INT C PHON SCI, P833 CHILDERS DG, 1983, JUL P STOCKH MUS AC, P125 CHILDERS DG, 1986, J ACOUST SOC AM, V80, P1309, DOI 10.1121/1.394382 CHILDERS DG, 1985, VOICE I O SYSTEMS AP, P349 CLARK HH, 1973, J VERB LEARN VERB BE, V12, P335, DOI 10.1016/S0022-5371(73)80014-3 CLARK JE, 1985, J ACOUST SOC AM, V78, P458, DOI 10.1121/1.392468 COLTON RH, 1981, SPEECH LANGUAGE, V5, P311 Cooper M., 1987, Speech Technology, V3 CRANEN B, 1987, J ACOUST SOC AM, V81, P734, DOI 10.1121/1.394842 ESKENAZI L, 1988, THESIS U FLORIDA FAIRBANKS G, 1958, J ACOUST SOC AM, V30, P596, DOI 10.1121/1.1909702 Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1979, STL QPSR, V1, P85 FANT G, 1988, STL QPSR, V2, P1 Fant Gunnar, 1985, STL QPSR, V4, P1 Flanagan J., 1972, SPEECH ANAL SYNTHESI FLETCHER H, 1929, BELL SYST TECH J, V8, P848 FRENCH NR, 1947, J ACOUST SOC AM, V19, P90, DOI 10.1121/1.1916407 Fujisaki H., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) Furui S., 1989, DIGITAL SPEECH PROCE GUERIN B, 1976, IEEE INT C AC SPEECH, P47 HAWLEY ME, 1977, SPEECH INTELLIGIBILI HESS W, 1984, P IEEE INT C ACOUST, V1 Hildebrand B. H., 1976, THESIS U FLORIDA HILLENBRAND J, 1987, J SPEECH HEAR RES, V30, P448 HOLMES JN, 1973, IEEE T ACOUST SPEECH, VAU21, P298, DOI 10.1109/TAU.1973.1162466 HOLMES JN, 1983, SPEECH COMMUN, V2, P251, DOI 10.1016/0167-6393(83)90044-4 HOOVER J, 1987, J SPEECH HEAR RES, V30, P425 HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 HUGGINS AWF, 1985, J ACOUST SOC AM, V77, P1896, DOI 10.1121/1.391941 ITOH K, 1984, REV ELEC COMMUN LAB, V32, P220 Jayant N. S., 1984, DIGITAL CODING WAVEF JUANG BH, 1982, IEEE T ACOUST SPEECH, V30, P294 JUANG BH, 1984, AT&T TECH J, V63, P1477 KAHN M, 1983, IEEE T ACOUST SPEECH, P531 KITZING P, 1974, FOLIA PHONIATR, V26, P138 KLATT DH, 1987, J ACOUST SOC AM, V82, P737, DOI 10.1121/1.395275 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 KOIKE Y, 1973, J ACOUST SOC AM, V54, P1618, DOI 10.1121/1.1914458 KOIZUMI T, 1985, J ACOUST SOC AM, V78, P1541, DOI 10.1121/1.392789 KOLJONEN J, 1984, P IEEE INT C ACOUST, V1 KRISHNAMURTHY AK, 1983, THESIS U FLORIDA KRISHNAMURTHY AK, 1986, IEEE T ACOUST SPEECH, V34, P730, DOI 10.1109/TASSP.1986.1164909 KRYTER KD, 1970, EFFECTS NOISE MAN, pCH2 KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 LADD DR, 1985, J ACOUST SOC AM, V78, P435, DOI 10.1121/1.392466 Laver J, 1980, PHONETIC DESCRIPTION LEE CK, 1989, IN PRESS VOCAL FOLD LICKLIDER JCR, 1948, J ACOUST SOC AM, V20, P42, DOI 10.1121/1.1906346 LIEBERMAN P, 1961, J ACOUST SOC AM, V33, P597, DOI 10.1121/1.1908736 LIEBERMAN P, 1963, J ACOUST SOC AM, V35, P344, DOI 10.1121/1.1918465 LIM JS, 1983, SPEECH ENHANCEMENT LOGAN JS, 1989, J ACOUST SOC AM, V86, P566, DOI 10.1121/1.398236 LUCE PA, 1983, HUM FACTORS, V25, P17 MACK MA, 1985, ADA160401 MAKHOUL J, 1976, IEEE T ACOUST SPEECH, P103 MAKHOUL J, 1985, P IEEE, V73, P1551, DOI 10.1109/PROC.1985.13340 Markel JD, 1976, LINEAR PREDICTION SP MILENKOVIC P, 1987, J SPEECH HEAR RES, V30, P529 MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 MURRY T, 1980, J ACOUST SOC AM, V68, P1294, DOI 10.1121/1.385122 NADALSURIS M, 1977, IEEE T ACOUST SPEECH, P37 NAIK JM, 1984, THESIS U FLORIDA NAKATSUI M, 1982, J ACOUST SOC AM, V72, P1136, DOI 10.1121/1.388323 NOCERINO N, 1985, SPEECH COMMUN, V4, P317, DOI 10.1016/0167-6393(85)90057-3 NOOTEBOOM SG, 1983, IPO18 EINDH PROGR RE, P32 PINTO NB, 1989, IEEE T ACOUST SPEECH, V37, P1870, DOI 10.1109/29.45534 PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 Pisoni D. B., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing Pisoni D. B., 1982, Speech Technology, V1 QUACKENBUSCH SR, 1988, OBJECTIVE MEASURES S Rabiner L.R., 1978, DIGITAL PROCESSING S ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 ROSSON MB, 1986, P HUMAN FACTORS COMP, P192, DOI 10.1145/22627.22370 Rothauser E. H., 1969, IEEE T AUDIO ELECTRO, V17, P225, DOI DOI 10.1109/TAU.1969.1162058 ROTHAUSE.EH, 1971, J ACOUST SOC AM, V49, P1297, DOI 10.1121/1.1912493 ROTHAUSE.EH, 1968, J ACOUST SOC AM, V44, P408, DOI 10.1121/1.1911095 ROTHENBERG M, 1981, VOCAL FOLD PHYSL, P303 SAMBUR MR, 1978, J ACOUST SOC AM, V63, P918, DOI 10.1121/1.381771 SCHROEDER MR, 1985, SPEECH COMMUN, V4, P155, DOI 10.1016/0167-6393(85)90043-3 SCHWAB EC, 1985, HUM FACTORS, V27, P395 SENEFF S, 1982, IEEE T ACOUST SPEECH, V30, P566, DOI 10.1109/TASSP.1982.1163919 SINGH S, 1978, J ACOUST SOC AM, V64, P81, DOI 10.1121/1.381958 SMITH ME, 1981, IEEE T ACOUST SPEECH, V29, P391, DOI 10.1109/TASSP.1981.1163606 STEENEKEN HJM, 1980, J ACOUST SOC AM, V67, P318, DOI 10.1121/1.384464 THOMAS IB, 1968, J AUDIO ENG SOC, V16, P182 TITZE I, 1982, 11TH TRANSCR S CAR P, P90 TITZE IR, 1987, J SPEECH HEAR RES, V30, P252 VISWANATHAN VR, 1984, ADA1411941, P17 VISWANATHAN VR, 1984, OBJECTIVE SPEECH QUA, P137 Voiers WD, 1977, IEEE INT C AC SPEECH, P204 VOIERS WD, 1983, SPEECH TECHNOL, P30 VOIERS WD, 1968, IEEE T ACOUST SPEECH, VAU16, P275, DOI 10.1109/TAU.1968.1161973 Voiers WD, 1977, SPEECH INTELLIGIBILI, P374 WATERWORTH JA, 1983, APPL ERGON, V14, P39, DOI 10.1016/0003-6870(83)90219-3 Wong D. Y., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing WONG DY, 1977, IEEE T ACOUST SPEECH, P208 WU K, 1985, THESIS U FLORIDA YEA JJ, 1983, THESIS U FLORIDA YEA JJ, 1983, IEEE T ACOUST SPEECH, P1332 YEGNANARAYANA JM, 1984, 10TH INT C COMP LING, P530 NR 121 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 97 EP 117 DI 10.1016/0167-6393(90)90064-G PG 21 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300001 ER PT J AU PICONE, J AF PICONE, J TI DURATION IN CONTEXT CLUSTERING FOR SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article RP PICONE, J (reprint author), TEXAS INSTRUMENTS INC, SPEECH & IMAGE UNDERSTANDING LAB, DALLAS, TX 75265 USA. CR ANDERBERG MR, 1973, CLUSTER ANAL APPLICA BAKIS R, 1976, 91ST M AC SOC AM BAUM LE, 1967, B AM MATH SOC, V73, P360, DOI 10.1090/S0002-9904-1967-11751-8 BAUM LE, 1970, ANN MATH STAT, V41, P164, DOI 10.1214/aoms/1177697196 DODDINGTON GR, 1988, P IEEE INT C ACOUSTI, P699 DODDINGTON GR, 1988, J ACOUST SOC AM S1, V84 DODDINGTON GR, 1987, DARPA WORKSHOP SPEEC HEMPHILL C, 1989, P IEEE INT C ACOUSTI, P723 LEE CH, 1989, IEEE T ACOUST SPEECH, V37, P1649, DOI 10.1109/29.46547 LEONARD RG, 1984, P IEEE INT C ACOUSTI LEVINSON SE, 1979, IEEE T ACOUST SPEECH, V27, P134, DOI 10.1109/TASSP.1979.1163222 PICONE J, 1990, P IEEE ICASSP 90, P105 PICONE J, 1988, 1988 IEEE ARD HOUS S PICONE J, 1987, P IEEE INT C ACOUSTI, P1652 PICONE J, 1989, P IEEE INT C ACOUSTI, P421 Rabiner L. R., 1986, IEEE ASSP Magazine, V3, DOI 10.1109/MASSP.1986.1165342 Rabiner L. R., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80021-3 RABINER LR, 1985, AT&T TECH J, V64, P1211 RABINER LR, 1989, P IEEE INT C ACOUSTI, P421 RAJASEKARAN PK, 1985, P ICASSP 85, P882 VITERBI AJ, 1967, IEEE T INFORM THEORY, V13, P260, DOI 10.1109/TIT.1967.1054010 NR 21 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 119 EP 128 DI 10.1016/0167-6393(90)90065-H PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300002 ER PT J AU BERNASCONI, C AF BERNASCONI, C TI ON INSTANTANEOUS AND TRANSITIONAL SPECTRAL INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION SO SPEECH COMMUNICATION LA English DT Article RP BERNASCONI, C (reprint author), SWISS FED INST TECHNOL, ELECTR LAB, CH-8092 ZURICH, SWITZERLAND. CR ATAL BS, 1976, P IEEE, V64, P460, DOI 10.1109/PROC.1976.10155 ATAL BS, 1974, J ACOUST SOC AM, V55, P1304, DOI 10.1121/1.1914702 BERNASCONI C, 1988, THESIS SWISS FEDERAL Bevington P. R., 1969, DATA REDUCTION ERROR DODDINGTON GR, 1985, P IEEE, V73, P1651, DOI 10.1109/PROC.1985.13345 Duda R. O., 1973, PATTERN CLASSIFICATI FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P254, DOI 10.1109/TASSP.1981.1163530 GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 HOHNE HD, 1983, IEEE T ACOUST SPEECH, V31, P807, DOI 10.1109/TASSP.1983.1164174 KREISZYG E, 1979, STATISTISCHE METHODE NEY H, 1981, P ICASSP, P720 ROSENBERG AE, 1976, P IEEE, V64, P475, DOI 10.1109/PROC.1976.10156 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 NR 13 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 129 EP 139 DI 10.1016/0167-6393(90)90066-I PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300003 ER PT J AU YE, HY WANG, SG ROBERT, F AF YE, HY WANG, SG ROBERT, F TI A PCMN NEURAL NETWORK FOR ISOLATED WORD RECOGNITION SO SPEECH COMMUNICATION LA English DT Article C1 INST NATL POLYTECH GRENOBLE, CNRS, TIM LAB 3, F-38031 GRENOBLE, FRANCE. RP YE, HY (reprint author), INST NATL POLYTECH GRENOBLE 2, INST COMMUN PARLEE, UA 368, 46 AV F VIALLET, F-38031 GRENOBLE, FRANCE. CR JORDAN MI, 1988, COINS8827 TECHN REP LeCun Y., 1987, THESIS U PARIS 6 LIPPMANN R, 1987, IEEE ASSP MAGAZ 0404 LIPPMANN RP, 1989, NEUR COMPUTAT, V1, P39 Minsky M.L, 1969, PERCEPTRONS Plaut D. C., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90026-X Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1 SEJNOWSKI TJ, 1986, JHUEECS8601 J HOPK U WAIBEL A, 1989, IEEE T ACOUST SPEECH, V37, P328, DOI 10.1109/29.21701 WAIBEL A, 1987, ATR INTERPRETING TEL WANG S, 1988, IMPLEMENTATION THRES WANG SR, 1989, THESIS INP GRENOBLE WATROUS RL, 1987, 1ST P INT C NEUR NET, V4, P381 ZWICKER E, 1957, J ACOUST SOC AM, V29, P548, DOI 10.1121/1.1908963 NR 14 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 141 EP 153 DI 10.1016/0167-6393(90)90067-J PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300004 ER PT J AU RADEAU, M MORAIS, J AF RADEAU, M MORAIS, J TI THE UNIQUENESS POINT EFFECT IN THE SHADOWING OF SPOKEN WORDS SO SPEECH COMMUNICATION LA English DT Article RP RADEAU, M (reprint author), UNIV LIBRE BRUXELLES, PSYCHOL EXPTL LAB, B-1050 BRUSSELS, BELGIUM. CR Bagley WC, 1900, AM J PSYCHOL, V12, P80, DOI 10.2307/1412429 Cole R. A., 1980, PERCEPTION PRODUCTIO, P133 DESROCHERS A, 1989, CAN J PSYCHOL, V43, P62, DOI 10.1037/h0084253 FRAUENFELDER UH, 1987, COGNITION, V25, P1, DOI 10.1016/0010-0277(87)90002-3 GOODMAN JC, 1988, J MEM LANG, V27, P684, DOI 10.1016/0749-596X(88)90015-0 LUCE PA, 1986, PERCEPT PSYCHOPHYS, V39, P155, DOI 10.3758/BF03212485 MARCUS SM, 1985, LANG COGNITIVE PROC, V1, P163, DOI 10.1080/01690968508402077 MARCUS SM, 1984, ATTENTION PERFORM, V10, P151 MARSLENWILSON W, 1984, ATTENTION PERFORM, V10, P125 MARSLENWILSON WD, 1987, COGNITION, V25, P71, DOI 10.1016/0010-0277(87)90005-9 MARSLENWILSON WD, 1978, COGNITIVE PSYCHOL, V10, P29, DOI 10.1016/0010-0285(78)90018-X MARSLENWILSON WD, 1985, SPEECH COMMUN, V4, P55, DOI 10.1016/0167-6393(85)90036-6 NOOTEBOOM SG, 1981, J PHONETICS, V9, P407 RADEAU M, 1989, MEM COGNITION, V17, P525, DOI 10.3758/BF03197074 RADEAU M, 1989, PSYCHOL RES-PSYCH FO, V51, P123, DOI 10.1007/BF00309307 ROBERT P, 1986, MICRO ROBERT DICT FR TAFT M, 1986, COGNITION, V22, P259, DOI 10.1016/0010-0277(86)90017-X *TRES LANG FRANC, 1971, DICT FREQ NR 18 TC 28 Z9 28 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 155 EP 164 DI 10.1016/0167-6393(90)90068-K PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300005 ER PT J AU MARIANI, J AF MARIANI, J TI REPORT ON EUROSPEECH 89 CONFERENCE EUROPEAN CONFERENCE ON SPEECH-COMMUNICATION AND TECHNOLOGY - SEPTEMBER 26-28, 1989, PARIS, FRANCE SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD APR PY 1990 VL 9 IS 2 BP 165 EP 166 DI 10.1016/0167-6393(90)90069-L PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DN093 UT WOS:A1990DN09300006 ER PT J AU HUCKVALE, M AF HUCKVALE, M TI EXPLOITING SPEECH KNOWLEDGE IN NEURAL NETS FOR RECOGNITION SO SPEECH COMMUNICATION LA English DT Article RP HUCKVALE, M (reprint author), UNIV LONDON UNIV COLL, DEPT PHONET & LINGUIST, GOWER ST, LONDON WC1E 6BT, ENGLAND. CR BAHL LR, 1989, P ICASSP 89 GLASGOW BEDWORTH MD, 1989, P IEE C ARTIFICIAL N, P86 Bridle J. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing CHOW YL, 1987, APR P IEEE INT C AC, P89 COLE RA, 1988, P ICASSP 88, P453 DEMICHELIS P, 1989, P IEEE INT C ACOUSTI, P314 Gimson A. C., 1989, INTRO PRONUNCIATION Hinton G. E., 1981, PARALLEL MODELS ASS HOWARD IS, 1989, P IEE C ARTIFICIAL N, P90 HUCKVALE MA, 1989, P EUROSPEECH 89, V2, P565 HUCKVALE MA, 1987, EUROPEAN C SPEECH TE, P231 IMAMURA A, 1989, P EUROSPEECH 89, V1, P171 JELINEK F, 1985, P IEEE, V73, P161 KANGAS J, 1989, P EUROSPEECH 89 PARI, P345 KOHONEN T, 1987, EUROPEAN C SPEECH TE, V2, P377 Lass Roger, 1984, PHONOLOGY Lee K. F., 1989, P IEEE INT C AC SPEE, P445 Lowerre B.T., 1980, TRENDS SPEECH RECOGN MAKHOUL J, 1985, VARIABILITY INVARIAN MCCLELLAND JL, 1986, PARALLEL DISTRIBUTED, V2, pCH15 OSHIKA BT, 1975, IEEE T ACOUST SPEECH, VAS23, P104, DOI 10.1109/TASSP.1975.1162639 SENEFF S, 1988, TRANSCRIPTION ALIGNM Svendsen T., 1989, P INT C AC SPEECH SI, P108 WAIBEL A, 1989, P 6 IEEE INT C AC SP, P112 YOSHIDA K, 1984, P ICASSP 89 GLASGOW, P1 ZUE V, 1989, P IEEE INT C ACOUSTI, P389 NR 26 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 1 EP 13 DI 10.1016/0167-6393(90)90040-G PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000002 ER PT J AU COSI, P BENGIO, Y DEMORI, R AF COSI, P BENGIO, Y DEMORI, R TI PHONETICALLY-BASED MULTILAYERED NEURAL NETWORKS FOR VOWEL CLASSIFICATION SO SPEECH COMMUNICATION LA English DT Article C1 CTR RECH INFORMAT MONTREAL, MONTREAL H3G 1N2, QUEBEC, CANADA. MCGILL UNIV, SCH COMP SCI, MONTREAL H3A 2K6, QUEBEC, CANADA. RP COSI, P (reprint author), CNR, CTR STUDIO RICERCHE FONET, PIAZZA SALVEMINI 13, I-35131 PADUA, ITALY. CR BENGIO Y, 1989, NATO ASI SERIES BOURLARD H, 1987, 1ST P IEEE INT C NEU, P407 DELGUTTE B, 1984, J ACOUST SOC AM, V75, P897, DOI 10.1121/1.390599 DELGUTTE B, 1984, J ACOUST SOC AM, V75, P866, DOI 10.1121/1.390596 DELGUTTE B, 1980, J ACOUST SOC AM, V68, P843, DOI 10.1121/1.384824 DELGUTTE B, 1984, J ACOUST SOC AM, V75, P887, DOI 10.1121/1.390598 DEMORI R, 1985, IEEE T PATTERN ANAL, V7, P56 GOLDHOR RS, 1985, RLE505 TECHN REP Hinton G.E., 1986, PARALLEL DISTRIBUTED, V1, P282 Kiang NY-s, 1965, DISCHARGE PATTERNS S LEUNG HC, 1988, 1988 P INT C AC SPEE, P422 MILLER MI, 1983, J ACOUST SOC AM, V74, P502, DOI 10.1121/1.389816 Plaut D. C., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90026-X Rumelhart D.E., 1986, PARALLEL DISTRIBUTED, V1, P318 SACHS MB, 1980, J ACOUST SOC AM, V68, P858, DOI 10.1121/1.384825 SENEFF S, 1985, RLE504 TECHN REP SENEFF S, 1984, P IEEE INT C ACOUST SENEFF S, 1986, 1986 P IEEE INT C AC SENEFF S, 1988, J PHONETICS, V16, P55 SINEX DG, 1983, J ACOUST SOC AM, V73, P602, DOI 10.1121/1.389007 WAIBEL A, 1988, 1988 P INT C AC SPEE WATROUS RL, 1987, 10TH P INT JOINT C A, P851 YOUNG ED, 1979, J ACOUST SOC AM, V66, P1381, DOI 10.1121/1.383532 NR 23 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 15 EP 29 DI 10.1016/0167-6393(90)90041-7 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000003 ER PT J AU KNAGENHJELM, P BRAUER, P AF KNAGENHJELM, P BRAUER, P TI CLASSIFICATION OF VOWELS IN CONTINUOUS SPEECH USING MLP AND A HYBRID NET SO SPEECH COMMUNICATION LA English DT Article RP KNAGENHJELM, P (reprint author), CHALMERS UNIV TECHNOL, DEPT INFORMAT THEORY, S-41296 GOTHENBURG, SWEDEN. CR BRAUER P, 1989, 1989 P IEEE ICASSP G, V1, P647 KOHONEN T, 1987, 7TH P IEEE INT C PAT KOHONEN T, 1988, 1988 IEEE INT C NEUR, V1, P61 KOHONEN T, 1988, 1988 P IEEE ICASSP N LEUNG HC, 1988, 1988 P IEEE ICASSP N Linde Y., 1980, IEEE T COMMUNICATION, V28 LIPPMANN RP, 1987, 1ST P IEEE INT C NEU WAIBEL A, 1989, 1989 P IEEE ICASSP G, V1, P112 NR 8 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 31 EP 34 DI 10.1016/0167-6393(90)90042-8 PG 4 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000004 ER PT J AU GRAMSS, T STRUBE, HW AF GRAMSS, T STRUBE, HW TI RECOGNITION OF ISOLATED WORDS BASED ON PSYCHOACOUSTICS AND NEUROBIOLOGY SO SPEECH COMMUNICATION LA English DT Article RP GRAMSS, T (reprint author), UNIV GOTTINGEN, INST DRITTES PHYS, W-3400 GOTTINGEN, GERMANY. CR BURR DJ, 1988, IEEE T ACOUST SPEECH, V36, P1162, DOI 10.1109/29.1643 FUJIMURA O, 1981, PHONETICA, V38, P66 KAMMERER B, 1989, ZFE F INF11389BKWK T Lippmann R.P., 1987, IEEE P 1 INT C NEUR, V4, P417 PARSONS TW, 1976, J ACOUST SOC AM, V60, P911, DOI 10.1121/1.381172 Puschel D., 1988, THESIS U GOTTINGEN RUMELHART DE, 1986, NATURE, V323, P533, DOI 10.1038/323533a0 Rumelhart D.E., 1986, PARALLEL DISTRIBUTED, V1, P318 SHAMMA SA, 1987, IEEE INT C NEURAL NE, V4, P397 STRUBE HW, 1981, SIGNAL PROCESS, V3, P355 TRAUNMULLER H, 1987, SPEECH COMMUN, V6, P143, DOI 10.1016/0167-6393(87)90037-9 Warren R. M., 1970, Journal of the Acoustical Society of America, V48, DOI 10.1121/1.1912298 Zwicker E., 1982, PSYCHOAKUSTIK NR 13 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 35 EP 40 DI 10.1016/0167-6393(90)90043-9 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000005 ER PT J AU KUHN, G WATROUS, RL LADENDORF, B AF KUHN, G WATROUS, RL LADENDORF, B TI CONNECTED RECOGNITION WITH A RECURRENT NETWORK SO SPEECH COMMUNICATION LA English DT Article C1 SIEMENS RTL, PRINCETON, NJ 08540 USA. UNIV TORONTO, DEPT COMP SCI, TORONTO M5S 1A4, ONTARIO, CANADA. RP KUHN, G (reprint author), INST DEF ANAL, CRD, THANET RD, PRINCETON, NJ 08540 USA. CR BROWN PF, 1987, IBM RC12750 COMP SCI, P1 GHERRITY M, 1989, P INT JOINT C NEUR N, V1, P643 GORI M, 1989, P INT JOINT C NEURAL, V2, P417 HAFFNER P, 1989, P EUROSPEECH, V2, P553 KUHN GM, 1987, SCIMP487 I DEF AN CO LANG KJ, 1988, CMUCS88152 CARN U TE Pearlmutter B. A., 1989, Neural Computation, V1, DOI 10.1162/neco.1989.1.2.263 Rumelhart DH, 1986, PARALLEL DISTRIBUTED, V1 WATROUS RL, 1989, IN PRESS J ACOUST SO WATROUS RL, 1988, IN PRESS J ACOUST SO Williams R. J., 1989, Neural Computation, V1, DOI 10.1162/neco.1989.1.2.270 NR 11 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 41 EP 48 DI 10.1016/0167-6393(90)90044-A PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000006 ER PT J AU RAHIM, MG GOODYEAR, CC AF RAHIM, MG GOODYEAR, CC TI ESTIMATION OF VOCAL-TRACT FILTER PARAMETERS USING A NEURAL NET SO SPEECH COMMUNICATION LA English DT Article RP RAHIM, MG (reprint author), UNIV LIVERPOOL, DEPT ELECT ENGN & ELECTR, POB 147, LIVERPOOL L69 3BX, ENGLAND. CR ELMAN JL, 1987, ICS8701 REP KELLY JL, 1962, 4 INT C AC, P1 LEVINSON SE, 1983, J ACOUST SOC AM, V74 RAHIM MG, 1989, IEEE ICASSP 89, V1, P227 ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V2 SONDHI MM, 1987, IEEE T ACOUST SPEECH, V35, P955 NR 7 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 49 EP 55 DI 10.1016/0167-6393(90)90045-B PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000007 ER PT J AU CAMPBELL, WN AF CAMPBELL, WN TI ANALOG I/O NETS FOR SYLLABLE TIMING SO SPEECH COMMUNICATION LA English DT Article RP CAMPBELL, WN (reprint author), IBM UNITED KINGDOM LTD, CTR SCI, ST CLEMENT ST, WINCHESTER, ENGLAND. CR CAMPBELL WN, 1989, IN PRESS WORKING SPE CAMPBELL WN, 1989, P EUROPEAN C SPEECH CAMPBELL WN, 1987, P FASE EDINBURGH Klatt D. H., 1979, FRONTIERS SPEECH COM, P287 McClelland J. L., 1986, PARALLEL DISTRIBUTED NR 5 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 57 EP 61 DI 10.1016/0167-6393(90)90046-C PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000008 ER PT J AU WALLIKER, JR HOWARD, I AF WALLIKER, JR HOWARD, I TI REAL-TIME PORTABLE MULTILAYER PERCEPTRON VOICE FUNDAMENTAL-PERIOD EXTRACTOR FOR HEARING-AIDS AND COCHLEAR IMPLANTS SO SPEECH COMMUNICATION LA English DT Article C1 GUYS & ST THOMAS HOSP, DEPT CLIN PHYS & BIOENGN, LONDON SE1 9RT, ENGLAND. RP WALLIKER, JR (reprint author), UNIV LONDON UNIV COLL, DEPT PHONET & LINGUIST, LONDON WC1E 6BT, ENGLAND. CR CHONG M, 1988, CUED FINSENGTR8 CAMB FOURCIN AJ, 1971, MED BIOL ILLUS, V21, P172 HOWARD DM, 1983, ELECTRON LETT, V19, P776, DOI 10.1049/el:19830529 HOWARD I, 1988, SPEECH FUNDAMENTAL P Levine M. V., 1981, FUNDAMENTALS SENSATI MERMELSTEIN P, 1977, J ACOUST SOC AM, V61, P581, DOI 10.1121/1.381301 ROSEN S, 1988, J REHABIL RES DEV, V24, P239 WALLIKER JR, 1986, IEE C PUB, V258, P194 NR 8 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 63 EP 72 DI 10.1016/0167-6393(90)90047-D PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000009 ER PT J AU WOODLAND, PC SMYTH, SG AF WOODLAND, PC SMYTH, SG TI AN EXPERIMENTAL COMPARISON OF CONNECTIONIST AND CONVENTIONAL CLASSIFICATION SYSTEMS ON NATURAL DATA SO SPEECH COMMUNICATION LA English DT Article RP WOODLAND, PC (reprint author), BRITISH TELECOMMUN PLC, MARTLESHAM HEATH, RT5233, MLB 2-49, IPSWICH IP5 7RE, SUFFOLK, ENGLAND. CR Duda R. O., 1973, PATTERN CLASSIFICATI KOHONEN T, 1988, NEURAL NETWORKS, V1, P3, DOI 10.1016/0893-6080(88)90020-2 MILLAR W, 1987, P EUROPEAN C SPEECH, P136 NILES L, 1989, P IEEE ICASSP 89 GLA RABINER LR, 1985, AT&T TECH J, V64, P1251 RABINER LR, 1983, AT&T TECH J, V62, P1075 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 SORENSON HW, 1971, AUTOMATICA, V7, P465, DOI 10.1016/0005-1098(71)90097-5 WILPON JG, 1985, IEEE T ACOUST SPEECH, V33, P587, DOI 10.1109/TASSP.1985.1164581 NR 10 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 73 EP 82 DI 10.1016/0167-6393(90)90048-E PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000010 ER PT J AU BRIDLE, JS AF BRIDLE, JS TI ALPHA-NETS - A RECURRENT NEURAL NETWORK ARCHITECTURE WITH A HIDDEN MARKOV MODEL INTERPRETATION SO SPEECH COMMUNICATION LA English DT Article RP BRIDLE, JS (reprint author), ROYAL SIGNALS & RADAR ESTAB, SPEECH RES UNIT, ST ANDREWS RD, MALVERN WR14 3PS, WORCS, ENGLAND. CR Bahl L. R., 1986, P IEEE INT C AC SPEE, P49 Bahl L. R., 1988, P ICASSP 88 NEW YORK, pA93 Bahl L. R., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90010-6 BEDWORTH MD, 1989, IEE CONF PUBL, P86 BOTTOU L, 1989, P EUROSPEECH 89 BOURLARD H, 1989, COMPUTER SPEECH LANG BOURLARD H, 1989, ADV NEURAL INFORMATI, V1, P502 Bridle J.S., 1989, NEUROCOMPUTING ALGOR BRIDLE JS, 1989, P IEEE C NEURAL INFO Gopalakrishnan P. S., 1989, P ICASSP, P631 Holmes J. N., 1988, SPEECH SYNTHESIS REC Hunt M. J., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), DOI 10.1109/ICASSP.1989.266415 KUHN G, 1990, SPEECH COMMUN, V9, P41, DOI 10.1016/0167-6393(90)90044-A KUHN GM, 1987, SCIMP487 I DEF AN CO LANG KJ, 1988, CMUCS88152 CARN U TE LIPPMANN RP, NEURAL COMPUTATION, V1 Moore R. K., 1989, Computer Speech and Language, V3, DOI 10.1016/0885-2308(89)90025-9 PRAGER RG, 1986, COMPUTER SPEECH LANG, V1 WAIBEL A, 1988, IEEE T ACOUST SPEECH WAIBEL A, 1988, P IEEE ICASSP NEW YO, P107 WATROUS RL, 1989, IN PRESS J ACOUST SO WATROUS RL, 1988, P IEEE WORKSHOP SPEE NR 22 TC 43 Z9 43 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP 83 EP 92 DI 10.1016/0167-6393(90)90049-F PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000011 ER PT J AU MOORE, RK AF MOORE, RK TI NEUROSPEECH SO SPEECH COMMUNICATION LA English DT Editorial Material RP MOORE, RK (reprint author), ROYAL SIGNALS & RADAR ESTAB, SPEECH RES UNIT, MALVERN WR14 3PS, WORCS, ENGLAND. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD FEB PY 1990 VL 9 IS 1 BP R9 EP R10 DI 10.1016/0167-6393(90)90038-B PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA DJ980 UT WOS:A1990DJ98000001 ER PT J AU RECASENS, D AF RECASENS, D TI LONG-RANGE COARTICULATION EFFECTS FOR TONGUE DORSUM CONTACT IN VCVCV SEQUENCES SO SPEECH COMMUNICATION LA English DT Article RP RECASENS, D (reprint author), UNIV AUTONOMA BARCELONA, DEPT FILOL CATALANA, BARCELONA, SPAIN. CR ALFONSO PJ, 1982, LANG SPEECH, V25, P151 BELLBERTI F, 1981, PHONETICA, V38, P9 Butcher A., 1976, J PHONETICS, V4 Catford John C., 1977, FUNDAMENTAL PROBLEMS Fant G., 1960, ACOUSTIC THEORY SPEE FARNETANI E, 1985, PHONETICA, V42, P78 Fowler C.A., 1981, J SPEECH HEAR RES, V46, P127 FOWLER CA, 1980, J PHONETICS, V8, P113 GAY T, 1977, DYNAMIC ASPECTS SPEE, P85 HARRIS KS, 1984, ARTICULATORY ASSESSM, P147 Henke W, 1966, THESIS MIT HUFFMAN MK, 1986, WORKING PAPERS PHONE, V63, P26 MACNEILA.PF, 1969, J ACOUST SOC AM, V45, P1217, DOI 10.1121/1.1911593 MAGEN H, 1989, THESIS YALE U MCCUTCHEON M, 1980, BIOCOMMUNICATION RES, V3, P38 OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 PARUSH A, 1983, J ACOUST SOC AM, V74, P1115, DOI 10.1121/1.390035 PERKELL J, 1987, J ACOUST SOC AM, V82, pS17, DOI 10.1121/1.2024681 RECASENS D, 1984, J ACOUST SOC AM, V76, P1624, DOI 10.1121/1.391609 RECASENS D, 1987, J PHONETICS, V15, P299 RECASENS D, 1985, LANG SPEECH, V28, P97 SHIBATA S, 1978, ANN B RES I LOGOPEDI, V12, P5 SUSSMAN HM, 1973, J SPEECH HEAR RES, V16, P397 SUSSMAN HM, 1981, J SPEECH HEAR RES, V46, P16 WOLF M, 1976, BIOCOMMUNICATION RES, V1, P57 NR 25 TC 23 Z9 23 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 293 EP 307 DI 10.1016/0167-6393(89)90012-5 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800001 ER PT J AU DOLOGLOU, I CARAYANNIS, G AF DOLOGLOU, I CARAYANNIS, G TI PITCH DETECTION BASED ON ZERO-PHASE FILTERING SO SPEECH COMMUNICATION LA English DT Article RP DOLOGLOU, I (reprint author), NATL TECH UNIV ATHENS, DEPT ELECT ENGN, DIV COMP SCI, 9 IROON POLYTECH ST, GR-15773 ZOGRAPHOU, GREECE. CR FOURCIN AJ, ASHA REPORTS, V11, P116 GOLD B, J ACOUST SOC AM, V34, P916 JOSPA P, 1984, 13TH P JOURN ET PAR, P161 KAY SM, 1981, P IEEE, V69, P1402 MARKEL JD, 1972, IEEE T AUDIO ELECTRO, V20 NOLL AM, J ACOUST SOC AM, V36, P296 RABINER LR, 1978, DIGITAL PROCESSING S, P398 ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 STRUBE AW, 1974, J ACOUST SOC AM, V56 NR 9 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 309 EP 318 DI 10.1016/0167-6393(89)90013-7 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800002 ER PT J AU EGGEN, JH AF EGGEN, JH TI INTELLIGIBILITY OF SYNTHETIC SPEECH IN THE PRESENCE OF INTERFERING SPEECH SO SPEECH COMMUNICATION LA English DT Article RP EGGEN, JH (reprint author), INST PERCEPT RES, POB 513, 5600 MB EINDHOVEN, NETHERLANDS. CR Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Atal B. S., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 BODE DL, 1973, IEEE T ACOUST SPEECH, VAU21, P196, DOI 10.1109/TAU.1973.1162479 BRADY PT, 1968, J ACOUST SOC AM, V44, P695, DOI 10.1121/1.1911163 CARHART R, 1975, J ACOUST SOC AM, V58, pS35, DOI 10.1121/1.2002082 COHEN A, 1961, FONOLOGIE NEDERLANDS FRENCH NR, 1947, J ACOUST SOC AM, V19, P90, DOI 10.1121/1.1916407 HERMES DJ, 1988, J ACOUST SOC AM, V83, P257, DOI 10.1121/1.396427 HOUSE AS, 1965, J ACOUST SOC AM, V37, P158, DOI 10.1121/1.1909295 KALIKOW DN, 1977, J ACOUST SOC AM, V61, P1337, DOI 10.1121/1.381436 LEVITT H, 1976, J ACOUST SOC AM, V42, P609 LEVITT H, 1971, J ACOUST SOC AM, V49, P467, DOI 10.1121/1.1912375 MACKIE K, 1988, SPEECH COMMUN, V6, P309 MARKEL JD, 1974, IEEE T ACOUST SPEECH, V2, P124 NAKATANI LH, 1973, J ACOUST SOC AM, V53, P1083, DOI 10.1121/1.1913428 PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 Pisoni D. B., 1982, Speech Technology, V1 PISONI D B, 1982, Journal of the Acoustical Society of America, V71, pS94, DOI 10.1121/1.2019648 PLOMP R, 1979, AUDIOLOGY, V18, P43 Pratt R. L., 1987, Speech Technology, V3 STUDEBAKER GA, 1985, J SPEECH HEAR RES, V28, P455 VANDIJKKAPPERS AML, 1989, SPEECH COMMUN, V8, P125, DOI 10.1016/0167-6393(89)90039-3 VOGTEN LLM, 1980, IPO ANN PROGR REPORT, V15, P33 VOGTEN LLM, 1983, THESIS EINDHOVEN U T VOIERS WD, 1969, AFCRL690157 AIR FORC Voiers WD, 1977, SPEECH INTELLIGIBILI, P374 WILLEMS LF, 1986, IPO ANN PROGR REPORT, V21, P34 1983, TECHNICAL PUBLICATIO, V101 NR 29 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 319 EP 327 DI 10.1016/0167-6393(89)90014-9 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800003 ER PT J AU BERG, T AF BERG, T TI HOW PHONETIC IS A PHONOLOGICAL FEATURE REPRESENTATION - THE CASE OF LABIODENTAL FRICATIVES SO SPEECH COMMUNICATION LA English DT Article RP BERG, T (reprint author), TECH UNIV BRAUNSCHWEIG, DEPT ENGLISH LINGUIST, MUEHLENPFORDTSTR 22-23, W-33 BRAUNSCHWEIG, GERMANY. CR Abd-El-Jawad H., 1987, LANG SCI, V9, P145, DOI 10.1016/S0388-0001(87)80017-7 ANDERSON SR, 1971, LINGUIST INQ, V2, P103 ANDERSON SR, 1981, LINGUIST INQ, V12, P493 Archangeli Diana B., 1988, PHONOLOGY, V5, P183, DOI 10.1017/S0952675700002268 BERG T, 1987, CROSS LINGUISTIC COM Berg Thomas, 1988, ABBILDUNG SPRACHPROD Chomsky N., 1968, SOUND PATTERN ENGLIS Delattre P, 1965, COMP PHONETIC FEATUR DONEGAN P, 1978, OHIO STATE U WORKING, V23 FROMKIN VA, 1973, SPEECH ERRORS LINGUI, P243 FRY DB, 1959, LANG SPEECH, V2, P52 Gamkrelidze Th.V., 1978, UNIVERSALS HUMAN LAN, V2, P9 HULTZEN LS, 1964, TABLES TRANSITIONAL HYMAN LM, 1970, LANGUAGE, V46, P58, DOI 10.2307/412407 SHATTUCKHUFNAGEL S, 1979, J VERB LEARN VERB BE, V18, P41, DOI 10.1016/S0022-5371(79)90554-1 LADEFOGED P, 1980, LANGUAGE, V56, P485, DOI 10.2307/414446 Lass Roger, 1984, PHONOLOGY INTRO BASI LAUBSTEIN A, 1988, NATURE PRODUCTION GR MACNEILA.PF, 1970, PSYCHOL REV, V77, P182, DOI 10.1037/h0029070 MACNEILAGE PF, 1980, LANG SPEECH, V23, P3 MCCARTHY JJ, 1981, LINGUIST INQ, V12, P373 MILLER GA, 1955, J ACOUST SOC AM, V27, P338, DOI 10.1121/1.1907526 Stemberger J. P., 1985, PROGR PSYCHOL LANGUA, V1, P143 STEMBERGER JP, 1983, J PHONETICS, V11, P139 Van den Broecke M. P. R., 1980, ERRORS LINGUISTIC PE, P47 WALTER H, 1985, 5TH INT PHON M, P276 WANG MD, 1973, J ACOUST SOC AM, V54, P1248, DOI 10.1121/1.1914417 NR 27 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 329 EP 345 DI 10.1016/0167-6393(89)90015-0 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800004 ER PT J AU PARTALO, M SIJERCIC, Z AF PARTALO, M SIJERCIC, Z TI COMPARISON OF SEVERAL SPEECH SIGNAL FEATURE PARAMETERS FOR AUTOMATIC SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Note C1 TELECOM AUSTRALIA, TELECOM NETWORK ENGN, ISDN TRANSMISS EQUIPMENT SECT, MELBOURNE, VIC 3000, AUSTRALIA. RP PARTALO, M (reprint author), LOGOS, SPEECH TECHNOL LAB, PROFESS ELECTR RUDI CAJAVEC, BANJA LUKA, YUGOSLAVIA. CR BROWN MK, 1982, IEEE T ACOUST SPEECH, V30, P535, DOI 10.1109/TASSP.1982.1163916 BUI NC, 1983, IEEE T ACOUST SPEECH, V31, P323 COX BV, 1980, IEEE T ACOUST SPEECH, V28, P550, DOI 10.1109/TASSP.1980.1163444 DAUTRICH BA, 1983, IEEE T ACOUST SPEECH, V31, P793, DOI 10.1109/TASSP.1983.1164172 GOPALAN K, 1984, DEC P ISSM INT S LAS GUDONAVICUS RV, 1977, SPEECH SIGNAL RECOGN ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 MATAUSEK MR, IN PRESS IEEE T ASSP MOKEDDEM A, 1984, 7TH P ICPR MONTR RABINER LR, 1975, AT&T TECH J, V54, P297 NR 10 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 347 EP 353 DI 10.1016/0167-6393(89)90016-2 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800005 ER PT J AU EEFTING, W RIETVELD, ACM AF EEFTING, W RIETVELD, ACM TI JUST NOTICEABLE DIFFERENCES OF ARTICULATION RATE AT SENTENCE LEVEL SO SPEECH COMMUNICATION LA English DT Note RP EEFTING, W (reprint author), CATHOLIC UNIV NIJMEGEN, DEPT LANGUAGE & SPEECH, PHONET SECT, NIJMEGEN, NETHERLANDS. CR BUTCHER A, 1981, ARBEITSBERICHTE, V15, P3 DENOS E, 1985, PHONETICA, V42, P124 Eefting W., 1988, 7th FASE Symposium. Proceedings Speech '88 Fujisaki Hiroya, 1975, AUDITORY ANAL PERCEP, P197 Goldman-Eisler F., 1968, PSYCHOLINGUISTICS EX HAYS WL, 1974, STATISTICS SOCIAL SC HOEQUIST C, 1984, ARBEITSBERICHTE I PH, V22, P137 KOHLER KJ, 1986, LANG SPEECH, V29, P115 Lehiste I., 1970, SUPRASEGMENTALS LEROY L, 1986, ANTWERPEN PAPERS LIN, V40 MILLER JL, 1984, PHONETICA, V41, P215 RIETVELD ACM, 1987, J PHONETICS, V15, P273 SIXTL F, 1967, MESSMETHODEN PSYCHOL Woodrow H, 1930, J EXP PSYCHOL, V13, P473, DOI 10.1037/h0070462 NR 14 TC 10 Z9 10 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 355 EP 361 DI 10.1016/0167-6393(89)90017-4 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800006 ER PT J AU MULLER, JM AF MULLER, JM TI IMPROVING PERFORMANCE OF CODE EXCITED LPC-CODERS BY JOINT OPTIMIZATION SO SPEECH COMMUNICATION LA English DT Note RP MULLER, JM (reprint author), ANT TELECOMMUN, DEPT ADV DEV, E314, GERBERSTR, W-7150 BACKNANG, GERMANY. CR DAVIDSON G, 1986, P ICASSP TOKYO GUTH P, 1986, COMMUNICATION MULLER JM, 1988, RELP SPRACHCODIERUNG, P93 SCHROEDER MR, 1985, MAR P INT C AC SPEEC, P937 SINGHAL S, 1984, P ICASSP SAN DIEGO TRANCOSO IM, 1987, P ICASSP DALLAS NR 6 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1989 VL 8 IS 4 BP 363 EP 369 DI 10.1016/0167-6393(89)90018-6 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA CR588 UT WOS:A1989CR58800007 ER PT J AU TITZE, IR AF TITZE, IR TI A 4-PARAMETER MODEL OF THE GLOTTIS AND VOCAL FOLD CONTACT AREA SO SPEECH COMMUNICATION LA English DT Article C1 DENVER CTR PERFORMING ARTS, RECORDING & RES CTR, DENVER, CO 80204 USA. RP TITZE, IR (reprint author), UNIV IOWA, DEPT SPEECH PATHOL & AUDIOL, VOICE ACOUST & BIOMECH LAB, IOWA CITY, IA 52242 USA. RI Titze, Ingo/G-4780-2010 CR ABBERTON E, 1972, BRIT J DISORD COMMUN, V7, P24 BAER T, 1983, J ACOUST SOC AM, V73, P1304, DOI 10.1121/1.389279 BAKEN RJ, 1987, CLIN MEASUREMENT SPE, P216 CHILDERS DG, 1985, CRIT REV BIOMED ENG, V12, P131 CHILDERS DG, 1986, J ACOUST SOC AM, V80, P1309, DOI 10.1121/1.394382 Fant G., 1985, Q PROGR STATUS REPOR, V4, P1 HIRANO M, 1975, 78TH ANN CONV OT RHI HOLLIEN H, 1960, J SPEECH HEAR RES, V3, P361 HOLLIEN H, 1960, J SPEECH HEAR RES, V3, P150 HOLMBERG EB, 1988, J ACOUST SOC AM, V84, P511, DOI 10.1121/1.396829 KITZING P, 1982, FOLIA PHONIATR, V34, P216 KITZING P, 1982, FOLIA PHONIATR, V34, P234 LECLUSE FLE, 1975, FOLIA PHONIATR, V27, P215 LECLUSE FLE, 1977, ELECTROGLOTTOGRAFIE ROTHENBERG M, 1981, P C ASSESSMENT VOCAL, P86 ROTHENBERG M, 1988, J SPEECH HEAR RES, V31, P338 Scherer R. C., 1988, VOCAL PHYSL VOICE PR, P279 SCHERER RC, 1988, J ACOUSTICAL SOC S1, V84, pS81, DOI 10.1121/1.2026502 TITZE IR, 1989, J ACOUST SOC AM, V85, P1699, DOI 10.1121/1.397959 TITZE IR, 1989, J ACOUST SOC AM, V85, P901, DOI 10.1121/1.397562 TITZE IR, 1988, J ACOUST SOC AM, V83, P1536, DOI 10.1121/1.395910 TITZE IR, 1981, P C ASSESSMENT VOCAL, P48 TITZE IR, 1984, J ACOUST SOC AM, V75, P570, DOI 10.1121/1.390530 TITZE KA, IN PRESS J VOICE NR 24 TC 29 Z9 29 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 191 EP 201 DI 10.1016/0167-6393(89)90001-0 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100001 ER PT J AU VANDIJKKAPPERS, AML AF VANDIJKKAPPERS, AML TI COMPARISON OF PARAMETER SETS FOR TEMPORAL DECOMPOSITION SO SPEECH COMMUNICATION LA English DT Article RP VANDIJKKAPPERS, AML (reprint author), INST PERCEPT RES, 5600 MB EINDHOVEN, NETHERLANDS. CR Ahlbom G., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Applebaum T. H., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Atal B. S., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing BIMBOT F, 1987, 11TH P ICPHS TALLINN, V5, P31 CHOLLET G, 1986, 3RD P EUSIPCO, P365 Flanagan J., 1972, SPEECH ANAL SYNTHESI GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 LIBERMAN AM, 1967, PSYCHOL REV, V74, P431, DOI 10.1037/h0020279 Markel JD, 1976, LINEAR PREDICTION SP MARTEAU PF, 1988, P ICASSP, P615 NIRANJAN M, 1987, P EUROPEAN C SPEECH, P71 NOCERINO N, 1985, P ICASSP, P25 POLS LCW, 1977, SPECTRAL ANAL IDENTI SEKEY A, 1984, J ACOUST SOC AM, V75, P1902, DOI 10.1121/1.390954 VANDIJKKAPPERS AML, 1987, IPO ANN PROGR REPORT, V22, P41 VANDIJKKAPPERS AML, 1989, SPEECH COMMUN, V8, P125, DOI 10.1016/0167-6393(89)90039-3 VISWANATHAN R, 1975, IEEE T ACOUST SPEECH, VAS23, P309, DOI 10.1109/TASSP.1975.1162675 VOGTEN LL, 1983, THESIS EINDHOVEN WILLEMS LF, 1987, P EUROPEAN C SPEECH, P250 WILLEMS LF, 1986, IPO ANN PROGR REPORT, V21, P34 NR 20 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 203 EP 220 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100002 ER PT J AU DARWIN, CJ MCKEOWN, JD KIRBY, D AF DARWIN, CJ MCKEOWN, JD KIRBY, D TI PERCEPTUAL COMPENSATION FOR TRANSMISSION CHANNEL AND SPEAKER EFFECTS ON VOWEL QUALITY SO SPEECH COMMUNICATION LA English DT Article RP DARWIN, CJ (reprint author), UNIV SUSSEX, BRIGHTON BN1 9QG, E SUSSEX, ENGLAND. CR ASSMANN PF, 1987, J ACOUST SOC AM, V81, P520, DOI 10.1121/1.394918 Carlson R., 1975, AUDITORY ANAL PERCEP, P55 DARWIN CJ, 1985, SPEECH COMMUN, V4, P231, DOI 10.1016/0167-6393(85)90049-4 DECHOVITZ D, 1977, SR5152 HASK LAB STAT, P213 DIBENEDETTO MG, 1987, THESIS U ROME SAPIEN Finney DJ, 1964, PROBIT ANAL HAGGARD M, 1974, BRIT J PSYCHOL, V65, P69 Joos M., 1948, LANGUAGE SUPPL, V24, P1, DOI DOI 10.2307/522229 Klatt D. H., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing KLATT DH, 1982, REPRESENTATION SPEEC Ladefoged P., 1967, 3 AREAS EXPT PHONETI LADEFOGED P, 1957, J ACOUST SOC AM, V29, P98, DOI 10.1121/1.1908694 LINDQVIST J, 1968, ROLE RELATIVE SPECTR, V2, P12 MATTINGLY IG, 1968, THESIS YALE U MILLER GA, 1955, J ACOUST SOC AM, V27, P338, DOI 10.1121/1.1907526 MOORE BCJ, 1983, J ACOUST SOC AM, V74, P750, DOI 10.1121/1.389861 PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 PICKETT JM, 1957, J ACOUST SOC AM, V29, P613, DOI 10.1121/1.1908983 REPP BH, 1979, J EXP PSYCHOL HUMAN, V5, P129, DOI 10.1037//0096-1523.5.1.129 *SIGN TECHN INC, 1983, INT LAB SYST SUMMERFIELD Q, 1984, PERCEPT PSYCHOPHYS, V35, P203, DOI 10.3758/BF03205933 VANBERGEM DR, 1988, SPEECH COMMUN, V7, P1, DOI 10.1016/0167-6393(88)90018-0 VANDIJKHUIZEN JN, 1987, J ACOUST SOC AM, V81, P465, DOI 10.1121/1.394912 WATKINS AJ, 1986, J ACOUST SOC AM, V80, pS111, DOI 10.1121/1.2023561 Watkins A. J., 1988, 7th FASE Symposium. Proceedings Speech '88 WATKINS AJ, 1986, PROC INS AC, V8, P17 NR 26 TC 16 Z9 17 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 221 EP 234 DI 10.1016/0167-6393(89)90003-4 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100003 ER PT J AU SCHWARTZ, JL ESCUDIER, P AF SCHWARTZ, JL ESCUDIER, P TI A STRONG EVIDENCE FOR THE EXISTENCE OF A LARGE-SCALE INTEGRATED SPECTRAL REPRESENTATION IN VOWEL PERCEPTION SO SPEECH COMMUNICATION LA English DT Article RP SCHWARTZ, JL (reprint author), ECOLE NATL SUPER ELECTR & RADIOELECT, INST NATL POLYTECH GRENOBLE, INST COMMUN PARLEE, CNRS, UA 368, F-38031 GRENOBLE, FRANCE. CR ABELES M, 1970, J NEUROPHYSIOL, V33, P172 ABRY C, 1987, B LCP, V1, P191 ABRY C, 1989, IN PRESS J PHONETICS BEDDOR PS, 1984, SR7778 HASK LAB STAT, P107 BEDROV YA, 1978, SOV PHYS ACOUST+, V24, P275 BLADON A, 1983, SPEECH COMMUN, V2, P305, DOI 10.1016/0167-6393(83)90047-X BLADON RAW, 1978, STL QPSR, V1, P1 BLADON RAW, 1981, J ACOUST SOC AM, V69, P1414, DOI 10.1121/1.385824 BUSER P, 1987, NEUROPHYSIOLOGIE FON Carlson R., 1970, STL QPSR, V2, P19 Carlson R., 1975, AUDITORY ANAL PERCEP, P55 Carlson R., 1979, STL QPSR, V3, P73 CARLSON R, 1979, STLQPSR34, V3, P84 CHENG YM, 1983, ETUDE SPECTRES INT V CHISTOVICH IA, 1986, SPEECH COMMUN, V5, P3, DOI 10.1016/0167-6393(86)90026-9 CHISTOVICH LA, 1980, LANG SPEECH, V23, P67 Chistovich L. A., 1979, FRONTIERS SPEECH COM, P143 CHISTOVICH LA, 1979, HEARING RES, V1, P185, DOI 10.1016/0378-5955(79)90012-1 CHISTOVICH LA, 1976, PHYSL SPEECH HUMAN S COWAN N, 1986, J ACOUST SOC AM, V79, P500, DOI 10.1121/1.393537 CROWDER RG, 1982, COGNITIVE REPRESENTA, P167 Delattre P, 1952, WORD, V8, P195 ESCUDIER P, 1985, FRANC SEM SOC FRANC, P143 ESPINOZAVARAS B, 1987, NATO ASI SER, P80 FANT G, 1983, STL QPSR, V2, P1 FUJIMURA O, 1967, LANG SPEECH, V10, P181 Fujisaki H., 1969, Annual Report of the Engineering Research Institute, Faculty of Engineering, University of Tokyo, V28 Fujisaki H., 1970, Annual Report of the Engineering Research Institute, Faculty of Engineering, University of Tokyo, V29 GREEN DM, 1984, J ACOUST SOC AM, V75, P1163, DOI 10.1121/1.390765 GREEN DM, 1987, NATO ASO SERIES, P314 HAWKINS S, 1985, J ACOUST SOC AM, V77, P1560, DOI 10.1121/1.391999 HAWKINS S, 1985, J ACOUST SOC AM S1, V77, pS8, DOI 10.1121/1.2022575 Klatt D. H., 1982, IEEE ICASSP, P1278 KLATT DH, 1985, J ACOUST SOC AM S1, V77, pS7, DOI 10.1121/1.2022524 MANTAKAS M, 1988, B LABORATOIRE COMMUN, V2, P95 MANTAKAS M, 1986, 15EMES JEP SFA, P157 MANTAKAS M, 1988, 7TH FASE C ED Marr D, 1982, VISION MASSARO DW, 1987, PERSPECTIVES PERCEPT, P273 Petitot J., 1985, CATASTROPHES PAROLE Pickles JO, 1982, INTRO PHYSL HEARING PISONI DB, 1975, MEM COGNITION, V3, P7, DOI 10.3758/BF03198202 PISONI DB, 1973, PERCEPT PSYCHOPHYS, V13, P253, DOI 10.3758/BF03214136 PISONI DB, 1971, THESIS U MICHIGAN PLATT JR, 1964, SCIENCE, V146, P347, DOI 10.1126/science.146.3642.347 Popper K. R., 1959, LOGIC SCI DISCOVERY QUACHTUAN N, 1986, THESIS INPG GRENOBLE Repp B. H., 1984, SPEECH LANGUAGE ADV, V10, P243 REPP BH, 1979, J EXP PSYCHOL HUMAN, V5, P129, DOI 10.1037//0096-1523.5.1.129 REPP BH, 1987, HASKINS LABORATORIES, V89, P1 SCHWARTZ JL, 1987, NATO ASI SER, P284 SCHWARTZ JL, 1987, THESIS INPG USMG GRE SCHWARTZ JL, 1987, B LCP, V1, P159 Stevens KN, 1972, HUMAN COMMUNICATION, P51 STEVENS KN, 1985, SPEECH COMMUN, V4, P137, DOI 10.1016/0167-6393(85)90041-X SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 SYRDAL AK, 1985, SPEECH COMMUN, V4, P121, DOI 10.1016/0167-6393(85)90040-8 TRAUNMUELLER H, 1982, REPRESENTATION SPEEC, P103 Zwicker E., 1967, OHR NACHRICHTENEMPFA NR 59 TC 20 Z9 21 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 235 EP 259 DI 10.1016/0167-6393(89)90004-6 PG 25 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100004 ER PT J AU PRICE, PJ AF PRICE, PJ TI MALE AND FEMALE VOICE SOURCE CHARACTERISTICS - INVERSE FILTERING RESULTS SO SPEECH COMMUNICATION LA English DT Article RP PRICE, PJ (reprint author), SRI INT, 333 RAVENSWOOD AVE, EK178, MENLO PK, CA 94025 USA. CR ANANTHAPADMANABHA TV, 1979, IEEE T ACOUST SPEECH, V27, P309, DOI 10.1109/TASSP.1979.1163267 BEROUTI MG, 1977, P INT C ACOUST SPEEC, P33 Bickley C., 1982, MIT RLE SPEECH COMMU, V1, P71 BICKLEY CA, 1986, J PHONETICS, V14, P373 CHANG SH, 1956, J ACOUST SOC AM, V28, P565, DOI 10.1121/1.1908399 DAVIS KH, 1952, J ACOUST SOC AM, V24, P637, DOI 10.1121/1.1906946 Fant G, 1982, STL QPSR, V2-3, P1 Fant G, 1979, SPEECH TRANSM LAB Q, V1, P79 Fant G., 1970, ACOUSTIC THEORY SPEE FANT G, 1961, 3RD P INT C AC STUTT, P3 FANT G, 1982, Journal of the Acoustical Society of America, V71, pS104, DOI 10.1121/1.2019177 FANT G, 1979, STL QPSR, V1, P85 FANT G, 1980, STL QPSR, V2, P17 Fant G., 1983, STL QPRS, V4, P1 Flanagan J., 1972, SPEECH ANAL SYNTHESI FLANAGAN JL, 1968, IEEE T ACOUST SPEECH, VAU16, P57, DOI 10.1109/TAU.1968.1161949 FUJIMURA O, 1971, J ACOUST SOC AM, V49, P541, DOI 10.1121/1.1912385 HEDELIN P, 1986, P IEEE ICASSP86 TOKY, P465 HENTON CG, 1985, LANG COMMUN, V5, P221, DOI 10.1016/0271-5309(85)90012-6 Hirano M., 1977, DYNAMIC ASPECTS SPEE, P13 HIRANO M, 1980, 9TH T S CAR PROF V 2 HOLMES JN, 1972, IEEE AU, V21, P298 HOLMES JN, 1962, 4TH P INT C AC COP, pG13 HOUSE AS, 1958, J SPEECH HEAR RES, V1, P309 ISHIZAKA K, 1972, AT&T TECH J, V51, P1233 KRISHNAMURTHY AK, 1984, P INT C ACOUST SPEEC LAINE UK, 1988, SPEECH COMMUN, V7, P21, DOI 10.1016/0167-6393(88)90019-2 MATAUSEK MR, 1980, IEEE T ACOUST SPEECH, V28, P616, DOI 10.1109/TASSP.1980.1163483 MILLER RL, 1959, J ACOUST SOC AM, V31, P667, DOI 10.1121/1.1907771 MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 MONSEN RB, 1978, J ACOUST SOC AM, V64, P65, DOI 10.1121/1.381957 MUNSON WA, 1950, J ACOUST SOC AM, V22, P678, DOI 10.1121/1.1917204 PINSON EN, 1963, J ACOUST SOC AM, V35, P1264, DOI 10.1121/1.1918682 ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 ROTHENBE.M, 1973, J ACOUST SOC AM, V53, P1632, DOI 10.1121/1.1913513 SONDHI MM, 1975, J ACOUST SOC AM, V57, P228, DOI 10.1121/1.380429 STRUBE HW, 1974, J ACOUST SOC AM, V56, P1625, DOI 10.1121/1.1903487 VEENEMAN DE, P INT C ACOUST SPEEC WONG DY, 1979, IEEE T ACOUST SPEECH, V27, P350, DOI 10.1109/TASSP.1979.1163260 NR 39 TC 33 Z9 33 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 261 EP 277 DI 10.1016/0167-6393(89)90005-8 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100005 ER PT J AU SCHOENTGEN, J AF SCHOENTGEN, J TI THE RECOGNITION OF SPEECH BY MACHINE - A BIBLIOGRAPHY - HOUSE,AS SO SPEECH COMMUNICATION LA English DT Book Review RP SCHOENTGEN, J (reprint author), FREE UNIV BRUSSELS, INST PHONET, B-1050 BRUSSELS, BELGIUM. CR HOUSE AS, 1988, RECOGNITION SPEECH M NR 1 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 279 EP 280 DI 10.1016/0167-6393(89)90006-X PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100006 ER PT J AU MILLAR, B AF MILLAR, B TI REPORT ON THE 2ND AUSTRALIAN INTERNATIONAL-CONFERENCE ON SPEECH SCIENCE AND TECHNOLOGY, AND A NEW ASSOCIATION-FOR-SPEECH-SCIENTISTS-AND-TECHNOLOGISTS SO SPEECH COMMUNICATION LA English DT Editorial Material NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD SEP PY 1989 VL 8 IS 3 BP 281 EP 282 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AU531 UT WOS:A1989AU53100007 ER PT J AU VAISSIERE, J AF VAISSIERE, J TI OBITUARY FOR KLATT,DENNIS SO SPEECH COMMUNICATION LA English DT Biographical-Item RP VAISSIERE, J (reprint author), CTR NATL ETUD TELECOMMUN, F-22301 LANNION, FRANCE. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 109 EP 110 DI 10.1016/0167-6393(89)90037-X PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500001 ER PT J AU STEVENS, KN AF STEVENS, KN TI OBITUARY FOR KLATT,DENNIS SO SPEECH COMMUNICATION LA English DT Biographical-Item RP STEVENS, KN (reprint author), MIT, ELECTR RES LAB, CAMBRIDGE, MA 02139 USA. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 110 EP 111 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500002 ER PT J AU KIRIAKOS, B OSHAUGHNESSY, D AF KIRIAKOS, B OSHAUGHNESSY, D TI LEXICAL STRESS DETECTION IN ISOLATED ENGLISH WORDS SO SPEECH COMMUNICATION LA English DT Article C1 UNIV QUEBEC, INST NATL RECH SCI TELECOMMUN, NUNS ISL H3E 1H6, QUEBEC, CANADA. RP KIRIAKOS, B (reprint author), MCGILL UNIV, DEPT ELECT ENGN, MONTREAL H3A 2T5, QUEBEC, CANADA. CR ADAMS C, 1978, PHONETICA, V35, P125 Adams C., 1979, ENGLISH SPEECH RHYTH AULL AM, 1985, 1985 INT C AC SPEECH, P1549 BOLINGER D., 1975, ASPECTS LANGUAGE Bolinger Dwight L., 1965, FORMS ENGLISH ACCENT DUMOUCHEL P, 1986, 1986 MONTR S SPEECH, P73 FREIJ GJ, 1988, 1988 INT C AC SPEECH, P135 Lea W., 1980, TRENDS SPEECH RECOGN, P166 Lehiste I., 1970, SUPRASEGMENTALS LIEBERMAN P, 1960, J ACOUST SOC AM, V32, P451, DOI 10.1121/1.1908095 MARCHAL A, 1976, ACCENT INSISTANCE EM, P93 MERMELSTEIN P, 1975, J ACOUST SOC AM, V58, P880, DOI 10.1121/1.380738 RABINER LR, 1975, IEEE T ACOUST SPEECH, V23, P552, DOI 10.1109/TASSP.1975.1162749 RABINER LR, 1975, AT&T TECH J, V54, P297 RABINER LR, 1978, DIGITAL PROCESSING S, P130 ROSSI M, 1978, PHONETICA, V35, P11 Waibel A., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) NR 17 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 113 EP 124 DI 10.1016/0167-6393(89)90038-1 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500003 ER PT J AU VANDIJKKAPPERS, AML MARCUS, SM AF VANDIJKKAPPERS, AML MARCUS, SM TI TEMPORAL DECOMPOSITION OF SPEECH SO SPEECH COMMUNICATION LA English DT Article RP VANDIJKKAPPERS, AML (reprint author), INST PERCEPT RES, 5600 MB EINDHOVEN, NETHERLANDS. CR Ahlbom G., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) ATAL BS, 1983, P ICASSP, V2, P81 BIMBOT F, 1987, 11TH P ICPHS TALLINN, V5, P31 CHOLLET G, 1986, 3RD P EUSIPCO, P365 GERBRANDS JJ, 1981, PATTERN RECOGN, V14, P375, DOI 10.1016/0031-3203(81)90082-0 GOLUB GH, 1983, MATRIX COMPUTATIONS, P16 LAWLEY DN, 1971, FACTOR ANAL STATISTI, P79 LIBERMAN AM, 1967, PSYCHOL REV, V74, P431, DOI 10.1037/h0020279 MARCUS SM, 1984, IPO19 ANN PROGR REP, P25 NIRANJAN M, 1987, P EUROPEAN C SPEECH, P71 VANDIJKKAPPERS AML, 1988, UNPUB SPEECH COMMUNI VANDIJKKAPPERS AML, 1987, IPO22 ANN PROGR REP, P41 van Dijk-Kappers A. M. L., 1988, 7th FASE Symposium. Proceedings Speech '88 VANDIJKKAPPERS AML, 1988, IPO MS652 REP NR 14 TC 16 Z9 16 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 125 EP 135 DI 10.1016/0167-6393(89)90039-3 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500004 ER PT J AU BAILLY, G AF BAILLY, G TI INTEGRATION OF RHYTHMIC AND SYNTACTIC CONSTRAINTS IN A MODEL OF GENERATION OF FRENCH PROSODY SO SPEECH COMMUNICATION LA English DT Article C1 UNIV QUEBEC, INST NATL RECH SCI TELECOMMUN, ILE DES SOEURS, QUEBEC, CANADA. RP BAILLY, G (reprint author), CNRS, INST COMMUN PARLEE, F-38031 GRENOBLE, FRANCE. CR BAILLY G, 1983, THESIS GRENOBLE Bailly G., 1988, 7th FASE Symposium. Proceedings Speech '88 BAILLY G, 1986, 15EMES JOURN ET PAR, P75 Bailly G., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) BARTKOVA K, 1985, 14EMES JOURN ET PAR, P188 BARTKOVA K, 1987, SPEECH COMMUN, V6, P245, DOI 10.1016/0167-6393(87)90029-X CAELENHAUMONT G, 1986, P MONTREAL S SPEECH, P82 Chomsky N., 1968, SOUND PATTERN ENGLIS Cooper W. E., 1980, SYNTAX SPEECH DAUER RM, 1983, J PHONETICS, V11, P51 DICRISTO A, 1978, THESIS U PROVENCE AI EMERARD F, 1977, RECHERCHES PROSODIE, P131 FARNETANI E, 1986, SPEECH COMMUN, V5, P17, DOI 10.1016/0167-6393(86)90027-0 FOWLER CA, 1982, SR7172 HASK LAB STAT, P1 FUJISAKI H, 1972, IEEE P ACOUST SPEECH, P140 GEE JP, 1983, COGNITIVE PSYCHOL, V15, P411, DOI 10.1016/0010-0285(83)90014-2 HAYS DG, 1964, LANGUAGE, V46, P511 HENDERSON A, 1966, LANG SPEECH, V10, P122 Hirst D. J., 1983, PROSODY MODELS MEASU, P93 LARREUR D, 1973, B I PHONETIQUE, P103 Lehiste I., 1976, J PHONETICS, V4, P113 LJOLJE A, 1986, IEEE T ACOUST SPEECH, V34, P1074, DOI 10.1109/TASSP.1986.1164948 MAEDA S, 1974, KTH114 Q PROGR REP, P193 MARTIN P, 1986, 15EM JEP AIX EN PROV, P89 MARTIN P, 1976, 7EMES JOURN ET PAR G, P207 NAKATANI LH, 1978, J ACOUST SOC AM, V63, P234, DOI 10.1121/1.381719 OSHAUGHNESSY D, 1984, SPEECH COMMUN, V3, P233, DOI 10.1016/0167-6393(84)90018-9 OHMAN S, 1967, STL2 Q PROGR STAT RE, P20 OSHAUGHNESSY D, 1981, J PHONETICS, V9, P385 PIERREHUMBERT J, 1981, J ACOUST SOC AM, V70, P985, DOI 10.1121/1.387033 POWER MJ, 1983, LANG SPEECH, V26, P253 SAINTBONNET M, 1977, 8EM P JOURN ET PAR A, P337 SCHWARTZ J, 1968, LANG SPEECH, V11, P27 SORIN C, 1987, P INT C PHON SCI TAL, V1, P127 STUDDERTKENNEDY M, 1971, SPEECH RES, V27, P153 VAISSIERE J, 1971, THESIS U LANGUES LET VAISSIERE J, 1974, KTH114 Q PROGR STAT, P212 Vaissiere Jacqueline, 1983, PROSODY MODELS MEASU, P53 NR 38 TC 13 Z9 13 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 137 EP 146 DI 10.1016/0167-6393(89)90040-X PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500005 ER PT J AU CHILDERS, DG WU, K HICKS, DM YEGNANARAYANA, B AF CHILDERS, DG WU, K HICKS, DM YEGNANARAYANA, B TI VOICE CONVERSION SO SPEECH COMMUNICATION LA English DT Article C1 UNIV FLORIDA, DEPT SPEECH, GAINESVILLE, FL 32611 USA. INDIAN INST TECHNOL, DEPT COMP SCI, MADRAS 600036, TAMIL NADU, INDIA. RP CHILDERS, DG (reprint author), UNIV FLORIDA, DEPT ELECT ENGN, GAINESVILLE, FL 32611 USA. CR ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 CHENG YM, 1987, LARYNGEAL FUNCTION P, P219 CHILDERS DG, 1987, IEEE T ACOUST SPEECH, P293 CHILDERS DG, 1985, P IEEE INT C ACOUST CHILDERS DG, 1983, JUL P STOCKH MUS AC, V1, P125 Childers D. G., 1985, Mathematics and computers in biomedical applications CHILDERS DG, 1989, UNPUB SPEECH COMMUNI CHILDERS DG, 1984, IEEE T BIO-MED ENG, V31, P807, DOI 10.1109/TBME.1984.325242 CHILDERS DG, 1983, 10TH INT C PHON SCI, P833 Childers D. G., 1987, Computers in Mechanical Engineering, V6 CHILDERS DG, 1985, VOICE I O SYSTEMS AP, P349 CHILDERS DG, 1987, P EUROPEAN C SPEECH, V2, P488 CHILDERS DG, 1988, UNPUB IEEE T ACOUST FANT G, 1979, GLOTTAL SOURCE EXCIT, P85 Fant G., 1985, 4 PARAMETER MODEL GL, P1 Flanagan J., 1972, SPEECH ANAL SYNTHESI JUANG BH, 1984, AT&T TECH J, V63, P1477 KRISHNAMURTHY AK, 1986, IEEE T ACOUST SPEECH, V34, P730, DOI 10.1109/TASSP.1986.1164909 KUWABARA H, 1984, SPEECH COMMUN, V3, P211, DOI 10.1016/0167-6393(84)90016-5 LADD DR, 1985, J ACOUST SOC AM, V78, P435, DOI 10.1121/1.392466 NAIK JM, 1984, THESIS U FLORIDA PINTO NB, 1989, IN PRESS IEEE T ACOU ROSSON MB, 1986, P HUMAN FACTORS COMP, P192, DOI 10.1145/22627.22370 SENEFF S, 1982, IEEE T ACOUST SPEECH, V30, P566, DOI 10.1109/TASSP.1982.1163919 WONG DY, 1980, IEEE T ACOUST SPEECH, P208 WU K, 1985, THESIS U FLORIDA YEA JJ, 1983, THESIS U FLORIDA Yegnanarayana B., 1984, 10th International Conference on Computational Linguistics. 22nd Annual Meeting of the Association for Computational Linguistics. Proceedings of Coling 84 NR 28 TC 26 Z9 28 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 147 EP 158 DI 10.1016/0167-6393(89)90041-1 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500006 ER PT J AU ROMARY, L PIERREL, JM AF ROMARY, L PIERREL, JM TI THE USE OF THE DEMPSTER-SHAFER RULE IN THE LEXICAL COMPONENT OF A MAN-MACHINE ORAL DIALOG SYSTEM SO SPEECH COMMUNICATION LA English DT Article C1 INRIA, CTR RECH INFORMAT NANCY, NANCY, FRANCE. RP ROMARY, L (reprint author), SUPELEC METZ, METZ, FRANCE. CR BARNETT JA, 1983, 8TH P IJCAI KARLSR, P868 BEROULE D, 1984, 4TH P C AFCET RFIA P, P53 BORNERAND S, 1988, 17EME P JEP NANC, P61 CARBONELL N, 1988, STRUCTURE MULTIMODAL CHAILLOUX J, 1986, MANUEL REFERENCE CHEESEMAN P, 1985, 9TH P INT JOINT C AR, P1002 DEMORI R, 1983, COMPUTER MODELS SPEE DENHEYER K, 1985, J MEM LANG, V24, P699, DOI 10.1016/0749-596X(85)90054-3 DEVILLE G, 1989, THESIS U INSTELLING DEVILLE G, 1986, THESIS ANTWERPEN DEVILLE G, 1987, 6TH P C AFCET RFIA A, P157 Dubois D., 1980, FUZZY SETS SYSTEMS ERMAN LD, 1980, COMPUTING SURVEYS, V12 GINSBERG ML, 1984, P NAT C ARTIFICIAL I, P126 HATON JP, 1977, P IEEE ICASSP77 DALL, P807 Hudson R., 1984, WORD GRAMMAR LAASRI H, 1988, 8TH P ECAI MUN, P5 Lowerre B., 1980, TRENDS SPEECH RECOGN, P340 LU SY, 1984, P AAAI 84, P216 MARIANI JJ, 1978, P C AFCET RFIA PARIS, P169 MERIALDO B, 1986, P ICASSP 86 TOKYO MOON DA, 1986, SEP C OBJ OR PROGR S, P1 PIERREL J. M., 1987, DIALOGUE ORAL HOMME PIERREL JM, 1979, 4TH P IJCPR KYOT PIERREL JM, 1983, TECHNOLOGY SCI INFOR, V1, P329 PRADE H, 1982, THESIS TOULOUSE ROMARY L, 1988, 17EME P JEP NANC, P168 Shafer G., 1976, MATH THEORY EVIDENCE SHAFER G, 1987, ARTIF INTELL, V33, P271, DOI 10.1016/0004-3702(87)90040-3 SIROUX J, 1985, SPEECH COMMUN, V4, P289, DOI 10.1016/0167-6393(85)90056-1 Touretzky D. S., 1986, MATH INHERITANCE SYS TYLER LK, 1986, J MEM LANG, V25, P741, DOI 10.1016/0749-596X(86)90047-1 WOLF JJ, 1980, TRENDS SPEECH RECOGN, P316 ZADEH LA, 1965, INFORM CONTROL, V8, P338, DOI 10.1016/S0019-9958(65)90241-X NR 34 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 159 EP 176 DI 10.1016/0167-6393(89)90042-3 PG 18 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500007 ER PT J AU SABIN, MJ AF SABIN, MJ TI FIXED-SHAPE ADAPTIVE-GAIN VECTOR QUANTIZATION FOR SPEECH WAVEFORM CODING SO SPEECH COMMUNICATION LA English DT Note C1 UNIV CALIF BERKELEY, BERKELEY, CA 94720 USA. RP SABIN, MJ (reprint author), CYLINK CORP, SUNNYVALE, CA 94086 USA. CR CHEN JH, 1985, 1985 IEEE C COMM REC, V3, P1456 CUPERMAN V, 1982, 1982 IEEE GLOB TEL C, P1092 JAYANT NS, 1973, BELL SYST TECH J, V52, P1105 SABIN MJ, 1984, IEEE T ACOUST SPEECH, V32, P474, DOI 10.1109/TASSP.1984.1164346 NR 4 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUN PY 1989 VL 8 IS 2 BP 177 EP 183 DI 10.1016/0167-6393(89)90043-5 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA AJ065 UT WOS:A1989AJ06500008 ER PT J AU REPP, BH AF REPP, BH TI TRAVERSING UPPER VOWEL SPACE - A SMOOTH OR A BUMPY RIDE SO SPEECH COMMUNICATION LA English DT Article RP REPP, BH (reprint author), HASKINS LABS, NEW HAVEN, CT 06511 USA. CR CHISTOVICH LA, 1966, 2 R I TECHN SPEECH, P1 HEALY AF, 1982, J EXP PSYCHOL HUMAN, V8, P68, DOI 10.1037/0096-1523.8.1.68 KENT RD, 1973, PHONETICA, V28, P1 LINDBLOM BE, 1971, J ACOUST SOC AM, V50, P1166, DOI 10.1121/1.1912750 MACMILLAN NA, 1987, PSYCHOPHYSICS SPEECH, P28 PERKELL JS, 1985, J ACOUST SOC AM, V77, P1889, DOI 10.1121/1.391940 PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 PISONI DB, 1973, PERCEPT PSYCHOPHYS, V13, P253, DOI 10.3758/BF03214136 PISONI DB, 1980, PHONETICA, V37, P285 REPP BH, 1987, SPEECH COMMUN, V6, P1, DOI 10.1016/0167-6393(87)90065-3 REPP BH, 1985, SPEECH COMMUN, V4, P105, DOI 10.1016/0167-6393(85)90039-1 REPP BH, 1979, J EXP PSYCHOL HUMAN, V5, P129, DOI 10.1037//0096-1523.5.1.129 SCHOUTEN MEH, 1988, J PHONETICS, V5, P273 Stevens KN, 1972, HUMAN COMMUNICATION, P51 SYRDAL AK, 1986, J ACOUST SOC AM, V79, P1086, DOI 10.1121/1.393381 NR 15 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 1 EP 15 DI 10.1016/0167-6393(89)90063-0 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200001 ER PT J AU VANDENBERG, RJH AF VANDENBERG, RJH TI PERCEPTION OF VOICING IN DUTCH 2-OBSTRUENT SEQUENCES - COVARIATION OF VOICING CUES SO SPEECH COMMUNICATION LA English DT Article RP VANDENBERG, RJH (reprint author), CATHOLIC UNIV NIJMEGEN, INST PHONET, NIJMEGEN, NETHERLANDS. CR Booij Geert, 1981, GENERATIEVE FONOLOGI CRYSTAL A, 1980, 1ST DICT LINGUISTICS GOODMAN LA, 1973, BIOMETRIKA, V60, P179, DOI 10.2307/2334920 GOODMAN LA, 1971, J AM STAT ASSOC, V66, P339, DOI 10.2307/2283933 LISKER L, 1986, LANG SPEECH, V29, P3 MANN VA, 1980, PERCEPT PSYCHOPHYS, V28, P213, DOI 10.3758/BF03204377 MASSARO DW, 1980, J ACOUST SOC AM, V67, P996, DOI 10.1121/1.383941 MILLER JL, 1977, J SPEECH HEAR RES, V20, P519 ODEN GC, 1978, PSYCHOL REV, V85, P172, DOI 10.1037/0033-295X.85.3.172 Pisoni D., 1974, J PHONETICS, V2, P181 SLIS, 1985, THESIS U NIJMEGEN SLIS IH, 1986, J PHONETICS, V14, P311 SLIS IH, 1982, GLOT, V5, P235 SLIS IH, 1987, HONOUR ILSE LEHISTE, P225 VANDENBERG RJH, 1986, SPEECH COMMUN, V5, P355, DOI 10.1016/0167-6393(86)90018-X VANDENBERG RJH, 1987, J PHONETICS, V15, P259 VANDENBERG RJH, 1987, J PHONETICS, V15, P39 ZONNEVELD T, 1979, INLEIDING GENERATIEV NR 18 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 17 EP 25 DI 10.1016/0167-6393(89)90064-2 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200002 ER PT J AU PALIWAL, KK AF PALIWAL, KK TI A STUDY OF LINE SPECTRUM PAIR FREQUENCIES FOR VOWEL RECOGNITION SO SPEECH COMMUNICATION LA English DT Article RP PALIWAL, KK (reprint author), TATA INST FUNDAMENTAL RES, COMP SYST & COMMUNICAT GRP, BOMBAY 400005, INDIA. CR Applebaum T. H., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Duda R. O., 1973, PATTERN CLASSIFICATI GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 Hanson B. A., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) HANSON BA, 1987, IEEE T ACOUST SPEECH, V35, P968, DOI 10.1109/TASSP.1987.1165241 HERMANSKY H, 1987, IEEE T ACOUST SPEECH, P1159 Hermansky H., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) ITAKURA, 1987, IEEE T ACOUST SPEECH, P1257 ITAKURA, 1975, J ACOUST SOC AM, V57, pS35 JUANG BH, 1987, IEEE T ACOUST SPEECH, V35, P947 Juang B. H., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) KAHN D, 1987, 298 PAP Kang G. S., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) PALIWAL KK, 1988, USE LSP REPRESENTATI Paliwal K. K., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90034-6 PALIWAL KK, 1984, SPEECH COMMUN, V3, P101, DOI 10.1016/0167-6393(84)90012-8 PALIWAL KK, 1982, J ACOUST SOC AM, V71, P1016, DOI 10.1121/1.387653 PALIWAL KK, 1982, SIGNAL PROCESS, V4, P323, DOI 10.1016/0165-1684(82)90008-1 SOONG FK, 1984, P IEEE INT C ACOUST Tohkura Y., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) TOUSSAIN.GT, 1974, IEEE T INFORM THEORY, V20, P472, DOI 10.1109/TIT.1974.1055260 Wakita H., 1981, Speech Technology, V1 NR 22 TC 4 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 27 EP 33 DI 10.1016/0167-6393(89)90065-4 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200003 ER PT J AU COHEN, A FROIND, I AF COHEN, A FROIND, I TI ON TEXT INDEPENDENT SPEAKER IDENTIFICATION USING A QUADRATIC CLASSIFIER WITH OPTIMAL FEATURES SO SPEECH COMMUNICATION LA English DT Article RP COHEN, A (reprint author), BEN GURION UNIV NEGEV, DEPT ELECT & COMP ENGN, IL-84105 BEERSHEBA, ISRAEL. CR ATAL BS, 1976, P IEEE, V64, P460, DOI 10.1109/PROC.1976.10155 ATAL BS, 1976, IEEE T ACOUST SPEECH, V24, P201, DOI 10.1109/TASSP.1976.1162800 BIDSARIA HB, 1987, PATTERN RECOGN, V20, P365 CHEUNG RS, 1978, IEEE T ACOUST SPEECH, V26, P397, DOI 10.1109/TASSP.1978.1163142 COHEN A, 1982, 4TH P C CAD CAM TEL COHEN A, 1986, UNPUB REAL TIME FEAT COHEN A, 1985, ANTITERRORISM FORENS, P246 DANTE HM, 1979, IEEE T ACOUST SPEECH, V27, P225 FUKUNAGA K, 1972, INTRO STATISTICAL PA FURUI S, 1986, SPEECH COMMUN, V5, P183, DOI 10.1016/0167-6393(86)90007-5 FURUI S, 1981, IEEE T ACOUST SPEECH, V29, P342, DOI 10.1109/TASSP.1981.1163605 FURUI S, 1973, ELECTRON COMMUN JPN, V56, P62 KALAYEH HM, 1983, IEEE T GEOSCI REMOTE, V21, P434, DOI 10.1109/TGRS.1983.350504 Kuhn M. H., 1980, Proceedings of the 1980 Carnahan Conference on Crime Countermeasures MAKHOUL JD, 1977, P IEEE, V63, P561 MARKEL JD, 1977, IEEE T ACOUST SPEECH, V25, P330, DOI 10.1109/TASSP.1977.1162961 MARKEL JD, 1972, IEEE T ACOUST SPEECH, VAU20, P367, DOI 10.1109/TAU.1972.1162410 MORGERA SD, 1987, IEEE T PATTERN ANAL, V9, P29 MORGERA SD, 1984, IEEE T PATTERN ANAL, V6, P601 RABINER LR, 1978, DIGITA PROCESSING SP, P96 ROSENBERG AE, 1976, P IEEE, V64, P475, DOI 10.1109/PROC.1976.10156 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 SHRIDHAR M, 1979, SPEECH COMMUN, V1, P103 Shridhar M., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90019-X NR 24 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 35 EP 44 DI 10.1016/0167-6393(89)90066-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200004 ER PT J AU SAVOJI, MH AF SAVOJI, MH TI A ROBUST ALGORITHM FOR ACCURATE ENDPOINTING OF SPEECH SIGNALS SO SPEECH COMMUNICATION LA English DT Article RP SAVOJI, MH (reprint author), BRITISH TELECOM RES LABS, IPSWICH IP5 7RE, SUFFOLK, ENGLAND. CR ATAL BS, 1976, IEEE T ACOUST SPEECH, V24, P201, DOI 10.1109/TASSP.1976.1162800 BRUNO G, 1987, IEEE T ACOUST SPEECH, V35, P556, DOI 10.1109/TASSP.1987.1165169 CHO DH, 1982, NOV P IEEE GLOB TEL, P1340 Das S. K., 1978, IBM Technical Disclosure Bulletin, V21 DESOUZA P, 1983, IEEE T ACOUST SPEECH, V31, P678, DOI 10.1109/TASSP.1983.1164129 DONVITO MB, 1985, MAR P INT C AC SPEEC, P1433 FLETCHER IG, 1986, PROC INS AC, V8, P111 HALTSONEN, 1984, MAR P INT C ACOUST S Kitamura T., 1979, Transactions of the Institute of Electronics and Communication Engineers of Japan, Section E (English), VE62 LAMEL LF, 1981, IEEE T ACOUST SPEECH, V29, P777, DOI 10.1109/TASSP.1981.1163642 LYNCH JF, 1987, APR IEEE INT C AC SP, P1348 MWANGI E, 1985, OCT P MED EL C MELEC, P123 NEY H, 1981, MAR P INT C AC SPEEC, P720 RABINER LR, 1977, IEEE T ACOUST SPEECH, V25, P338, DOI 10.1109/TASSP.1977.1162964 RABINER LR, 1975, AT&T TECH J, V54, P297 RAMAMOORTHY V, 1980, APR P INT C AC SPEEC, P57 SARMA VVS, 1978, APR P INT C AC SPEEC, P1 SAVOJI MH, 1987, SEP P EUR C SPEECH T, P325 TSAO C, 1984, MAR IEEE INT C AC SP WILPON JG, 1984, AT&T TECH J, V63, P479 NR 20 TC 41 Z9 44 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 45 EP 60 DI 10.1016/0167-6393(89)90067-8 PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200005 ER PT J AU SCHOENTGEN, J AF SCHOENTGEN, J TI JITTER IN SUSTAINED VOWELS AND ISOLATED SENTENCES PRODUCED BY DYSPHONIC SPEAKERS SO SPEECH COMMUNICATION LA English DT Article RP SCHOENTGEN, J (reprint author), UNIV BRUSSELS, INST PHONET, B-1050 BRUSSELS, BELGIUM. CR ASKENFELT A, 1980, VOICE ANAL DEPRESSED, P71 ASKENFELT A, 1981, SPEECH WAVEFORM PERT, P40 ASKENFELT A, 1980, SPEECH WAVEFORM PERT, P40 BLESSER BA, 1978, J AUDIO ENG SOC, V26, P739 DAVIS S, 1978, SR54 HASK LAB STAT R, P133 DAVIS S, 1979, SPEECH LANGUAGE ADV, V1, P273 DAVIS S, 1976, MONOGRAPH SPEECH COM, V13 DAVIS SB, 1981, SPEECH EVALUATION ME, P77 FANT G, 1981, 11981 STLQPSR, P21 GECKINLI NC, 1981, SIGNAL PROCESS, V3, P49, DOI 10.1016/0165-1684(81)90064-5 GOLD B, 1969, J ACOUST SOC AM, V46, P442, DOI 10.1121/1.1911709 GUBRYNOWICZ R, 1983, COMMUNICATION GUBRYNOWICZ R, 1981, 4TH P S FED AC SOC E, P131 GUBRYNOWICZ R, 1977, 8TH P JOURN ET PAR, P21 Gubrynowicz R., 1980, Archives of Acoustics, V5 GUERIN B, 1980, PHONETICA, V37, P169 HECKER MHL, 1971, J ACOUST SOC AM, V49, P1275, DOI 10.1121/1.1912490 HESS W, 1982, PITCH DETERMINATION HILLER S, 1983, WORK PROGR, V16, P40 HILLER SM, 1984, WORK PROGR, V13, P59 Hirano M, 1981, CLIN EXAMINATION VOI Hollien H., 1973, J PHONETICS, V1, P85 HORII Y, 1982, J SPEECH HEAR RES, V25, P12 HORII Y, 1979, J SPEECH HEAR RES, V22, P5 HORII Y, 1975, J SPEECH HEAR RES, V18, P192 Imaizumi S., 1985, ANN B RES I LOGOPEDI, V19, P179 ISHIZAKA K, 1976, J ACOUST SOC AM, V60, P1193, DOI 10.1121/1.381221 ISSHIKI N, 1972, STUDIA PHONOLOGICA, V6, P39 JOSPA P, 1982, 5TH P S FED AC SOC E, P993 JOSPA P, 1982, 17 ACT I PHON U LIBR, P89 JOSPA P, 1984, 13TH P JOURN ET PAR, P161 Kacprowski J., 1979, Archives of Acoustics, V4 KAHN M, 1983, IEEE T ACOUST SPEECH, P531 KASUYA K, 1983, IEEE T ACOUST SPEECH, P344 KASUYA K, 1986, SPEECH COMMUN, V2, P171 KITAJIMA K, 1975, STUDIA PHONOL KYOTO, V9, P25 KLINGHOLZ F, 1985, J SPEECH HEAR RES, V28, P169 KOIKE Y, 1975, ANN OTO RHINOL LARYN, V84, P117 KOIKE Y, 1977, ACTA OTO-LARYNGOL, V84, P105, DOI 10.3109/00016487709123948 Koike Y., 1973, STUDIA PHONOLOGICA, V7, P17 LAVER J, 1985, WORK PROGR U EDINBUR, V18, P1 LAVER J, 1982, IEEE T ACOUST SPEECH, P192 LEBRUN Y, 1971, J LARYNGOL OTOL, V1, P43 LIEBERMAN P, 1963, J ACOUST SOC AM, V35, P344, DOI 10.1121/1.1918465 LUDLOW C, 1985, 4TH INT VOC FOLD PHY MACKENZIE J, 1984, WORK PROGR U EDINBUR, V17, P98 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 MCCLELLAN JH, 1979, FIR LINEAR PHASE DES MONSEN RB, 1977, J ACOUST SOC AM, V62, P981, DOI 10.1121/1.381593 NIEDERJOHN RJ, 1985, IEEE T ACOUST SPEECH, V33, P349, DOI 10.1109/TASSP.1985.1164571 Rabiner L.R., 1978, DIGITAL PROCESSING S RAMIG LA, 1983, J SPEECH HEAR RES, V26, P22 RASCH RA, 1983, 10TH P INT C PHON SC, P288 ROSSI M, 1981, J PHONETICS, V9, P233 SCHOENTGEN J, 1988, APPLIED STOCHASTIC M, V4, P127, DOI 10.1002/asm.3150040207 Schoentgen J., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90020-6 SCHOENTGEN J, 1985, THESIS FREE U BRUSSE SERNICALES W, 1984, COMMUNICATION Siegel S., 1956, NONPARAMETRIC STATIS Wong D. Y., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing NR 60 TC 17 Z9 17 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 61 EP 79 DI 10.1016/0167-6393(89)90068-X PG 19 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200006 ER PT J AU DELIEGE, RJH AF DELIEGE, RJH TI AN EXPERIMENTAL DUTCH KEYBOARD-TO-SPEECH SYSTEM FOR THE SPEECH IMPAIRED SO SPEECH COMMUNICATION LA English DT Note RP DELIEGE, RJH (reprint author), INST PERCEPTIE ONDERZOEK, INST PERCEPT RES, EINDHOVEN, NETHERLANDS. CR 't Hart J., 1975, J PHONETICS, V3, P235 COHEN A, 1982, PHONETICA, V39, P254 CORSTIUS HB, 1965, F LANGUAGE JAN, P59 ELSENDOORN BAG, 1985, IPO20 ANN PROGR REP, P33 ELSENDOORN BAG, 1982, IPR17 ANN PROGR REP, P63 ELSENDOORN BAG, 1984, IPO19 ANN PROGR REP, P32 ESTES SE, 1964, IBM J RES DEV, V8, P2 GALYAS K, 1987, P EUROPEAN C SPEECH, P357 GREENE BG, 1986, BEHAV RES METH INSTR, V18, P100, DOI 10.3758/BF03201008 Kerkhoff J., 1984, LINGUISTICS NETHERLA, P111 t'Hart J., 1973, J PHONETICS, V1, P309 TENHAVE M, 1985, 85 OFF P SPEECH TECH, P253 VANBEZOOIJEN R, 1987, P EUROPEAN C SPEECH, P183 Van Bruck H. E., 1982, Electronic Components & Applications, V4 VANKATWIJK A, 1965, F LANGUAGE JAN, P51 WARRICK A, 1977, JUN P WORKSH COMM AI, P120 NR 16 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1989 VL 8 IS 1 BP 81 EP 89 DI 10.1016/0167-6393(89)90069-1 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA U4752 UT WOS:A1989U475200007 ER PT J AU NIEMANN, H SCHUKATTALAMAZZINI, EG AF NIEMANN, H SCHUKATTALAMAZZINI, EG TI SPECIAL ISSUE ON WORD RECOGNITION IN LARGE VOCABULARIES SO SPEECH COMMUNICATION LA English DT Editorial Material RP NIEMANN, H (reprint author), UNIV ERLANGEN NURNBERG, DEPT COMP SCI, DIV PATTERN RECOGNIT, D-8520 ERLANGEN, FED REP GER. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 333 EP 334 DI 10.1016/0167-6393(88)90048-9 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100001 ER PT J AU DEMORI, R CARDIN, R MERLO, E PALAKAL, M ROUAT, J AF DEMORI, R CARDIN, R MERLO, E PALAKAL, M ROUAT, J TI A NETWORK OF ACTIONS FOR AUTOMATIC SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article C1 CTR RECH INFORMAT MONTREAL, MONTREAL, QUEBEC, CANADA. RP DEMORI, R (reprint author), MCGILL UNIV, SCH COMP SCI, MONTREAL H3A 2K6, QUEBEC, CANADA. CR BAHL LR, 1983, IEEE T PATTERN ANAL, V5, P179 BAIRD HS, 1986, 1986 NAT ADV RES WOR Baum L. E., 1972, INEQUALITIES, V3, P1 Cole R. A., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing DEMICHELIS P, 1983, IEEE T ACOUST SPEECH, V31, P359, DOI 10.1109/TASSP.1983.1164067 DEMORI R, 1987, IEEE T PATTERN ANAL, V9, P289 DEMORI R, 1979, IEEE T ACOUST SPEECH, V27, P538, DOI 10.1109/TASSP.1979.1163281 DEMORI R, 1985, IEEE T PATTERN ANAL, V39, P1 ERMAN LD, 1980, ACM COMPUT SURV, V12, P213, DOI 10.1145/356810.356816 Ferguson J. D., 1980, P S APPL HIDD MARK M, P143 Fu K.S., 1982, SYNTACTIC PATTERN RE HARALICK RM, COMMUNICATION JELINEK F, 1984, P IEEE NOV, P1616 KLATT DH, 1977, J ACOUST SOC AM, V62, P1345, DOI 10.1121/1.381666 KOPEC GE, 1985, IEEE T ACOUST SPEECH, V33, P850, DOI 10.1109/TASSP.1985.1164652 Larar J. N., 1986, Computer Speech and Language, V1, DOI 10.1016/S0885-2308(86)80010-9 LEE KF, 1986, 1986 P INT C AC SPEE, P77 LEVINSON SE, 1985, P IEEE, V73, P1625, DOI 10.1109/PROC.1985.13344 MERLO E, 1986, 1986 P INT C AC SPEE, P1597 NI HP, 1982, ARTIF INTELL, V3, P23 PALAKAL M, UNPUB AUTOMATIC RECO ROUAT J, UNPUB AUTOMATIC RECO Shapiro L. G., 1986, Eighth International Conference on Pattern Recognition. Proceedings (Cat. No.86CH2342-4) SHICHMAN G, 1986, 1986 P INT C AC SPEE, P53 SHIPMAN DW, 1982, 1982 P INT C AC SPEE, P546 STEVENS KN, 1980, J ACOUST SOC AM, V68, P836, DOI 10.1121/1.384823 WALDINGER R, 1977, MACH INTELL, P8 ZUE VW, 1985, P IEEE, V73, P1602, DOI 10.1109/PROC.1985.13342 NR 28 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 337 EP 353 DI 10.1016/0167-6393(88)90050-7 PG 17 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100002 ER PT J AU FISSORE, L MICCA, G PIERACCINI, R LAFACE, P AF FISSORE, L MICCA, G PIERACCINI, R LAFACE, P TI STRATEGIES FOR LEXICAL ACCESS TO VERY LARGE VOCABULARIES SO SPEECH COMMUNICATION LA English DT Article C1 UNIV SALERNO, DIPARTIMENTO INFORMAT & APPLICAZ, I-84100 SALERNO, ITALY. POLITECN TORINO, CENS, I-10128 TURIN, ITALY. RP FISSORE, L (reprint author), CTR STUDI & LAB TELECOMMUN, TORINO, ITALY. CR BILL R, 1986, P INT C ACOUST SPEEC Carter D. M., 1987, Computer Speech and Language, V2, DOI 10.1016/0885-2308(87)90023-4 CRAVERO M, 1986, P INT C ACOUST SPEEC DORTA P, 1987, P INT C ACOUST SPEEC Fissore L., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196606 FISSORE L, 1988, P NATO ASI SERIES RE, V46, P235 FISSORE L, 1988, INT C ACOUST SPEECH, P229 Fissore L., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), DOI 10.1109/ICASSP.1988.196549 GIORDANA A, 1986, INT J MAN MACH STUD, P453 GUPTA VN, 1987, P INT C ACOUST SPEEC HUTTENLOCHER DP, 1984, P INT C ACOUST SPEEC JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 KALTENMEIER A, 1986, P MONTREAL S SPEECH, P95 KANEKO T, 1983, IEEE T ACOUST SPEECH, V31, P1061, DOI 10.1109/TASSP.1983.1164211 KOHONEN T, 1984, INFORMATION SCI, P3 LAFACE P, 1987, P INT C ACOUST SPEEC LAGGER H, 1985, P INT C ACOUST SPEEC LEVINSON SE, 1983, AT&T TECH J, V62, P1035 MICCA G, 1987, P INT C DIGITAL SIGN, P547 PISONI DB, 1985, SPEECH COMMUN, V4, P75, DOI 10.1016/0167-6393(85)90037-8 SCHUKATTALAMAZZ.G, 1986, P INT C ACOUST SPEEC Shipman D. W., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing WAIBEL A, 1987, P INT C ACOUST SPEEC ZUE VW, 1985, P IEEE, V73, P1602, DOI 10.1109/PROC.1985.13342 NR 24 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 355 EP 366 DI 10.1016/0167-6393(88)90051-9 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100003 ER PT J AU NEY, H PAESELER, A AF NEY, H PAESELER, A TI PHONEME-BASED CONTINUOUS SPEECH RECOGNITION RESULTS FOR DIFFERENT LANGUAGE MODELS IN THE 1000-WORD SPICOS SYSTEM SO SPEECH COMMUNICATION LA English DT Article RP NEY, H (reprint author), PHILIPS GMBH, FORSCHUNGSLAB, HAMBURG, FED REP GER. CR BAHL LR, 1988, 1988 P IEEE INT C AC, P497 BAKER JK, 1975, SPEECH RECOGNITION, P512 JELINEK F, 1976, P IEEE, V64, P532, DOI 10.1109/PROC.1976.10159 KATZ SM, 1987, IEEE T ACOUST SPEECH, V35, P400, DOI 10.1109/TASSP.1987.1165125 KUBALA F, 1988, 1988 P IEEE INT C AC, P291 LEE KF, 1988, 1988 P IEEE INT C AC, P123 MERGEL D, 1987, 1987 P IEEE INT C AC NEY H, 1987, 1987 P IEEE INT C AC NEY H, 1984, IEEE T ACOUST SPEECH, V32, P263, DOI 10.1109/TASSP.1984.1164320 NEY H, 1988, 1988 P IEEE INT C AC, P437 NOLL A, 1987, 1987 P IEEE INT C AC PAESELER A, 1987, 1987 P NATO ASI REC, P465 PAUL DB, 1988, 1988 P IEEE INT C AC, P283 SOTSCHECK J, 1984, 1984 P DAGA 84 DTSCH NR 14 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 367 EP 374 DI 10.1016/0167-6393(88)90052-0 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100004 ER PT J AU LEE, KF AF LEE, KF TI ON LARGE-VOCABULARY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION SO SPEECH COMMUNICATION LA English DT Article RP LEE, KF (reprint author), CARNEGIE MELLON UNIV, DEPT COMP SCI, PITTSBURGH, PA 15213 USA. CR ADAMS DA, 1986, SPEECH TECHNOLOGY, P14 CHOW YL, 1986, IEEE INT C ACOUST SP CHOW YL, 1987, IEEE INT C ACOUST SP, P89 GUPTA VN, 1987, IEEE INT C ACOUST SP, P697 Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop LEE KF, 1988, SPEAKER INDEPENDENT LEE KF, 1987, 1987 NATO ASI SPEECH LEE KF, 1988, THESIS CARNEGIEMELLO PRICE PJ, 1988, IEEE INT C ACOUST SP SCHWARTZ R, 1985, IEEE INT C ACOUST SP SHIKANO K, 1986, IEEE INT C ACOUST SP SHIKANO K, 1985, EVALUATION LPC SPECT NR 12 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 375 EP 379 DI 10.1016/0167-6393(88)90053-2 PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100005 ER PT J AU KUNZMANN, S KUHN, T NIEMANN, H AF KUNZMANN, S KUHN, T NIEMANN, H TI AN EXPERIMENTAL ENVIRONMENT FOR THE GENERATION AND VERIFICATION OF WORD HYPOTHESES IN CONTINUOUS SPEECH SO SPEECH COMMUNICATION LA English DT Article RP KUNZMANN, S (reprint author), FRIEDRICH ALEXANDER UNIV, FAC TECHNOL, ERLANGEN, FED REP GER. CR EHRLICH U, 1988, RECENT ADV SPEECH UN KUHN T, 1987, IMPLEMENTATION ALGOR LIPORACE LA, 1982, IEEE T INFORM THEORY, V28, P729, DOI 10.1109/TIT.1982.1056544 MUHLFELD R, 1986, THESIS U ERLANGEN MYERS CS, 1981, IEEE T ACOUST SPEECH, V29, P284, DOI 10.1109/TASSP.1981.1163527 NIEMANN H, 1985, NEW SYSTEMS ARCHITEC RABINER LR, 1988, RECENT ADV SPEECH UN REGEL P, 1988, AKUSTISCH PHONETISCH SCHUKATTALAMAZZ.G, 1987, GENERIERUNG WORTHYPO NR 9 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 381 EP 388 DI 10.1016/0167-6393(88)90054-4 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100006 ER PT J AU SCAGLIOLA, C CAROSSINO, A COLLA, AM FAVARETO, C PEDRAZZI, P SCIARRA, D VICENZI, C AF SCAGLIOLA, C CAROSSINO, A COLLA, AM FAVARETO, C PEDRAZZI, P SCIARRA, D VICENZI, C TI REAL-TIME LARGE VOCABULARY WORD RECOGNITION VIA DIPHONE SPOTTING AND MULTIPROCESSOR IMPLEMENTATION SO SPEECH COMMUNICATION LA English DT Article RP SCAGLIOLA, C (reprint author), ELSAG SPA, ELETTR SAN GIORGIO, DEPT RRL, GENOVA, ITALY. CR APPIANI E, 1985, 1ST P INT C SUP SYST, P310 CAVAZZA M, 1985, NATO ASI SERIES F, V16, P215 COLLA AM, 1987, READ RULES TABLE LOO COLLA AM, 1985, 1985 P INT C AC SPEE, V3, P1229 COLLA AM, 1985, NATO ASI SERIES F, V16, P361 COLLA AM, 1987, 1987 P INT C AC SPEE, V3, P1281 ITAKURA F, 1987, 1987 P INT C AC SPEE, V3, P1257 SCAGLIOLA C, 1982, 1982 P INT C AC SPEE, V3, P2008 SCAGLIOLA C, 1981, 4TH P FASE S AC SPEE, P255 VICENZI C, 1986, 1986 P MONTR S SPEEC, P47 NR 10 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 389 EP 396 DI 10.1016/0167-6393(88)90055-6 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100007 ER PT J AU SCHUKATTALAMAZZINI, EG AF SCHUKATTALAMAZZINI, EG TI ROBUST FEATURES FOR WORD RECOGNITION SO SPEECH COMMUNICATION LA English DT Article RP SCHUKATTALAMAZZINI, EG (reprint author), UNIV ERLANGEN NURNBERG, FAC TECHNOL, D-8520 ERLANGEN, FED REP GER. CR GOLDBERG HGF, 1975, SEGMENTATION LABELIN KUNZMANN S, 1988, SPEECH COMMUN, V7, P381, DOI 10.1016/0167-6393(88)90054-4 LAFACE P, 1988, SPEECH COMM, V7 Lagger H., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) MARIANI JJ, 1982, INT C ACOUST SPEECH, P1637 Moore R. K., 1983, INT C ACOUST SPEECH, P1041 NIEMANN H, 1985, NATO ASI SERIES F, P271 REGEL P, 1988, AKUSTISCH PHONETISCH SCHUKATTALAMAZZ.EG, 1985, 7 P DAGM S INF FB 10, P170 SCHUKATTALAMAZZ.EG, 1986, EUSIPCO 86, P537 SCHUKATTALAMAZZ.EG, 1987, GENERIERUNG WORTHYPO SHIPMAN DW, 1982, ICASSP 82 P MAY 3 4, P546 NR 12 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 397 EP 401 DI 10.1016/0167-6393(88)90056-8 PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100008 ER PT J AU PEELING, SM MOORE, RK AF PEELING, SM MOORE, RK TI ISOLATED DIGIT RECOGNITION EXPERIMENTS USING THE MULTI-LAYER PERCEPTRON SO SPEECH COMMUNICATION LA English DT Article RP PEELING, SM (reprint author), ROYAL SIGNALS & RADAR ESTAB, SPEECH RES UNIT, MALVERN WR14 3PS, WORCS, ENGLAND. CR BEDWORTH MD, 1987, 4049 ROYAL SIGN RAD HOLMES JN, 1980, IEE PROC-F, V127, P53 LONGSTAFF ID, 1986, 3936 ROYAL SIGN RAD PEELING SM, 1987, 1987 P NATO ASI SPEE PEELING SM, 1987, 4073 ROYAL SIGN RAD PEELING SM, 1986, AUT P IOA C SPEECH H, V8, P307 Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1 SMITH DC, 1986, 3926 ROYAL SIGN RAD NR 8 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 403 EP 409 DI 10.1016/0167-6393(88)90057-X PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100009 ER PT J AU YALABIK, N YARMANVURAL, F MANSUR, A AF YALABIK, N YARMANVURAL, F MANSUR, A TI MODIFIED CONDENSED NEAREST NEIGHBOR RULE AS APPLIED TO SPEAKER INDEPENDENT WORD RECOGNITION SO SPEECH COMMUNICATION LA English DT Article C1 DREXEL UNIV, PHILADELPHIA, PA 19104 USA. RP YALABIK, N (reprint author), MIDDLE E TECH UNIV, ANKARA, TURKEY. CR CHIDANANDA K, 1979, IEEE T INFORM THEORY, V125, P488 Devijver P. A., 1982, PATTERN RECOGNITION Duda R. O., 1973, PATTERN CLASSIFICATI DUTOIT, 1987, EUROPEAN C SPEECH TE, V2, P241 Furui S., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) HART PE, 1968, IEEE T INFORM THEORY, V14, P515, DOI 10.1109/TIT.1968.1054155 ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 MANSUR A, 1988, ISOLATED WORD RECOGN NIEMANN H, 1985, NATO ASI SERIES F, V16, P271 PAN KC, 1985, IEEE T ACOUST SPEECH, V33, P546 TOUSSAINT GT, 1980, 5TH P INT C PATT REC, P1324 WILPON JG, 1985, IEEE T ACOUST SPEECH, V33, P587, DOI 10.1109/TASSP.1985.1164581 YALABIK N, 1987, P EUROPEAN C SPEECH, V2, P276 NR 13 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 411 EP 415 DI 10.1016/0167-6393(88)90058-1 PG 5 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100010 ER PT J AU VIDAL, E LLORET, MJ AF VIDAL, E LLORET, MJ TI FAST SPEAKER-INDEPENDENT DTW RECOGNITION OF ISOLATED WORDS USING A METRIC-SPACE SEARCH ALGORITHM (AESA) SO SPEECH COMMUNICATION LA English DT Article RP VIDAL, E (reprint author), UNIV POLITECN VALENCIA, DEPT SISTEMAS INFORMAT & COMPUTAC, VALENCIA, SPAIN. CR CASACUBERTA F, 1987, IEEE T ACOUST SPEECH, V35, P1631, DOI 10.1109/TASSP.1987.1165065 Duda R. O., 1973, PATTERN CLASSIFICATI Duds O., 2000, RECONOCIMIENTO AUTOM GUPTA VN, 1984, P ICASSP LOCKWOOD P, 1986, 1966 INT C PATT REC, V1, P467 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 VIDAL E, 1986, 8TH P INT C PATT REC, V2, P808 Vidal Ruiz E., 1985, Speech Communication, V4, DOI 10.1016/0167-6393(85)90058-5 VIDAL E, 1988, SPEECH COMMUN, V7, P67, DOI 10.1016/0167-6393(88)90022-2 VIDAL E, 1988, IEEE T ACOUST SPEECH, V36, P651, DOI 10.1109/29.1575 Vidal Ruiz E., 1986, Pattern Recognition Letters, V4, DOI 10.1016/0167-8655(86)90013-9 NR 11 TC 6 Z9 6 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 417 EP 422 DI 10.1016/0167-6393(88)90059-3 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100011 ER PT J AU HOGE, H AF HOGE, H TI A STOCHASTIC-MODEL FOR BEAM SEARCH SO SPEECH COMMUNICATION LA English DT Article RP HOGE, H (reprint author), SIEMENS AG, ZENT AUFGABEN INFORMAT TECH, MUNICH, FED REP GER. CR ADAMS A, 1985, SPEECH TECHNOLOGY, P14 BRIDLE J, 1982, P IEEE ACOUST SPEECH, P892 Hoge H., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) LEVINSON SE, 1983, AT&T TECH J, V62, P1035 Lowerre B., 1980, TRENDS SPEECH RECOGN, P340 NEY H, 1984, IEEE T ACOUST SPEECH, V32, P263, DOI 10.1109/TASSP.1984.1164320 NIEMANN H, 1984, INFORMATION SCI, V3, P87 NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1988 VL 7 IS 4 BP 423 EP 430 DI 10.1016/0167-6393(88)90060-X PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA R4701 UT WOS:A1988R470100012 ER PT J AU MRAYATI, M CARRE, R GUERIN, B AF MRAYATI, M CARRE, R GUERIN, B TI DISTINCTIVE REGIONS AND MODES - A NEW THEORY OF SPEECH PRODUCTION SO SPEECH COMMUNICATION LA English DT Article C1 ECOLE NATL SUPER ELECTROCHIM & ELECTROMET GRENOBLE, INST NATL POLYTECH GRENOBLE, F-38401 ST MARTIN HERES, FRANCE. CR ATAL BS, 1978, J ACOUST SOC AM, V63, P1535, DOI 10.1121/1.381848 BADIN P, 1984, STL QPSR, P53 BADIN P, 1987, 11TH INT C PHON SCI BOSCH LFM, 1987, 11TH INT C PHON SCI BOTHEREL A, 1986, TRAVAUX I PHONETIQUE BROAD DJ, 1970, J ACOUST SOC AM, V47, P1572, DOI 10.1121/1.1912090 BROAD DJ, 1977, J ACOUST SOC AM, V62, P1467, DOI 10.1121/1.381676 BROAD DJ, 1987, J ACOUST SOC AM, V81, P155, DOI 10.1121/1.395025 BUTLER J, 1982, ARTICULATORY CONSTRA, P88 CHARPENTIER F, 1984, SPEECH COMMUN, V3, P291, DOI 10.1016/0167-6393(84)90025-6 Chiba T., 1941, VOWEL ITS NATURE STR COOPER FS, 1952, J ACOUST SOC AM, V24, P597, DOI 10.1121/1.1906940 DEGRYSE D, 1981, 4TH FASE S VEN, P193 DUKIEWICZ L, 1973, SPEECH ANAL SYNTHESI, P135 Fant G., 1960, ACOUSTIC THEORY SPEE FANT G, 1969, STL QPSR, P1 Fant G., 1973, SPEECH SOUNDS FEATUR FANT G, 1980, PHONETICA, V37, P55 FANT G, 1974, SPEECH COMMUNICATION FANT G, 1983, STLQPSR4 ROYAL I TEC, P1 FANT G, 1975, STL QPSR, P1 FANT G, 1967, KOMPENDIUM TALOVER F FISCHERJORGENSE.E, 1985, PHONETIC LINGUISTICS, P79 FUJIMURA O, 1974, P SPEECH COMMUNICATI GAY T, 1981, J ACOUST SOC AM, V69, P802, DOI 10.1121/1.385591 GUERIN B, 1977, P S ARTICULATORY MOD HEINZ JM, 1967, STL QPSR, P1 HUIZENGA E, 1931, ARCH NEERLANDAISES P, V4, P66 JOSPA P, 1978, 13 I PHON BRUX RAPP, P33 KAKITA Y, 1985, PHONETIC LINGUISTICS, P133 KELLY L, 1962, P SPEECH COMMUNICATI Krull D., 1987, PHONETIC EXPT RES I, V5, P43 LADEFOGED P, 1978, J ACOUST SOC AM, V64, P1027, DOI 10.1121/1.382086 LADEFOGED P, 1979, UCLA WPP, V45, P32 LAUFER A, 1979, UCLA WPP, V45, P32 LIBERMAN AM, 1963, P SPEECH COMMUNICATI LINDBLOM B, 1969, STL QPSR, P19 LINDBLOM B, 1976, J ACOUST SOC AM, V50, P1166 Lindblom B., 1986, EXPT PHONOLOGY, P13 MAEDA S, 1987, 11TH INT C PHON SCI MENON KMN, 1971, 7TH P INT C AC BUD, V3, P13 MERMELST.P, 1973, J ACOUST SOC AM, V53, P1070, DOI 10.1121/1.1913427 MORSE PM, 1948, VIBRATION SOUND, P265 MRAYATI M, 1976, PHONETICA, V33, P285 MRAYATI M, 1976, REV ACOUST, V36, P18 Ohala John J., 1985, PHONETIC LINGUISTICS, P223 OHMAN SEG, 1967, J ACOUST SOC AM, V41, P310 OHMAN SEG, 1974, P SPEECH COMMUNICATI OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 Perkell JS, 1969, PHYSL SPEECH PRODUCT, V53 PETERSON GE, 1966, J SPEECH HEAR RES, V9, P5 SANTERRE L, 1972, 3EMES JOURN ET PAR L, P49 SCHROEDE.MR, 1967, J ACOUST SOC AM, V41, P1002, DOI 10.1121/1.1910429 SHARF DJ, 1972, J ACOUST SOC AM, V51, P652, DOI 10.1121/1.1912890 SONDHI MM, 1979, IEEE T ACOUST SPEECH, V27, P268, DOI 10.1109/TASSP.1979.1163240 Stevens K N, 1953, J ACOUST SOC AM, V25, P1070 Stevens KN, 1972, HUMAN COMMUNICATION, P51 STEVENS KN, 1955, J ACOUST SOC AM, V27, P484, DOI 10.1121/1.1907943 STEVENS KN, 1971, J ACOUST SOC AM, V50, P1180, DOI 10.1121/1.1912751 STEVENS KN, 1980, J ACOUST SOC AM, V68, P836, DOI 10.1121/1.384823 WAKITA H, 1979, IEEE T ACOUST SPEECH, V27, P281, DOI 10.1109/TASSP.1979.1163242 WOOD S, 1986, J ACOUST SOC AM, V80, P391, DOI 10.1121/1.394090 YANG S, 1987, 11TH INT C PHON SCI NR 63 TC 42 Z9 42 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1988 VL 7 IS 3 BP 257 EP 286 DI 10.1016/0167-6393(88)90073-8 PG 30 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA Q9157 UT WOS:A1988Q915700001 ER PT J AU MARCHAL, A AF MARCHAL, A TI COPRODUCTION - EVIDENCE FROM EPG DATA SO SPEECH COMMUNICATION LA English DT Article RP MARCHAL, A (reprint author), UNIV PROVENCE AIX MARSEILLE 1, INST PHONET, 29 AVE R SCHUMAN, F-13621 AIX EN PROVENCE, FRANCE. CR ABERCROMBIE D, 1967, ELEMENTS GENERAL PHO Catford John C., 1977, FUNDAMENTAL PROBLEMS COURVILLE L, 1981, 12IEMES ACT JOURN ET, P17 Daniloff R. G., 1973, J PHONETICS, V1, P239 FOWLER CA, 1980, J PHONETICS, V8, P113 FOWLER CA, 1983, J PHONETICS, V11, P303 GENTIL M, 1983, REV ACOUSTIQUE, V4, P11 GRAMMONT M, 1961, TRAITE PHONETIQUE HARDCASTLE WJ, 1977, UNPUB COARTICULATION, V1, P27 HENDERSON JB, 1982, PHONETICA, V39, P71 HOUDE RA, 1968, SPEECH COMMUNICATION, V2 Jones Daniel, 1956, OUTLINE ENGLISH PHON Kent R. D., 1977, J PHONETICS, V15, P115 KENYON JS, 1951, AM PRONUNCIATION Kozhevnikov V. A., 1965, SPEECH ARTICULATION Ladefoged P., 1975, COURSE PHONETICS LINELL P, 1982, J PHONETICS, V10, P37 MacKay I. R. A., 1978, INTRO PRACTICAL PHON MANCERON F, 1982, THESIS LIMSI PARIS MARCHAL A, 1984, TRAV I PHON AIX, V9, P267 OHMAN SEG, 1966, J ACOUST SOC AM, V39, P151 Perkell JS, 1969, PHYSL SPEECH PRODUCT, V53 PERKELL JS, 1986, SPEECH COMMUN, V5, P47, DOI 10.1016/0167-6393(86)90029-4 Rochette C.-E., 1973, GROUPES CONSONNES FR ROUSSELOT PJ, 1924, PRINCIPES PHONETIQUE SALTZMAN EL, 1983, HASKINS LABORATORIES, V76, P3 NR 26 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1988 VL 7 IS 3 BP 287 EP 295 DI 10.1016/0167-6393(88)90074-X PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA Q9157 UT WOS:A1988Q915700002 ER PT J AU WEIGEL, W AF WEIGEL, W TI RECOGNITION OF DEMISYLLABLES BASED ON DYNAMIC-PROGRAMMING METHODS SO SPEECH COMMUNICATION LA English DT Article RP WEIGEL, W (reprint author), TECH UNIV MUNICH, LEHRSTUHL DATENVERARBEITUNG, D-8000 MUNICH 2, FED REP GER. CR BRIDLE JS, 1982, IEEE INT C ACOUSTICS, P889 MORITZ M, 1986, THESIS TU MUNCHEN NEY H, 1984, IEEE T ACOUST SPEECH, V32, P263, DOI 10.1109/TASSP.1984.1164320 RUSKE G, 1986, INT C ACOUSTICS TORO, pA1 Ruske G., 1984, Sprache und Datenverarbeitung, V8 RUSKE G, 1978, IEEE INT C ACOUSTICS, P722 ZWICKER E, 1979, J ACOUST SOC AM, V65, P487, DOI 10.1121/1.382349 NR 7 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1988 VL 7 IS 3 BP 297 EP 304 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA Q9157 UT WOS:A1988Q915700003 ER PT J AU KLEIJN, WB KRASINSKI, DJ KETCHUM, RH AF KLEIJN, WB KRASINSKI, DJ KETCHUM, RH TI AN EFFICIENT STOCHASTICALLY EXCITED LINEAR PREDICTIVE CODING ALGORITHM FOR HIGH-QUALITY LOW BIT RATE TRANSMISSION OF SPEECH SO SPEECH COMMUNICATION LA English DT Article RP KLEIJN, WB (reprint author), AT&T BELL LABS, NAPERVILLE, IL 60540 USA. CR ATAL BS, 1979, IEEE T ACOUST SPEECH, V27, P247, DOI 10.1109/TASSP.1979.1163237 Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1967, 1967 P IEEE C COMM P, P360 Atal B. S., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Atal B.S., 1984, P INT C COMM AMST, P1610 CANTONI A, 1976, IEEE T COMMUN, V24, P804, DOI 10.1109/TCOM.1976.1093391 Chen J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Copperi M., 1984, Seventh International Conference on Pattern Recognition (Cat. No. 84CH2046-1) DAVIDSON G, 1986, P INT C ACOUST SPEEC, P3055 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 Kroon P., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) LIN D, 1986, SIGNAL PROCESS, V3, P445 Moriya T., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) RAMAMOORTHY V, 1984, AT T BELL LABS TECH, P1465 Shoham Y., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) TRANCOSO IM, 1986, P INT C ACOUST SPEEC, P2379 Tremain T.E., 1982, SPEECH TECHNOLOG APR, P40 NR 17 TC 9 Z9 9 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1988 VL 7 IS 3 BP 305 EP 316 DI 10.1016/0167-6393(88)90076-3 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA Q9157 UT WOS:A1988Q915700004 ER PT J AU YANG, SA XU, Y AF YANG, SA XU, Y TI AN ACOUSTIC-PHONETIC ORIENTED SYSTEM FOR SYNTHESIZING CHINESE SO SPEECH COMMUNICATION LA English DT Article RP YANG, SA (reprint author), CHINESE ACAD SOCIAL SCI, INST LINGUIST, BEIJING, PEOPLES R CHINA. RI Xu, Yi/C-4013-2008 OI Xu, Yi/0000-0002-8541-2658 CR COKER CH, 1972, PAPERS INTERDISCIPLI, P319 Fant G., 1960, ACOUSTIC THEORY SPEE HOLMES JN, 1983, SPEECH COMMUN, V2, P251, DOI 10.1016/0167-6393(83)90044-4 HUANG TY, 1982, P ICASSP 82 PARIS, V3, P1601 KLATT DH, 1980, J ACOUST SOC AM, V67, P971, DOI 10.1121/1.383940 LEE SC, 1982, 1982 P INT C CHIN LA, P157 LI TY, 1982, NTZ ARCH, V4, P121 LIN MC, 1965, ACOUSTICA SINICA, V2, P15 LIN MC, 1987, 11TH P ICPHS, V1, P162 MA DY, 1983, MANUAL ACOUSTICS, P417 ROSENBER.AE, 1971, J ACOUST SOC AM, V49, P583, DOI 10.1121/1.1912389 Wu Z. J., 1964, ACTA ACUST, V1, P33 WU ZJ, 1987, 11TH P ICPHS, V5, P209 WU ZJ, COURSE PHONETICS XU Y, ACOUSTIC PHONETIC ST YANG S, 1987, 11 P INT C PHON SCI, V1, P239 YANG SA, 1986, ZHONGGUO YUWEN, P173 ZHANG J, 1986, P ICASSP, P2023 NR 18 TC 2 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD OCT PY 1988 VL 7 IS 3 BP 317 EP 325 DI 10.1016/0167-6393(88)90077-5 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA Q9157 UT WOS:A1988Q915700005 ER PT J AU HEUTE, U AF HEUTE, U TI MEDIUM RATE SPEECH CODING FOR DIGITAL MOBILE TELEPHONY - EDITORIAL SO SPEECH COMMUNICATION LA English DT Editorial Material RP HEUTE, U (reprint author), RUHR UNIV BOCHUM, INST COMMUN, D-4630 BOCHUM, FED REP GER. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 111 EP 112 DI 10.1016/0167-6393(88)90032-5 PG 2 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700001 ER PT J AU NATVIG, JE AF NATVIG, JE TI PAN-EUROPEAN SPEECH CODING STANDARD FOR DIGITAL MOBILE RADIO SO SPEECH COMMUNICATION LA English DT Article RP NATVIG, JE (reprint author), NORWEGIAN TELECOMMUN ADM RES ESTAB, N-2007 KJELLER, NORWAY. CR COLEMAN AE, 1988, SPEECH COMMUN, V7, P151, DOI 10.1016/0167-6393(88)90036-2 GALAND C, 1988, SPEECH COMMUN, V7, P167, DOI 10.1016/0167-6393(88)90037-4 HANES RB, 1988, SPEECH COMMUN, V7, P179, DOI 10.1016/0167-6393(88)90038-6 LAZZARI V, 1988, SPEECH COMMUN, V7, P193, DOI 10.1016/0167-6393(88)90039-8 MATTSON T, 1986, 2ND SEM LAND MOB DIG RAMSTAD TA, 1982, ICASSP 82 PARIS VARY P, 1988, SPEECH COMMUN, V7, P209, DOI 10.1016/0167-6393(88)90040-4 NR 7 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 113 EP 123 DI 10.1016/0167-6393(88)90034-9 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700002 ER PT J AU HEUTE, U AF HEUTE, U TI MEDIUM-RATE SPEECH CODING - TRIAL OF A REVIEW SO SPEECH COMMUNICATION LA English DT Article RP HEUTE, U (reprint author), UNIV ERLANGEN NURNBERG, LEHRSTUHL NACHRICHTENTECH, CAUERSTR 7, D-8520 ERLANGEN, FED REP GER. CR AHMED N, 1974, IEEE T COMPUT, VC 23, P90, DOI 10.1109/T-C.1974.223784 Almeida L. B., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing [Anonymous], 2007, Wireless LAN Medium Access Control (MAC) and Physical layer (PHY) Specifications (802.11p), Patent No. 2605361 AREVALO L, 1986, THESIS U ERLANGEN Arjmand M. M., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1970, AT&T TECH J, V49, P1973 ATAL BS, 1971, J ACOUST SOC AM, V50, P637, DOI 10.1121/1.1912679 ATAL BS, 1975, P ICC, P30 ATAL BS, 1981, P ICASSP, P599 BARABELL AJ, 1979, P ICASSP WASHINGTON, P530 BAYLESS JW, 1973, IEEE SPECTRUM OCT, P28 BELLANGER MG, 1976, IEEE T ACOUST SPEECH, V24, P109, DOI 10.1109/TASSP.1976.1162788 Bertorello L., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing BRAUN HJ, 1986, SUBJECTIVE QUALITY M BREHM H, 1987, SIGNAL PROCESS, V12, P119, DOI 10.1016/0165-1684(87)90001-6 BREHM H, 1986, INT S INF TH BREHM H, 1983, P EUSIPCO ERLANGEN, P383 BREHM H, 1986, P EUSIPCO HAGUE, P353 CARL H, 1987, THESIS U ERLANGEN CHU PL, 1985, IEEE T ACOUST SPEECH, V33, P203, DOI 10.1109/TASSP.1985.1164529 CROCHIERE RE, 1982, IEEE T COMMUN, V30, P621, DOI 10.1109/TCOM.1982.1095502 CROCHIERE RE, 1977, AT&T TECH J, V56, P771 CROCHIERE RE, 1979, P ICASSP WASHINGTON, P526 CROCHIERE RE, 1976, AT&T TECH J, V55, P1069 CROCHIERE RE, 1981, AT&T TECH J, V60, P1633 CROISIER A, 1976, P INT C INF SCI SYST DEPRETTERE EF, 1985, P ICASSP TAMPA, P965 DIETRICH M, 1984, INT ZURICH SEM Dudley H., 1955, Journal of the Audio Engineering Society, V3 Dudley H, 1940, BELL SYST TECH J, V19, P495 Dudley H, 1939, J ACOUST SOC AM, V11, P169, DOI 10.1121/1.1916020 EMMERT B, 1986, THESIS U ERLANGEN ESTEBAN D, 1979, P ICASSP WASHINGTON, P975 ESTEBAN D, 1983, P ICASSP BOSTON, P224 *EURASIP COST, 1985, WORKSH MED RAT SPEEC FEHN HG, 1982, IEEE T COMMUN, V30, P687, DOI 10.1109/TCOM.1982.1095524 FELDTKELLER R, 1967, OHR ALS NACHRICHTENE FLANAGAN JL, 1979, IEEE T COMMUN, V27, P710, DOI 10.1109/TCOM.1979.1094454 FOSTER J, 1985, IEEE T INFORM THEORY, V31, P348, DOI 10.1109/TIT.1985.1057035 FRANGOULIS ED, 1984, IEE PROC-F, V131, P542 Galand C, 1977, P IEEE INT C AC SPEE, P191 GALAND C, 1988, SPEECH COMMUN, V7, P167, DOI 10.1016/0167-6393(88)90037-4 GERSHO A, 1983, IEEE COMMUN MAG, V21, P15, DOI 10.1109/MCOM.1983.1091516 GERSHO A, 1984, P ICASSP SAN DIEGO GLUTH R, 1986, 2ND P NORD SEM DIG M, P230 GOLD B, 1965, J ACOUST SOC AM, V37, P753, DOI 10.1121/1.1909425 GOLDEN RM, 1963, J ACOUST SOC AM, V35, P1358, DOI 10.1121/1.1918698 GRAUEL C, 1980, SIGNAL PROCESS, V2, P23, DOI 10.1016/0165-1684(80)90059-6 Gray R. M., 1984, IEEE ASSP Magazine, V1, DOI 10.1109/MASSP.1984.1162229 GRAY RM, 1982, IEEE T INFORM THEORY, V28, P256, DOI 10.1109/TIT.1982.1056471 GUNDEL CL, 1987, AUSGEW ARB NACHR SYS, V66 GUNDEL CL, 1986, P EUSIPCO HAGUE, P439 GUNDEL CL, 1985, P IEEE MELECON MADRI, P167 Gupta V., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing HAMEL P, 1985, P ICASSP TAMPA, P1676 HANES RB, 1988, SPEECH COMMUN, V7, P179, DOI 10.1016/0167-6393(88)90038-6 Hedelin P., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Heron C. D., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Hess W., 1983, PITCH DETERMINATION HEUTE U, 1981, P INT C DIG SIG PROC, P1041 Huang J.J.Y., 1963, IEEE Transactions on Communication Systems, VCS-11, DOI 10.1109/TCOM.1963.1088759 ILLMER H, 1987, THESIS U ERLANGEN Jayant N. S., 1984, DIGITAL CODING WAVEF JAYANT NS, 1974, P IEEE, V62, P611, DOI 10.1109/PROC.1974.9484 JAYANT NS, 1973, AT&T TECH J, V52, P1119 Johnston J. D., 1980, P IEEE INT C AC SPEE, P291 Kaltenmeier A., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing Katterfeldt H., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing KATTERFELD H, 1981, P ICASSP ATLANTA, P824 KROON P, 1985, THESIS TH DELFT KUGLER W, 1986, THESIS U ERLANGEN Kupfmuller K., 1954, Fernmeldetechnische Zeitschrift, V7 KUSCH H, 1976, DFG C SPEECH PROCESS LANGLAIS T, 1986, P EUSIPCO HAGUE, P419 LAZZARI V, 1988, SPEECH COMMUN, V7, P193, DOI 10.1016/0167-6393(88)90039-8 LEGUYADER A, 1988, SPEECH COMMUN, V7, P217, DOI 10.1016/0167-6393(88)90041-6 Lin D., 1986, P EUSIPCO HAGUE, P445 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 Makhoul J., 1979, P IEEE INT C AC SPEE, P428 MAKHOUL J, 1975, P IEEE, V63, P561, DOI 10.1109/PROC.1975.9792 MAKHOUL J, 1985, P IEEE, V73, P1551, DOI 10.1109/PROC.1985.13340 MALAH D, 1979, IEEE T ACOUST SPEECH, V27, P121, DOI 10.1109/TASSP.1979.1163210 MALAH D, 1981, IEEE T ACOUST SPEECH, V29, P273, DOI 10.1109/TASSP.1981.1163547 Markel JD, 1976, LINEAR PREDICTION SP MARKEL JD, 1974, IEEE T ACOUST SPEECH, VAS22, P273, DOI 10.1109/TASSP.1974.1162587 MASSON J, 1985, P ICASSP TAMPA, P541 MAX J, 1960, IRE T IT, V8, P7 MAZOR B, 1986, P ICASSP TOKYO, P3075 MILLER RL, 1953, J ACOUST SOC AM, V25, P832 MINTZER F, 1985, IEEE T ACOUST SPEECH, V33, P626, DOI 10.1109/TASSP.1985.1164587 MULLER H, 1986, THESIS U ERLANGEN NATVIG JE, 1988, SPEECH COMMUN, V7, P113, DOI 10.1016/0167-6393(88)90034-9 NUSSBAUMER HJ, 1984, P INT C DIG SIG PROC, P8 QUATIERI TF, 1985, P ICASSP TAMPA, P945 Ramstad T. A., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Reeves A. H., 1938, Electric signaling system, Patent No. [French Patent, 852 183, 852183] Rothweiler J. H., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing SCHREGLMANN H, 1986, THESIS U ERLANGEN SCHRODER G, 1988, SPEECH COMMUN, V7, P227, DOI 10.1016/0167-6393(88)90042-8 Schroeder M., 1985, P IEEE INT C AC SPEE, P937 SCHROEDER M, 1984, P INT C COMMUN, P1610 SCHROEDE.MR, 1966, PR INST ELECTR ELECT, V54, P720, DOI 10.1109/PROC.1966.4841 SINGHAL S, 1984, P ICASSP SAN DIEGO SMITH CP, 1963, J ACOUST SOC AM, P805 SMITH MJT, 1985, P IEEE ICASSP 85, P521 SMITH MJT, 1984, P ICASSP SAN DIEGO SOONG FK, 1985, P ICASSP TAMPA, P1672 SOTSCHECK J, 1985, NATIONAL COST 207 M TRANCOSO IM, 1988, SPEECH COMMUN, V7, P239, DOI 10.1016/0167-6393(88)90043-X TRANCOSO IM, 1984, P ICASSP SAN DIEGO TRANCOSO IM, 1987, THESIS U TECNICA LIS Tremain T.E., 1982, SPEECH TECHNOLOG APR, P40 Tribolet J. M., 1978, Proceedings of the 1978 IEEE International Conference on Acoustics, Speech and Signal Processing TRIBOLET JM, 1979, IEEE T ACOUST SPEECH, V27, P512, DOI 10.1109/TASSP.1979.1163283 TRIBOLET JM, 1979, P INT C SIGNAL PROCE, P638 UN CK, 1975, IEEE T COM, V23, P1467 VARY P, 1979, AEU-INT J ELECTRON C, V33, P293 VARY P, 1988, SPEECH COMMUN, V7, P209, DOI 10.1016/0167-6393(88)90040-4 VARY P, 1985, P NTG C MOB RAD SERV, P172 VARY P, 1985, SIGNAL PROCESS, V8, P387, DOI 10.1016/0165-1684(85)90002-7 VARY P, 1980, SIGNAL PROCESS, V2, P55, DOI 10.1016/0165-1684(80)90062-6 VETTERLI M, 1986, SIGNAL PROCESS, V10, P219, DOI 10.1016/0165-1684(86)90101-5 VISWANATHAN R, 1982, IEEE COMMUN, V30, P663 VOLMARY C, 1987, DIGITAL SPEECH TRANS WACKERSREUTHER G, 1985, AEU-ARCH ELEKTRON UB, V39, P123 WACKERSREUTHER G, 1985, P ICASSP TAMPA, P73 WACKERSREUTHER G, 1986, IEEE T ACOUST SPEECH, V34, P1182, DOI 10.1109/TASSP.1986.1164942 WALKER AM, 1931, P ROYAL SOC LONDON A, V131, P518 Weinstein C.J, 1975, P EASCON, p30A XYDEAS CS, 1984, IEE P F, V131, P57 Yule GU, 1927, PHILOS T R SOC LOND, V226, P267, DOI 10.1098/rsta.1927.0007 ZELINSKI R, 1982, FREQUENZ, V36, P193 ZELINSKI R, 1977, IEEE T ACOUST SPEECH, V25, P299, DOI 10.1109/TASSP.1977.1162974 ZIEGLER KH, 1987, THESIS U ERLANGEN ZINSER RL, 1985, P ICASSP TAMPA, P969 NR 136 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 125 EP 149 DI 10.1016/0167-6393(88)90035-0 PG 25 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700003 ER PT J AU COLEMAN, AE GLEISS, N USAI, P AF COLEMAN, AE GLEISS, N USAI, P TI A SUBJECTIVE TESTING METHODOLOGY FOR EVALUATING MEDIUM RATE CODECS FOR DIGITAL MOBILE RADIO APPLICATIONS SO SPEECH COMMUNICATION LA English DT Article C1 SWEDISH TELECOMMUN ADM, S-12386 FARSTA, SWEDEN. CTR STUDI & LAB TELECOMUN, TURIN, ITALY. RP COLEMAN, AE (reprint author), BRITISH TELECOM RES LABS, MARTLESHAIM HEATH, IPSWICH IP5 7RE, SUFFOLK, ENGLAND. CR *BRIT TEL, 1982, OPT OV LOUDN RAT COM CARSON RM, TE R1300502 BRIT TEL COLEMAN A, 1987, GLOBECOM 87 TOKYO *COM CONS INT TEL, 1985, RED BOOK S2, V5 *COM CONS INT TEL, 1985, RED BOOK S13, V5 *COM CONS INT TEL, 1985, RED BOOK S4, V5 *COM CONS INT TEL, 1985, RED BOOK S14, V5 *COM CONS INT TEL, 1986, 45 CIRC GLEISS N, 1974, TELE LAW HB, 1962, REFERENCE DISTORTION, P484 RICHARDS DL, 1973, TELECOMMUNICATION SP, P188 1985, FEB P NORD SEM DIG L 1987, JUN INT C DIG LAND M 1986, 2ND P NORD SEM DIG L NR 14 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 151 EP 166 DI 10.1016/0167-6393(88)90036-2 PG 16 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700004 ER PT J AU GALAND, C ROSSO, M ELIE, P LANCON, E AF GALAND, C ROSSO, M ELIE, P LANCON, E TI MPE LTP SPEECH CODER FOR MOBILE RADIO APPLICATION SO SPEECH COMMUNICATION LA English DT Article RP GALAND, C (reprint author), IBM CORP LAB, DEPT ADV TELECOMMUN, F-06610 LA GAUDE, FRANCE. RI yu, yan/C-2322-2012 CR ATAL BS, 1982, IEEE ICASSP PARIS BERAUD JP, 1985, IBM J RES DEV, V29 BEROUTI M, 1984, ICASSP, V1 CROISIER A, 1974, SEMINAR COMMUNICATIO LANCON E, 1985, THESIS U NICE LEROUX J, 1977, IEEE T ACOUSTICS SPE Markel JD, 1976, LINEAR PREDICTION SP UN CK, 1977, IEEE ICASSP HARTFORD USAI P, 1988, SPEECH COMM, V7 NR 9 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 167 EP 178 DI 10.1016/0167-6393(88)90037-4 PG 12 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700005 ER PT J AU HANES, RB ATTKINS, PM AF HANES, RB ATTKINS, PM TI THE UK CANDIDATE 16 KBIT/S SPEECH CODEC FOR THE GSM PAN-EUROPEAN STUDY ON DIGITAL CELLULAR LAND MOBILE RADIO SO SPEECH COMMUNICATION LA English DT Article RP HANES, RB (reprint author), BRITISH TELECOM RES LABS, MARTLESHAM HEATH, IPSWICH IP5 7RE, SUFFOLK, ENGLAND. CR BRIGHAM ER, 1976, POST OFFICE ELEC ENG, V69, P93 CUSHMAN RH, 1985, EDN, V11, P59 ESTEBAN D, 1977, 1977 P IEEE INT C AC, P191 ESTEBAN D, 1983, 1983 P INT C ASSP BO, P224 GOODMAN DJ, 1975, IEEE T COMMUN, V23, P1362, DOI 10.1109/TCOM.1975.1092719 Hanes R. B., 1985, British Telecom Technology Journal, V3 JAYANT NS, 1973, AT&T TECH J, V52, P1119 JOHNSTON JD, 1980, 1980 P ICASSP, P291 KNEIB KN, 1985, ELECTRICAL COMMUNICA, V59 MARRIN K, 1986, COMPUTER DESIGN 1115, P59 NATVIG JE, 1986, 2ND INT DIG MOB RAD, P223 NEILSEN H, 1986, 2ND INT DIG MOB RAD, P266 SYMINGTON IC, 1986, 2ND INT DIG MOB RAD, P94 NR 13 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 179 EP 192 DI 10.1016/0167-6393(88)90038-6 PG 14 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700006 ER PT J AU LAZZARI, V MONTAGNA, R SERENO, D AF LAZZARI, V MONTAGNA, R SERENO, D TI COMPARISON OF 2 SPEECH CODECS FOR DMR SYSTEMS SO SPEECH COMMUNICATION LA English DT Article RP LAZZARI, V (reprint author), CTR STUDI & LAB TELECOMUN, VIA REISS ROMOLI 274, I-10148 TORINO, ITALY. CR ATAL BS, 1982, IEEE T COMMUN, V30, P600, DOI 10.1109/TCOM.1982.1095501 CHEUNG RS, 1981, P ICASSP, P631 COPPERI M, 1984, P ICASSP COPPERI M, 1986, P ICASSP, P845 CROCHIERE R, 1976, INT C ACOUST SPEECH, P233 ESTEBAN D, 1978, INT C ACOUST SPEECH, P320 GALAND CR, 1984, IEEE T ACOUST SPEECH, V32, P522, DOI 10.1109/TASSP.1984.1164356 GERSHO A, 1984, INT C ACOUST SPEECH GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 JAYANT NS, 1986, P INT C ACOUST SPEEC, P829 Johnston J. D., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing Krasner M. A., 1980, ICASSP 80 Proceedings. IEEE International Conference on Acoustics, Speech and Signal Processing LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 Mensa G., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) MENSA G, 1986, EUSIPCO 86 HAGUE RAMSTAD TA, 1982, INT C ACOUST SPEECH, P203 Soong F. K., 1985, ICASSP 85. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 85CH2118-8) NR 17 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 193 EP 207 DI 10.1016/0167-6393(88)90039-8 PG 15 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700007 ER PT J AU VARY, P HOFMANN, R HELLWIG, K SLUYTER, RJ AF VARY, P HOFMANN, R HELLWIG, K SLUYTER, RJ TI A REGULAR-PULSE EXCITED LINEAR PREDICTIVE CODEC SO SPEECH COMMUNICATION LA English DT Article C1 PHILIPS RES LABS, EINDHOVEN, NETHERLANDS. RP VARY, P (reprint author), PHILIPS KOMMUNIKAT IND AG, NURNBERG, FED REP GER. CR [Anonymous], 1917, J REINE ANGEW MATH Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Blahut R. E., 1983, THEORY PRACTICE ERRO DEPRETTERE EF, 1985, P ICASSP TAMPA, P965 Eizenhofer A., 1986, 1986 International Zurich Seminar on Digital Communications. New Directions in Switching and Networks. Proceedings (Cat. No.86CH2277-2) GALAND C, 1988, SPEECH COMMUN, V7, P167, DOI 10.1016/0167-6393(88)90037-4 HELLWIG K, 1986, 2ND NORD SEM DIG LAN, P257 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 Makhoul J., 1979, P IEEE INT C AC SPEE, P428 NATVIG JE, 1988, SPEECH COMMUN, V7, P113, DOI 10.1016/0167-6393(88)90034-9 UN CK, 1975, IEEE T COMMUN, V23, P1466 VARY P, 1987, JUN INT C DIG LAND M, P507 NR 12 TC 7 Z9 7 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 209 EP 215 DI 10.1016/0167-6393(88)90040-4 PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700008 ER PT J AU LEGUYADER, A MASSALOUX, D ZURCHER, F AF LEGUYADER, A MASSALOUX, D ZURCHER, F TI A ROBUST AND FAST CELP CODER AT 16 KBIT/S SO SPEECH COMMUNICATION LA English DT Article RP LEGUYADER, A (reprint author), CTR NATL ETUD TELECOMMUN, F-22301 LANNION, FRANCE. CR Adoul J., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1982, IEEE T COMMUN, V30, P600, DOI 10.1109/TCOM.1982.1095501 CHEN J, 1986, P IEEE INT C AC SPEE, P1693 GALAND C, 1986, EUROPEAN SIGNAL PROC, P435 KROON P, 1986, IEEE T ACOUST SPEECH, V34, P1054, DOI 10.1109/TASSP.1986.1164946 Le Guyader A., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4) SAGURAMA N, 1986, SPEECH COMMUN, V5, P199 Schroeder M., 1985, P IEEE INT C AC SPEE, P937 TRANCOSO IM, 1986, P IEEE INT C AC SPEE, P2375 NR 10 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 217 EP 226 DI 10.1016/0167-6393(88)90041-6 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700009 ER PT J AU SCHRODER, G AF SCHRODER, G TI RESIDUAL-EXCITED LPC WITH VECTOR QUANTIZATION (RELP-VQ) SO SPEECH COMMUNICATION LA English DT Article RP SCHRODER, G (reprint author), DEUTSCH BUNDESPOST, CTR TELECOMMUN ENGN, RES INST, D-6100 DARMSTADT, FED REP GER. CR Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing BRAUN HJ, 1985, ACUSTICA, V57, P233 Crochiere R. E., 1983, MULTIRATE DIGITAL SI GRAY AH, 1976, IEEE T ACOUST SPEECH, V24, P380, DOI 10.1109/TASSP.1976.1162849 KROON P, 1986, IEEE T ACOUSTICS SPE, V34, P380 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 MARKEL JD, 1976, COMMUNICATIONS CYBER, V12 Markel JD, 1976, LINEAR PREDICTION SP Rabiner L.R., 1978, DIGITAL PROCESSING S Schroder G., 1986, NTG-Fachberichte, V94 VISWANATHAN R, 1975, IEEE T ACOUST SPEECH, VAS23, P309, DOI 10.1109/TASSP.1975.1162675 NR 11 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 227 EP 237 DI 10.1016/0167-6393(88)90042-8 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700010 ER PT J AU TRANCOSO, IM ALMEIDA, LB RODRIGUES, JS MARQUES, JS TRIBOLET, JM AF TRANCOSO, IM ALMEIDA, LB RODRIGUES, JS MARQUES, JS TRIBOLET, JM TI HARMONIC CODING - STATE OF THE ART AND FUTURE-TRENDS SO SPEECH COMMUNICATION LA English DT Article C1 Univ Tecn Lisboa, INST SUPER TECN, LISBON 1, PORTUGAL. RP TRANCOSO, IM (reprint author), INESC, LISBON, PORTUGAL. RI Marques, Jorge/C-1427-2010; Trancoso, Isabel/C-5965-2008; Tribolet, Jose/A-6408-2012 OI Marques, Jorge/0000-0002-3800-7756; Trancoso, Isabel/0000-0001-5874-6313; Tribolet, Jose/0000-0003-1903-4561 CR Almeida L. B., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ALMEIDA LB, 1983, IEEE T ACOUST SPEECH, V31, P664, DOI 10.1109/TASSP.1983.1164128 ALMEIDA LB, 1984, P INT C ACOUSTICS SP Almeida L. B., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing Atal B. S., 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing ATAL BS, 1970, AT&T TECH J, V49, P1973 Bronson E. C., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) FLANAGAN JL, 1966, AT&T TECH J, V45, P1493 George E. B., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) GERSHO A, 1984, P INT C ACOUSTICS SP GRIFFIN DW, 1986, P ICASSP 86, P125 HEDELIN P, 1981, P IEEE INT C AC SPEE, P205 MAKHOUL J, 1985, P IEEE, V73, P1551, DOI 10.1109/PROC.1985.13340 Markel JD, 1976, LINEAR PREDICTION SP MARQUES JS, 1987, 11 P C TRAIT SIGN IM, P447 MCAULAY R, 1985, P IEEE ICASSP85 TAMP, P945 McAulay R. J., 1986, P IEEE ICASSP, P1713 MCAULAY RJ, 1986, IEEE T ACOUST SPEECH, V34, P744, DOI 10.1109/TASSP.1986.1164910 Rodrigues J. S., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) Schroeder M., 1985, P IEEE INT C AC SPEE, P937 TRANCOSO IM, 1985, P INT C ACOUSTICS SP, P260 TRIBOLET JM, 1979, IEEE T ACOUST SPEECH, V27, P512, DOI 10.1109/TASSP.1979.1163283 NR 22 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD JUL PY 1988 VL 7 IS 2 BP 239 EP 245 DI 10.1016/0167-6393(88)90043-X PG 7 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA P3707 UT WOS:A1988P370700011 ER PT J AU VANBERGEM, DR POLS, LCW KOOPMANSVANBEINUM, FJ AF VANBERGEM, DR POLS, LCW KOOPMANSVANBEINUM, FJ TI PERCEPTUAL NORMALIZATION OF THE VOWELS OF A MAN AND A CHILD IN VARIOUS CONTEXTS SO SPEECH COMMUNICATION LA English DT Article RP VANBERGEM, DR (reprint author), UNIV AMSTERDAM, INST PHONET SCI, AMSTERDAM, NETHERLANDS. CR AINSWORT.WA, 1974, LANG SPEECH, V17, P103 ASSMANN PF, 1982, J ACOUST SOC AM, V71, P975, DOI 10.1121/1.387579 BLADON RAW, 1984, LANG COMMUN, V4, P59, DOI 10.1016/0271-5309(84)90019-3 DECHOVITZ D, 1977, SR5152 HASK LAB STAT, P213 DELATTRE P, 1969, IRAL-INT REV APPL LI, V7, P295, DOI 10.1515/iral.1969.7.4.295 DISNER SF, 1980, J ACOUST SOC AM, V67, P253, DOI 10.1121/1.383734 DRULLMAN R, 1987, 95 U AMST I PHON SCI DUIFHUIS H, 1982, J ACOUST SOC AM, V71, P1578 FANT G, 1975, Q PROG STAT REP STL, V2, P1 FUJISAKI H, 1968, IEEE T ACOUST SPEECH, VAU16, P73, DOI 10.1109/TAU.1968.1161952 GERSTMAN LJ, 1968, IEEE T ACOUST SPEECH, VAU16, P78, DOI 10.1109/TAU.1968.1161953 Harshman R., 1970, WORKING PAPERS PHONE, V16 JOHNSON TL, 1982, J ACOUST SOC AM, V72, P1761, DOI 10.1121/1.388649 Joos M., 1948, LANGUAGE SUPPL, V24, P1, DOI DOI 10.2307/522229 KLEIN W, 1970, J ACOUST SOC AM, V48, P999, DOI 10.1121/1.1912239 KOOPMANSVANBEIN.FJ, 1980, THESIS U AMSTERDAM A LADEFOGED P, 1957, J ACOUST SOC AM, V29, P98, DOI 10.1121/1.1908694 LINDBLOM B, 1963, J ACOUST SOC AM, V35, P1773, DOI 10.1121/1.1918816 LOBANOV BM, 1971, J ACOUST SOC AM, V49, P606, DOI 10.1121/1.1912396 MACCHI MJ, 1980, J ACOUST SOC AM, V68, P1636, DOI 10.1121/1.385219 Markel JD, 1976, LINEAR PREDICTION SP MILLER RL, 1953, J ACOUST SOC AM, V25, P114, DOI 10.1121/1.1906983 Nearey Terrance Michael, 1977, THESIS U CONNECTICUT NOOTEBOOM SG, 1974, IPO9 ANN PROGR REP, P47 NORDSTROM PE, 1975, 8TH INT C PHON SCI L PETERSON GE, 1952, J ACOUST SOC AM, V24, P175, DOI 10.1121/1.1906875 Plomp R, 1976, ASPECTS TONE SENSATI POLS LCW, 1969, J ACOUST SOC AM, V46, P458, DOI 10.1121/1.1911711 POLS LCW, 1973, J ACOUST SOC AM, V53, P1093, DOI 10.1121/1.1913429 REPP BH, 1977, J ACOUST SOC AM, V62, P720, DOI 10.1121/1.381584 REPP BH, 1979, J EXP PSYCHOL HUMAN, V5, P129, DOI 10.1037//0096-1523.5.1.129 SLAWSON AW, 1968, J ACOUST SOC AM, V43, P87, DOI 10.1121/1.1910769 STRANGE W, 1983, J ACOUST SOC AM, V74, P695, DOI 10.1121/1.389855 STRANGE W, 1976, J ACOUST SOC AM, V60, P213, DOI 10.1121/1.381066 SYRDAL AK, 1985, SPEECH COMMUN, V4, P121, DOI 10.1016/0167-6393(85)90040-8 THOMAE H, 1970, HUM DEV, V13, P1 VANBALEN CW, 1977, PRIPU, V2, P32 VANBERGEM DR, 1986, 88 U AMST I PHON SCI VANDERKA.LJ, 1971, ACTA PSYCHOL, V35, P64, DOI 10.1016/0001-6918(71)90032-1 VANDIJK JSC, 1984, 8 P I PHON SCI, P19 VERBRUGGE RR, 1976, J ACOUST SOC AM, V60, P198, DOI 10.1121/1.381065 WEENINK DJM, 1985, 10 P I PHON SCI, P41 WEENINK DJM, 1986, 82 U AMST I PHON SCI WEENINK DJM, 1986, 83 U AMST I PHON SCI WEENINK DJM, 1985, 9 P I PHON SCI, P45 NR 45 TC 14 Z9 14 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 1 EP 20 DI 10.1016/0167-6393(88)90018-0 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400001 ER PT J AU LAINE, UK AF LAINE, UK TI HIGHER POLE CORRECTION IN VOCAL-TRACT MODELS AND TERMINAL ANALOGS SO SPEECH COMMUNICATION LA English DT Article RP LAINE, UK (reprint author), HELSINKI UNIV TECHNOL, ACOUST LAB, SF-02150 ESPOO, FINLAND. CR DUNN HK, 1950, J ACOUST SOC AM, V22, P740, DOI 10.1121/1.1906681 FANT G, 1959, ERICSSON TECHNICS, P43 FANT G, 1970, ACOUSTIC THEORY SPEE, P115 FLANAGAN JL, 1972, SPEECH ANAL SYNTHESI, P25 GOLD B, 1980, APR P ICASSP 80 C DE, P128 GOLD B, 1968, IEEE T ACOUST SPEECH, VAU16, P81, DOI 10.1109/TAU.1968.1161954 Ingard KU, 1968, THEORETICAL ACOUSTIC, P343 INGARD U, 1953, J ACOUST SOC AM, V25, P1037, DOI 10.1121/1.1907235 ISHIZAKA K, 1975, IEEE T ACOUST SPEECH, V23, P370, DOI 10.1109/TASSP.1975.1162701 LAINE UK, 1982, REPRESENTATION SPEEC, P229 LAINE UK, 1982, MAY P ICASSP 82 PAR, P940 LAINE UK, 1984, 10TH P INT C PHON SC, P270 Maeda S., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90017-6 MARKEL JD, 1976, LINEAR PREDICTION SP, P7 ROSEN G, 1958, J ACOUST SOC AM, V30, P201, DOI 10.1121/1.1909541 STEVENS KN, 1953, J ACOUST SOC AM, V25, P734, DOI 10.1121/1.1907169 Titchmarsh E. C., 1932, THEORY FUNCTIONS, V2nd VANDENBERG J, 1960, ACTA PHYSIOL PHARM N, V9, P361 NR 18 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 21 EP 40 DI 10.1016/0167-6393(88)90019-2 PG 20 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400002 ER PT J AU HUANG, SS GRAY, RM AF HUANG, SS GRAY, RM TI SPELLMODE RECOGNITION BASED ON VECTOR QUANTIZATION SO SPEECH COMMUNICATION LA English DT Article C1 STANFORD UNIV, INFORMAT SYST LAB, STANFORD, CA 94305 USA. RP HUANG, SS (reprint author), GTE LABS INC, WALTHAM, MA 02254 USA. CR ABUT H, 1982, IEEE T ACOUST SPEECH, V30, P423, DOI 10.1109/TASSP.1982.1163907 ALDEFELD B, 1980, P IEEE, V68, P1364, DOI 10.1109/PROC.1980.11879 BURTON D, 1985, MAR P IEEE INT C AC, P29 BURTON DK, 1985, IEEE T ACOUST SPEECH, V33, P837, DOI 10.1109/TASSP.1985.1164650 BUSH MA, 1983, APR P IEEE INT C AC, P742 BUZO A, 1980, IEEE T ACOUST SPEECH, V28, P562, DOI 10.1109/TASSP.1980.1163445 DUNHAM MO, 1985, IEEE T COMMUN, V33, P83, DOI 10.1109/TCOM.1985.1096198 FOSTER J, 1985, IEEE T INFORM THEORY, V31, P348, DOI 10.1109/TIT.1985.1057035 Gray R.M., 1984, IEEE ASSP MAG APR, P4 GRAY RM, 1980, IEEE T ACOUST SPEECH, V28, P367, DOI 10.1109/TASSP.1980.1163421 HUGGINS AWF, 1986, VIDVOX HUMAN FACTORS ITAKURA F, 1975, IEEE T ACOUST SPEECH, VAS23, P67, DOI 10.1109/TASSP.1975.1162641 JELINEK F, 1985, P IEEE, V73, P1616, DOI 10.1109/PROC.1985.13343 KOPEC GE, 1985, IEEE T ACOUST SPEECH, V33, P850, DOI 10.1109/TASSP.1985.1164652 LEINER BM, 1973, IEEE T INFORM THEORY, V19, P706, DOI 10.1109/TIT.1973.1055066 LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 PAN KC, 1985, MAR P IEEE INT C AC, P874 PAN KC, 1985, IEEE T ACOUST SPEECH, V33, P546 RABINER LR, 1984, AT&T TECH J, V63, P1981 RABINER LR, 1983, AT&T TECH J, V62, P1075 ROSENBERG AE, 1979, AT&T TECH J, V58, P1797 SCHALK TB, 1982, MAR WORKSH STAND SPE SCHWARTZ RM, 1985, MAR P IEEE INT C AC, P1205 SHORE JE, 1983, IEEE T INFORM THEORY, V29, P473, DOI 10.1109/TIT.1983.1056716 SOONG FK, 1986, J ACOUST SOC AM S1, V80, P35 TSAO C, 1985, THESIS STANFORD U ST YOUN WS, 1986, APR P IEEE INT C AC, P717 NR 27 TC 5 Z9 5 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 41 EP 53 DI 10.1016/0167-6393(88)90020-9 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400003 ER PT J AU OSHAUGHNESSY, D BARBEAU, L BERNARDI, D ARCHAMBAULT, D AF OSHAUGHNESSY, D BARBEAU, L BERNARDI, D ARCHAMBAULT, D TI DIPHONE SPEECH SYNTHESIS SO SPEECH COMMUNICATION LA English DT Article RP OSHAUGHNESSY, D (reprint author), INRS TELECOMMUN, NUNS ISL H3E 1H6, QUEBEC, CANADA. CR ALLEN J, 1976, P IEEE, V64, P433, DOI 10.1109/PROC.1976.10152 BROWMAN C, 1980, P IEEE INT C ASSP, P561 Caspers B., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0) DETTWEILER H, P IEEE INT C ASSP, P752 DETTWEILER H, 1985, ACUSTICA, V57, P268 Dixon N.R., 1968, IEEE Transactions on Audio and Electroacoustics, VAU-16, DOI 10.1109/TAU.1968.1161948 ELSENDOORN BAG, 1984, IPO ANN PROG REP, V19, P32 FUJIMURA O, 1981, PHONETICA, V38, P66 Fujimura Osamu, 1978, SYLLABLES SEGMENTS, P107 KAESLIN H, 1986, IEEE T ACOUST SPEECH, V34, P264, DOI 10.1109/TASSP.1986.1164810 Kucera H., 1967, COMPUTATIONAL ANAL P LEE DL, 1983, COMPUT SURV, V15, P351 LIN KS, 1981, IEEE T CONSUM ELECTR, V27, P144, DOI 10.1109/TCE.1981.273496 lovins J. B., 1979, SPEECH COMM 97 M AC, P519 OSHAUGHN.D, 1974, IEEE T ACOUST SPEECH, VAS22, P282, DOI 10.1109/TASSP.1974.1162588 OSHAUGHNESSY D, 1984, SPEECH COMMUN, V3, P233, DOI 10.1016/0167-6393(84)90018-9 OLIVE J, 1980, P IEEE INT C ASSP, P568 Olive J. P., 1977, P INT C ACOUST SPEEC, P568 SCHWARTZ R, 1979, P IEEE INT C AC SPEE, P891 SHADLE CH, 1979, J ACOUST SOC AM, V66, P1325, DOI 10.1121/1.383553 SIVERTSEN E, 1961, LANG SPEECH, V4, P27 Strube H. W., 1982, Speech Communication, V1, DOI 10.1016/0167-6393(82)90029-2 NR 22 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 55 EP 65 DI 10.1016/0167-6393(88)90021-0 PG 11 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400004 ER PT J AU VIDAL, E CASACUBERTA, F BENEDI, JM LLORET, MJ RULOT, H AF VIDAL, E CASACUBERTA, F BENEDI, JM LLORET, MJ RULOT, H TI ON THE VERIFICATION OF TRIANGLE INEQUALITY BY DYNAMIC TIME-WARPING DISSIMILARITY MEASURES SO SPEECH COMMUNICATION LA English DT Article RP VIDAL, E (reprint author), UNIV POLITECN VALENCIA, DEPT SIST INFORMAT COMPUTAC, VALENCIA, SPAIN. RI Benedi, Juana/K-9740-2014 OI Benedi, Juana/0000-0002-3796-639X CR Bellman R., 1972, DYNAMIC PROGRAMMING CASACUBERTA F, 1987, IEEE T ASSP, V35 DAVIS SB, 1980, IEEE T ACOUST SPEECH, V28, P357, DOI 10.1109/TASSP.1980.1163420 di Martino J., 1985, NEW SYSTEMS ARCHITEC, P405 Duda R. O., 1973, PATTERN CLASSIFICATI Duds O., 2000, RECONOCIMIENTO AUTOM GUPTA V, 1982, J ACOUST SOC AM, V71, P1581, DOI 10.1121/1.387812 GUPTA VN, 1984, P ICASSP KAHN D, 1984, J ACOUST SOC AM, V75, P590, DOI 10.1121/1.390532 KITAZUME Y, 1985, IEEE T ACOUST SPEECH, V33, P1, DOI 10.1109/TASSP.1985.1164510 LLORET MJ, 1986, THESIS U VALENCIA LOCKWOOD P, 1986, 8TH INT C PATT REC P LOCKWOOD P, 1985, 5EME C AFCET REC FOR, P975 MOORE RK, 1985, NEW SYSTEMS ARCHITEC, P73 PIERACCINI R, 1984, SIGNAL PROCESS, P1 PRIETO N, 1987, IN PRESS REV INFORMA RABINER LR, 1981, IEEE T COMMUN, V29, P621, DOI 10.1109/TCOM.1981.1095031 RULOT H, 1984, QUESTIO, V8, P179 SAKOE H, 1978, IEEE T ACOUST SPEECH, V26, P43, DOI 10.1109/TASSP.1978.1163055 Vidal Ruiz E., 1985, Speech Communication, V4, DOI 10.1016/0167-6393(85)90058-5 VIDAL E, 1988, IN PRESS IEEE T ASSS VIDAL E, 1986, 8TH INT C PATT REC P Vidal Ruiz E., 1986, Pattern Recognition Letters, V4, DOI 10.1016/0167-8655(86)90013-9 VIDAL E, 1985, THESIS U VALENCIA ZWICKER E, 1961, J ACOUST SOC AM, V33, P248, DOI 10.1121/1.1908630 NR 25 TC 8 Z9 8 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 67 EP 79 DI 10.1016/0167-6393(88)90022-2 PG 13 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400005 ER PT J AU HARMEGNIES, B LANDERCY, A AF HARMEGNIES, B LANDERCY, A TI INTRA-SPEAKER VARIABILITY OF THE LONG-TERM SPEECH SPECTRUM SO SPEECH COMMUNICATION LA English DT Note RP HARMEGNIES, B (reprint author), UNIV MONS, DEPT PHONET PSYCHOACOUST, B-7000 MONS, BELGIUM. CR BANULSTEROL, 1971, 7TH ACTS INT C AC BU, P253 FORMBY C, 1982, J ACOUST SOC AM, V71, P196, DOI 10.1121/1.387347 Frokjaer-Jensen B, 1976, BRUEL KJAER TECHNICA, V3, P3 FURUI S, 1972, ELECTRON COMMUN JPN, V55, P54 HARMEGNIES B, 1985, REV PHONETIQUE APPLI, V75 HARMEGNIES B, 1985, 15EMES ACT JEP PAR, P51 HARMEGNIES B, 1985, REV PHONETIQUE APPLI, V73, P69 HARMEGNIES B, 1984, 13EMES ACT JOURN ET, P115 HARMEGNIES B, 1985, REV PHONETIQUE APPLI, V74 LIENARD JS, 1984, 13EMES ACT JOURN ET, P57 MAJEWSKI W, 1974, SPEECH COMMUNICATION, V3 TARNOCZY T, 1962, 4TH ACTS INT C AC CO TARNOCZY T, 1958, ACUSTICA, V8, P393 ZALEWSKI J, 1978, ACUSTICA, V34, P24 NR 14 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 81 EP 86 DI 10.1016/0167-6393(88)90023-4 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400006 ER PT J AU BERGERVACHON, C MORGON, A AF BERGERVACHON, C MORGON, A TI AN EVALUATION OF AUDITORY PERFORMANCES IN PATIENTS WITH COCHLEAR IMPLANTS SO SPEECH COMMUNICATION LA English DT Note C1 HOP EDOUARD HERRIOT, SERV OTORHINOLARYNGOL, F-69374 LYON 08, FRANCE. RP BERGERVACHON, C (reprint author), UNIV LYON 1, GENIE BIOL & MED LAB, F-69100 VILLEURBANNE, FRANCE. CR BANFAI P, 1984, 2ND P INT S COCHL IM, P183 BURIAN K, 1984, 2ND P INT S COCHL IM, P217 Chouard C H, 1981, Ann Otolaryngol Chir Cervicofac, V98, P593 Chouard C H, 1982, Ann Otolaryngol Chir Cervicofac, V99, P15 CHOUARD CH, 1984, 2ND P INT S COCHL IM CLARK GM, 1981, ACTA OTO-LARYNGOL, V91, P173, DOI 10.3109/00016488109138496 CLARK GM, 1981, ANN OTO RHINOL LARYN, V90, P227 Dallos P, 1973, AUDITORY PERIPHERY DOWELL RC, 1983, 2ND P INT S COCHL IM, P230 FARDEAU M, 1979, CAHIERS ORL, V14, P609 FUGAIN C, 1984, 2ND P INT S COCHL IM, P237 LAFON JC, 1977, REV ACOUSTIQUE, V42, P258 Lafon J.-C., 1983, Annales Francaises des Microtechniques et de Chronometrie, V37 MICHELSON RP, 1981, LARYNGOSCOPE, V91, P38 Morgon A., 1983, Annales Francaises des Microtechniques et de Chronometrie, V37 MORGON A, 1984, 2ND P INT S COCHL IM, P195 MOUHSSINE R, 1984, 4TH P SPAN PORT B2 4 ROSSI M, 1977, LINGUISTIQUE, V13, P63 VOIERS M, 1971, 2E P JOURN ET PA B C NR 19 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD MAR PY 1988 VL 7 IS 1 BP 87 EP 95 DI 10.1016/0167-6393(88)90024-6 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA N0904 UT WOS:A1988N090400007 ER PT J AU MILLAR, B CLARK, J AF MILLAR, B CLARK, J TI SPECIAL ISSUE ON SPEECH RESEARCH IN AUSTRALIA SO SPEECH COMMUNICATION LA English DT Editorial Material C1 MACQUARIE UNIV, SPEECH HEARING & LANGUAGE RES CTR, N RYDE, NSW 2113, AUSTRALIA. RP MILLAR, B (reprint author), AUSTRALIAN NATL UNIV, DEPT ENGN PHYS, COMP SCI LAB, CANBERRA, ACT 2600, AUSTRALIA. NR 0 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 283 EP 283 DI 10.1016/0167-6393(87)90001-X PG 1 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000001 ER PT J AU MILLAR, JB CLARK, JE AF MILLAR, JB CLARK, JE TI CURRENT SPEECH RESEARCH IN AUSTRALIA SO SPEECH COMMUNICATION LA English DT Article C1 MACQUARIE UNIV, SPEECH HEARING & LANGUAGE RES CTR, N RYDE, NSW 2113, AUSTRALIA. RP MILLAR, JB (reprint author), AUSTRALIAN NATL UNIV, DEPT ENGN PHYS, COMP SCI LAB, CANBERRA, ACT 2601, AUSTRALIA. CR BARLOW M, 1986, 1ST P AUSTR C SPEECH, P240 BLAMEY PJ, 1986, 1ST P AUSTR C SPEECH, P54 BOGNER RE, 1986, 1ST P AUSTR C SPEECH, P112 BOTTELL F, 1986, 1ST P AUSTR C SPEECH, P28 BRADLEY AB, 1986, 1ST P AUSTR C SPEECH, P254 BUTLER SJ, 1986, 1ST P AUSTR C SPEECH, P2 CLARK JE, 1986, 1ST P AUSTR C SPEECH, P342 DERMODY P, 1986, 1ST P AUSTR C SPEECH, P66 DUKE PF, 1986, 1ST P AUSTR C SPEECH, P168 DUONG N, 1986, 1ST P AUSTR C SPEECH, P98 FLAHERTY MJ, 1986, 1ST P AUSTR C SPEECH, P184 GROCKE M, 1986, 1ST P AUSTR C SPEECH, P266 HARRISON JM, 1986, 1ST P AUSTR C SPEECH, P222 HARVEY AL, 1986, 1ST P AUSTR C SPEECH, P334 INGRAM J, 1986, 1ST P AUSTR C SPEECH, P40 KOOB PC, 1986, 1ST P AUSTR C SPEECH, P150 LAI EMK, 1986, 1ST P AUSTR C SPEECH, P316 MAGDY MA, 1986, 1ST P AUSTR C SPEECH, P304 MAHESWARAN A, 1986, 1ST P AUSTR C SPEECH, P196 MANNELL RH, 1986, 1ST P AUSTR C SPEECH, P260 MEAD D, 1986, 1ST P AUSTR C SPEECH, P328 MILLAR JB, 1986, 1ST P AUSTR C SPEECH, P228 NANDAGOPAL D, 1986, 1ST P AUSTR C SPEECH, P298 OKANE M, 1986, 1ST P AUSTR C SPEECH, P322 PITTAM J, 1986, 1ST P AUSTR C SPEECH, P234 PURVIS HSJ, 1986, 1ST P AUSTR C SPEECH, P216 ROBINSON REE, 1986, 1ST P AUSTR C SPEECH, P136 SEIDL RA, 1986, 1ST P AUSTR C SPEECH, P86 STANDEN P, 1986, 1ST P AUSTR C SPEECH, P78 STONE BJ, 1986, 1ST P AUSTR C SPEECH, P366 SUMMERFIELD CD, 1986, 1ST P AUSTR C SPEECH, P348 TOGNERI R, 1986, 1ST P AUSTR C SPEECH, P292 VANDOORN JL, 1986, 1ST P AUSTR C SPEECH, P46 WAGNER M, 1986, 1ST P AUSTR C SPEECH, P204 WALSH MJ, 1986, 1ST P AUSTR C SPEECH, P176 ZHOU KC, 1986, 1ST P AUSTR C SPEECH, P8 NR 36 TC 1 Z9 1 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 285 EP 292 DI 10.1016/0167-6393(87)90002-1 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000002 ER PT J AU BLAMEY, PJ DOWELL, RC BROWN, AM CLARK, GM AF BLAMEY, PJ DOWELL, RC BROWN, AM CLARK, GM TI A FORMANT-ESTIMATING SPEECH PROCESSOR FOR COCHLEAR IMPLANT PATIENTS SO SPEECH COMMUNICATION LA English DT Article C1 COCHLEAR PROPRIETARY LTD, LANE COVE, NSW 2066, AUSTRALIA. RP BLAMEY, PJ (reprint author), UNIV MELBOURNE, DEPT OTOLARYNGOL, PARKVILLE, VIC 3052, AUSTRALIA. CR BLAMEY PJ, 1987, IN PRESS J ACOUST SO BLAMEY PJ, 1985, J ACOUST SOC AM, V77, P209, DOI 10.1121/1.392260 BROWN AM, 1985, J LARYNGOL OTOL, V99, P231, DOI 10.1017/S0022215100096614 CLARK GM, 1984, J MED ENG TECHNOL, V8, P3, DOI 10.3109/03091908409032065 Dahl R., 1978, DANNY CHAMPION WORLD DEFILIPPO CL, 1978, J ACOUST SOC AM, V63, P1186, DOI 10.1121/1.381827 DOWELL RC, 1987, ANN OTO RHINOL LARYN, V96, P132 MILLAR JB, 1984, J SPEECH HEAR RES, V27, P280 OWENS E, 1980, MINIMAL AUDITORY CAP NR 9 TC 2 Z9 2 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 293 EP 298 DI 10.1016/0167-6393(87)90003-3 PG 6 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000003 ER PT J AU JOHNSON, AW BRADLEY, AB AF JOHNSON, AW BRADLEY, AB TI ADAPTIVE TRANSFORM CODING INCORPORATING TIME DOMAIN ALIASING CANCELLATION SO SPEECH COMMUNICATION LA English DT Article RP JOHNSON, AW (reprint author), ROYAL MELBOURNE INST TECHNOL, DEPT COMMUN & ELECTR ENGN, MELBOURNE, VIC 3001, AUSTRALIA. CR CROCHIERE RE, 1980, IEEE T ACOUST SPEECH, V28, P99, DOI 10.1109/TASSP.1980.1163353 Crochiere R. E., 1983, MULTIRATE DIGITAL SI DRAUMER WR, 1982, IEEE T COMMUN, V30, P655 Esteban D, 1977, IEEE T ACOUST SPEECH, P191 GERSHO A, 1984, IEEE INT C ACOUST SP LINDE Y, 1980, IEEE T COMMUN, V28, P84, DOI 10.1109/TCOM.1980.1094577 Max J., 1960, IEEE T INFORM THEORY, P7 NOLL P, 1975, INT ZURICH SEMINAR D, P1597 Princen J. P., 1987, IEEE INT C AC SPEECH, P2161 PRINCEN JP, 1986, IEEE T ACOUST SPEECH, V34, P1153, DOI 10.1109/TASSP.1986.1164954 RAMSTAD TA, 1982, INT C ACOUST SPEECH, P203 TRIBOLET JM, 1979, IEEE T ACOUST SPEECH, V27, P512, DOI 10.1109/TASSP.1979.1163283 NR 12 TC 11 Z9 11 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 299 EP 308 DI 10.1016/0167-6393(87)90004-5 PG 10 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000004 ER PT J AU MACKIE, K DERMODY, P KATSCH, R AF MACKIE, K DERMODY, P KATSCH, R TI ASSESSMENT OF EVALUATION MEASURES FOR PROCESSED SPEECH SO SPEECH COMMUNICATION LA English DT Article RP MACKIE, K (reprint author), NATL ACOUST LABS, CHATSWOOD, NSW 2067, AUSTRALIA. CR BARNWELL TP, 1979, J ACOUST SOC AM, V66, P1658, DOI 10.1121/1.383664 BODE DL, 1974, J ACOUST SOC AM, V56, P963, DOI 10.1121/1.1903356 CLARK JE, 1985, J ACOUST SOC AM, V78, P458, DOI 10.1121/1.392468 DERMODY P, 1986, 1ST P AUSTR C SPEECH, P66 EGAN JP, 1948, LARYNGOSCOPE, V58, P955, DOI 10.1288/00005537-194809000-00002 FAIRBANKS G, 1958, J ACOUST SOC AM, V30, P596, DOI 10.1121/1.1909702 FRENCH NR, 1947, J ACOUST SOC AM, V19, P90, DOI 10.1121/1.1916407 GREENE BG, 1986, BEHAV RES METH INSTR, V18, P100, DOI 10.3758/BF03201008 KEWLEYPORT D, 1983, J ACOUST SOC AM, V73, P1779, DOI 10.1121/1.389402 LEVITT H, 1967, J ACOUST SOC AM, V42, P609, DOI 10.1121/1.1910630 MACKIE K, 1986, 1ST P AUSTR C SPEECH, P144 NAKATANI LH, 1973, J ACOUST SOC AM, V53, P1083, DOI 10.1121/1.1913428 NOOTEBOOM S, 1984, 10TH P INT C PHON B, V2, P481 PISONI DB, 1985, P IEEE, V73, P1665, DOI 10.1109/PROC.1985.13346 SCHROEDE.MR, 1968, J ACOUST SOC AM, V44, P1735, DOI 10.1121/1.1911323 STEVENS KN, 1978, J ACOUST SOC AM, V64, P1358, DOI 10.1121/1.382102 TAYLOR MM, 1967, J ACOUST SOC AM, V41, P782, DOI 10.1121/1.1910407 NR 17 TC 4 Z9 4 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 309 EP 316 DI 10.1016/0167-6393(87)90005-7 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000005 ER PT J AU MANNELL, R CLARK, JE AF MANNELL, R CLARK, JE TI TEXT-TO-SPEECH RULE AND DICTIONARY DEVELOPMENT SO SPEECH COMMUNICATION LA English DT Article RP MANNELL, R (reprint author), MACQUARIE UNIV, SPEECH HEARING & LANGUAGE RES CTR, N RYDE, NSW 2109, AUSTRALIA. CR CLARK JE, 1985, FESTSCHRIFT HONOUR A, V48, P251 Delbridge A., 1981, MACQUARIE DICT ELOVITZ HS, 1976, IEEE T ACOUST SPEECH, V24, P446, DOI 10.1109/TASSP.1976.1162873 JOHANSSON S, 1978, MANUAL INFORMATION A Kucera H., 1967, COMPUTATIONAL ANAL P LAVER J, 1982, UNPUB AUSTR ENGLISH NR 6 TC 3 Z9 3 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 317 EP 324 DI 10.1016/0167-6393(87)90006-9 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000006 ER PT J AU NEILSON, MD NEILSON, PD AF NEILSON, MD NEILSON, PD TI SPEECH MOTOR CONTROL AND STUTTERING - A COMPUTATIONAL MODEL OF ADAPTIVE SENSORY-MOTOR PROCESSING SO SPEECH COMMUNICATION LA English DT Article C1 UNIV NEW S WALES, PRINCE HENRY HOSP,DEPT NEUROL,CTR SPAST,RES UNIT, KENSINGTON, NSW 2033, AUSTRALIA. UNIV NEW S WALES, SCH MED, KENSINGTON, NSW 2033, AUSTRALIA. RP NEILSON, MD (reprint author), UNIV NEW S WALES, ST VINCENTS HOSP, CLIN RES UNIT ANXIETY DISORDERS, DARLINGHURST, NSW 2010, AUSTRALIA. CR ANDREWS G, 1983, J SPEECH HEAR DISORD, V48, P226 CHERRY C, 1956, J PSYCHOSOM RES, V1, P233 Fairbanks G, 1954, J SPEECH HEAR DISORD, V19, P133 GOPHER D, 1982, J EXP PSYCHOL HUMAN, V8, P146, DOI 10.1037/0096-1523.8.1.146 Gopher D, 1984, COGNITION MOTOR PROC KLEINMAN DL, 1970, AUTOMATICA, V6, P357, DOI 10.1016/0005-1098(70)90051-8 KLEINMAN DL, 1971, AUTOMATICA, V6, P371 LEE YW, 1965, INT J CONTROL, V2, P237, DOI 10.1080/00207176508905543 McRuer D. T, 1974, AGARDOGRAPH McRuer D.T., 1959, Journal of the Franklin Institute, V267, DOI 10.1016/0016-0032(59)90091-2 McRuer DT, 1959, J FRANKLIN I, V267, P511, DOI 10.1016/0016-0032(59)90072-9 NAVON D, 1979, PSYCHOL REV, V86, P214, DOI 10.1037/0033-295X.86.3.214 NEILSON MD, 1980, THESIS U NSW AUSTR NEILSON MD, 1985, MOTOR MEMORY CONTROL NEILSON PD, 1987, UNPUB CENTRAL PROCES NEILSON PD, 1980, Q J EXP PSYCHOL, V32, P125, DOI 10.1080/00335558008248238 NORMAN DA, 1975, COGNITIVE PSYCHOL, V7, P44, DOI 10.1016/0010-0285(75)90004-3 PEW RW, 1974, HUMAN INFORMATION PR RAIBERT MH, 1978, BIOL CYBERN, V29, P29, DOI 10.1007/BF00365233 SALTZMAN E, 1979, J MATH PSYCHOL, V20, P91, DOI 10.1016/0022-2496(79)90020-8 WICKENS C, 1983, SCIENCE, V221, P1080, DOI 10.1126/science.6879207 Wickens CD, 1980, ATTENTION PERFORMANC, VVIII YATES AJ, 1963, BEHAV RES THER, V1, P95, DOI 10.1016/0005-7967(63)90013-5 NR 23 TC 36 Z9 36 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 325 EP 333 DI 10.1016/0167-6393(87)90007-0 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000007 ER PT J AU PLANT, G AF PLANT, G TI A SINGLE-TRANSDUCER VIBROTACTILE AID TO LIPREADING SO SPEECH COMMUNICATION LA English DT Article RP PLANT, G (reprint author), NATL ACOUST LABS, CHATSWOOD, NSW 2067, AUSTRALIA. CR BOOTHROYD A, 1983, J ACOUST SOC AM S1, V63, pS27 CLARK J E, 1981, Australian Journal of Audiology, V3, P21 DEFILIPPO CL, 1978, J ACOUST SOC AM, V63, P1186, DOI 10.1121/1.381827 GRANT K, 1980, THESIS U WASHINGTON GRANT KW, 1985, J ACOUST SOC AM, V77, P671, DOI 10.1121/1.392335 KALLIKOW DN, 1977, J ACOUST SOC AM, V61, P1337 PLANT G, 1984, Australian Journal of Audiology, V6, P55 PLANT G, 1984, Australian Journal of Audiology, V6, P65 PLANT G, 1983, STL QPSR, V2, P61 RISBERG A, 1978, STL QPSR, V2, P51 Risberg A., 1978, STL QPSR, V4, P1 RISBERG A, 1974, SCAND AUDIOL S, V4, P153 ROSEN SM, 1981, NATURE, V291, P150, DOI 10.1038/291150a0 ROTHENBERG M, 1979, J ACOUST SOC AM, V66, P1029, DOI 10.1121/1.383322 RUDMIN F, 1983, VOLTA REV, V85, P263 TRAUNMULLER H, 1980, J COMMUN DISORD, V13, P183, DOI 10.1016/0021-9924(80)90035-0 NR 16 TC 0 Z9 0 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 335 EP 342 DI 10.1016/0167-6393(87)90008-2 PG 8 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000008 ER PT J AU ROSE, P AF ROSE, P TI CONSIDERATIONS IN THE NORMALIZATION OF THE FUNDAMENTAL-FREQUENCY OF LINGUISTIC TONE SO SPEECH COMMUNICATION LA English DT Article RP ROSE, P (reprint author), AUSTRALIAN NATL UNIV FAC, DEPT LINGUIST, CANBERRA, ACT, AUSTRALIA. CR BLADON RAW, 1981, J ACOUST SOC AM, V69, P1414, DOI 10.1121/1.385824 Chen G. T., 1974, J CHINESE LINGUISTIC, V2, P159 CHIANG HT, 1967, PHONETICA, V17, P100 DISNER SF, 1980, J ACOUST SOC AM, V67, P253, DOI 10.1121/1.383734 DREHER JJ, 1966, 4156 DOUGL ADV RES L EARLE MA, 1975, MONOGRAPH SPEECH COM, V11 Fant G., 1973, SPEECH SOUNDS FEATUR Fujisaki H., 1983, PRODUCTION SPEECH, P39 JASSEM W, 1975, AUDITORY ANAL PERCEP, P523 LADD DR, 1985, J ACOUST SOC AM, V78, P435, DOI 10.1121/1.392466 Ladefoged P., 1967, 3 AREAS EXPT PHONETI LEATHER J, 1983, J PHONETICS, V11, P373 Lehiste I., 1970, SUPRASEGMENTALS MENN L, 1982, LANG SPEECH, V25, P341 Nolan F, 1983, PHONETIC BASES SPEAK PHUONG VT, 1981, THESIS AUSTR NATIONA ROSE P, IN PRESS PACIFIC LIN ROSE PJ, 1982, THESIS CAMBRIDGE U Takefuta Yukio, 1975, MEASUREMENT PROCEDUR, P363 YUAN JH, 1980, HANYU FANGYAN GAIYAO NR 20 TC 48 Z9 48 PU ELSEVIER SCIENCE BV PI AMSTERDAM PA PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS SN 0167-6393 EI 1872-7182 J9 SPEECH COMMUN JI Speech Commun. PD DEC PY 1987 VL 6 IS 4 BP 343 EP 351 DI 10.1016/0167-6393(87)90009-4 PG 9 WC Acoustics; Computer Science, Interdisciplinary Applications SC Acoustics; Computer Science GA L6710 UT WOS:A1987L671000009 ER EF