Dallas, Texas, June 11, 1978
A new speech synthesis monolithic integrated circuit has been developed by Texas Instruments Incorporated. It marks the first time the human vocal tract has been electronically duplicated on a single chip of silicon. Measuring 44,000 square mils, the chip is fabricated using TI's low-cost metal gate P-channel MOS process, the same used for TI calculator MOS ICs.
The speech synthesis MOS/LSI integrated circuit along with two 128K dynamic ROMs each with the capacity to store over 100 seconds of speech, and a special version of the TMS 1000 microcomputer, all TI developed, serve as the main electronics for the new talking learning aid, SPEAK & SPELL(TM), for seven year olds and up. The new TI consumer product was introduced at the Summer Consumer Electronics Shows in Chicago, June 11-14.
Speech encoding is achieved through pitch excited Linear Predictive Coding (LPC). As the name implies, LPC is based on a linear equation to formulate a mathematical model of the human vocal tract and an ability to predict a speech sample based on previous ones.
Linear Predictive Coding is a technique of analyzing and synthesizing human speech by determining from original speech a description of a time varying digital filter modeling the vocal tract. This filter is then excited by either periodic or random inputs. An on-chip 8-bit digital-to-analog (D/A) converter transforms digital information processed through the filter into synthetic speech.
Codes for twelve synthesis parameters (10 filter coefficients, pitch and energy) serve as inputs to the synthesizer chip. These codes are stored in a ROM and, once decoded by on-chip circuitry, represent the time varying description of the LPC synthesis model.
Inputs to the digital filter take two forms: (1) periodic and (2) random. The periodic inputs are used to reproduce voiced sounds which have a definite pitch such as vowel sounds or voiced fricatives such as Z, B or D. A random input models unvoiced sounds such as S, F, T and SH .
The speech synthesis chip has two separate logic blocks which generate the voiced and unvoiced excitation. Output of the digital filter drives a D to A converter which in turn drives a speaker.
Key to TI's high quality LPC speech synthesizer is an advanced design 10-stage lattice filter which has an integrated array multiplier, an adder coupled to the multiplier output and various delay circuits coupled to the adder output.
With this increased computational sequencing capability and a fast continuous data transfer rate, the multiplier can accept two inputs every five microseconds. Twenty multiply and accumulate operations are needed to generate each speech sample, and the circuit can generate up to 10,000 speech samples per second.
The chip is operated at an eight kilohertz rate for the Speak & Spell. This 10th order Linear Predictive Coding (LPC-10) speech synthesizer IC accurately reproduces human speech from stored or transmitted digital data.
www.ti.com/corp/docs/company/history/pmos.shtml
www.datamath.org/Album_Speech.htm
Last edited on 2005.03.14 08:32