Acoustic characteristics of oral speech

 Oral speech is characterized by many physical parameters.

Along with its content side, the prosodic side of speech is of great importance for the listener's perception.


Prosody is the highest level of language development.

The prosodic design of the text is subordinated to the semantic-syntactic task of the speech utterance. It includes a set of a number of indicators, such as psychophysiological, situational, need-motivational, and extralinguistic. This complex ultimately determines the acoustic-articulatory characteristics of prosody as a whole. The main component of prosody is intonation. Through intonation, the meaning of speech and its subtext is revealed. It represents one of the most important aspects of oral speech.


Intonation is a complex phenomenon that includes several acoustic components. This is the tone of the voice, its timbre, the intensity or strength of the sound of the voice, pause and logical stress, the tempo of speech. All these components are involved in the division and organization of the speech flow in accordance with the meaning of the transmitted message.

Acoustic correlates of intonation characteristics are changes in the intensity and frequency of the fundamental tone of the voice, as well as the duration of individual phonetic elements. The tone of the voice is formed by the passage of air through the pharynx, vocal folds, mouth, and nose.

An additional articulatory acoustic coloration of the voice is timbre (“voice color”). While the tone of voice can be common to many people, the tone of voice is as individual as fingerprints.

Individual characteristics of prosody are combined and coordinated by the tempo-rhythmic organization of the speech flow.


The rate of speech is usually defined as the speed of speech flow in time or as the number of sound units spoken per unit of time. A sound unit can be a sound, a syllable, and a word. The rate of speech can also be defined as the speed of articulation and is measured by the number of sound units spoken per unit of time. In an adult, the rate of speech in a calm state varies from 90 to 175 syllables per minute.

In practice, there are three main types of pace: normal, fast, and slow. The pace of the same person can be both stable and changing. A stable speech rate can be realized only on short segments of the message.

Tempo plays a significant role in the transmission of emotionally modal information. Sharp deviations of the rate of speech from the average values - both acceleration and deceleration - interfere with the perception of the semantic side of the utterance.

The rate of speech largely determines the originality of another parameter of speech - rhythm. The rhythm of speech is the sound organization of speech by alternating stressed and unstressed syllables. Tempo and rhythm are in a complex relationship and interdependence.

There are a number of rhythm components. The main property of speech rhythm is regularity. Metric signs of rhythm make up its "skeleton", which is reflected in the metric schemes (the number and order of stressed and unstressed syllables). There are also non-metric signs of rhythm, which are included in the concept of speech melody.


The tempo-rhythmic organization of oral speech is the core that unites and coordinates all components of oral speech, including lexical and grammatical structuring, articulatory-respiratory program, and the whole complex of prosodic characteristics.

At present, we can talk about such concepts as the tempo-rhythm-intonational division of speech, which arises not as a result of a sound arrangement, a ready-made lexical-syntactic structure of an utterance, but in the process of the current formation of thought and its verbalization. Tempo-rhythm-intonation division permeates all phases of the construction of an utterance, starting from the speaker's intention (intention) and including lexical-syntactic structuring, as well as motor-respiratory rhythm nation of the speech flow (articulation and breathing).

The syntagma acts as an elementary unit of prosody, i.e. a segment of an utterance, united by intonation and semantic meaning. It has physiological integrity and delimitation and acts as a rhythmic period of oral speech. Syntagma is associated with meaning, and therefore with syntax and intonation. In prose, the syntagma includes on average 2-4 words, and in verse - 2-3 words. It is pronounced on one speech exhalation and represents a single articulatory complex.

A syntagma pronounced on one speech exhalation, without pauses in the process of continuous articulation, can be associated with the concept of fluency of speech. In other words, smooth speech is characterized by a single articulatory complex for pronouncing a syntagma on one speech exhalation.

In normal speech, fluency is organically combined with pauses, which are a necessary component of a speech utterance. Their duration and the nature of their distribution in the speech stream largely determine the rhythmic-melodic side of intonation.

It is customary to define a pause as a break in the sound of a voice for a certain time. In this case, the acoustic correlate of the pause is a drop in the intensity of the voice to zero, and the physiological one is a break in the work of the articulatory organs. The shortest pauses are associated with the pronunciation of occlusive consonants. They are characterized by the absence of a voice for the period while the organs of articulation are in a closed state before the "explosion". On average, they last about 0.1 seconds.

In the process of oral speech, it is periodically necessary to breathe in to meet biological needs and to maintain optimal subglottic pressure during speech. This occurs at the time of the so-called “breathing pauses”. Their frequency and duration depend on the general rate of speech and the boundaries of syntagmas. These pauses also carry a semantic load, since they divide the text into semantic segments. The duration of these pauses is on average 0.5-1.5 seconds.

In contextual oral speech, in contrast to reading, pauses are found not only at the boundaries of syntagmas but also within them. Their duration is very variable. These pauses are called hesitation pauses. There are several hypotheses regarding hesitation pauses. It is believed that these pauses characterize the period of intense mental activity associated with the solution of a mental task (“what to say?”), As well as with the implementation of the planning of the utterance at the lexical-grammatical level, ie. the duration of the pauses reflects the mental activity of the speaker in the process of internal speech planning of the utterance.

All acoustic characteristics of oral speech are gradually formed in the process of speech ontogenesis and become quite stable and individual in an adult.


