Voice and Speech in Space

We were part of an inspirational conference at the Houston Space Center on October 4, 2024. It was the annual meeting of the Pan American Vocology Association. Many themes were developed with aspirations beyond an earth existence. Two scientists, Arian Shamei and Bryan Gick, conducted a panel discussion on the topic Beyond Earth: The Physiology of Speech and Voice in Outer Space. The two authors of this article offered a few insights on the physics and physiology of voice and speech production, which are summarized here. 

Gravity on Vocal Fold and Tongue Movement

The aerodynamic stresses driving vocal fold tissues range from 0.5 kPa to 10 kPa. For moderately loud conversational speech, we can assume the value 1.0 kPa. Assuming an average 1.0 cm2 vocal fold surface on which these stresses act, the driving forces are on the order of:

1000 Pa* (0.0001 m2) = 0.1 N

The gravitational force on vocal folds that have a mass of about 1 gram would be

F = (0.001 kg) * 9.8 m/s2 = 0.01 N

which is an order of magnitude less than the aerodynamic driving forces. In addition, the peak acceleration of the vocal folds during vibration is much higher than the acceleration due to gravity. For example, if the maximum amplitude A of vocal folds vibrating in a sinusoidal pattern were 0.001 m, the peak acceleration would be Aω2 = (0.001)(2 π f)2. For a 100 Hz frequency of vibration, the peak acceleration would be 395 m/s2 which is 40 times greater than the acceleration due to gravity.  For these reasons, most calculations on vocal fold dynamics do not include gravity.

On the contrary, the average human tongue has a mass of 90 grams, two orders of magnitude larger than the mass of the vocal folds. Gravity, or lack thereof in spaceflight, can have a significant effect on the posture and movement of the tongue. In their introductory presentation, Drs. Shamei and Gick reviewed published studies regarding the effects of spaceflight on the acoustic characteristics of speech and presented some of their own analyses of publicly available audio data from various space missions. Although there are many variables that may contribute to observed differences in speech on earth and during spaceflight, Shamei and Gick hypothesize that the environmental conditions, such as microgravity, contribute to affecting the posture of the tongue, and perhaps other articulators, resulting in a change in acoustic characteristics. There is much to be investigated regarding this hypothesis; here we provide a simple example, based on speech simulation, of how a slight postural change could affect the characteristics of speech. 

Shown in Figure 1 are two “pseudo-midsagittal” profiles of a vocal tract (VT) model of an adult male talker, where the thick blue lines represent neutral configurations that result from the posture of the tongue, jaw, lip, velum, and larynx. In this modeling approach, the neutral VT shape serves as a substrate on which shape modulations are imposed to generate speech. In both cases shown in Figure 1, the modulations of the respective neutral VT configuration produce the sentence “I have a perfect memory,” but the shape of the neutral configuration in Fig. 1b has slightly constricted pharyngeal region and slightly expanded oral cavity relative to Fig. 1a, perhaps due to a lowered and backed tongue posture.  

Figure 1: Pseudo-midsagittal plots showing temporal modulations of the vocal tract that produce “I have a perfect memory”. (a) Modulations superimposed on the initial neutral VT configuration, (b) Modulations superimposed on a modified neutral VT configuration where the pharyngeal region has been slightly constricted and the oral cavity slightly expanded.

The calculated frequencies of the first two resonances (essentially the same as formants for purposes of this article) over the time course of “I have a perfect memory” are shown in Figure 2 for both of vocal tract model simulations of Figure 1. The connected dots indicate the resonance frequencies at successive time points, where those in gray were generated by modulation of the “Neutral VT posture 0” in Fig. 1a and those in blue by the “Neutral VT posture 1” modulations. The solid line that envelops each set of resonances is the convex hull, providing a simplified view of the space occupied for each case.  This plot indicates that changing the underlying neutral posture from VT 0 to VT 1 has the effect of slightly increasing the frequencies of the first resonance while lowering the second resonance frequencies (but primarily in the upper portion of the plot).

Figure 2: Vowel space plot based on the time-varying VT configurations in Fig. 1.

Space Suits and Air-Conditioned Space Environments

Much of the physics of respiration, phonation, and resonation is based on the movement of air. The famous Navier-Stokes equations describe this movement. The equations involve the air density and the air viscosity, both of which are in turn affected by temperature. Air density and viscosity determine whether air moves smoothly (laminar) in an air duct or breaks up into turbulent flow. A Reynold’s number, written as

Re = ρ v d / μ

where ρ is the air density and μ is the air viscosity. The variable v is the air particle velocity and d is the effective diameter of the channel. If Re exceeds about 1800 anywhere in the vocal tract, the airflow is expected to become turbulent. This is a requirement for producing sibilant consonants in speech. 

Airway resistances, such as the glottal resistance, are sharply dependent on kinetic pressure drops across a constriction. These pressure drops are proportional to the kinetic pressure, written as 

Kinetic pressure = ½ ρ v2

Again, we see that air density plays a major role. It will alter glottal resistance, lip resistance, velar resistance, nasal resistance, and all resistances involved in consonant and vowel production.

The sound velocity in the airway is also dependent on air density. With an absolute atmospheric pressure P, the formula for sound velocity c is

c = (1.4 P / ρ)½

This velocity has a profound effect on the resonances (formants) of the vocal tract. In fact, all formants are directly proportional to the sound velocity. Doubling the speed of sound will double all formant frequencies, which alters gender and age perception. However, in a space environment, it is often possible to vary the pressure in direct proportion to the air density, which then leaves the sound velocity unchanged relative to the earth environment.

The product of air density and the speed of sound determines the characteristic impedance of the airway to sound waves. This impedance is

z = ρ c / A

where A is the cross-sectional area of the airway. This impedance determines how much pressure is needed to drive acoustic waves through the vocal tract. 

For quantifying source-filter interaction, the air inertance of a short section of the vocal tract is calculated as

I = ρ L /A

where L is the length of a short section and A is again the cross-sectional area of the air column in the duct. 

Air viscosity affects all aerodynamic and acoustic losses in vocal tract. In addition, dry air may increase the viscosity of liquids on the inner surfaces of the airways. For vocal fold tissues, it is known to raise phonation threshold pressure. 

In conclusion, speaking or singing in a space environment requires a well-controlled atmosphere, either in the space suit or an air-conditioned room in space. If air temperature, density, viscosity, and pressure deviate from conditions on earth’s surface, it is yet unclear how much adaptation is possible for speech over short periods of time.

How to Cite

Story, Brad and Titze, Ingo (2025), Voice and Speech in Space. NCVS Insights, Vol. 3(2), pp. 1-2. DOI: https://doi.org/10.62736/ncvs122956

Leave a Reply