NCVS Insights – Science that Resonates
Human beatbox, the art of pushing back the limits of vocal possibilities
August 28, 2025
volume 3 issue, 8 – August 2025
Human voice is undoubtedly the cheapest musical instrument in the world : everyone is born with it, and can train and develop it to suit his or her own aesthetic needs. Beyond its communicative functions, the human vocal instrument is capable of producing a wide variety of sounds, and imitating all kinds of other musical instruments (e.g. percussion, trumpet, saxophone, harmonica, guitar). Among current artistic vocal practices, human beatboxing is at the forefront of exploiting the possibilities of the human vocal instrument.
Human beatbox, an emerging musical language
Human beatbox emerged in the 1970s American hip-hop culture. Vocal percussionists navigating this burgeoning culture naturally came to be known as human beatboxers, shortened to beatboxers today. American beatboxers pioneered the practice, developing it in the 1980s and 1990s. The practice then spread to Europe and began to renew itself. From just a few dozen practitioners in the early 2000s, there are now thousands of beatboxers worldwide. By observing its development through the multiplication of events, the links forged on the Internet and the reflection carried out around transmission, we can properly speak of a community, with its codes, influences and references (Henrich Bernardoni and Giemza, 2023). Initially a male-dominated community, the number of female beatboxers is growing all the time.
To define human beatboxing in a nutshell, we can talk about multivocalism. Human beatbox is not a musical genre, but a musical practice in which the body is instrumentalized. Human beatbox tends to be reduced to vocal percussion, traces of which can be found in various forms in all vocal traditions throughout the world and the ages. Nevertheless, in its practice it induces the blending of vocal percussion with other singing techniques. Any sound produced by the human vocal instrument, whether voiced, percussed or blown, can be included in this constantly evolving vocal art. The primary idea of human beatbox is to use all vocal sounds to produce music, but it can also be used to produce vocal effects considered as sound effects or sound imitation. In this way, human beatbox is the fruit and addition of a multitude of vocal practices, which it synthesizes and remixes.
Originally, human beatboxing was practiced solo. The beatboxer created or reproduced a complex musical orchestration by mixing rhythmic and melodic elements, sometimes even including a vocal part. But there are other ways of approaching this art form. It can be practiced in a group, with several beatboxers, or with a beatboxer as part of a group of traditional musicians or a choir. There are also beatboxers who use loopers, machines that enable them to record themselves quickly and add up musical loops. Others play an instrument and beatbox at the same time, or beatbox directly into their instrument, as can be done in a flute, for example.
About boxemes and their automatic recognition
Beatbox sounds have a unique acoustic signature and they can be easily classified, as illustrated in Figure 1 (Paroni et al., 2021). A boxeme is defined as a fundamental unit of sound in human beatboxing, similar to how a phoneme is the smallest unit of sound in speech. The term was coined to describe the building blocks of beatboxing performances.
Figure 1 : A. temporal and spectral representation of the main boxemes; B. classification through spectral clustering (from Paroni et al., 2021). The x and y axis are output of a t-SNE projection technique (t-distributed stochastic neighbor embedding), so they are arbitrary scales.
Nowadays, automatic speech recognition tools have been widely developed. They are present in most of speech technologies available on the market. What about human beatboxing? Automatic recognition of beatbox sounds is an emerging field that adapts speech recognition technologies to identify and classify the unique percussive sounds produced in human beatboxing. Researchers have explored various methodologies to achieve accurate recognition, leveraging tools like the Kaldi speech recognition toolkit (Evain et al., 2021). The feasibility of adapting speech recognition tools for beatbox sound recognition has been demonstrated, contributing to advancements in automatic database annotation and music information retrieval.
The mechanisms behind beatboxing
The art of beatboxing is to play with airflow, vocalization and articulation to produce rapid, agile successions of percussive sounds and melodies… all at the same time! To do this, beatboxers have developed expertise in the sensory-motor control of their vocal instrument. Few studies have explored the physiological mechanisms underlying the production of beatboxed sounds (Proctor et al., 2013; Sapthavee et al., 2014; Blaylock et al., 2017; Paroni et al, 2021; Blaylock, 2022; Paroni, 2022; Dehais Underdown, 2023). These studies have shown that beatboxers employ a diverse array of articulatory methods, including trills (tongue or lips), clicks, and ejectives. Unlike standard speech, which primarily relies on pulmonic egressive airflow (air pushed out from the lungs), human beatboxing incorporates multiple airstream mechanisms, either egressive (air out) or ingressive (air in), activated at the levels of lungs, larynx, tongue or velum. Beatboxers develop breathing strategies distinct from regular speech breathing to optimize their performance and produce continuous sounds without pauses in breathing (see Figure 2).
Figure 2 : Breathing patterns in human beatboxing (HBB) and speech, for several repetitions of the phrase “boots and cats” in the same breath group. The ventilatory volume signal (VR) is computed as a linear combination of thoracic (thorax) and abdominal (abdo) cross sectional area signals measured by means of respiratory inductance plethysmography (see Paroni et al., 2021; Paroni, 2022).
Human beatboxing and speech rehabilitation
Beatboxed sounds mobilize all the articulators of speech (mandible, lips, tongue, velum, larynx). Their production requires expert coordination of movements in terms of strength, amplitude and temporal organization (De Torcy et al., 2014; Sapthavee et al., 2014; Paroni, 2022; Dehais Underdown, 2023). Offering beatboxed sound exercises as part of speech therapy is thus a promising approach (Pillot-Loiseau et al., 2021). The first advantage is motivational: human beatboxing is fun and enjoyable, socially accepted and even rewarding (Icht & Carl, 2022). As a practice that coordinates laryngeal, pharyngeal and oral articulatory gestures, it enables an integrative approach to care. It promotes various speech-related skills by working on i) respiratory capacity (linked to vocal intensity), ii) posture, iii) speech rate, and iv) intelligibility (Icht 2019, 2021; Icht & Carl, 2022; Paroni, 2022).
REFERENCES
Nathalie Henrich Bernardoni
Director of Research at the CNRS, choirmaster and singer, Nathalie HENRICH BERNARDONI is a scientist passionate about the human voice in all its forms of expression. Her research focuses on the experimental and clinical phonetic description of speech and singing, on the physiological and physical characterization of various vocal techniques (lyrical singing, contemporary music, world songs), on the management of vocal effort in speech and singing, as well as the development and improvement of non-invasive experimental techniques for analyzing the human voice.
HOW TO CITE
Subscribe to NCVS Notes
Contact
975 S. State Street
Clearfield, UT 84015