Sign in / Join

Part 7 : General Conclusions

by Karen Lederer

The Acoustic and Auditory Phonetics of Human Beatboxing

Part 7 : General Conclusions

We have seen in Part 6 that the compression of air involved in the generation of vocally produced transient sounds produces energy vibrating at more than one frequency. This renders the purity of the equivalent electronic sounds unachievable in the human mouth. The imitations of electronic sounds are otherwise accurate with the resonant frequencies and general energy patterns of the clave and kick sounds being well replicated.

The hi-hat was a less accurate imitation in terms of energy distribution but the aperiodicity of the electronic sound was well replicated.

In Part 3 it was suggested that, when lyrics are not involved, an accurate imitation of a drum machine sound should involve the subduing of typical speech frequencies or the accentuation of non-speech frequencies in order to make the beatboxed sounds seem less like speech. It was also observed that extraneous sounds should contain frequency components close to those of missing sounds if the missing sound is to be ‘heard’ by listeners.

The beatboxed hi-hat contains frequencies not normally associated with alveolar fricatives; however there are also speech frequencies typical of alveolar fricatives evident in the sound. In addition to this, the kick drum contains energy typical of an English nasal tone but the burst energy is higher than that usually associated with bilabials.

This study shows that all the beatboxed sounds studied are distinct enough from typical English phonemes not to be perceived as such, however the kick and hi-hat both contain frequency components that are close to those of English phonemes made with the same place of articulation. These frequency components are close enough to those in certain English phonemes that, when the beatboxed sound replaces a certain one of those phonemes, the listener is fooled into thinking they have heard a speech sound that does not exist.

There is no phoneme similar to the click in the English language so, as in Miriam Makeba’s ‘Wedding Song’, click sounds in beatboxing are likely to be perceived as percussive rather than phonemic by an English speaking audience.

It was noted in Part 6.4 that the best imitated sound was that which the beatboxer has least control over and the worst was that which the beatboxer had most control over. It is also notable that the best imitated sound was the one which is least like any phoneme of the beatboxer’s mother tongue and the worst is the most similar to a phoneme of the beatboxer’s mother tongue. It is more difficult to achieve accurate drum imitations with sounds articulated similarly to those of the beatboxer’s mother tongue than those articulated in an unfamiliar way. This is likely to be due to habits developed that cannot easily be reversed.

The discussion in Part 3 concludes that the accuracy of drum machine imitations must be sacrificed in order to maintain intelligibility when beatboxing is combined with speech. However the results of this study show that speech frequencies exist in beatboxed sounds even when no lyrics are present, so there is no need to alter their properties according to the presence of lyrics.

The degree of accuracy with which drum machine sounds are replicated by beatboxed sounds depends on the properties of the original sound, however all imitations are easily identifiable as their electronic counterpart and they all achieve a balance between speech and non-speech that allows them to be perceived as either or both by the listener.


Leave a reply