top of page
Search

5. What is Object based Audio? A deeper dive into 3D sound in Interactive Media.

Updated: Jun 23, 2022

Introduction

As we know from the Binaural blog, we hear all sounds in the natural world in binaural - however, over the past couple of decades there has been a growth in interest into how we can manipulate sounds in linear and interactive media to appear as though they are coming from any point in space, no matter what type of speaker arrangements you have (stereo speakers, surround sound, headphones). You may have heard terms such as ‘spatial audio, ‘3D audio’ or even ‘object-based audio’ coming up more and more over the past couple of years, and for good reason. These terms usually describe approaches to making the listening experience in interactive media more immersive, and more accessible.


This blog will explore what Object-Based Audio is, why it is important to the future of gaming and interactive media, and uncover more about the overall aims of 3D audio.


Object-Based Audio

The function of object-based audio is really quite simple! Each sound source is treated as an individual ‘sound object’ – also referred to as an ‘emitter’. The position of the sound object can be changed by adjusting some parameters (metadata) connected to that sound. This allows you to place the object anywhere you like in virtual space (Tsingos, Roginska and Geluso, 2018, chap. Object-Based Audio). When you listen back to the sound, the aim is to able to perceive it as though it has really come from that position.


ree

Take a video game for example, in which there are potentially hundreds of different sounds ‘emitted’ from different things inside the game. Object-based audio allows you to connect the source of the sound to the emitter inside of the game, allowing the position to change dynamically. If a car drives past you for instance, object-based audio enables the apparent position of the sound source to match the position of the car. This is really important in tying the visual experience to the auditory experience.


Why Object-Based Audio?

The primary goal of Object-Based Audio is to enhance immersion by rendering sound both horizontally, and vertically in space.

In the case of Binaural Audio, the position of a sound object will relate to a particular HRTF (as discussed in this blog). Once the sound object has been rendered through the HRTF, it should sound as though it is coming from that position when listening through headphones.


In the case of Cinema Audio, the position of the sound could relate to a specific speaker in the room, allowing you to feel as though you are part of the environment that is portrayed on screen.


ree
Hearing a sound object from an individual speaker in a cinema environment. The sound object is positioned based on where it would be in the virtual environment. ie. where the characters are looking (Tsingos, Roginska and Geluso, 2018, chap. Object-Based Audio).

How is it different from traditional Stereo or Surround Sound?

Traditional Stereo and Surround Sound speaker methods are described as ‘Channel Based Audio’. This means that each speaker in the configuration is treated a single ‘Channel’ (Rumsey, 2001).


Take 5.1 surround sound. The 5 represents the 5 different speakers positioned around the room. The ‘dot’ 1, represents the sub bass speaker channel. Altogether the 5.1 surround sound has six total channels. The images below display different domestic speaker configurations that have been popular over the past half a century. These are taken from the Audiokinetic Worldwide Online Expo 2021, in which the Game Audio Middleware company announced that their new version has full support for Object-based Audio (Audiokinetic, 2021).


When audio is produced using channel based audio, the output speaker format is fixed. This means that if you mix a film for 5.1 surround sound, the only way you can hear this audio correctly is through a 5.1 surround system.


Therefore, when the sound team is mixing the sound track to a film, they have to produce multiple mixes for different output configurations (usually standard stereo and 5.1 surround at least). As someone who has worked as a dubbing mixer for a feature-length documentary, I can attest to how laborious this process can be.

ree
Waves Surround Sound Panner. Used for creating Channel Based Audio mixes in DAW's. Sounds are based between speakers, instead of being panned based on coordinates in space (S360 Surround Panner & Imager | Waves, 2022)

One of the major advantages of Object-Based audio, however, is that it generally doesn’t matter what type of speaker configuration you have. Instead of panning the position of the sound between each channel, you just position the sound anywhere in space using simple (x,y,z) coordinates. When it comes to playing the audio back, a special renderer distributes the sound between the relevant speakers in the listeners own set up.


This means that only one mix needs to be created for Object-Based Audio, unlike Channel Based Audio that requires a separate mix for each individual speaker configuration (Tsingos, Roginska and Geluso, 2018, chap. Object-Based Audio; Herre, et al, 2012)



Mixing Object Based Audio

As mentioned above, the workflow for mixing object-based audio is different from traditional channel-based audio. Instead of using the pan posts that are available on conventional mixers inside of DAW's (Digital Audio Workstations) to position the sounds, the mix engineer uses a special plugin that is designed for object-based panning.


One of the most popular formats for object-based audio is Dolby Atmos. The Dolby Atmos Plugin suite can be used in most DAW's, allowing mix engineers to create a single mix that can be played through multiple different speaker configurations.

Recording Sound Objects

Audio Objects are usually recorded in mono. This means that they are recorded with a single microphone.


Spatial effects such as reverb can be processed afterward the sound has been recorded, so it always best to record the sound object in a non-reverberant space, if possible. Studio recording environments also help reduce the probability of external sounds being captured in the recording (Tsingos, Roginska and Geluso, 2018, chap. Object-Based Audio).


This technique is similar to traditional Foley recording, in which sounds presented on screen are recorded in a purpose built studio environment for sound effects.


ree
Foley Recordist (‘The Role of a Foley Artist’, 2016)

Reflective Summary

What?

This project has looked further into Object-Based Audio following a brief introduction on the topic in the Binaural Audio blog.


So What?

As discussed in my first blog, the aim of this project is to create a virtual ORTF Stereo Microphone renderer for 3D audio. The role of the renderer will be to spatialise sound objects by assigning ITD and ILD cues to the object, based on its virtual position. The position information will be provided by the game engine for each individual sound object.


The main assumption is that this will provide an effective alternative method of binaural audio over headphones then generic HRTF’s, which are known to cause perceptual issues if not well suited to the listener.


What Next?

Future research will explore mixing in Object-Based audio - specifically exploring wide sounds such as ambience, and working with spatially encoded recordings such as ambisonics and stereo recordings.


I will also need to look further into software design for object-based audio interactive media. Specifically how the positional metadata of sound objects are sent to the audio engine.


Conclusion

Object-Based audio is a technique used to process spatial audio. Each sound is treated as an individual ‘object’ that can be positioned anywhere in virtual space. The position data is then fed to a renderer that determines which speakers play the sound object, to create realistic localisation in any speaker arrangement.


My goal is that by the end of the project I will have produced a plugin that will be compatible for game audio engines and middleware, such as Unity and Wwise. that will be able to render effective spatial audio for headphones using an object-based audio approach.


References

astrogaming (2021) What is Multichannel Audio vs. Object Based Audio? | ASTRO Gaming. Available at: https://www.youtube.com/watch?v=JUkX2OGHrag (Accessed: 10 May 2022).


Audiokinetic (2021) Wwise Worldwide Online Expo 2021 - Day 1. Available at: https://www.youtube.com/watch?v=qHcVkJekfW0 (Accessed: 10 May 2022).


Herre, J.R. et al. (2012) ‘MPEG Spatial Audio Object Coding—The ISO/MPEG Standard for Efficient Coding of Interactive Audio Scenes’, J. Audio Eng. Soc., 60(9), p. 19.

Roginska, A. and Geluso, P. (2018) Immersive sound: the art and science of binaural and multi-channel audio. New York ; London: Routledge, Taylor & Francis Group.


Rumsey, F. (2001) Spatial audio / Francis Rumsey. Oxford: Focal Press (Music technology series). Available at: http://www.myilibrary.com/browse/open.asp?id=101234&entityid=https://shib.york.ac.uk/shibboleth (Accessed: 12 May 2022).

S360 Surround Panner & Imager | Waves (2022) waves.com. Available at: https://www.waves.com/plugins/s360-surround-imager-panner (Accessed: 10 May 2022).

The Role of a Foley Artist’ (2016) BridgmansBlog, 15 January. Available at: https://bridgmansblog.wordpress.com/2016/01/15/the-role-of-a-foley-artist/ (Accessed: 10 May 2022).

Comments


bottom of page