top of page
Eclipse Project and Download

ECLIPSE  

 RESEARCH AND DEVELOPMENT PROJECT IN INTERACTIVE SPATIAL AUDIO

PROJECT INTRODUCTION

Eclipse is a first-person game, with a shooter/zombies themed gameplay. The project explores the extent to which current Spatial Audio practices within interactive media can enhance the player experience.

Following recent developments in binaural ambisonics, and further research in immersive technologies, the ability to implement real-time object-based ambisonics, and game object informed virtual acoustics is now computationally possible in a commercial application, as well relatively simple to implement due the recent release of specialist plugins and game packages. There are many benefits to implementing spatial audio in games, including competitive advantages due to improved sound localisation, heightened immersion as a result of acoustically adaptive environments, and outcomes beyond the scope of this project including personalised listening experiences using custom HRTF's. 

The aim of this project is to explore the potential of adaptive spatial audio to inform gameplay and enhance immersion, as well as discuss some of the limitations and issues that arise. The outcomes of this project are documented in a series of development videos, as well as a downloadable version of the game which is accessible above. 

ECLIPSE | SPATIAL AUDIO

Spatial Audio

SPATIAL AUDIO

WHAT IS SPATIAL AUDIO 

Spatial Audio in interactive media, is the process of modelling acoustic and psychoacoustic phenomena, in order to virtualise the spatial context of audio sources in a synthesised environment. The physiology of human hearing, enables humans to spherically localise sounds to a high level of accuracy (Moore. D, 1991), as well as estimate the distance from audio sources when there is environmental acoustic context (Lu, Y.C, et al, 2007). In order to integrate Spatial Audio into my game, several different acoustic phenomena were modelled individually.

 

LOCALISATION 

Localisation of sound sources describes how we estimate the directionality of sound, which in human physiology is known as Binaural Hearing. Binaural, uses the placement and shape of both of our ears, to estimate the direction of sound, by determining the time-difference (Interaural Time-Difference ITD), relative intensity (Interaural Level-Difference ILD),  and spectral-filtering of the sound between each ear (Moore D, 1991). Binaural Ambisonics, is a technology developed and researched by the University of York, that enables the localisation of sources to be virtualized in up to 3rd order ambisonics, before all sources are rendered through a symmetrical HRTF (Head-related Transfer Function) calculation in real-time (SADIE, 2017). This technology has been developed further by Google, to create an SDK called ‘Resonance Audio’, that can be used in different interactive media platforms, to render binaural ambisonics of sound sources in real-time. This SDK has been implemented into my Wwise project, so that all 3D sounds can be spatialsed in gameplay using binaural ambisonics. This enables the player to accurately determine the vertical and horizontal direction of sounds, enabling them to more accurately locate game items and enemies, with the intention of assisting in gameplay and enhancing immersion. 

 

Distance perception is more difficult to model, as a mixture of environmental and physiological cues influence our ability to estimate distance. Loudness and Spectral content, are referred to as relative cues, as the inverse square-law and spectral absorption coefficients during sound propagation can be used to determine the absolute distance of a sound source, if that source is familiar to the listener- however, relative cues are not reliable if the sound source is not familiar (Lu, Y.C, et al, 2007). For instance, you may be able to determine the distance of a police siren due to its loudness intensity, however, prior context and association will impact your perception. In Wwise, the attenuation curves, attached to each 3D sound, can be adjusted to enhance and obscure intensity curves over distance- therefore relative cues are usually inaccurate in gaming, as they rarely abide to the inverse square-law, however, attenuation curves can be enhanced to create a sense of hyper-reality. 

 

Binaural hearing can provide accurate absolute distance information for nearfield sources, as the ITD and ILD have a larger magnitude, however, this is only effective up to around 1m.  The second absolute cue is the ‘Direct to Reverb Ratio’ (DRR), which is the ratio between direct sound intensity and its reverb intensity. For closer sounds, the direct ratio is greater than that of reverberant sound, therefore this relationship can be used to better determine the distance of farfield sources (Lu, Y.C, et al, 2007). To model this in my game, I used the ‘Reflect’ plugin, to generate early-reflections based on mesh geometry and acoustic textures. Late reflections, were modeled using the Wwise ‘RoomVerb’ plugin-in, attached to game-defined room axillary buses, assigned to rooms using the ‘AkRoom’ component. Diffraction and occlusion were also modeled in my game, so that sound propagation was impacted by large game objects, such as crates, containers, and fences. To achieve this, I used the Wwise Spatial Audio component ‘AkSurfaceReflector’ on specific game objects, which enabled me to set occlusion values, and diffract sound paths around meshes.

Adaptive Sound Design

ECLIPSE | ADAPTIVE SOUND DESIGN

LIMITATIONS

A limitation with Binaural Ambisonics, is that location ambiguities caused by the ‘Cone of Confusion’, can cause the perceived directionality of sound to be mirrored on the interaural axis from their true direction. This problem has been addressed in research from the SADIE project (SADIE, 2017), providing two main solutions: the first solution is to compare your ear-shape with a bespoke HRTF profile of someone with a similar ear-shape, as matched pinnae profiles increase spectral cues that reduce this problem (Musicant, A.D. and Butler, R.A., 1984). This method is not ideal, as finding a close enough match can be time consuming, and potentially still poorly matched. Sony however, appears to be pioneering this method, notably with the PS5 ‘Tempest System’, allowing players to choose one of 5 HRTF profiles, when starting the game (WHAT HI-FI, 2020). HRTF’s are created using a 3D scan of a person’s head, which is very costly and extremely inaccessible; however, Sony have also integrated a feature into their Headphone Connect app, that allows users to scan their ear shape using their smartphones in order to create a personal HRTF that they can apply to their headphones. Although these may not be particularly accurate, perhaps the use of LiDAR technology, that is slowly becoming commonplace within smartphone technology, can make this method more effective- and perhaps we will eventually see this integration in the PS5 ‘Tempest System’. The second solution to the ‘cone of confusion’, is to use head-tracking, which allows the player to move their head in real time, to determine the exact direction of sound. Subtle head tilts are used by humans and animals to direct ambiguous sounds, however, in traditional first person games, the player's head position can only be rotated across the X and Y axis. Using GyrOSC, I was able to implement a very cheap, but crude solution to headtracking in my game, by sending Gyroscopic OSC messages from my phone to my character controller. VR games such as flight simulators, are large adopters of headtracking to increase visual immersion, however, the benefit of headtracking in stealth and shooter games for localisation purposes, has large competitive potential. 

 

The benefit of being able to spatialise a game environment through a basic pair of headphones, is unfortunately also the reason that binaural audio has not yet been heavily adopted on a large scale by game manufacturers. Forcing players to wear headphones is not accessible, and is likely to alienate a large proportion of players. A similar argument is also applied to larger audio systems, such as Surround Sound, which provides spatial audio through multiple-transducer setups, but usually at a large cost and form factor. Sound departments traditionally have had to make choices about which formats would be supported by their game, as the final output would be fixed to a defined channel configuration- thus multiple output formats would require multiple busses created within the audio middleware.  This is changing, however, as Wwise 2021 has introduced a new feature called ‘Audio Devices’. This feature has introduced a new bus configuration called ‘Audio Object’, which is an object-based audio pipeline that enables 3D sound emitters to preserve their positional metadata. As the object-based pipeline uses virtual ambisonics, the output can be formatted to almost any speaker configuration that a system supports (Audiokinetic, 2021). The benefit of this is that sound departments no longer have to design audio busses specifically for each format, and players are granted more versatility to adopt an output format that is suitable for them.

Interactive Music

ECLIPSE | INTERACTIVE MUSIC

ECLIPSE | GAMEPLAY

Gameplay

BIBLIOGRAPHY

 

AudioKinetic (2021) ‘Wwise Worldwide Expo 2021 - Day 1’ Available at: https://www.youtube.com/watch?v=qHcVkJekfW0 (Accessed: 14/04/2021)

 

Audiokinetic (2016) ‘Wwise Spatial Audio, Ambisonics in Wwise: Overview.’ Available at: https://www.audiokinetic.com/products/ambisonics-in-wwise/ (Accessed: 20/11/2020)

 

Audiokinetic (2020) ‘Wwise Spatial Audio’ Available at: https://www.audiokinetic.com/products/wwise-spatial-audio/ (Accessed: 20/11/2020)

 

Audiokinetic (2020) ‘Wwise 2021.1 What's New | Beta Edition’ Available at: https://blog.audiokinetic.com/wwise2021.1-beta-whats-new/ (Accessed: 20/01/2021)

 

David R. Moore (1991) Anatomy and Physiology of Binaural Hearing, Audiology, 30:3, 125-134, DOI: 10.3109/00206099109072878 (Accessed: 20/05/2021)

 

Jin, C.T., et al. (2000) Spectral cues in human sound localization. In Advances in Neural Information Processing Systems (pp. 768-774). (Accessed:19/05/2021)

 

Musicant, A.D. and Butler, R.A., (1984) The influence of pinnae‐based spectral cues on sound localization. The Journal of the Acoustical Society of America, 75(4), pp.1195-12 (Accessed: 18/05/2021)

 

Resonance Audio (Unknown) ‘Resonance Audio’ Available at: https://resonance-audio.github.io/resonance-audio/ (Accessed: 20/10/2020)

 

SADIE (2017) ‘Spatial Audio for Domestic Interactive Entertainment’, University of York Available at: https://www.york.ac.uk/sadie-project/ (Accessed: 15/01/2021)

 

 

Sony (2021) ‘How to analyze your ear shape (360 Reality Audio)’, Available at: https://www.sony.com/electronics/support/articles/00233341 (Accessed: 20/05/2021)

 

 

WHAT HI-FI (202) ‘PS5 3D Audio: What is it? How do you get it?’ Available at: https://www.whathifi.com/features/ps5-3d-audio-what-is-it-how-do-you-get-it (Accessed: 03/02/2021)

bottom of page