• No results found

Can an Optimized MidSide Technique Improve Perceived Envelopment in Game Audio

N/A
N/A
Protected

Academic year: 2022

Share "Can an Optimized MidSide Technique Improve Perceived Envelopment in Game Audio"

Copied!
53
0
0

Loading.... (view fulltext now)

Full text

(1)

Improve Perceived Envelopment in Game Audio

Oskar Hansson

Audio Technology, bachelor's level 2018

Luleå University of Technology

Department of Arts, Communication and Education

(2)

Abstract

Mid/side processing techniques are commonly used in the music recording industry to widen the stereo image to create a more enveloping listening experience. Since the gaming industry is now in need of better audio solutions to stay on par with the recent visual advances in technology; these mid/side techniques could potentially be a useful tool for sound designers to use. In this study, an experiment was conducted where 16 participants were asked to play 4 scenarios with different audio settings meant to enhance envelopment in different ways. After each scenario the participants were asked to rate their preference and perceived envelopment followed by a short survey after all 4 scenarios were completed. The quantitative data showed very little evidence suggesting the mid/side processing to be neither perceived more enveloping nor more preferred than the other versions, except for a group with gamers that played games less than 6 hours per week. The qualitative data on the other hand, showed hints at the mid/side version having envelopment as its defining attribute along with it making the sound design more exciting and making some sounds more powerful. The main problem with the mid/side technique seems to be that it has to exclude in-game spatialization for the widened stereo image to be perceived as enveloping. However, if it is applied on sounds that do not need to be spatialized then it might be able to improve the perceived envelopment of those sounds.

(3)

Table of content

Abstract ... 1

Introduction ... 3

Environmental Envelopment and Presence ... 5

Environment Width ... 5

Increasing Width ... 6

Mid/side Processing ... 6

Implementing the technique. ... 9

Resonating Space ... 10

Method ... 12

Game Level ... 12

Audio... 14

Sounds ... 16

Pre-study ... 17

Interactive test ... 18

Equipment ... 19

Results & Analysis ... 20

Pre-study ... 20

Interactive test results: Preference ... 23

Interactive test results: Envelopment ... 27

Qualitative results and analysis ... 32

Discussion ... 39

Conclusion ... 41

Future research ... 42

References ... 43

Appendix ... 45

(4)

Introduction

The gaming industry constantly seeks to develop its technology and techniques to stay ahead of the competition and provide new and exciting experiences. The visual medium is constantly improving with better lighting engines, higher quality textures etc. etc. Creating better audio solutions is necessary for sound to stay on par with the advances in graphical technology. Game sound in itself cannot be described as immersive without the graphics and the interactivity in the gaming medium, but improving audio, especially in terms of spatialization as suggested by Huiberts (2010), can and most likely will help create an immersive experience. Some of the most recent advancements in game audio that strive to increase immersion are different types of spatial audio techniques that in different ways try to simulate how we hear sound in the real world and applies those algorithms to the sounds in the game. An example being HRTF. These HRTF algorithms often need to be computed in real time and it is therefore quite resource demanding to apply the algorithms to every sound in the game. They also need to be individualized for the player to work at their full potential. A less demanding technique could potentially be utilized on the sounds that might not need to be as accurately spatially enhanced as other sounds in the game. Depending on how good the technique is it might even be able to replace the other techniques to some extent.

(5)

Because of the limitations in the current audio technology, finding other potential techniques for improving audio for gaming is most relevant. According to Floros and Tatlas (2011):

[…] spatial enhancement of existing stereo audio material is now very demanding in order to provide a more natural listening environment in modern 3D audio/visual applications, and can be achieved in the context of broadening the perceived width and depth of the reproduced sound field, ideally to dimensions beyond the physical locations of the loudspeakers or the headphones. (p. 1)

Surround headphones, spatializing plug-ins and hardware do exist but are quite expensive and might be considered a gimmick to the general consumer. A study by Eriksson (2017, p.9) describes some of the issues with spatializing products as a lack of system awareness, meaning they are not aware of what other hardware or software are connected and affecting the audio.

Because of this issue with third party products it might be better to find a way to manipulate a stereo signal in the game engine to improve the envelopment of the player without the aid of third party products. As such being independent of whatever software or hardware the players are using. This study focuses on stereo headphones because of their simplicity and their prevalence in the audio and gaming industry.

The aim of this study is to use an optimized mid/side technique and find out how well the technique can improve perceived envelopment in game audio when compared to a commercially available spatializing technique and to no spatializing techniques at all. The commercially available spatializing technique that will be used in this study is the HRTF spatialization within the Steam Audio plugin beta 10 for Unreal Engine 4.19. Optimizing the mid/side technique is necessary since increasing the side signal to the max will most likely

(6)

create too much out-of-phase information for the sounds to work as intended. The next section will explore what envelopment means for this study and how it ties in to using mid/side processing as a technique for improving envelopment.

Environmental Envelopment and Presence

Environmental envelopment is defined by Rumsey (2002) as the “Sense of being enveloped by reverberant or environmental (background stream) sound”, with background-stream sounds being defined as sound energy coming from many directions (p. 663). This might be sounds that define the space enveloping the player in the game. Common examples being the sound of wind, waves or rain. Rumsey also defines another immersion attribute, presence, as the “Sense of being inside an (enclosed) space or scene”. Enclosed being parenthesized since it does not necessarily rule out the sensation of presence being experienced in an outdoor environment (p.

663). Rumsey then goes on to explain that these attributes may be closely related and states that “Once a subject feels to be inside the space, they are able to judge concepts such as environment width and depth” (p. 663) which brings us to our next attribute.

Environment Width

Rumsey (2002) defines environment width as “Broadness of (reflective) environment within which individual sources are located” (p. 660). For this study, the reflective environment can be defined as the ambient sounds which define the space enveloping the player. The individual sources can then be defined as player-originated sounds or other diegetic sounds in the game world that are not ambient sounds. This attribute is relevant since it goes hand in hand with what both Floros and Tatlas (2011) (quoted in the introduction) and Rumsey (2002) say about envelopment. Rumsey (2002) says that a “Source perceived as very wide, wrapped around listener, and diffuse may be considered enveloping” (p. 660). It can therefore be argued that an increase in width in sounds can increase environmental envelopment in games. This is backed

(7)

up by Floros and Tatlas (2011) statement that “audio reproduction through headphones may significantly benefit from the targeted sound stage widening, allowing an improvement of the overall user immersion in the reproduced stereo audio environment.” (p. 6). In a study by Cobos and Lopez (2010) an informal evaluation was carried out among 7 expert listeners where it was shown that the perceived spatial quality was better when utilizing a technique that widened the stereo image without impacting the original panning of the individual sources. This leads us to the next section which explores how we might widen the stereo image.

Increasing Width

There are a number of different tools at an audio engineer’s/sound designer’s disposal that can be used to increase the width of a stereo signal. Reverb, chorus and panning perhaps being the most common ones. However, reverb needs to be used differently for every space inside the game in order to make the sounds sound neither too big for a small room nor too small for a big room. Chorus might work better than reverb since it doesn’t change the spaciousness of the sounds but can also be a bit tricky to use since it can create unwanted artefacts if not used correctly. Panning is perhaps the best method for increasing width since it doesn’t add any artefacts or changes the space and timbre in any considerable way. However, one can only get so far with panning. Once the sound has been panned fully to both left and right there isn’t anything left to do to make the sound wider. However, one technique that can increase width reliably without changing the space and timbre or adding any substantial artefacts to the same extent as chorus and reverb is mid/side processing.

Mid/side Processing

Using mid/side as a method of manipulating stereo sound is a classic technique in the music recording business according to Schneider (2012, p. 1). It is commonly used in mastering to gain more control in a stereo mix. There are a lot of stereo imagers and width controlling plug-

(8)

ins and hardware out there but Katz (2015) argues that “[…] all stereo width controls are MS processors in disguise.” (p. 135). Although there are quite a lot of tools out there that do the encoding and decoding of mid/side for you; doing it manually isn’t that difficult. The mid signal is made from the left channel plus the right channel, i.e. the sum of both channels, and the side signal is made from the left channel minus the right channel, i.e. the difference between both channels. The audio is now split up into the mono information (mid) and the stereo information (side) and as such can be processed separately. To decode it back to stereo the left channel is received from taking the mid signal plus the side signal and the right channel is received from taking the mid signal minus the side signal.

Schneider (2012) explains that “In the LR format, the width can only be reduced with the pan- pots, from fully-panned to half-panned, or even mono. But, transferring to the MS format, the stereo stage can also be widened, by increasing S over M.” (p. 3), in other words: changing the ratio between the mid and side signal will change the width of the stereo signal and as was suggested earlier: an increase in width can increase the environmental envelopment. Schneider (2012) also says that “An increased sensation of spaciousness or envelopment may arise” which further proves that mid/side processing might be a suitable technique for improving the environmental envelopment of sounds in games. However, Schneider also mentions that increasing the ratio between the mid and side signal too much might create a “hole in the middle” effect. However, determining whether this effect is useful or not is outside of the scope of this study.

(9)

Although the mid signal is essentially the mono information of the audio, it does not have to stay in mono. There are multiple ways for mono to be converted into stereo. Maher, Lindemann and Barish (1996) say that “Examples include a delay or phase shift between the channels, a stereo reverberator or decorrelator, a pair of complementary comb filters, and various combinations of techniques.” (p. 2). As previously mentioned, a reverb might not be the best option for our purpose because of its space- and timbre-altering attributes, but this time a chorus or a delay might work since the side signal might potentially mask the artefacts from the effects.

There are still some issues that might arise from this mono to stereo conversion when using said effects. Firstly, the “hole in the middle” effect will most likely become even more noticeable. Secondly, the mono compatibility will be impaired since we are increasing the amount of out-of-phase information that will be cancelled out when summing the information to mono. This is mostly a problem for mobile- and handheld console-games since most gamers are using some kind of setup with 2 or more channels when playing regular console- or computer-games. Lastly, stereo sounds with transient information like random bird tweets or crackling branches might not work well with this kind of mono to stereo conversion. The reasoning behind this is that multiple instances of this transient information will appear in both channels which most likely will sound confusing, jarring and/or strange to the player. To avoid this, these transient sounds would have to be separated from the rest of the stereo track or perhaps a more advanced technique would have to be developed.

(10)

Implementing the technique.

Implementing the mid/side processing could be applied during the development of the game and/or at the end of the development cycle. The most straight forward way to implement the techniques would be to apply the processing to the sounds before they are imported into the game engine. This would essentially mean that the sound would be treated just like any other sound in the game engine and so the processing would not require any more CPU than any other sound. However, if one would like to change the processing in any way, going back and adjusting the sound outside of the game engine and importing it again would be the only way to accomplish that.

Another way to apply the processing would be within the game engine itself. Modifying the game engines to create a way to apply the mid/side processing in real time would be quite simple since the encoding and decoding of the audio is only a matter of simple mathematics.

This way of implementing the techniques would be useful if one wanted to change the mid/side processing after the sound has been imported. One audio sample could also be used with different processing for different scenarios without taking up extra disk space and freeing up RAM. Although the encoding and decoding is easy to compute, the processing itself might not be. If one would want to utilize something like a chorus effect to further widen the mid signal;

a relatively large amount of CPU power would have to be used.

The final way to implement the mid/side processing could be in the form of an external software or simply a general processing on the entire game audio. This would have to be a simpler processing in order to work for every sound without impairing the original sound design. The obvious downside here is that the technique wouldn’t be utilized to its full potential for every sound. Furthermore, it would be applied to every sound in the game which could come with negative implications like making some sounds sound too wide. Although no method of implementing the techniques is without its flaws, it is nothing that sound designers are not

(11)

already dealing with in their day to day work. One could argue these flaws to be small enough to be considered virtually non-existent when comparing the mid/side techniques to the current 3D audio and spatialization techniques. Regardless of what method of implementation is chosen, it is very important to think about which sounds that can benefit from this mid/side technique. Therefore, this next section explores which sounds within a game soundtrack might work best for improving the perceived envelopment in a game.

Resonating Space

To understand which sounds might work well for improving envelopment in games; one must first understand what type of sounds define the resonating space around the player. Grimshaw and Schott (2007) have created a conceptual framework based on a first-person shooter, but the majority of the framework could most likely be used in more genres with some modifications. In this framework it is proposed to use the term resonating space as a collective term for all space-defining sounds in a game. Within this resonating space are choraplasts, topoplasts, chronoplasts and aionoplasts. Choraplasts are sounds that define the resonating space through depth and localization. An example being a spot-sound like a burning torch on a wall. Topoplasts are sounds that creates the perception of a general or specific location within the resonating space. This can also be a choraplast but this time it also describes the space the player is located in, commonly done by having a reverb that fits the space. Chronoplasts are sounds that provides the perception of temporal movement, i.e. sounds that describe time. An example being crickets, letting us know that the level is set in the night. Lastly, aionoplasts are sounds that define the time period the game is set in. If the game is set in the future these sounds may be something like spaceships and laser-beams.

Grimshaw and Schott (2007) explains that “Any FPS acoustic ecology must make use of choraplasts and topoplasts and, to a lesser extent, aionoplasts as the foundations of the spatial and temporal elements of the acoustic environment that is a component of that ecology.” (p.

(12)

477). They continue: “In many cases, these are the keynote sounds, described […] as ubiquitous and pervasive background sounds and, therefore, environment sounds.” (p. 477). In other words: these sounds are mainly ambient sounds and have the important function to define the virtual acoustic space in which the player is projected into. Grimshaw and Schott continues to explain that these sounds are not meant to be consciously listened to and analysed and that they mostly are not once other sounds needs attention from the player. Instead they act as sounds that set the scene and space of each location in the game. Furthermore, these sounds are seldom sounds triggered by the player and are as such independent of player actions. This means that a sound designer will have the ability to fully control and manipulate the sounds, which in turn means they can manipulate the time and space of each location without having to take player interaction into account.

According to Collins (2013), self-produced sounds like footsteps are a type of sounds that are able to enhance a player’s perceived presence in a game and as previously mentioned: presence is essential for the listener to be able to judge attributes such as environmental width and depth.

Therefore, a game might benefit greatly from having self-produced sounds for the sound design to be perceived as enveloping. Additionally, if the self-produced sounds themselves are perceived as enveloping then that might enhance the rest of the sound design and help glue it together. Although they might not be the perfect tool for a sound designer to reliably create an enveloping sound design; they are arguably essential for the player to feel present in the game world.

(13)

Method

For this study, an interactive listening test was conducted with 16 test-subjects in the form of a game level with four different versions of the processing of the sounds placed in the game level. The game level was created with a goal for the test-subjects to complete. A plethera of sounds were then created to be placed around the level. These sounds had four different versions with one being the mid/side processed version. This mid/side version was optimized with the help of a pre-study before the final experiment was conducted.

Game Level

The game level was made in Unreal Engine 4.19 with Steam Audio as the audio engine. A lot of effort was put into making the level look as good as possible so that the visuals did not ruin the test subjects experience.

Figure 1. Example image from the game with torches to the left and volcano in the background.

The level was constructed by the experimenter with assets made by Epic Games. It had a path that led through some hills (figure 1 and 2) and ended by the foot of a volcano (figure 3). The path was surrounded by trees, boulders, torches and various foliage. Invisible walls were placed

(14)

around the level so that the test subjects could not stray away from the path. The goal of the game was simply to make your way to the volcano.

Figure 2. Example image from the game taken from the start of the level.

When the test subject reached the foot of the volcano the audio faded out and they were teleported back to the beginning. The level was complete with lighting, post processing effects and optimised to run at 100 fps. The level was also covered in a thick fog to create atmosphere and limit how much the test subjects could see. The reason being that the sounds would be more in focus if the test subjects couldn’t see the level that well. It was still necessary to have some sort of visual cue for the test subjects to orient themselves with.

(15)

Figure 3. Example image from the game at the end of the level.

This was accomplished by having a volcano at the end of the path which was visible from all around the level. The sounds were placed around the level with wind sounds in the trees, cricket sounds in the bushes, fire sounds on the torches, volcano sound on the middle of the volcano, lava sounds on the surrounding lava-pools and footstep sounds that were attached to the player character.

Audio

The initial plan was to have three different version, one with stereo, one with stereo and mid/side processing and one with stereo and HRTF. The audio stimuli were also supposed to be just ambient sounds like wind and cricket sounds fed directly to the left and right channels.

However, Steam Audio’s HRTF spatialization currently does not support stereo audio, thus the stereo sounds had to be summed to mono and placed around the level as point sources for the HRTF version. This meant that the level required sounds that weren’t just ambient sounds since ambient sounds does not work very well in mono. This in turn meant that the other two versions had to have their sounds placed as point sources in the level. This worked quite well for the

(16)

version without mid/side since Unreal Engine as of version 4.11 supports stereo sounds for spatialization. However, there was no noticeable difference between the version with and the one without mid/side processing unless the player character was standing right on top of the placed sound. This is due to how the 3D stereo spread works in Unreal Engine where it treats the stereo audio as two channels which it virtually places to the corresponding sides of the sound emitter. Therefore, the mid/side version had to exclude spatialization and just use basic volume attenuation for the effects to be noticeable. The volume attenuation changed the volume of the sounds depending on how far away or close to the sound the player was located. The closer to the sound the louder it got.

The original aim of comparing it to stereo was no longer a fair comparison since they were vastly different and so a fourth version was made which used stereo sounds but like the mid/side version excluded the spatialization, thus only using volume attenuation. It was important that the sounds and the different versions did not favour one technique more than any other which is why this method was chosen. This way there were 4 distinct versions that featured the same sounds placed on the same location and with different characteristics.

The project’s audio settings are displayed in appendix 1 and Steam Audio’s plugin settings in appendix 2,3 and 4. Appendix 5,6,7 and 8 show the different attenuation settings for the two spatialized versions (i.e. the spatialized stereo version and the HRTF version) used for the different sounds. Whether Unreal Engine uses the Steam Audio HRTF spatialization or the built in spatialization depends on if the sounds are in mono or stereo. If they are in mono, the Steam Audio HRTF spatialization will be used and if they are in stereo the built in spatialization will be used since Steam Audio currently does not support stereo audio. Therefore, the same settings could be used for both versions.

(17)

Since the two versions without spatialization (i.e. the non-spatialized stereo version and the mid/side version) worked very differently from the other versions, they needed their own set of customized attenuation settings. Since the only difference between the two versions were that one of them had the sounds widened with mid/side processing before being imported into the game engine, they could both use the same attenuation settings. Appendix 9,10,11 and 12 show the attenuation settings for these versions.

Sounds

There were 11 different sound cues in total. A footstep sound cue, 4 wind sound cues, 3 cricket sound cues, a lava sound cue, a torch sound cue and a volcano sound cue. These particular sounds were chosen with the aim to have some sounds that worked well as point source sounds, some as stereo ambiences and some as both. They were also chosen to have some sounds that the listener might expect to sound big (like the volcano) and vice versa. An example of what these sound cues looked like is shown in figure 4.

Figure 4. Example of a sound cue, in this case the volcano sound cue.

The sound cues had different attenuation settings that were optimised by the experimenter to make sounds like the volcano and wind sounds be heard over greater distances than sounds like the torch and cricket sounds. The versions which could utilize Steam Audio’s occlusion did so for increased realism. The different versions were assigned to one sound class master each which could control the audio volume of the assigned version as shown in figure 5.

(18)

Figure 5. Paus menu controlling the different sound class masters.

Before the main experiment could be conducted, the mid/side ratio had to be optimised so that the processing was suitable for all sounds. Optimizing the mid/side technique was necessary since just increasing the side signal to the max would most likely create too much out-of-phase information for the sounds to work as intended.

Pre-study

To optimise the mid/side ratio, 6 test subjects were asked to adjust the mid/side ratio to the amount they deemed suitable for the sound and context. A midi keyboard was connected to a DAW containing all the sounds that were to be implemented into the game. The DAW was set up with the side signal and master volume being assigned to the same rotary knob within the DAW. That rotary knob was then assigned to a step-less modulation wheel on the midi keyboard so that it increased the side signal but decreased the master volume for the sound.

This was done to rule out loudness as a factor. The amount the master volume could be decreased and the side signal increased was optimised for each sound by the experimenter to the amount where it was deemed to have no noticeable volume difference between the max

(19)

value and minimum value of the modulation wheel and so the side signal could not be lower than +0dB nor higher than +18 dB. Each test subject adjusted the mid/side ratio for each sound one at a time while looking at the game level the sounds were going to be implemented into.

Interactive test

16 test subjects were asked to participate in an interactive listening test. The only requirement for the test-subjects was that they had to have played video games sometime before. The task was to reach the volcano by simply walking along a path that led to the volcano as shown in figure 6.

Figure 6. Overview of the path to the volcano.

Once they reached the volcano they had to remove the headphones and answer on two 11-point scales how enveloping they perceived the audio to be. The scale asking for envelopment ranged from “not at all enveloping” to very enveloping and the scale asking for preference ranged from

“did not like at all” to “liked very much”. This was repeated for all 4 versions. After the 4 versions had been played, the test subjects had to answer a questionnaire with 4 questions regarding what the defining traits were in the versions they found the most/least enveloping and liked the most/least. The scales and questions can be found in appendix 13 and 14. The

(20)

participants were also asked to fill in how many hours they played games on average per week.

This was done to see if there was a difference between what is commonly referred to in the gaming industry as casual gamers and hardcore gamers. For this study, the casual gamers were defined as gamers who played less than 4 hours per week and hardcore gamers being defined as gamers who played more than four hours per week. The order the versions were played in was decided using a 4x4 latin square. The average session lasted 15 minutes.

Equipment

The DAW used for the mid/side processing and pre-study was Reason 10 using an M-Audio Keystation as the midi-keyboard. The monitor was a 34-inch 100 Hz ultrawide monitor from HP. The headphones were a pair of AKG K240 studio.

(21)

Results & Analysis

Pre-study

Pre-study results Subject

Sound 1 2 3 4 5 6 Standard

Deviation Footsteps 8,5 8,4 7,2 8,8 7,8 9,6 0,8 Crickets 1 15,3 4,8 1,3 3,6 0,0 6,1 5,4 Crickets 2 18,0 4,4 0,0 0,0 2,7 3,7 6,7 Crickets 3 18,0 12,3 3,3 6,0 0,0 0,0 7,2

Wind 1 11,9 3,0 8,1 0,0 1,6 5,5 4,4

Wind 2 17,3 8,9 4,1 8,5 7,9 9,1 4,3

Wind 3 10,1 1,7 4,0 2,7 7,4 4,4 3,1

Wind 4 10,4 6,7 7,2 6,7 3,0 0,0 3,6

Torch 12,8 0,0 9,9 18,0 4,4 6,9 6,4

Lava bubbles 8,5 3,5 3,5 15,7 2,4 6,1 5,0

Volcano 17,3 8,2 0,0 0,0 6,4 6,2 6,4

Lava flames 13,5 3,7 5,1 16,4 5,4 7,8 5,1 Table 1. Side level gain in dB set by the test-subjects on the sounds.

There is a lot to observe from table 1. One thing is that the test subjects were more in agreement whether how much processing was suitable on some sounds than on others. The most notable sound being the footstep sounds where the standard deviation is as low as 0,8 dB. This might be because it is a sound that most people and gamers are quite familiar with and so already have an understanding of what it should sound like. Other sounds like the volcano and torch sounds differed more between the test-subjects. This could be due to them not knowing what physical object the sounds were going to be tied to, thus having different opinions of what it should sound like. It might also be that the mid/side processing is perceived differently from person to person meaning that it could sound great with an increased side to mid ratio for

(22)

someone and bad for someone else. Perhaps the reason why the cricket sounds had such a high standard deviation is due to them having quite a lot of side information to begin with, which might have made some test-subjects feel like increasing the side to mid ratio worsened the sounds. Lastly, the test-subjects preference for how much mid/side processing is suitable differs quite a bit. Test-subject nr.1 clearly thinks that a high side to mid ratio is suitable for most of the sounds whereas test-subject nr.2,3 and 5 mostly keep to the lower values. In other words: the suitable mid/side ratio is mostly highly subjective with the exception being the footstep sounds.

(23)

The pre-study resulted in the following increases in side level gain. It is important to note that higher values do not mean that one sound with a high value is perceived as more enveloping than a sound with a lower value. The different increases in side level gain can be due to some sounds having less side information, thus needing a larger increase in gain for there to be a noticeable difference in perceived envelopment.

Side level results

Sound Side level gain (dB)

Footsteps +8,4

Crickets 1 +5,2

Crickets 2 +4,8

Crickets 3 +6,7

Wind 1 +5,0

Wind 2 +9,4

Wind 3 +5,1

Wind 4 +5,7

Torch +8,6

Lava bubbles +6,7

Volcano +6,4

Lava flames +8,6

Table 2. The resulting side level gain increase from the pre-study.

One thing that can be said from these two tables and from talking to the test-subjects is that most of the test-subjects preferred a higher side to mid ratio than the original for most of the sounds.

(24)

Interactive test results: Preference

Figure 7. Diagram for the stereo version with spatialization.

From figure 7 we can see that the results are quite spread out ranging from 3 to 11 with 8 having the highest frequency of 4.

Figure 8. Diagram for the stereo version without spatialization.

Figure 8 shows that the test-subjects were more in agreement with each other with the values ranging from 5 to 11 with 8 having the highest frequency of 8.

0 1 2 3 4 5

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Stereo spatialized preference

0 1 2 3 4 5

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Stereo preference

(25)

Figure 9. Diagram for the mid/side version without spatialization.

Unlike the previous diagram; figure 9 shows a more spread out diagram ranging from 4 to 11 with 10 having the highest frequency of 4 but 4 and 6 both having a frequency of 3 meaning that even though there were many high ratings there were also many lower ratings.

Figure 10. Diagram for the HRTF version.

Figure 10 is also quite spread out ranging from 2 to 11 with 5 and 6 having the highest frequency of 3.

0 1 2 3 4 5

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Mid/side preference

0 1 2 3 4

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

HRTF preference

(26)

Figure 11. Boxplots for the four versions.

From figure 11 we can see that the different versions yielded quite similar results with all versions except stereo without spatialization having an average around 7. Because of them being so similar it is difficult to say something with certainty other than the fact that all versions except stereo without spatialization had quite a wide spread of data. This most likely means that the test-subjects prefer different things in game sound design.

(27)

The following are the test results from three two-tailed paired sample t-tests from the preference ratings. A positive t-value greater than the t-critical means the mid/side version was more preferred than the compared version.

T-test: entire sample

Compared version Stereo spat. Stereo HRTF

M/S t-value 0,399 -0,788 0,910

M/S p-value 0,696 0,443 0,377

Table 3. T-test results with a significance level of 5%, t-critical=2,131 and DoF=15.

As we can see in table 3, there was no significance to be found meaning that the mid/side version was not preferred over any other version.

T-test: hardcore gamers

Compared version Stereo spat. Stereo HRTF

M/S t-value -0,608 -1,238 -0,262

M/S p-value 0,558 0,247 0,799

Table 4. T-test results with a significance level of 5%, t-critical=2,262 and DoF=9.

Table 4 shows that, yet again, there was no significance to be found when looking at the hardcore gamers, meaning that the mid/side version was not preferred over any other version.

T-test: casual gamers

Compared version Stereo spat. Stereo HRTF

M/S t-value 2,907* 0,159 3,478*

M/S p-value 0,034 0,880 0,018

Table 5. T-test results with a significance level of 5%, t-critical=2,571 and DoF=5.

(28)

Table 5 shows that we have found significance when comparing the mid/side version to the spatialized stereo version and the HRTF version when looking at the casual gamers. Casual gamers seem to prefer the mid/side version over the spatialized versions. It is also interesting that there is no significant difference between the non-spatialized stereo version and the mid/side version. This could be due to them being very similar to the point where the test- subjects couldn’t hear a significant difference or more likely the difference wasn’t big enough for them to rate it that much higher than the other one. The reason that just the casual gamers preferred the mid/side version more than the hardcore gamers did is most likely due to the casual gamers not being used to having to listen for the location of objects in games like the footsteps of an enemy or an incoming missile.

Interactive test results: Envelopment

Figure 12. Diagram for the stereo version with spatialization.

The diagram in figure 12 ranges from 1 to 11 with 8 having the highest frequency of 3. Again, a very wide spread of values.

0 1 2 3 4

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Stereo spatialized envelopment

(29)

Figure 13. Diagram for the stereo version without spatialization.

The diagram in figure 13 ranges from 4 to 11 with 7 and 8 having the highest frequency of 4.

Not as big of a spread and with 50% of the test-subjects answering 7 or 8.

Figure 14. diagram for the mid/side version without spatialization.

The diagram in figure 14 ranges from 2 to 11 with 10 having the highest frequency of 5. Even though this version has a quite high frequency of 5 on the value 10, the values are still quite spread out.

0 1 2 3 4 5

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Stereo envelopment

0 1 2 3 4 5 6

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

Mid/side envelopment

(30)

Figure 15. Diagram for the HRTF version.

The diagram in figure 15 ranges from 3 to 11 with 9 having the highest frequency of 6. Again, the values are still quite spread out even though the value 9 has a frequency of 6.

0 1 2 3 4 5 6 7

1 2 3 4 5 6 7 8 9 10 11

Frequency

Rating

HRTF envelopment

(31)

Figure 16. Boxplots for the four version.

From the boxplot in figure 16 we can observe that version 1 and 3 had the widest spread of data, but all the versions were, yet again, quite similar with the average values being around 7 or 8. Similar to the boxplot for preference, version 2 has the least spread. This could perhaps be that the test-subjects was most used to hearing sounds in that way.

(32)

The following are the test results from three two-tailed paired sample t-tests from the envelopment ratings. A positive t-value greater than the t-critical means the mid/side version was perceived as more enveloping than the compared version.

T-test: entire sample

Compared version Stereo spat. Stereo HRTF

M/S t-value 0,473 0,808 -0,063

M/S p-value 0,643 0,432 0,951

Table 6. T-test results with a significance level of 5%, t-critical=2,131 and DoF=15.

As we can see in table 6, there was no significance to be found meaning that the mid/side version was not perceived any more enveloping than any other version.

T-test: hardcore gamers

Compared Version Stereo Spat Stereo HRTF M/S t-value -0,333 -0,161 -1,272

M/S p-value 0,747 0,876 0,235

Table 7. T-test results with a significance level of 5%, t-critical=2,262 and DoF=9.

From table 7 we can observe that there was no significance to be found when looking at the hardcore gamers.

T-test: casual gamers

Compared version Stereo spat. Stereo HRTF

M/S t-value 1,904 1,941 2,521

M/S p-value 0,115 0,110 0,053

Table 8. T-test results with a significance level of 5%, t-critical=2,571 and DoF=5.

(33)

Table 8 shows that no significance was found when looking at the casual gamers. This means that none of the test results for perceived envelopment yielded any significance in any of the groups.

Qualitative results and analysis

This section covers the results from the questions the test-subjects had to answer after playing all four versions. It is important to note that some test-subjects had two or more versions rated equally which is why some answers appear on multiple version and the total number of answers sometimes exceed 16 answers.

Question 1 answers

1.Stereo Spat 2.Stereo 3.Mid/side 4.HRTF

Clear localization,

good envelopment.

Just right and believable envelopment.

Felt like you were inside the game, enhanced the

experience, more exciting.

More bass, cosy.

Natural sounds.

Hear more sounds, built atmosphere, more spot effects.

Good balance, more enveloping.

Clarity, easy to localize, footsteps were clearer.

Natural direction.

Enveloping environments.

Many small audio sources.

Able to locate sounds, audio not constantly from the left and right

channel.

Localized sounds, most comfortable to

listen to.

Sounded cool, very enveloping.

Felt natural, not as much

"in your face", more powerful volcano.

Clear localization, good envelopment.

Striking and fitting point effects, wide and enveloping ambiences.

Striking and fitting point effects, wide and enveloping ambiences.

Enveloping environments.

Enveloping environments.

Sounded cool, very enveloping.

Table 9. Answers to the question: What defined the version you liked the most?

(34)

From the answers shown in table 9 we can see that the people who liked the 1st version felt like it was natural, had clear and localized sounds, good envelopment and was comfortable to listen to. It seems like the fact that the sounds were spatialized made the test-subjects like that version more than the others. The same goes for 4th version where most test-subjects like it because of its clear and natural localization.

The test-subjects who like version 2 and 3 the most also used the attribute envelopment a lot more, saying that they were very enveloping, and that the envelopment was believable. They also used more words describing the experience and the narrative like saying that it sounded cool and that some sounds like spot effects and the volcano felt striking.

When we compare this to what could be observed in the quantitative results it seems like the test-subjects simply preferred different things. Some preferred to have the sounds localized and some preferred them to be more striking. Since we found significance when looking at the casual gamers who aren’t that used to having spatialized audio and having to listen for objects location, one might argue that more envelopment makes a game’s sound design more preferable if the gamers do not care as much about localization or if the game does not require the players to listen for the sounds directionality.

(35)

Question 2 answers

1.Stereo Spat 2.Stereo 3.Mid/side 4.HRTF

Lost one side.

Difficult to separate different

sounds, felt like an "audiohelmet".

Boring, didn't hear that much, spent more time looking at

things.

Bad envelopment, didn't feel believable.

Soup of sounds in the middle.

Unnatural

"direction", crickets

"took over" and amplified the unnaturalness.

Sounded empty, hard to get immersed.

Typical headphone stereo, unrealistic

towards the picture.

Only based on distance, always sounded from both

channels.

Volcano not as striking, not as exciting.

"Flat"

soundeffects, boring/unnatural

ambiences.

Quiet footsteps, everything hard

panned.

A bit messy, unnatural positioning on the volcano, could hear EQ-

changes when turning.

The sound design did not glue,

unnatural positioning on the

crickets.

A lot of artefacts in the sounds, phase-problems.

A lot of treble.

Table 10. Answers to the question: What defined the version you liked the least?

From table 10 we can see that the test-subjects who disliked the 1st version had very different reasons to why. One subject found that the sound design didn’t glue, one felt like it was panned more to one side and one felt like the sounds were like a soup in the middle. It is therefore difficult to say anything about why the test-subjects didn’t like this version. However, one thing that version 1 and 4 have in common is that two test subjects have mentioned the positioning to be unrealistic towards the picture. One test-subject specifically mentions the volcano to have unrealistic positioning despite a lot of test-subjects liking the accurate positioning in these two

(36)

versions. This could be because the sound was positioned as a mono source in the middle of the volcano whereas in real life it wouldn’t be possible to hear a thundering volcano as one small sound originating from one small point on the volcano. Some test-subjects also felt the 4th version to have bad envelopment (perhaps because of the sounds being in mono) and that it had artefact and phase problems. This is most likely the HRTF algorithm that doesn’t work that well for some people, making it sound weird and uncanny.

The test-subjects that disliked version 2 and 3 had quite similar answers saying that it the audio was hard panned to both channels and that they couldn’t locate the sounds in the game other than from the decrease and increase in volume depending on how close to the sound they were.

These test-subjects probably preferred to be able to hear where the sounds were coming from rather than having them enveloping you at all times.

(37)

Question 3 answers

1.Stereo Spat 2.Stereo 3.Mid/side 4.HRTF

A lot of ambient sounds, though phase weirdness, especially

in the crickets.

A lot of sounds and clear point effects.

A lot of sounds and

clear point effects. More bass, cozy.

Very loud ambiences. Felt natural. Felt natural.

The volcano’s direction felt natural, could hear the sounds

position.

Presence, direction, localization.

All very enveloping, some

more discrete.

The sounds were always there, the volcano got louder the closer you got.

Felt like you were inside the sounds, the sounds were

everywhere.

The sounds came from different directions, but

still a background stream with e.g. forest

sounds.

Felt like you were inside the sounds, the

sounds were everywhere.

Presence, direction, localization.

All very enveloping, some more discrete.

The sounds were perceived as they are

in real life.

All very enveloping, some

more discrete.

All sounds felt right together, dynamic in

a good way.

Distinct and placed audio

sources.

Some sounds like the volcano were very

wide.

Able to locate sounds, audio not

constantly from the left and right

channel.

Table 11. Answers to the question: What defined the version you found the most enveloping?

Like table 9, table 11 also shows that the test-subjects had two distinct preferences. Some found the two spatialized versions to be more enveloping due to them having better localization and some found the two versions without spatialization to be more enveloping due to them being perceivably wider and always there. This explains why no significance could be found when looking at both types of gamers.

(38)

It seems like a game sound design can be enveloping in more ways than one with localized sound surrounding the player being one way and a constant stream of wide sounds being another. Additionally, the volcano is yet again mentioned as a defining trait for the mid/side version suggesting that a big object benefits from a wide sound.

Question 4 answers

1.Stereo Spat 2.Stereo 3.Mid/side 4.HRTF

Felt unnatural due to hard panning of the

cricket sounds.

"Flat", muddy.

All sounds to both channels, masked other sounds when

getting close to an object.

The sounds were relegated to different

channels, felt unnatural.

Too spread out, it didn't glue.

Volcano sound was big but

without localization the envelopment and

immersion was broken.

Quiet footsteps, everything felt hard

panned.

Audio not as central, the excitement level

decreased, felt boring.

Typical headphone stereo, unrealistic towards the picture.

No real wideness on the ambient

sounds, "flat"

ambient sounds, weird balance.

No big difference between the version.

No big difference between the version.

A lot of treble.

No big difference between the

version.

Like everything is in mono.

No big difference between the versions.

Table 12. Answers to the question: What defined the version you found the least enveloping?

(39)

Again, it is quite difficult to say anything about the 1st version due to the answers being so different as seen in table 12. Some answers even contradict each other like one answer saying that it sounded like everything was in mono, some saying that it was too spread out and one that said the cricket sounds were too hard panned.

The 2nd version had 2 answers mentioning the attribute “flat”. Perhaps this was because this version was just stereo without anything special going on like spatialization or mid/side processing. The same test-subjects also found it “muddy” and weirdly balanced which could be due to the version not being spatialized.

Most test-subjects found the version they felt was the most enveloping to also be the version they preferred the most. This correlation is further proven in a study by Berg and Rumsey (2001) where it was found that preference and envelopment had a close correlation.

The following are the answers to the optional comments of the survey.

• I use sounds in games to locate objects. Too many ambient sounds make it difficult to locate sounds. At first, I didn’t understand that it was a volcano that was thundering.

• My experience clashed when not only the ambient sounds were everywhere but also sounds that should have been positioned. A balance between the two would have worked better.

These two comments make it clear that some test-subjects favoured positioned sounds to be able to locate and understand what they were listening to. Additionally, the second comment suggests that a combination of positioned and non-positioned audio might work better.

(40)

Discussion

It is likely that the experiment design caused the experiment to be less a comparison of the mid/side technique, stereo and HRTF and more of a comparison between spatialized versus non-spatialized audio and their perceived envelopment. This was difficult to circumvent due to how Unreal Engine and Steam Audio worked. This also made it very difficult to create a fair comparison since the techniques are most suited for different types of audio. HRTF and spatialized audio are intended for point sources which need to be accurately located within the game world whereas the mid/side technique works better for background streams like ambient wind sounds fed to both channels simultaneously. Additionally, there seems to have been some problems with the mix in some of the versions. One test-subject mentioned that the footsteps were too quiet in one of the versions and others mentioned that the crickets were too loud in another version. This could have been avoided with an additional pre-study in which the participants could have given their thoughts on the mix. Although there were some flaws in the experiment design, there was still a lot of data to analyse which said a great deal about the capabilities, advantages and applications of the different techniques.

There are some things in the pre-study that are worthy of a discussion before we dive in to the main experiment. Firstly, the footsteps sound was the only sound for which the test-subjects were mostly in agreement. This is interesting since this was the only self-produced sound and perhaps the sound the test-subjects was most used to hearing. Both in real life and in games.

The rest of the sounds did not have as similar values which could mean that they were more difficult for the test-subjects to find a suitable mid/side ratio, that the test-subjects perceived the mid/side processing effects differently or that they had different opinions on what the most suitable mid/side ratio was. Ultimately, this means that optimizing the mid/side technique is neither an easy nor an objective task which in turn means that it could potentially be a difficult

(41)

technique to use reliably. If the technique is to be used, it could perhaps benefit from a slider in a settings menu which controls the mid/side ratio.

Although there were few significances found in the main experiment, there is still a lot to be observed from the data. One take home point is that there seems to be a correlation between envelopment and preference in a game sound design since a lot of test-subjects chose their most preferred version as the one they found the most enveloping. This means that envelopment most likely is a key element in creating a preferable sound design in games and is an attribute that sound designers should strive to perfect. However, there seems to be two different kinds of envelopment.

The casual gamers results yielded statistical significances showing that non-spatialized audio playing in both channels simultaneously with mid/side processing is more preferable than spatialized audio. Additionally, from the qualitative data we could see that the test-subjects who found the mid/side preferable chose to use the word envelopment a lot more than the others, suggesting that the mid/side versions envelopment was one of the more defining traits rather than factors like localization or naturalness. Some of their answers also suggested that it enhanced the game experience/narrative making some sounds like the volcano sound more striking. This suggests that the mid/side technique could be used not only to improve envelopment but also as more of a narrative tool.

The other kind of envelopment in sound design seems to be the one that comes from having the audio sources positioned in the game world. A lot of answers from the qualitative data mentions direction, localization and clarity as the defining traits for the spatialized versions.

One way to interpret this is that having the sounds spatialized and giving them a relative location and direction to the player makes the players feel like they are surrounded by the sounds and thus perceived as enveloping. However, it seems like the spatialized sounds

(42)

sometimes did not correlate well with the picture which degraded the experience a bit for some of the test-subjects. The volcano was particularly sensitive to spatialization considering it was mentioned twice regarding why the test-subjects didn’t like the HRTF version. This could mean that bigger objects like a volcano or a ferry need to have a bigger and wider sound that can’t be located to a single focused point.

Conclusion

Can an optimized mid/side technique improve perceived envelopment in game audio? Yes and no. The qualitative data gathered in this study suggests that the mid/side version did have envelopment as its most defining attribute, but the quantitative data showed no statistical significance except for the casual gamers’ data from their preference rating. Though even when looking at the casual gamers’ data, the mid/side version wasn’t significantly more preferred than the non-spatialized stereo version. Despite this, the mid/side version seemed to be able to enhance the experience and narrative by making some sounds sound more powerful and making some test-subjects feel like they were “inside the game”.

Perhaps the most prominent observation from this study is that a game’s sound design probably works best if a combination of spatialized and non-spatialized audio is used. The sounds that a player expects to be locatable in a game need to be spatialized in some way while non- spatialized sounds can further help fill out the auditory space around the player and improve the perceived envelopment. The mid/side technique could then be used on these non-spatialized sounds to further improve the envelopment of the game’s sound design. The mid/side technique could also be used as a narrative tool to make sounds sound more striking. An example being the volcano sound. Perhaps it would be preferable to have two (or more) sound components for the volcano sound. One spatialized component that can be heard over a very large distance and one mid/side processed component that can only be heard when you are closer to the volcano,

(43)

making the volcano sound big, wide and all around you when you are close to it while also having a localized component giving the player an audio-visual coherence.

Creating an enveloping and engaging sound design for a game is no easy task. A lot of different tools and tricks must be used to overcome the many challenges. Some objects require accurate localization, some require different sounds to be heard at different distances and some require realistic real time reverbs. Furthermore, these challenges require different solutions and sometimes they can’t be solved at all due to technical limitations. Because of this it would be a lie to say that: Yes, mid/side processing is the best way to improve envelopment in game audio. However, in some cases it might be just the tool that a sound designer needs.

Future research

Future research would need to study more in depth what sounds that work well with mid/side processing and why some sounds are easier for people to judge what a suitable amount of mid/side processing is for those sounds. It would perhaps also be relevant to study the effects of the mid/side processing in a more controlled environment to see how much mid/side processing is suitable. In the study by Berg and Rumsey (2001) a correlation between preference and envelopment was found which also seemed to be the case in this study.

However, the non-spatialized mid/side version, which in theory should be perceived as wider and therefore more enveloping, did not seem to be preferred over the non-spatialized version without mid/side processing. It would therefore be interesting to narrow the scope down to just a comparison between these two versions to study why this is the case.

(44)

References

Berg, J., & Rumsey, F. (2001, June). Verification and correlation of attributes used for describing the spatial quality of reproduced sound. Paper presented at the 19th AES International Conference, Elmau, Germany. Retrieved from http://www.aes.org/e- lib/browse.cfm?elib=10057

Cobos, M., & Lopez, J. (2010, October). Interactive Enhancement of Stereo Recordings Using Time-Frequency Selective Panning. Paper presented at the 50th AES International Conference, Tokyo, Japan. Retrieved from http://www.aes.org/elib/browse.cfm?elib=15565

Collins, K. (2013). Playing with sound: A theory of interacting with sound and music in video games. Cambridge, MA: MIT Press.

Eriksson, P.R. (2017). A Comparison Of Two Commercially Available Alternatives For Spatializing Audio Over Headphones In A Game Setting. Retrieved from Digitala Vetenskapliga Arkivet (diva2:1108216)

Floros, A., & Tatlas, N.A. (2011, July). Spatial Enhancement for Immersive Stereo Audio Applications. Paper presented at the 2011 17th International Conference on Digital Signal

Processing (DSP), Corfu, Greece. Retrieved from

http://ieeexplore.ieee.org/document/6005001/

Grimshaw, M., & Schott, G. (2007, September). Situating Gaming as a Sonic Experience: The acoustic ecology of First-Person Shooters. Paper presented at the 2007 DiGRA International Conference, Tokyo, Japan. Retrieved from http://www.digra.org/digital- library/publications/situating-gaming-as-a-sonic-experience-the-acoustic-ecology-of-first- person-shooters/

Huiberts, S. (2010). Captivating Sound - The Role of Audio for Immersion in Computer Games.

Retrieved from https://octo.hku.nl/octo/repository/getfile?id=zo30jluTQVw

(45)

Katz, R.A. (2015). Mastering Audio: The art and the science (3rd ed.). Burlington, MA: Focal Press

Maher, C.M., Lindemann, E., & Barish, J. (1996). Old and New Techniques for Artificial Stereophonic Image Enhancement. Paper presented at the 101st Audio Engineering Society Convention, Los Angeles, USA.

Rumsey, F. (2002). Spatial Quality Evaluation for Reproduced Sound: Terminology, Meaning, and a Scene-Based Paradigm. Journal of the AES, 50, 651-666. Retrieved from http://www.aes.org/e-lib/browse.cfm?elib=11067

Schneider, M. (2012, April). MS Mastering of Stereo Microphone Signals. Paper presented at the 132nd Audio Engineering Society Convention, Budapest, Hungary. Retrieved from http://www.aes.org/e-lib/browse.cfm?elib=16600

(46)

Appendix Appendix 1. Project audio settings.

Appendix 2. Steam Audio’s plugin settings.

(47)

Appendix 3. Steam Audio’s spatialization settings.

Appendix 4. Steam Audio’s occlusion settings.

Appendix 5. Attenuation settings for the spatialized wind and cricket cues.

(48)

Appendix 6. Attenuation settings for the spatialized lava cue.

Appendix 7. Attenuation settings for the spatialized torch cue.

(49)

Appendix 8. Attenuation settings for the spatialized volcano cue.

Appendix 9. Attenuation settings for the non-spatialized wind and cricket cues.

(50)

Appendix 10. Attenuation settings for the non-spatialized lava cue.

Appendix 11. Attenuation settings for the non-spatialized torch cue.

(51)

Appendix 12. Attenuation settings for the non-spatialized volcano cue.

(52)

Appendix 13. Scales for preference and envelopment ratings.

Scenario 1 Scenario 2

Gillar väldigt mycket Väldigt omslutande Gillar väldigt mycket Väldigt omslutande

Gillar inte alls Inte alls omslutande Gillar inte alls Inte alls omslutande

Scenario 3 Scenario 4

Gillar väldigt mycket Väldigt omslutande Gillar väldigt mycket Väldigt omslutande

Gillar inte alls Inte alls omslutande Gillar inte alls Inte alls omslutande

References

Related documents

If the previous studies correlate with the result of time difference in this study, it could mean that the participants in this study got motivated to finish the level faster due to

Tables 1, 2, 3, and 4 presents the words used by the subjects for motivating their choice of a sound effect and groups the words to form different themes of motivation.. Figure

R2 : - Ja vi pratade ju med Namn från MRS och hon påpekade också det att det vore bra att ha något, för de får också väldigt tunga modeller när de ska göra kataloger och

By identifying the relationships between sound events with the IEZA-framework, Huiberts (2010) suggests two overarching themes of main usages of sound in video

To explore highpass and lowpass filtering music as potential adaptation methods to the end media, a listening test was projected where nineteen subjects were exposed to four

Det har getts ett försök till att skapa ett sammansatt ramverk för att kunna analysera ljud, men även för att assistera ljudproduktionen genom bland annat Murch

Simply when one lacks the knowledge to process another piece of information (in order to process item B, one must first understand piece A). Chen et al. 474)

In our work we focus on measuring some of the main HRQoL aspects [ 7 ] such as sleep, motor function, physical exercise, medication compliance, and meal intake timing in relation