• No results found

Reducing Repetition in Game Sound: Utilizing Frequency Manipulation to Create Variations of Footstep Sound Assets

N/A
N/A
Protected

Academic year: 2022

Share "Reducing Repetition in Game Sound: Utilizing Frequency Manipulation to Create Variations of Footstep Sound Assets"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

BACHELOR THESIS

Reducing Repetition in Game Sound:

Utilizing Frequency Manipulation to Create Variations of Footstep Sound Assets

David Forss 2016

Bachelor of Arts Audio Engineering

Luleå University of Technology

Department of Arts, Communication and Education

(2)

Reducing Repetition in Game Sound:

Utilizing Frequency Manipulation to Create Variations of Footstep Sound Assets

David Forss

Bachelor's Thesis in Audio Engineering Supervisor: Nyssim Lefford

Abstract

This study tested if convincing variations of footstep sounds can be created for games by utilizing simple frequency manipulation. A listening-playing test was conducted using two versions of footstep foley, one processed, with variation created through frequency manipulation, and one unprocessed, varied by different samples. The test was conducted in a game, using multiple levels. The levels were exactly the same in both visuals and sound design, except for the footsteps and an object. 30 test subjects, 12 of which were trained listeners participated and were asked to rate how much of a difference they heard between the two versions, and the believability of the different type of sounds. Results show that the subjects had a hard time hearing any difference between the two versions. 21 of 30 subjects heard no difference at all in the character sounds, and most of the subjects who heard a difference responded with a perceived difference in sound, such as differentiating surfaces, shoes or walking, rather than a

(3)

Table of Contents

Abstract... 1

Introduction... 3

Background... 5

Techniques... 5

Pre-Processing...6

Real-Time Processing... 7

Wavelet Packet Transform Re-synthesis... 7

Real-Time Spectral Morphing...9

Choice of Applicable Sonic Parameters... 10

Frequency...10

Aim and Purpose... 11

Method...12

Construction of the Level... 13

Choice & Preparation of the Stimuli... 14

Processing... 15

Testing conditions...17

Subjects...17

Results... 19

Question 1: Perceived Difference... 19

Question 1: Subject Comments...20

Question 2: Preferred Version...21

Questions 3 & 4: Character Sounds Believability... 22

Questions 3a & 4a: Character Sounds Realism... 23

Questions 3 & 4: Subject Comments... 24

Questions 5 & 6: Ambient Sounds Believability...24

Questions 5a & 6a: Did the Ambient Sounds Fit The Environment...25

Questions 5b & 6b: Ambient Sounds Realism...25

Analysis... 25

Interpreting the Data... 26

Discussion... 29

Future Research... 32

References... 34

Appendix A: Screenshots of Game & Construction...36

Appendix B: Questionnaire... 38

Appendix C: Raw Data...40

(4)

Introduction

This thesis is about sound design in digital games, and more specifically on reducing perceived repetition of footstep samples in order to give the player a more enjoyable experience. Due to several different reasons such as time, budget and computing limitations, games have a limited amount of sound assets. This is a problem, because it means sounds that occur more than once need to be reused throughout playing sessions. If a player detects this and perceives the sound design to be repetitive it could result in annoyance and break the immersion of the experience. More varied sound design could increase the likelihood of the consumer enjoying the game to the fullest, increasing the chance of playing for longer sessions or engaging in the experience more deeply.

The focus of this study is footstep sounds, one of the most commonly heard type of sound in games, and probably the most repeated. The choice of footstep sounds is due to its common occurrence and also its easily repetitive nature whereas real footsteps are very varied. Since footsteps are of the diegetic type of sounds and relates directly to the game world, it is important to keep them varied to make the sound of the game world believable so that the player stays immersed in the experience. If a player is to engage in long playing sessions, a lot of different variants to the footstep sounds are needed to reduce the chance that the player can perceive repetition in the sounds, meaning they realize that they are hearing the same exact footstep sound more than once, which is impossible in the real world and thus not believable in a game.

A footstep sound also has a simple structure in terms of timbre, in comparison with sounds consisting of a complex structure which have more complicated overtone series, timbres and amplitude envelopes, for example an instrument or a dialog spoken by a voice, that changes in complex ways. In the case of a footstep, its structure consists of the foot hitting the ground which creates a transient for the heel with a follow up from the toe. How this transient sound depends on the walker, its weight distribution and foot roll. Then the ground resonates depending on the type of surface. In addition there may be friction between the sole of the shoe and the surface, consisting mostly of high frequency content. Due to this simple structure of a footstep sound, processing is not very hard to apply. The type of shoe also affects the sound, because its form and mass can be heard. Compare high heels with winter boots for example.

There have been previous studies that look for methods for avoiding repetition in sound assets.

Frequency manipulation has proved useful in algorithm-based real-time processing techniques that create variation for impact sounds with simple structure. These algorithms are still in an early research phase, and not implemented in commercial games due to being too computationally taxing. But since they utilize specific parameters such as pitch shifting and dynamic equalization to create variation, it is interesting to see if it could be done manually, in a simpler manner. If convincing variations can be

(5)

created in a simpler way, it could improve sound design workflow and reduce repetitiveness further, without necessarily demanding extreme amounts of processing power.

The purpose of the thesis is to test a technique for recycling recorded sound samples of footsteps and to create variations through simple frequency manipulation. Although other sonic parameters might be useful too, the scope will be limited to frequency manipulation for reasons explained in the background info below. By using basic tools such as equalization and pitch shifting, processed versions of unprocessed source sounds have been created and tested as variants comparing it to the traditional method of creating variants. The point of comparing these two methods is to see if the new processing method can provide an equally believable experience for the player and thereby and equally viable way to produce footstep sounds in commercial games.

Ultimately, if such a technique was turned into a signal processing algorithm and implemented in a commercial game, it might not only be used with only one footstep sample per foot as is being tested in this experiment, but rather each unique footstep sample available. This could give the opportunity to multiply all available footstep sound assets in a short matter of time. For example, say 3 extra variations can be created from each existing footstep sound asset in a game, then it would be effectively four times the amount of variants after processing.

This experiment also establishes a basis for developing real-time processing techniques in game engines.

The sound designer could design and modify the frequency manipulation bands and limitations in preparation for the engine to implement them dynamically, producing potentially larger amounts of variations to the assets. Since the processing tools are simple and already exist in some game engines to some degree; This low-cpu usage method could prove effective, and would not only reduce repetition, but the variants would not take up additional storage space on the gaming system.

More immediately, and with less computation, the point of utilizing this method would be to work more efficiently as a sound designer, and produce more variants in a quick fashion to hasten the production process, since there are often strict deadlines and time limitations when working in the sound design business. Less foley would also have to be recorded, edited and mixed, which is a time saver in and of itself. This is the focus of this thesis, to essentially hasten the production process. But all other perspectives mentioned, such as real-time processing possibilities and saving the computer from stressful tasks are still relevant and kept in mind for this thesis.

(6)

Background

Similar to sound design in the movie industry, sound designers that work with game development need to create and implement assets (in this case audio files), which fit their visual counterpart. The big difference however is that sound designers in the game industry need to create assets for an interactive experience (Tuffy, 2004). Whereas sounds in movies are locked to the picture in a linear experience, non-linear games have sounds bound to actions, often activated by the player. This essentially means that some or all sound assets in a game most likely can be heard more than once, depending how often the action occurs and how many variations of the sound are bound to it. How often an action happens is not predictable since a game is based on interactivity, and what sounds happens when is not predictable unless it is by design. An example of this would be walking in a game. The amount of times the player would hear the same footstep sound(s) is decided by how long said player decides to walk and the number of footstep assets available in the game, assuming the sounds aren't generated in real-time.

Actions that are repeated and unpredictable, which is common in non-linear games, increase the possibility that the game sound can be repeated.

Addressing repetitiveness in game sound is an important step towards a more realistic and well sounding final product. Not having a varied sound design could potentially end up being distracting for the player.

Due to technical and economical limitations, repetition has always been one of the biggest problems a sound designer has to work against (Vachon, 2009). The repetition problem has received attention from both the game industry and research.

There are several techniques and foreseeable possibilities for creating variation in sound assets. Most of them come with their own cost, usually based around the technical limitations due to them utilizing real- time processing techniques which is very computationally stressful for the gaming systems.

Nevertheless, these computational techniques were reviewed so that it may be understood which sonic parameters are the most commonly used (successfully) to generate variation. Even though this literature refers mostly to real-time processing systems, if the most useful sonic parameters are identified through these, sound designers can use this knowledge to choose how to manually manipulate sounds to create maximum and believable variation.

Techniques

In preparation for an experiment in which footsteps are compared, techniques and methods were examined in order to see what was effective or not in terms of creating convincing variants, and to choose sonic parameters to manipulate. Although most of these are algorithm-based techniques it still lay the basis for this experiment. First, before delving into the real-time processing techniques, there is an overview of what is considered the most common ways sound designers create variation, which is by

(7)

pre-processing, and briefly about how the assets are handled and implemented in standard game-engines.

It remains that today the most common way to avoid repetition is to create several separate sound assets as variations to one sound effect (Collins, 2013). Two techniques based around the idea of real-time processing were also reviewed. The first computational technique considered is Wavelet Packet Transform Re-synthesis, which creates variation by converting a source sound into wavelets that are modified in a number of ways. The altered parameters in this case are based in the frequency and temporal spectra. This means modification in volume, time envelope, frequency envelope, time-scale modification, time inversion and time shuffle (Silva & Mendes, 2011). The second computational technique considered is Real-Time Morphing, which creates variation by morphing two or more sounds.

Features in the new variations have similarities to the source sounds in tone color (formant shift), amplitude envelope, pitch and length (Siddiq, 2015).

Pre-Processing

The most common way to avoid repetition in game sound is to basically produce many separate variations of one sound. After designing all the variations manually, they are rendered out to several separate audio files which are then implemented in the engine as distinct assets. This results in a large number of files, a “sound bank”, all of them being full-length sound clips. Producing all of these sound assets and variations can take a long time, and working in a game studio can mean working in a very limited timeframe (Silva & Mendes, 2011). Working against a very strict deadline could affect the number of variations that get created for each sound. In addition, even though technology is moving forward very quickly, there are still technical limitations such as memory and CPU usage needed to play sounds. Both of these are restrictions that a sound designer has to work against (Lundquist, J., Game Sound Lecture, 09/10-2015). Lundquist is a development director at DICE, and offered insight into the production process of game development.

Once variations are created and implemented, they can be set to be played in a random order in the engine (Collins, 2013). This removes the possibility of the same sound being played twice in a row, preventing repetition. However, this also means that many assets could be needed to achieve believable variation for the purpose. More assets essentially mean more variation since the player is less likely to hear the same sounds as often. To get more assets, more recording, editing, mixing and implementation is required, which can be costly and take time.

To give an insight in the workflow of sound design in digital games, Gustaf Grefberg, a well established sound designer in Sweden, from the game studio Hazelight, held a workshop in Piteå for the game sound students. Among the information that was shared during these two days, much of the work came down to making a lot of separate assets. Creation and some minor processing of the sounds was done in a DAW (Digital Audio Workstation). When the assets were done they were implemented in Unreal Engine 4, and

(8)

just as Collins (2009) remarked, Grefberg's sound assets were played in a random order to avoid repetitiveness. (Grefberg G., Sound Design Workshop, 17/09-2015).

However, there are some time efficient ways to avoid repetition using this method depending on what sound it is. There could be several sounds layered on top of each other. In the case of an ambience, there could be a base ambience sound with several so called “ambient spot sounds” added on top in the mix.

These spot sounds are shorter clips of random ambience sounds, such as birds, leaves rustling etc. If foley is done for a character, there could be several layers of sounds as well. Instead of hearing the entire foley from one type of sound, the parts can be separated. So instead of having the entire foley in the footstep assets, they can be split into parts like heel, toe, armor rustle etc. All of these combined can create a large number of variation, since they're all likely to be randomized separately (Grefberg G., Sound Design Workshop, 17/09-2015).

Real-Time Processing

Real-time processing techniques are based on real-time manipulation of source files to create a large amount of variations. As previously mentioned, a downside with real-time processing techniques and algorithms is that they can be very stressful for the system. They require processing power and memory that is not necessarily available to a sound designer in a game developer studio (Vachon, 2009).

However, in contrast to using separate recordings to get variation in the assets; The algorithms are creating the variations by manipulating one or more source files and utilize some interesting parameters to do so. The goal with these algorithms and ideas used for real-time processing is to eventually get them to be used in real-time within an engine. This would result in endless variation. But an important aspect that isn't mentioned a lot is the utilization of these processing tools under time limitations. They are often too demanding for in-game use, but if they are used in offline rendering to build a large sound bank, there is potential for time to be saved as compared to manual sound design methods. Regardless, these systems do not function in real-time on the average computer used for sound design or game play.

Wavelet Packet Transform Re-synthesis

The Wavelet Packet Transform technique presented by Silva & Mendes (2011) is a method proposed to reduce repetitiveness in sound effects for digital games. Using this method, new sounds are created by synthesis and manipulation of the sound in the Wavelet Packet Transform. A Wavelet is basically a small piece of the sound after decomposing it. A complex sound is broken into simpler components with coefficients that can be individually processed. One sound effect in this case would break down into a large amount of wavelets, and after processing they can be recombined to create a new sound.

(9)

In the Wavelet Packet Transform Matrix the wavelets are simply laid out on two axis representing time and frequency. Each of those wavelets can be manipulated separately by modifying their coefficient and then re-synthesized. As wavelets can be modified independently a large amount of variations can occur from just one source sound while still keeping the source sounds timbral characteristics intact. The method has to be able to analyze a source sound and manipulate it in multiple ways, utilizing several different parameters, to make it sound different each time it is played, essentially reducing repetition by providing new, unique sounds.

The manipulations which are utilized by Silva & Mendes (2011) are Volume, Time envelope, Frequency envelope, Time-scale modification, Time inversion and Time shuffle. To explain them briefly:

– Volume and Time envelope together creates an amplitude envelope which decides how the sound volume develops over time.

– Frequency envelope decides how gain on specific frequencies develop over time.

– Time-scale modification is used to gain control over simple time compression and expansion by resampling in the Transform Matrix.

– Time inversion is basically reversing the sound information.

– Time shuffle works by rearranging coefficients in the Transform Matrix in time, and through randomization can give a continuous sound texture.

By analyzing the re-synthesized sounds they came to the conclusion that the modifications that changed the gain of the coefficients had good results. The time-scale modification with the resample ratio set between 0.7 and 1.3 also yielded results which were acceptably credible with the exception of a few artifacts.

The last two manipulations, Time inversion and Time shuffle, worked best when used for sound textures where some satisfactory fire, rain and water sounds were re-synthesized. Using only one second of source material, it was re-synthesized into more than five seconds.

Although using this method is very intensive for the computer, the experiment is early work. Silva &

Mendes (2011) explains that further research must be done, and that the method could be improved by a custom built audio file format and optimization of the algorithms.

(10)

Real-Time Spectral Morphing

The second technique, Real-Time Morphing, presented by Siddiq (2015), is an algorithm used to avoid repetition in specifically impact sounds. It utilizes spectral based morphing, partly due to the fast transformation and that the sound can be processed in frames.

Impact sounds were chosen for the experiment due to their simple structure. They are very important for game sound, and can include sounds that are usually heard very often in short periods of time, such as footsteps and collisions.

Siddiq (2015) explains that filtering and pitch shifting is a way to fake variety. But by using morphing algorithm like this true variety can be achieved. These tools can also be used off-line, instead of creating a lot of separate variations manually before implementation. This could be useful for time efficiency.

By using this method, two or more sounds can be morphed, creating a new variation through re-synthesis which share characteristics with the source sounds including Tone color (formant shift), Amplitude Envelope, Pitch and Length. To which grade each parameter is used can be decided to control the amount of similarity to the source sound. A brief explanation on the sonic manipulations:

– Tone color manipulation is done by analyzing the formant frequencies of the source sound and shifting it towards the target sound.

– Amplitude envelope describes how the sound develops over time.

– Pitch is used to shift between the important frequencies of the sounds.

– Length between the sounds are matched.

The results of the study show that the algorithm works fine on simple impact sounds such as hits on simple materials. Manipulating the tone color was sufficient for this task, and pitch shifting was not used at all since the impact sounds didn't have an obvious pitch. Sounds with a more complex structure could not be morphed. The method requires a lot of memory and CPU power to use.

(11)

Choice of Applicable Sonic Parameters

To narrow the experiment down, most of the previously mentioned manipulations have been set aside.

These choices were made based on which sonic parameters would be the most applicable or effective for a sound designer to utilize in a pre-processing approach. This instead of focusing on the algorithms and programming aspect of the matter.

The general conclusion between three of the papers, Silva & Mendes (2011), Siddiq (2015) and Vachon (2009) is that real-time processing alternatives are currently too taxing for the systems we have available for gaming. There are limitations in memory usage and CPU power which has to be taken into account.

On top of this, there are usually strict deadlines as well, restricting sound designers to make large amounts of variations for each sound.

While the algorithm-based techniques might not be optimal for implementation in a game engine yet, there is still possibility to use them offline to save time. While the real-time alternative is too taxing for the system, it shouldn't be a problem if used offline in the pre-processing stage and implementing the sounds as usual, with separate assets for each variation (Siddiq, 2015).

The decision to not further explore time inversion and time shuffle is based around them being most useful in a real-time processing scenario. Utilizing a manipulation technique such as time shuffling is not something that can be efficiently done manually. To apply these time manipulations an algorithm would have to be used to re-synthesize them (Silva & Mendes, 2011). Using the amplitude envelope mentioned by Siddiq (2015), would be interesting to explore and consider since it is available for sound designers, but would probably not be as effective manually as in an algorithm. To limit the amount of steps and simplify the method presented in this thesis, frequency will be the only sonic parameter to manipulate.

However, a sound designer might also be able to utilize knowledge about the sonic parameters to efficiently create variation and increase workflow. For example in a scenario where a sound designer has to provide 40 footsteps, they might be able to be produced using a limited amount of source sounds.

Frequency

Alterations on any type of gain were described to have good results in creating variations using the Wavelet Package Transform-technique. This should include the frequency envelope gain manipulation which is capable of simple and dynamic equalization (Silva & Mendes, 2011). An equalizer plug-in for a DAW, which is a very commonly used tool by audio engineers and sound designers might be capable of similar results, and requires little computing power. Pitch shifting is also frequency based and could be

(12)

used in a similar way to shifting the formant frequencies as described in Siddiq (2015). A combination of these might be enough to generate variations from a limited set of source files.

For this to work, the components of the sound need to be identified. Then the tolerances for manipulating those components need to be found using frequency to a point where it's an accepted variation. The result would be two variations from one source sound. Only slight processing might be needed, to not make it sound too different and unnatural, but different enough to be accepted as a variant. Since the source sounds should be similar enough to be accepted variations of itself, and probably originating from the same recording, the structure should be very similar if not the same for all of them. This means the same processing can be applied to all of the source sounds, essentially giving a lot of varied sounds in little time.

If this pattern could be identified and simplified even more to the point where maybe only slight processing would have to be done; Minor pitch shifting and slight equalization, then rather than the complex algorithms above, a simple algorithm might operate on these components alone within the tolerances, making it computationally feasible in current game engines.

Aim and Purpose

The aim of the study is to examine if reliable variants can be created for footstep sounds using frequency manipulation. A method for allowing replication of footstep samples and processing them to sound unique is being tested. The research question is “Can convincing variations be created of footstep sounds for games using only frequency manipulation”.

Both trained and untrained listeners were allowed to participate in the test, since some gamers might be more sensitive to sound and more aware of the sound design than others. The main qualification is only gaming experience. The results will be both divided into groups and shown as a total. This is only to see if the results are affected by trained listeners. However, the “total” is the most essential and important part of the results, since the technique should be applicable for all kind of gamers.

The purpose is to increase understanding of the sound design process and improve workflow from a sound designer perspective. With increased understanding, this could perhaps set templates for how to manipulate the frequency spectrum for certain sounds to create variants. Less time would be spent recording, editing and mixing and more assets would be available for implementation. It could become an easier task to work against strict deadlines.

(13)

It could also increase the understanding from a technical, or programmer perspective. Breaking a sound down to different spectral and temporal parts to understand its structure could make it easier to apply this method to an automated process that creates variation in real-time within the game engine. For example, consider a footstep. In the motion of a footstep there is a strike and a roll. There are several things that contribute differently to the sound. Different parts of the foot, such as the heel and toe, type of shoes, weight and balance of the walker and the surface. If all these parts could be identified they could also be individually manipulated. A simple equalizer is far less of a sacrifice in processing power than many of the currently existing real-time processing algorithms used to create variation.

Essentially its purpose is to reduce repetitiveness in game sound. Whether it be by creating a larger number of assets in a shorter period of time or manipulating frequencies through simpler real-time processing in the future.

Method

A listening test was conducted in which the subjects compared footsteps made with the traditional method for creating variation in footstep samples to footsteps made with this new method using frequency manipulation. The listening tests were conducted as a playing-listening test where the subject listened to the two different type of methods in an interactive game-like environment. Subjects played two scenarios and evaluated the sound design in each. There were more than footstep sounds in each environment, and subjects were asked about all the sounds (ambience and character sounds), even though only the footsteps changed. This was to avoid directing subjects attention to the footsteps and to see if both sets of footsteps were equally convincing in a game environment. The stimuli in the unprocessed version consisted of 8 unique footstep assets, while the stimuli in the processed version consisted of 2 unique “source” footsteps, with 3 processed derivatives each.

After gathering the results they were analyzed using statistical analysis in the form of binomial and chi- square tests to prove that they were not decided by randomness. However, the interpretation of the analysis and the descriptive data is the focus of this study. There is no need to run statistical analysis on them further than to know that they are significant.

In this experiment pitch shifting and equalization were the tools that were allowed. Exactly what each tool contributed to the overall resulting stimuli is out of the scope of this study. Both are ways to manipulate content in the frequency spectrum, and this is not a refined approach to making these alterations. But rather to prove the method, and that these tools can be used to create variation as a sound designer. See it as a first step, where refinement and further experimentation within the tools and method is appropriate for further studies.

(14)

Construction of the Level

All non-sound related assets used in the level for testing were created and published by Epic Games, Inc.

through a pack by the name Infinity Blade: Grasslands. The content was at the time of writing distributed for free through their marketplace. It was used to build a level from scratch in Unreal Engine 4 (Epic Games, Inc., 2012) using their first-person shooter template as a framework for character movement.

Since Epic Games, Inc. are professional game developers and creators of the widely used Unreal Engine, it is easily justified to use content created by them to be sure that the visual assets are up to par and good enough for professionals in the industry. This was to ensure as high ecological validity as possible when it came to the visual and graphical assets to go along with the audio. To create a convincing environment for the subject to interact within.

It is a simple level, a plateau, where the players task is to press a button to start the movement of an object, a ball or a cube. Which object the player encounters first is chosen at random and then they alternate. While the object is moving, the player has to run to the next button, which calls the footstep sounds. When the second button is activated physics is enabled on the object, causing it to fall to the ground. The goal is to get the object to fall into the circle on the ground. Upon activating the third button, the game is restarted with the second object. Causing an A/B, or Cube/Ball situation. The questions they answer via a questionnaire relate to either if it's a cube or a ball, or calls for a comparison between the two. The subjects were allowed to answer the questions as they were playing and listening.

From the first button to the second, there are approximately 15-20 steps taken by the player. From the second to the third, less than 5. However, completing this objective was not compulsory to move around in the level, or switch levels, but rather to keep a certain interactivity available in the level.

The ambience sounds in the game remained unchanged between the two levels of the test. It was a stereo ambience put together with assets acquired from Renderosity (2015) called “Waves of the Mediterranean” published by the user ShaaraMuse3D, recorded by sound designer Gustaf Grefberg. The ambience sounds consisted of ocean waves, wind and birds. The assets were chosen specifically to fit the visual scenery.

The audio balance between the different parts of the ambience and footsteps were set from a sound designer perspective, where it would sound well balanced as compared to a typical commercially produced game. Then the ambience level was brought down even a bit lower than might be heard in a commercial game, to not risk masking too much information from the footstep stimuli. However, not so

(15)

low as to make a poor balance. Two third-year audio engineering students who had previously attended a

“sound for digital games”-course were asked to give their opinion on the audio level balance, and no changes had to be made. Some subjects in testing thought the footsteps were too loud. The ambience was the only other type of sound in the game, thus being the only point of reference of levels. They did not know this was intended, since the footstep sounds were the ones to be evaluated and masking had to be avoided.

Choice & Preparation of the Stimuli

To preserve as much ecological validity as possible, samples from a well known industry standard sound library were used; General Series 6000 Sound Effects Library (Sound Ideas, 2016). Using an industry standard library ensures that it is well recorded with a real foley artist and often used for professional applications. All samples used from this library, and rendered later, were of CD-quality standard, i.e.

44.1 kHz, 16 Bit.

The reason for separating the left and right foot samples is to increase the sense of realism. Since from a biological perspective people don't walk with the same weight distribution and foot roll with the left and right foot. Four samples per foot of a person walking on stone with leather soles were chosen. These came to be the source left- and right foot source sounds. Since the chosen source sounds were supposed to sound like a set of footsteps, it should not matter which of the samples that were chosen to be processed and tested. Similar processing can be applied to all of them. The first sample of each foot were chosen for processing, Fs1_L.wav and Fs1_R.wav.

All of the total 8 footsteps were analyzed by passing them through Insight (Izotope, 2006) to see which frequency bands were the most active, and if any patterns could be seen between the footsteps. Basically to see similarities and differences between them. The primary tool used was the spectrogram, both in 2D and 3D mode. The point of analyzing the footsteps was to lay a bit of groundwork and preparation for the processing that would result in the method to be tested.

Through the analysis it was concluded that there are some bands more consistent than others and that the fundamental frequency between them shifted a little bit and never stayed exactly the same. This was important to know in order to get a general grip on how much pitch shifting could be applied for example within the frame of realism. Since biologically no step has the exact same amount of weight and roll associated with it, to achieve realism that needs to change slightly between the footsteps. A step with more weight put in to it has a lower fundamental frequency associated with it than lighter steps. In addition, a footstep consisting of a weaker hit of the heel or the toe, will be less energetic in that

(16)

respective spectral area. The resonance of the surface, and the body of the shoe that is worn will of course also affect the resulting sound of a footstep.

When working with sound processing tools and especially those that alter the pitch, artifacts could become a problem due to overtones getting pitched down, and could result in an unnatural sounding product. From a mixture of listening to the footsteps getting pitched from a sound designer perspective, taking the artifacts into consideration and taking the spectral analysis into consideration, limitations were set between +-40 cents pitch allowance. This was going quite easy on the pitching, but seeing it as enough.

However, in some cases it might be possible to pitch it around +- 100 cent without it sounding unnatural, depending on the surface it's intended for. This study focused on stone/concrete-sounds, which are quite dense surfaces and gives of quite a sharp transient. Too much pitching smeared the transient, making the footstep sound less snappy. If it was wood, grass or softer surfaces more pitching might be easier to apply without losing important aspects of the sound. In some cases the alterations from much pitch shifting might even be desirable. However, losing too much of the transient on a stone surface is not.

Processing

The footstep sounds in the unprocessed version of the game consisted of four unique and separate sounds for each foot. Shown below:

Table 1: Footstep sound assets in the unprocessed version of the test.

Left Foot Right Foot

Fs1_L Fs1_R

Fs2_L Fs2_R

Fs3_L Fs3_R

Fs4_L Fs4_R

From each chosen source footstep sample, Fs1_L and Fs1_R, three additional samples were created. In the processed version of the game the footstep sounds consisted of one source sound and it's three derivatives per foot, still totaling 4 samples per foot, but not completely unique samples but rather manipulated ones. Shown below:

Table 2: Footstep sound assets in the processed version of the test.

Left Foot Right Foot

Fs1_L Fs1_R

Fs1_L_p1 Fs1_R_p1

Fs1_L_p2 Fs1_R_p2

Fs1_L_p3 Fs1_R_p3

(17)

The pitch shifting, which was first in the signal chain, was applied with Reaper (Cockos, 2016) built in pitch function and each source sound was pitched individually within the aforementioned limits. The next step in the signal chain was the equalizer, where Pro-Q (FabFilter Software Instruments, 2013) in zero latency mode was used due to its spectral transparency. All of the processing was applied from a sound designer perspective, where how it sounded was more important than the numbers themselves, within the limitations that were set.

Complete processing chart for each footstep:

Table 3: Processing chart for footstep stimuli

Footstep # Pitch (cent) EQ Low EQ Mid EQ High-Mid EQ High

Fs1_L (Source) - - - - -

Fs1_L_p1 -40 92 Hz

+2.5 dB Q: 1

909 Hz -2.5 dB Q: 1

2950 Hz +2.8 dB Q: 1

7238 Hz +3.9 dB Q: 1.7

Fs1_L_p2 +35 83 Hz

-2.6 dB Q: 1

648 Hz -5 dB Q: 1

4041 Hz +2.8 dB Q: 0.8

-

Fs1_L_p3 -20 83 Hz

-2.6 dB Q: 1

888 Hz +3.3 dB Q: 1

- 8621 Hz

-3 dB Q: 1 Fs1_R

(Source)

- - - - -

Fs1_R_p1 -35 88 Hz

+2.9 dB Q: 1

809 Hz -1.5 dB Q: 1

2950 Hz +2.4 dB Q: 1

8325 Hz +3 dB Q: 1

Fs1_R_p2 +20 83 Hz

-2.6 dB Q: 1

878 Hz -3.5 dB Q: 1

- 9687 Hz

+1.9 dB Q: 1

Fs1_R_p3 -20 83 Hz

-2.6 dB Q: 1

888 Hz +3.3 dB Q: 1

5667 Hz -4.6 dB Q: 1

12664 Hz +2.1 dB Q: 1.2

The frequency bands that were being processed were primarily chosen from a sound designer perspective, where how it sounded was in focus. However a lot of insight was gained from analyzing the raw footsteps. As previously mentioned there are different parts to a footstep sound, the heel, the toe, the friction etc. These lay the groundwork of knowledge for processing the sounds from a sound designer perspective. To explain further, the weight that the walker put into the step was altered by processing the lower frequencies. In the lower mid frequencies the heel was often most present, with the toe following up at high mid frequencies. The friction could be observed in the higher frequencies, and they seemed to be quite random from step to step, and was most likely decided by how the walker dragged or lifted the foot while walking.

(18)

Testing conditions

Testing was conducted at different locations on a laptop. Since playing is usually done from the comfort of your own home, most tests were conducted at the subjects home. 11 were conducted in a small room at LTU, while 19 were conducted in 12 different living rooms in Stockholm. Background noise was of course minimized as much as possible. All of the subjects used the same set of headphones, a brand new pair of Sennheiser HD-25. The subjects were allowed to answer the questions at the same time as they were playing and listening. They were allowed to set the volume at a level that they were comfortable with and adjust it throughout the test, which should not be a problem since they play it several times. A wired mouse was used for camera control in the game as per usual in an FPS-game environment.

Before testing the subjects were all given the same instructions. A brief explanation of the game mechanics, such as movement, using buttons and controlling the camera. Then they were instructed about the task of the game, and that the questions will be about the sound design in the game. It was mentioned that the object had no sound associated to it touching the ground.

They were told that the experiment and questions was about the ambience and character sounds, and brief descriptions about what these types of sounds were. Definitions of character sounds and ambient sounds were included on the questionnaire. On the questionnaire they were also told that their participation was completely anonymous, and that there are no “right” or “wrong” answers. The subjects were also allowed to answer the questionnaire while playing and listening. The full version of the original questionnaire (in Swedish), can be found in Appendix B.

In the questionnaire the subjects were asked to rate how much difference they heard in the sound design when the object was a cube to when the object was a ball. They were also asked to rate the believability of the character sounds and ambience sounds within respective object. The first sentences of the questionnaire were definitions of character sounds and ambient sounds. For questions 1, 3, 4, 5 and 6 the subjects were given the option to comment and describe the difference they perceive in the sound. There was no time limit for the subjects to conduct the test, but most of the subjects were done within approximately 15-20 minutes of listening.

Subjects

All of the subjects had to have gaming experience, to ensure that they are comfortable in a gaming environment. Whether they are very experienced gamers or less experienced gamers is not important, as this method applies to everyone with any gaming experience. The in-game task was very simple, and required no gaming skills to perform good at to evaluate properly, since it only had to do with the sound.

What could affect the results would rather be their listening experience, so if they had studied or worked

(19)

with sound it was noted. However, as previously mentioned, this thesis is not about comparing the two groups, but rather to divide them only to see if the results are affected by testing trained listeners.

Noteworthy is also that they are aware the experiment involves a listening test as they are told in beforehand that it is a listening test about sound design in games. So the subjects were actively listening for differences between the two scenarios. While an unsuspecting listener/gamer might notice even less of a difference if they were not informed that the experiment was about sound or if implemented in a real commercial game.

The subjects were collected from different places. 11 were students at Luleå Tekniska Universitet, while 19 were spread over Stockholm with different backgrounds and occupations, 5 of which were students.

All of the subjects self-selected under the conditions that they had experience with gaming. In total there were 12 trained listeners who took part of the test. 10 audio engineering students at LTU, and 2 audio engineers in Stockholm, one of which had experience in sound design.

(20)

Results

In this section the results are presented. The results will reflect answers from all subject groups: trained listeners, untrained listeners and all results together. Due to the nature of the questions, some results may appear uninformative before analysis i.e. subjects who heard a difference but preferred the processed version, or subjects that expressed that they heard a difference between the two versions but only in the ambience, where there is no difference; But these cases will be discussed later.

In the descriptions below, the Ball represents the unprocessed version of the game, while the Cube represents the processed version of the game.

For the complete questionnaire in Swedish, see Appendix B. For a complete sheet of raw data, see Appendix C.

Question 1: Perceived Difference

Presented below are the results for the question “How much difference did you hear in the sound design when the object was a ball opposed to a cube?” Where they could scale it from 1 “no difference” to 5

“big difference”.

Table 4: Question 1 answers, divided into groups

Answer Trained Listener Untrained Listener Total

1 – No difference 5 14 19

2 5 4 9

3 1 1

4 1 1

5 – Big difference

Average (score): 1.47

1 2 3 4 5

0 2 4 6 8 10 12 14 16 18 20

Untrained Listener Trained Listener Total

(21)

Question 1: Subject Comments

Below is a selection of comments from 6 different subjects who marked they heard a difference on the first question (with a rating higher than 1), translated to English. Some comments mention the footsteps specifically, others make no mention of the footsteps.

Footstep specific comments:

“When the object was a ball the bird was clearer, and it sounded like I walked on my toes. When the object was a cube it sounded more like I walked on my heels.”

“Ball: footsteps sound more believable. Cube: footsteps sounds like a horse.”

“A bit 'heavier' footsteps with the Cube.”

“Small difference. Ball sounds more trebly and clear.”

Other comments:

“There's a small difference in the wind sound.”

“More water sounds when it was a Ball.”

(22)

Question 2: Preferred Version

Presented below are the results for the three-choice question “Did you prefer the sound design when the object was a ball or a cube?”. The subjects could choose from “Ball”, “Cube” and “No Preference”.

Figure 2: Question 2 answers, divided into groups.

BALL CUBE NO PREF

0 5 10 15 20 25

Untrained Listener Trained Listener Total

(23)

Questions 3 & 4: Character Sounds Believability

Presented below are the results for question 3 “How believable/convincing were the character sounds when the object was a silver ball?” and question 4 “How believable/convincing were the character sounds when the object was a golden cube?”. In both questions they rated their answer from 1 “Not believable” to 5 “Believable”.

Table 5: Ratings of character sound believability. Q3 representing the Ball and Q4 representing the Cube.

Answer Q3 Trained Q3 Untrained Q3 Total Q4 Trained Q4 Untrained Q4 Total

1 – Not believable 1 1

2 1 2 3 2 3 5

3 3 5 8 3 5 8

4 4 8 12 3 8 11

5 – Believable 4 2 6 4 2 6

Averages (score): 3.91 3.44 3.63 3.75 3.5 3.6

Figure 3: Ratings from Question 3, believability, divided into groups. Question 3 represents the Ball.

Figure 4: Ratings from Question 4, believability, divided into groups. Question 4 represents the Cube.

1 2 3 4 5

0 2 4 6 8 10 12 14

Untrained Listener Trained Listener Total

1 2 3 4 5

0 2 4 6 8 10 12

Untrained Listeners Trained Listeners Total

(24)

Questions 3a & 4a: Character Sounds Realism

Presented below are the results for question 3a “How realistic were the character sounds when the object was a silver ball?” and question 4a “How realistic were the character sounds when the object was a golden cube?”. In both questions they rated their answer from 1 “Not realistic” to 5 “Realistic”.

Table 6: Ratings of character sound realism. Q3a representing the Ball and Q4a representing the Cube.

Answer Q3a

Trained Q3a

Untrained Q3a

Total Q4a

Trained Q4a

Untrained Q4a Total

1 – Not realistic 1 1

2 3 4 7 6 3 9

3 3 6 9 1 8 9

4 4 6 10 3 4 7

5 – Realistic 2 2 4 2 2 4

Averages (score): 3.42 3.33 3.37 3.08 3.17 3.13

Figure 5: Ratings from Question 3a, realism, divided into groups. Question 3a represents the Ball.

1 2 3 4 5

0 2 4 6 8 10 12

Untrained Listeners Trained Listeners Total

1 2 3 4 5

0 1 2 3 4 5 6 7 8 9 10

Untrained Listeners Trained Listeners Total

(25)

Questions 3 & 4: Subject Comments

Below is a selection of comments from 5 different subjects on the comment fields 3b and 4b, translated to English. Their instruction on 3b and 4b was “If it was not believable, or if something stood out, describe what and why”. The selected comments are from subjects who rated they heard a difference in question 1 and rated question 3 & 4 differently to indicate that they heard a difference in the character sounds.

“3b: It sounds like I am walking on my toes, which I wouldn't with shoes like that. I missed other character sounds like breathing.

4b: Sounds more like I am walking on my heels, which felt more appropriate, but still missing other character related sounds.”

“3b: Sounds like a horse shoe, but not as much as with the cube.

4b: Footsteps don't seem 'believable'.”

“3b: I heard a rhythm in the footsteps, which made it sound like the same five footsteps on repeat.

4b: Two steps, one-two, on repeat. Sounded good at first with but felt weird after a while.”

“3b: The rhythm was good, but slightly too bright shoe heel sound.

4b: Sounded a bit like the character walked on a wet surface.”

“3b: The footsteps are too loud, and sound too close.

4b: Pretty much the same as with the ball.”

Questions 5 & 6: Ambient Sounds Believability

Presented below are the results for question 5 “How believable/convincing were the ambient sounds when the object was a silver ball?” and question 6 “How believable/convincing were the ambient sounds when the object was a golden cube?”. In both questions they rated their answer from 1 “Not believable”

to 5 “Believable”.

Table 7: Ratings of ambient sound realism. Q5 representing the Ball and Q6 representing the Cube.

Answer Q5 Trained Q5 Untrained Q5 Total Q6 Trained Q6 Untrained Q6 Total 1 – Not believable

2 1 1 2 1 1 2

3 1 4 5 1 4 5

4 6 9 15 5 9 14

5 – Believable 4 4 8 5 4 9

(26)

Questions 5a & 6a: Did the Ambient Sounds Fit The Environment

Table 8: Ratings of ambient sound realism. Q5a representing the Ball and Q6a representing the Cube.

Answer Q5a

Trained Q5a

Untrained Q5a Total Q6a

Trained Q6a

Untrained Q6a Total

Yes 9 16 25 9 16 25

No 1 1 2 1 1 2

Questions 5b & 6b: Ambient Sounds Realism

Table 9: Ratings of ambient sound realism. Q5b representing the Ball and Q6b representing the Cube.

Answer Q5b

Trained Q5b

Untrained Q5b Total Q6b

Trained Q6b

Untrained Q6b Total 1 – Not realistic

2 1 1 2 1 1 2

3 2 4 6 3 4 7

4 6 9 15 5 9 14

5 – Realistic 3 4 7 3 4 7

Analysis

Question 1

Statistical analysis was done on question 1, due to it being the most important question. This is only to assure that the results are not completely random. The analysis and interpretation of the descriptive statistics is the focus of this thesis and by far the most important in this test in comparison to statistical analysis.

A Chi-square test was conducted on the results from question 1 using QuickCalcs (GraphPad Software, 2016). This showed that the result is statistically significant. Chi squared equals 44.000 with 4 degrees of freedom where the two-tailed P value is less than 0.0001. Analysis was made without excluding subjects that marked that they heard a difference without expressing that they noticed a difference in the character sounds but rather the ambient sounds.

If Question 1 is considered “Did the subject hear a difference, yes or no?”, the two listeners who marked that they heard a difference in the ambience sounds, but not in the character sounds, would need to be excluded from the analysis or counted as if they heard no difference, to get statistically significant results through a Chi-square test. However, if the test is run on only the results from untrained listeners it is statistically significant. This indicates that untrained listeners were less likely to hear a difference between the two versions than trained listeners.

(27)

A binomial test of the same question (yes/no) where number of successes is 21 (from the 19 subjects who heard no difference, and 2 who heard a difference, but not in the character sounds), of 30 trials where the probability of success is 0.5 (50%), the two-tail P value is 0.0428.

The conclusion of this statistical analysis is that the results of question 1 are statistically significant, meaning they were not decided by randomness. This indicates that the majority of people did not hear a difference between the processed and unprocessed version of the test.

Interpreting the Data

This section will interpret the data collected from the tests. Specific statistical analysis methods such as Chi-square tests and binomial tests are provided in the analysis-section and only included for question 1, which is the most important. Interpretation of the other questions will be done with the help of the descriptive statistics derived from the raw data.

Question 1: The subjects seem to have a hard time differentiating the processed version from the unprocessed version. Untrained listeners were less likely to hear the difference than trained listeners, but there were also subjects that heard differences where there were no differences. Out of the 11 subjects who heard a difference, 2 subjects did not perceive any difference in the character sounds, but instead the ambience sounds. If those are excluded, out of the 9 remaining subjects who heard a difference, 4 showed preference for the cube, i.e. the processed version. This leaves 5 subjects who heard a difference and did not show preference for the processed (cube) version through question 2.

Subject comments

One of the subjects who preferred the cube motivated it by saying the footsteps sounded like the character walked more with the heels in the processed version, and more on the toes in the unprocessed version. This could be interpreted as the footsteps sounding a bit “heavier” and transient rich in the processed version, since the heel often is the heavy initial hit of the foot in a walking motion. This could be similar to another subject who didn't have any preference of sound design, who wrote that the footsteps sounded heavier when the object was a cube.

Another subject who preferred the cube mentioned that the heel sounded a bit too bright or strong in high frequencies, while the cube sounded more like the character walked on a wet surface. Similar answers are seen in two other subjects where one preferred the ball and one had no preference, both rating question 3 and 4 differently, showing that they heard a difference in the character sounds. They both referred to the sound design being brighter with the (unprocessed) ball version.

(28)

One interesting comment from a subject was (translated to English): “When it is a silver ball it sounds more like you're in the game.” The same subject rated the realism of the character sounds in 3a one point higher than in 4a and commented “3b: It doesn't really sound like shoes when you walk. 4b: Same thing, they did not sound very realistic.” It seems like one answer across both questions, and no other subjects mentioned immersion, envelopment or similar things to relate to this subjects comment on question 1.

Question 2: There is not much more to say than that the majority of subjects had no preference of sound design. The subjects do not show any particular preference to either the cube or the ball. It is probably due to the high rate of people not hearing any difference between the two versions. This can be interpreted as good results given the aim and purpose of the experiment, as the method is supposed to create variants believable enough to be compared with the traditional method of creating variation in sound assets.

Questions 3 & 4: The unprocessed character sounds (Q3) had a slightly higher total average rating of believability.

Subject comments

In comment fields 3b & 4b, there were some comments mentioning the type of shoes as something that stood out. Note that these are opinions are not specific to one of the versions, but the footstep sounds in general. One subject thought the footsteps sounded like slippers on a wooden floor, while another thought it sounded like dress shoes (which it might be, since the samples are from leather sole shoes).

Another subject felt like he was walking on metal plates, and a fourth thought it sounded like high-heel shoes.

Among these comments there are some subjects who mention the footsteps being too loud. As described previously, the ambience is the only other point of reference in audio levels, and it is dampened in volume to not risk masking the footsteps. Subjects who mention that the footsteps sound too close might mean the same thing.

One very interesting set of comments was collected from one questionnaire. The subject described that he could hear a rhythm in the footsteps when the silver ball was the object and that it sounded like there were only 5 footstep sounds. Since the game engine randomizes the order that the footsteps are played, hearing a consecutive rhythm for a longer period of time is unlikely, but the subject was not far off from the amount of sound assets used for that version (8). The follow up comments on the (unprocessed) cube version were even more interesting. The subject described once again hearing a rhythm, but a rhythm

(29)

consisting of only two footsteps as “one-two, one-two”. After finishing the test, the person mentioned hearing it more as a “marching” rhythm. It is interesting how no one else said something similar. It poses the question if this person is just more sensitive to detecting specific rhythms than the other subjects or if the others didn't think about it rhythmically.

Questions 5 & 6: The subjects rated the believability of the ambience sounds very similarly for the most part. Only one point differentiate the two versions. Since there was no difference at all between the ambience in the processed version and the unprocessed version, the results seem reasonable. The subjects who expressed hearing a difference between the ambiences in any way most likely rated them equally, since the difference they experienced must have been so small.

Through the subquestions of questions 5 and 6 (a & b), it can be seen that 25 subjects agreed upon that the ambience fit the visual environment, while 2 didn't, and 3 gave a blank answer. And similarly to the main questions, the rating of realism is very similar between the processed and unprocessed versions, where only one point differentiate the two.

Subject comments

Some of the comments provided by subjects in 5c and 6c give insight on what people thought stood out with the ambience sounds. A few of the subjects mentioned that the waves from the ocean sounded closer and more active than they looked. One subject described them as aggressive. However, a number of subjects said that the water sound was good.

From a couple of listeners, especially trained listeners and only one untrained listener, the bird sound was mentioned as something that stood out, due to it not being spatialized, thus moving with the listener and always staying in the same place no matter how the character is turned. It is interesting that trained listeners notice and mentions this as odd, while it doesn't seem to be noticed or mentioned as something that stands out by untrained listeners, even though they outnumbered the trained listeners by almost double. However, it is not the scope of the study.

(30)

Discussion

What is known after this study is that, in this game context, subjects seem to have a hard time identifying which are the processed versions that utilizes frequency manipulated variants and which are the unique and separate footstep foley, especially untrained listeners. Trained listeners seemed more likely to hear a difference, but to most it did not seem like an obvious difference. Since a few subjects identified the ambience to be different, and some gave no real indication to exactly what was different, but only rated one slightly less believable or realistic to another, it could be the effect of placing them in a listening test.

However, there were also some untrained listeners who heard slight differences and motivated them in different ways, such as the surface sounding different in one version to another.

As previously mentioned this is also tested by conducting a listening test, which the subject is aware of.

An unsuspecting gamer in a real gaming scenario might not perceive any difference at all, while the difference might be more obvious in a critical listening scenario where subjects listen to only footsteps in the two versions and compare them.

What is not known from this study are things like influence of other foley, more detailed ambience and if it is applicable to all type of shoes and types of walking. For example, would footsteps created by this technique go well with other type of foley, such as clothing, scuffs, armor, rustling? Perhaps, or it might make the processing even more hidden and varied behind the layers. The same goes for a more detailed sound design in general, including randomized ambient spot sounds, and sounds associated with objects in the game world. Will the dynamic environment with all it's detailing make the created footsteps to sound even more varied than it actually is? This test was also quite simple in terms of the amount of assets being tested. If implemented or used in a commercial game, the same method would be applied (either in pre-processing or real-time) to all of the footstep sounds, essentially multiplying the footstep assets available in the game. In this test 8 unique footstep sounds were used for the unprocessed version of the game, while 2 were used with 3 derivatives each in the processed version, resulting in 8 total footsteps. If the processing were to be applied to all of the source footsteps available there would be a total of 32 footstep sounds available in-game. Having this many processed footsteps at the same time has not been tried, but it should not affect the foley sound design other than making it more varied.

Other types of shoes and walkers will affect in which bands processing is the most efficient, these would have to be examined further to find appropriate bands to process within, since these are not known, but can be found out through analysis and testing. It is not known if this method would work on something like backwards and sideways walking. Structurally it should be very similar, but in the frequency spectrum it might be very different.

(31)

It is also not known if this method would work on other types of surfaces. Technically it should, since other surfaces still would have a similar structure. The difference being that the resonance of the surfaces might differ, and the transient. However, if the same thing is to be done for a wooden surface for example, it might be best to process the base of the footstep, and keep the resonance and/or creak of the wooden floor separated and randomized separately on the side through other assets to avoid processing the whole tail. What works best will have to be tested and experimented with. If a similar method is to be applied to footsteps on a field with high grass instead, focus might lie a little bit less on the transient and more on slight manipulation in higher frequencies, since in that case less of the actual feet and walker is heard and more of the surface.

To improve the method in general, and to make it more usable in practice, a good start would be to start making templates out of structures. To analyze more recordings of other types of shoes, surfaces and steps. The most common ones could be done first: sneakers and boots, on wood, grass and gravel. Other sounds with a simple structure could also be researched and tested. Impact sounds in general are often heard a lot and repeated a lot in games, any type of collision. A player in a non-linear game might decide to continually drop a plate to the floor, or punch something. The collision and impact sounds are bound to get repeated at that point and a similar method and system might function for all type of sounds with a simple structure.

What does these results mean for sound designer work flow? They suggest that there are other ways to create variants of footstep sound assets than recording and mixing a lot of foley footstep sounds. If the method is refined and structured into templates for how the processing is done for different types of shoes and surfaces, existing source footstep assets might be multiplied in a short matter of time. For every footstep source asset that is recorded and mixed, at least three derivatives can be created from it using frequency manipulation. These could be processed in the pre-processing stage of production and then be implemented in the sound bank of the game engine.

The results also suggest that this might be viable for use in real-time processing which would increase workflow even further, due the tools used being not very demanding on the CPU. As mentioned previously, if the thresholds for processing can be set to fit the type of surface and shoe by the sound designer in the implementation stage, variations could be created directly in the game engine. This could allow for a large amount of source sounds being processed to become different variants of itself in real- time. To improve further on this idea, a smart randomization system could be put in place, to ensure that two derivatives of the same source sound are never played directly after each other. A system that utilizes frequency manipulation in the form of pitch shifting and equalization would be easy to implement in

References

Related documents

efficient design, with eliminated thermal bridges, thick insulations, proper windows with moving shading, suitably oriented, which has also installed renewable

Datainsamlingsmetod Genomförande Analys Resultat Kvalitet Mahat & Sanzero – Eller L Nepal 2009 HIV/AIDS and universal precautions: knowledge and attitudes

Hej mitt namn är Anton Andersson och jag studerar religionsvetenskap på Linnéuniversitetet i Växjö. Jag är inne i slutfasen av min utbildning och ska nu arbeta med mitt

The social distance from these young respondents, who have not established themselves in the labor market, to upper service class positions can be assumed to be wide, which

To do so, great numbers of dwellings (just as both governments have estimated) need to be produced, which should not be at an expense of life qualities and sustainability.

  Faktorerna representerade genom summerade värden av ingående variabler.. Estimation terminated at iteration number 3 because parameter estimates changed by less than

In this context it makes sense to use the three categories traditionally used for sound in films and computer games (Sonnenschein, 2001; see also Hug, 2011; Jørgensen, 2011

En slutsats jag kan dra från denna studie är att heteronormen förstärks i deltagarnas liv både genom andra men även genom dem själva, exempelvis genom användandet av begreppet