• No results found

Object Tracking withIphone 3Gs

N/A
N/A
Protected

Academic year: 2021

Share "Object Tracking withIphone 3Gs"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

Object Tracking with

Iphone 3Gs

Lars Alin

May 25, 2010

Master’s Thesis in Computing Science, 30 credits

Supervisor at CS-UmU: Ola ˚

Agren

Examiner: Per Lindstr¨

om

Ume˚

a University

Department of Computing Science

SE-901 87 UME˚

A

(2)
(3)

Abstract

In June of 2007 Apple Inc. released the smartphone Iphone. It was a groundbreaking success that set a new standard for what a smartphone should be able to do. Apple has improved the Iphone every year since then and the 3Gs is the newest Iphone model. As the phones have improved, both when looking at hardware and software, the applications have improved as well. The Iphone 3Gs provides the possibility to use the camera as an application background and with that the possibility to analyze the surroundings, making it possible to track objects that the phone is pointed towards.

This thesis examines how object tracking can be implemented in applications for Iphone 3Gs as well as providing a survey of four different areas of use that have been implemented in Xcode: an augmented reality car game, a letter tracking application, a face recognition application and an object recognition application.

(4)
(5)

Contents

1 Introduction 1

1.1 Task . . . 2

1.2 Iphone 3Gs . . . 2

1.3 Augmented Reality and Tracking . . . 3

1.4 Outline of the Thesis . . . 4

2 Problem Description 5 2.1 Problem Statement . . . 5

2.2 Purposes . . . 5

2.3 Methods . . . 5

2.4 Related Work . . . 6

3 Tracking in Handheld Devices 7 3.1 Introduction . . . 7

3.2 Algorithms . . . 8

3.2.1 Markertracking . . . 8

3.2.2 Edge detection . . . 9

3.2.3 Mean-shift algorithm . . . 11

3.2.4 Parallel Tracking and Mapping . . . 11

3.3 Fields of interest . . . 13

4 Accomplishment 15 4.1 Preliminaries . . . 15

4.2 How the Work was done . . . 16

4.2.1 The Preparing and Designing phase . . . 16

4.2.2 Early Development phase . . . 16

4.2.3 Development of Tracking . . . 16

4.2.4 Completion . . . 17

4.3 Conclusions . . . 17 iii

(6)

iv CONTENTS

5 Results 19

5.1 Tracking . . . 19

5.2 Augmented Reality Car Game . . . 20

5.2.1 Icon and Menus . . . 20

5.2.2 Game Mode . . . 20

5.3 Object Tracking . . . 22

5.3.1 Icon and Menus . . . 22

5.3.2 Object recognition . . . 23 5.3.3 Letter recognition . . . 24 5.3.4 Face recognition . . . 25 6 Conclusions 27 6.1 Limitations . . . 27 6.2 Future Work . . . 28 7 Acknowledgements 29 References 31 A Concept sketches 33 B Lo-Fi 35 C Interactive prototypes 37

(7)

List of Figures

1.1 Different companies share of the smartphone market worldwide in percent . . 1

1.2 The front and the camera on the back of a Iphone 3Gs . . . 2

3.1 An Iphone using a mean-shift algorithm tracking an orange object . . . 8

3.2 An early marker used by the ARToolKit . . . 8

3.3 An illustration over how camera angle and marker angle is mapped . . . 9

3.4 Edge detection on calculator and pen . . . 10

3.5 Histogram of a normalized colorspace . . . 11

3.6 Parallel tracking and mapping . . . 12

4.1 Preliminary time chart on the project . . . 15

5.1 Cargame icon . . . 20

5.2 Cargame splashscreen . . . 20

5.3 Screenshot from the game played on a desk at North Kingdom . . . 21

5.4 Icon to the application: What?What? . . . 22

5.5 Splashscreen and menu of the application . . . 22

5.6 Tracking of the Apple logo on a MacBook . . . 23

5.7 The letter tracking function in progress . . . 24

5.8 Face tracking in progress . . . 25

A.1 Concept sketch on the car game, before it was reduced to 2D . . . 33

A.2 Concept sketch on the letter reading application . . . 34

B.1 Lo-fi sketches on possible ways to steer the car . . . 35

B.2 Lo-fi sketches on possible ways to steer the car . . . 36

C.1 HiFi prototype to test the usability of a spinning steerwheel with gas and break pedals . . . 37

C.2 HiFi prototype to test the usability of a steeringcross . . . 38

(8)
(9)

Chapter 1

Introduction

With the technical progress of smartphones today, designers and developers of software suited for these smartphones strive to push the edge of what is possible to create. One of these fields, for which the technical progress is essential for development, is augmented reality. Augmented reality (AR) is a term for merging computer generated material into the physical world in real time, see section 1.3.

AR applications can be created in two ways, one of which the application does not consider its surroundings and by that merge digital objects irrespective of what is displayed in the physical world. The other way is to react to what is displayed in the physical world and then merge it with appropriate digital objects. In order to match the digital objects with physical objects the physical world has to be analyzed. This is where object tracking comes into play and this is the kind of augmented reality that this thesis is built around. Tracking makes it possible for the computer, which in this thesis is the Iphone 3Gs, to find and identify objects, and then react to different situations.

The reasons for the Iphone 3Gs to be the device of choice are many. First of all it contains all the needed hardware to be able to manage these kind of applications in combination with the hype and continuous growth in the smartphone market, see section 1.2. In just a couple of years Iphone have approximately taken over 17.8 percent of the smartphone market worldwide for the third quarter 2009, Figure 1.1.

Figure 1.1: Different companys share of the smartphone market worldwide in percent [1]

(10)

2 Chapter 1. Introduction

1.1

Task

The task is to investigate how far it is possible to take AR and object tracking with the Iphone 3Gs. The main goal of the task is to produce software which shows the possibilities in the form of an augmented reality car game and a tracking application that both track and make some recognition of what is tracked.

The work is developed in collaboration with North Kingdom, which is a digital creative agency from Sweden. Its main locations are in Stockholm and Skellefte˚a. North Kingdom provides digital storytelling in innovative ways to provide clients with digital media [2]. As for the assignment provided by North Kingdom, it is something that at the moment is not included in their ordinary area of work but, as they are striving to be in the front edge of development, investigations like these are essential to push the limit of what they can offer their clients even further.

1.2

Iphone 3Gs

Iphone 3Gs is the third version of Apples praised smartphone.

(11)

1.3. Augmented Reality and Tracking 3

It contains features such as touch screen, voice control, accelerometers, proximity sen-sor, ambient light sensen-sor, Wi-Fi, digital compass GPS and more. To use it as a tool for augmented reality and tracking the Iphones key features and limitations are:

– Camera The smartphone is equipped with a 3 megapixel camera. It has autofocus and the camera has a frame rate of 30 frames per second [3]. A limitation is that no flashlight is provided, which limits the use to already lightened areas.

– 3.5 inch multi-touch display The screen has a 480-by-320-pixel resolution which enables the user to easily interact with the phone [3]. Because of the widescreen format the camera view has to be scaled by 1.3 times the size, in order to have the camera view fill the whole screen.

– Processor The Iphone 3Gs has a 600MHz CPU and 256MB of RAM that contribute to a fast and powerful handheld device [4].

– Iphone SDK At the time this thesis is created, the newest version of the Iphone SDK is the 3.1.2 [3]. This update enables users to print the screen, making it possible to analyze the screenshot. A huge limitation with this version of the SDK is the fact that it is impossible to get access to the raw data stream from the camera, neither trough the SDK nor any workaround supported by Apple.

1.3

Augmented Reality and Tracking

Although this thesis focus on the tracking part of augmented reality, there is a need to go a little bit deeper into what AR actually is. Augmented reality is a term first coined in the 1990s and as stated in the introduction the most commonly used description is that AR is digital objects merged into the physical world [5]. This technology is traditionally used to enhance the physical world providing the user with information and assistance regarding the field it is used in [5].

Some fields where AR has been implemented throughout the years:

– Military. Aircraft pilots use head mounted displays to help navigation [5]. Surveys have also been done regarding the use of AR in military operations in urban terrain [6]. – Healtcare. AR for example is used to create live scenarios in simulators where surgeons can develop their skills [7]. It could also be used in real surgeries to assist the doctor [5].

– Entertainment The entertainment industry has also adopted this technology. The idea of having digital creatures in the real world can be found in several games, such as ARhrrrr - An augmented reality shooter [8].

In order to map digital objects to the physical world some kind of analysis of the world has to be done. This is usually performed using some kind of tracking algorithm. An in depth study of how the algorithms are implemented and how they work in handheld devices is presented in Chapter 3.

(12)

4 Chapter 1. Introduction

1.4

Outline of the Thesis

– Chapter 2 presents a detailed view of the task. In this chapter the task is stated and the purpose of the task is defined. It also contains a overview of the methods used when conducting this thesis and a look at what has already been done within this subject.

– Chapter 3 presents an theoretical study on tracking in handheld devices.

– Chapter 4 presents the preliminary timeframe and what was planned to be done. It also presents a detailed description on how it was actually done and ends with a comparison of planned and actual activities.

– Chapter 5 presents the final results of this project; a walk through the central parts of the resulting applications, complete with screenshots and pseudo code.

– Chapter 6 presents the conclusions of the results. This chapter also states the limi-tations of the result and future work that the result could lead to.

(13)

Chapter 2

Problem Description

In this chapter an in-depth explanation of the task is presented. To clarify things the problem is divided into sub-problems. This chapter also contains the purpose of the task, how the task is solved and related work.

2.1

Problem Statement

The main problem is stated as: how well suited is the Iphone 3Gs as a platform for aug-mented reality and object tracking?

This statement is not just a rhetorical question but rather a starting point for develop applications for Iphone, testing this statement by pushing the limits of what can be done.

The sub-tasks are:

– Augmented reality car game. The main idea of this game is for the user to be able to play a car game on any physical area with the physical objects posing as obstacles. – Tracking application. The focus of this application is to track and recognize ob-ject and patterns. Examples include logotypes, desktop material and human faces. Another functionality is to read hand written letters and display the resulting word.

2.2

Purposes

The purpose of this master thesis is to provide an insight of the abilities that the Iphone 3Gs has when it comes to handle applications with object tracking and augmented reality. Therefore the applications are not meant to be uploaded to Apples Appstore and introduced to the public but rather to be used to display what is possible to create within this field.

There is never a purpose to directly transfer this knowledge to North Kingdoms ordinary activity but as the mobile application market progresses this kind of work certainly will be a part of that activity in the future.

2.3

Methods

This master thesis is initially conducted through a literature review regarding the subject tracking in handheld devices. This review is the foundation of the thesis and it is an influ-ence to both the design process and the development process of the project, see chapter 3.

(14)

6 Chapter 2. Problem Description

After the literature review the project switch to the second phase of this thesis – develop-ment. After a review of the capabilities of the development environment Xcode a couple of applications are designed. The design process contain sketches, LoFi prototypes, HiFi prototypes and usability testing. The finished designs is implemented in Xcode.

2.4

Related Work

There are numerous companies that have produced and displayed visions in the form of demo videos of what they think they can do with augmented reality and tracking. But since no actual applications are displayed these cannot be regarded. One example of an AR and object tracking application is the Sudoku Grab [9]. This application can track a sudoku puzzle and solve it, adding the missing numbers in the empty sudoku slots. This application is by the time this thesis is written running for most innovative way of hardware use in an Iphone applications award [10]. I use similar ideas to what the Sudoku Grab is presenting in my implementations.

Another example is produced by Georg Klein and David Murray. They have created an application were the surroundings can be analyzed, making it possible to render different 3D characters look like they are sitting on the desk in the physical world [11]. Their study is conducted on an older version of Iphone with the possibility to access the raw camera stream.

An example of an application tracking its environment is the application Red Laser. It is developed by Occipital [12] and is a good example of how it is possible to scan and analyze the camera view within the iphone.

(15)

Chapter 3

Tracking in Handheld Devices

This chapter will give an in depth survey regarding some different ways that tracking can be used in mobile devices, as well as what they are used for. This section will discuss some of the most commonly used tracking algorithms such as marker tracking and edge detection but it will also highlight some alternative methods.

3.1

Introduction

Tracking in handheld devices is almost synonymous with marker tracking. The reason for this is because of how easy it is to calculate the angle of the camera and then rotate the AR object accordingly to that angle [13]. The process is described in the following section and one of the earliest successfully attempts to implement this on an “off-the-shelf hardware” was done by Daniel Wagner and Dieter Schmalstieg in 2003 [14]. They implemented an AR marker tracker system on a unmodified personal digital assistant (PDA).

The single biggest contribution within the field of marker tracking was done by Hirokazu Kato. He developed the ARTool kit, an AR and marker tracking framework which became open source in 2004 and since then has had hundreds of thousands of downloads [15]. The ARToolKit has since then evolved into versions more optimized for handheld devices [16].

Even if marker tracking is a big part of tracking with hand held devices, this subject has a lot more to offer. If the main goal of the tracking is to detect shapes it is optimal to use an edge detecting algorithm [17]. This can be done using a number of different approaches [18, 19], but the main goal is to highlight pixels that do not match a fixed threshold value in order to detect object edges.

Another more unconventional method is tracking with a mean-shift algorithm. This is a method that relies on features in the picture such as the histogram value of a specific area in order to track an object [20]. This method is often used only with single objects that are present in the view at all times. Figure 3.1 shows this method implemented on an Iphone device [21].

(16)

8 Chapter 3. Tracking in Handheld Devices

Figure 3.1: An Iphone using a mean-shift algorithm tracking an orange object [21] To be able to position a 3D generated object in correct angles without a marker is a far more complicated process. To do this there is no use in tracking a single object but rather to track the whole environment and matching a grid to specific points in the environment. It is this grid that then changes its position resulting in the 3D object changing angle [11]. A great example of how to do this was created by Georg Klein and David Murray with the label: parallel tracking and mapping [11].

3.2

Algorithms

For a better understanding of how these kinds of algorithms are working, an explanation of how the algorithms mentioned in the section above are operating follows. This section focuses on the theoretical part, explaining how each algorithm works rather than show exactly how they are implemented.

3.2.1

Markertracking

The essential part of this method is the marker. A key feature of the marker is that the pattern itself cannot be identical from two different angels. Another thing is that both the pattern and its size have to be known. Figure 3.3 shows a marker used in the ARToolKit project [15].

Figure 3.2: An early marker used by the ARToolKit

Due to the fact that the size of the marker is known it is possible to map the camera angle to how much that is seen of the marker. This is done by image processing where the

(17)

3.2. Algorithms 9

black borders on the marker is searched for and when found the pattern inside is analyzed, calculating the angle. Figure 3.4 is an illustration of how this is done. By knowing the angle of the camera with consideration to the x, y and z axis it is possible to rotate a 3D object creating the illusion of a digital object in the physical world [13]. Changing the distance between the marker and the camera will result in adjustments in size of the 3D object. Increasing the distance will shrink the object as long as the camera is still able to recognize the marker and decreasing the distance will enlarge the object.

Figure 3.3: An illustration over how camera angle and marker angle is mapped

3.2.2

Edge detection

There are a large number of ways to do an implementation of edge detection but the two main categories are search-based and zero crossing based. Search-based uses edge strength as measurement and searches for local maxima of that value while zero cross based algorithms use, as the name imply, zero crossings computed from the images to find edges. Common to the approaches is that they use deviations in the picture to localize edges. The most common way is a gradient operation that determines the level of variance between selected pixels [18]. Figures 3.4 shows how edges are detected in a mobile camera photo. In this case all pixels in the picture are processed and if there is a deviation in color value of the pixel compared to its neighbours, the pixel gets the color white. If there is no deviation, the color of the pixel is set to black and after all pixels are processed the resulting picture has a black background with the white edges from the starting picture. When edges are calculated the resulting image makes it possible to identify objects in the picture. With objects identified it is possible to place digital artifacts positions in relations to the physical objects. A walkthrough of a edge detection algorithm is given in chapter 5.1. This is a simple way of implementing an edge dection algorithm and a large reduction of the algorithms used in computer vision, like the extensive work of John Canny [19] and the work by Harris and Stephens [22]. The reduction is vital due to the difference in computer strength between a handheld device and a stationary computer. As for the calculation there is a difference in performance but that is a sacrifice that has to be made.

(18)

10 Chapter 3. Tracking in Handheld Devices

(19)

3.2. Algorithms 11

3.2.3

Mean-shift algorithm

A mean-shift algorithm needs a pre-decided object to track. It all starts with a cluster of pixels being chosen and this area must contain the object. After the initial stage the mean-shift works, frame by frame, calculating the area in the frame that has the closest color distribution to the pre-selected area [20]. As is seen in Figure 3.1 the pre-selected area is the bottom left side of the orange object, the picture to the right showes how the object is moved and the blue box surrounding the area is moving to the best matching area. To illustrate this Figure 3.5 shows a selected histogram of a normalized colorspace [23]. The method will try to find the local maxima that matches this histogram and in the case of Figure 3.1, move the blue square in that direction [24].

Figure 3.5: Histogram of a normalized colorspace

3.2.4

Parallel Tracking and Mapping

The main goal with this method is the same as with the marker tracking method; to estimate the camera pose in order to adjust the augmented reality content. It is performing that by tracking key points in the user environment and mapping them to a digital representation of the enviroment. This method contains two different processes running in parallel, a point-based tracking system and a mapping system that bundles points and keyframes to a map representation of the environment [25]. This section will only regard the point-based tracking system.

(20)

12 Chapter 3. Tracking in Handheld Devices

One way of performing the task of point-based tracking is described in six steps in Parallel Tracking and Mapping for Small AR Workspaces [25]:

– Step 1. A new frame is recived and the camera pose from the prior frame is estimated. – Step 2. From the estimation in step 1, the map points are added to the frame. – Step 3. 50 of the coarsets-scale points are searched for in the frame.

– Step 4. When thay are found the estimated camera pose is updated to the new estimation.

– Step 5. 1000 points are drawn into the frame again and searched for.

– Step 6. Finally a new camera pose is estimated from all the points that were found i step 5.

The picture below shows the mapping in progress.

(21)

3.3. Fields of interest 13

3.3

Fields of interest

At the moment the interest in the field of tracking in hand held devices is rising but there is not a lot of commercial usage out there. Traditionally, tracking algorithms is all part of image processing and that is basically the biggest challange. Hand held devices are always going to produce shaky images so the better the algorithm is to withstand obstacles like motion blur the more efficient the method will be [11].

Tracking is an essential part of computer vision. This is a field that reaches from special effects in movies to industrial robots inspecting manufactering. At this time, the most track-ing is performed to filter out data, leavtrack-ing the intresttrack-ing parts and removtrack-ing unnecessary parts of the picture to reduce the amount of data [19]. Therefore, various types of edge detection are the most commonly used tracking algorithms [19].

Even in computer vision the field of hand held tracking is limited as of today. Aside from a couple of barcode readers and augmented reality games, there is not a big market yet. Leading AR researchers predictict that the market for both AR in computors as well as AR in handheld devices will rapidly increase in the not so distant future so the market and contributions to hand held tracking will probably increase as well [26].

(22)
(23)

Chapter 4

Accomplishment

This chapter will compare the preliminary time plan and order of execution to the actual way it was executed.

4.1

Preliminaries

Below is the preliminary estimation of how the work would proceed.

Figure 4.1: Preliminary time chart on the project

(24)

16 Chapter 4. Accomplishment

4.2

How the Work was done

The work is divided into four phases which are described in this section. The first of these four phases, the preparing and designing phase, refers to an appendix which contains some of the LoFi and HiFi prototypes that were tested as well as the design concepts.

4.2.1

The Preparing and Designing phase

The preparation weeks in the beginning of the project where spent on gaining insight into Apples development environment Xcode. As a fairly new user of both Xcode and the development language Objective-C, this process was important to prepare for the work later on. In addition to the familiarization of the development environment, studies were done in order to see what already had been done within this field and to discover essential limitations.

Between weeks 40 and 41, the second and third week of this project, design concepts were created and finalized. Appendix A shows the first conceptual sketches of the two appli-cations. Because of the extensive work required on the tacking mechanisms, the framework surroundings were striped and very minimalistic. Booth applications present a splash screen with information on launch and by tapping the screen the tracking mode starts. Some LoFi design sketches on how to steer the car is presented in Appendix B and the final decision is based on the fact that the steering wheel metaphor is obviously suited for a car game. So the steering wheel became the steering device of choice. Regarding the other application the main focus was set on the back end part simplifying the interface as much as possible by implementing the standardized Iphone button and label classes. This implementation simplifies the user interaction due to the fact that the user will immediately be familiar with the environment.

The interaction with the game mode was tested with interactive prototypes. A screenshot of such a prototype and a substitute for the steering wheel can be found in Appendix C.

In this prototype booth gas and brake pedals were tested, but neither making the final version due to the minimization of on screen objects.

4.2.2

Early Development phase

Due to a delay at Telia who were providing the Iphone, only the framework and menus could be created. These were created and tested in an Iphone simulator provided by Xcode. This took place during weeks 42 and 43.

In the beginning of November, a couple of weeks late on the schedule due to the waiting time of receiving an Iphone, the implementation of the live video feed was done. This set the starting point for tracking implementations.

4.2.3

Development of Tracking

Between weeks 46 and 50 different tracking algorithms were implemented and tested, striving for an as efficient algorithm as possible. Running in parallel to the tracking adjustments was numerous attempts to bypass the standard SDK in order to access the raw data feed direct from the camera. As all attempt failed the fact had to be faced that the only way of analyzing the camera stream was by printing the whole screen. The original idea of having a 3D generated car had to be withdrawn and instead a concept of a 2D game was created due to the print screen problem. A 2D solution without depth consideration reduces the calculation needed and retains the frame rate, and by removing the 3D rendering performed

(25)

4.3. Conclusions 17

with Open GLes, the print screen method could be used. This also added to the problem that the project was running late and I had no choice but postpone the presentation from January to February.

4.2.4

Completion

Finally an edge detecting algorithm was chosen as the most efficient due to its capability to sustain satisfying frame rate despite the limitations. An interpolation technique was implemented to leave no trace on the prints. This was created by printing the edge marks in a separable layer on top of the camera stream with just enough alpha for the user to see but also for the image to be usable. This makes it possible to interpolate the edge marks so that they wont be present in the next frame captured, see chapter 5.1. As this was done by the end of the year 2009 the project started again week 2 and in the following weeks two applications were built. The car game was now created as a 2D game seen from above with the edge detecting algorithm keeping track of the location of physical objects and adjusting the car to these objects. The edge detecting algorithm was also used in the other application. In this application a backend thread was created to match the edges detected with the algorithm to pre-computed edges of letters, logos and human faces.

In week 5 I held a presentation of the project for a couple of companies also located in Skellefte˚a and showed a few demos of my work. In week 6 this report was written and it was completed at the start of week 7 2010.

4.3

Conclusions

Comparing the preliminary schedule to the actual outcome it is clear that there is a rather big difference. Knowing that it would take several weeks for Telia to supply the phone may have avoided some of the delay but not enough to finish on time. The main reason for the delay was the ridiculous amount of time spent on trying to access the raw data of the live video feed. When investigating in which order the task had to be done it is rater accurate. The amount of weeks spent on each task is also accurate other than the tracking algorithm task which could be prolonged due to the fact that the time for graphics could be shortened when the 3D idea got scrapped.

(26)
(27)

Chapter 5

Results

In this chapter the outcome of this project is presented. Every screenshot of the working applications has had its edge detection points colored in order to provide the reader of this thesis with a better understanding of what is tracked in the picture. The car game has white tracking points and the tracking points for the object tracking application are red.

5.1

Tracking

The heart of the following applications is the edge detection algorithm that I have imple-mented. It is therefore essential to explain how the algorithm works. Two versions of the same algorithm have been implemented and the following pseudo code will explain the dif-ferent steps. A threshold value is chosen before the algorithm starts, the larger the value is the less sensitive the edge tracking will be. A low value generates more and thicker edges.

1. Capture a screenshot of the whole display

2. Loop through all the pixels of the captured screenshot (a) Check color value of the pixel

(b) Check color value of pixels that border on the current pixel

(c) If the color value differs from the threshold value then save position in array 3. Update the array by removing pixels that no longer differs from the threshold value 4. Start all over again at 1.

In order to leave colored traces, like in the screenshots below, some more steps have to be completed. The trick here is to print points on a clear canvas instead of saving them in an array. To make sure that the points does not get captured on the screenshot, which would lead to a one colored screen, interpolation has to be obtained. By interpolation the color of the drawn pixel gets substituted by a mean value of the surrunding pixel colors.

(28)

20 Chapter 5. Results

5.2

Augmented Reality Car Game

As mentioned before the car game is implemented in 2D and therefore has a couple of limitations. The game has to be played directly from above, pointing the camera straight down. Because it is a 2D game, changing the angle will not rotate the car, and therefore loose the illusion of a merged digital object in the real world. Changing distance between table and camera is an implementation limitation. For this illusion to make sense, the car has to be scaled down when the distance increases and scaled up if the distance decreases. The decrease part is the problem when it is not possible to access the raw camera stream. If the camera gets too close, the car probably will fill the whole screen and by then no tracking will be possible and the car will never scale down even if the distance is increasing. Another limitation is that some of the physical objects from when the game is started has to be present at all time, if all of the original objects gets substituted it will be another scenario and the car will be adjusted to that scenario instead.

5.2.1

Icon and Menus

To start to game the application icon has to be pressed in the Iphone menu. The picture below shows the application icon.

Figure 5.1: Cargame icon

The game is started by pressing the play button on the splashscreen that appears at startup and the game mode will soon appear.

Figure 5.2: Cargame splashscreen

5.2.2

Game Mode

The play sequence in this game is rather simple. The steering wheel to the right controls the car and the player is rotating it by touch interaction. The phone can be moved in two dimensions as long as the angle of the camera and the distance between the objects and the

(29)

5.2. Augmented Reality Car Game 21

phone does not differ too much. The key feature in this game is that the digital car appears as if it is merged into the physical world, leaving the car at the same place as the physical objects even when the phone is moved. The example below shows two pictures where the digital car is keeping its angle and distance to the physical objects, which in this case is a stapler, even when the position of the phone has been changed.

(30)

22 Chapter 5. Results

5.3

Object Tracking

This application has three features based on object tracking and object recognition. It should be pointed out that the main focus of this thesis is object tracking so the recognition part of this application has been a bit foreseen and especially the face recognition function is a bit of showcase work. It will not tell the difference between faces, just recognize if a human face is present.

5.3.1

Icon and Menus

To start this application you have to tap the icon below in the Iphone menu.

Figure 5.4: Icon to the application: What?What?

When the application is started, the splash screen to the left in the picture below is visible. When the screen is tapped the picture to the right appears. It contains a toolbar in the bottom of the screen where the user can change from the default mode, which is object and face recognition, to the letter recognition mode by tapping the cross to the right in the toolbar. The done button exits the application and the space to the left contains a label that prints what the application has found.

(31)

5.3. Object Tracking 23

5.3.2

Object recognition

In order for this function to work, the application has to have the ratio between different edges in the objects precalculated. As is shown in Figure 5.6, the ratio between six different edges are compared and found. Once again it should be pointed out that this is probably not the most efficient way to do this kind of application, but as is stated in the beginning of this chapter, this is only implemented to show the potential that this kind of tracking has.

(32)

24 Chapter 5. Results

5.3.3

Letter recognition

The letter recognition function can be accessed through the bottom menu by pressing the cross on the right hand side. When pressing this button a red aim will appear on the screen. The user then has to fit the letters within this aim in order for the recognition to do its job.

Figure 5.7: The letter tracking function in progress

In this mode the regular recognition is turned off and the application switches its focus only to what is present within the aim. The recognition is built like a grid with every letter having unique tracking points within that grid. The example above shows how the letter R is recognized.

(33)

5.3. Object Tracking 25

5.3.4

Face recognition

This face recognition tracks the ratio between eyes and mouth. Both eyes and the mouth generates large cluster of edge pixels, making them easy to locate in a picture. If such a cluster is found triangulation is obtained in order to check if there is a cluster within a pre calculated range. This range is based on the ratio between the eyes and mouth of a human face. A feature in this mode is that if a face is recognized the facebook page of the detected person is accessible through the facebook icon appearing at the bottom of the screen.

(34)
(35)

Chapter 6

Conclusions

To summarize this whole thesis it is appropriate to revisit the main goal with the project. The goal was to find out how well suited the Iphone 3Gs is as a platform for augmented reality and object tracking. From this point the project has been a success, despite setbacks like redesigning the whole concept of a augmented reality game in 3D to a augmented reality game in 2D. Because that was what the whole project was about, testing the limits. The main conclusion of this thesis could easily be summed up in one sentence:

“As long as the SDK does not allow developers to access the raw camera stream, the Iphone 3Gs is not suited for augmented reality that depends on analyses of the physical world”

To add pictures on top of the camera stream works just fine but as long as the print screen method discussed in the last chapter is the only way of analyzing what is present on the screen, the picture will also be a part of the equation and complicate everything. This can be displayed in Figure 5.3 where looking closely both the steering wheel and the car has white edge around them, proving that the steering wheel is also considered a physical objects by the algorithm.

When talking about tracking of the physical world without the interference of digital objects merged with it, it is another story. The phone possesses enough features and CPU power to complete very demanding operations. Recognizable software as Iphone applications are just in the beginning of what I believe is an upcoming trend. The area of use is almost endless and as long as the Iphone continuously thrive on the market the development will continue.

6.1

Limitations

As for the limitations in my work some of it has already been mentioned. Due to what could be called an overview of the possibilities with Iphone 3Gs the focus has not been on developing bug free and solid applications. All of the applications should be seen as demonstrations of what can be done and to optimize these would probably be a master thesis of each and every one on their own. To sum up some of the limitations the cargame is used as starting point: It only works in 2 dimensions, some of the objects that are present at the start has to be present at all time and the car has no collision detection on objects. The letter tracker only tracks a bold handwritten font. The object tracker only tracks three different shapes at the moment and the face tracker only tracks that a face is present, not

(36)

28 Chapter 6. Conclusions

who the face belongs to and therefore the name of the person and the facebook link is just implemented for one person.

6.2

Future Work

As for future work there is quite a bit here that can be done. The work on the letter tracker will continue, and the first thing will be to make it possible to track whole words with a common font such as Times New Roman. It will also be made possible to save the text to a document.

When it comes to object and face recognition a database of objects and faces in different angles has to be implemented. My guess is that if the database grows large, the most efficient way to use it is to create a server/client application. In such a case the phone only provide photos to a back-end server that does all the calculations.

(37)

Chapter 7

Acknowledgements

This master thesis has been really interesting and I’m sure that I will have great use of this experience later on in my carrier. I would therefore like to take this opportunity to thank the CEO of North Kingdom David Eriksson for providing me with the subject and letting me work at their firm. I would also like to thank my supervisor at North Kingdom Hans Eklund for helping me with my work and also my supervisor at the department of computer science, Ola ˚Agren for the help and feedback on this report.

(38)
(39)

References

[1] Canalys. Smart phone market shows modest growth in q3 - but apple and rim hit record volumes. http://www.canalys.com/pr/2009/r2009112.html, December 25 2009. [2] North Kingdom. Official website. http://www.northkingdom.com/about/, January 20

2010.

[3] Apple Inc. Apple iphone. http://www.apple.com/iphone, January 10 2010.

[4] T-mobile Netherland. Leaked Iphone secret. http://www.mobilewhack.com/t-mobile-netherlands-leaks-iphone-3g-s-hardware, February 10 2010.

[5] R. T. Azuma. A survey of augmented reality. Presence: Teleoperators and Virtual Environments, pages 355–385, 1997.

[6] M. A. Livingston, L. J. Rosenblum, S. J. Julier, D. Brown, Y. Baillot. J, E. Swan II, J L. Gabbard, and D. Hix. An augmented reality system for military operations in urban terrain. I/ITSEC, page 89, 2002.

[7] J. Moline. Virtual reality for health care: a survey. Technical report, National Institute of Standards and Technology, Gaithersburg, MD, 1997.

[8] Augmented environments lab. Arhrrrr! http://www.augmentedenvironments.org/lab/ research/handheld-ar/arhrrrr/, February 14 2010.

[9] CMG Research. Sudoku grab. http://www.cmgresearch.com/sudokugrab/, Febru-ary 12 2010.

[10] Best App Ever Awards. Second annual iphone os application achievement awards. http://bestappever.com/awards/2009/, December 17 2009.

[11] G. Klein and D. Murray. Parallel tracking and mapping on a camera phone. ISMAR’09, 2009.

[12] Occipital. Redlaser. http://redlaser.com/, February 10 2010.

[13] H. Kato and M. Billinghurst. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. IWAR’99, pages 85–94, 1999.

[14] D. Wagner and D Schmalstieg. First steps towards handheld augmented reality. ISWC 2003, pages 127–137, 2003.

[15] ARToolKit. Official website. http://www.hitl.washington.edu/artoolkit/, January 20 2010.

(40)

32 REFERENCES

[16] D. Wagner and D Schmalstieg. Artoolkitplus for pose tracking on mobile devices. CVWW’07, pages 139–146, 2007.

[17] D. Marr and E. Hildreth. Theory of edge detection. PROC. ROY. SOC.(London), vol. B207, pages 187–217, 1980.

[18] H. S. Neoh and A. Hazanchuk. Adaptive edge detection for real-time video processing using fpgas. GSPx 2004, 2004.

[19] J. Canny. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence, pages 679–698, 1986.

[20] D. Comaniciu, V. Ramesh, and P. Meer. Real-time tracking of non-rigid objects using mean shift. Proceedings of 2000 IEEE Conference on Computer Vision and Pattern Recognition, pages 142–149, 2000.

[21] I. Halil. Mean-shift based moving object tracker. http://www.cs.bilkent.edu.tr/∼ismaila/MUSCLE/MSTracker.htm, January 14 2010. [22] C. Harris and M. Stephens. A combined corner and edge detector. Fourth Alvey Vision

Conference, pages 147–151, 1988.

[23] D. Comaniciu and P. Meer. Mean shift: A robust approach towards feature space analysis. IEEE Trans. Pattern Anal. Machine Intell., pages 603–619, 2002.

[24] R. Collins, O. Amidi, and T. Kanade. An active camera system for acquiring multi-view video. Proceedings of the International Conference on Image Processing, pages 517–520, 2002.

[25] G. Klein and D. Murray. Parallel tracking and mapping for small ar workspaces. ISMAR’07, pages 225–234, 2007.

(41)

Appendix A

Concept sketches

Figure A.1: Concept sketch on the car game, before it was reduced to 2D

(42)

34 Chapter A. Concept sketches

(43)

Appendix B

Lo-Fi

Figure B.1: Lo-fi sketches on possible ways to steer the car

(44)

36 Chapter B. Lo-Fi

(45)

Appendix C

Interactive prototypes

Figure C.1: HiFi prototype to test the usability of a spinning steerwheel with gas and break pedals

(46)

38 Chapter C. Interactive prototypes

References

Related documents

För Tvåkärlssystemet, Fyrfackskärl och KNI är det tomgångskörningen vid tömning av kärl i Lisas höjd som bidrar minst till miljöpåverkanskategorin försurning

In the images received by the digital camera, these colors are segmented and the binary image for each object is generated inside the FPGA. The robot is moved forward

The other two curves show how the dialogue act tagging results improve as more annotated training data is added, in the case of active (upper, dashed curve) and passive learning

Nordin-Hultman (2004) menar att olika handlingsmöjligheter i rum ger förutsättningar för att kunna bemöta barns olikheter, samt att rummen är mindre reglerade. Detta kan passa

[r]

Diagrammet visar tillåten maximal storlek för frisk kvist på kantsidan på 50 mm tjockt virke i Nordiskt Trä och SS-EN 1611-1 för olika virkesbredder.. Diagrammet visar

I en snävare betydelse måste en berättelse innehålla en berättare, det vill säga att det måste finnas någon som berättar något för någon.. 6.3 Några

To construct a broom from the branches of the trees growing close to the square.. To sweep the square and the small branches in the broom breaks one by one and now they are