Control of Articulated Robot Arm by Eye Tracking

(1)

Master Thesis Computer Science Thesis no: MCS-2010-33 September, 2010

School of Computing

Blekinge Institute of Technology Box

Control of Articulated Robot Arm by Eye Tracking

Muhammad Imran Shahzad and Saqib Mehmood

Contact Information:

Author(s):

Muhammad Imran Shahzad

Address: Folk Parksvägen 17 LGH 19, 37240, Ronneby, Sweden E-mail: saramadeel@gmail.com

Saqib Mehmood

Address: Folk Parksvägen 17 LGH 19, 37240, Ronneby, Sweden

Internet : www.bth.se/com

Phone : +46 457 38 50 00

Fax : + 46 457 271 25

(2)

University advisor(s):

Craig Lindley

Craig.lindley@bth.se

Game Systems and Interaction Research Laboratory

Department of Internet : www.bth.se/tek

Interaction and System Design Phone : +46 457 38 50 00

Blekinge Institute of Technology Fax : +46 457 102 45

Box 520 SE- 37225 Karlsakrona, Sweden

This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Computer Science.

The thesis is equivalent to 20 weeks of full time studies.

Contact Information:

Author(s):

Muhammad Imran Shahzad

19LGH 17 Folkparks Vägen 37240, Ronneby, Sweden saramadeel@gmail.com

Saqib Mehmood

19LGH 17 Folkparks Vägen 37240, Ronneby, Sweden saqib495@hotmail.com

(3)

A BSTRACT

Eye tracking has many comprehensive achievements in the field of human computer interaction. Uses of human eyes as an alternative of hands are an innovative way in the human computer interaction perspective. Many application of autonomous robot control has already been developed, but we developed two different interfaces to control the articulated robot manually. The first of these interfaces is controlled by mouse and the second is controlled by eye tracking. Main focus of our thesis is to facilitate the people with motor disabilities by using their eye as an input instead of a mouse. Eye gaze tracking technique is used to send commands to perform different tasks. Interfaces are divided into different active and inactive regions. Dwell time is a well known technique which is used to execute commands through eye gaze instead of using a mouse.

When a user gazes in an active region for a specific dwell time, the command is executed and the robot performs a specific task. When inactive regions are gazed at, there no command execution and no function are performed. The difference between time of performing the task by mouse and Eyetracking is shown to be 40 ms, the mouse being faster. However, a mouse cannot be used for people with motor disabilities, so the Eyetracker in this case has a decisive advantage.

Keywords: Eyetracking, Interface, Articulated robot

(4)

A CKNOWLEDGEMENT

We are greatly thankful to our family for their encouragements, support, assistance and their prayers to ALLAH for our success.

We would like to thanks our Supervisor for this Master Thesis, Craig Lindley, Blekinge Institute of Technology, SWEDEN for his commitment and inspiration to help us achieving our research goals. Craig Lindley has provided us with an essential advice and extraordinary guidance and without his supervision it was not possible to do this research.

We further thank Blekinge Institute of Technology administration and staff who provide us an opportunity to have quality education and research exposure to be successful in our life.

(5)

C ONTENTS

CONTROL OF ARTICULATED ROBOT ARM BY EYE TRACKING ...I

ABSTRACT ... 2

ACKNOWLEDGEMENT ... 3

CONTENTS ... 4

LIST OF TABLES ... 6

LIST OF FIGURES ... 7

1 INTRODUCTION ... 8

2 BACKGROUND ... 11

2.1 WHAT IS EYE TRACKING TECHNOLOGY? ... 11

2.2 EYE AND GAZE TRACKING APPLICATIONS ... 12

2.2.1 Midas touch, Dwell time & WPF Custom controls ... 13

2.3 TOBII TECHNOLOGY SYSTEM ... 14

2.3.1 Model T60 ... 14

2.4 EYETRACKING OVERVIEW /METHODOLOGY ... 15

2.4.1 Eyetracking ... 15

2.4.2 Eyetracking Methodology ... 15

2.5 MOTIVATION ... 17

2.6 ROBOT ... 18

2.6.1 Articulated Robot ... 18

2.7 RELATED WORK ... 18

2.8 PROPOSED WORK ... 18

3 PROBLEM DEFINITION ... 20

3.1 AIMS AND OBJECTIVES ... 20

3.2 RESEARCH QUESTIONS ... 20

3.3 EXPECTED OUTCOMES ... 20

3.4 THE STRUCTURE OF THE THESIS ... 20

4 RESEARCH METHODOLOGY ... 22

4.1 LITERATURE REVIEW ... 24

4.2 RESEARCH QUESTIONS/STUDY: ... 24

4.3 RESEARCH DESIGN ... 24

4.4 RELATED WORK ... 25

4.5 INTERFACE DESIGN ... 25

4.5.1 Robot Description ... 26

4.5.2 Arduino Micro controller Board for Robot ... 26

4.5.3 Pulse width Modulation (PWM) ... 28

4.6 VALIDATION ... 29

4.7 QUESTIONNAIRES ... 29

4.7.1 Closed Questionnaires ... 29

4.7.2 Open Questionnaires ... 29

4.7.3 Forced Choice Questionnaire ... 29

5 THEORETICAL WORK/STUDY ... 30

5.1 WHAT DOES INTERFACE MEAN? ... 30

5.2 MULTIMODAL INTERFACES. ... 30

5.3 ROBOTIC INTERFACES ... 30

5.3.1 Consol Interface ... 30

5.4 USER INTERFACE DESIGN PROCESS ... 30

5.4.1 Requirements of Functionality ... 31

5.4.2 User Analysis: ... 31

(6)

5.4.3 Prototype ... 31

5.4.4 Usability Testing ... 31

5.4.5 Graphical User Interface Design (GUI) ... 31

5.5 PROGRAMMING LANGUAGE USED ... 31

5.5.1 Java is Simple ... 32

5.5.2 Java is Platform Independent ... 32

5.5.3 Java is Safe ... 32

5.5.4 Java is High Performance ... 32

5.5.5 Java is Multi-Threaded ... 32

6 EXPERIMENT RESULT / ANALYSIS ... 33

6.1 INTERFACE 1USING EYETRACKING ... 33

6.1.1 First Stage ... 33

6.1.2 Second Stage ... 33

6.1.3 Third Stage... 35

6.2 2ND INTERFACE USING EYETRACKING ... 36

6.3 WORKING OF INTERFACE1 USING MOUSE. ... 37

6.4 WORKING OF THE INTERFACE 2USING A MOUSE ... 38

6.5 COMPARISON OF MOUSE-BASED AND EYE TRACKER INTERFACES ... 38

6.5.1 Reading with Mouse... 39

6.5.2 Reading using Eye Tracker ... 40

6.6 COMPARISON BETWEEN MOUSE AND EYETRACKING ... 41

7 CONCLUSION AND FUTURE WORK ... 45

7.1 CONCLUSION ... 45

7.2 FUTURE WORK ... 45

8 REFRENCES ... 46

9 APPENDIX ... 50

9.1 QUESTIONNAIRE ... 50

9.2 RESULTSOFTHEFORCECHOICECLOSEDENDEDQUESTIONS ... 51

(7)

L IST OF T ABLES

TABLE 1.SRUCTURE OF THESIS ... 21

TABLE 2SUMMARY OF ARDUINO BOARD ... 26

TABLE 3NUMBER OF USERS AND THEIR TIME CONSUMPTION IN SECONDS ... 39

TABLE 4TIME USED IN SECONDS TO COMPLETE TASK BY EYETRACKING ... 40

TABLE 5 SHOWING TIME DIFFERENCE BETWEEN EYETRACKING AND MOUSE ... 41

TABLE 6 SHOWING MEAN DIFFERENCE OF TIME BETWEEN MOUSE AND EYETRACKING ... 42

TABLE 7SHOWING STANDARD DEVIATION DIFFERENCE OF TIME BETWEEN MOUSE AND EYETRACKING ... 42

TABLE 8 SHOWING COMPARISON BETWEEN MOUSE AND EYETRACKING FOR MEAN AND STANDARD DEVIATION ... 43

TABLE 9RESULT OF QUESTIONNAIRE ... 51

(8)

L IST OF F IGURES

FIGURE 1ARDUINO BOARD ... 9

FIGURE 2HEAD MOUNTED EYETRACKING SYSTEM ... 9

FIGURE 3INTERFACES AND ROBOT ... 10

FIGURE 4EYE TRACKING SYSTEM OF EARLY AGE ... 12

FIGURE 5HEAD MODEL FOR DETECTION OF EYE AND GAZE ORIENTATION ... 12

FIGURE 6USER POINTS TO OBJECTS ... 13

FIGURE 7TOBII T60EYETRACKING ... 14

FIGURE 8SENSORS ATTACHED AT THE SKIN AROUND THE EYES ... 15

FIGURE 9SCLERAL CONTACT LENS PROCESS STEP 1 ... 16

FIGURE 12 SHOWING POSITION OF EYE IN VIDEO OCULOGRAPHY ... 17

FIGURE 13ARTICULATED ROBOT ARM ... 18

FIGURE 14 SHOWING THE STRUCTURE OF THE RESEARCH METHODOLOGY... 23

FIGURE 15PROCESS OVERVIEW OF RESEARCH WORK ... 24

FIGURE 16ARTICULATED ROBOT ARM SHOWING JOINTS ... 26

FIGURE 17ARDUINO BOARD CONFIGURATION [61] ... 28

FIGURE 18 SHOWING GRAPH FOR PULSE WIDTH MODULATION [61] ... 28

FIGURE 19FLOW OF DATA OF THREE STAGES ... 33

FIGURE 20GRAPHICAL MOVEMENTS OF EYES OF USERS ... 34

FIGURE 21SPECIFICATION OF ARDUINO BOARD ... 35

FIGURE 222ND INTERFACE ... 36

FIGURE 23MOUSE BASED INTERFACE ... 37

FIGURE 24MOUSE BASED GUI AND ROBOT. ... 37

FIGURE 25 GRAPH SHOWING TIME USING MOUSE ... 39

FIGURE 26 GRAPH SHOWING TIME EYETRACKING... 40

FIGURE 27SHOWING DIFFERENCE OF TIME BETWEEN MOUSE AND EYETRACKING ... 42

FIGURE 28SHOWING MEAN DIFFERENCE OF TIME BETWEEN MOUSE AND EYETRACKING ... 42

FIGURE 29SHOWING STANDARD DEVIATION DIFFERENCE OF TIME BETWEEN MOUSE AND EYETRACKING ... 43

FIGURE 30SHOWING COMPARISON OF MEAN AND STANDARD DEVIATION FOR BOTH MOUSE NAD EYETRACKING ... 44

(9)

1 INTRODUCTION

For many centuries, human beings have continuously made efforts to quest substitute that will imitate their role in numerous instances of interaction with environment. Several economical, social, scientific and philosophical principals have inspired this research.

Artifacts to life are greatest ambition of human being. Czech playwright Karel Capek coined the term ―ROBOT‖ – derive from Slav robota which means executive labor—this

denoting a humanoid automation created by the Ficticious Rossum. [1]

In the field of robotics for manufacturing, there exist three types of automation:

1. Rigid Automation: In this kind of automation, the same types of machines are manufactured by automated assembly processes.

2. Programmable Automation: In this kind of automation, variable batches of different type of machines are manufactured

3. Flexible Automation: It deals with evaluation of programmable automation. [1]

With the advancement of robotics technology, robots are used for many purposes, such as painting, welding, sealing, assembly etc. Robots need some basic teaching for performing these tasks. These teaching operations are indispensible to bring in use as it is time taker activity. Also, there is worse accuracy in positioning of robots.

Position of articulated robot is calculated on angles of joints which are detected by rotary encoders. It could put bad affects on accuracy. This problem can be overcome by any board that will rotate the motors (servo) of robot. We have used board of ―Arduino Diecimila” company.

This is microcontroller board (ATmega168). There are 14 digital input/output pins (6 are PWM outputs), 1USB connection for communication, 1header (ICSP), 1 crystal oscillator (oscillator), 1 power jack. To start this board, we simply connect it with computer with USB cable or AC-to-DC adapter or battery.

There are servo loops that can rotate motors from 0 to 180 degree. It gives sufficient accuracy.

(10)

Figure 1 Arduino board

Different methods are being used to make interaction with computer with passage of time, from long commands to mouse, recognition of voice to eye-tracking and punch card to

switch boards.

Research on Eyetrcking has been conducted for almost 125 years [13]. Dodge and Cline (1901) used a method of light reflection from the cornea to develop eye tracking technique.

It was a precise, non-invasive technique. They recorded position (horizontal) of eye onto photography photographic plate and the method required the participant’s head to be motionless.

Later on, Judd, McAllister & Steel (1905) record ―moving picture‖ in two dimensions. They recorded white spec material instead of cornea. They made more research during twentieth century on corneal reflection and motion picture techniques. [4]

Head-mounted eye tracking [3], [5], [6], is also used on large scale.

Figure 2 Head mounted Eyetracking system

(11)

Yarbus (a Russian psychologist) has also done practical work in gaze tracking world. In 1950s and 1960s, he did research in saccadic exploration of complex images. He made recordings of eye movements when observers view natural scenes [7]. Purpose of related work is to make technical improvement to enhance exactness and correctness of the eye trackers [8].

Researchers in the 1970s made technical improvements to increase accuracy and precision. This new research used multiple eye reflections to dissociate eye rotations from movement of the head (Cornsweet and Crane, 1973) which boosted tracking precision and provides the base for greater freedom of participant movement [4].

With the increasing use of computers in the 80's, there was increasing interest in gaze tracking and designing new interfaces and interaction with computer [9], [10],[11]. Levine was another pioneer who exposed the potential of eye tracking for interaction with applications [12].

In our project we have developed different interfaces for controlling an articulated robot arm. This development was done using JAVA. We have controlled these interfaces with both a mouse and an eye tracker device (Tobii (company)) .These interfaces send data to an arduino board that controls the robot. We send joint number and angles to the robot as specified on the interface.

Figure 3 Interfaces and Robot

As Figure 3 shows, there are five rectangles namely, JAW, BASE, JOINTS (1, 2, 3) to control five different joints of the robot arm. Our robot arm can move from 0 to 180 degree.

When we click on Jaw the gripper of the robotics moves. Joints 1, 2, 3 and the base work similarly. One noticeable thing is that each rectangle has two parts .When we click on the upper part then the joint moves from 0 to 90 degrees and when we would click on the lower part the joint moves from 90 to 180 degree.

(12)

2 B ACKGROUND

2.1 What is eye Tracking Technology?

Eye trackers estimate direction of gaze of a user. Early eye tracking systems are surveyed by Young and Sheena [32].and recently by Duchowski [33].

Eye-tracking systems can track the eye-in-head gaze direction or the combined eye-head gaze direction. These systems could be intrusive (physical contact between the sensor apparatus and the user) [37] [34] [35] [36] or non-intrusive (camera-based techniques) [37]

[38] [39] [40] [41] [42] [43].

Researchers in field of eye-movements have been pursued for almost 125 years [13]. Dodge and Cline (1901) used the method of light reflection from the cornea to develop eye tracking.

It was a precise, non-invasive technique. They recorded the position (horizontal) of the eye onto a photographic plate, which required the participant’s head to be motionless.

Later on, Judd, McAllister & Steel (1905) record “moving picture” in two dimensions. They did recording of white spec material instead of cornea. They made more research during the twentieth century on corneal reflection and motion picture techniques. (See Mackworth &

Mackworth, 1958, for a review)

In the 1930s, Miles Tinker and his colleagues began to apply photographic techniques to study eye movements in reading (see Tinker, 1963 for a thorough review of this work).Miles Tinker and his colleagues studied eye movements in reading by applying photographic techniques in the 1930s.They observed effects on speed of reading, eye patterns by changing print size, page layout etc. Paul Fitts and his colleagues started ―motion picture cameras‖ to observe pilots’ eyes during landing in cockpit in 1947[4].

Impressive major innovation in eye tracking was the development of head-mounted Eyetracking ([3], [5], [6]), which is still widely used. Yarbus (a Russian psychologist) has made valuable contributions to gaze tracking, exploring complex images in the 1950s and 1960s. He recorded the movements of eyes of observers when looking at different scenes [7].

Electro-oculography is a method based on electrostatic fields that exist when eyes rotate. The position of the eye can be estimated by recording skin potential around the eye. Since this technique requires the close contact of electrodes, it is wearisome [14].

Three different approaches exist for detection and measurement of eye gaze:

 The glint

 3D-model

 Local linear map network

In the glint approach, we calculate angle of the visual axis and fixation point by tracking the angle of the pupil and light reflection point from cornea (glint).

(Study of similar system can be found in [15]).

(13)

Figure 4 Eye Tracking System of early age

In 3D-model, we can detect the mouth, pupils, and face by using serialized image processing. With a 3D model [17], we evaluate the face orientation and eye images can be used to estimate gaze direction. (Fresh research is found in [16]).

Figure 5 Head model for detection of eye and gaze orientation

Local linear map network is more trivial [18], it recognize the head orientation of a user with the help of infrared light.

2.2 Eye and gaze tracking applications

When we review history, we found that the first example of eye tracking systems is user interface design. We can design devices, cockpits, cars, etc. by tracking ―where people are looking‖. [31]

In this field, a number of publications can be found [19] [20] [21].This whole study depends on the eye-mind hypothesis.

Eye mind hypothesis shows that what a person is looking at and what are the indications of assuming the thought on top of the stack of cognitive processes. The use of eye-mind hypothesis is meaningful and visible and improves design of interface. [22] [31].According to Jacob, the basic reason behind the dawdling appearance of interfaces [23] [31] is the Midas touch problem: the application must not respond every time the goal of the gaze changes.

(14)

We focus on aspect of eye and gaze tracking in game environment. But it is reported in 2005 [24] [31] that there is no existence of information about eye tracking and games of computers. Nevertheless, it is very motivating to reveal from previous work about foundations of players eye behaviors [25] [31].

Research reveals that action game playing is a constructive means for rehabilitation of visually impaired people [26]. Eye or gaze tracking is not a standard input for commercial games. The first contribution to handle this problem was made by [27] [31] in which a system was developed for eye motion disability rehabilitation. Joystick was used to play a game in environment of multimedia.

Figure 6 User points to objects

Laurent Itti Lab also made comprehensive contribution relating realistic avatar eye and neurobiological model of visual attention [28] [31].

This team has worked in field of interactive visual environments and mechanism of gaze direction [29] and S. Yan and N. Seif El-Nasr by [30] [31] has reinforced the work investigating visual attention patterns within 3D games.

2.2.1 Midas touch, Dwell time & WPF Custom controls

"How do you make a distinction between users glancing at objects and fixations with the intention to start a function‖? This is the well known problem usually referred to as the

"Midas touch problem". The interfaces that depends on gaze as input should carry such a interaction and must be able of making difference between glances (just looking around) and fixations ( objective to start a function). The most common solution to this problem is

"dwell-time" where you can make active the functions by prolonging fixation on an icon, button or image. Its range varies from 4 to 500 ms. the problem of people who is suffering from Amyotrophic lateral sclerosis (ALS) can be solved by this way.

Amyotrophic lateral sclerosis (ALS) is severe type of neurological disease which affects nerve cells and causes muscular weakness.

But dwell time has also some problems, e.g. prolonged fixation results in slow interaction because user has to sit through the dwell-time. When the user makes fixations on any object an event is triggered. This event creates new threads and it decide that if there is enough fixations on area then continue it, otherwise finish this thread. At the end it measures if fixations have resided within the area for more than 70% of the time, in that case, it activates the function.

Windows Presentation Foundation (WPF) is another progress in development of user interfaces. The Microsoft Expression Blend is a method for easing graphical design of components. It can be broken into different projects and included as a DLL file [52].

(15)

2.3 Tobii Technology System

Tobii Technology has camera-based eye tracker systems in which camera and light source are affixed to a monitor on a permanent basis. T/X-series and 50-series eyetrackers use infrared pupil-center/corneal-reflection eye tracking [37] [44] and respond via a Tobii Eye Tracker (TET) server.

Figure 7 Tobii T 60 Eyetracking

The TET server does image processing on incoming data from a video camera and maps image parameters to screen coordinates. The server can exist on a separate machine from the machine that receives the reports of tracker and it communicates over TCP/IP [37].

2.3.1 Model T60

The T60 processes binocular tracking at 60Hz, and its head movement tolerance is 44x22x30cm volume centered 70cm from the camera. The T60 combines both bright and dark pupil tracking. Latency rate is 33ms between camera and gaze [37] [44].

The monitor (LCD) of the T60 is 17 inches. The system has an accuracy of 0.5 degree with fewer than 0.3 degree of drift over time and fewer than 1degree error due to head movement [37] [44]. Its processing rate of binocular tracking is 60Hz and the movement of the head is 44x22x30cm volume centered 70cm from the camera. Both bright and dark pupil tracking is combined.

During the tracking process the Tobii eye tracker uses infrared diodes that produces reflection pattern on the corneas in the user’s eye after it sensors come into action and it collect reflection patterns and other visual information. Different complex image processing algorithms detect features of corneal reflection patterns. After some mathematical calculations gives us gaze point on screen, which shows where user looking [45].

(16)

2.4 Eyetracking Overview /Methodology

2.4.1 Eyetracking

The term eye tracking is used here as way the estimation of direction of the user’s gaze.

The estimation of the gaze direction describes the identification of the objects upon which the gaze falls. Eyetracking as standard computer machine identifies the object of gaze by the coordinates on the screen. To find out gaze direction in 3D virtual worlds is difficult and becomes hard when the real world interact [33].

2.4.2 Eyetracking Methodology

There are 3 broad categories of eye movement measurements methodologies.

1. Electro –Oculography 2. Scleral Contact Lens

3. Video / Photo –Oculography

 Electro –Oculography

It based on recordings of differences of skin surrounding the ocular cavity. It is basically used for point of regard measurement. Point of regard measurement totally based on reflection of corneal. Contact lens was developed for improvement of accuracy in 1950.

Devices like Small mirror, coils of wire are used with contact for improvement of accuracy.

These rely on features of eyes like e.g., the pupil, iris-sclera boundary, corneal reflection of light source. Recorded potential is in range of 15- 200 v with sensitivity of 20 v. This technique is not suitable for point of regard .It measures eye movement related to head movement [33].

Figure 8 Sensors attached at the skin around the eyes

(17)

 Scleral Contact Lens

In this technique contact lens is worn on eyes. Optical reference object or mechanical object is mounted on contact lens. On modern contact lens mounting stalk is attached. This contact lens is extended over sclera and cornea. Reflecting phosphors, line diagram, wire coals are attached on mounting stalk. This is intrusive method. This technique has special consideration like, practice and care is required for insertion of lens. User can feel discomfort when he uses it. This is not suitable for point of regard measurement. The scleral search coil is made of a medical-grade silicone rubber suction ring .It has a suitable shape to stick on to the limbus of the eye [33].

Below mentioned 3 figures depicts a clear view of this process [65].

Figure 9 Scleral Contact Lens Process step 1

Figure 10 Scleral Contact Lens Process step 2

(18)

Figure 11 Scleral Contact Lens Process step 3

 Video/Photo – Oculography

This techniques measure to unique features of eye under rotation e.g., position of limbus, shape of pupil, corneal reflection of light source. This is not suitable for point of regard measurement. It involves visual inspection of recorded eye movement and measurement of ocular features is automatically. Visual measurement or assessment is performed by video tape frame by frame. This technique could be error free and tedious [33].

At the picture you can see the result of intelligent eye recognition: the software automatically locates the pupil and outlines the region of interest [33].

Figure 12 showing position of eye in video oculography

2.5 Motivation

Most of Eyetracking systems are based on video-based pupil detection and a reflection of an infrared LED. One reason is that Video cameras are cheaper and price for an LED is minor. Many devices like mobile phones, laptops come with built-in cameras. New machines with latest powerful processors can process the video stream necessary to do eye tracking. New head-tracking systems are also video-based. With passage of time ,due to

(19)

optical mouse or a webcam today Many people suffer from overstressing particular parts of their hands, as most interaction is done with keyboard and mouse by using the hands , that can result in carpal tunnel syndrome. The eyes are a good candidate because they move anyway when interacting with computers. New eye tracking based interface techniques could use benefits of eye movements and giving the users some benefit in terms of computer interaction [65].

2.6 Robot

Robots were introduced in 1921 by Czech Playwright Karel Capek first time by resembling it with human being in all aspects except tiredness. Robot is Czech word which means

―Worker‖. After the word Robot come into existence then it is needed which form of robot is required and which form of function it has to perform [51].

2.6.1 Articulated Robot

It is that types of robot which have a series of joints resembling a human arm.

Figure 13 Articulated robot arm

The robot used for this project has 5 moving joints.

2.7 Related Work

This is fourth era of eye tracking search [46] and research is in progress to design more rapid, more trustworthy and easier to use eye tracking systems [47], [48]. Consequently, the realm of eye tracking applications is growing swiftly including both problem-solving and interactive applications. Application developers use inputs from the eyes to develop many graphical user interfaces for human computer interaction (HCI) as part of the development of interactive applications [49]. Consequently, a small number of attempts of eye tracking for human-robot interaction are existed [50].

2.8 Proposed Work

We have used Eye tracker of Tobii Company (T60). We used its built in APIs to take inputs of eyes in the form of X-AXIS and Y-AXIS. We then pass these parameters to a java

(20)

application that controls the robot through an Arduino board. While robot is controlled by programming (JAVA APPLICATION). This robot can rotate from 0 degree to 180.It has5 different joints. Joints are controlled with pins number (2-10) mounted on the Arduino board.

Gripper of robot can grip any object.

(21)

3 PROBLEM DEFINITION

The main focus of this study is to analyze the control of the robot. The first step involved is to control the robot with mouse by making some useful interfaces. Then take this process through eye tracking utilization.

3.1 Aims and Objectives

The aim and objective of this thesis is making an interface between articulated robot and eye tracking in the form of a prototype. The aim will be completed by addressing the following objectives.

 To control the articulated robot using gaze interaction by eye tracking.

 To find out the most effective interface to support this driving function.

 To explore eye tracking in handling articulated robotic functionality.

 To gain a deeper understanding of the degree of automation that may be applied to such a system, and how to balance or transfer between automated and manual functions.

3.2 Research Questions

There are three research questions in this study on the basis of above mentioned aims and objectives.

1. Can an articulated robot be controlled gaze interaction?

2. What kind of interface can be most effective for gaze interaction?

3. How does the best gaze-based interface compare with the mouse-based interface?

3.3 Expected Outcomes

On the basis of the Research questions literature review and experimental results, the following outcomes are possible.

1. Software for interfacing a GUI with robot control operators.

2. Software link to a Tobii T60 eye tracking system, to support gaze-directed robot interaction.

3. Several alternative interaction design implementations for gaze-directed robot control.

4. An evaluation of the different interaction models implemented, and comparison with a mouse-driven interface.

3.4 The Structure of the Thesis

This section will guide about the Structure of the thesis by chapters. The contents of each chapter are shown in this table.

(22)

No of Chapter Title of the Chapter Description of the Chapter

2 Backgrounds This chapter is about background of Eye Tracking and Robot.

3 Problem Definition This Chapter shows problems, Research questions and objectives.

4 Research

Methodology

This chapter highlights the research methodology used

5 Theoretical Work / Study

This chapter shows results got by Experiments and systematic review.

6 Experiment Result and Analysis

This chapter describes the analysis of authors after conducting the experiments and research.

Each research question is answered through the experiment is analyzed.

7 Conclusion / Future Works

This chapter is consisted of complete research work, conclusion and outcomes of experimental results.

8 References This chapter consists of all used references in this research.

9 Appendix

Table 1.Sructure of Thesis

(23)

4 R ESEARCH M ETHODOLOGY

This chapter presents different approaches and methods used in our research work.

Research methodology defines the area of research which is going to be addressed and what is the process of this research and how it leads to success. We adopted mixed methodology (Qualitative and Quantitative) to complete the thesis and research work. Quantitative research methods are capable providing numerically precise solution, which is based on facts and figures [53].

“Mixed method research in which the researcher uses qualitative research paradigm for one phase and the quantitative research paradigm for a different phase of the study” [54].

In this research thesis most emphasis is on the quantitative research methodology. In the first part of the thesis we conduct a literature review. After completing the literature review we implement a prototype application framework providing a GUI for robot control. The study of robotics control and implementing the eye tracker as a gaze input helped us a lot to make graphical user interface. Some specific interface designs will be completed. A series of experiments will then be conducted to evaluate those interface designs and for comparison with a mouse-based interface. Both interfaces are evaluated by using a mouse and then by using Eyetracking.

We designed a closed questionnaire to evaluate the interface. This is discussed in 4.7.

(24)

Results Empirical / Statistics

Experiment

Questionnaires Mix Research Methodology

Qualitative Research Methodology Quantitative

Research Methodology

Figure 14 showing the structure of the research methodology

Research Methodology

(25)

4.1 Literature Review

The role of a literature review is very important to know what the state of art is, what existing research there is and what other researchers says about any specific topic. A deep literature review is done to know the existing materials and to know the answers of our research questions as they have been developed so far. Our main focus was to find research papers, articles, books, journals, and websites related to our research area. For this different research engines, Google search engine, Engineering Village, IEEE Explore, ACM Digital library, BTH Library, and Zotero were used to find information and knowledge. We also used some websites to get information about our research area. A literature review process helps to understand the system and its functioning and where its need more research.

4.2 Research Questions/ Study:

For conducting research in any field of, we need to develop research questions that maximize possible solutions. Research questions are necessarily to include in our research study because we are developing a new interface design for the mouse and eye tracking system in order to control articulated robot. To answer the research questions means to design an interface for the said application. To use hypothesis is irrelevant at this stage, but once a prototype is developed and functioning we can use the hypothesis for the interface design to check for its efficiency, correctness/accuracy, simplicity, user friendliness, etc., in terms of task completion during an experiment (usability study).

Our thesis contains three important research questions, first and second questions are Primary while the third is secondary. The significance of the research

Questions are concluded based on the following study domains.

4.3 Research Design

Figure 15 Process overview of Research work

(26)

4.4 Related Work

This is fourth era of Eyetracking research [46] and research is on its way to develop more speedy, more reliable and easier to use eye tracking systems [47], [48]. As a result, the area of eye tracking applications is on the rise rapidly including both problem-solving and interactive applications. Gaze inputs are being used to develop many demanding graphical user interfaces for HCI as part of the development of interactive applications [49]. As a result, a small number of examples of eye tracking for human-robot interaction exist [50].

Human robot interaction is study of multiple disciplines [55]. Different applications of Eye tracking has started to grow up in diagnostic and interactive fields. Eyes inputs have been used in many Human computer interaction user interfaces [58].

There are some attempts of developing interfaces for human robot interaction using eye tracking [58].There are a number of ways which determine gaze direction has been developed in last 50 years, like reflection of light, electric skin potential, and contact lens.

Mason was the first who give the idea of calculating eye gaze by using an IR LED. This Paper reports a robotic arm controlled by experimental algorithm of eye tracking. A wheel chair controlling interface was designed that uses human eye gaze; this interface consisted of five non-active regions and four active regions. Another very close work related to our proposed graphical user interface was done by [56].This graphical user interface was based on buttons .This graphical user interface has not enough accuracy for small menu buttons so the author decided to go with large menu buttons. Every button was responsible for controlling some specific joints. These buttons are for controlling and commanding the all joints of the robot. The user of this interface just sees the button to control the joint which was specific to said button. The authors of [56] used two windows systems, one for the control system and the other to see the state of the robot. The divided interfaces were referred to as the feedback region and the commanding region. Another robotic toy was on a similar basis. [57]

4.5 Interface Design

The major part of our thesis work is to develop an interface design to control the robot which will suitable and user friendly for people with motor disabilities. Final output of the research is in the form of a proposed user interface design for controlling an articulated robot arm using mouse and eye tracking system and comparison between these two. The design relates the information obtained by following modules;

 Experiment Results

 Research Study guiding principle

An articulated robot is that type of robot which is consisted on rotary joints to access its work space. These joint are attached each other for supporting them.

(27)

Figure 16 Articulated robot arm showing joints

4.5.1 Robot Description

Articulated Robot arm use used in our thesis project is consisted on five joints and each is controlled by servo motors attached as showing in figure 2.

4.5.2 Arduino Micro controller Board for Robot

A method is required to control the robot. In our case, we used an Arduino Diecimila microcontroller board. An ATmega 168 is used in this board. The board has 14 digital input/output pins, 6 of them are used as analog inputs and 6 are used as Pulse width Modulation (PWM) outputs, one 16 MHz crystal oscillator, one USB connection , one power jack , one ICSP header and one reset button. We simply use a USB cable to connect it with the computer.

―Diecimila‖ is an Italian word .The meaning of word ―Diecimila‖ is 10000. This is the latest version of Arduino in USB series of board. It has many advantages over the previous versions.

Here is the total summary of the board

Microcontroller ATmega168

Operating Voltage 5V

Input Voltage (recommended) 7-12 V

Input Voltage (limits) 6-20 V

Digital I/O Pins 14 (of which 6 provide PWM output)

DC Current per I/O Pin 40 mA

DC Current for 3.3V Pin 50 mA

Flash Memory 16 KB (of which 2 KB used by boot loader)

SRAM 1 KB

EEPROM 512 bytes

Clock Speed 16 MHz

Table 2 Summary of Arduino board

(28)

 Power

There are two ways to give power to this board, by USB connection and another method is by Non USB (AC to DC adapter or battery). In our case, we used USB connection method to give power to the microcontroller board. PWR_SEL jumper is source of sending power.

 Memory

The Arduino board ATmega 168 has 16 KB of flash memory which is used for storing the code.2 Kb is required for a boot loader. SRAM requires 1 KB and EEPROM needs 512 bytes.

 Input and Output

The board has 14 digital pins for purposes of input or output. Pin Mode, Digital write, and Digital read functions are three different functions. 5 volts are needed to operate these functions. A maximum of 40 mA can be sent or received by each pin using a pull up resistor of 20-50 k Ohms. Each of these digital pins has its own special functions. The board has 6 analogue input pins. Resolution of each pin is 10 bits.

 Digital Pins

Serial 0 and 1: These pins are used for receiving and sending data (0(RX) for receiving data and 1 (TX) for sending data).

External Input 2 and 3: Value is changed from low to high by these pins.

Serial 3, 5, 6, 9, 10 and 11: For pulse width modulation, these six pins are used.

Serial 10 (SS), 11(MOSI), 12(MISO), and 13(SCK): To handle SPI communication, these pins are used.

LED 13: If value is high than LED is on and if value is low than LED is off.

Serial 10 (SS), 11(MOSI), 12(MISO), and 13(SCK): For handling SPI communication, these pins are used.

 Analog Pins

Serial I2C 4(SDA) and 5(SCI): For handling TWI communication, this pin is used.

Reset: For adding reset button, this pin is used

(29)

Figure 17 Arduino board configuration [61]

4.5.3 Pulse width Modulation (PWM)

PWM is a technique that is used to provide the intermediate value of power between fully off and fully on. Different voltages can be obtained by varying the portion of the time that is used for on verses the portion of the time it uses for off. The duration of the on time is called pulse width. The proportion of the on time with the off time is called the duty cycle of the pulse. A 100% duty cycle is the pulse in full on. The microcontroller board generates PWM output voltages between 0 to 5V. A sample PWM signal with the Frequency of 50Hz is shown in the figure below [61].

Figure 18 showing graph for pulse width modulation [61]

(30)

4.6 Validation

Validation of the research involves evaluating the effectiveness of the technology solution of using of gaze to control the articulated robot, via the proposed ―User

Interface Design‖. Validation is accomplished using a questionnaire given to the students who tested the system using both a mouse and Eyetracking. On the basis of answers collected from students we validate our result.

4.7 Questionnaires

Questionnaires are a well-established techniques for the collection of data and user’s opinions. They are almost similar to interviews and are of two types [63].

 Closed Questionnaires

 Open Questionnaires

4.7.1 Closed Questionnaires

Questionnaire which consists of closed-ended questions is known as closed questionnaires.

These questions have predefined answers, e.g. ―YES‖, ―NO‖ or multiple choice questions [63].

4.7.2 Open Questionnaires

Open questionnaires permit the user to respond in their own words [63]. In open-ended questions, different users provide different points of view and different styles of thinking according to their own choice and will.

We used (forced choice questionnaire) a type of closed questionnaire to evaluate results of our experiment which is shown in appendix 9.1.

4.7.3 Forced Choice Questionnaire

There exist many methods of forming survey questions, which help us to gather more action able data and forced choice questionnaire is also a type of closed questionnaire.

Forced choice questions help to make the survey most respondent. In this way a definite opinion is shown by the user by selection a response option. When we form this type of questionnaire then neutral and do not know option is eradicated. This questionnaire is designed to force the respondent to show his / her attitude about said question, these question are usually written in the form of agree / disagree. Most of the survey research studies show that excluding of do not know and neutral options does not alter the percentage of specific response scale[66], [67].

(31)

5 T HEORETICAL WORK / STUDY

5.1 What does Interface mean?

User interface is defined as way of communication between the users and the system.

The purpose of the interface is to create ease of use and accessibility to human computer interaction. Interface design consists of seven different principals. Many interfaces existed and these interfaces are completed for user experiences. There are many types of interfaces with features like intelligence, adaptively and some interfaces are specific for interaction like command, graphical multimedia. Relative types of interfaces to our thesis are multimodal and robotic [62].

5.2 Multimodal Interfaces.

Multimodal interfaces are multimedia interfaces based on a ―more is more‖ principle .They are controlled by different modalities, i.e. touch, sight, sound, speech. Interface techniques that have been combined for this purpose include speech and gesture, eye, pen input and speech. Multimodal interfaces can support more flexible, efficient and expressive means of human computer interaction, that are more akin to the multimodal human experiences in the physical world [62].

5.3 Robotic Interfaces

Robots have been with us for some time, most notably as characters in science fiction movies, but also playing an important role as part of manufacturing assembly lines, as remote investigators of hazardous locations such as nuclear power plants and bomb disposal and as search and rescue helpers in disasters [62].

5.3.1 Consol Interface

A console interface allows a human to control and navigate robots in remote environments using a combination of joystick and keyboard. This combination of joystick and keyboard controls acts together with camera and sensor based interaction. The focus has been on designing interfaces that enable users to effectively steer and move a remote robot with the aid of live video and dynamic maps. Domestic robots are appearing in our homes as helpers. Robots are being developed to help elderly and disabled with certain activities. Such as picking up and placing objects [62].

5.4 User Interface Design Process

User interface development involves different phases, which are used to define the interface [30] .Each interface depends upon its type of application. It can be described as below.

(32)

5.4.1 Requirements of Functionality

Functionality requirements include requirements for software, the whole system and its components. User input and output behavior is also mentioned in this part. It contains processing, relevant calculations, technical details of the system. ―What system is expected to accomplish‖ is founding question answered by functional requirements.

5.4.2 User Analysis:

User analysis is a very essential aspect for success of an interface and concerns analyzing the following problems.

When the product is totally dependent on the user, its role becomes very important, while if the product is specific user independent, it reduces complexity in the user analysis.

5.4.3 Prototype

Prototyping is the development of an entity having a logical or physical existence based on an idea or concept. A prototype is a complete product, developed at an intermediate stage in the system development life cycle. A prototype helps to confirm the ―does‖ or ―don’ts‖ for use, cause, functionality, user expectation and usability tests. An interface prototype allows users to experience the interface directly [63]. It is vital to include this in product development to analyze the conceptual model working in reality and its limitations.

5.4.4 Usability Testing

The purpose of usability testing is to confirm the product development verification and validation through performing user tests. The main motto and objective of usability testing is to achieve user satisfaction and confidence. Usability evaluation is carried out through simple methods, i.e. questionnaires, video recording, code testing and provisioning [63].

5.4.5 Graphical User Interface Design (GUI)

Mostly graphical user interface is described as the better options to command line interface usability. GUI interface designers give more control to users, functionality, design options, length and size of the screen, attracting users understanding and skills, to give the user adaptation and learning [63].

.

5.5 Programming Language used

We used Java to control the articulated robot. There is some motivation behind choosing Java Platform.

(33)

5.5.1 Java is Simple

Java has a rich feature set. Syntactic sugar or unnecessary features are removed. There are only about eighty pages for the Java language specification. Java has more functionality than C. A learner can become fluent quickly because it is small.

5.5.2 Java is Platform Independent

Byte-code is an intermediate form in the compilation of java code. A Java interpreter (special program) reads the byte code and executes the corresponding native machine instructions. Byte codes remain the same on all platforms. If you want to run a program on new platform, run it with compatible interpreter.

5.5.3 Java is Safe

Java provides secure execution of code across a network even when source is UN trusted and possibly malicious. Expected and unexpected errors are handled by robust exception

handling mechanisms of java.

5.5.4 Java is High Performance

Java byte code is compiled using a "just-in-time compiler" very speedily. Several companies are also doing research on native-machine-architecture compilers for Java.

5.5.5 Java is Multi-Threaded

Java is multi-threaded. A single Java program can have many different processes executing independently and continuously. Different threads of a single java program can execute independently and continuously. Different threads can get equal time from the CPU with little effort [64].

(34)

6 E XPERIMENT R ESULT / A NALYSIS

We develop two different interfaces.

6.1 Interface 1 Using Eyetracking

There are 3 stages of Eye tracker based interface.

6.1.1 First Stage

In first stage there is implementation of eye tracking. The Tobii Eye Tracker Components API (TetComp) is a programming environment that provides real-time high- precision gaze from Tobii eye tracker hardware. It provides a high level of abstraction.

The eye tracker machine consists of a client and a server (local server) .There is a monitor that is used to take input. A user can rotate his eyes, while sensors on the Eyetracker monitor record eye movements .This data (recorded movements of eyes) is sent to the server machine that determine the X-AXIS and Y-AXIS position of eyes and convert it into floating point numbers. It also writes this data in a text file.

Tobii Eye Tracker Components API (Tet. Comp)

Arduino Board ( IDE+ Memory programming)

Robot Arms

Experiment Control Structure using Eyetracking

Java Application Interface

Capture the position of X and Y axis

Sends data to control the robot arms

Sends Instructions to

control joint

Robot arms performed specific tasks

Stage 1 Stage 2 Stage 3 Stage 4

Figure 19 Flow of data of three stages

6.1.2 Second Stage

In a second stage, a java application reads data from the text file and converts this data into integer form. The Java application uses this data (written in text) for two purposes.

Java application uses java APIs (draw Circle) to draw the position of the eyes on different positions at different interval of time (as shown in Figure 6.2) on the basis of the data from the text file (X-Axis and Y-Axis). Secondly, it determines the movement of joints of the

(35)

Eye Movement

Red circle showing the movement of eyes

Figure 20 Graphical movements of eyes of users

Because the robot has five different joints (figure 19) we have drawn five different rectangles on the interface (figure 23). Each rectangle is associated with a different joint of the robot. The name of a rectangle and the corresponding joint of the robot are the same (Base, Gripper, Joint 1, 2, and 3) as shown in figures 19 and 23.

We have done programming in such a way that if the eyes of the user focus on the upper part of a rectangle, then the relevant joint can move from 0 to a 90 degree angle. Likewise if the eyes of the user focus on the lower part of a rectangle, then the relevant joint can move from 90 to 180 degrees.

The java application maintains byte b [] of size 3. First byte b[0] holds a joint number to be moved (0 for Base , 1 for Gripper , 2 ,3,4 for Joint 1,2,3). When the eyes of the user focus on any relevant rectangle among 5 different rectangles, then digit (0 to 4 as a joint number) will be assigned to b [0]. Similarly, if the eyes of the user focus on the upper part of a rectangle,

(36)

then angle (0) is assigned to b [1] and the last angle (90) is assign to b [2]. If the eyes of the user focus on the lower part of a rectangle, then angle (90) is assigned to b [1] and the last angle (180) is assigned to b [2]. This byte is sent to the memory of the Arduino board.

6.1.3 Third Stage

All input data is stored in a "Positioning Controller", that is the memory for a specific servo.

Figure 21 Specification of arduino board

As the picture shows, the Arduino board has digital pins. Different joints of the robot are given power by these pins. We have controlled these pins according to our needs by java application and Arduino board IDE. In our case we have given high voltage to a pin on basis of data provide by b [0] using serial communication.

We have also done coding in the IDE of the Arduino board (3rd stage). Programming in this stage has while () loop and waits for data from the java application through the serial port. When it receives data from the java application (stage 2nd), this data has the form of a byte array b [] of size 3.The first byte tells the joint number to be moved. A programming instruction sends PWM signal to the pin of the arduino board (through attach (pin number) method) that moves to specific joint. Second byte b [1] and third byte b [2] specify the first and last angle to which a specific joint is to be moved. Programming instruction use this data for movement

.

(37)

6.2 2nd Interface Using Eyetracking

Its first and third stage is similar to first interface.

Figure 22 2nd interface

The difference between Figure 21 and Figure 19 is that it has a picture with rectangles showing the active regions of the interface. Because the robot has five different joints, we have drawn five different rectangles on the interface .Each rectangle is associated with a different joint of the robot. The name of the rectangle and the joints of robot are the same (Base, Gripper, Joint 1, 2, 3).

We have programmed the application in such a way that if the eyes of the user dwell in the upper part of the rectangle, then relevant joint can move from 0 to 90 degrees. Likewise, if the eyes of the user dwell in the lower part of the rectangle then the relevant joint can move from 90 to 180 degree.

The java application maintains byte b[] of size 3.The first byte b[0] holds the joint number (0 for Base , 1 for Gripper , 2 ,3,4 for Joint 1,2,3). When the eyes of the user dwell on any relevant joint among the 5 different joints then digit (0 to 4 as a joint number) will be assigned to b [0]. Similarly, if the eyes of the user dwell on the upper part of the rectangle, then angle (0) is assigned to b [1] and last angle (90) is assigned to b [2]. If the eyes of the user dwell on the lower part of the rectangle then angle (90) is assigned to b [1] and last angle (180) is assign to b [2].

(38)

6.3 Working of Interface1 using Mouse.

This interface has 3 stages .Its second stage is similar to the 3^rd stage of the eye tracker base interface shown in figure 6.5, so we will not describes it again.

Arduino Board ( IDE+ Memory programming)

Robot Arms

Experiment Control Structure using Mouse

Java Application Interface

Sends data to control the robot arms

Sends Instructions to

control joint

Robot arms performed specific tasks

Stage 3

Stage 1 Stage 2

Figure 23 Mouse based interface

Its first stage is different so here is its description. The Java application draws the following graphical user interface. This application uses the following methods to draw the GUI.

Figure 24 Mouse Based GUI and Robot.

SetColor (Color.red);

DrawRect(x, y, width, length);

SetColor (Color);

DrawLine(x, y, ending x, ending y);