Computer Science and Engineering, 300 credits

(1)

Bachelor Thesis

HALMSTAD

UNIVERSITY

Computer Science and Engineering, 300 credits

Interactive Robot Art

A turn-based system for painting together with a robot

Computer Science and Engineering, 15 credits

Halmstad 2019-06-18

Nils Lindhqvist, Erik Westberg

(2)

(3)

Abstract

A large amount of people suffer from mental illnesses such as depression and autism. Receiving the care they need can be a very difficult process, with long queues and expensive bills. Automating part of the therapeutic process might be a solution to this. More patients could be treated at the same time, and the cost could be decreased.

This project explores the possibilities of using a robot that paints to- gether with patients. Such a robot would encourage the patient to be cre- ative, which is thought to be an efficient way of improving their well-being.

The painting will be done in a turn-based fashion, each taking turns adding details to the same painting.

Software is developed for the robot Baxter, made by Rethink Robotics.

Computer vision concepts and algorithms is applied to interpret what the user has painted and construct a plan of what Baxter will paint. Painting is then done by tracing the target shape through a set of pre-defined points on a canvas.

The constructed system performs fairly well - although the user is limited

to painting lines, squares, rectangles and circles. Further work can be done

to increase the amount of options available to the user. This system serves

as a model of how a similar system could be used in reality.

(4)

(5)

Sammanfattning

En stor m¨ angd folk lider fr˚ an mentala sjukdomar som depression och autism.

Att f˚ a den hj¨ alp som dem beh¨ over kan vara en jobbig process, med l˚ anga k¨ oer och dyra kostnader. Detta inneb¨ ar att m˚ anga m¨ anniskor lider l¨ angre tid ¨ an vad de vad de ska beh¨ ova g¨ ora, eller att de inte har r˚ ad att skaffa den hj¨ alp som dem beh¨ over. Att automatisera en del av den terapeutiska processen skulle kunna vara en l¨ osning till detta. Flera patienter skulle kunna bli behandlade samtidigt och kostnaderna skulle minska.

Detta projekt utforskar m¨ oojligheterna kring anv¨ andning av en robot som m˚ alar tillsammans med patienter. Detta skulle uppmuntra patien- terna att vara kreativa, vilket tros vara ett effektiv s¨ att att f¨ orb¨ attra de- ras v¨ alm˚ aende. Roboten kommer m˚ ala tillsammans med patienter p˚ a ett tur-baserat vis, d¨ ar var och en l¨ agger till detaljer i samma m˚ alning.

Mjukvara utvecklas till roboten Baxter, gjord av Rethink Robotics. Kon- cept och algoritmer f¨ or datorseende appliceras f¨ or att tolka vad som anv¨ an- daren har m˚ alat och bygga upp en plan av vad Baxter ska m˚ ala. M˚ alandet utf¨ ors genom att sp˚ ara den slutgiltiga formen genom ett par f¨ or-definierade punkter p˚ a tavlan.

Det tillverkade systemet presterar hyfsat bra - ¨ aven om anv¨ andaren ¨ ar

begr¨ ansad till att rita linjer, kvadrater, rektanglar och cirklar. Fortsatt

utveckling kan ut¨ oka antal valm¨ ojligheter f¨ or anv¨ andaren. Systemet agerar

som en modell f¨ or hur ett liknande system skulle kunna anv¨ andas i verk-

ligheten.

(6)

(7)

Acknowledgements

We would like to thank our supervisor, Martin Cooney, for excellent support and guidance.

We also want to thank the volunteers for participating in the experiments.

Nils Lindhqvist & Erik Westberg

Halmstad, June 2019

(8)

(9)

1 Introduction

Mental and cognitive disorders such as depression, autism and trauma af- fect a large number of people all over the world. The ability to treat these problems is improving, often through means such as drugs or therapy. How- ever, medical staff is a limited resource, and not everyone can receive the care they require. The waiting time might be unreasonably long, or the cost too high. This project will explore the possibility of automating part of the therapeutic process by using a robot. This would be economically benefi- cial, as the personnel could shift their focus to other patients. There is also an important ethical aspect, as the general well-being of patients could be improved at a faster rate.

The robot will perform art therapy, a strategy already conducted by therapists. The intent is to let patients express their feelings through art rather than words. This can be effective when, for example, working with children suffering from severe trauma. They might be unable to describe their feelings, but able to make a drawing based on them. Art therapy is not only used as a way for the therapist to understand the patient; the very act of creating art has also shown to improve the health of patients [1]. This concept is the core of this project. The robot will be used to motivate the patient to continue painting. The incentive will come from the robot being able to provide feedback, by also adding features to the same painting.

This will be done in a turn-based fashion. When the patient finishes their turn, the robot will analyse what was painted. The result of this analysis will be used to influence the robot‘s choice of what to paint next. This allows the patient and the robot to create a painting together. The concept is shown in Figure 1.

Figure 1: Baxter with brush, paints and canvas

(12)

1.1 Purpose

A robot will be taught how to paint, and a turn-based system will be im- plemented. This will allow a person and the robot to take turns adding details to the same painting. What the robot paints should relate to what the person has painted, to give the feeling of cooperation.

1.2 Requirements

There are three distinct requirements on the robot.

• The ability to perceive and recognise what the user has painted

• Based on what the user painted construct a plan of what to paint

• Be able to paint on a canvas in accordance with the constructed plan There are also two requirements on the system that handles communication between user and robot.

• The user needs to be able to tell the robot that it is free to start painting

• The user needs to be able to turn the robot off at any point in the process

1.3 Delimitations

To allow the project to be completed within the time frame, two delimita- tions have been set.

• The robot will only be able to understand some basic features, such as polygons

• The canvas and paints will be placed at a fixed location relative to the robot

1.4 Research Questions

To complete the project, the following questions will need to be answered.

• What are the best ways of understanding what basic features are present in an image?

• How can a plan of what to paint be constructed from information about what the user painted?

• How can the robot paint some basic features on a canvas?

(13)

2 Related Work

Much research has been done regarding the two largest building blocks of this project - painting robots and Socially Assistive Robots. Only in later years have these two fields started to be combined. Relevant parts of this research is presented here.

2.1 Painting Robots

The topic of painting robots is heavily explored, and many projects have been successful in creating such robots [2], [3], [4], [5]. Some of these are described here.

Scalera et al. [2] developed a painting robot that is able to reproduce grey-scale watercolour artworks. A big part of this project was the desire to distinctly visualise each stroke, to make the observer recall the gestures of a human. This was achieved by filling all areas of similar intensity with strokes separated by a certain distance. A shorter distance means that the area appears darker, as less of the white canvas is visible. The contours of objects is detected using the Canny Edge Detector, and Inverse Kinematics was used to paint the image.

Jaquier [3] created an interactive painting robot capable of playing tic- tac-toe on a paper. The game board is interpreted using edge detection and contour extraction, described in Sections 3.4 and 3.4. The robot constructed an idea of what to paint as an internal image. Replicating this image on the canvas is done by extracting the contours and tracing them using Inverse Kinematics.

Another interactive painting robot was constructed by Grosser [4]. This robot is able to listen to the surroundings and use the sounds to influence what it paints. A genetic algorithm was applied to handle the decision of what to paint. How the actual painting is done is not described.

2.2 Socially Assistive Robots

Socially assistive robots (SARs) aid people through social interactions. This can be done through, for example, speech, motion or painting. It is a rela- tively new field, with many promising applications [6].

One example is a SAR used in physical therapy for patients having suf-

fered from a stroke. The goal is to shorten the recovery period by having

the robot encourage the person to do a certain physical task. The robot

is able to remind the patient to perform their exercise if they have not,

and show praise otherwise. The concept was appreciated by patients during

experiments, although much work remains to be done [7].

(14)

Matari´ c et al. [8] has studied SARs for use in autism therapy, mainly aimed at children. Several properties of such SARs are discussed, of which two are especially relevant to this project. These are the possibilities of helping children through imitation and turn-taking. The ability to imitate others and partaking in social turn-taking games are important ways of learning social behaviours. However, imitation often does not come natural to children with autism, and they often find social games challenging. SARs designed to encourage imitation and to play turn-taking games with such children could therefore be beneficial to their personal development.

2.2.1 SARs for Creativity

Several projects has applied SARs in attempts of increasing creativity. One example is a SAR named YOLO that focuses on developing creativity in children. This is done by having the children create stories involving YOLO, to which YOLO is able to react. This is meant to boost the idea generation in the children, thereby training their creative abilities [9].

A co-creative drawing system was created by Jansen and Sklar [10].

Their system is meant to act as an inspirational agent for artists, solving issues of ”artist’s block”. However, the authors found that artists were resistant to the idea of having a SAR directly interfere with their drawing.

This was solved by having the SAR simply project ideas onto the canvas, leaving the final decision of what to draw to the artist. This project will, despite these results, aim to develop a SAR that actually paints on the canvas. This is motivated by the fact that the goal is not help users create a good artwork, but rather to encourage users that would not otherwise paint at all to do so.

Alves-Oliveira et al. [11] conducted a study in an attempt to determine whether SARs can be used to spark creativity. This was done using an interactive painting robot, similar to the concept of this project. In this study, participants were asked to paint on a tablet. After doing so, the robot responded to some participants - while other participants only got a digital response from the tablet. The study concluded that, contrary to expectations, painting with the robot did not actually spark creativity.

The authors suggest that this might be because some participants thought the robot ”ruined their drawing”, and that they had higher expectations of the robot. Further studies with a more sophisticated SAR can be done to eliminate these issues, which hopefully would yield more favourable results.

2.3 Art Therapy SARs

This project is largely inspired by the works of Cooney and Menezes [12].

They discuss ideas and concepts regarding the design of an SAR for con-

ducting art therapy. Many ethical pitfalls are introduced, together with

(15)

some proposed solutions. All of these pitfalls can not be regarded in this project, and some will be left for future development. The pitfalls that will be regarded are those of security, confidentiality and misjudgement. These pitfalls are described here.

Developing the SAR to be safe is crucial. The SAR should preferably not be able to hurt the user, even in worst-case scenarios. The user should also be able to quickly disable the robot, should anything unexpected happen.

Cooney and Menezes also discuss the importance of letting the user know that the SAR is safe, as they otherwise might feel unsafe despite this not being the case.

Confidentiality is a delicate subject, but very important. An art therapy SAR might benefit from knowledge of the diagnosis of the patient, as the decisions made by the SAR could be adapted to the needs of the patient.

Logging as much information as possible from sessions with patients might be useful to the actual therapist. However, all this information is very sensitive and should be kept private. The system must also make sure to follow regulations regarding the storage of such personal information, such as the General Data Protection Regulation (GDPR).

The third pitfall, misjudgement, is a complex issue. Incorrectly inter- preting what the user has painted could lead to the SAR doing more harm than good. The patient might feel ignored and unimportant if the SAR paints something that does not correctly relate to what they painted. A solution presented by Cooney and Menezes is to explicitly ensure correct in- terpretations by having the SAR ask the patient for confirmation, and never act unknowingly.

2.4 Conclusions

Painting robots is a heavily explored area, and many existing solutions per-

forms extremely well. SARs, on the other hand, remains largely unexplored

- especially regarding specifically art therapy. This project will therefore

put emphasis on these parts of the system. This means that the pitfalls

discussed in Section 2.3 has to be carefully incorporated into the project. If

the project is successful, the SAR could be used to further the study done

by Alves-Oliveira et al. [11].

(16)

(17)

3 Computer Vision

Many tasks, such as object detection, require computers to gain an under- standing of images. Computer Vision (CV) is the field that studies how this can be done. This section will introduce the CV concepts that are required to complete this project. This includes the following topics: Colour Labelling, Image Thresholding, Noise Filters, Edge Detection, Contour Extraction and Shape Recognition.

3.1 Colour Labelling

To understand the colour of features in an image, some processing has to be done. This section will describe a few alternative methods for labelling colours. Labelling refers to the process of identifying colours as elements of a predefined set of colours, for example ’red’ or ’yellow’.

RGB

Computers often use the RGB model to describe colours. In the RGB model, a colour is defined using three parameters - the amounts of red, green and blue in the colour. These parameters are based on the concept of additive mixing of light. Variable amounts of red, green and blue light can be com- bined to construct light of almost any colour. This is exactly how colours are displayed on screens, often making the RGB model an obvious choice.

However, the model is often sub-optimal when trying to label colours. To do this using the RGB model, boundaries in three-dimensional space has to be defined for each possible label. Labelling is then done by observing which of these regions a colour ends up in.

There are several other colour models that make this problem much easier to solve, for example the HSV, HSL and CIELAB colour models.

Colour labelling using these models is described below.

HSV / HSL

Similar to the RGB model, the HSV and HSL colour models also use three

parameters to define colours. However, these parameters are designed differ-

ently from those in the RGB model. The main component of how humans

differentiate colours, hue, is represented by a single parameter, rather than

being a combination of all three. The two other parameters describe how

bright and how saturated the colour is [13]. Chromatic colours can now

be labelled by defining boundaries for the hue parameter alone. However,

achromatic colours will be incorrectly labelled using this method, since they

are not part of the hue spectrum. To solve this, a constant upper and lower

(18)

bound to brightness and saturation can be defined. If this bound is breached, a separate check can label the colour as white, black or a shade of grey.

CIELAB

Another powerful colour model is the CIELAB model. Colours are again described using three parameters - L ^∗ , a ^∗ and b ^∗ . This time, the parameters correspond very closely to how the colour is perceived by humans. The amount of change applied to any of the parameters relate to how much the colour changes according to a human. This allows the model to be used to calculate the perceived dissimilarity of any two colours. This is done by treating colours as points in three-dimensional space, with the three parameters as base vectors. The perceptual difference between any two colours is equal to the Euclidean distance between the points of those colours, as shown in Equation 1 ¹ . If the Euclidean distance is large, the colours will be perceived as very different [13]. Labelling a colour can be done by defining the ideal parameters for each available label, and then checking what label is closest to the actual colour.

∆E _ab ^∗ = ^q (L ^∗ ₁ − L ^∗ ₂ ) ² + (a ^∗ ₁ − a ^∗ ₂ ) ² + (b ^∗ ₁ − b ^∗ ₂ ) ² (1)

3.2 Image Thresholding

Image thresholding is used to manipulate an image based on the state of each pixel. This can be used to separate areas of interest from their surroundings, simplifying any further processing of the image. The goal is to construct a new binary image. A binary image consists of pixels limited to one of two states, usually represented as black or white. Three methods of achieving this will be presented here: regular thresholding, adaptive thresholding and range-based thresholding. A comparison is shown in Figure 2.

Regular thresholding operates on grey-scale images, and simply com- pares the intensity of each pixel to a specified threshold value. Pixels will be black if their intensity is below the threshold, and white otherwise. Adap- tive thresholding also operates on grey-scale images, but does not use a fixed threshold value during the comparison. Rather, a different threshold is de- termined for each pixel based on the pixels in a region around it. This means that pixels that are generally brighter than other pixels in their region will be set to white, and set to black otherwise. This is useful when, for example, a shadow is ast on part of an object.

Range-based thresholding is slightly different, and can handle colour- images. A specified region is used instead of a threshold value. Pixels of colour within this region is set to white, and other pixels are set to black.

1

The pure Euclidean distance is not entirely accurate, but close enough for the purposes

of this project. For more information see doi:10.1002/col.20070

(19)

This is similar to the HSL / HSV method of labelling colours, described in Section 3.1. A range-based threshold can be used to separate features of different colours, or features of different intensities.

Figure 2: Three different methods of thresholding. Top-left: Orig- inal image. Top-right: Regular threshold. Bottom-left: Adaptive threshold. Bottom-right: Range adjusted for red colours

The result of the thresholding operation can be used as a mask to restore information about the original image. Masking will keep information intact for pixels that are set to white in the mask, and reset it to zero (usually black) otherwise. Figure 3 shows masking being done to restore part of the original image from Figure 2, by using the result of range-based thresholding as the mask.

Figure 3: Masking. The bottom-right image of Figure 2 is used as a mask for the top-left image of Figure 2

3.3 Noise Filters

It is very often necessary to reduce the amount of noise in an image. This is done by applying a low-pass filter to the image, for example the Gaussian blur. The Gaussian blur works be applying a kernel, or convolution matrix, to all pixels in the image. A kernel is similar to a mathematical function.

For example, the 3x3 kernel for the Gaussian filter is shown in Figure 4.

When this kernel is applied to a pixel, the new value of that pixel will be determined by the nine pixels in the 3x3 grid with that pixel in the centre.

The amount of influence each of these pixels has can be seen in the kernel. In

this case the central pixel will have the largest influence, while the diagonally

connected pixels have the smallest influence. The result is that any noise

(20)

will become spread out over a larger area. However, the desired objects are also affected, as seen in Figure 5.

Figure 4: Kernel for the 3x3 Gaussian blur

A filter that partially fixes this problem is the bilateral filter. Contrary to the Gaussian blur, the bilateral filter will preserve the edges of objects.

This means that different objects will not be smudged into each other, and identifying objects become easier. This is done by not only considering the distance between two pixels, but also how perceptually different they are.

If this difference is large the two pixels will not be combined, no matter how close they are. For colour images the perceptual difference is calculated using the CIELAB model, as described in Section 3.1 [14]. A comparison between the Gaussian blur and the bilateral filter is seen in Figure 5.

Figure 5: Two different methods of removing noise. Left: Original image. Middle: Gaussian Blur. Right: Bilateral filter

3.4 Edge Detection

Another task in computer vision is the extraction of edges, for example the

contour of objects. This is used to remove unnecessary information, while

preserving all structural properties of an image. One popular method of

doing this is by applying the Canny Edge Detection algorithm, developed

by John Canny [15]. The first step of the algorithm is to reduce the amount

of noise in the image, as described in Section 3.3. Traditionally, the Gaussian

blur was used for this purpose, but the bilateral filter has shown to result in

even better edge detection [16]. The process of actually finding the edges are

then done with the Sobel operators. The Sobel operators are two kernels,

shown in Figure 6. They are designed to respond to significant change in

intensity - one for vertical and one for horizontal change [17]. The result of

applying these kernels will be a new image, where only these drastic intensity

(21)

changes are visible - all the edge-points. The intensity of any pixel in the result corresponds to how large the change in intensity was in that region in the original image.

Figure 6: The two Sobel operators

After applying the Sobel operators, the result will be fine-tuned. Edge- points above a certain threshold will be marked as certain edges, while points below another threshold will be removed. Hysteresis thresholding will be used for the remaining points. This means that they will only remain if connected to points that are already marked as certain edges [18]. An example of the Canny edge detector being used is shown in Figure 7.

Figure 7: The result of the Canny Edge Detector

Contour Extraction

Contour extraction is the process of analysing the edges of an image, and extract information about each shape in the image. Algorithms developed by Suzuki and Abe can then be applied to trace the edges and thereby retrieve contours of objects [19]. A contour can be described as a set of vertices, each representing a corner of the shape. The original shape can be re-constructed by connecting all these vertices with edges.

3.5 Shape Recognition

After successfully extracting the contours of an image, it is possible to recog-

nise simple shapes. There are a few different techniques to doing this, the

most promising being described here.

(22)

Contour Approximation

One technique is to analyse the vertices received from performing contour extraction, described in Section 3.4. However, if the original shape is im- perfect, the shape can not easily be identified from immediately studying the vertices. Douglas and Peucker developed algorithms for down-sampling the contour. This will create a new contour, that approximates the original contour. The down-sampling is done by replacing similar vertices with a single vertex. The right amount of down-sampling will result in a shape that is much easier to identify [20]. This process is shown in Figure 8. Af- ter down-sampling the contour sufficiently, the recognition of simple shapes is trivial. A rectangle, for example, can be identified by looking for four vertices connected by right angles.

Figure 8: Contour extraction and approximation. Left: Square with defects. Middle: Vertices of the contour. Right: Vertices of an approximated contour

Hough Transform

Another method of recognising shapes is using Hough transforms. Hough

transforms can be used to detect arbitrary shapes in an image [21]. The

transforms functions best when applied to an image where only the contour

of a shape is visible. The concept is described here in the context of detecting

lines, but the same idea applies to other shapes. When applying the Hough

line transform, each of the non-zero (visible) pixels in the image will be

traversed. All lines that could possibly be drawn through each of these

pixels will be calculated, and remembered. When the image is traversed,

all calculated lines are analysed. If the same line could be drawn through

a large number of different pixels, this means that those pixels all occupy a

certain diagonal in the image. It is thereby likely that these pixels actually

construct such a line [22]. All the discovered lines can then be analysed, to

determine what shape they construct.

(23)

4 Design

Based on the requirements, introduced in Section 1.2, four distinct tasks are constructed. The structure of these tasks is based on the Sense-Plan-Act paradigm, common within robotics [23]. Each task represents a separate part of the system, and they will all be combined to finally construct the complete system. These tasks are:

Interpretation

• The ability to perceive and recognise what the user has painted Decision

• Based on what the person painted construct a plan of what to paint.

Painting

• Be able to paint on a canvas in accordance with the constructed plan.

Interaction

• The user needs to be able to tell the robot that it is free to start painting.

• The user needs to be able to turn the robot off at any point in the process.

4.1 Main Program Structure

The main data flow of the system is depicted in Figure 9. When starting a

session with the SAR, the user will be in control. The user is free to paint

on the canvas for as long as they like. When they are satisfied, they can

instruct the robot to start painting (1). Control will then be handed over

to the Interpretation task (2). From here, an image of the canvas will be

retrieved (3) and interpreted. The resulting interpretation will be passed

on to the Decision task (4). Based on the interpretation a plan of what to

paint will be constructed, which is given to the Painting task (5). The plan

will be executed, after which control is finally returned to the user (6). The

user can start painting again, and the cycle can repeat whenever the user

initiates it.

(24)

Figure 9: Main structure of the system

4.2 Workflow

The development process was divided into two cycles. The goal of the first cycle was to construct a basic first version of the system that comply with the intended design. The goal of the second cycle was to further develop and improve each task separately. The development was done slightly different for the different tasks. The Interpretation and Decision tasks required a fair amount of experimentation, to find suitable solutions. In comparison, Painting and Interaction had a linear development process.

4.3 Hardware

The robot chosen for this project is Baxter, made by Rethink Robotics.

Baxter is a multi-purpose robot, designed for research and education. The robot has two arms, a head and a main body - making it somewhat human- like. One camera is built into each arm, and one into the forehead. Each arm has seven degrees of freedom, and a gripper on the end.

Baxter was chosen largely due to the fact that the robot already was accessible at the start of the project, and is very capable of performing the painting motions required. It should be noted that the intention is not to have Baxter be part of the final product. If the project proves successful a more specialised robot can be designed with this precise task in mind. Until that point is reached, Baxter will be used for research.

4.4 Programming Language

All programs will be written using the programming language Python, spec-

ifically version 2.7.6. Python is a high-level, general-purpose programming

(25)

language. One reason for this might be that Python is extremely simple to use, and can be applied in a wide range of scenarios. It is dynamically typed, and the syntax is very concise and readable compared to many other high-level programming languages. It is also an interpreted language, mean- ing that the program is not compiled before running - also simplifying the development process.

Python was chosen mainly because of the versatility of the language.

Creating both simple scripts and large, complex systems is easy when using Python. Programs can be as simple as shown in Listing 1. Programs can be combined by including them as modules, supporting modularity. Another benefit of using Python is the large amount of libraries and frameworks available.

Listing 1: A simple Python program p r i n t (’ Hello , w o r l d ! ’)

> Hello , w o r l d !

4.5 Software

A number of software also had to be selected for the project. These decisions will be discussed here.

4.5.1 Robot Operating System

Robot Operating System (ROS) is a framework commonly used within robo- tics. It contains many tools and libraries meant to simplify the control of many different kinds of robots. This includes methods of controlling joints and accessing camera feeds. This is done using the subscription pattern.

Subscriptions can be made to specific topics, defined by their name. If another unit publishes a message to this topic a callback function will be invoked, and the message can be handled. ROS was chosen since it is com- patible with Python and Baxter, and has a good reputation.

4.5.2 OpenCV

There are a large number of libraries that focus on CV. The library that will

be used in this project is OpenCV, version 2.4.8. It has a very large number

of functionalities for many tasks, such as image manipulation and image pro-

cessing. It is open-source, has a very large community and has support for

Python. One large reason to choosing OpenCV is that algorithms for most

of the concepts presented in Section 3 is already implemented. This, cou-

pled with the fact that OpenCV is one of the more established CV libraries,

made it a fairly easy choice.

(26)

4.5.3 Tkinter

To build the graphical user interface (GUI) of the system, the Python frame- work Tkinter will be used. The framework makes it simple to construct windows with buttons, images and similar components. Tkinter uses similar concepts to many other GUI-framework. The main loop is the main thread of the program, used to receive events from components. While this thread is busy, the user will not be able to interact with the system. To maintain a responsive GUI it is therefore important to handle events quickly, or by us- ing different threads. Different threads are handled as separate tasks by the operating system, which means that parts of the program can be executed in parallel. This would enable the user to interact with the GUI while, for example, the robot is being controlled.

4.6 Evaluation

The system will be evaluated in two steps. The first step will consist of informal tests of each task, and the second step will consist of experiments involving participants.

Initial Evaluation

The purpose of the initial tests is to make sure that the system performs as expected before moving on to the real experiments. If it is not able to do so, the real experiments will not be able to result in any constructive conclusions. During these tests, each component of the system will be tested individually. Exactly how well each component has to perform is not defined.

Creating such a definition is difficult, as the system largely depends on the actions of the user.

Informally, the system has to successfully identify what the user has painted most of the time, without the user taking excessive care in painting accurately. If the interpretation was successful, the system should be able to construct a plan consisting of features related to those painted by the user. Baxter should then be able to trace this shape on the canvas, to some degree.

Final Evaluation

The final evaluation of the system will be experiments involving participants.

The participants will each get to experience a painting session with Baxter, lasting a few minutes. Prior to the session they will be informed of the ca- pabilities of Baxter, and is suggested to limit their paintings to what Baxter should be able to recognise. After the session follows a short questionnaire.

The questionnaire will aim to identify potential problems with the current

design, and parts that has to be developed further.

(27)

The first part of the questionnaire will seek to gather the participants impressions regarding each component of the system through the following four statements:

1. The robot was able to understand what I painted 2. What the robot painted was related to what I painted 3. The painting skills of the robot were good

4. Interacting with the robot (using the program) was easy

The participant will be able to respond to each statement using the following scale:

1. Strongly disagree 2. Somewhat disagree 3. Neutral

4. Somewhat agree 5. Strongly agree

The participant will also be given the option to express more specific thoughts by listing some words describing their thoughts of the system and their experience with it. This is meant to give an even clearer idea of what the participants disliked and liked about the system, and act as inspiration for further research.

The next part of the questionnaire will ask the participant to state whether they think a similar concept could be applicable within art ther- apy. A positive result to this question is expected. If this is not the case there could be some large flaws with the system, and this will have to be addressed.

The final part will ask if the participant has any previous interests in either robotics or arts. This question is included since there might be a correlation between finding the system enjoyable and being interested in robotics or arts. Such interests will not always be held by the end-user.

If the only participants that show appreciation of the system is those that

already have similar interests, this is another potential issue.

(28)

(29)

5 Implementation

This section will describe the construction of the system in detail. Each task will be discussed separately. The first section will introduce some concepts common to all tasks.

5.1 Utility Modules

Several Python modules were designed to be used in several different parts of the program, or as a tool during development. The most important of these modules are described here.

5.1.1 SavePoint

The module SavePoint was made to streamline the process of recording useful positions for Baxter‘s arms. After manually moving the arm to the correct position the angles are acquired by using a ROS subscription. All angles of the joints are then formatted in a way to suit the module Move- ToPoint, and stored for later use.

5.1.2 MoveToPoint

Moving the joints of Baxter is done in many different parts of the program, so making this a separate module makes the program much more manageable.

The arms are moved by using a ROS publisher to communicate with Baxter.

5.1.3 CaptureImage

The purpose of the CaptureImage module is to capture an image using Baxter‘s left-hand camera. This is again done using a ROS subscription After retrieval of an image it is converted from the ROS image format to a regular OpenCV image. The main part of the module is shown in listing 2.

Listing 2: Retrieving an image from Baxter‘s left-hand camera, and converting to an OpenCV image

def i m a g e C a l l b a c k ( i m a g e ):

s u b s c r i p t i o n . u n r e g i s t e r ()

r e s u l t = C v B r i d g e (). i m g m s g _ t o _ c v 2 ( image , ’ b g r 8 ’) t o p i c = ’ / c a m e r a s / l e f t _ h a n d _ c a m e r a / i m a g e ’

s u b s c r i p t i o n = r o s p y . S u b s c r i b e r ( topic , Image ,

i m a g e C a l l b a c k )

(30)

5.2 Interpretation

The goal of the Interpretation task is to be able to describe all the features present on the canvas, as shown in Figure 10. This section will discuss how this task was implemented, using many of the CV concepts introduced in Section 3.

Figure 10: The purpose of the Interpretation task

Image Pre-processing

The first challenge is to capture an image of the canvas. One of the delim- itations, defined in Section 1.3, is that the canvas will be kept on a fixed location. This means that an appropriate position for the camera only has to be found once, and can be reused every time an image needs to be captured.

Finding this position was done using SavePoint and taking images was then done using CaptureImage, as described in Section 5.1. The captured image is then cropped to include only the canvas. An example of this process is seen in Figure 11.

Figure 11: Image taken of the canvas. Left: Raw image. Right:

Cropped image

After capturing the image, noise is removed using the bilateral filter.

This filter was chosen over the Gaussian blur because of the fact that the

bilateral filter preserves edges. This is very beneficial, as it means that

recognising the shape will be slightly less troublesome. The features are

then separated from the background canvas by thresholding, as described in

Section 3.2. All three techniques; regular threshold, adaptive threshold and

range-based threshold is attempted. Results from each is shown in Figure

(31)

12, where the parameters for each function have been fine-tuned to give as good a result as possible.

Figure 12: Comparison of the three different thresholding strate- gies. Left: threshold. Middle: adaptiveThreshold. Right: inRange

Regular thresholding had very large difficulties distinguishing the yellow line from the white canvas, but darker colours could be detected very reliably.

By using adaptive thresholding it was possible to maintain most of the yellow line. Range-based thresholding also had some troubles with the line, as parts of the edges was missed. Based on these tests, adaptive thresholding was chosen for the final system.

Each of the remaining features are then separated from each other by using contour extraction, as it much simpler to interpret each shape sepa- rately. Contours with a small area are discarded - as they are likely just noise. The final result of the pre-processing stage is shown in Figure 13.

Figure 13: The result of the pre-processing stage

Color

The CIELAB method was chosen for labelling the colour of the shapes. The

reason for choosing this method over HSV is that adding support for new

colours becomes much easier. This can be done by filling a small region

with this colour, and then use CaptureImage to capture an image. This

image is then converted to CIELAB colour space, and the L ^∗ , a ^∗ and b ^∗

components can be identified. These three parameters can then be stored

as the definition of this colour. This is a much easier task than finding

the appropriate borders of the HSV hue spectrum, especially when a large

number of different colours are in use.

(32)

During the development of this project four different colours were avail- able - red, yellow, green and blue, shown in Figure 14. Despite being vividly green in reality, the colour appeared very similar to blue in the images.

Slight changes in lightning or camera angle made them completely indistin- guishable. Green was therefore omitted during the remainder of the project.

Figure 14: The four colours available during development

Filling

To check whether a shape is drawn as just an outline or completely filled-in, the area of the contour was compared to the area of the features. If the shape is completely filled-in, these two areas should be roughly equal.

Shape

The first attempt at interpreting the shape was made using the Hough trans- forms. While the method works very well with perfect shapes, it was very difficult to get any good results for hand-drawn shapes. Decent results could be achieved by fine-tuning the parameters for each new image, as shown in Figure 15. All of the non-circular shapes in this example would have been able to be identified using the normal Hough line transform. However, the system would quickly break if the user changes the overall scale of the shapes, or draws them slightly worse.

Figure 15: Hough Line Transform, with detected lines in light blue

The contour approximation method was then implemented, as seen in

(33)

Figure 16. The problem with this method is that it is very difficult to estimate how much the contours should be approximated. In Figure 16 the square and the line have been identified perfectly, but the circle has also been turned into a square.

Figure 16: Contour Approximation, with increasing amount of ap- proximation

Based on the poor results from both the Hough transform and the con- tour approximation, another method was implemented. This method at- tempts to classify a shape as either a line, a square, a rectangle or a circle.

This is done by comparing the contour to the ideal shape for each category, in hope of finding a close match. The ideal shape is positioned on top of the contour, and to what extent they match is calculated by dividing the intersecting area by the total area. It is considered a match if this ratio is large enough. A visualisation of this is shown in Figure 17.

Figure 17: Circle Classification. Red: Intersecting area. Green:

Non-intersecting area of the ideal shape. Blue: Non-intersecting area of the actual shape

The ideal shapes are constructed by calculating the smallest such shape that fully encloses the contour, and then scaling it so that the area match the area of the contour. This means that any circles and rectangles in the image can be identified. Rectangles are then further split into lines, squares and rectangles based on how it is constructed.

For these simple shapes this method results in much better identification

(34)

than the Hough transforms and contour approximation, and was therefore chosen for the final system.

5.3 Decision

The goal of the Decision task is to generate an image based on the result of the Interpretation task. As input is a description of the shapes present in the image. The concept is illustrated in Figure 18.

Figure 18: The purpose of the Decision task

The decision is made in two major steps. The first step constructs the shape to paint, and what colour to use. The second step finds a suitable location for this shape. How both of these steps are implemented is described below.

Shape

If the description received from the Interpretation is empty, a completely random choice of what to paint will be made. Any of the shapes line, square, rectangle and circle is chosen. A colour is chosen from all the available paints.

Size is chosen randomly within a specified range of a pre-defined default size.

If the description does contain data, the decision is instead based on this data. The properties of the shape to paint is constructed from all the described features. If three circles and a square has been identified, for example, there is 75% chance that the decision will be to paint a circle, and 25% that it will be a square. The other properties of the feature, such as colour, are chosen similarly.

Finally, a small image containing only this shape is constructed, and passed to the second step of the decision.

Location

The purpose of this step is to find a location on the canvas that can fit the shape created in the previous step. Preferably a location should e found where the new shape does not intersect any of the already existing features.

The image of the canvas is modified using the dilate function from

OpenCV. This will result in an image where all the shapes have grown

(35)

slightly. A border was also added to the image. This is done to ensure that the new shape is not positioned too close to any of the pre-existing shapes or the edge of the canvas. The result can be seen in Figure 19.

Figure 19: Features has been dilated, and a border has been added

A point on this image is then chosen at random. The new shape is then inserted onto this image, at many different locations. The location closest to the random point that does not result in any intersections is chosen as the final location for the shape, as shown in Figure 20. If no such location is found the shape is scaled down slightly, and another attempt is made.

Intersections are detected by calculating the total area of the features in the image. If this area is roughly equal to the area of the shape combined with the original area of all the dilated shapes, no intersection is occurring.

Figure 20: Finding a non-intersecting location for the red square as close to the green mark as possible

After finding such a location a new image is created, and the shape is drawn at that location. This image is the result of the Decision task.

5.4 Painting

The goal of the Painting task is to be able to paint on the canvas, in accor-

dance with an internal image. The concept is shown in Figure 21.

(36)

Figure 21: Purpose of the Painting task

To allow Baxter to paint on the canvas, a grid of positions on the canvas was recorded using the SavePoint module. Bilinear interpolation can then be used to calculate the angles required to reach any position inside this grid.

In reality the angles do not change linearly, so bilinear interpolation will not give the perfect result. However, the change in angle will be approximately linear when the amount of change is small. Furthermore, a slight error will only lead to the brush being pressed slightly harder to the canvas. In this case points separated by 3 centimetres was sufficient to achieve good results.

At this scale the human inaccuracy when recording points was much more significant than the errors from the interpolation.

The paints will also remain in a fixed position relative to Baxter, so the positions required to pick colour was also recorded using SavePoint.

What colour to pick is decided by using the colour interpreter that was implemented for the Interpretation task.

After defining the grid, the points to visit need to be calculated. This is done by first generating vertices through contour extraction. However, these vertices can be very far apart, or unnecessarily close. To fix this new waypoints are placed at regular intervals along the contour. An interval of about two centimetres gave good results. An example of generating vertices is seen in Figure 22.

Figure 22: Conversion of shape to vertices.

A choice was made not to implement the ability to paint filled shapes.

This could have been done by filling the contours with a grid of points to

visit, with a separation related to how wide the brush is. However, drawing

filled shapes would take a lot of additional time, which means that the

(37)

system would be a lot less engaging, for no real benefit. A choice was also made not to have Baxter clean the brush automatically before switching to another colour. This decision was made because it is quite hard to get rid of all colour applied to the brush, so doing it manually proved much simpler.

Tests was also made regarding the accuracy at which Baxter should paint. The threshold determines how close to the target angles the joints should be before being done. A higher threshold means that Baxter will paint quicker, but with less accuracy. Some of the results of these tests are shown in Table 1.

Threshold Result Time

0.009 34s

0.02 24s

0.05 16s

0.15 8s

Table 1: Painting a square with a few different thresholds

A threshold of 0.05 was chosen for the final system. This choice was made because of the large time difference compared to the more accurate thresholds - despite not actually painting much worse.

5.5 Interaction

The goal of the Interaction task is to enable the user to interact with the system. This is done through a GUI, made using Tkinter.

The three main components that are used is Frame, Button and Label.

(38)

Frames are used as containers for the Buttons and Label s, and makes it easy to handle the positioning of the components. Buttons are used to receive events from the user, and the Label s show images. Two windows were created; a main window and a setup window. Both are described below.

Main Window

The main window is used to run the main program. It consists of four Buttons: Start, Reset, Disable and Setup, and two labels containing one image each. A screenshot of the GUI is shown in Figure 23.

Figure 23: The main window of the GUI.

The Reset and Disable buttons are used to control Baxter directly. If anything goes wrong, Disable will immediately disable the actuators of Bax- ter, stopping all movements. Reset will make the arms of Baxter return to their default state. If the main program is running when any of these but- tons are clicked, it will stop as soon as possible. This is made possible by passing a flag to the main program when it is started, and then setting this flag to True if the program has to exit. The main program will check this flag regularly.

The Setup button launches a secondary window, the Setup Window, on a new thread. This is used before running the main program to ensure that everything is ready.

The Start button will start the main program, and Baxter will start painting. Again, this is done on a new thread, to make sure that the Reset and Disable buttons will be responsive if they are required.

The two labels are used to display data from the main program. One

of them will show the image of the canvas taken by Baxter during the in-

terpretation, and the second will show the decision made by Baxter. These

images are received by using the Observer pattern, as shown in Listing 3.

(39)

Listing 3: The Observer pattern used to feed images to the GUI def s u b s c r i b e S e n s e ( self , c a l l b a c k ):

s e l f . s e n s e C a l l b a c k = c a l l b a c k def s u b s c r i b e P l a n ( self , c a l l b a c k ):

s e l f . p l a n C a l l b a c k = c a l l b a c k

The labels are then updated with these new images. An issue is that the labels should only be updated by the main thread. This is solved by adding any new images to a list, and have the main thread occasionally check this list for any new images. This is done using the after method, as shown in Listing 4.

Listing 4: Checking the image queue every 100ms top = tk . Tk ()

top . a f t e r (100 , s e l f . p r o c e s s Q u e u e )

Setup Window

The setup window is used to make sure that everything is ready before running the main program. The window is shown in Figure 24. It consists of two Label s, with one image each. One of the images shows the current position of the canvas, and the other shows the desired position of the canvas.

The image is captured using the CaptureImage module, and the labels are updated similarly to how the main window labels are updated.

Figure 24: The setup window of the GUI.

(40)

(41)

6 Results

This section will present results from general tests of the complete system and from the final experiments.

6.1 System Tests

During these experiments each component of the system is tested separately.

This is done to make sure that all parts of the system perform as expected before moving on to the final experiments.

6.1.1 Interpretation

The ability to recognise colours and shapes was tested with a set of hand- drawn shapes. Some examples are shown in Tables 2. A larger set of exam- ples are included in Appendix A.

Image Colour Shape

Yellow Rectangle

Blue -

Blue Line

Yellow Square

Red Circle

Yellow -

Table 2: Interpretation results

6.1.2 Decision

The performance of the decision component was tested by observing the

output with a few different inputs, as shown in Table 3.

(42)

Input Result

Table 3: Decision results. In reality the input also consists of a set of dictionaries describing all shapes in the image

6.1.3 Painting

The ability to paint shapes was tested by having Baxter paint circles, lines

and squares of different sizes. Some of the results are shown in Table 4, and

all are included in Appendix B.

(43)

Target Result

Table 4: Painting results

6.2 Final Experiments

The final experiments were done to gain an insight into how the system is

perceived by the users. To do this, five participants got to test the system

and provide feedback in the form of a questionnaire. The procedure was as

follows:

(44)

• A consent form was filled by the participant

• An introduction to the system was given in the form of a hand-out sheet

• The participant painted together with Baxter for a few minutes

• A questionnaire was filled by the participant

The questionnaire contained questions about how every part of the sys- tem performed during the session, and questions about their general thoughts on the system. The consent form, hand-out sheet and questionnaire are in- cluded in Appendixes C, D and E. The results are presented in Sections 6.2.1 to 6.2.4. The raw data is also included in Appendix F.

6.2.1 Individual Task Performance

This section covers the results to the first four questions of the questionnaire.

These questions focus on the performance of the individual components of the system. The level of agreement to the following four statements was examined:

1. The robot was able to understand what I painted 2. What the robot painted was related to what I painted 3. The painting skills of the robot were good

4. Interacting with the robot (using the program) was easy The following scale was used:

1. Strongly disagree 2. Somewhat disagree 3. Neutral

4. Somewhat agree 5. Strongly agree

The results can be seen in Figure 25 and Table 5.

(45)

Figure 25: Graph of questions 1 - 4

Question Average

1 4.4

2 4.6

3 3.8

4 5

Table 5: Results of questions 1 - 4 of the questionnaire

6.2.2 Experience Description

This section covers the fifth question of the questionnaire:

• Describe your experience with a few words

All words and their frequency is shown in Table 6. The full table of words

and the reasoning behind choosing that word is seen in Appendix B.

(46)

Description Frequency

Cool 1

Easy 2

Exhilarated 1

Fun 3

Good 2

Impressed 1

Improvement 1 Interesting 1

Surprised 1

Thoughtful 1

Table 6: Descriptive words and frequency

6.2.3 Purpose Agreement

This section covers the sixth question of the questionnaire:

• Do you think a system like this could be used to help people suffering from mental illnesses feel better?

The results can be seen in Figure 26, and the average was 2.8.

Figure 26: Results of question 6 of the questionnaire

6.2.4 Personal Interests

This section covers the seventh question of the questionnaire:

• Do you have an interest in robotics and/or arts?

(47)

The results are presented in Figure 27.

Figure 27: Results of question 7 of the questionnaire

(48)

(49)

7 Discussion

This section will discuss the results and key parts of the project, such as current limitations and suggestions for further development.

7.1 Results

This section will discuss the results from the two experiments, starting with the initial system tests.

7.1.1 System Tests

All components performed fairly well during the final tests. Some issues regarding the interpretation and painting components are noticeable.

The shape of features was classified correctly in the majority of cases, but there are many occasions where a feature could not be classified at all, for example row 2 and 6 of Table 2. The colour was successfully detected in every case. This is fairly expected, since the three colours used are very different. Experiments would have to be conducted with more similar colours in order to determine how good the colour recognition actually is.

The painting tests show that the system has problems painting small shapes, as seen in row 4 of Table 4. This is in part because of human error when recording the points on the canvas, but also because the brush used is relatively large. When drawing larger shapes these same small inaccuracies are less significant. Another issue is shown in row 2 of the same table, where part of the circle is not correctly painted. This is caused by how far apart the points on the canvas are defined. Three centimetres turned out to be insufficient in this case.

7.1.2 Questionnaire

Having a sample of five test persons is problematic as it’s hard to make any solid conclusions from such a small number of participants. More data would give a more accurate result. Despite this, the questionnaire gave a general estimation of how the system was perceived.

The results from the first four questions, regarding the individual compo- nents of the system, was mainly positive. The only component that received any criticism was the painting. The reason behind the criticism might have been that Baxter is fairly bad at painting small shapes, as seen during the initial system tests.

The words chosen when asked to describe the system were mostly pos-

itive. This does not necessarily mean that they thought of the system as

mostly positive, of course. A lot of factors play into this. One big fac-

tor might have been that the participants were not doing the questionnaire

(50)

anonymously. The results still reflect some of their thoughts on the sys- tem. More than one person thought that the system was easy to use, which correlates with the result from question four. Other words suggest that the concept is interesting; Cool, Interesting, Thoughtful. The results also showed that some improvements to the interpretation could be made, as one participant experienced that lines were being treated as rectangles.

The majority also thought that a system like this could be useful for treating people suffering from mental illnesses. Four out of five people agreed that it could be useful, while the fifth person remained neutral.

7.1.3 Conclusions

The interpretation shows good results, and is able to correctly identify the painted shape most of the time. Much care was put into making sure that the SAR does not misunderstand the intentions of the user, as this is one of the potential issues with art therapy SARs - as discussed by Cooney and Menezes [12].

However, the painting ability of Baxter is poor compared to the results of several other projects. This might be because painting was not the main focus of the project. If anything, this shows that the results from those projects can be used to implement an improved painting component for Baxter in the future.

Similarly to the results by Alves-Oliveira et al. [11] this project was unable to reach any decisive results showing whether a SAR can be used to increase creativity in people. The reason might also be similar; SARs needs further development to be able to produce any clear results.

7.2 Project Relevance in Society

Having access to a system like this would allow more patients to be treated simultaneously. This would lead to reduced waiting times, and more people could be declared healthy every day. This would be a large economical gain for society. Beyond these factors are also the ethical aspects. Improving the general well-being of people would be a great thing, even without any immediate economical gain.

7.3 Societal Demands on Technical Product De- velopment

The development of a product has to be done very carefully. Failure to do

so might make the product unusable and undesirable, or put the end-user at

risk. The demands that is most relevant to this project is those of security

and integrity.

(51)

Security

A requirement for this project is that the user should be able to disable the SAR at any point during the interactive process, should anything go wrong. This can be done through clicking a button on the GUI. This fea- ture was heavily tested, and the final version has so far worked flawlessly.

However, the possibilities of an oversight that would lead to system crashes, and thereby the GUI becoming unresponsive, has not been completely elim- inated.

Integrity

Products most also ensure that regulations regarding personal information, such as GDPR, are followed correctly. The SAR created in this project does not store any information about the user, and the captured images and the interpretation of them are not stored after the session with the user ends. However, there might be a desire for the SAR to, in the future, store information about the patient and the session. This could have a number of benefits, but would have to be handled very carefully.

7.4 Limitations and problems

There are a lot of limitations to the system created in this project. Accepting some of these limitations was necessary to achieve a finished system within a reasonable amount of time.

One big limitation is the interpretation. This system is able to classify shapes as either lines, squares, rectangles or circles. This is a very small subset of all shapes. A more general way of detecting arbitrary shapes would be great. The current system will also fail to identify these shapes when drawn inside of another shape, further limiting the user. Distinguishing between green and blue was also problematic, as they were both perceived as blue by the camera.

The painting is also very limited. The table holding the canvas has to be at a specific height, and the canvas and paints has to stay in a very specific location. Baxter is also unable to clean the brush before switching to another colour, which currently has to be done manually. This means that this exact system is difficult to use in the real world, improvements would have to be made.

7.5 Future Work

This project did not conclude whether a SAR of this type actually is bene-

ficial. This is an important question, and has to be answered at some point

in the development. It is likely that the current model would not prove

beneficial in reality, as explained in Section 7.4. Possible improvements will

(52)

be discussed below. At some point the improved SAR could be used in a study that aim to answer this question.

The interpretation can be expanded upon. This would enable the system to recognise more shapes and objects, allowing the user to paint with fewer restrictions. This can be done by applying more advanced CV concepts, or by further developing the implementations of the Hough transform and contour approximation. It would also be very interesting to investigate the results of using machine learning for the interpretation and decision phases.

Work can be done to generalise the painting ability by using Inverse Kinematics and CV instead of pre-defined points on a canvas. This could allow the SAR to automatically recognise the location of the canvas, and calculate correct placements of the brush. Inverse Kinematics alone would enable the SAR to paint with higher precision, since the current method only approximates the correct brush placement.

There are also many practical issues with the current model that has to be resolved. For use in reality, the SAR has to be able to automatically change colour. The automation of other tasks, such as replacing the canvas with a new one, would also be useful.

The ability to give voice commands to Baxter could also be implemented.

The designed GUI was well received during experiments, but voice com-

mands could help simplify the process.

Computer Science and Engineering, 300 credits

Bachelor Thesis

HALMSTAD

UNIVERSITY

Computer Science and Engineering, 300 credits

Interactive Robot Art

A turn-based system for painting together with a robot

Computer Science and Engineering, 15 credits

Halmstad 2019-06-18

Nils Lindhqvist, Erik Westberg

Abstract

This project explores the possibilities of using a robot that paints to- gether with patients. Such a robot would encourage the patient to be cre- ative, which is thought to be an efficient way of improving their well-being.

The painting will be done in a turn-based fashion, each taking turns adding details to the same painting.

Software is developed for the robot Baxter, made by Rethink Robotics.

Computer vision concepts and algorithms is applied to interpret what the user has painted and construct a plan of what Baxter will paint. Painting is then done by tracing the target shape through a set of pre-defined points on a canvas.

The constructed system performs fairly well - although the user is limited

to painting lines, squares, rectangles and circles. Further work can be done

to increase the amount of options available to the user. This system serves

as a model of how a similar system could be used in reality.

Sammanfattning

En stor m¨ angd folk lider fr˚ an mentala sjukdomar som depression och autism.

Det tillverkade systemet presterar hyfsat bra - ¨ aven om anv¨ andaren ¨ ar

begr¨ ansad till att rita linjer, kvadrater, rektanglar och cirklar. Fortsatt

utveckling kan ut¨ oka antal valm¨ ojligheter f¨ or anv¨ andaren. Systemet agerar

som en modell f¨ or hur ett liknande system skulle kunna anv¨ andas i verk-

ligheten.

Acknowledgements

We would like to thank our supervisor, Martin Cooney, for excellent support and guidance.

We also want to thank the volunteers for participating in the experiments.

Nils Lindhqvist & Erik Westberg

Halmstad, June 2019

Contents

1 Introduction 1

1.1 Purpose . . . . 2

1.2 Requirements . . . . 2

1.3 Delimitations . . . . 2

1.4 Research Questions . . . . 2

2 Related Work 3 2.1 Painting Robots . . . . 3

2.2 Socially Assistive Robots . . . . 3

2.2.1 SARs for Creativity . . . . 4

2.3 Art Therapy SARs . . . . 4

2.4 Conclusions . . . . 5

3 Computer Vision 7 3.1 Colour Labelling . . . . 7

3.2 Image Thresholding . . . . 8

3.3 Noise Filters . . . . 9

3.4 Edge Detection . . . . 10

3.5 Shape Recognition . . . . 11

4 Design 13 4.1 Main Program Structure . . . . 13

4.2 Workflow . . . . 14

4.3 Hardware . . . . 14

4.4 Programming Language . . . . 14

4.5 Software . . . . 15

4.5.1 Robot Operating System . . . . 15

4.5.2 OpenCV . . . . 15

4.5.3 Tkinter . . . . 16

4.6 Evaluation . . . . 16

5 Implementation 19 5.1 Utility Modules . . . . 19

5.1.1 SavePoint . . . . 19

5.1.2 MoveToPoint . . . . 19

5.1.3 CaptureImage . . . . 19

5.2 Interpretation . . . . 20

5.3 Decision . . . . 24

5.4 Painting . . . . 25

5.5 Interaction . . . . 27

6 Results 31

6.1 System Tests . . . . 31

6.1.1 Interpretation . . . . 31

6.1.2 Decision . . . . 31

6.1.3 Painting . . . . 32

6.2 Final Experiments . . . . 33

6.2.1 Individual Task Performance . . . . 34

6.2.2 Experience Description . . . . 35

6.2.3 Purpose Agreement . . . . 36

6.2.4 Personal Interests . . . . 36

7 Discussion 39 7.1 Results . . . . 39

7.1.1 System Tests . . . . 39

7.1.2 Questionnaire . . . . 39

7.1.3 Conclusions . . . . 40

7.2 Project Relevance in Society . . . . 40

7.3 Societal Demands on Technical Product Development . . . . 40