Bachelor Thesis
HALMSTAD
UNIVERSITY
Computer Science and Engineering, 300 credits
Interactive Robot Art
A turn-based system for painting together with a robot
Computer Science and Engineering, 15 credits
Halmstad 2019-06-18
Nils Lindhqvist, Erik Westberg
Abstract
A large amount of people suffer from mental illnesses such as depression and autism. Receiving the care they need can be a very difficult process, with long queues and expensive bills. Automating part of the therapeutic process might be a solution to this. More patients could be treated at the same time, and the cost could be decreased.
This project explores the possibilities of using a robot that paints to- gether with patients. Such a robot would encourage the patient to be cre- ative, which is thought to be an efficient way of improving their well-being.
The painting will be done in a turn-based fashion, each taking turns adding details to the same painting.
Software is developed for the robot Baxter, made by Rethink Robotics.
Computer vision concepts and algorithms is applied to interpret what the user has painted and construct a plan of what Baxter will paint. Painting is then done by tracing the target shape through a set of pre-defined points on a canvas.
The constructed system performs fairly well - although the user is limited
to painting lines, squares, rectangles and circles. Further work can be done
to increase the amount of options available to the user. This system serves
as a model of how a similar system could be used in reality.
Sammanfattning
En stor m¨ angd folk lider fr˚ an mentala sjukdomar som depression och autism.
Att f˚ a den hj¨ alp som dem beh¨ over kan vara en jobbig process, med l˚ anga k¨ oer och dyra kostnader. Detta inneb¨ ar att m˚ anga m¨ anniskor lider l¨ angre tid ¨ an vad de vad de ska beh¨ ova g¨ ora, eller att de inte har r˚ ad att skaffa den hj¨ alp som dem beh¨ over. Att automatisera en del av den terapeutiska processen skulle kunna vara en l¨ osning till detta. Flera patienter skulle kunna bli behandlade samtidigt och kostnaderna skulle minska.
Detta projekt utforskar m¨ oojligheterna kring anv¨ andning av en robot som m˚ alar tillsammans med patienter. Detta skulle uppmuntra patien- terna att vara kreativa, vilket tros vara ett effektiv s¨ att att f¨ orb¨ attra de- ras v¨ alm˚ aende. Roboten kommer m˚ ala tillsammans med patienter p˚ a ett tur-baserat vis, d¨ ar var och en l¨ agger till detaljer i samma m˚ alning.
Mjukvara utvecklas till roboten Baxter, gjord av Rethink Robotics. Kon- cept och algoritmer f¨ or datorseende appliceras f¨ or att tolka vad som anv¨ an- daren har m˚ alat och bygga upp en plan av vad Baxter ska m˚ ala. M˚ alandet utf¨ ors genom att sp˚ ara den slutgiltiga formen genom ett par f¨ or-definierade punkter p˚ a tavlan.
Det tillverkade systemet presterar hyfsat bra - ¨ aven om anv¨ andaren ¨ ar
begr¨ ansad till att rita linjer, kvadrater, rektanglar och cirklar. Fortsatt
utveckling kan ut¨ oka antal valm¨ ojligheter f¨ or anv¨ andaren. Systemet agerar
som en modell f¨ or hur ett liknande system skulle kunna anv¨ andas i verk-
ligheten.
Acknowledgements
We would like to thank our supervisor, Martin Cooney, for excellent support and guidance.
We also want to thank the volunteers for participating in the experiments.
Nils Lindhqvist & Erik Westberg
Halmstad, June 2019
Contents
1 Introduction 1
1.1 Purpose . . . . 2
1.2 Requirements . . . . 2
1.3 Delimitations . . . . 2
1.4 Research Questions . . . . 2
2 Related Work 3 2.1 Painting Robots . . . . 3
2.2 Socially Assistive Robots . . . . 3
2.2.1 SARs for Creativity . . . . 4
2.3 Art Therapy SARs . . . . 4
2.4 Conclusions . . . . 5
3 Computer Vision 7 3.1 Colour Labelling . . . . 7
3.2 Image Thresholding . . . . 8
3.3 Noise Filters . . . . 9
3.4 Edge Detection . . . . 10
3.5 Shape Recognition . . . . 11
4 Design 13 4.1 Main Program Structure . . . . 13
4.2 Workflow . . . . 14
4.3 Hardware . . . . 14
4.4 Programming Language . . . . 14
4.5 Software . . . . 15
4.5.1 Robot Operating System . . . . 15
4.5.2 OpenCV . . . . 15
4.5.3 Tkinter . . . . 16
4.6 Evaluation . . . . 16
5 Implementation 19 5.1 Utility Modules . . . . 19
5.1.1 SavePoint . . . . 19
5.1.2 MoveToPoint . . . . 19
5.1.3 CaptureImage . . . . 19
5.2 Interpretation . . . . 20
5.3 Decision . . . . 24
5.4 Painting . . . . 25
5.5 Interaction . . . . 27
6 Results 31
6.1 System Tests . . . . 31
6.1.1 Interpretation . . . . 31
6.1.2 Decision . . . . 31
6.1.3 Painting . . . . 32
6.2 Final Experiments . . . . 33
6.2.1 Individual Task Performance . . . . 34
6.2.2 Experience Description . . . . 35
6.2.3 Purpose Agreement . . . . 36
6.2.4 Personal Interests . . . . 36
7 Discussion 39 7.1 Results . . . . 39
7.1.1 System Tests . . . . 39
7.1.2 Questionnaire . . . . 39
7.1.3 Conclusions . . . . 40
7.2 Project Relevance in Society . . . . 40
7.3 Societal Demands on Technical Product Development . . . . 40
7.4 Limitations and problems . . . . 41
7.5 Future Work . . . . 41
8 Conclusions 43 8.1 Requirements . . . . 43
8.2 Research Questions . . . . 44
1 Introduction
Mental and cognitive disorders such as depression, autism and trauma af- fect a large number of people all over the world. The ability to treat these problems is improving, often through means such as drugs or therapy. How- ever, medical staff is a limited resource, and not everyone can receive the care they require. The waiting time might be unreasonably long, or the cost too high. This project will explore the possibility of automating part of the therapeutic process by using a robot. This would be economically benefi- cial, as the personnel could shift their focus to other patients. There is also an important ethical aspect, as the general well-being of patients could be improved at a faster rate.
The robot will perform art therapy, a strategy already conducted by therapists. The intent is to let patients express their feelings through art rather than words. This can be effective when, for example, working with children suffering from severe trauma. They might be unable to describe their feelings, but able to make a drawing based on them. Art therapy is not only used as a way for the therapist to understand the patient; the very act of creating art has also shown to improve the health of patients [1]. This concept is the core of this project. The robot will be used to motivate the patient to continue painting. The incentive will come from the robot being able to provide feedback, by also adding features to the same painting.
This will be done in a turn-based fashion. When the patient finishes their turn, the robot will analyse what was painted. The result of this analysis will be used to influence the robot‘s choice of what to paint next. This allows the patient and the robot to create a painting together. The concept is shown in Figure 1.
Figure 1: Baxter with brush, paints and canvas
1.1 Purpose
A robot will be taught how to paint, and a turn-based system will be im- plemented. This will allow a person and the robot to take turns adding details to the same painting. What the robot paints should relate to what the person has painted, to give the feeling of cooperation.
1.2 Requirements
There are three distinct requirements on the robot.
• The ability to perceive and recognise what the user has painted
• Based on what the user painted construct a plan of what to paint
• Be able to paint on a canvas in accordance with the constructed plan There are also two requirements on the system that handles communication between user and robot.
• The user needs to be able to tell the robot that it is free to start painting
• The user needs to be able to turn the robot off at any point in the process
1.3 Delimitations
To allow the project to be completed within the time frame, two delimita- tions have been set.
• The robot will only be able to understand some basic features, such as polygons
• The canvas and paints will be placed at a fixed location relative to the robot
1.4 Research Questions
To complete the project, the following questions will need to be answered.
• What are the best ways of understanding what basic features are present in an image?
• How can a plan of what to paint be constructed from information about what the user painted?
• How can the robot paint some basic features on a canvas?
2 Related Work
Much research has been done regarding the two largest building blocks of this project - painting robots and Socially Assistive Robots. Only in later years have these two fields started to be combined. Relevant parts of this research is presented here.
2.1 Painting Robots
The topic of painting robots is heavily explored, and many projects have been successful in creating such robots [2], [3], [4], [5]. Some of these are described here.
Scalera et al. [2] developed a painting robot that is able to reproduce grey-scale watercolour artworks. A big part of this project was the desire to distinctly visualise each stroke, to make the observer recall the gestures of a human. This was achieved by filling all areas of similar intensity with strokes separated by a certain distance. A shorter distance means that the area appears darker, as less of the white canvas is visible. The contours of objects is detected using the Canny Edge Detector, and Inverse Kinematics was used to paint the image.
Jaquier [3] created an interactive painting robot capable of playing tic- tac-toe on a paper. The game board is interpreted using edge detection and contour extraction, described in Sections 3.4 and 3.4. The robot constructed an idea of what to paint as an internal image. Replicating this image on the canvas is done by extracting the contours and tracing them using Inverse Kinematics.
Another interactive painting robot was constructed by Grosser [4]. This robot is able to listen to the surroundings and use the sounds to influence what it paints. A genetic algorithm was applied to handle the decision of what to paint. How the actual painting is done is not described.
2.2 Socially Assistive Robots
Socially assistive robots (SARs) aid people through social interactions. This can be done through, for example, speech, motion or painting. It is a rela- tively new field, with many promising applications [6].
One example is a SAR used in physical therapy for patients having suf-
fered from a stroke. The goal is to shorten the recovery period by having
the robot encourage the person to do a certain physical task. The robot
is able to remind the patient to perform their exercise if they have not,
and show praise otherwise. The concept was appreciated by patients during
experiments, although much work remains to be done [7].
Matari´ c et al. [8] has studied SARs for use in autism therapy, mainly aimed at children. Several properties of such SARs are discussed, of which two are especially relevant to this project. These are the possibilities of helping children through imitation and turn-taking. The ability to imitate others and partaking in social turn-taking games are important ways of learning social behaviours. However, imitation often does not come natural to children with autism, and they often find social games challenging. SARs designed to encourage imitation and to play turn-taking games with such children could therefore be beneficial to their personal development.
2.2.1 SARs for Creativity
Several projects has applied SARs in attempts of increasing creativity. One example is a SAR named YOLO that focuses on developing creativity in children. This is done by having the children create stories involving YOLO, to which YOLO is able to react. This is meant to boost the idea generation in the children, thereby training their creative abilities [9].
A co-creative drawing system was created by Jansen and Sklar [10].
Their system is meant to act as an inspirational agent for artists, solving issues of ”artist’s block”. However, the authors found that artists were resistant to the idea of having a SAR directly interfere with their drawing.
This was solved by having the SAR simply project ideas onto the canvas, leaving the final decision of what to draw to the artist. This project will, despite these results, aim to develop a SAR that actually paints on the canvas. This is motivated by the fact that the goal is not help users create a good artwork, but rather to encourage users that would not otherwise paint at all to do so.
Alves-Oliveira et al. [11] conducted a study in an attempt to determine whether SARs can be used to spark creativity. This was done using an interactive painting robot, similar to the concept of this project. In this study, participants were asked to paint on a tablet. After doing so, the robot responded to some participants - while other participants only got a digital response from the tablet. The study concluded that, contrary to expectations, painting with the robot did not actually spark creativity.
The authors suggest that this might be because some participants thought the robot ”ruined their drawing”, and that they had higher expectations of the robot. Further studies with a more sophisticated SAR can be done to eliminate these issues, which hopefully would yield more favourable results.
2.3 Art Therapy SARs
This project is largely inspired by the works of Cooney and Menezes [12].
They discuss ideas and concepts regarding the design of an SAR for con-
ducting art therapy. Many ethical pitfalls are introduced, together with
some proposed solutions. All of these pitfalls can not be regarded in this project, and some will be left for future development. The pitfalls that will be regarded are those of security, confidentiality and misjudgement. These pitfalls are described here.
Developing the SAR to be safe is crucial. The SAR should preferably not be able to hurt the user, even in worst-case scenarios. The user should also be able to quickly disable the robot, should anything unexpected happen.
Cooney and Menezes also discuss the importance of letting the user know that the SAR is safe, as they otherwise might feel unsafe despite this not being the case.
Confidentiality is a delicate subject, but very important. An art therapy SAR might benefit from knowledge of the diagnosis of the patient, as the decisions made by the SAR could be adapted to the needs of the patient.
Logging as much information as possible from sessions with patients might be useful to the actual therapist. However, all this information is very sensitive and should be kept private. The system must also make sure to follow regulations regarding the storage of such personal information, such as the General Data Protection Regulation (GDPR).
The third pitfall, misjudgement, is a complex issue. Incorrectly inter- preting what the user has painted could lead to the SAR doing more harm than good. The patient might feel ignored and unimportant if the SAR paints something that does not correctly relate to what they painted. A solution presented by Cooney and Menezes is to explicitly ensure correct in- terpretations by having the SAR ask the patient for confirmation, and never act unknowingly.
2.4 Conclusions
Painting robots is a heavily explored area, and many existing solutions per-
forms extremely well. SARs, on the other hand, remains largely unexplored
- especially regarding specifically art therapy. This project will therefore
put emphasis on these parts of the system. This means that the pitfalls
discussed in Section 2.3 has to be carefully incorporated into the project. If
the project is successful, the SAR could be used to further the study done
by Alves-Oliveira et al. [11].
3 Computer Vision
Many tasks, such as object detection, require computers to gain an under- standing of images. Computer Vision (CV) is the field that studies how this can be done. This section will introduce the CV concepts that are required to complete this project. This includes the following topics: Colour Labelling, Image Thresholding, Noise Filters, Edge Detection, Contour Extraction and Shape Recognition.
3.1 Colour Labelling
To understand the colour of features in an image, some processing has to be done. This section will describe a few alternative methods for labelling colours. Labelling refers to the process of identifying colours as elements of a predefined set of colours, for example ’red’ or ’yellow’.
RGB
Computers often use the RGB model to describe colours. In the RGB model, a colour is defined using three parameters - the amounts of red, green and blue in the colour. These parameters are based on the concept of additive mixing of light. Variable amounts of red, green and blue light can be com- bined to construct light of almost any colour. This is exactly how colours are displayed on screens, often making the RGB model an obvious choice.
However, the model is often sub-optimal when trying to label colours. To do this using the RGB model, boundaries in three-dimensional space has to be defined for each possible label. Labelling is then done by observing which of these regions a colour ends up in.
There are several other colour models that make this problem much easier to solve, for example the HSV, HSL and CIELAB colour models.
Colour labelling using these models is described below.
HSV / HSL
Similar to the RGB model, the HSV and HSL colour models also use three
parameters to define colours. However, these parameters are designed differ-
ently from those in the RGB model. The main component of how humans
differentiate colours, hue, is represented by a single parameter, rather than
being a combination of all three. The two other parameters describe how
bright and how saturated the colour is [13]. Chromatic colours can now
be labelled by defining boundaries for the hue parameter alone. However,
achromatic colours will be incorrectly labelled using this method, since they
are not part of the hue spectrum. To solve this, a constant upper and lower
bound to brightness and saturation can be defined. If this bound is breached, a separate check can label the colour as white, black or a shade of grey.
CIELAB
Another powerful colour model is the CIELAB model. Colours are again described using three parameters - L ∗ , a ∗ and b ∗ . This time, the parameters correspond very closely to how the colour is perceived by humans. The amount of change applied to any of the parameters relate to how much the colour changes according to a human. This allows the model to be used to calculate the perceived dissimilarity of any two colours. This is done by treating colours as points in three-dimensional space, with the three parameters as base vectors. The perceptual difference between any two colours is equal to the Euclidean distance between the points of those colours, as shown in Equation 1 1 . If the Euclidean distance is large, the colours will be perceived as very different [13]. Labelling a colour can be done by defining the ideal parameters for each available label, and then checking what label is closest to the actual colour.
∆E ab ∗ = q (L ∗ 1 − L ∗ 2 ) 2 + (a ∗ 1 − a ∗ 2 ) 2 + (b ∗ 1 − b ∗ 2 ) 2 (1)
3.2 Image Thresholding
Image thresholding is used to manipulate an image based on the state of each pixel. This can be used to separate areas of interest from their surroundings, simplifying any further processing of the image. The goal is to construct a new binary image. A binary image consists of pixels limited to one of two states, usually represented as black or white. Three methods of achieving this will be presented here: regular thresholding, adaptive thresholding and range-based thresholding. A comparison is shown in Figure 2.
Regular thresholding operates on grey-scale images, and simply com- pares the intensity of each pixel to a specified threshold value. Pixels will be black if their intensity is below the threshold, and white otherwise. Adap- tive thresholding also operates on grey-scale images, but does not use a fixed threshold value during the comparison. Rather, a different threshold is de- termined for each pixel based on the pixels in a region around it. This means that pixels that are generally brighter than other pixels in their region will be set to white, and set to black otherwise. This is useful when, for example, a shadow is ast on part of an object.
Range-based thresholding is slightly different, and can handle colour- images. A specified region is used instead of a threshold value. Pixels of colour within this region is set to white, and other pixels are set to black.
1