• No results found

9  Evaluation of Phrases 2: Experiment

9.2  Method

9.2.1 Participants 

The participants were selected in a part snowball, part convenience fashion, aiming at a large-enough sample with a good distribution in terms of age, education and gender. Due to problems to find enough men who were willing to participate, the resulting group consisted of 24 women and 12 men. Of the participants in this experiment, some worked with computers as assistive technology (7), others worked with computers as play and/or education for children with disabilities (3), and for some, computers as assistive

153 

technology was a small part of their usual work (4), some of the participants were enrolled in a course for personal assistants (13) and 9 participants had no professional link to people with disabilities and/or assistive technology.

The participants were asked about their age, gender, length of education and familiarity with communication aids. Table 9.1 lists all the participants, the access mode they used in the experiment, and what group they were in (regular or pilot).

Table 9.1. Participants in the experiment regarding Phrases 2  Parti‐

cipant  Gender  Age  Education 

Comm 

aids  Access mode  Group 

Female    21‐40     University 2‐3 years    Familiar   Touch Screen    Pilot     Female    41‐60     Long Academic ≥ 4 years  Familiar   Touch Screen  Pilot     Female    41‐60     University 2‐3 years      Unfamiliar Touch Screen  Pilot     Female    41‐60     Long Academic ≥ 4 years  Familiar   Touch Screen  Pilot     Female    41‐60     University 2‐3 years    Familiar   Touch Screen   Pilot     Female    41‐60     University 2‐3 years    Unfamiliar Touch Screen  Pilot     Male      41‐60     Long Academic ≥ 4 years  Familiar   Touch Screen   Pilot     Female    61+       University 2‐3 years    Familiar   Touch Screen  Pilot     Male      21‐40     12 years   Familiar   Touch Screen   Pilot     10  Male      21‐40     Long Academic ≥ 4 years  Familiar   Touch Screen    Pilot     11  Female    21‐40     University 2‐3 years    Familiar   Touch Screen    Pilot     12  Female    21‐40     Long Academic ≥ 4 years  Familiar   Touch Screen   Pilot     13  Male      21‐40     12 years   Unfamiliar  Touch Screen  Regular   14  Female    21‐40     12 years   Unfamiliar Touch Screen   Regular   15  Male      21‐40     Long Academic ≥ 4 years  Unfamiliar Touch Screen   Regular   16  Male      41‐60     University 2‐3 years    Unfamiliar Touch Screen  Regular   17  Female    61+       University 2‐3 years    Unfamiliar Touch Screen  Regular   18  Female    41‐60     Long Academic ≥ 4 years  Unfamiliar Touch Screen   Regular  

19  Male      ≤ 20  9 years    Unfamiliar Touch Screen   Regular  

20  Male      41‐60     12 years   Familiar   Touch Screen  Regular   21  Male      21‐40     12 years   Unfamiliar Touch Screen  Regular  

22  Female    ≤ 20  9 years    Unfamiliar Touch Screen    Regular  

23  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

24  Male      21‐40     12 years   Unfamiliar Mouse     Regular  

25  Female    21‐40     12 years   Familiar   Mouse     Regular  

26  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

27  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

28  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

29  Female    21‐40     9 years    Unfamiliar Mouse     Regular  

30  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

31  Female    ≤ 20  12 years   Unfamiliar Mouse     Regular  

32  Female    21‐40     University 2‐3 years    Familiar   Mouse     Regular  

33  Male      21‐40     12 years   Unfamiliar Mouse     Regular  

34  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

35  Female    21‐40     12 years   Unfamiliar Mouse     Regular  

36  Male      41‐60     University 2‐3 years    Familiar   Mouse     Regular  

154 

There were two groups of participants: a pilot group and a regular group. The pilot group differed from the other participants in two ways: They were much more familiar with communication aids (10 out of 12, vs. 4 out of 24 in the regular group), and they got more experience with the vocabulary during the test. The pilot group was asked to find 18 + 18 utterances instead of 10 + 10, and to write 8 utterances instead of 10.

Another difference among the group of participants was that 22 participants got to use a touch screen during the experiment, whereas 14 instead got to use a mouse. The results were tested for differences between these groups (pilot vs. regular, touch screen vs.

mouse), in order to investigate whether these differences within the group of participants affected the results in any way, or if the participants could be treated as one group.

The pilot group originally consisted of 13 participants, but one participant misinterpreted the instructions in the first part of the experiment. She answered with another utterance instead of duplicating the ones that were presented, and also did not complete the whole experiment. This participant was therefore excluded from the group of participants.

Two participants from the regular group also misinterpreted the first part of the experiment, although one of them grasped the concept after a number of questions, but then continued to perform inconsistently. Their results from the first two parts of the experiment were not counted. They however performed normally on the writing part, and their results were included there. Two other participants failed to write with the on-screen keyboard, but their results from the other parts of the experiment were counted.

9.2.2 Instruments 

The first 22 participants got to use a portable computer with a touch screen, in most cases a Fujitsu P1510 Lifebook with built-in 8.9″ touch screen. Some participants in the pilot group used a Panasonic Toughbook with 14″ touch screen, so that two persons could take part in the experiment at the same time.

On one occasion 7 identical Fujitsu Media PCs with 19″ flatscreens and regular mice were used, so that 14 participants could take part in the experiment within a time frame of two hours.

A presentation video with an 18 minutes long presentation of the structure of Phrases 2 was used to teach the participants the specifics of the vocabulary.

All 36 participants used the same test software, a test version of the prototype software for Phrases 2, where the vocabulary was organised in exactly the same way as in the role-play and shopping evaluations of the vocabulary. The test software was created in Toolbook® Instructor 9, and the logging of the results was performed within that software.

155 

For statistic calculations, SPSS v. 16.0 was used to perform group statistics, independent samples t-tests and paired t-tests.

9.2.3 Procedure 

The first group of participants (13 persons) got a pilot version of the experiment with 18 questions from each part of the vocabulary instead of 10. They also got 8 utterances to write instead of 10. A preliminary evaluation of their results showed that some of the utterances were hard to find, and the participants showed signs of frustration about this.

Because of that, the number of questions was reduced. The questions that remained were the same that the pilot group had got, only fewer, and the order of the questions were the same, except for the most difficult ones that were presented in another way. It was therefore decided to keep the results form the pilot group, but test their results against the others’ to see if there were any significant differences in their performances.

 Introduction 

To start with, the participants looked at an 18 minutes long video, where the vocabulary Phrases 2 was described in detail. Figures 9.1. and 9.2. display screen shots from the video.

   

Figure 9.1. and 9.2. Screenshots from the video about Phrases 2 

The participants then entered information about themselves in the test software. Their first task was to browse the vocabulary on their own for a couple of minutes. 

Part 1 of the experiment 

The participants were given one utterance at the time to locate among the quick-phrases to the top right of the screen. The phrases were always presented in the same order. They were: “Hej (Hello)”, “Ha en trevlig dag (Have a nice day)”, “Jag undrar (I wonder)”, “Så tråkigt (Too bad)”, “Oj då (Oh)”, “tack ska du ha (thank you)”, “det tycker jag inte (I don’t think so)”, “förresten (by the way)”, “grattis (congratulations)”, and “är det min tur nu (is it my turn now)”.

156 

This is the instruction that the participants were given for the first task (translated from Swedish): “Your task is to find a number of ready-made utterances. To begin with, you only have to look among the quick-phrases to the right. // When you select Next, you will be presented with a spoken and written utterance. Find it as quickly as you can, select it and hear it spoken. // Then, select Next again, to get a new utterance to look for. If you, despite several attempts, don’t find the requested utterance, you may stop trying to find it and continue to the next question. // Click on this window to close it.”

The Pilot group was also asked to find the following utterances: “det blev fel (that was  wrong), ett ögonblick (one moment), det spelar ingen roll (it doesn’t matter), jag tror inte det (I don’t believe/think so), jaha (aha), okej (okay), tycker du? (do you think?), fy (ugh)”.

Part 2 of the experiment

The participants were given one utterance at the time to locate among the shopping-related phrases to the left and bottom. The phrases were always presented in the same order. These were the phrases: “När har de öppet? (When are they open?)”, “Hur har du det? (How are you?)”, “När går bussen? (When does the bus leave?)”, “Jag tänkte vi skulle åka till apoteket (I thought we should go to the pharmacy)”, “Kan du se bäst-före-datumet? (Can  you see the best‐before‐date?”, “Vad kostar det? (How much is it?)”, “Finns det i brunt? (Do  you have it in brown?)”, “Jag tar en sån (I take one of these)”, “Kan jag få ett kvitto? (Can I  have a receipt?)”, “Jag har en väska (I have a bag.)”

The participants were given the following instructions for the second task (translated from Swedish): “You now get a number of new utterances to look for. // This time you can find them under one of the headlines AFFÄR (SHOP) or HANDLA & TYCKA (SHOPPING & APPRAISALS). // Click to close this window.”

In the video introduction it was explained that, for reasons of a limitation in the number of menu buttons, most of the utterances for activities that take place in the shop were to be found under the menu AFFÄR, and that utterances for activities that take place in preparation for the shopping and utterances related to size and colour + expressions of opinions, were to be found under the menu HANDLA & TYCKA.

The Pilot group was also asked to find the following utterances: “Har ni den sista Harry Potter-boken? (Do you have the latest Harry Potter book?), Det blir alldeles lagom (That’s just  right), Jag vet inte riktigt (I don’t really know), Den är för genomskinlig (It’s too much see‐

through), Det var bättre (That was better), Det är ju riktigt härligt väder (It’s a really lovely  weather), Jag vill ha långa (I want long), Du kan få en 100-lapp (You can have a 100‐note)”.

Part 3 of the experiment 

The participants were asked to write one utterance at the time with the on-screen keyboard and let the speech synthesis speak the utterance. They were given the following

157 

instruction for the third task (translated from Swedish): “Now, you yourself are going to write, using the on-screen keyboard. // Select Next to hear and see the first utterance you are going to write. // When you have finished writing, select the “speak” button, the one with the loudspeaker on. You can then hear what you have written. Then, select Next, to get the next task. // Click on this window to close it”

 

   Figure 9.3: Test application with on‐screen keyboard 

The participants were asked to write the phrases: “När går bussen? (When does the bus  leave?), Oj då (Oh oh), Vad kostar det? (How much is it?), Finns det i brunt (Do you have it in  brown?), Förresten (By the way), Jag har en väska (I have a bag), Grattis (Congratulations), Är det min tur nu? (Is it my turn now?). Most of the participants (not the Pilot group) were also asked to write the following two utterances: “Jag tänkte vi skulle åka till Apoteket ( thought we should go to the pharmacy)” and “Tack ska du ha (Thank you)”.

9.2.4 Tests regarding group differences 

Since all the participants did not get exactly the same test, and because not everybody used the same access mode, it was important to find out if these differences influenced the test results in any significant way. Two major groups that were important to consider:

1. Mouse users vs. touch screen users

Most of the participants in the experiment used a touch screen instead of a mouse to access the vocabulary and to type the utterances in part 3 of the test. One group of 14 participants instead used a mouse to access the computer. Since the rate with which the participants wrote these utterances is of interest, it is important to learn if there is any significant difference between the writing rates by the mouse users versus the touch screen users. (The pilot group is included in the touch screen group).

158 

2. The pilot group vs. the regular group.

The pilot group consisted mainly of professionals associated with resource centres that worked with assistive technology, including communication aids. Not all 12 participants in this group worked with these aids, but they all used the pilot version of the test that made them look for 18 + 18 utterances instead of 10+10. It could therefore be assumed that they as a group were better acquainted with vocabularies like the one tested, and on top of that they had more exposure to the tested vocabulary during the experiment.

1. Test for differences between mouse users vs. touch screen users: writing rate 

All 36 participants were asked to write the same eight expressions with the on-screen keyboard. Twenty-two used a touch screen to access the keyboard, the rest used a regular mouse. The histograms in figures 9.4, 9.5 and 9.6 show the distribution for the whole group and for the two sub-groups.

      

Figure 9.4: Time it took the whole        Figure 9.5: Time it took the touch      Figure 9.6: Time it took the mouse   group of participants to write         screen users to write the eight       users  to write the eight utterances. 

the eight utterances.         utterances. 

 

Group statistics showed that for the mouse using group (n=14), the mean writing time was 104.2 (Sd= 16.8 and standard error of mean 4.8). For the touch screen group (n=

22), the mean writing time was 105.8 (Sd = 17.4 and standard error of mean 3.9). The independent samples t-test showed no significant difference between the touch screen users and the mouse users regarding the time it took them to write the eight utterances that they all wrote using an on-screen keyboard during the experiment.

2. Test for differences between the pilot group and the regular group 

Independent t-tests were used to test for differences between the pilot group and the regular group, regarding how long time it took the participants to write the same 8 expressions, how many expressions they found in the quickfire section (meeting the criteria of ≤3 menu selections by ≤30 seconds) and how many seconds it took to them to find these expressions.

The tests showed no significant differences between the two groups for these tasks. The mean number of expressions (out of 10) that were found in the quickfire section by each participant was 7.3. A histogram for the

  Figure 9.7. Histogram showing   the distribution of scores 

159 

whole group of participants (figure 9.7) suggests a varied distribution in the whole group for this task.

The number of expressions found in the activity-related section was also tested for group differences. This time the test showed that there was a significant difference between the pilot group and the regular group, regarding how many activity-based utterances they found, meeting the criteria. The pilot group performed better than the regular group on this task (see table 9.2).

Table 9.2. Test for differences in the number of activity‐related expressions found  Comparison  Groups  Mean Std. dev.  t  df  Sig (2‐tailed)  Find activity‐related  

expressions (n) 

Pilot  12  6.8  1.9  2.7  32  0.01 

Regular  22  5.2  1.5