• No results found

Mobile Vision Mixer

N/A
N/A
Protected

Academic year: 2021

Share "Mobile Vision Mixer"

Copied!
52
0
0

Loading.... (view fulltext now)

Full text

(1)

Mobile Vision Mixer

A System for

Collaborative Live Mobile Video Production

Ramin Toussi

Department of Computer and Systems Sciences

January 2011

Advisor: Oskar Juhlin

Second Advisor: Arvid Engstr¨om Examiner: Fredrik Kilander

(2)

IV

Summary. Mobile phones equipped with video cameras and access to high-bandwidth

net-works enable a new generation of mobile applications: live amateur video production. There are already mobile technologies that allow capturing, editing and broadcasting live video from mobile phones; however some parts of production techniques still remain exclusively in the hands of professionals; multi-camera filming in a live broadcast of sporting events is an obvious example for such.

In this thesis, a system is described that is developed to address these needs for amateur video producers. Specifically, it focuses on providing a real-time collaborative mobile environment that can be used by mobile users who are interested in making live video from various angles only by using their phones.

(3)

Acknowledgments

This project would have never become possible without the contribution of several people. First and foremost I would like to thank my advisor and MobileLife centre director, Oskar Juhlin. Thank you for the great support, for trusting me and guiding me through the project. I also want to thank Arvid Engstr¨om, my second advisor and Mobility Studio director at MobileLife, whose knowledge and talent have always inspired me.

Arvid also led the team while evaluating the prototype in Malm¨o and G¨oteborg. When it comes to the evaluation of the work, acknowledgments should go to Alexan-dra Weilenmann and Emelie Dahlstr¨om as well, with whom I had fantastic experi-ences. Great thank you to Mahak Memar and Mudassar Ahmed Mughal, my friends and colleagues at SICS and MobileLife who helped us running tests by acting as remote operator in Stockholm. I also have to mention all those anonymous young people who volunteered to participate in our test sessions. Emelie wrote a separate report about the evaluation. It was written in Swedish; but later on, she helped me translate some parts of it to English, for this thesis and the paper. Tack s˚a mycket Emelie!

The project, the dissertation and all other related reports and papers received some invaluable comments from other contributors to this effort. First and most of all I want to thank Goranka Zoric (Goga) for the fruitful discussions we had when writing the paper. She really inspired me by her knowledge and patience. This also includes Kari Gustafsson, Michael Kitzler and Per Sivborg with whom we collaborated a lot for patenting the idea; the thesis and particularly its technical parts was excellently aspired by these people.

(4)

VI

A very special thanks to Fredrik Kilander, my teacher at DSV, KTH Interactive System Engineering program manager and this thesis examiner. I believe I was so lucky to meet you and have you as my examiner. You most impressed with all your kindness, patience and commitment to your job and to your students. The present dissertation also received several excellent comments from you. Thank you so very much.

Acknowledgments as well to all people at MobileLife, SICS and Interactive Insti-tute, for every wonderful moment we shared together.

(5)

Contents

1 Introduction . . . . 1 1.1 Research Problem . . . 4 1.2 Methodology . . . 5 1.3 Contribution . . . 6 1.4 Layout . . . 6 2 Background . . . . 9

2.1 Video in HCI and CSCW . . . 10

2.2 Video Production . . . 10

2.2.1 Professional Production . . . 11

2.2.2 Amateurs and semi-professionals practices . . . 12

2.2.3 Comparison and conclusion . . . 13

2.3 Mobile Broadcasting, More Mobility . . . 14

2.4 Related Work . . . 14

3 System overview . . . 17

3.1 Inspiration and Implication for Design . . . 17

3.2 Employed Technologies and Components . . . 18

3.3 Ideal Architecture . . . 21

3.4 Use Scenario . . . 21

3.5 First Attempts and Lessons Learnt . . . 23

3.6 Implemented Architecture . . . 25

3.6.1 Bambuser API . . . 27

3.6.2 Broadcasting with Adobe Flash Media Live Encoder . . . 27

3.6.3 Combiner Process . . . 27

3.6.4 Switch Process . . . 28

3.6.5 Vision Mixer Mobile Application . . . 29

3.6.6 Communication . . . 30

(6)

VIII Contents

4 System Evaluation . . . 33

4.1 Method and Setting . . . 33

4.2 Study Results . . . 34

4.3 Problems Found . . . 35

4.4 Discussion . . . 36

5 Conclusion and Further Work . . . 39

5.1 Further Work . . . 39

(7)

List of Figures

1.1 SKY Sport24 news channel production control room . . . 3

1.2 A typical combination of live streams . . . 4

1.3 The Mobile Vision Mixer application prototype running on a Nokia N86 8MP phone . . . 5

3.1 FLV playback with DirectShow in GraphEdit . . . 20

3.2 Vision Mixer Architecture . . . 22

3.3 Mobile Vision Mixer in Operation . . . 22

3.4 Simplified Vision Mixer Architecture . . . 26

3.5 Mobile Vision Mixer in Operation, in a Simplified Architecture . . . 26

3.6 Combiner Flash component layout . . . 28

3.7 Abstract Model of Communication and Data Flow in Mobile Vision Mixer . . . 30

3.8 Conceptual Design of the Ideal Video Combiner and Switch Component Integration . . . 32

4.1 Evaluation in Malm¨o . . . 34

(8)
(9)

List of Tables

3.1 Some of the available metadata fields contained in a typical video

(10)
(11)

v

1

Introduction

This thesis is reporting on the Mobile Vision Mixer (MVM)system, an application prototype than can provide mobile users with a real-time collaborative environment by which they can make live broadcast with their own footage of any event. It can be specifically useful for mobile users who are interested in video practices. This work can be recognized as a significant step forward in mobile video; between July and December 2010 it gained some press interests1, designated the title of innovation

while a patent is also pending and is expected to be finalized soon2.

Having features like video cameras and high-bandwidth network access inte-grated into the recent(2010) mobile technologies, mobile users are now enabled to have a firsthand ability of social media creation. With this, mobile phones are now beyond communication and passive media consumption devices.

This integration, by taking advantage of a sheer peculiar characteristic of mo-bile devices, being available everywhere and all the time, has ended up with the emergence and development of new services for immediate publishing and sharing of live image and video [16, 26, 27]. ComVu Pocket Caster, launched in 2005 which later on was renamed Livecast is the pioneer in live mobile video publishing. In the years that followed, more services were introduced like Qik, Kyte, Flixwagon, Floobs, Next2Friends, Stickam, Ustream and Bambuser3; among which, Qik and Bambuser

are the most widely used [22].

Employing this sort of ”capture-and-share-straightaway” [23] services allows peo-ple to instantly share their captured mobile images through manageable web pages instead of using emails, paper prints and web publishing [23, 26]. Mobile phones in this way enhance a shared experience among the spectators of a live event. More-over, in distributed events like car rally or bicycle racing, this experience will become even more enjoyable [16, 18]. However, results from previous studies and research show that although these live mobile broadcasting services are available for

indi-1e.g.: http://www.metro.se/2010/09/22/49027/gor-gerillatv-med-din-mobiltelefon/ 2European Patent Office, under Rule 19(3) EPC

(12)

2 1 Introduction

viduals, allowing people only to broadcast from their mobile devices is not enough. This includes situations in which a group collaborates to create a live show such as sporting events or live TV interviews. Accordingly, challenges still remain for the designers of these services to provide their users with features that so far have been exclusively in the hands of professionals [22].

The production of live TV shows usually takes place under a time critical con-dition [19] and needs to be highly coordinated among the entire production team members. On the other hand, events like team based sports might be distributed over a large area or could happen so fast (as it is in ice hockey and football) to be cov-ered only by one camera; hence the need for the real-time coordination of several cameras is extremely felt [19]. In such multi-camera settings, each camera starts filming from a position that is defined by the director; their corresponding video streams are simultaneously transmitted to the production control room. There, the director by having multiple views of both live and pre-recorded items on an array of monitors, can manage a suitable selection and combination of streams to provide the spectators of the final broadcast with the best viewing experience.

The main role in the production room is played by the director or Vision Mixer(VM) who also has control over the switching operation, ProcAmps(Processing Amplifiers) like brightness, contrast, hue and saturation as well as the instant replays and handling the communication between team members [19, 25].

Video production in this sense, should be considered as an interactional pro-cess which demands an extensive collaboration to provide the director with efficient direction capabilities [14]. The spectators consequently, would enjoy the final out-come of this collaboration, the seamless broadcast of the event from multiple angles, without ever being aware of what happens behind the scenes [19].

Figure 1.1 shows SKY Sport24 news channel production control room. In this picture, the director and his staff together with the mixing console, video sources, preview monitors and other equipments are visible.

With the advancements in digital and mobile technology, contribution from am-ateurs in video production has been made possible. Phones with advanced function-alities are coming that can promote current amateur mobile video technologies by establishing distributed real-time collaborative environments [16, 18, 25]. The area has recently gained more interest in research and a growing field of practice is visi-ble as well; yet to enavisi-ble users to experiment a real video work, more research and studies are expected [19, 22].

(13)

1 Introduction 3

Fig. 1.1. SKY Sport24 news channel production control room4

share moments with others. However, no solution exists to address more advanced requirements of mobile video users.

It has also been argued that with the increasing interest in this area and to sup-port amateurs with a more robust and effective collaboration on the Internet, appli-cations like a mobile mixer need to be designed to allow viewers to collaborate in such a shared experience [16,18,25]. The Mobile Vision Mixer(MVM) prototype tar-gets these needs by providing users with freedom, liveness and coordination of the task. This work, mainly addresses the mobility and collaborative aspects of amateur video making.

MVM consists of a mobile application and a remote service setup for choosing and receiving a group of four live video streams from users broadcasting an event with their mobile phones. The mobile application represents live preview of each stream to its user. The mobile user then, can select a video stream from the preview for broadcasting to the Internet. This broadcast can immediately be publicly shared via social networks or a personal webpage. Figure 1.2 depicts a possible mix of streams from four different cameras in an ice hockey match.

(14)

4 1 Introduction

Fig. 1.2. A typical combination of live streams

The system consists of a backend mixing and switching application, a web ser-vice and a mobile application for any mobile phone that supports Adobe Flash Plat-form [6]. Bambuser is been used to address the live streaming needs. The system in this sense, includes four mobile camera operators streaming from their mobile phones; while the user who is running MVM on his phone (the director) sees a quadruple live preview of them (mixing) and can select one for broadcast at any moment (switching).

The system is created to allow mobile users coordinate a live mobile video pro-duction through a multi-camera setting. Previous studies also reveal that live TV shows or sporting events become more attractive and understandable to viewers if multi-angle shots like wides and mediums or detailed and overviews are pro-vided [14, 18, 25]. MVM can address this need likewise by presenting the seamless switching capability. Figure 1.3 depicts a running instance of MVM.

MVM can make co-directing a live broadcast enjoyable by getting all participants joining the production task; having live preview of every stream, will provide the director with a better understanding of the collaboration. The author hopes that this work could influence the design for mobile video services.

1.1 Research Problem

(15)

1.2 Methodology 5

Fig. 1.3. The Mobile Vision Mixer application prototype running on a Nokia N86 8MP phone

the Artifact Development type5. Chapter 2 justifies why such a system is required

and what aspects were found missing in the previous works; but briefly speaking, bringing extreme mobility to video producers, providing mobile video users with a collaborative tool, all together in a mobile system, is the main research problem in this thesis.

1.2 Methodology

Considering the nature of the problem that I was supposed to create a functional prototype in a reasonable time (February - May 2010), it was realized that the tra-ditional methods of user centered design which expects the HCI designers to begin by observing the target group, defining their requirements and designing for them might be time costing. Alternative solution then, was to put more effort on design-ing and implementdesign-ing the prototype; yet the main strain toward this thesis can be divided into the following steps:

1. Pre-studies

2. Design and the implementation of the prototype 3. System evaluation

The first step constituted some literature review, checking similar works, getting familiarized with related technologies and trying out possible solutions to make the prototype.

5Thesis Information, Version 4, May 3, 2010. available at http://dsv.su.se, last visited:

(16)

6 1 Introduction

Next step focused on the actual implementation of the system, which is described later and in details in chapter 3.

System evaluation as the final phase was expected to provide essential feedback about users’ experimenting the system. To do this, user studies were conducted while test sessions were video recorded for further analysis. The study method as well as results found, are presented in chapter 4.

1.3 Contribution

The effort was done as part of my master thesis final project at MobileLife research centre6in Stockholm, Kista between January and July 2010. I was mainly respon-sible for designing and developing the entire MVM system. The evaluation of the project however, was not in the scope of my task when the project started; yet, later on it was decided and done under the same working group’s supervision, morevideo!

7at MobileLife.

Needless to say that the main contribution of the thesis is the MVM prototype itself, a mobile system that offers its users to co-produce a live video. The success-ful launch of the system showed the possibility of similar and more sophisticated systems. While working, my colleagues and I also learned a lot; particularly with the user studies, seeing people in collaboration and figuring out some other related aspects.

1.4 Layout

After presenting an introduction to the main idea of the work in this chapter, the rest of the dissertation is outlined as follows:

The discourse continues by chapter 2 for giving a background of the topic men-tioning the importance of video in HCI8 and CSCW9, followed by an overview of

professionals and amateurs video production interests and practices. Mobile broad-casting services as well as some examples from previous works are also presented in the same chapter.

Chapter 3 then gives an overview of the design and implementation of the system. Technical aspects of the system are discussed in detail by presenting the architec-ture of the system, how it is implemented, use scenario and system components. Expected technical improvements are also briefly presented.

6http://mobilelifecentre.org/

7http://mobilelifecentre.org/project/show/3 8Human Computer Interaction

(17)

1.4 Layout 7

The evaluation process, addressing the two conducted user studies is briefly dis-cussed in chapter 4. The results of the evaluation as well as the problems found with MVM are also described in that chapter.

(18)
(19)

v

2

Background

When it comes to video production in CSCW, video as a social media gains multiplied interest from its researchers as it could be considered both as a mediated channel for communication and a topic of concern [25]. Technologies have been developed to provide a certain amount of collaboration between members of a video production team; they are usually supplied along with fixed devices like TV studios, production rooms or custom fitted buses which provide the production team with some degree of fluidity. The concept of mobility however, has been widely ignored in CSCW [24], while with the recent advancements in mobile technologies more effort in research on live interaction with visual content is expected. This movement includes devel-opments and improvements in processing power of mobile phones and pervasion of high-bandwidth Internet mobile networks like 3G and recently 4G (2010); at the same time, high speed network access and storage costs are decreasing [19].

In the consumer demand, the situation is even worse; there actually exist no real practices to provide ordinary people with collaborative video productions, even though several products have been developed that allow some sort of fluidity be-tween individuals on fixed devices, which limits the users to their domain [24, 25]. These issues are discussed briefly in the following subsections.

(20)

10 2 Background

2.1 Video in HCI and CSCW

The topic of video has always gained a vast and growing interest in HCI; which matters different concerns like in production, as live and non-live media, stream-ing, users’ generated content and as means to support collaboration [19, 22, 25]. Perry et al [25] show how these topics come together in a practice of TV production although they have been always considered separately. They also argue how a col-laborative work is supported and coordination of multiple people takes place around and through the video material.

It was aforementioned in chapter 1 how live TV could be recognized as an in-teractional process. To manage the production in a meaningful way, an enormous amount of coordination and collaboration between the actors is needed; in this, the turn toward users’ involvement has resulted into a broader focus on CSCW by allowing non professionals to collaborate and coordinate the production of a live media, like broadcasting an event as it happens. With the advancements in multi-media computing, high-bandwidth networks and mobile video enabled devices, the issue has gained even more interests and been also incentive for designers to support amateurs’ contribution in video production [19].

The relationship between video and interaction has also been a longstanding topic of concern in HCI for the past 20 years [19]. The main goal is to help both professionals and amateurs not only create their own video data but also access this complex data in the best possible ways and most proper techniques such as browsing, editing and summarization [19].

2.2 Video Production

(21)

2.2 Video Production 11

In traditional way of filming, cameras are brought to events and scenes are recorded and shared with selected audience later. However, with the emergence of camera enabled mobile phones, more spontaneity in filming is visible. This spon-taneity can be seen in both video capturing and sharing the created media later through Bluetooth, emails or by playing back straight on the device screen. Mobile devices in this way of presence everywhere, whenever and for whatever has affected the traditional ways of video production [16, 18].

A more professional setting of video production, consists of a multi-camera work and vision mixing like it is in live sports broadcast, TV interviews and studio pro-ductions. In a typical configuration with five cameras, two produce close-up shots of the guests, two others positioned behind the participants, taking close-ups of the interviewers while the fifth camera is responsible for making wide and spontaneous pictures [14]. The end product of this effort, is a series of shots produced around a carefully coordinated collaboration among team members [14, 19]. The produc-tion cycle together with both professionals and amateurs interests and methods are discussed in more details below.

2.2.1 Professional Production

Professional video production is a process that demands a high volume of collabora-tion among the team members including the camera operators, the director (or the Vision Mixer) and their equipments. In case of broadcast of a live sport event or a live TV show, this setting might also include instant replay operators, commentators and interviewers. These two cases will be taken as model below to describe some aspects of the professional production process. The final outcome of a production team’s collective work is a series of shots delivered to the audience. In such collabo-rative work, the media itself is a means of collaboration which organizes the process to go through [25].

Depending on the event to broadcast, its distribution, length, fastness and some other factors, best shots need to be selected by the director to be presented to re-mote viewers. This selection brings the audience an appropriate understanding of the event by letting them feel both its progress and liveness.

(22)

12 2 Background

To sum up, camera operators present and suggest their work through filming while they receive feedbacks from the Vision Mixer(VM) with their selection, shown by the tally lamp. Other means of communication like an audio link or gestures in cases where verbal communication is not allowed, could be also available [25].

The main process of direction and production takes place in the production con-trol room wherein the vision mixer and others orient toward the broadcast. The main role in this studio is played by the vision mixer who continuously identifies the potentially selectable feeds from different angles and previously recorded shots and manages a selection for the broadcast. In live sport production, the vision mixer is also cooperating with the instant replay operator to replay key sequences immedi-ately as they occur in the game.

The setup of the production control room also includes an image gallery display-ing all sources together with preview monitors. A backchannel is provided likewise for giving instructions to others both inside and outside the studio. The outcome of the co-development in this room is a balanced and dynamic assembly of the images covering the action from various viewpoints, providing the remote viewers with an ongoing interesting experience of the action [19, 25].

This intersection of video production interactional process and the emerging new technologies, has been a significant motivation toward recent interests in research for video production and consumption.

2.2.2 Amateurs and semi-professionals practices

The chief difference between professional and amateur video work is in how the cameras are used, first with the types of cameras and second with their setup config-uration. Kirk et al [23] in their investigation of nonprofessional home video makers have mentioned how teenagers (as the focus group of their study) experience the practice of video production. Most of the teenagers they spoke to, have preferred to use their mobile phones rather than video cameras as they have found no benefit in buying cameras.

The spontaneity of the action and the level of involvement in it were also found to be two other topics of concern. Like said before in section 2.2, this spontaneity is also visible in capturing and sharing of the media. As with the level of engagement, users do not want to be so absorbed in the filming; they prefer to actively participate in it. This also describes why video cameras are not taken with them constantly [23]. On the other hand, camera phones can be brought everywhere and all the time. They can be easily carried and while in use, provide their users with a high degree of mobility. However, to collaborate around the topic of interest, the positions of filming devices needs to be negotiated between director and the camera operators [25].

(23)

2.2 Video Production 13

are doing. To do this, a video backchannel could be provided for example to enable the team to communicate over and through different modalities like text and speech interfaces.

As a result, by using mobile phones as video cameras and letting people actively engage in producing a footage and consuming it at the same time the amateur video production experience could be drastically improved [18, 23]; yet, little attention has been put so far in designing services or adapting modern technologies to address these needs [25, 27].

2.2.3 Comparison and conclusion

The difference between amateur and professional video practices is not only with what equipments and devices are used, but is also in how they are utilized in the process. Professional video production starts with a setting of video cameras brought to the location, might also include a control unit or a central operation place (e.g.: custom fitted bus), while amateurs on the other hand might find no interest in fol-lowing this setting. In the latter case, camera enabled mobile phones could be an alternative to video cameras that also provide their users with more degree of fluid-ity and mobilfluid-ity.

An extensive degree of spontaneity has also been seen in the mobile video making and consumption while as with professionals the recorded videos are usually edited and shared with the viewers afterward. This is because mobile created materials are rarely planned in advance and are often used to ”enhance the moment” [23]. In this manner, the post production works like editing, seem unnecessary to their producers [22, 23].

In a multi-camera setting, the communication between team members is of immense importance. Methods like proposal-acceptance are used to increase the awareness in whole team, tally lights as well as body gestures are also other means of communication. However, there are no current technologies to provide nonpro-fessionals with collaborative video mobile production. Challenges also remain for the designers of these services to support amateur collaborative video practices. Al-lowing people only to broadcast from their phones does not seem to be appropriate enough [22].

(24)

14 2 Background

2.3 Mobile Broadcasting, More Mobility

Juhlin et al [22] have defined the mobile broadcasting, its characteristics and fea-tures as follows. Mobile broadcasting services are new enhancements on mobile phones that allow users to capture and broadcast live video from their devices to web interfaces on the Internet in real-time. The web application lets people browse these live and archived feeds and interact with their producers. In this sense, mobile broadcasters by grabbing their phones can share their moments through the Inter-net with unlimited number of audience. The mobile webcasting services typically provide the following features:

• Immediate sharing of live video from mobile phones to websites • Archiving the live videos for later reviews

• Distributing the live feeds via social networks, emails or through other web pages embedding

• Title and GPS location descriptions • Live chats and online commenting

Mobile broadcasting is also similar to mobile video conferencing system and web-cam live video chats in that all provide immediate sharing of the captured video [22]. They are on the other hand different, as the cameras are wireless and allow users to capture from anywhere in the coverage of mobile networks, targeting thousands of online viewers [22].

Taking advantage of the most remarkable characteristic of mobile camera phones, its ubiquity and being always present and reachable, users can also benefit from the combination of capture and immediate remote sharing of images that has not been possible before, with traditional or digital cameras [22, 26].

Broadcasting from a handheld has its own restrictions such as small screens and limited interaction. The mobile streaming users also lack a key feature, editing; but it is still evident that mobile broadcasting is a growing medium with a growing number of users [22].

2.4 Related Work

(25)

2.4 Related Work 15

visitor-generated video or parents who want to broadcast live images of their chil-dren in events like football matches. IBS is a considerable step forward for nonpro-fessional collaborative video production despite it lacks the full mobility of its users by limiting the vision mixer to use a stationary computer. Successful aspects of IBS are taken by MVM; yet, its mobility was to be extended to allow running the vision mixer on a mobile phone.

InstantShareCam is another example in this area [27]. InstantShareCam is a ser-vice concept that targets ordinary citizens holding video cameras. It operates on wirelessly-networked cameras and allows its users to simultaneously collaborate to capture, edit and view the coverage of an event in real-time. InstantShareCam how-ever relies basically on wireless networks and personal video cameras, restricting its degree of mobility and target population respectively.

Bambuser also presents a web application that give its users the ability to col-laborate and co-produce their footage in a multi-camera setting1. That is an online

event manager which allows users to get together through the web interface and add their cameras to each others’ events. The user who has set up the event would then hold control over the vision mixer and at any moment can select one camera for the broadcast. Those who are watching the event through the Bambuser website will be presented by a mixed stream consisting of videos from all participants.

With MVM, an innovative mobile system has been set out for nonprofessionals who are interested in collaborative video practices. Even though the current work only provides basic mixing and switching functions, it will hopefully become a strong foundation for further mobile multimedia systems development.

1

(26)
(27)

v

3

System overview

The Mobile Vision Mixer(MVM) prototype has been implemented to allow a group of people to collaboratively film and coordinate the broadcast of an event. The cur-rent work can enables collaboration between 4 camera operators who live stream their desired scenes to Bambuser using their phones, with a director who holds Vi-sion Mixer(VM) application on his phone and can switch to one of them for final broadcast, at any moment. //The system has been implemented in a two tier archi-tecture with a web service in between to connect the tiers and for message passing. However, due to the project purpose and the time limit, its scope had to be narrowed down is some way which is explained in section 3.5. The following subsections ex-plain all the design and implementation related issues to the latest version of MVM by the time of writing the dissertation.

3.1 Inspiration and Implication for Design

Perry et al [25] have shown that there is no real technology to support amateurs’ col-laborative video production. Their research also has determined how distinctly the need to design consumer technologies, assisting inexperienced users with collabo-rative live video production is felt [25]. The present work was inspired by all these facts; the main goal here has been to develop an approach to let people

collabora-tively film and broadcast an event. In this, several issues needed to be considered.

First, Juhlin et al [22] discuss that only providing users with live video streaming from their mobile phones is not enough, neither mobile webcasting has fulfilled its potentials expected by the researchers. These have led them to claim that:

”There remains a challenge for the designers of these services to develop the concept in order to support people’s appropriation and thereby democratize a medium which up to now has been entirely in the hands of well-trained professional TV-producers.”

(28)

18 3 System overview

this, Bambuser was chosen for the live broadcasting service. It is easy to install, to learn and to use and it is also one of the most popular mobile live broadcasting services [22]. Its low latency was also another reason of choice.

Finally, and the most challenging one has been the live preview of what all the participants are filming. It has been argued that in cases like VJing, static thumbnails of prerecorded materials could be enough for the basic recognition, yet that would never be sufficient for a live video production [16]. The evaluations and comparisons later on also proved this fact [15]; It was expected from the beginning likewise, that a live preview would give a better impression. These three as well as the team’s previous experience resulted in the design for a mobile mixer with quadruple pre-view window of Bambuser live broadcasts. The mixer also allows its user seamlessly switch between these streams for final broadcast.

3.2 Employed Technologies and Components

The following services, technologies and components are used in MVM:

1. Bambuser: Bambuser by its founders is described as follows [1]. It is a live streaming online service which lets its users to broadcast instantly from their mobile phones or desktops and share them right straight away with followers all over the net. Bambuser can be easily integrated with social networks like Facebook, Twitter and Myspace and also with users’ blogs and websites.

In the current work, Bambuser is used to empower individuals’ broadcasts; it is also utilized to stream the combined video as well as handling the switching operation. Here, we have benefited from its capability of broadcasting from a desktop application; Adobe Flash Live Encoder in this case.

Bambuser also provides its users with an API for fetching some complementary data of any activity like date, time, title, size and location of the broadcast; If eligible, a direct URL pointing to a location on their storage which holds the broadcast actual FLV file will be also returned.

(29)

3.2 Employed Technologies and Components 19

3. SWF: According to [11], SWF or Small Web Format is a repository for multimedia content originally developed by FutureWave Software, transferred to Macrome-dia and now owned by Adobe. SWF is intended to contain animation, applets, ActionScript code and FLV videos. Its small size enables SWF components to be easily published to the web. FLV files embedded in SWF containers can be watched on most operating systems and a variety of mobile phones.

4. Adobe Flash: Adobe Flash is explained as a multimedia platform founded by Macromedia that was transferred later to Adobe. Flash is typically used to create rich Internet applications by adding animation, visual content, multimedia and interactivity. Flash contents can be displayed on most operating systems, some electronic devices and mobile phones [6].

5. Adobe Flash Lite: Adobe Flash Lite, the lightweight version of Adobe Flash as its name also implies is the Flash platform for mobile phones and portable elec-tronic devices. It enables its users view multimedia contents on their devices. Flash Lite supports ActionScript which has enabled it to bring some degree of interaction to its users. The latest version of Flash Lite, 3.0 is based on Flash and also supports H.264, On2 VP6 and Sorenson video codec which means it can play FLV videos [12].

6. ActionScript: ActionScript is a scripting language primarily developed by Macro-media to run and control simple animations, but later on owned by Adobe, tar-geting Adobe Flash platforms on web pages and portable devices in the form of embedded SWF files [5]. With the release of recent versions of Adobe Flash, ActionScript is now an object oriented programming language that provides more interaction and features like video playback and control (since Action-Script 3) [5].

7. Microsoft DirectShow: The Microsoft DirectShow application programming in-terface (API) is Microsoft Windows media streaming platform which provides high quality capture, edit and playback of multimedia streams. DirectShow sup-ports a variety of multimedia formats like Advanced Systems Format (ASF), Mo-tion Picture Experts Group (MPEG), Audio-Video Interleaved (AVI) and MPEG Audio Layer-3 (MP3). By providing access to the underlying stream control ar-chitecture, DirectShow can be extended to support new formats such as FLV [4]. A common DirectShow application is based on a set of connected components (Filters) for receiving multimedia content as well as parsing, presenting, chang-ing and streamchang-ing purposes. Filters are embedded and connected to each other in a container called Graph. Graph is in charge of running and controlling the media stream through filters [4].

(30)

20 3 System overview

find and see their desired filters as building blocks and build up a graph using them with drag and drop [4].

Figure3.1 represents a simple DirectShow graph for a FLV video file playback in GraphEdit.

Fig. 3.1. FLV playback with DirectShow in GraphEdit

8. MediaLooks Flash DirectShow Source Filter: Adobe Flash does not internally support in Microsoft DirectShow; yet, with MediaLooks Flash DirectShow Source Filter, Flash media (.swf and .flv) can be played back in DirectShow graphs via the native Flash runtime [9].

9. Adobe Flash Media Live Encoder: According to [13], Adobe Flash Media Live Encoder is described as follows. This is an application for capturing, encoding and streaming multimedia content to Adobe Flash Media Server or the Adobe Flash Video Streaming Service. Through its interface, Flash Media Live Encoder provides its user with audio and video capture and live streaming features. It is also possible to run it from the command line.

10. MediaLooks MultiGraph Sink/Source: This is a set of DirectShow filters devel-oped by MediaLooks to allow transfer of multimedia content between different DirectShow graphs running on either the same or different threads and pro-cesses [10].

11. Web Service: Web Services are described as follows [2]. Web services are soft-ware services executing on remote hosts, accessible through HTTP Protocol. One type of web services called XML Web Services use XML standards for their data structure and transfer. XML web services are becoming a platform for distributed application integration through the Internet.

(31)

3.4 Use Scenario 21

12. Microsoft C# Pronounced ”See Sharp” is a modern object-oriented language that supports component-oriented programming as well. It is developed by Mi-crosoft as part of the .NET framework. C# has its roots in the family of C pro-gramming languages, yet many similarities to Java propro-gramming language can be seen [3].

13. Microsoft Internet Information Services (IIS): Formerly called Internet Infor-mation Server is a web server application with a number of featured extensions, developed by Microsoft to use with Windows systems [20].

3.3 Ideal Architecture

MVM is a setup of different services and applications consisting Bambuser as the live streaming service provider, a group of four video enabled mobile phones streaming to Bambuser, the mixer machine and the VM mobile application. All mobile phones are connected to the mobile Internet network through 3G, while the mixer machine itself uses Internet connection to communicate with Bambuser. A backchannel is also provided between camera operators and the VM. However, due to the project scope, the backchannel is not implemented. Figure 3.2 represents this model wherein the communication between different parts of the system can also be seen.

3.4 Use Scenario

This section describes how people are envisioned using the system. An example scenario in which MVM could be utilized is as follows:

A group of five people, all interested in making home video, are attending an ice hockey match. During the game, they decide to use their mobile phones to live broadcast their own view of the game over the available 3G mobile network. They choose one among themselves to be the director while the other four hold mobile cameras. They start filming and moving around in the spectators area, each trying to find the best view. They have planned to have the following cameras setup: • One camera for providing overview shots,

• two for details, one on each side of the rink,

• and one camera aiming at the spectators and the bench coach.

During the match, the director is able to see live previews of every camera and at any moment can seamlessly cut to the most liked one. The video created in this way is the coverage of the event through an spontaneous collaboration that is publicly visible on Bambuser website and is also recorded there. Figure 3.3 represents this scenario in operation.

(32)

22 3 System overview

Fig. 3.2. Vision Mixer Architecture

(33)

3.5 First Attempts and Lessons Learnt 23

1. The camera operators (ML1, ML2, ML3 and ML4) start filming and broadcasting to Bambuser.

2. The VM application enables the director to view Bambuser video feeds.

3. The VM application also enables the video to select a group of video streams to be combined for an intended preview on its display.

4. Upon selection of the group of video feeds, the VM application sends a request to the mixer machine. The request includes information associated with the group of video streams to be combined.

5. The VM application also notifies the selected camera operator (corresponding to the group of video streams) to stay on shot.

6. Upon receipt of the request from the VM application, the Mixer Machine requests and fetches the selected group of video streams from Bambuser.

7. The combiner process then combines the group of video streams to create a combined video signal (a quadruple view).

8. Subsequently, the stitched stream is broadcast back to Bambuser (with MLMixer username). At the same time, another video signal which delegates the final output is broadcast (under the MLDirector username).

9. The switching operation affect the MLDirector output.

10. The Mixer Machine also sends a notification for the availability of the combined stream to the VM application hosted in the mobile device.

11. The VM application then fetches the combined stream from Bambuser and starts displaying it to the user.

12. The VM user (the director) chooses a video stream from the previewed combined video signal on the display for broadcast.

13. The VM application sends a notification to the mixer machine with details re-garding the selected video stream.

14. Concurrently, VM application also notifies the camera operators corresponding to the selected video stream about the on air status of the video stream.

15. On receipt of the notification, the Mixer Machine with its switching process switches the final output stream to the selected video from the combined video signal.

3.5 First Attempts and Lessons Learnt

After investigating different possibilities for implementing the Mixer Machine, the following four solutions were proposed for initial implementation and feasibility test:

(34)

24 3 System overview

3. Flash Mixer + Flash Switch application

4. Flash Lite mobile mixer + Flash Switch application

5. Switch on the player side + any kind of mobile or desktop switch application In all these five, Flash Source Filter was supposed to be used as the FLV source player in DirectShow together with Adobe Live Media Encoder as the final broad-casting tool. Also, two Video Mixer Filter objects were available, MediaLooks Vision Mixer Filter and Rogue Stream 3D Video Mixer Filter1. These filters are capable

of real-time mixing of several video streams from a variety of sources; In the case of MVM, they are intended to be used in conjunction with a group of Flash Source Filters to perform a basic video mixing. Finally, using DirectShow graphs at the inter-mediate level for connecting Flash components to Adobe Live Media Encoder seems to be inevitable.

It is also worth to mention that the pre studies and some basic tests showed that the best way to bring Bambuser live videos into the programming context (Direct-Show graphs), is by using MediaLooks Flash Source Filter. Later tests proved this model, yet it was an innovative idea and never tested before. Among the four men-tioned solutions the first two were recognized as infeasible due to some difficulties between Flash and DirectShow such as:

1. Threading/Process problem: One strange behavior witnessed when using Flash

Source filter in DirectShow was that using more than one instance of that ob-ject in a graph is utterly impossible. It means that for bringing 4 different video streams into one graph at the same time, at least 4 separate Directshow graphs are needed which in turn, ends up with having 4 different processes. Running several processes of video tasks is extremely resource consuming and personal computers are generally not capable of this.

2. Latency: An increasing delay between the live stream and the graph output was

seen after running the mixer graph for few minutes.

3. Incompatibility Issues: The incompatibilities between Flash and DirectShow

was resolved using MediaLooks Flash Source Filter working as a bridge in be-tween, to connect these two platforms together. However, for a huge resource consuming process like live video mixing with previously mentioned video mixer filters, instabilities, strange behaviors and unpredicted performance were ob-served.

4. Sound: Another observed problem with the Flash Source Filter integration into

DirectShow, was its inability to play the live interleaved audio stream. Medi-aLooks later on reported that this is likely to happen if the interleaved stream has no audio data on its first frame. This issue was solved later by playing a soundless audio stream on the first frame of the Flash object.

(35)

3.6 Implemented Architecture 25

The fourth solution (Flash Lite mobile mixer + Flash Switch application) was also rejected. Although Flash Lite objects are capable of playing video streams, bring-ing four live FLV streams at the same time in one context due to mobile network bandwidth limits is impossible2.

The third solution (Flash Mixer + Flash Switch application) however, seemed to be possible. Due to the internal support for live FLV streams, playback over RTMP and HTTP protocols in Adobe Flash platform and playing multiple video streams from live sources has become possible. Seamless Switching between different video streams is not technically a problem either. Communication between Flash objects (combiner and switch) and DirectShow and Adobe Live Media Encoder is managed out with the help of the Flash Source Filter. Section 3.6 describes this in more detail. The idea of the fifth solution (Switch on the player side + any kind of mobile or desktop switch) was actually taken from Adobe’s article on building live video switches [7]. This solution also refused later since it is entirely based on running server side ActionScript codes on a media server which demands running a sepa-rate server application. However, sending commands to registered watching clients switch to another video stream is a novel idea.

3.6 Implemented Architecture

To meet the timeline of the project, its scope had to be narrowed down. The most notable outcome of this degradation is the removal of the backchannel. Figure 3.4 represents the new architecture in which this change is visible.

Considering this simplified model and assuming that the combiner and switch operations are already run and controlled by an operator, in operation:

1. Camera operators start filming and broadcasting to Bambuser.

2. The Mixer Server requests and fetches selected streams from Bambuser.

3. The Mixer Server runs an instance of Video Combining Process, providing it with specified selection of streams. The related information about the selection is written in the XML input text file by the Initiator Application. An instance of the Stream Switching Process will be also executed to manage the switching operations afterward.

4. Upon having the combined video stream ready it will be broadcast to Bambuser. 5. The director runs the Vision Mixer application on his phone; it requests the

com-bined stream from Bambuser and starts displaying it.

6. The director via the application can select any of the four combined streams for broadcast.

2Perhaps by the emergence of 4G network and mobile devices in near future (4G is already

(36)

26 3 System overview

Fig. 3.4. Simplified Vision Mixer Architecture

7. A request will be sent to the Mixer Server to switch to the desired stream. 8. The Mixer Server switches to the requested stream. It is done through Stream

Switching Process and the result is visible on Bambuser.

Fig. 3.5. Mobile Vision Mixer in Operation, in a Simplified Architecture

(37)

3.6 Implemented Architecture 27

3.6.1 Bambuser API

Bambuser provides its users with a set of API functions to access parts of their database through HTTP calls3. The API includes two main functions, getVideo and

getVideos; the latter is used both in the implemented VM mobile application and by the Mixer Machine to retrieve information about MLMixer and ML1, ML2, ML3 and ML4 recent activities.

The ”getVideos” function returns a list of results containing the following fields: vid Integer id that uniquely identifies the Bambuser video.

title String The broadcast title, set in the broadcast device before starting the broadcast.

type String Available values: ”live”, ”archived”

username String Bambuser username of the broadcasting user created Integer Unix timestamp for the video.

Table 3.1. Some of the available metadata fields contained in a typical video object returned

by Bambuser ”getVideos” API function

Depending on the caller’s permissions level, it might also return another String value, url, which points to a location on the Bambuser server to reach the broadcast through HTTP or RTMP.

3.6.2 Broadcasting with Adobe Flash Media Live Encoder

It was mentioned in section 3.2 how Adobe Flash Media Live Encoder (FME) can be used for live media streaming purposes. FME, through its interface, can be con-figured to use audio and video capture devices as input source. Since MediaLooks Source filter in paired with the Sink filter are designed to operate as a virtual stan-dard capture device, they can be identified and used by FME; in this way, and with the help of a DirectShow graph, broadcasting to Bambuser from Combiner and Switch components are made possible. For security reasons, broadcasting to Bambuser from other applications rather than their own mobile service are lim-ited. However, with authentication profiles created specifically for each user, FME instances can be configured for streaming to Bambuser as well.

3.6.3 Combiner Process

The video combiner process fetches the current broadcasts of Bambuser ML1, ML2, ML3 and ML4 users, stitches them together in a cross view and broadcast it back to Bambuser under the MLMixer username. Like already said, fetching from Bambuser is made possible through their provided API function, ”getVideos”.

3

(38)

28 3 System overview

The ”getVideos” output which contains all necessary information including a lo-cation on Bambuser server, is written in an XML text file, bam.xml. The initialization of the system including this function call and preparing the bam.xml file, should be done by the operator through a simple application written in C# which is also provided.

The actual combiner component however, is implemented as a Flash SWF file, which has four video player objects laid out on its canvas; each locating a video file on Bambuser as well as a group of labels providing complementary information to the user. The Flash object is illustrated below in figure 3.6 in which its quadruple layout and the camera index and timestamp labels are visible.

Fig. 3.6. Combiner Flash component layout

This Flash component in operation, starts by reading the bam.xml file. Related information for each broadcast is retrieved from this file and assigned to associ-ated player objects. The complementary information including the time, title and username will also be shown. The automatic playback of every stream will start afterward.

For broadcasting with FME, a DirectShow graph is used as a link in which a Flash Source Filter is connected to a Sink Filter. In this, the Flash Source Filter by reading from Combiner SWF file will streamdown its content to a shared memory area allocated by the Sink Filter. FME simultaneously starts reading from this shared area using a paired instance of MediaLooks Source Filter. Running and starting both the DirectShow graph and FME instance needs to be done by the operator.

3.6.4 Switch Process

(39)

3.6 Implemented Architecture 29

are put on top of each other. Each stream in this embodiment, corresponds to a different channel. They are all automatically started with their sound off. Whenever a switching request is received, the corresponding player will be brought up to front, covering all other instances. In such way, the selected player would be heard while others are muted.

In this implementation, the Flash object always keeps reading the content of a text file, broadcast.txt every second. The retrieved value is supposed to be a number between 1 and 4, representing the selected channel.

It will be shown later how this file is used by an implemented web service which is accessible from the mobile application to allow mobile users switch between video streams from a live simultaneous preview of the combined video.

Another instance of FME in conjunction with a DirectShow graph is used for broadcasting the output to Bambuser; this instance is configured to work under MLDirector privilege.

3.6.5 Vision Mixer Mobile Application

The Vision Mixer mobile application is created to allow a director coordinate a pro-duction task. It is implemented as an Adobe Flash Lite object which is suitable for ordinary mobile interactive applications. This object can run on any mobile device with Flash support capability concurrently with all other applications without inter-rupting them. However, just before the start of VM application, the desktop mixer needs to be run by an operator as it was mentioned previously.

The application is configured to retrieve and playback the current Bambuser broadcast by the MLMixer user via a call to ”getVideos” function. This Flash Light object also lets the user use keys 1 to 4 to switch between the video channels.

The application waits for the user (i.e. the director) input on the mobile device to select a channel. The related area of the selected channel will be demarcated by a red rectangle to separate from the rest of the display. A red ”On Air” text will be also lit up to show that the chosen video is already selected for broadcast. At the same time, through the web service selectChannel method call, the corresponding channel number (1 to 4) will be sent to the Mixer Machine to be read by the Switch component.

(40)

30 3 System overview

3.6.6 Communication

An abstract schema of the communication model in the entire system is illustrated in figure 3.7. As seen in that figure, Bambuser lies in the middle of the architecture as the media provider; with mobile application on one side and the desktop mixing machine on the other, they are connected through 3G mobile network and the Inter-net over HTTP respectively. The VM mobile application is also accessing the Mixer Machine for sending channel switching requests through web service method calls over 3G. The four mobile devices broadcasting to Bambuser are not depicted in the figure.

Fig. 3.7. Abstract Model of Communication and Data Flow in Mobile Vision Mixer

On the desktop side, the inter-process communications take place via text files (as with bam.xml for initialization of the system and channel.txt for switching) or via DirectShow graphs hosting instances of MediaLooks Sink Filter.

3.7 Further Technical Improvements

Although the current prototype is working well enough to meet its primary purposes, further improvements and changes are expected for a real product. Among them, the most prominent ones are as follow:

(41)

3.7 Further Technical Improvements 31

2. The desktop application that handles combining and switching operations needs to be started by a remote operator. The web service can be extended to sup-port an array of method calls such as automatically starting the combining and switching processes by receiving the proper commands.

3. The current mobile application needs to be developed to provide a better GUI and interaction to its user. The already created Adobe Flash Lite application can be extended for this purpose; however, using mobile application development kits, frameworks and SDKs like J2ME or Android SDK can be also proposed as they support programming with device specific functions such as touch screen and soft keys.

4. No security checks, authentications and logins are performed neither in the desk-top application nor the web service and the mobile side. In a real collaborative environment, users are supposed to login to the system, and be known and au-thenticated both to the applications and other collaborators.

5. Enough options should also be given to the director to make his own selection out of Bambuser feeds. Current application is restricted to only use videos made pre-defined users (ML1, ML2, ML3 and ML4).

6. Like already said, the result of switching and the combined stream are both broadcast to Bambuser and are publicly visible. This is unnecessary, and policies should be applied to make these feeds private.

7. DirectShow filters can be developed to play live FLV streams directly over RTMP or HTTP media servers. With this, SWF files embodiment in DirectShow graphs would be replaced by a straight Bambuser live streams access. The overall con-trol and possibilities for video practices in DirectShow would be also extended. Utilizing editing functions, effects and transitions would become possible as well.

(42)
(43)

v

4

System Evaluation

To evaluate the system, two studies were conducted. The method used to evaluate the system as well the study results are described in this section.

Evaluating the system was not in the scope of the author’s work, although per-forming some basic tests intended. However, due to the successful launch of the first version of the system and by realizing the potential of continuation, conducting a user study was designed and arranged by my other colleagues at MobileLife. I was then asked to join them only as Technical Staff for setting up the mobile devices, controlling the remote Mixer Machine and helping the test subjects to learn how to use Bambuser live streaming, as well as the VM application.

A separate report on the design of the user studies, their purpose and relationship to the topic is already written [15]; from which some relevant facts are presented below1.

4.1 Method and Setting

Evaluation sessions took place in two public places in Sweden during June and July 2010; first in Stapelb¨addsparken2, Malm¨o and then, Universeum3, G¨oteborg. The

evaluation sessions were ethnographic field studies during which participants were filmed and observed. With this method of observation, people can be studied in context to see how they face the situation [29]. At the beginning of every session, participants were given a brief introduction of the system and how it would work. To conclude the sessions, participants were brought together and interviewed after each test to examine the experiment and their views about the prototype. By these debriefing sessions, the situation they were situated in could be analyzed and

un-1The first evaluation report is written in Swedish but an informal English translated

sum-mary exists as well.

(44)

34 4 System Evaluation

derstood [29]. These assemblies were also video recorded for content analysis later on. Figure 4.1 represents an evaluation session in Malm¨o Stapelb¨addsparken.

Fig. 4.1. Evaluation in Malm¨o [15]

The studies were performed through 7 sessions totally, with volunteers aging between 11 and 17. Participants were divided into groups consisting of four camera holders and one director. Upon their interest, the director or camera operators were allowed to appoint an assistant. Every group was asked to provide a footage of the environment; However, evaluators in Stapelb¨addsparken were free to choose their topic of filming while those in Universeum were instructed to produce an overview of one of the various showcases there, the ”Crime Lab”4. They were all requested to

represent the environment in the best understandable way to the web followers.

4.2 Study Results

The results from interviewing participants during the test sessions as well as from observations and recorded material analysis are presented below.

Interestingly, MVM was liked by the most of its users, and was said to be natural as it could show a live preview of what others were filming; in this manner, the Vision Mixer test subjects also said that they could experience the feeling of playing the main role in the production team. The overall collaboration to perform the task was also an enjoyable experience to the whole team even though some problems raised. These issues are discussed in the next section.

In addition, the Vision Mixer was reported as an easy-to-use and learnable ap-plication by most directors since after using the prototype for a while, they could

4

(45)

4.3 Problems Found 35

understand how it would function. Due to these new learning, some group restruc-tured their work after a few minutes.

Some participants also chose to have assistants to collaborate better with the director. In this, the cameraman could focus on the task while the assistant would coordinate it, for example by talking to the director or trying to propose topics for filming. This, most obviously became a solution for directors when the Vision Mixer delayed and other team members were out of sight. Codirecting a scene by using MVM is depicted in figure 4.2.

Fig. 4.2. Codirecting with MVM [15]

In some cases, directors were observed instructing camera holders for what to film. These observations illustrate how the whole team found workarounds to col-laborate in a better way.

4.3 Problems Found

During the test sessions, the following problems were reported by the participants: 1. Delay: Most of the directors complained about the obvious up to 30 seconds

delay seen in the mixer. The delay is between the actual moment of filming and when the stream is received by the Vision Mixer; it used to happen after a while5 of using the Vision Mixer. Mixing in this way became difficult for some

directors since what they were seeing on display was behind of what camera operators were filming. One chosen solution to this problem was to use a co-director, following the cameramen, watching what they are filming, but mixing and switching by the mobile phone. Some other evaluators on the other hand,

5By the time the user studies were being conveyed, there was no exact estimation of how

(46)

36 4 System Evaluation

said that they could benefit from this delay in some way by being able to plan in advance and having in mind what would be on screen soon.

2. Communication and feedback: Since the system does not currently provide

any channel for awareness, almost every camera holder experienced a common problem that they could never know who is selected for broadcast. The only solution to this problem was to ask the director or the codirector. This means that the need for a backchannel or any feedback to be at least aware of the current camera selection is inevitable.

3. Picture Quality: A group of evaluators in Malm¨o thought that having higher quality pictures could improve the application. Moreover, these group reported that because of the low quality video on their mobile phones they rarely use mobile video. What they complained about might be due to the conditions of the Malm¨o study which took place in a sunny day and an open area. Thus, the mobile screens were reflecting sunlight; therefore, watching video became problematic.

4.4 Discussion

The successful launch of the prototype proved that the development of similar or even more advanced systems is possible. The study revealed that for users of such systems, being aware of what others are doing is indispensable. Showing live pre-views of every camera is one way of doing so, but providing a channel for back-ground communication also looks to be of high importance. This could be imple-mented as verbal channel, text based chats or only one way notifications to show who is on air.

(47)

4.4 Discussion 37

(48)
(49)

v

5

Conclusion and Further Work

MVM has demonstrated the potential of mobile phones in collaborative live video production. What was envisioned from the start of the project has been to address the amateurs’ needs in video practices. MVM in this sense, is a considerable step forward by which, an innovative way for mobile users have been set out to co-produce their own footage of any event like sports as it was mentioned in section 3.4. Further investigation also proved the fact that MVM is the first and unique mobile video mixing system1 2 which is taking mobile video another step further. Yet another

application of MVM would be in cases where people or events are needed to be watched or recorded such as in surveillance and security systems.

Through the development of the system, several challenges were raised. Fetching live streams from Bambuser and bringing them into a multimedia environment such as Microsoft DirectShow was the biggest one. The selection of DirectShow however, was promising since it could give almost full control over the media streams. Direct-Show also provided functions for integration with Adobe Flash Live Media Encoder to help broadcasting the combined video to Bambuser. To get DirectShow running, an intermediate server needed to be used. It was shown in chapter 3 how this archi-tecture was beneficial; transmitting one video stream which actually contains four, was a solution to overcome with the current mobile network limitations.

5.1 Further Work

Still, there are remaining challenges like those technical improvements mentioned in 3.7. Besides, I want to create a real mobile application rather than the current Flash Lite object. Mobile apps are more consistent and durable; also with recent advancements in mobile operating systems, they seem to provide more flexibility in programming and developing user interfaces.

(50)

40 5 Conclusion and Further Work

My colleagues at MobileLife and I, want to continue our work on MVM to ad-dress these issues. A few fresh features also deserve some efforts, like expanding the system to provide the director with more editing capabilities. Utilizing HD video is in our intention too; which requires live streaming service providers like Bambuser support this as well.

(51)

References

1. Bambuser. http://bambuser.com. accessed August 24 2010.

2. Xml web services basics. http://msdn.microsoft.com/en-us/library/ms996507.aspx, De-cember 2001. accessed August 25, 2010.

3. C# language specification version 3.0. http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09/CSharp Language Specification.doc, 2007. accessed

August 31, 2010.

4. Introduction to directshow. http://msdn.microsoft.com/en-us/library/dd390351(v=vs.85).aspx, June 2007. accessed August 24 2010.

5. Actionscript technology center. http://www.adobe.com/devnet/actionscript.html, July 2009. accessed December 24, 2010.

6. Adobe flash platform blog. http://blogs.adobe.com/flashplatform/, July 2009. accessed December 24, 2010.

7. Building a live video switcher with flash communication server mx. http://www.adobe.com/devnet/flash/articles/live video switcher print.html, July 2009. accessed Spetember 1, 2010.

8. F4v/flv technology center. http://www.adobe.com/devnet/f4v.html, July 2009. accessed September 2, 2010.

9. Flash directshow source filter. http://www.medialooks.com/products/directshow filters/flash source.html, 2009. accessed August 24 2010.

10. Multigraph sink/source. http://www.medialooks.com/products/directshow filters/multigraph sink source.html, 2009. accessed August 25 2010.

11. Swf technology center. http://www.adobe.com/devnet/swf.html, July 2009. accessed December 20, 2010.

12. Adobe - flash lite. http://www.adobe.com/ap/products/flashlite/, 2010. accessed August 24 2010.

13. Adobe. Using adobe flash media live encoder 3.

14. Mathias Broth. The production of a live tv-interview through mediated interaction. Recent

Developments and Applications in Social Research Methodology (Proceedings of the Sixth International Conference on Logic and Methodology), 2004. SISWO, Amsterdam.

15. Emelie Dahlstr¨om. Att dokumentera och uppleva med live video, en utv¨ardering av tv˚a mobila applikationer f¨or live videoredigering (documenting and experiencing with live video, an evaluation of two mobile applications for live video editing). MobileLife, 2010. 16. A. Engstr¨om, M. Esbj¨ornsson, and O. Juhlin. Mobile collaborative live video mixing. In Proceedings of the 10th international conference on Human computer interaction with

mobile devices and services, MobileHCI ’08, pages 157–166, New York, NY, USA, 2008.

ACM.

17. Arvid Engstr¨om, Liselott Brunnberg, Josefin Carlsson, and Oskar Juhlin. Instant broad-casting system: mobile collaborative live video mixing. In ACM SIGGRAPH ASIA 2009

Art Gallery & Emerging Technologies: Adaptation, SIGGRAPH ASIA ’09, pages 73–73, New

References

Related documents

In this paper we consider the problem of the construction of overlays for live peer-to-peer streaming that leverage peering connections to the maximum extent possible, and

For a given composition of the overlay in terms of peer upload contribution, minimum delay streaming occurs when data is diffused to the peers in descending order with respect to

This work started by choosing an open source video player. The open source video player is then developed further according to requirements of this project. The player is

In terms of the RE, the behaviour of the playing strategies is exactly the same as in the case of the medium bit rate case: the Flash and HTML5 have one re-buffering event while

In this part, the differences between Chinese and American live streaming are summarized. Due to the influence of commercial, technological, and psychological

Det var bara en studie i litteraturstudien som hade uppföljning efter så lång tid men en forskningsöversikt sammanställd av Williams, Simmons, och Tannabe (2015) på patienter

The Role of Nitric Oxide in Host Defence Against Mycobacterium tuberculosis.

Graph 8: Comparing bandwidth used between WebRTC and DASH for ABR scenario 3.