• No results found

Personal TV Channels

N/A
N/A
Protected

Academic year: 2021

Share "Personal TV Channels"

Copied!
53
0
0

Loading.... (view fulltext now)

Full text

(1)

Department of Science and Technology Institutionen för teknik och naturvetenskap

Linköpings universitet Linköpings universitet

SE-601 74 Norrköping, Sweden 601 74 Norrköping

Examensarbete

LITH-ITN-MT-EX--07/042--SE

Personal TV Channels

Per Carlsson

2007-08-30

(2)

LITH-ITN-MT-EX--07/042--SE

Personal TV Channels

Examensarbete utfört i medieteknik

vid Linköpings Tekniska Högskola, Campus

Norrköping

Per Carlsson

Handledare Reiner Lenz

Handledare Ad Denissen

Examinator Reiner Lenz

(3)

Rapporttyp Report category Examensarbete B-uppsats C-uppsats D-uppsats _ ________________ Språk Language Svenska/Swedish Engelska/English _ ________________ Titel Title Författare Author Sammanfattning Abstract ISBN _____________________________________________________ ISRN _________________________________________________________________

Serietitel och serienummer ISSN

Title of series, numbering ___________________________________

Nyckelord

Keyword

Datum

Date

URL för elektronisk version

Avdelning, Institution

Division, Department

Institutionen för teknik och naturvetenskap Department of Science and Technology

2007-08-30 x x LITH-ITN-MT-EX--07/042--SE Personal TV Channels Per Carlsson

The Personal TV Channel concept, still in an early stage of research, will record all TV programmes of interest in separate virtual channels (news channel, TV-series channels, movie channels) based on your preferences.

The concept has been well received and this report investigates the technical issues and possibilities of creating Personal TV Channels.

The development was done mostly as stand alone applications but also by extending to the open-source Linux PVR software MythTV. The experimental results using video fingerprint techniques shows a high accuracy for finding the starting points for TV-series and news broadcasts. The developed visualisation tool provides a clear output of the broadcast segmentation.

The segmentation and matching of commercials outputs extensive commercial statistics and makes it possible to track the broadcast of

specific commercials. Finally conclusions and future work are presented.

Video Scene Boundary Detection, Commercial Detection,Video Fingerprinting, Video Indexing, Content Analysis, Pattern Recognition

(4)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extra-ordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances.

The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity,

please refer to its WWW home page: http://www.ep.liu.se/

(5)

Final thesis

Personal TV Channels

by

Per Carlsson

(6)
(7)

Final thesis

Personal TV Channels

by Per Carlsson

Supervisor : Principal System Architect Ad Denis-sen

Storage Systems and Applications at Philips Research Laboratories

Examiner : Reiner Lenz

Dept. of Computer and Information Science at Link¨opings universitet

(8)
(9)

Abstract

The Personal TV Channel concept, still in an early stage of research, will record all TV programmes of interest in separate virtual channels (news channel, TV-series channels, movie channels) based on your preferences. The concept has been well received and this report investigates the technical issues and possibilities of creating Personal TV Channels.

The development was done mostly as stand alone applications but also by extending to the open-source Linux PVR software MythTV. The exper-imental results using video fingerprint techniques shows a high accuracy for finding the starting points for TV-series and news broadcasts. The developed visualisation tool provides a clear output of the broadcast seg-mentation. The segmentation and matching of commercials outputs exten-sive commercial statistics and makes it possible to track the broadcast of specific commercials. Finally conclusions and future work are presented.

Keywords : Video Scene Boundary Detection, Commercial Detection, Video Fingerprinting, Video Indexing, Content Analysis, Pattern Recog-nition

(10)
(11)

Acknowledgements

Ad Denissen for supervising my internship with enthusiasm and knowl-edge. Ingmar Van Dijk and Ard Biesheuvel for your expertise in Linux, programming and software architecture. Adolf J. Proidl for your expertise in Personal TV Channels. Thank you.

(12)
(13)

Contents

1 Introduction 1 1.1 Background . . . 1 1.2 Philips Research . . . 1 1.3 PVR . . . 2 1.4 Personal TV Channels . . . 2

1.5 Electronic Program Guide . . . 3

1.6 Main Goal . . . 4 1.7 Report structure . . . 4 1.8 Limitations . . . 4 1.9 Video input . . . 4 1.10 MythTV . . . 4 1.11 Commercial detection . . . 5 1.12 Method . . . 5

1.13 Abbreviations, acronyms and definitions . . . 6

2 TV Programme Boundaries Detection 7 2.1 TV channel characteristics . . . 7 2.2 Broadcast structure . . . 8 2.3 Statistics . . . 9 2.4 Video Fingerprint . . . 9 2.4.1 Subtractive Fingerprint . . . 10 2.4.2 Fingerprint Generation . . . 11 2.4.3 Fingerprint Matching . . . 11 2.4.4 Matching issues . . . 13 vii

(14)

viii CONTENTS 2.5 System Architecture . . . 13 3 Visualisation 17 3.1 XSLT . . . 18 3.2 Image Extraction . . . 18 3.3 Video Extraction . . . 18 3.4 Skiplists . . . 19 4 Commercial matching 22 4.1 Commercial statistics . . . 23

4.2 Tracking of specific commercials . . . 23

5 Experimental Results 25 5.1 The intro issue . . . 27

5.2 Preknowledge of the headers . . . 28

6 Conclusions 32 6.1 Future work . . . 33

7 Appendix A 36

(15)

Chapter 1

Introduction

1.1

Background

The era of waiting for TV programmes to begin is over. The user wants to watch his favourite shows when he turns on his TV. Instead of zap-ping through the channels waiting for something to come on just turn on your own channel and start watching the latest episodes of your preferred content. Personal TV Channels makes this possible.

1.2

Philips Research

This thesis has been made for the Storage Systems and Applications group at Philips Research, Eindhoven.

Founded in Eindhoven, The Netherlands, in 1914, Philips Research as part of Koninklijke Philips N.V. has expanded the scale and scope of its activities to become one of the world’s major private research organisations. The annual research budget of Philips Research is slightly less than 1% of Philips’ annual sales, which amounted EUR 30.4 billion in 2005. Roughly two-thirds of the corporate research work is geared to the activities of the Product Divisions of Philips, with contractual agreements about programs and costs. The remainder is research of a more exploratory nature. (Total

(16)

2 1.3. PVR

R&D efforts of Philips amount to approximately 8% of sales.) [1]

1.3

PVR

A Personal Video Recorder (PVR) is a device used to watch and digitally record television broadcasts. Usually, a PC with a TV-tuner card is used as hardware base, but it is the software that defines the type of PVR. Different PVR software are for example ReplayTV, TiVo, Microsoft Media Center and MythTV. The PVR is meant to replace the regular TV-tuner, optical disc player and video recording device in the household. By having the video broadcast go through a PVR instead of directly to the television set, there are new possibilities to use the abilities of the PC which has much more processing power, memory and storage capabilities.

On a PVR, the received video broadcast is first stored in a buffer in the memory of the PC before being displayed on the screen. This gives the ability to instantly record the material that is being watched, since this operation only requires the video stored in the buffer to be written to any storage device such as a hard drive. This also allows time shifting where the user can press pause in a live programme. The PVR will immediately start recording the remains of the programme and the user can then continue to watch the recording from that point at another time. For Personal TV Channels its even easier. Limitless fast forward and backward is enabled without the need of any buffer since all the data is stored on the hard drive.

1.4

Personal TV Channels

Personal TV Channels also known as Flex channels is under research by Philips, [2] and [3]. Personal TV Channels lets the user create TV channels containing only programmes chosen by the user. Examples are recordings of all news broadcasts from a certain channel or every movie starring Arnold Schwarzenegger. All programmes recorded for the Personal TV Channel will be played in a specific order when the user selects the channel, cre-ating the illusion of a regular television channel. For this feature, precise boundaries detection will be important since the need to fast forward in any

(17)

Introduction 3

direction ruins the illusion of a regular TV channel. Personal TV Channels are made with the remote of your PVR and watched like any other channel on your TV, see image 1.1. The concept of Personal TV Channels is a future differentiator separating Philips PVR boxes from other PVR boxes.

Figure 1.1: Regular and Personal TV Channels in seamless integration

1.5

Electronic Program Guide

The Electronic Program Guide (EPG) is extra information about the broad-cast often send as side information in a digital video broadbroad-cast, but can also be retrieved in other ways (for example by downloading from a remote server). The EPG can contain some information about the programmes, for example start and end time, starring actors, a short description etc. The schedule information is very useful for scheduling recordings in advance.

But the EPG does not contain exact time information. If the exact starting and end time of each programme was available, it would be very easy to find the start and end points, and the channel would not get any advertisement money. The times are not exact because the commercial breaks are not shown in the EPG. As a user, however, we are usually interested in the material we recorded, and not additional surrounding advertisement. Therefore, enhancement of the information in the EPG is wanted to make it more accurate to fit our needs.

(18)

4 1.6. Main Goal

1.6

Main Goal

The main goal is automatic programme boundaries detection using con-tent analysis and pattern recognition of recorded TV programmes to make Personal TV Channels.

1.7

Report structure

First some preliminaries and definitions are given. Then the method and the work process are described followed by experimental results, conclusions and future work.

1.8

Limitations

The method is limited to the pixel domain (no use of the compressed do-main). No limitation to real-time performance. The method is focused on TV-series and news broadcasts running on some of the most popular televi-sion channels in the Netherlands. The method is supposed to work on full recorded programmes that contains a few minutes of additional recorded material before the beginning and after the end of the programme. For the segmentation and matching of commercials TV channels are expected to include blank frame segments between commercials.

1.9

Video input

The video is recorded as MPEG-2 PS with a resolution of 720*576 at 25 frames per second with a bit rate of 4500 kbps (roughly 2 GB/hour) using a MPEG-2 hardware encoder and the MythTV software.

1.10

MythTV

MythTV is a popular open-source PVR software program for Linux under the GNU license. For this project version 0.20 were used. See image 1.2.

(19)

Introduction 5

Figure 1.2: Snapshot of the Linux PVR software MythTV showing the EPG information based recording interface.

1.11

Commercial detection

Commercial detection algorithms are typically based on the detection of blank frames, channel logotypes and scene change frequency. Commercial detectors ([4], [5], [6], [7], [8]) are able to detect most of the commercials in a recording but are still unable to find the starting points of programmes because of two reasons. The first one is that many programmes start right after the end of the previous one. The second one is that there are other content except the commercials and the main programme in the recording like trailers for upcoming shows and parts of other programmes.

1.12

Method

Instead of trying to cut out all commercials and content that is not the main programme we are explicitly looking for the start of the programme. The proposed method uses mainly video fingerprinting but also some com-mercial detection techniques are used. All the content is left intact on the

(20)

6 1.13. Abbreviations, acronyms and definitions

hard disk of the PVR, the output of the method is only a pointer to the starting point of the main programme in the recording. See figure 1.3.

Figure 1.3: Recording consisting of main programme and other content such as other programmes and commercials

1.13

Abbreviations, acronyms and definitions

PVR Personal Video Recorder PC Personal computer

MPEG Motion Pictures Experts Group

MPEG-x Standard number x developed by MPEG TV Television

Set-top box A set-top box describes a device that connects to a television and some external source of signal, and turns the signal into content then displayed on the screen.

Ground Truth The manually annotated truth used to help determine the accuracy of the automatic classification by comparison.

EPG Electronic Programme Guide MythTV Open-Source PVR software

(21)

Chapter 2

TV Programme

Boundaries Detection

In the segmentation we focus on finding the starting point of the main programme in the recording. Finding the starting point is more important than finding the ending point because as long as you can easily skip forward to the precise starting point of the next show in the sequence of your Personal TV Channel it’s not vital that the shows are completely seamlessly composed.

2.1

TV channel characteristics

TV channels can be divided into three types based on the commercials. TV channels without commercials, TV channels with commercials only in between programmes and TV channels with commercials anytime. Finding the starting point will be easier if you always have commercials in between and only in between programmes. Finding the starting point will be more difficult without commercial or with commercial anytime. Our method aims at handling all these types of TV channels.

(22)

8 2.2. Broadcast structure

2.2

Broadcast structure

An extensive study of the different segments of the broadcasting structure and a few terms was introduced for easier description of a recording. Notice how intro and header is defined since these words will be used frequently and the use of these terms is not evident.

Main Programme The main programme of the recording. Other Programme Another programme in the recording. Commercial A single commercial. Usually lasts 4-40 seconds.

Self Commercial Self Commercials are trailers for upcoming shows in the near future. Usually lasts 5-40 seconds. Also referred to as selfComs. Blank frame segment Four to eight consequent blank (black) frames

separating commercials.

Commercial Separators Commercial separators, usually 3-15 seconds, informs the viewers about the start or end of a commercial block. Also referred to as CommSeps. See image 2.1.

Sponsor The programme is sponsored by a company. See image 2.2. Intro Introduction part of the programme before the header, not present

in all programmes.

Header When a programme starts it usually starts with a header, an introduction signature sequence that is always the same in all pro-grammes. See image 2.3.

Government Commercial Commercial from the government are differ-ent in that they do not follow the usual regulations for commercials with blank frames and commercial separators.

Figure 7.1 (in the appendix) shows the first part of two manually anno-tated episodes of Goede tijden, slechte tijden. Each segment is displayed with an representative image in combination with key numbers and text.

(23)

TV Programme Boundaries Detection 9

Figure 2.1: Commercial Separator

2.3

Statistics

The manually annotated episodes of Goede tijden, slechte tijden shows some interesting statistics, see table 2.1 and 2.4. Most striking is that when popular shows (popular shows contain more commercials) are recorded with some extra minutes around the recording because of an inaccurate EPG the percentage of main programme in the recording is low, close to 50%. This is expensive in storage space and clearly indicates the need to segment the recording in order to be able to make Personal TV Channels.

2.4

Video Fingerprint

To find the starting points of the video recordings we are using video fin-gerprints. By using video fingerprints we are able to detect repeating parts of a TV channel. Parts that are repeating are supposed to belong to either a ComSep, a SelfCom, a header or a commercial.

A video fingerprint identifies a video sequence like a human fingerprint identifies a human. Like human fingerprint they should be robust against distortions and transformations. More specific it should have robustness to

(24)

10 2.4. Video Fingerprint

Figure 2.2: Sponsor segment

subtle differences in rotation, scaling, luminance and colour. And it should be easy and fast to calculate and include only a small number of values so that storing and comparing calculations are inexpensive.

The fingerprint method selected is Differential Variable Size Block Lu-minance Algorithm, [9]. A method based on mean luLu-minance values ex-tracted from a grid as seen in figure 2.5. It is robust to rotation and scaling, quick to generate and simple enough for fast matching calculations. The fingerprint is small using only 38 numbers per frame namely the mean value, the variance and 36 area luminance numbers.

2.4.1

Subtractive Fingerprint

The selected fingerprint method recommends storing the differences be-tween block i and block i+1 instead of just the luminance value. A deci-sion was made not to do this since we are also using the fingerprint to find the black frames between commercials. But storing the luminance and not the differences has some issues. If a matching sequence is overall slightly brighter the match will be negative even though the difference values would match perfectly.

The solution would be to store the differences between block i and block i+1 to improve the matching but to store the mean and variance based on

(25)

TV Programme Boundaries Detection 11

Figure 2.3: Header of the popular TV-series Goede tijden, slechte tijden

the original luminance values to still be able to detect black frames using the fingerprint.

2.4.2

Fingerprint Generation

To generate fingerprints from a large number of recordings fingerprint generation were added to MythTV. A large of number of TV-series and news broadcasts from some of the most watched Dutch TV channels were recorded and fingerprinted.

2.4.3

Fingerprint Matching

The matching intensity is calculated per frame and compared to a threshold to decide if the match is good enough. Then the length of each matching sequence is counted, see table 2.2. All but the longest match are ignored. Each matching sequence has a time offset between the two compared files. This offset, in frames, occurs where the diagonal line from the match in-tersects the axes. In table 2.2 the longest matching sequence is matching for seven letters with an offset of one for the vertical episode and three for the horizontal episode.

(26)

12 2.4. Video Fingerprint

Figure 2.4: Commercial length histogram of six episodes of Goede tijden, slechte tijden

Table 2.1: Statistics of two manually annotated episodes of Goede tijden, slechte tijden, seconds in parenthesis

30 number of commercials per recording

524 (20.24) number of frames per commercial in average 124 (4.24) and 1124 (44.24) min and max length of commercial

26.7% percentage of recording that is commercial

54.6% percentage of recording that is main

pro-gramme

18.7% percentage of recording that is not commercial or main programme

147 (5.22) number of frames per commercial separator in average

73 (2.23) and 424 (16.24) min and max length of ComSeps

7 number of blank frames in between

commer-cials in average

6 number of blank frames in between

(27)

TV Programme Boundaries Detection 13

Figure 2.5: Fingerprint consisting of mean luminance values

matching similarity as intensity. See figure 2.6 and 2.7. Blank frames shows up as squares. Matching headers/commercials as lines.

2.4.4

Matching issues

The counting algorithm has been tweaked to be more forgiving when han-dling two types of matching issues. One is the issue of missing frames as seen in figure 2.8 where segments are fully matching except for a miss-ing frame in one of the segments breakmiss-ing the perfectly straight line. The other one is the issue of partial weak matches as seen in figure 2.9 where the segments are matching completely except for a few frames.

2.5

System Architecture

In the most simple case MythTV generates the fingerprints. Then a stand alone application inputs the fingerprints, does the segmentation and match-ing and outputs a text file of the results. See figure 2.10

(28)

14 2.5. System Architecture Y 0 0 0 0 0 0 0 0 0 0 0 0 0 B 1 2 3 0 0 0 0 0 0 0 0 0 0 B 1 2 2 0 0 0 0 0 0 0 0 0 0 B 1 1 1 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 0 0 0 S 0 0 0 0 0 0 0 0 1 7 0 0 1 P 0 0 0 1 0 0 0 0 6 0 0 0 0 I 0 0 0 0 0 1 0 5 0 0 0 0 0 L 0 0 0 0 0 0 4 0 0 0 0 0 0 I 0 0 0 0 0 3 0 1 0 0 0 0 0 H 0 0 0 0 2 0 0 0 0 0 0 0 0 P 0 0 0 1 0 0 0 0 1 0 0 0 0 R 0 0 0 0 0 0 0 0 0 0 1 0 0 B B B P H I L I P S R E S

Table 2.2: Matching concept

(29)

TV Programme Boundaries Detection 15

Figure 2.7: Match between parts of two recordings, the two white lines are matching segments, the rest of the white parts mostly blank frames matching

(30)

16 2.5. System Architecture

Figure 2.9: Match image - weak match

(31)

Chapter 3

Visualisation

Figure 3.1: Complete system architecture

To make the segmentation and matching output clear, there is a need for a meaningful visualisation. To accomplish this the system was extended,

(32)

18 3.1. XSLT

see figure 3.1. An XML file is output and to close the loop, skip-lists are output and executed to update the MythTV database with the starting points found in the stand alone application. Also output are images and video sequences extracted from the recordings.

3.1

XSLT

Extensible Stylesheet Language Transformations (XSLT) is an XML-based language used for the transformation of XML documents. In this case XML to HTML using Mozilla Firefox as the XSLT processor. The main system output is an XML file containing the complete segmentation data. The XML file is then transformed to a HTML representation using a XSLT code. Different style-sheets transform the same XML in seconds using a standard web browser to different output representations. Maybe you want a simple text list of the length of found commercials for further external plotting or maybe you want a more complete visual output with representative images, see figure 3.2.

3.2

Image Extraction

For each segment a representative image is extracted. An image is chosen according to time and matching width (normally the brand name is dis-played for a few seconds in the end of the commercial). This is most likely a good snapshot of the commercial for visualisation. Figure 3.3 shows with a dotted line where the image was extracted for a specific match.

3.3

Video Extraction

The application is also capable of outputting Flash Video files as an alter-native to the representative images. This gives a more complete represen-tation with the possibility to forward/backward but it’s considerably more expensive in extraction time and storage space. Flash Video (.flv files) using H.263 for encoding the video and MP3 for encoding the audio was used.

(33)

Visualisation 19

3.4

Skiplists

It is possible to set cut-lists for recordings using MythTV. A cut-list is of the form: #-#[,#-#]... (ie, 1-100,1520-3012,4091-5094) and stored in the MythTV database. The stand alone application outputs and executes skip-lists to update the recordings with the found starting point which is then used when recordings are played in MythTV.

(34)

20 3.4. Skiplists

Figure 3.2: Visualisation with segments coloured according to type and also according to matching results

(35)

Visualisation 21

(36)

Chapter 4

Commercial matching

Commercial detection methods are advanced and a well explored field, ([4], [5], [6], [7], [8]). For this project only a simple commercial detection al-gorithm was implemented since the main focus was on video fingerprint methods. Blank frame detection was chosen since it’s the foundation of most modern commercial detectors and since it does not only find the commercial block but the starting and end point of most of the individ-ual commercial within the commercial block. In the Netherlands there is a law that television broadcasters must separate commercials with blank frames. Commercials are found in most cases. Problems occur when there are sometimes no blank frames between the commercial separators and the commercials (the first and last commercial of the commercial block not always found) and when there are long black fades in a commercial (one commercial detected as several commercials). Figure 5.4 shows the differ-ence between the last automatically found commercial and the actual last, manually annotated, commercial.

Figure 4.1 gives an rough idea of how many commercials are matching when a large number of recordings are matched. 63 commercials were matched and a subset of them are displayed. Coloured fields represents matching commercials.

(37)

Commercial matching 23

Figure 4.1: Commercial matching for a large number of recorded programs

4.1

Commercial statistics

The automatic finding of a large number of commercials gives the oppor-tunity to output extensive commercial statistics like the one seen in table 2.1 but for a large number of episodes. But the statistics are not perfectly accurate because of the minor occurring problems in the automatic blank frame based commercial segmentation.

4.2

Tracking of specific commercials

Each commercial is assigned a unique id. When visualising the matching data this id number can be used to highlight only specific commercials using a condition statement. See figure 4.2.

(38)

24 4.2. Tracking of specific commercials

Figure 4.2: Highlighting of two specific commercials in three recordings of Goede tijden, slechte tijden

(39)

Chapter 5

Experimental Results

The performance of the developed algorithm for finding the programme starting points was tested on a number of recordings by measuring the distance in time between the actual ground truth starting point and the starting point found by the algorithm.

Table 5.1 and 5.2 displays the offset between the found starting point and the actual starting point for an increasing number of input files of different kinds of recordings. For the Teletubbies episodes of table 5.1 the error is decreasing for an increasing number of episodes. For the Bold and the Beautiful episodes of table 5.2 the result is about ten seconds off since the ComSep before the episode is identical for all tested episodes.

The result will also be off if the programme before has an identical tail or if there are identical SelfComs. But with the increase of recordings of the same type, the surrounding material that is not wanted will hopefully cancel out as noise as more recordings of the same type will give a more outstanding hit for the wanted repeating programme header.

Table 5.3 shows that matching of only two episodes is generally not enough to obtain an accurate starting position.

Finding the starting points of a programme can be done faster since I use commercial detection to get an idea of the recording and only do matching where the occurrence of a header is likely, see figure 5.1.

The two main types of TV programmes I was looking at was TV-series

(40)

26

e1 e1,e2 e1,e2,e3 e1,e2,e3,e4 e1,e2,e3,e4,e5 e2 01:19:06

e3 00:00:13 00:00:06

e4 00:11:06 00:11:06 00:00:19

e5 00:01:16 00:01:16 00:02:05 00:00:19

e6 00:01:10 00:00:06 00:00:29 00:00:19 00:00:19

Table 5.1: Time offsets for episode 1 of Teletubbies when matched with an increasing number of other episodes

e1 e1,e2 e1,e2,e3 e1,e2,e3,e4 e1,e2,e3,e4,e5 e2 00:10:21

e3 00:07:03 00:10:10

e4 00:09:01 00:09:01 00:09:01

e5 00:19:13 00:09:04 00:10:06 00:10:06

e6 00:19:15 00:09:06 00:10:06 00:09:11 00:10:06

Table 5.2: Time offsets for episode 1 of Bold and the Beautiful when matched with an increasing number of other episodes

Teletubbies Bold and the Beautiful

episode gt offset error gt offset error

2 06:40:12 07:59:18 01:19:06 05:57:12 05:46:16 00:10:21 3 06:40:12 06:41:00 00:00:13 05:57:12 06:04:12 00:07:03 4 06:40:12 06:29:06 00:11:06 05:57:12 05:48:11 00:09:01 5 06:40:12 06:42:03 00:01:16 05:57:12 06:17:00 00:19:13 6 06:40:12 06:41:22 00:01:10 05:57:12 06:17:02 00:19:15

(41)

Experimental Results 27

Figure 5.1: Sections likely to contain the header

and news broadcasts. The differences between TV-series and news broad-casts are displayed in table 5.4.

5.1

The intro issue

Figure 5.2 shows the offset from the actual starting point of the header to the automatically found header for 50 recordings, it does not take into account the actual starting points of the programs with an intro. In other words, intros are not considered. Given the header and the last commercial you would think it is easy to find the beginning of the episode. But there are some significant difficulties. First the intro length varies as seen in

(42)

fig-28 5.2. Preknowledge of the headers

Tv-serie News broadcast

header length long short

header automatically found yes no

intro sometimes sometimes

Table 5.4: Comparison between TV-series and news broadcasts

ure 5.3. Moreover, as seen in figure 5.4, there is also a difference between the last automatically found commercial and the actual last, manually an-notated, commercial. Also commercial is lacking in some cases just before the programme. Figure 5.4 also illustrates how much the structure varies even for the same TV-series on the same TV-channel.

A basic solution would be to go back two minutes from the beginning of the header unless the last commercial are found closer then these two minutes, in that case go back to the end of the last found commercial.

5.2

Preknowledge of the headers

For news broadcasts the starting points can be found accurately if one header is given. One way of providing this knowledge would be to set up a server that holds the fingerprint of the start sequence for every news show. It could also be possible to allow the users to do it. I.e. as soon as one user takes the effort to tell one box where the start/stop of the intro is the fingerprint information is shared back with the server and made available to everybody.

(43)

Experimental Results 29

Figure 5.2: Offset between the found starting point of the header and the actual starting point of the header in seconds for 50 recordings

(44)

30 5.2. Preknowledge of the headers

(45)

Experimental Results 31

(46)

Chapter 6

Conclusions

Finding the programme starting points using video fingerprint techniques was the main task for me in this project. The experimental results using my developed algorithm show a high accuracy for finding the starting points. Starting points in TV-series can be found without knowing the header. But TV-series sometimes have intros and going from the header to the beginning of the programme is not trivial, see discussion in 5.1. Starting points in news can be found automatically but only if the header is given for one of the episodes. This is a big drawback, see discussion in 5.2.

While working with the project I was frustrated trying to read the text output generated by the matching and I decided to develop a XML to HTML visualisation tool to enhance the presentation. The developed visualisation tool provides a clear output of the broadcast segmentation using not only text but tables, colors and images.

I was asked during the project to look into the topic of target advertising for possible future implementation as a part of Personal TV Channels. The segmentation and matching of commercials I developed outputs extensive commercial statistics and makes it possible to track the broadcast of specific commercials using the convenient XSLT processing language.

My results contribute to the current Personal TV Channel knowledge but more research is needed before a full Personal TV Channel demo can be developed.

(47)

Conclusions 33

6.1

Future work

There are numerous possibilities for future work ranging from a very prac-tical to a more conceptual level.

• The segmentation and matching of commercials shows interesting fu-ture possibilities for target advertising. Commercials in recordings could be exchanged with up to date commercials personalised for the user.

• Using the video fingerprint as intended storing luminance differences instead of just luminance values would improve the matching, see discussion in section 2.4.1.

• The brute force method for matching the fingerprints could be ap-proved by using more advanced methods proposed in [9] to signifi-cantly increase the matching speed.

• There are some unresolved issues with the memory management in the main stand alone matching / visualisation program for matching of large segments, see figure 6.1. The procedures of allocation of memory at request and freeing it for reuse when no longer needed could be improved.

• The database connection to the stand alone application is not fully developed. Building a database with commercials and headers would eliminate time consuming and redundant rematching, see figure 6.2. • Other solutions then ignoring all but the longest match could be

evaluated when multiple matches are found such as choosing the one closest to the EPG starting time if you are looking for a header.

(48)

34 6.1. Future work

(49)

Conclusions 35

(50)

Chapter 7

Appendix A

(51)

Appendix A 37

Figure 7.1: Ground truth segmentation of 2 episodes of Goede tijden, slechte tijden

(52)

Bibliography

[1] http://www.research.philips.com/profile/about.html.

[2] A.J. Proidl et al M. Verberkt. Virtual channels. Technical report, Philips Restricted, 2004.

[3] S.P.P. Pronk A.J. Proidl. Intelligent virtual channels. Technical report, Philips Restricted, 2006.

[4] B. Satterwhite and O. Marques. Automatic detection of tv commercials. In IEEE Potentials, 2004.

[5] A Albiol A Albiol, M.J.C. Full`a and L Torres. Detection of tv com-mercials. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2004.

[6] L. Agnihotri and N. Dimitrova. Evolvable visual commercial detec-tor. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003.

[7] Dieter W Blum. Method and apparatus for identifying and eliminating specific material from video signals.

[8] W. Effelsberg R. Lienhart, C. Kuhmunch. On the detection and recog-nition of television commercials. In IEEE International Conference on Multimedia Computing and Systems, 1997.

[9] V. Bhargava. Video fingerprinting. Technical report, Philips Restricted, 2005.

(53)

Copyright

Svenska

Detta dokument h˚alls tillg¨angligt p˚a Internet - eller dess framtida ers¨attare - under 25 ˚ar fr˚an publiceringsdatum under f¨oruts¨attning att inga extraordin¨ara omst¨andigheter uppst˚ar.

Tillg˚ang till dokumentet inneb¨ar tillst˚and f¨or var och en att l¨asa, ladda ner, skriva ut enstaka kopior f¨or enskilt bruk och att anv¨anda det of¨or¨andrat f¨or ickekommersiell forskning och f¨or undervisning. ¨Overf¨oring av upphovsr¨atten vid en senare tidpunkt kan inte upph¨ava detta tillst˚and. All annan anv¨andning av dokumentet kr¨aver upphovsmannens medgivande. F¨or att garantera ¨aktheten, s¨akerheten och tillg¨angligheten finns det l¨osningar av teknisk och administrativ art.

Upphovsmannens ideella r¨att innefattar r¨att att bli n¨amnd som upphovsman i den omfat-tning som god sed kr¨aver vid anv¨andning av dokumentet p˚a ovan beskrivna s¨att samt skydd mot att dokumentet ¨andras eller presenteras i s˚adan form eller i s˚adant sammanhang som ¨ar kr¨ankande f¨or upphovsmannens litter¨ara eller konstn¨arliga anseende eller egenart.

F¨or ytterligare information om Link¨oping University Electronic Press se f¨orlagets hem-sida http://www.ep.liu.se/

English

The publishers will keep this document online on the Internet or its possible replacement -for a period of 25 years from the date of publication barring exceptional circumstances.

The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Link¨oping University Electronic Press and its pro-cedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/

c

Per Carlsson

References

Related documents

There are different roles that a LOSC plays in OSP. From our interviews we found different aspects that LOSC can engage in when dealing with OSP. These roles are

Already here the variance for the treatment campaign is less than for the control group so it is already here significant that external public relations does not

The project will focus on developing a non-portable prototype of a security token, with the software needed to extend the login authentication functionality in Linux via PAM.. It

”Personligen tycker jag att det är väldigt roligt med all fri mjukvara för du kan göra så mycket och du behöver inte alltid titta på prislappen först och det händer mycket

Paper I: Effects of a group-based intervention on psychological health and perceived parenting capacity among mothers exposed to intimate partner violence (IPV): A preliminary

Exposure to family violence in young at-risk children: A longi- tudinal look at the effects of victimization and witnessed physical and psychological aggression.. R., &

community, for example, the community reacts strongly to an entrepre- neurial venture’s attempt to block venues of code development through appropriation). The characterization of

Guided by the research question, and the need for certain data to answer the question as described in Section 3.1.1 and Section 3.1.2, the information gathering and analysis phase