Efficient content distribution in IPTV environments

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Efficient content distribution in IPTV environments

Examensarbete utfört i Bildkodning vid Tekniska högskolan i Linköping

av

Mirza Galĳasevic, Carl Liedgren

LiTH-ISY-EX--08/4084--SE

Linköping 2008

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Efficient content distribution in IPTV environments

Examensarbete utfört i Bildkodning

vid Tekniska högskolan i Linköping

av

Mirza Galĳasevic, Carl Liedgren

Handledare: Peter Johansson

isy, Linköpings universitet

Erik Hallbäck

Examinator: Robert Forchheimer

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution

Division, Department Image coding

Department of Electrical Engineering Linköpings universitet

SE-581 83 Linköping, Sweden

Datum Date 2008-02-26 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version http://www.ep.liu.se

ISBN

—

ISRN

Serietitel och serienummer

Title of series, numbering

ISSN

—

Titel

Title Efficient content distribution in IPTV environments

Författare

Author

Mirza Galĳasevic, Carl Liedgren

Sammanfattning

Abstract

Existing VoD solutions often rely on unicast to distribute content, which leads to a higher load on the VoD server as more nodes become interested in the content. In such case, P2P is an alternative way of distributing content since it makes better use of available resources in the network. In this report, several P2P structures are evaluated from an operators point of view. We believe BitTorrent is the most ad-equate protocol for a P2P solution in IPTV environments. Two BitTorrent clients have been implemented on an IP-STB as proof of concept to find out whether P2P is suited for IPTV environments. Several tests were conducted to evaluate the performance of both clients and to see if they were able to reach a sufficient throughput on the IP-STB. Based upon the tests and the overall impressions, we are convinced that this particular P2P protocol is well suited for IPTV environ-ments. Hopefully, a client developed from scratch for the IP-STB will offer even greater characteristics.

Further, we have studied how to share recorded content among IP-STBs. Such a design would probably have many similarities to BitTorrent since a central node needs to keep track of content; the IP-STBs take care of the rest.

The report also brings up whether BitTorrent is suitable for streaming. We believe that the necessary changes required to obtain such functionality will dis-rupt the strengths of BitTorrent. Some alternative solutions are presented where BitTorrent has been extended with additional modules, such as a server.

Nyckelord

(6)

(7)

Abstract

Existing VoD solutions often rely on unicast to distribute content, which leads to a higher load on the VoD server as more nodes become interested in the content. In such case, P2P is an alternative way of distributing content since it makes better use of available resources in the network. In this report, several P2P structures are evaluated from an operators point of view. We believe BitTorrent is the most ad-equate protocol for a P2P solution in IPTV environments. Two BitTorrent clients have been implemented on an IP-STB as proof of concept to find out whether P2P is suited for IPTV environments. Several tests were conducted to evaluate the performance of both clients and to see if they were able to reach a sufficient throughput on the IP-STB. Based upon the tests and the overall impressions, we are convinced that this particular P2P protocol is well suited for IPTV environ-ments. Hopefully, a client developed from scratch for the IP-STB will offer even greater characteristics.

Further, we have studied how to share recorded content among IP-STBs. Such a design would probably have many similarities to BitTorrent since a central node needs to keep track of content; the IP-STBs take care of the rest.

The report also brings up whether BitTorrent is suitable for streaming. We believe that the necessary changes required to obtain such functionality will dis-rupt the strengths of BitTorrent. Some alternative solutions are presented where BitTorrent has been extended with additional modules, such as a server.

(8)

(9)

Acknowledgments

We would like to express our gratitude to Erik Hallbäck, for his great enthusiasm, encouragement and support during this master’s thesis. The following, also deserve our gratitude for help with various tasks; Martin Bengtsson, Martin Åkesson, Erik Johansson and Niels Bosma. Further, thanks to our supervisor Peter Johansson at ISY for interesting discussions and great tip-offs. Finally, we would like to thank friends and family for their support during this intense period.

(10)

(11)

1 Introduction 3 1.1 Purpose . . . 3 1.2 Problem description . . . 3 1.3 Objectives . . . 3 1.4 Terminology . . . 4 1.5 Limitations . . . 4 1.6 Report outline . . . 4 2 Background 5 2.1 Network basics . . . 5 2.1.1 OSI . . . 5 2.1.2 Ethernet . . . 6 2.1.3 IP . . . 7 2.1.4 TCP . . . 9 2.1.5 UDP . . . 11 2.2 Video basics . . . 11 2.2.1 MPEG-2 . . . 11 2.2.2 MPEG-TS . . . 12 2.3 IP-STB architecture . . . 13 2.3.1 Hardware . . . 13

2.3.2 Hardware Abstraction Layer . . . 13

2.3.3 Operating system . . . 13

2.3.4 Application services . . . 14

2.4 Present IPTV situation . . . 14

2.5 Introduction to P2P . . . 15 2.6 Requirements . . . 16 3 Evaluation of P2P structures 17 3.1 Overlay networks . . . 17 3.2 Centralized structure . . . 17 3.2.1 Napster . . . 18 3.2.2 BitTorrent . . . 18 3.3 Decentralized structure . . . 19 3.3.1 Gnutella . . . 19 ix

(12)

3.3.2 DHT . . . 20 3.4 Semi-centralized structure . . . 20 3.4.1 FastTrack . . . 21 3.5 Discussion . . . 21 4 BitTorrent 23 4.1 BitTorrent components . . . 23

4.2 An overview of how BitTorrent works . . . 23

4.3 Peer selection . . . 24

4.4 Advantages with BitTorrent . . . 25

4.5 Disadvantages with BitTorrent . . . 25

4.6 Implementation . . . 26

4.6.1 Available clients . . . 26

4.6.2 Compilation . . . 27

5 Tests and results 29 5.1 Introduction . . . 29

5.2 Limiting the download speed . . . 30

5.2.1 rTorrent . . . 30

5.2.2 BTPD . . . 32

5.3 Limiting the number of peers . . . 33

5.3.1 rTorrent . . . 33

5.3.2 BTPD . . . 34

5.4 Restricting the buffer size . . . 35

5.4.1 rTorrent . . . 35

5.4.2 BTPD . . . 36

5.5 Limiting the upload speed . . . 36

5.5.1 rTorrent . . . 36

5.5.2 BTPD . . . 37

5.6 Watching content while downloading . . . 37

5.6.1 rTorrent . . . 37 5.6.2 BTPD . . . 38 5.7 Conclusions . . . 40 6 Conceptual design 41 6.1 Introduction . . . 41 6.2 Synchronization . . . 41 6.3 Content distribution . . . 42 6.4 Detailed overview . . . 42 7 Future work 45 7.1 Network coding . . . 45 7.1.1 Introduction . . . 45

7.1.2 Max-flow min-cut theorem . . . 45

7.1.3 Centralized and decentralized network coding . . . 47

7.1.4 Network coding in IPTV environments . . . 48

(13)

Contents xi 7.2.1 Introduction . . . 49 7.2.2 TOAST . . . 50 7.2.3 BiToS . . . 50 8 Conclusion 53 Bibliography 57 A The BitTorrent protocol 59 A.1 Bencoding . . . 59

A.1.1 Required . . . 59

A.1.2 Optional . . . 60

A.2 Client HTTP Protocol . . . 60

A.2.1 Required . . . 60

A.2.2 Optional . . . 61

A.3 Tracker HTTP Protocol . . . 61

(14)

(15)

List of Figures

2.1 The OSI model. . . 6

2.2 An ethernet frame. . . 6 2.3 An IP packet. . . 7 2.4 Illustration of unicast. . . 8 2.5 Illustration of multicast. . . 8 2.6 Illustration of broadcast. . . 9 2.7 A TCP packet. . . 10 2.8 An UDP packet. . . 11 2.9 Illustration of MPEG-TS. . . 12

2.10 The IP-STB architecture. . . 13

2.11 A client-server model. . . 15 2.12 A P2P structure. . . 15 4.1 Illustration of BitTorrent. . . 24 4.2 rTorrent dependencies. . . 26 4.3 BTPD dependency. . . 27 5.1 rTorrent - speed 100KB/s . . . 30 5.2 rTorrent - speed 300 KB/s . . . 31 5.3 rTorrent - speed 600 KB/s . . . 31 5.4 BTPD - speed 100 KB/s . . . 32 5.5 BTPD - speed 300 KB/s . . . 32 5.6 BTPD - speed 600 KB/s . . . 33 5.7 rTorrent - 10 peers . . . 34 5.8 BTPD - 10 peers . . . 34

5.9 rTorrent - buffer size 4 MB . . . 35

5.10 rTorrent - buffer size 16 MB . . . 35

5.11 rTorrent - speed 400 KB/s (upload) . . . 36

5.12 BTPD - speed 400 KB/s (upload) . . . 37

5.13 rTorrent - speed 400 KB/s with SD-stream . . . 38

5.14 rTorrent - speed 400 KB/s with HD-stream . . . 38

5.15 BTPD - speed 400 KB/s with SD-stream . . . 39

5.16 BTPD - speed 400 KB/s with HD-stream . . . 39

6.1 Illustration of the concept. . . 43

6.2 Information stored at the central node. . . 43

6.3 Illustration of data exchange. . . 43

7.1 A butterfly network where max-flow min-cut is two. . . 46

7.2 Max-flow min-cut is not achieved when using multicast. . . 46

7.3 The advantage of network coding. . . 47

(16)

(17)

Chapter 1

Introduction

This chapter presents why this master’s thesis was carried out.

1.1 Purpose

The assigner of this master’s thesis is a company developing IP-STBs (Internet Protocol-Set Top Box). They are interested in how P2P (Peer-To-Peer) can be used to distribute video content in IPTV (Internet Protocol Television) environ-ments.

1.2 Problem description

IPTV is simply put, digital-TV delivered with IP (Internet Protocol) over a net-work infrastructure. Besides providing ordinary TV and VoD (Video on Demand), it has the advantage of providing other IP-based services, such as browsing the Internet. To make use of IPTV, the user needs an IP-STB. At present, many VoD solutions use unicast to transmit content from the server, which is not efficient since the load on the server increases with the number of IP-STBs.

1.3 Objectives

Our intention with this report is to evaluate available P2P structures and to im-plement a client as proof of concept, showing whether content distribution via P2P is possible in IPTV environments. Due to the timeframe for a master’s thesis of 20 weeks, we have decided to do the following:

• Present theory and the present IPTV situation

• Evaluate existing P2P structures due to specific requirements

• Focus on a specific protocol and implement a client as proof of concept

(18)

• Evaluate the performance of the client on an IP-STB

• Investigate how P2P can be used in IPTV environments in the future

1.4 Terminology

This section presents the interpretation of common terms in this report.

• BitTorrent refers to the BitTorrent protocol

• P2P client or BitTorrent client refers to an application • Client refers to a particular node of interest

• Structure refers to a certain topology used in overlay networks • Piece refers to data that consists of two or more blocks

1.5 Limitations

Since the P2P client should act as proof of concept, we strive to implement an already existing client. Thus, we will need to compile the client for the specific architecture. By proof of concept, we mean, downloading legal content with a P2P client. If successfully accomplished, we believe it proves that the particular P2P structure is suited for IPTV environments. Also, we do not take into consideration security threats and legal issues since the IP-STBs are connected to a network monitored by an operator.

1.6 Report outline

• Chapter 2 presents basic theory regarding the most fundamental parts in

IPTV environments

• Chapter 3 presents existing P2P structures

• Chapter 4 presents a deeper study of a specific P2P protocol as well as how

we implemented such a client on an IP-STB

• Chapter 5 presents the tests that were conducted and analysis of the results • Chapter 6 presents the conceptual design; how to share recorded content • Chapter 7 presents how P2P can be combined with IPTV in the future • Chapter 8 presents a conclusion on performed work

(19)

Chapter 2

Background

This chapter presents network and video basics as well as a description of the IP-STB used in this master’s thesis.

2.1 Network basics

This section presents network basics that IPTV environments rely on. The foun-dation of this section originates from [1].

2.1.1 OSI

The OSI (Open System Interconnection) model makes it possible for devices with different operating systems to communicate with each other. The OSI model is a framework that gives basic understanding of how data communication works. It separates the system into different layers and categorizes these from physical signaling to applications that can be handled by the user. Such approach enables changes in a specific layer without requiring additional changes in other layers. Each layer provides services to the layer directly above it and uses services provided by the layer directly below it. The highlighted layers in figure 2.1 are explained since these are of interest in this report.

• The physical layer handles the physical signals. Before a frame is sent the

digital signal is modulated onto an analogue pulse form, electric signal on wire or wireless radio signal. The signal is demodulated at the receiver.

• The data link layer fragments the stream into frames. It is also used for

ad-dressing in a LAN (Local Area Network) without routing capability since the MAC (Media Access Control) address is predefined on every NIC (Network Interface Card).

• The network layer delivers individual packets between hosts by using the IP

address to identify each NIC. If the host is connected to the Internet the IP address needs to be unique.

(20)

A p p l i c a t i o n P r e s e n t a t i o n S e s s i o n T r a n s p o r t N e t w o r k D a t a l i n k P h y s i c a l T C P , U D P I P E T H E R N E T C L I E N T S W I R E

Figure 2.1. The OSI model.

• The transport layer delivers packets between processes on separate hosts.

This layer communicates through a specific port and each running process can use one or many of the 65 536 available ports.

• The application layer supports applications and end-user processes such as

P2P clients.

2.1.2 Ethernet

Ethernet is a standard for signaling on the physical layer. Each ethernet frame is maximum 1500 bytes as shown in figure 2.2. This means that one ethernet frame can carry a maximum payload of 1472 bytes. The MAC address makes it possible to broadcast ethernet frames because only the host with the specific MAC address will receive the frame. MAC addresses are only relevant inside a LAN and cannot be used for routing between networks.

(21)

2.1 Network basics 7

2.1.3 IP

IP is an unreliable, connectionless protocol located in the network layer. An IP packet consists of header and data as shown in figure 2.3. The maximum size of an IP packet in IPv4 is 65 535 bytes. IP is responsible for source-to-destination delivery and packets are delivered in a store and forward fashion among intermediate nodes. Each node can decide next hop for a packet individually, thus different packets may travel different paths through the physical network and arrive out of order due to delays. When sending video content this could be a problem since video frames need to be played in order. To make sure that content is not affected by IP packets arriving out of order, IP can be complemented with, i.e. TCP.

Figure 2.3. An IP packet.

Unicast

Unicast involves one source and one destination, thus when a router receives an IP packet it forwards it through one of its interfaces. Figure 2.4 illustrates how host A (source) sends IP packets to hosts B and C (destinations). There are two separate IP packets sent since the destination is always a single host in unicast.

Multicast

Multicast involves one source and several destinations, thus when a router receives an IP packet it can forward it through several of its interfaces. Participating peers interested in the same content create an overlay topology for data delivery as illustrated in figure 2.5.

(22)

Figure 2.4. Illustration of unicast.

Figure 2.5. Illustration of multicast.

Broadcast

Broadcast is one-to-all communication as shown in figure 2.6 since there is only one source and all hosts are destinations. Broadcasting IP packets is only allowed at the local level because of the bandwidth consumption by such traffic. The disadvantage with broadcast is that all hosts in a network receive the IP packets, which is considered to be bandwidth inefficient.

(23)

2.1 Network basics 9 " H e l l o W o r l d " F r o m : A T o : H o m e 1 1 1 1 1 A B D C I w a n t n e t w o r k H o m e t o h a v e i t ! 1 N e t w o r k : H o m e

Figure 2.6. Illustration of broadcast.

2.1.4 TCP

TCP (Transmission Control Protocol) is a reliable process-to-process and connection-oriented protocol located in the transport layer. It was constructed to deal with IP packets being lost, corrupt or arriving out of order. TCP includes a sequence number in each packet as shown in figure 2.7. This makes it possible to detect lost packets, reorder packets that arrived out of order and acknowledge successfully arrived packets. A TCP connection has three different states; connection establish-ment, data transfer and connection termination. The connection is initiated with a slow data rate that gradually increases until it reaches an appropriate upload and download rate. Since TCP is a reliable process-to-process protocol it cannot be used with multicast since it sends the same content to several destinations.

Buffer

Since the sender and receiver usually have different bandwidth, a buffer is imple-mented to cope with this problem. The buffer exists on both sides and is the main mechanism that makes error and flow control possible.

The buffer at the sender has three types of chambers:

• Empty chambers

• Chambers that contain bytes that will be sent

(24)

The buffer at the receiver has two types of chambers:

• Empty chambers

• Chambers that contain data that has not yet been read by the process

Flow control

The receiver uses the flow control mechanism to control the data rate to avoid being overwhelmed with packets. The flow control mechanism tells the sender how much data that can be sent before an acknowledgment is received.

Congestion control

The network can act as a bottleneck since IP packets may be dropped along the way or arrive too late. TCP has a feature called congestion avoidance that tries to make sure packets arrive at the receiver in time.

Slow start

Slow start is an algorithm that refers to the size of the buffer and consists of two different phases. The first phase is aggressive and increases the window size exponentially with time. When a certain threshold is reached the strategy changes and the window size increases linearly with time. The purpose of the strategy is to find an appropriate data rate that the network and receiver can handle.

(25)

2.2 Video basics 11

2.1.5 UDP

UDP (User Datagram Protocol) is an unreliable and connectionless protocol. Each UDP packet is treated independently even if the packets originate from the same source. UDP lacks flow and error control, thus if a package is lost it cannot be resent but such a feature can be implemented in the application layer. The strength of UDP is its simplicity that makes the protocol suitable for delivering time sensitive material such as audio or video content, to avoid resending packets that have already been played. The UDP header is 8 bytes as can been seen in figure 2.8 and it is small when compared to the TCP header, which is shown in figure 2.7. Both UDP and TCP packets are encapsulated in IP packets but the IP packet is in turn encapsulated in an ethernet frame, which is relatively small. The total length of a UDP packet ranges from 0 to 65535 bytes but the maximum length cannot be reached since an UDP packet is inserted into an IP packet. UDP is a connectionless protocol, thus it is well suited for multicast as well as unicast when streaming video content.

Figure 2.8. An UDP packet.

2.2 Video basics

This section presents video basics necessary to understand the MPEG (Moving Picture Experts Group) standard.

2.2.1 MPEG-2

MPEG has developed several standards for compression and transmission of con-tent, i.e. MPEG-2, which is widely used in IPTV environments. MPEG-2 consists of ESs (Elementary Stream) which in turn consist of PESs (Packetized Elementary Stream).

ES

Each elementary stream contains one kind of data, such as compressed video, audio or other digital data. Video content, such as movies, usually contain several ESs.

(26)

PES

ESs are accumulated into a PES by randomly picking video frames or audio bits from each ES. Since each video frame has a different size depending on the resolu-tion, color depth etc. the packets have variable length but at most 65 526 bytes. In order to be recombined at the decoder the DTS (Decoding Time Stamp) and PTS (Presentation Time Stamp) are set in the 6 byte header and used for syn-chronization. DTS is used for decoding and storing packets in the buffer while PTS is used for fetching packets from the buffer and presenting them.[2]

MPEG-2 can be multiplexed in two different ways:

• MPEG-PS (MPEG – Program Stream) for storing content on error-free

mediums such as a DVD (Digital Versatile Disc)

• MPEG-TS (MPEG – Transport Stream) for transmitting content over a

channel where packets may be dropped

2.2.2 MPEG-TS

MPEG-TS is able to transmit content simultaneously, i.e. several TV-shows can be merged into a single MPEG-TS. This is not the case for MPEG-PS since it contains only specific content, i.e. a single movie per DVD. A MPEG-TS is created by fragmenting the PES into fixed-size packets of 188 bytes as shown in figure 2.9. Each packet consists of a 4 byte header and 184 byte payload. If the payload is less than 184 bytes it is padded with additional zeros to standardize the packets.[3]

(27)

2.3 IP-STB architecture 13

2.3 IP-STB architecture

This section gives an insight in the architecture of an IP-STB. The foundation of this section originates from [4].

2.3.1 Hardware

An IP-STB is a simple computer with limited resources. The hardware is the simplest possible including a slow CPU (Central Processing Unit), limited RAM (Random Access Memory), limited flash memory and a MPEG-2 decoder espe-cially dedicated for decompression of the mentioned stream. The IP-STB has also an USB (Universal Serial Bus) interface where external devices can be connected. All user interaction is captured through a IR (Infra Red) remote control or a IR keyboard.

Figure 2.10. The IP-STB architecture.

2.3.2 Hardware Abstraction Layer

The hardware abstraction layer contains the most low-level libraries and drivers for the IP-STB and it makes use of the hardware that can be found at the bottom of the platform shown in figure 2.10.

2.3.3 Operating system

GNU/Linux is used as operating system mostly because it is open source. It can also be optimized by only including necessary parts thus making the operating system small and efficient. The operating system is compressed into a boot im-age together with installation packim-ages. An installation packim-age is a compressed archive that contains an application. The boot image is distributed to all IP-STBs connected to the network with a multicast server and is uncompressed while the IP-STB boots.

(28)

2.3.4 Application services

Application services provide functionality to handle audio, video and communica-tion between layers as well as other basic funccommunica-tions. These funccommunica-tions are always running in the background but they are not presented to the user. The applica-tions use a framework called TOI (TV Open Interface) to communicate with the platform.

2.4 Present IPTV situation

This section presents how content is distributed in IPTV environments.

1. Live streaming, which is multicasted content and thus the load on the server does not increase with the number of viewers.

2. VoD, which is a popular service offered by most IPTV providers. The tech-nology involves movies being stored at a VoD server and transmitted to the IP-STB when ordered by a customer. There are different VoD solutions and they differ in the way they interact with the user. Three of the most common VoD solutions are presented below:

• Pay-Per-View, where the customer pays in advance for a movie that is

mul-ticasted at a specific time. The customer is not interacting with the content which makes it similar to live streaming.

• True-VoD, where the customer can choose when to watch a movie since the

server uses unicast to communicate with each customer in the network. This creates a lot of traffic in the network but lets the user rewind or fast-forward as pleased. True-VoD offers the same interactivity as a DVD player.

• Offline-VoD, which is similar to True-VoD but the customer does not have

the option of viewing content immediately, in real-time. The content needs to be stored on the hard drive before it can be accessed. Since many customers do not have the necessary bandwidth required for True-VoD, Offline-VoD is considered as an option to receive on demand video.[5]

Multicast is the best technology to use when the operator needs to transmit the same content to many customers at the same time. The advantage is that the load on the server does not increase with the number of customers but the disadvantage is the lack of interactivity. As mentioned earlier, TCP cannot be used with multicast since the same content is sent to multiple destinations. UDP is therefore used in multicast context.

VoD on the other hand offers great interactivity but delivering on demand video has traditionally been expensive since the content needs to be unicasted. We have decided to evaluate how P2P can be used to improve Offline-VoD scalability since it is a relatively straightforward VoD solution.

(29)

2.5 Introduction to P2P 15

2.5 Introduction to P2P

Figure 2.11 illustrates the idea with the client-server model, which implies that there is at least one server providing data to the nodes. The disadvantage with this kind of infrastructure is that it scales poorly because as the number of nodes increase so does the load on the server(s). It is possible to extend the server farm to cope with the problem but it is an rather expensive and temporary solution. The purpose of a server is to be reliable, which implies a high uptime while a P2P structure mainly focuses on performance and not reliability. A node may go down unannounced for several reasons while a server rarely does the same.

Figure 2.11. A client-server model.

Figure 2.12 illustrates the idea with P2P, which is a way of dealing with the disadvantages of the client-server model. P2P describes the topology of the partic-ipating nodes but it is not used for data exchange. When nodes obtain knowledge about each other, a TCP connection is often used to exchange data. As the num-ber of nodes increases so does the available bandwidth and therefore there is no constraint regarding bandwidth as in the client-server model.

(30)

2.6 Requirements

As already stated, we have decided to evaluate how P2P can be used to improve Offline-VoD scalability. When investigating which P2P structure is best suited, we believe it is of great importance to have the operator in mind. The operator has always the final word regarding changes in the network infrastructure and there are several requirements that need to be taken into consideration, such as:

• The operator must always have control of the traffic in the network • The operator must be able to distribute all kind of content

• The operator wants to use open source software due to licensing costs • The operator wants to limit the bandwidth consumption

We have defined four requirements above that will underlie the evaluation of the P2P structures. Our intention is to find the best suited P2P structure due to the given requirements and to study a specific protocol in detail.

Further, we have decided to look into how users could share recorded content among each other if connected to the same network. This makes it possible for users to watch content that has already been aired. To be able to design such a system several issues need to be addressed, such a how to locate, track and download content. The content needs to be synchronized to ensure that IP-STBs “speak the same language”. We only focus on the design of such a concept and our work can be used as background if such a project is carried out.

(31)

Chapter 3

Evaluation of P2P structures

This chapter presents different P2P structures and emphasizes advantages and disadvantages associated with each structure from an operator’s perspective.

3.1 Overlay networks

An overlay network is a virtual network created on top of the physical network. Overlay networks describe the topology of the participating peers, which makes it easier for peers to connect to each other. There are three structures easy to distinguish:

• Centralized structure, where a single node in the network keeps track of the

distributed content

• Decentralized structure, where all nodes exchange information about

avail-able content in the network since there is no central node as in the centralized structure

• Semi-centralized structure, where strategically selected nodes in the network

act as central nodes and keep track of content

3.2 Centralized structure

The simplest way to create an overlay network is to use a central node that stores information about participating peers. The centralized structure has a weakness since it relies on a single node, making it a single point of failure. In large central-ized structures the central node might experience a higher load as the number of nodes increases. This usually results in increased time to perform a search. The server keeps track of all connected nodes in the network and tries to supply new nodes by directing them to existing nodes that hold the requested content. This approach leads to high integrity since there is only one node that has information about all participating nodes.

(32)

3.2.1 Napster

Napster was one of the first P2P structures. It was designed for sharing music and attracted people to share copyrighted material. When a node performs a search it sends a request to the Napster server. The server stores text based information about the content that users share from their hard drives. A node searching for a song is requesting information from the server about where to find the content. The server responds with a list that contains information about nodes that share the specific content. The list contains only information about where the file can be located; the peers take care of the rest.[6]

Advantages

• Nodes must share content to be able to contact a server • Straightforward design makes searching easy

Disadvantages

• Designed for music sharing • Single point of failure • Closed source

• Download from one node at a time

3.2.2 BitTorrent

This protocol divides a large file into smaller pieces, which makes it possible to download several pieces from different nodes at the same time. The protocol encourages trading with pieces and it is possible to share pieces that have been downloaded thus it is not necessary for the entire file to be available. BitTorrent uses a central node, the tracker, to establish a connection with other peers. A client needs to obtain a .torrent file that contains the IP address of the tracker for that specific file. The client contacts the tracker, which replies with a list of nodes that have the entire content or part of the content. When the connections have been established with other nodes, the tracker is contacted by each node every 30 minutes to report the status or when the download is completed. Even if the tracker fails when connections have been established the communication between the nodes will not be affected.[7]

(33)

3.3 Decentralized structure 19

Advantages • Open source

• Download from several nodes simultaneously

Disadvantages

• Content is available for a limited time • Single point of failure

3.3 Decentralized structure

An alternative way of building a P2P structure is to use flooding techniques to perform a search in the network. When a node wants to connect to the network it contacts the first known node. They create a link between each other and after a while an ad-hoc (point-to-point) network is created with an unstructured topology. When a node performs a search it broadcasts a query to all neighbors in the overlay network, who in turn broadcast the query to their neighbors etc.

3.3.1 Gnutella

The protocol includes a TTL (Time-To-Live) field at the application layer that settles how far the query can propagate in the overlay network. Each time a new node receives the query it decreases the TTL field by one. The node drops the packet when the TTL field reaches zero and it is not forwarded to other nodes. The advantages with Gnutella are nodes being treated equally and the possibil-ity to limit the propagation due to flooding. The disadvantage is the unnecessary packet exchange between nodes. There are other ways to perform a search through a network without having to broadcast messages to all nodes, i.e. by using DHT (Distributed Hash Table).[8]

Advantages

• A tracker is not required • Open source

Disadvantages

• Consumes bandwidth due to flooding • Difficult to restrict once implemented

(34)

3.3.2 DHT

Decentralized P2P systems are classified as structured or unstructured, where an unstructured system uses flooding techniques to retrieve information in the network. DHT is a structured system used to store and retrieve information about nodes in a network. There are many different DHT methods and they differ in how to route messages in the overlay network. DHT is basically a list that contains two columns, ID and value. The ID represents the hash value of the IP address for a specific node and the value corresponds to desired content. To make the system efficient a node needs to store information about log (N ) other nodes, where N represents the total number of nodes in the network.[9]

BitTorrent DHT

To conduct a search without using a centralized tracker, the BitTorrent proto-col needs to be complemented with DHT. BitTorrent uses a DHT variant called Kademlia which has a binary tree structure. Each node is allocated a 160 bit ID and a node stores information about nodes at the distance 2i, were i represents a number between 0 and 160. The ID space, 2160, is divided into “buckets” that store 8 IDs. Nodes are divided into good or bad nodes where a node that does not respond to queries, i.e. a node behind a firewall, is typically a bad node. A node that has not responded to queries in 15 minutes is considered questionable since it may no longer be in the network. To find the status of a questionable node, a node from the bucket pings it. Every node struggles to maintain a routing table (of the overlay network) of known, good nodes. If a bucket is full, no more IDs can be accepted but if one node is known to be bad it can be replaced by a new, good node. A node has an empty bucket at the beginning and it tries to fill the bucket by searching for neighboring nodes.[9]

Advantages

• A tracker is not required • Open source

Disadvantages

• Content is available for a limited time • Complex implementation

3.4 Semi-centralized structure

A semi-centralized structure uses several nodes to coordinate traffic. When com-pared with a centralized structure, the biggest difference is that the semi-centralized structure does not treat all nodes equally. Nodes with very good performance will have greater responsibility.

(35)

3.5 Discussion 21

3.4.1 FastTrack

The FastTrack protocol is a semi-centralized structure where nodes are divided in following:

• Supernodes that handle the communication in the network since they have

high bandwidth or good computational capability

• Ordinary nodes that do not qualify as supernodes since they have low

band-width or poor computational capability

Supernodes are linked together to form a large overlay network. A supernode is in charge of several ordinary nodes and acts as a hub. Ordinary nodes do not initially communicate with each other, they communicate only with supernodes. When an ordinary node performs a search it sends a query to the responsible su-pernode. This supernode forwards the query to other supernodes in the network. Each supernode then forwards the query to “their” ordinary nodes. The propa-gation of the query is limited by integrating a TTL-field in the protocol. When a target node is found the reply is sent directly to the originating node instead of traveling the same path once again. The FastTrack protocol is used by several existing clients such as KaZaa and Morpheus.[10]

Advantages

• A tracker is not required • Limited flooding

Disadvantages • Closed source

• Requires nodes with good performance

3.5 Discussion

We present a discussion regarding the requirements specified in the previous chap-ter and which P2P structure is best suited.

The first requirement was to make sure the operator has full control of traffic in the network, which basically rules out the decentralized and semi-centralized structures. Further, a requirement stated that the bandwidth consumption should be limited and as already stated, the decentralized and semi-centralized structures use flooding to perform a search. We believe that a centralized structure is the best option due to the given requirements. The remaining two requirements are used to identify which protocol to use among the centralized protocols.

We reason as following, the evolution of centralized protocols has led to Bit-Torrent. In earlier protocols, such as Napster, the user had to download all of the content before being able to share. With BitTorrent, peers can share content they

(36)

possess, thus it is not necessary for all of the content to be downloaded. According to [11], which is a study of a tracker log, more than 180 000 peers participated during five months. The authors claim that the throughput for all simultaneously active clients was 800 MB/s, which proves that BitTorrent has good scalability. BitTorrent is also open source, which was a requirement to help the operator keep licensing costs down but it also lets us examine the protocol in detail to understand it better. Finally, BitTorrent enables distribution of all content, unlike Napster that only enables music sharing. When all this is put together, we realize that a BitTorrent client is best suited for IPTV environments.

(37)

Chapter 4

BitTorrent

This chapter gives a detailed description of the BitTorrent components and algo-rithms, as well as how we implemented a BitTorrent client on an IP-STB.

4.1 BitTorrent components

This section presents the basic components found in BitTorrent.

• Piece = part of the content (pieces are of equal length)

• Metainfo file (.torrent file) = a text file that contains the IP address of the

tracker and hash values of all pieces

• Tracker = the central node that coordinates traffic by maintaining a list of

peers and their current status

• Swarm = similar to a network where peers share pieces of the same .torrent

file among each other

• Seeder = a peer that has completed the download and continues to share

the content

• Leecher = a peer that has not yet completed the download

4.2 An overview of how BitTorrent works

This section explains the five steps in figure 4.1, which illustrates how BitTorrent works.

1. The metainfo file is usually published on a web page.

2. The client needs to download the metainfo file to be able to contact the tracker.

(38)

3. The client contacts the tracker since it is interested in retrieving a list of active peers.

4. The tracker responds with a list of 50 (default) active peers that have been selected at random among all peers interested in the same .torrent file. 5. The client connects to peers with an initial handshake. A TCP session is

then initiated and data is exchanged between peers. The client tries to stay connected to 20-40 (default) peers at a time. If the client fails in maintaining a connection to 20 peers it reconnects to the tracker to receive additional peers.[7]

Figure 4.1. Illustration of BitTorrent.

4.3 Peer selection

This section presents the peer selection strategies that BitTorrent relies on. A block is fraction of a piece, thus each piece consists of several blocks.

• Random first policy

The random first policy is applied when a client joins a swarm and has less than four pieces. The aim is to permit a peer to download four pieces as fast as possible. When the peer has downloaded four pieces it switches to the rarest first algorithm.

(39)

4.4 Advantages with BitTorrent 25

• Following blocks policy

Once a block has been requested, the remaining blocks from that particular piece are requested before blocks from any other piece. The aim is to get a complete piece as soon as possible.

• Rarest first algorithm

The goal with the rarest first algorithm is to improve the piece diversity in the peer set but the algorithm gives only a local view. This means that the rarest piece in a subset might be the most popular in the whole swarm but since a local view applies the subsets do not have any knowledge about the overall topology of the BitTorrent network.

• End-game mode

At the very end of a download the peer requests the last blocks from all peers in the swarm but they are downloaded from the peer that responds fastest. [11]

4.4 Advantages with BitTorrent

When many nodes are interested in the same file simultaneously a flash crowd is created, which puts a huge load on the server in the client-server model. BitTor-rent works well for distributing files that are popular in a limited time span and performs better the more nodes exist. BitTorrent incorporates some fairness by letting those who upload fast to download fast as well. The fairness means that peers with similar conditions share content between each other and therefore no peer will act as a bottleneck. The protocol encourages downloading of the rarest pieces first to prevent those to be lost in the swarm. The content is evaluated at each peer by computing the hash values, which makes it harder to spoil. Bit-Torrent provides high integrity since the protocol is centralized which means that users do not receive more information than necessary about other peers in the swarm. The tracker on the other hand stores information that makes monitoring easy for an administrator.

4.5 Disadvantages with BitTorrent

The life of a torrent file is short and as a result it can be hard to find peers that are willing to share content. Since seeding is an individual and active choice it means that many users choose not to seed the content when the download is completed which makes the swarm less efficient. In other systems like Gnutella the user shares a folder and all its content to other peers. The files are therefore available for a longer time. The piece distribution is a problem in BitTorrent since peers join and leave the network often. BitTorrent does not attempt to satisfy realtime delivery since it involves trading with pieces.

(40)

4.6 Implementation

It was decided already at the beginning of this master’s thesis that a proof of concept should be developed. As mentioned earlier in the report, we chose to use an existing BitTorrent client. Since the IP-STB has limited resources, the client should meet the following requirements:

• Small and efficient since the IP-STB has limited resources • Open source to be able to use existing code

• Written in C/C++ since the IP-STB supports C/C++

• Primitive GUI (Graphical User Interface) due to the performance restrictions

of the IP-STB

4.6.1 Available clients

We found two suitable clients; rTorrent and BTPD. Both are written in C++ and thus would probably work on the IP-STB but the difference is the number of dependencies and features.

rTorrent

rTorrent seemed to suit our purpose well since it is supposed to be efficient ac-cording to [12]. The necessary libraries are shown in 4.2.

Figure 4.2. rTorrent dependencies.

• libTorrent is a BitTorrent library written in C++ for Linux and Unix [12] • libCurl is a URL (Uniform Resource Locator) transfer library, supporting

HTTP (Hypertext Transfer Protocol) among other [13]

• nCurses is a programming library allowing to write text user interfaces in a

terminal independent manner [14]

(41)

4.6 Implementation 27

BTPD

This client has only one library as dependency which is shown in figure 4.3. It can be run as a console program through a terminal.

Figure 4.3. BTPD dependency.

• OpenSSL is a library that implements the basic cryptographic functions and

provides various utility functions [16]

4.6.2 Compilation

These necessary steps were performed to get the BitTorrent client running on an IP-STB:

1. Cross-compiling dependencies for the application. 2. Cross-compiling the application.

3. Building an installation package of the application. 4. Building a new boot image with the installation packages.

Cross compiling means creating executable code for a platform that is not the same as the one running the cross compiler. A cross compiler is often used to generate executable code for embedded systems since these cannot compile on that specific platform due to limited resources. Cross compiling is also used to support several different versions of an operating systems since a single build environment can be set up to compile for different targets.

GCC (GNU Compiler Collection) was used for cross compiling clients and libraries. We had to cross compile the libraries for each client before we could cross compile the clients. The following step was to create a new installation package for each client containing the necessary libraries and the client itself. Finally, we created a new boot image with the new installation packages. Upon rebooting the IP-STB the new boot image was installed.

(42)

(43)

Chapter 5

Tests and results

This chapter presents a selection of the most important tests that were conducted. The aim was to determine which BitTorrent client is best suited for the IP-STB.

5.1 Introduction

We used an existing script to store instantaneous CPU and RAM utilization, which were plotted as a graph later on. The script uses the TOP command in GNU/Linux that provides information regarding currently running processes. We tried to keep all parameters fixed during the tests, thus having the same circumstances for both clients. This was easier said than done since BitTorrent finds peers in the swarm at random. This means that each time we conducted a new test, new peers were selected by the tracker. This is something that cannot be controlled but we believe that it does not matter as long as the throughput stays the same. We could have avoided this problem by using our own tracker but in such case we would have needed at least 20 peers to imitate BitTorrent “in real life”. This would have been very complicated and time consuming to implement and thus such an idea was dismissed at early stage. The same .torrent file was used during all runs (Fedora Core 8) since it provided peers from the same swarm with high throughput.

Many tests have been conducted in five minute segments. We did conduct tests during the entire download but observed that the system behaves the same after a couple of minutes. This has probably to do with the fact that the random first policy applies in the early stage of the download. When the four first pieces have been downloaded the rarest first policy applies until the end and the system behaves the same. It would have required so much more time to run tests for longer periods.

Out intention was to evaluate the CPU and RAM utilization on the IP-STB by :

• Limiting the download speed • Limiting the number of peers

(44)

• Restricting the buffer size

• Watching content while downloading • Limiting the upload speed

The graphs presented in this chapter are supposed to give a hint of differences and similarities between the clients. The IP-STB used throughout the tests was equipped with a 300 MHz CPU and 128 MB RAM. When idling, CPU utilization was 8% and RAM utilization 64 MB but on load, such as watching content, CPU and RAM utilization increases.

MV refers to mean value and Var refers to variance. These values have been calculated to facilitate the comparison of graphs. MB and KB refers to megabytes respectively kilobytes.

5.2 Limiting the download speed

5.2.1 rTorrent

rTorrent utilizes 3 MB of RAM when idling but the RAM utilization increases on load. When the speed is increased to 300 KB/s we can clearly see that the CPU utilization increases as well. CPU utilization reaches 100% sporadically, which might disrupt the TV experience. While studying the RAM utilization, we discovered a triangular pattern, which we believe corresponds to data from the rTorrent buffer being stored on the hard drive every 30 seconds. According to figures 5.1, 5.2 and 5.3 we can see that the CPU utilization increases with speed. The high RAM utilization could however become a problem later on.

(45)

5.2 Limiting the download speed 31

Figure 5.2. rTorrent - speed 300 KB/s (MV=15,2%, Var=11,5%)

(46)

5.2.2 BTPD

Figure 5.4. BTPD - speed 100 KB/s (MV=13,0%, Var=13,0%)

(47)

5.3 Limiting the number of peers 33

Figure 5.6. BTPD - speed 600 KB/s (MV=49,9%, Var=16,3%)

The RAM utilization is much lower in BTPD when compared to rTorrent. The CPU utilization on the other hand is slightly higher since BTPD stores data continuously on the hard drive. The RAM utilization remains the same while CPU utilization increases with speed. CPU utilization reaches a MV of 50% when the speed is increased to 600 KB/s which can be considered as decent since the IP-STB has limited resources. According to figures 5.4, 5.5 and 5.6, BTPD performs much better than rTorrent when compared in RAM utilization while rTorrent is slightly better when it comes to CPU utilization.

5.3 Limiting the number of peers

5.3.1 rTorrent

When restricting the number of peers to ten, rTorrent allocates 3 MB of RAM upon startup but it is utilized in a triangular pattern later on, as we already stated. We believe that this is implemented in the client since we experienced the same phenomenon when running this particular BitTorrent client on a PC. By limiting the number of peers, it will take more time for a client to receive the four pieces during the random first policy since there are fewer nodes willing to share. We noticed that the speed is not affected, it is just a matter of time before the client reaches the speed as in previous tests. Thus, limiting the number of peers will not affect the client in any way.

(48)

Figure 5.7. rTorrent - 10 peers (MV=18,0%, Var=15,2%)

5.3.2 BTPD

BTPD allocates 800 KB of RAM upon startup and the highest RAM utilization is 1,2 MB as shown is figure 5.8.

(49)

5.4 Restricting the buffer size 35

5.4 Restricting the buffer size

5.4.1 rTorrent

Figure 5.9. rTorrent - buffer size 4 MB (MV=5,1%, Var=3,4%)

(50)

Since rTorrent utilizes a lot of RAM it may result in a disrupted TV experience or in worst case, a crash. Fortunately, rTorrent has the possibility to predefine the buffer size (RAM allocation). The buffer size was first set to 4 MB as shown in 5.10, which means that 1 MB is allocated to each connection since BitTorrent strives to download from the four best peers at the same time. The buffer size was then changed to 16 MB which means that 4 MB are allocated to each connection and figure 5.10 shows that the RAM allocation does not exceeded 16 MB. Restricting the buffer size leads to a lower throughput since less data can be stored in the buffer.

5.4.2 BTPD

According to figure 5.6, when the speed was restricted to 600 KB/s, BTPD did not have problems regarding high RAM utilization like rTorrent. Therefore it was no need to restrict the buffer size to see how it affected the client.

5.5 Limiting the upload speed

We wanted to evaluate how the two clients perform when uploading/seeding con-tent. The upload rate was restricted to 400 KB/s.

5.5.1 rTorrent

rTorrent has high CPU and RAM utilization when downloading content but this is not the case when uploading/seeding content as shown in figure 5.11.

(51)

5.6 Watching content while downloading 37

5.5.2 BTPD

BTPD has a slightly higher CPU utilization than rTorrent but the RAM utilization is lower as shown in figure 5.12.

Figure 5.12. BTPD - speed 400 KB/s (upload) (MV=8,8%, Var=5,4%)

5.6 Watching content while downloading

We wanted to investigate the effects of the CPU peaks and the high RAM utiliza-tion that we discovered in rTorrent to evaluate how it may affect the TV experi-ence. We multicasted both HD (High Definition) and SD (Standard Definition) content to the IP-STB. The download speed was restricted to 400 KB/s.

5.6.1 rTorrent

Figure 5.13 shows the system (represented by the red line) and client (represented by the green line) CPU utilization when downloading content while watching a SD-stream.

Downloading content with rTorrent at 400 KB/s while playing an HD-stream seems to put a big load on the IP-STB. The system crashed due to high RAM utilization as shown in figure 5.14.

(52)

Figure 5.13. rTorrent - speed 400 KB/s with SD-stream (MV=66,8%, Var=23,5%)

Figure 5.14. rTorrent - speed 400 KB/s with HD-stream (MV=82,8%, Var=16,2%)

5.6.2 BTPD

BTPD performed better then rTorrent in this test since it did not crash and had overall better performance in this specific scenario.

(53)

Figure 5.15. BTPD - speed 400 KB/s with SD-stream (MV=73,0%, Var=16,0%)

(54)

5.7 Conclusions

We have been studying two BitTorrent clients in detail; rTorrent and BTPD. rTorrent has higher RAM utilization since it makes use of a buffer implemented in the RAM while BTPD has a low RAM utilization since it stores data on the hard drive continuously. Such an approach implies higher CPU utilization but the overall conclusion based on the tests and our own impressions is that BTPD is better suited for an IP-STB. A low RAM utilization is more important, since high RAM utilization may lead to a reboot of the operating system. According to figure 2.10, the IP-STB has numerous services running in the background that consume CPU and RAM on load. When uploading content, the two clients performed equally. Our verdict is that BTPD is best suited for the IP-STB and to combine a pleasant TV experience while running BTPD, we recommend a download rate of 400 KB/s. By doing so the client gives sufficient marginal to other processes on the IP-STB without a high CPU utilization.

(55)

Chapter 6

Conceptual design

The aim of this chapter is to present a concept regarding how to share recorded content among IP-STBs. The concept does not include detailed information such as what protocols to use.

6.1 Introduction

As mentioned earlier, content is encoded as MPEG-TS before being multicasted to the STBs, which means that it does not arrive at the same time to all IP-STBs in the network due to delays. Imagine that two users start/stop recording the same content at exactly the same time; they would still obtain different parts of the content. We need to make sure that content is synchronized to specific positions when being recorded.

The idea is to count the PES-packets since they are time-invariant. PES-packets are sequentially numbered at the encoder and will arrive in the same order as numbered by the encoder. We assume that the recorded content is stored as MPEG-TS on the hard drive.

6.2 Synchronization

Video, audio and other data are spread out in different PES-packets during the transportation but they are combined at the decoder according to their PTS values. Each PES-packet is split into one or many TS-packets, why each new PES-packet starts with a header. When the user starts recording content it must be synchro-nized to the following PES-header. When the user stops recording content it will be synchronized to the previous PES-header. The advantage is hosts only storing whole PES-packets, which becomes the common denominator. The disadvantage is TS-packets being dropped if entire PES-packets are not received.

(56)

6.3 Content distribution

We propose to let the encoder map a start/stop-PES value by looking at the EPG (Electronic Program Guide) and time. The central node is then notified whereas it stores the values as reference points. If the EPG states that a TV-show starts at 10.00 and ends at 11.00, the encoder will use the start-time to map the beginning of the TV-show to a start-PES and the end-time to map the end of the TV-show to a stop-PES. This cannot be done in advance since the PES-packets are not known ahead of time. It needs to be done while the content is encoded to MPEG-TS.

The next step involves transmitting the start/stop-PES values to the IP-STBs and we propose using parts from BitTorrent to obtain a suitable solution. When the host stops recording content, information is sent to the central node about the range of PES-packets stored. The central node stores this information in a database together with the host’s IP address. When another host requests pieces from the central node it will return a list. This can be solved in two different ways:

• The central node returns a list with random hosts that are sharing

PES-packets for the requested content. This approach puts more load on the IP-STB since it must contact all nodes to verify what pieces are available before download can start.

• The central node returns a list with nodes that are sharing PES-packets in

the specific range requested by the host. This approach puts more load on central node since it needs to verify restrictions before being able to return the list.

Both scenarios involve nodes creating an overlay structure when exchanging pieces.

6.4 Detailed overview

This section presents useful comments to figure 6.1.

1. The encoder maps a start/stop-PES value by looking at the EPG and time. It notifies the central node that stores these values as reference points. 2. Hosts A and B press START RECORDING during PES-packets 1 and 73

but as we already stated the recording will not start until the next PES-header arrives. The hosts press STOP RECORDING during PES-packets 527 and 625, which results in the recording being chopped to the previous PES-header.

3. Host C is interested in content that was aired yesterday between 7.00 and 7.30. The central node is contacted to get hold of the necessary PES range. The central node returns the following list as shown in figure 6.2.

4. Host C contacts the hosts A and B regarding the PES-packets that are in turn transferred between the hosts as illustrated in figure 6.3.

(57)

6.4 Detailed overview 43 S T A R T S T A R T S T O P S T O P R E C O R D E D R E C O R D E D 1 2 ... 5 2 8 7 2 7 3 6 2 5 H O S T A H O S T B H O S T B A C P I E C E S 7 4 - 6 2 4 H O S T C 2 - 5 2 6 6 2 5 1 2 P 2 P - C O N T E N T D I S T R I B U T I O N F R O M H O S T A F R O M H O S T B R E C R E A T E D M P E G - T S E N C O D E R -... ... T S P E S T S P E S T S P E S 3 7 4 ... 3 0 1

Figure 6.1. Illustration of the concept.

HOST PIECES

A 2-526

B 74-624

Figure 6.2. Information stored at the central node.

(58)

(59)

Chapter 7

Future work

This chapter presents upcoming P2P solutions that could be integrated in an IPTV environment.

7.1 Network coding

This section presents network coding and how it can be used in IPTV environments. In order to do so some light needs to be shed upon basic notion in network coding.

7.1.1 Introduction

A communication network is represented by a directed acyclic graph, G = (V, E) as shown in figure 7.1. Each graph consists of a vertex set V where

• s represents the source node

• i represents a subset of intermediate nodes • t represents a subset of sink nodes

The vertices in V are connected by edges with a specific capacity. The flow on an edge cannot exceed its capacity.[17]

7.1.2 Max-flow min-cut theorem

A cut is achieved by partitioning the nodes in the graph into two disjoint subsets,

S and T , such that the source s is in S and the sinks t are in T . The minimal cut

is the cut whose capacity is minimal among all possible cuts. The maximal flow maximizes the flow into the sink. The max-flow min-cut theorem states that the maximum flow of a graph is equal to the capacity of the minimal cut in the graph. The butterfly network in figure 7.1, shows that the max-flow min-cut is two. When using multicast to send two bits, b1 and b2from the source to the sinks, the

max-flow is 1,5 bits/s and thus the max-flow for each individual connection is not achieved, as illustrated in figure 7.2.

(60)

Figure 7.1. A butterfly network where max-flow min-cut is two.

Figure 7.2. Max-flow min-cut is not achieved when using multicast.

When using a routing scheme the nodes only forward information but nodes can actually be used more efficiently. Network coding takes advantage of nodes as encoders and makes it possible for nodes to recombine input packets into one or several output packets.[18]

Figure 7.3 illustrates how network coding works. As already established, by using multicast a total link utilization of 1,5 bits/s is achieved. But with network coding a link utilization of two bits/second is achieved. Instead of letting i3forward

b1 or b2, these two bits are combined into a new entity with an exclusive-OR

operation. Thus the links (i3, i4), (i4, t1) and (i4, t2) carry a linear combination

of the original entities. The sinks will be able to restore the original packets since they receive all necessary packets.[17]

(61)

7.1 Network coding 47

Figure 7.3. The advantage of network coding.

7.1.3 Centralized and decentralized network coding

There are two different approaches in network coding. The first is called random network coding or decentralized network coding since no knowledge regarding the topology is required. The advantage with random network coding is coding vectors being selected randomly from a finite field and the disadvantage is that linear dependencies may occur. The probability that two nodes with the same set of blocks will pick the same set of coefficients and by that generate linearly dependent vectors depends on the size of the field that coding vectors are selected from. The other approach is centralized network coding and it is similar to random network coding except from a central node that keeps track of each node and its coding vectors. The advantage is a much smaller probability for linear dependencies to occur but the disadvantage is that a dedicated node is required. Each new node that enters the network needs to contact the central node to register its coding vectors.[18]

Encoding

The original content with length L is segmented into n blocks, b = [b1, b2, ..., bn]

with equal size, s. The last block is padded with additional zeros if necessary. The source chooses all the blocks but other nodes randomly choose n blocks out of all received blocks so far as well as m coding coefficients, c = [c1, c2, ..., cm] in the

finite field (2s_{). It then creates a coded block, x =}Pm

i=1ni· ciwith the same size

as the original blocks thanks to linear operations. The node passes on the coding coefficients and the encoded data to other nodes. These nodes perform a linear combination of the content they currently store before passing on content.[19]