Investigation of End User IPTV Quality for Content Delivered in MPEG-2 Transport Stream

(1)

MEE 10:10

Investigation of End User IPTV Quality for Content Delivered in

MPEG-2 Transport Stream

Syed Fakhar Uz Zaman Gillani

School of Computing

Blekinge Institute of Technology Karlskrona, Sweden

(2)

2

(3)

3

Supervisors:

Andreas Ekeroth Research Engineer

Multimedia Technologies Ericsson Research

Luleå, Sweden

&

Elin Johansson Research Engineer

Multimedia Technologies Ericsson Research

Luleå, Sweden

Examiner:

Dr. Markus Fiedler Associate Professor School of Computing

Blekinge Institute of Technology Karlskrona, Sweden

Sponsor:

Multimedia Technologies Ericsson Research

Sweden

Presented To:

School of Computing

Blekinge Institute of Technology

Karlskrona, Sweden

(4)

4

(5)

5

To,

“ My Grandparents ”

(6)

6

(7)

7

Acknowledgements

First of all I am grateful to almighty Allah the greatest of all. Then I would like to thank my supervisors at Ericsson, Andreas Ekeroth and Elin Johansson for their guidance, encouragement, patience and support throughout this thesis work. Next, I would like to thank Stefan Håkansson LK, Manager Application Frameworks and Service Assurance Department, for providing me the opportunity to explore this endless voyage of knowledge.

Many thanks to Martin Pettersson who was always there to take some time out of his busy schedule and guide me whenever I needed. I also want to extend my gratitude to David Lindegren for his valuable assistance in the parametric model.

Special thanks to my supervisor at BTH, Dr. Markus Fiedler, who took a keen interest in the research and shared his vast experiences making the whole work possible.

I specially appreciated the sincere and unreserved support of Syed Muhammad Ali and Syed Majid Ali Shah Bukhari throughout my stay in Sweden.

I am thankful to my colleague Nagarajesh Garapati for his support during the thesis work. Not to forget the efforts of my friend Mahboob ur Rehman who used to cook toothsome spicy meals which I am going to miss a lot.

Finally, a special thank to my parents and siblings because I have been sustained during this journey on “a wing and a prayer”.

Syed Fakhar Uz Zaman Gillani

Stockholm , January 2010

(8)

8

(9)

9

Abstract

IPTV or Internet Protocol Television is a system where digital television is delivered to the end user using the Internet Protocol. It relies on the same technologies that are used for computer networks and adds new possibilities, such as video on demand, on top of the traditionally broadcasted TV. Video content for IPTV is typically compressed using a MPEG-2 or H.264 (MPEG-4 part 10) compression and sent in an MPEG-2 transport stream over IP. Since TV is a

"real time service" packets are delivered using a simple unreliable transmission model and packets may arrive out of order or be lost. Lost packets are normally not re-transmitted since they anyhow would arrive too late to be useful and packet loss will thus decrease perceived quality for the end user.

The Master thesis is an investigation of the parameters that affect the perceived quality in MPEG- 2 Transport Stream (TS). The aim of the thesis is to develop an objective parametric model¹ that can estimate the perceived quality at transport layer. The thesis work includes experimentation performed on High Definition (HD) video sequences with various bit rates and packet loss ratios (PLR).

Keywords:

IPTV, QoE, PEVQ, Quality degradation.

1 Some of the details of Parametric Modeling in the thesis are not made public due to IPR restrictions from Ericsson Research.

(10)

10

(11)

11

Chapter1: Introduction

s IPTV is one of the most rapidly growing broadband service so there is a need to provide high Quality of Service (QoS) along with an excellent user experience. From the service provider‟s perspective service monitoring is the key to success. In order to keep the users satisfied the service provider must take measures to examine subscribers‟ experience of a particular service. The main objective of this thesis is to examine the end user‟s experience for IPTV quality with a combination of different bitrates and packet loss ratios and to develop a parametric model² that can be used to estimate the perceived quality at the transport layer.

1.1 Scope of Thesis:

The demand of multimedia streaming has increased a lot in the past few years, so mobile operators and internet service providers (ISP) need to monitor the services in order to know how satisfied their consumers‟ are. Among other approaches parametric model is used to determine the customer‟s experience of a certain service. So in order to develop and train the parametric model one must meet its requirements. The main step for developing the model is to know the users‟ experience for different video sequences and then train the developed model on users‟

feedback. The thesis work starts with finding an open source MPEG-2 transport stream encoder, decoder, Packetizer and Depacketizer to be used in the analysis. Once the software tools are ready the experimentation process will start that includes simulation of packet loss in MPEG-2 Transport stream. After the experimentation the collected data will be used for objective testing in order to have an objective result of user experience for the degraded video sequences. After objective testing has been done the parametric model will be developed by correlation of measurable parameters and objective test results.

1.2 Research Questions:

This thesis carries detailed answers to the following research questions.

 Which open source tool is useful to encode, decode, packetize and depacketize the TS?

 What useful information can be extracted from the TS header?

 How packet loss and bitrate effect perceived video quality in MPEG-2 TS?

 How can a parametric model, that estimates the objective video quality at transport layer, be specified?

A

(15)

15

1.3 Related Work:

In [1] the authors developed an objective model which estimated perceived video quality for IPTV by considering codec, bitrate, packet loss ratio, and buffering events in consideration. The experiments were performed on Standard Definition (SD) video sequences with somehow lower bitrates although the experimental results show a good model performance with an excellent correlation and RMSE of (between subjective tests and model) 0.93 and 0.33 respectively. The parametric model generates an average score which depends on the video content for transport quality and is the key to service quality monitoring.

In [2] the authors first discussed and then developed an objective quality assessment approach for quality management in IPTV at user‟ premises. The proposed approach used bit stream layer information that was content-dependent. When the model was developed it was tested and verified with a good RMSE and correlation.

The authors in [3] suggested a packet layer parametric model for IPTV that based on ITU- TG.1070 video quality model, in which different quality parameters can be adjusted. The authors enhanced the ITU-TG.1070 model by introducing some metrics for packet loss bursts. The experimental results show that the model estimates the video quality with a good correlation with the subjective tests.

1.4 Outline:

Chapter 2 includes the relevant background details of video coding, H.264, MPEG-2 transmission, QoE and packet loss in IPTV.

Chapter 3 explains the methodology of the experiments that comprised of the theoretical details of experiments, FFmpeg, the packet loss model and objective testing tool PEVQ.

Chapter 4 elaborates the analysis of the results from objective testing and its impact on video quality.

Chapter 5 comprises the parametric development³, its performance and analysis of the achieved results.

Chapter 6 elaborates the conclusions besides suggests some relevant future research areas relevant to this thesis work.

(16)

16

Chapter 2: Background

his chapter gives a brief introduction to frequently discussed topics and contains preliminary informative material which a reader is invited to go through in order to fully understand the thesis to its merits. Please note that the IPTV architecture is not within the scope of this thesis.

2.1 IPTV:

Internet Protocol Television is the type of TV which is broadcasted by the service providers through managed packet-switched networks. This service is being delivered to the TV sets by the service providers through a Set Top Box (STB) which is in-between the TV and the broadband connection. It includes video streams (TV programmes, songs, movies, etc) saved on a video server, live video streams, and a variety of different interactive applications. IPTV has also been defined by International Telecommunication Union Focus group on IPTV (ITU-T FG IPTV) as

“Multimedia services such as television/video/audio/text/graphics/data delivered over IP based networks managed to provide the required level of quality of service and experience, security, interactivity and reliability” [4].

IPTV differs from Internet TV in many ways as Internet TV is simply video streaming over public Internet while IPTV is video streaming via a service provider‟s managed network. Internet TV can also be viewed on television sets but in that case QoS is not controlled as it is done (by the service providers) in IPTV. Another emerging technology in today‟s world is Triple play which includes IPTV, VoIP and Broadband internet provided in a single local loop. And if wireless mobility is added to these services, then it becomes Quadruple play. The figure below shows an expected growth in broadband traffic which will mostly be driven by IPTV [5].

Figure 1: Dramatic increase in broadband traffic (in particular IPTV) [5].

T

(17)

17 As IPTV is transmitted over IP it needs proper compression techniques to compress the video prior to its transmission depending on the available bandwidth. Now selecting among compression techniques is pivotal, because it will set the quality of video content viewed by the user. Researchers (under the umbrella of ITU-T) have found MPEG compression the best possible solution for this challenge. ITU-T has standardized H.264 and that is equivalent to MPEG-4 (part 10) standardized by Moving picture expert group MPEG. In addition to this H.264/MPEG-4 is used for HDTV, while MPEG-2 is used for SDTV (as MPEG-2 codec is an older standard).

2.2 Fundamentals of video coding:

MPEG-2 simply describes the way in which audio, video and their corresponding data is compressed and then multiplexed together for digital transmission and storage of video. It is mainly used for Digital video broadcasting (DVB), Digital Versatile Disc (DVD) and Video Compact Disc (VCD). The maximum and minimum data rates which are supported by MPEG-2 are 10.08 Mbit/sec and 300 kbit/sec respectively. International Standard Organization (ISO) has also accepted MPEG-2 as a standard and has divided it into seven different parts which are corresponding to its proprieties in different conditions. Now we will explain video coding in detail.

Video stream:

It starts and ends with a starting and an ending sequence header respectively, and in between the sequence headers, there is one or a group of pictures (GOP).

Group of Picture (GOP):

It contains certain pictures in a group in such a way that allows random access in the sequence.

Picture:

A picture consists of three rectangular matrices that are characterized by a luminance (Y) and two chrominance (Cb and Cr) values. Y matrices have an even number of rows and columns while Cb

& Cr matrices are one half of the size of Y matrices.

Slice:

A slice is the basic unit of the coded data. It is important for error correction because if an error occurs the decoder can skip the current slice and start with the next.

Macroblock:

Macroblock is the basic coding unit in MPEG algorithm. It consists of 16x16 pixels fragments in a frame. As we have read earlier that the luminance component is twice in dimensions as compared to both vertical and horizontal resolution of the chrominance component, a macroblock is made up of four Y and one Cb & Cr each.

(18)

18

Block:

A block is the basic coding unit in Intra-Frame coding. It consists of Y, Cr or Cb and its size is 8x8 pixels.

Discrete Cosine Transform (DCT):

A block is coded by Discrete Cosine Transform. DCT is applied to each elementary block present in the picture which then produces coefficients in 8x8 matrices. The average luminance or chrominance value is represented by the top left value in the square matrix which is also known as DC coefficient. Normally the values of the coefficients are in a decreasing order, so the values at the bottom are negligible or zero. If the luminance or chrominance is homogeneous throughout the block, that means that all the values are zero except the DC coefficient.

Zigzag scan:

Zigzag scan reads all the coefficients of the matrices except DC coefficients. This step maximizes the efficiency of Huffman coding and Run length coding which are coding techniques used in MPEG coding.

Types of Pictures:

In MPEG algorithm there are mainly three types of pictures [6].

a) Intra Pictures b) Predicted Pictures c) Bidirectional Pictures

I-Pictures:

I-pictures are commonly known as I-frames which play a key role in video sequences. I-frames are coded by using information which is already present within the picture itself. In I-frames all macroblocks are coded without prediction.

P-Pictures:

P-pictures, also known as P-frames, are coded with respect to the previous adjacent I-frame or P- frame (itself) as can be seen in the figure 2 below. As P-frames make use of I-frames as a reference for prediction, in the same way P-frames can be used for future prediction of B-frames as well as P-frames [7]. P-frames give a greater compression rate as compared to that of an I- frame, the reason being its usage of motion compensation in cases of moving pictures.

B-Pictures:

B-pictures are bi-directional predicted pictures which make use of previous as well as future pictures for prediction purposes. It offers the highest degree of compression of all three picture (frame) types, but the estimated time for compression is the highest. B-frames are never used as prediction reference.

(19)

19

Figure 2: I-frames, P-frames and B-frames

2.3 H.264:

H.264, also known as MPEG-4 part 10, is a video compression scheme which is a part of Moving Picture Expert Group (MPEG). It is getting famous in video broadcasting, video conferencing and many other broadband-based video services because of its better video quality at relatively lower bitrates, higher resolution or high definition (HD) quality video delivery and lower storage requirements for high quality video content [7]. In Figure 3 we can clearly see the efficiency of H.264 in comparison to MPEG-2 i.e. at similar bitrates H.264 can achieve twice the quality and resolution as compared to MPEG-2.

Figure 3: Comparison of MPEG2 and H.264 [7].

(20)

20 MPEG-2 is the (most famous) compression scheme which is being used by most of the television broadcasters nowadays. But in the future, digital TV will be delivered by means of H.264 compression due to its lower bandwidth consumption and higher quality/resolution as compared to MPEG-2. In general H.264 is one of the several video compression schemes which have been included in MPEG-4 [7].

2.4 MPEG-2 Transmission:

MPEG-2 is being delivered by two different types of multiplexing schemes; Transport Stream (TS) and Program Stream (PS). TS is mainly used for broadcast purpose while PS is mostly used in regard to video storage. PS can carry only one program channel at a time while on the other hand TS can operate multiple program channels. For example, in satellite TV broadcasting, only single stream is broadcasted, which however contains multiple TV channels. This can only be done by means of TS. As TS is of our main concern we will examine it in detail but before that we will discuss some basic terminologies which will be used in Transport Stream.

2.4.1 Elementary Stream:

An Elementary Stream (ES) consists of digital audio, video and data. For television broadcast or video storage these three components are multiplexed together to form an ES. In addition to this, more related information can be multiplexed in the ES such as time stamps (for replaying synchronized ES at the decoder) and tables which may include service information of network parameters [8].

2.4.2 Packetized Elementary Stream:

A packetized elementary stream (PES) contains all data from the ES but is packetized in such a way that the data is divided into chunks of packets combined together with data envelope which then forms a TS. A detailed overview of the process is shown in the figure 4. One of the major advantages of a PES over an ES is that it enables random access and error protection. A PES packet starts with a header which is followed by stream ID (that carries the identification information of the ES), scrambling control flags, priority flags, copyright flags, etc. Detailed header structure is available in appendix A1.

Figure 4: Packetized Elementary Stream [8].

(21)

21

2.4.3 Transport Stream:

The TS contains series of packets each of 188 bytes size, of which 184 bytes are for payload and remaining 4 bytes consist of header. Appendix A2 shows a detailed overview of the Transport packet header. The first byte in the TS header is a synchronization byte whose value is 0x47 followed by Packet Identifier (PID), scrambling control and a few more entries which are used to identify the payload information of the TS packet. The PID plays the most important role at the decoding end by enabling the decoder to rebuild the PES from the TS using the 13 bit PID code which is uniquely assigned to the TS [9]. The use of short and fixed-sized packets enables the TS to be well protected from errors and if there is a packet loss while transmitting, the decoder can move on to the next packet without having a large data loss.

The transport stream contains data in multiplexed form which may include payload information from various PES packets i.e. video, audio and the relevant program information as shown in Figure 5. For synchronization of multiple streams, signaling is not needed at all because the PES headers and the Adaptation field carry timing information for playback [10].

Figure 5: MPEG-2 Transport Stream, multiplexing video, audio & program information [10].

2.5 Quality of Experience:

“Quality of Experience (QoE) has been defined as an extension of the traditional quality of service (QoS) in the sense that QoE provides information regarding the delivered services from an end-user point of view” [15]. It can also be called the subjective measure of user‟s experience.

As broadband services are growing rapidly there is a need to ensure customer satisfaction that can be achieved by having a regular eye on costumer‟s feedback regarding their subscribed services.

From operator‟s point of view QoE is of a great importance as they can improve their services in the light of user‟s feedback. There are few factors which play a pivotal role in determining the QoE of a certain service; usability, cost effectiveness, reliability, availability and privacy.

(22)

22 As by now the term QoE is quite clear to us, we shall find the means of calculating it. QoE is very subjective in nature but it is really important for operators and service providers to measure it realistically. A simple way to measure user experience is by calculating a Mean Opinion Score (MOS) [4], which is a numerical rating of perceived end user quality. In order to have qualitative results all user ratings are averaged. MOS score varies from 1 to 5 where 1 is the lowest perceived quality and 5 is the highest as shown in Table 1.

MOS Quality Human Perception

5 Excellent Imperceptible

4 Very Good Perceptible

3 Fair Slightly Annoying

2 Bad Annoying

1 Poor Very Annoying

Table 1: Mean Opinion Score [4].

QoE in regard to IPTV is a bit different from other broadband services because of its dynamic nature. As discussed in [11], QoE for IPTV must be measured per unit time over time. For example if a user has subscribed to an IPTV channel and the broadcast is of very high quality for the first three quarters but in the last quarter quality is poor, then the overall QoE for the whole hour would be poor.

2.6 Packet Loss in IPTV:

Today‟s technologically advanced world cannot compromise on quality, especially while talking in regard to broadband services. IPTV is expected to be most widely used broadband service (as shown in Figure 1) so the service providers shall have to fulfill the costumer‟s expectations in order to register itself as a healthy competitor in the market. Packet loss is one of the major impairments which service providers are facing in IPTV broadcasting. Generally, packet loss occurs when packets are lost during transmission leaving a distortion in the received video stream. Because of IPTV‟s real-time nature the packets cannot be retransmitted to overcome this problem. If the lost packets are related to I-frames then the intensity of loss will be more as compared to loss packets related to B or P-frames.

(23)

23

Chapter 3: Methodology

he overall emulation chain of the experiments performed in the thesis work is shown in Figure 6. At first we had an AVI file as input and then we encoded AVI to H.264. After encoding we converted the encoded file to packetized version so as to process it through the Packet loss Model (PLM) in the next step. The PLM generated a degraded TS file which was further processed through the depacketizer to produce a H.264 degraded file. Once we had a degraded H.264 file we could decode it to have an output AVI file. Comparison of both input and output AVI files was done with the help of a full reference objective testing algorithm PEVQ which estimates the quality of the degraded video sequence with respect to the source video sequence. The output PEVQ file contains PEVQ score and many different distortion indicators but in our experimental analysis, the PEVQ score is of major concern. A predicted Mean Opinion score (MOS) ranging between 1 and 5 is the output from PEVQ, where 1 is the worst quality and 5 is the best quality. But in case of extreme artifacts it can result in negative [12].

Figure 6: Network topology

In the above mentioned experimental chain encoding, packetizing, depacketizing and decoding is done with FFmpeg [13]. Once this chain is up and running we have a degraded AVI output file which is then processed through an objective testing tool called PEVQ that compares the input and the output AVI files. From the comparison of degraded and non-degraded AVI files (output and input respectively) PEVQ generates a text document which contains various parameters describing the quality of the output AVI file [12]. As we are investigating user‟s experience while watching the video, our prime concern in the output text file (generated by PEVQ) is the predicted Mean Opinion Score (MOS).

Six different video sequences, eight varying bit rates and six distinctive seeds (for randomness in the Markov Model [4]) were used in the experiments in order to have a generalized output (see Appendix A3 for detail). The video sequences used in the experiments were taken from SVT (Swedish National Television) video database. The details of the video sequences are described in Table 2:

T

(24)

24 S. No. Video

Sequence

Frame Rate Duration Description

1 Crowd Run 50 fps 10 sec High amount of details, complex movement, outdoor

2 Tree Tilt 50 fps 10 sec Less details, less movement, outdoor 3 Into Castle 50 fps 10 sec less movements, zoom, day light 4 Seeking 50 fps 10 sec Less amount of details, complex

movements, slow panning, day light 5 Stockholm 50 fps 14 sec Complex details, zoom, less

movements, outdoor 6 Fairy People 50 fps 14 sec Moderate complexity, moderate

movement, moderate panning, day light

Table 2: Details of the video sequences

3.1 FFmpeg:

As defined by [13], “FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.” It is an open source tool, written in C, which can be used in a verity of different ways e.g. for streaming, recording, encoding, decoding, packetizing, depacketizing of audio as well as video files. FFmpeg handles all the complex video processing conversions and it is easier for developers to implement multimedia applications. It can decode nearly all the codecs in the market. Furthermore with the help of FFmpeg we can also play with different parameters that play a vital role in multimedia applications such as bitrates, frame rates, resizing, aspect ratio, dubbing of audio (by importing from various sources), error concealment and much more, as mentioned in [13]. The commands which were used in the experiments are described in the appendix A4.

(25)

25

3.2 Packet Loss Model:

The simulator used for packet drop in MPEG-2 TS consisted of a two state (A and B) Markov model shown in figure 7. The model was developed in Python.

Figure 7: Two state Markov Model

It can be seen in figure 7 that the probabilities of No Loss state and Loss State will affect the outcome of the video quality. For instance if P (AB) > P (BA) then there will be more degradation of the video in comparison to the other case i.e. P (BA) > P (AB).

In order to implement packet loss on the video sequence the user should indicate a value for the PLR; for instance, if the input from the user is 0.002 it means that the user wants 0.2% PL in the whole video. So the question here arises that how does the target PL value work for the two state Markov model, as the user should indicate both the probabilities i.e. P (AB) & P (BA)? The answer lies in the equation below.

So whenever a user feeds a target PLR as an input, the Markov model works on the above mentioned mathematical formulae to degrade the video stream.

( )( )

( )

(1 )

P A B PL b

  PL



( ) 0.1

bP BA 

(26)

26

Figure 8: Example of Markov Packet loss

The packet loss simulator used UDP datagram in order to encapsulate the TS packets; each UDP datagram carries seven TS packets. Furthermore a log file was also generated by the packet loss simulator that carried details such as; type of PLM used, total number of UDP packets, number of lost UDP packets, number of bursts, percentage of actual packet loss in the video sequence and packet level information i.e. which packets are lost and which are present in the degraded video stream.

3.3 Objective Testing:

Need for service monitoring is increasing day by day so IPTV service providers must keep an eye on the perceptual quality while the service is being delivered. There are two types of testing methodologies i.e. subjective and objective. In subjective testing all the tests are conducted on different population groups of different ages, professions or genders. In other words the term objective can be called as a technical model of human perception [12]. So an objective testing methodology is a computational model that is trained on human perception of a certain setup. In objective testing the algorithm which carries out the evaluation is known as an objective model.

The dimensions for investigating video quality have been changed due to objective testing methodologies, because they are repeatable and qualitative. Objective testing can be classified into three major categories:

No Reference Model:

It judges the degraded output signal without having any information regarding reference or source signal. The degradation is estimated by prudent examination of the degraded signal with respect to its properties.

Reduce Reference Model:

Reduced reference model carries reduced information regarding the source signal. The quality assessment algorithm does not have complete information about the source signal but it gets some information about it.

(27)

27

Full Reference Model:

A full reference model compares the degraded signals with the source and computes the amount of degradation. It is more accurate than the other two models because it has complete information of the source signal and it compares every pixel in each image of the degraded video sequence to its original copy (i.e. the non degraded video sequence). As full reference models are the most accurate of all so we opted to use one of those for our experiments.

3.3.1 PEVQ:

Generally PEVQ is an objective full reference testing algorithm used to analyze the perceived video quality with the help of MOS, which has been described in section 2.5. The main structure of the algorithm depends on four building blocks as shown in figure 9 [14]. The first one is pre- processing which deals with the spatial and temporal alignment of the degraded and the non- degraded reference signal. But while talking in regard to video sequences, it compares each frame of both the reference and degraded video sequence.

Figure 9: PEVQ Structure [14].

The second block examines the perceptual difference between the aligned signals i.e. comparison of the degraded and the reference signals with reference to the perceived subjective quality. The third block classifies some parameters which were calculated in the previous stage such as changes in luminance and chrominance domain and evaluate the quality from them. The last block sums all the distortion indicators that results in an output file which carries all the relevant information regarding the evaluation done by PEVQ that also includes the Mean Opinion Score (which is of our major concern).

(28)

28

Chapter 4: Results and Analysis of Simulations

here were two test cases in the simulations; one with varying bitrates and no packet loss and the other with varying bitrates and varying packet loss. As discussed in chapter 3, a total of six video sequences were used in the experiments while we had 440 different scenarios for each video sequence with changing seeds, bitrates and PLR. All the video sequences were in 720p format and of HD quality.

4.1 Clean Quality Results:

Clean quality means that there is no packet loss in the experiments and quality with respect to bitrate is examined. By no packet loss we mean that 0% PL was introduced in the PLM while conducting the experiments. In Figure 10 one can analyze that the quality increases with an increase in bitrate and at certain point it becomes saturated (with exception to higher bitrates than those which are mentioned). As we can see, at similar bitrates, the MOS values are different for each sample that is because the property of each video sequence differs from the other as described in table 2.

Figure 10: Quality Vs bitrates for different video sequences

T

(29)

29 In the experiments we have tested that FFmpeg can encode even higher than 100Mbit/sec and we can obtain a maximum PEVQ score of 4.7 with clean quality. The quantization coefficient is an important parameter which must be specified accordingly while encoding the video sequences to higher bitrates. Figure 11 illustrates the effects of bitrates i.e. higher the bitrate more the information concealed in video and vice versa.

Figure 11: Video Sequences encoded with different bitrates. (Top Left) 2Mbit, (Top Right) 5Mbit, (Bottom Left) 12Mbit, (Bottom Right) 20Mbit.

After analyzing Figure 11 one can see some of the impact of higher and lower bitrates on the video sequence. The difference is certainly more visible if the video is viewed at its original resolution (i.e. 1280x720).

(30)

30

4.2 Experiments with Packet Loss:

As discussed in the previous chapter, that the packet loss simulator consisted of two state Markov model and there were two techniques to implement the packet loss on the video sequences. The experiments performed were with eight different bitrates ranging from 5Mbit to 12Mbit, eleven different PLR that ranged from 0% to 20% and five different seeds for randomness in the Markov model (details can be seen in Appendix A3).

Markovtarget Actual PL Total Nr. Of TS Packets

Nr. Of Lost TS Packets

Avg. Nr. Of Packets lost per

Second

0.0% 0.0% 76349 0 0

0.2% 0.20% 76349 154 11

0.4% 0.46% 76349 371 26.5

0.8% 0.73% 76349 560 40

1.5% 1.52% 76349 1162 83

5% 5.02% 76349 3829 273.5

10% 10.06% 76349 7679 548.5

20% 19.97% 76349 15246 1089

Table 3: Actual loss values from a video sequence (‘Stockholm’) that was encoded with a data rate of 5Mbit.

Table 3 shows an overview of some parameters that were taken from the log files, except the values in the last column, packet lost per second, which were calculated by dividing Nr. of Lost TS packets by the length of video sequence.

Figure 12: Averaged Quality Vs PLR for different bitrates

(31)

31 Figure 12 illustrate the overall averaged quality (for overall PL plot see Appendix A6) for different bitrates with respect to PLR. It can be concluded from the plot that the perceived quality is affected by both PL and bitrate. As packet loss is dependent on the distribution of lost packets so the type of frame that is corrupted plays an important role i.e. if an I-frame is corrupted then its effects will continue until the next I-frame will occur. In the experiments an I-frame occurs after each second it means that if the length of the video sequence is 10 seconds, then whole video will have 10 I-frames.

Figure 13: Video Sequences with different PLR. (Top Left) 0.2% PLR, (Top Right) 0.8% PLR, (Bottom Left) 5% PLR, (Bottom Right) 20% PLR.

(32)

32 Figure 13 shows screen shots from the output AVI file that was degraded by different factors which were mentioned above. We can clearly examine that as PL increases the video quality decreases. As discussed earlier, video quality depends on bitrate as well as PLR, so we experimented with bitrates from 5 Mbit/sec up till 12 Mbit/sec in each PL scenario. Another important point to be discussed is that in case of extreme artifacts the PEVQ score may result less than one or as negative but we rounded off those values to 1 because in future it may cause some unusual behaviors while modeling for PL. See Appendix A7 for having a clear understanding of Quality Vs PLR plot for all bitrates separately.

(33)

33

Chapter 5: Parametric Model Development

his model is based on PEVQ scores and estimates video quality based on bitrate and packet loss only for video sequences encoded with the H.264 codec. The model will give a score between 1 and 5 (where 5 is the best and 1 is the worst quality) which describes the perceived quality of the given video sequence. The modeling is based on six video sequences, five seeds, twenty bitrates and eleven PLR.

The basic structure of the model⁴ is being represented by:

( , )

MOSest  f bitrate plr

The objective estimate Mean Opinion Score (MOS ) is calculated based on encoded bitrate and _est Packet Loss Ratio.

5.1 Modeling clean quality:

One of the approaches for modeling is to initiate the parametric model development with clean quality (i.e. without any degradation). The impairments in clean quality can be due to different video sequences and bitrates. As encoding is only done with one H.264 encoder implementation there is no variation because of codec.

T

(34)

34

Figure 14: MOS Vs Bitrate for all samples

The PEVQ scores for all six samples are plotted with respect to bitrate in Figure 14. An increase in quality is visible with a decreasing slope at 20Mbit.But if the bitrate is increased to 100Mbit the saturation is achieved at 4.6 (as shown in Appendix A5). As bitrate increases it becomes very difficult to distinguish between different samples.

(35)

35

Figure 15: Clean quality Model

Figure 15 shows that the model flattens out a bit earlier then the mean of all data points but it lies almost in between the upper and the lower ranges of all the data points. Further more the model correlates well with the data points (i.e. the PEVQ Scores) and it follows their trend too. In addition to this figure 16 also illustrates that the model lies within 95% confidence interval of the mean of all data points.

´

(36)

36

Figure 16: Score (Model and 95% confidence interval of mean) Vs bitrate.

5.2 Modeling for Packet loss:

The impact on video quality due to packet loss is difficult to predict since it is dependent on the distribution of lost packets and the types of frames which are affected. This implies that only using packet loss ratio will not provide the best possible prediction of video MOS.

(37)

37

Figure 17: MOS Vs PL for all samples, all seeds and all bitrates

Figure 17 shows a strong correlation between packet loss rate and perceived quality i.e. the higher the packet loss rate the lower the perceived quality. Figure 17 illustrates 1 as the lowest PEVQ score this is because we rounded off all the PEVQ scores (which were lower then 1) to 1 so as to have an appropriate model. As PEVQ has no lower bounds, so for sequences with extreme degradations the score can be below 1 and also negative [12].

(38)

38

Figure 18: Model for Packet loss

The decreasing trend is visible in the PEVQ scores as well as the model which means that the model correlates well with the data points. The variation of MOS score can be because of varying bitrate or PEVQ estimation (as PEVQ is a model for human perception so a difference may exist between subjective score and PEVQ score).

(39)

39

5.3 Model Performance:

The two most important factors which influence the perceived quality (bitrate and packet loss) are examined for experimentation and model development⁵. As we have no subjective test data the whole model is based on PEVQ scores. Figure 19 compares PEVQ scores (of all six samples among various bitrates and packet losses) with the corresponding scores from the Model.

Figure 19: PEVQ Vs Model (trained on all data)

Figure 19 clearly shows a trend which follows x=y line. The correlation between the PEVQ scores and the predicted model scores is 0.87 which is good. The Root Mean Square Error (RMSE) can be called as the main performance metrics which indicates the error between the predicted model score and the PEVQ score. The RMSE value (which was calculated as the root mean square of the difference between the Model and the PEVQ scores) is 0.55.

(40)

40

Figure 20: Histogram and approximated CDF for all errors

Figure 20 show a histogram of error i.e. the distribution of error in the data and its corresponding absolute values. It represents the difference between the PEVQ scores and the predicted model scores and the impact of quality under estimation in the error histogram is noticeable as error is biased towards positive side. Another important conclusion that can be extracted from the CDF plot is that nearly 75% of the errors lie in between 0 and 0.5 values of the MOS error.

(41)

41

Chapter 6: Conclusions and Future work

6.1 Conclusions:

In this thesis we examined the impact of IPTV quality, on end users, with the help of various amounts of packet loss and bitrates. Artificial degradations, of different intensities, were introduced in various video sequences with variable bitrates. All the experiments were performed on HD 720p video sequences of the duration less then 14 seconds. Full reference algorithm

„PEVQ‟ was used in order to calculate the perceived quality at transport layer. After getting the PEVQ scores for 2640 different scenarios we developed the parametric model for clean quality and packet loss separately.

With the help of simulation results we proved that by increasing bitrate the perceived quality gets better so the users are satisfied by the service. But on the other hand higher bitrates require increased bandwidth other resources as well. Bitrate of 10Mbit is suggested to be a good choice for HD IPTV because the difference in perceived quality at higher bitrates is not much. The impact of packet loss is one of the most challenging factors for IPTV service providers, as IPTV is being delivered through UDP so retransmission of lost or corrupted packets is not possible. The impact of packet loss can also be examined by the simulation results i.e. as packet loss increases the end user quality decreases. On the other hand quality is influenced by bitrate and many other encoding factors (e.g. frame rate, codec, I-frame frequency, resolution etc).

The developed parametric model⁶ is based on objective data. Two types of parametric models were developed, the first model dealt only with the degradation due to bitrates and the second model dealt with the packet loss as well as the bitrates. The overall correlation between the PEVQ and the Model scores is 0.87 with an RMSE of 0.55 which is satisfactory. The difference between the Model and the PEVQ score is more biased towards the positive axis that can also be seen while examining the performance of the model, where most of the data points lie on top of x=y line, this is because the model underestimates the quality of the video sequences.

(42)

42

6.2 Future Work:

Here are some suggestions to improve and enhance the parametric model as well as the simulation results gathered in this thesis.

 There is a need to enhance the model so that it should not underestimate the quality of the video sequences.

 Comparative study of MPEG-2 encoded video sequences along with H.264 and their impact on perceived quality.

 The parametric model developed in the thesis estimates only the video quality so there is a need to take audio in consideration too.

 Subjective tests should be performed on a same experimental setup and the model should also be revised in order to have more generalized one.

 Experiments with a PL < 0.1% should be performed to analyze the end user quality.

 Identify parameters in the transport stream that can be used in an objective parametric model to estimate perceived quality at the transport layer.

(43)

43

Appendix A

A1. Packetized Elementary Stream Header [8]:

Item Name Description Bits

Start Code 0x000001 24

Stream ID Stream Identifier 8

PES Packet length Length of the packet 16

Sync code „10‟ 2

PES Scrambling Control Scrambling Flags 2

PES Priority Used by the decoder 1

Data Alignment Indicator Used by the decoder 1

Copyright When its set PES packet payload is protected by copyright 1

Original or Copy When set PES packet is an original 1

PTS DTS Flags „10‟: PTS fields are present in PES packet header

„11‟: PTS & DTS fields are present in PES packet header

„00‟: no PTS or DTS fields are present in PES packet header

2

ESCR Flags When set ESCR fields are present in PES Packet header 1 ES Rate Flag When set ES rate field is present in PES packet header 1 DSM Trick Mode Flag When set DSM trick mode flag is present in PES packet header 1 Additional Copy Info Flag When set an additional copy flag is present in the PES packet header 1 PES CRC Flag When set CRC flag is present in the PES packet header 1 PES Extension Flag When set an extension field is present in the PES packet header 1 PES Header Data Length Total Nr. Of bytes occupied by the additional fields and any stuffing

bytes in PES header

8 Additional Header Data Depending on the flags above, more headers may follow. The length of

this section is set by the PES header data length field.

PES Stream Data Contiguous bytes of data from a stream

A2. Transport Stream Packet Header [8]:

Item Name Description Bits

Sync Byte Synchronization code „0x47‟ 8

Transport Error Indicator When set at least one bit error is present in the packet 1 Payload Start Indicator When set the payload of this transport packet is the start of a

packetized elementary stream (PES) packet

1

Transport Priority Used by the decoder 1

PID Identifies which PES the data stored in this transport packet payload is form

13

Scrambling Control Scrambling mode 2

Adaptation Field Control Indicates if there is an adaptation field (additional information) following the TS header

2 Continuity Counter 4 Bit counter which is incremented with each TS packet containing the

same PID, when it reaches 15 it loops back to zero.

4

(44)

44

A3. Experimental Data:

Seeds 7 17 21 39 42

PLR 0% 0.2% 0.4% 0.6% 0.8% 1% 1.5% 5% 10% 15% 20%

Bitrates 5Mbit 6Mbit 7Mbit 8Mbit 8.5Mbit 9Mbit 10Mbit 12Mbit

A4. FFmpeg Commands & Description:

ffmpeg.exe -i sample.avi -y -vcodec libx264 -f h264 -g 50 -vb 5000000 -qmax 51 -r 50 -pass 1 - passlogfile fn encoded_5M_0.002.h264

ffmpeg.exe -i sample.avi -y -vcodec libx264 -f h264 -g 50 -vb 5000000 -qmax 51 -r 50 -pass 2 - passlogfile fn encoded_5M_0.002.h264

ffmpeg.exe -f h264 -i encoded_5M_0.002.h264 -y -f mpegts -vcodec copy packetized_5M_0.002.ts

Enhanced_plsimMPEG2TS.py -i packetized_5M_0.002.ts -o degraded_5M_0.002.ts -l log_5M_0.002.txt -s 7 -p markovtarget " 0.002

ffmpeg.exe -i degraded_5M_0.002.ts -f h264 -vcodec copy -er FF_ER_VERY_AGGRESSIVE -y depacketized_5M_0.002.h264

ffmpeg.exe -i depacketized_5M_0.002.h264 -f rawvideo decoded_5M_0.002.yuv

ffmpeg.exe -s 1280x720 -r 50 -i decoded_5M_0.002.yuv -vcodec copy -y output_sample_5M_0.002.avi

PEVQOem.exe -Ref sample.avi -Test output_sample_5M_0.002.avi -Out pevq_result_5M_0.002.txt

(45)

45

Description [13]:

-i Input video name along with its extension -y Overwrite output files (if any)

-vcodec Force (specified) video codec

-f Force format

-g Group of Picture size -vb Video Bitrate

-qmax Maximum video quantization scale -r Frame rate (i.e. fps)

-pass Select the pass number (1 or 2). It is used to do two-pass video encoding.

-er Set error resilience (ranges from 1 to 4)

A5. PEVQ Scores while encoding up to 100Mbit:

Figure 21: Quality vs. Bitrate

(46)

46

A6. Overall PL VS Quality:

Figure 22: PL vs. Quality (overall)

(47)

47

Figure 23: Quality vs. PLR (PL<3%)

(48)

48

Figure 24: Quality vs. PLR (4%<PL<21%)

(49)

49

A7. Averaged Quality with respect to PL and Bitrates:

Figure 25: Encoded with 5 Mbit/sec

(50)

50

(51)

51

Figure 29: Encoded with 8.5 Mbit/sec

(52)

52

(53)

53

List of Abbreviations

IPTV Internet Protocol Television TS Transport Stream

MPEG Moving Picture Expert Group QoE Quality of Experience

STB Set Top Box

DVB Digital Video Broadcasting DVD Digital Versatile Disc VCD Video Compact Disc GOP Group of Pictures

HD High Definition

PS Program Stream

ES Elementary Stream

PES Packetized Elementary Stream PID Packet Identifier

MOS Mean Opinion Score PLM Packet Loss Model

SVT Swedish National Television UDP User Datagram Protocol RMSE Root Mean Square Error

CDF Cumulative Distribution Function

(54)

54

Bibliography:

[1] Jörgen Gustafsson, Gunnar Heikkilä and Martin Pettersson. Measuring multimedia quality in mobile networks with an objective parametric model. IEEE 978-1-4244-1764-3/08. Ericsson Research Luleå, Sweden, 2008. Presented at ICIP 2008.

[2] Keishiro Watanabe, Kazuhisa Yamagishi, Jun Okamoto, and Akira Takahashi. Proposal of new QoE assessment approach for quality management of IPTV services. IEEE 978-1-4244- 1764-3/08. NTT Service Integration Laboratories, NTT Corporation, 2008.

[3] Fenghua You, Wei Zhang and Jun Xiao. Packet Loss Pattern and Parametric Video Quality Model for IPTV. IEEE 978-0-7695-3641-5/09. Department of Computer Science and

Technology, East China Normal University, Shanghai, China, 2009.

[4] www.wikipedia.org [Online] [Cited: September 4, 2009.]

[5] Peter Arberg, Torbjörn Cagenius, Olle V. Tidblad, Mats Ullerstig and Phil Winterbottom, Network infrastructure for IPTV. Ericsson Review No. 3, 2007.

[6] Alex MacAulay, Boris Felts, Yuval Fisher, of MPEG-4: Native RTP vs MPEG-2 Transport Stream. White Paper Envivio October 2005.

[7] Introduction to H.264, ATI Technologies, AVIVO, 2005.

[8] Peter Daniel, Investigation into digital video streams. Final year project, University of Salford, Department of Acoustics, Electronics and Electrical Engineering, 2001.

[9] P.A. Sarginson, MPEG-2: Overview of the system layer, Research and development report.

BBC R&D 1992.

[10] http://www.vbrick.net/topics/transport_stream.htm, [Online] [Cited: 20 August,2009.]

[11] Quality of Experience for Media over IP, IneoQuest Technologies, France.

[12] User Manual PEVQ OEM Version 2.4, www.opticom.de [13] www.ffmpeg.org [Online] [23 June, 2009]

[14] Advanced perceptual evaluation of video quality, White paper from Opticom GMBH.

[15] Lopez, D.; Gonzalez, F.; Bellido, L.; Alonso, A., (2006): Adaptive multi-media streaming over IP based on customer oriented metrics, International Symposium on Computer Networks.

Investigation of End User IPTV Quality for Content Delivered in MPEG-2 Transport Stream