Institutionen för systemteknik
Department of Communication Systems
Examensarbete
Forward Error Correction for Packet Switched Networks
Master thesis performed in Communication Systems
byFrancisco Javier Parada Otte
David Valverde Martínez
Report number
LiTHISYEX08/4036—SE
Linköping Date:
2008 – 01 25TEKNISKA HÖGSKOLAN
LINKÖPINGS UNIVERSITET
Department of Electrical Engineering Linköping University S581 83 Linköping, Sweden Linköpings tekniska högskola Institutionen för systemteknik 581 83 LinköpingForward Error Correction for Packet Switched Networks
Master thesis in Communication Systems
at Linköping Institute of Technology
by
Francisco Javier Parada Otte
David Valverde Martínez
LiTHISYEX08/4036—SE
Supervisors: Mikael Olofsson Robert Forchheimer Examiner: Mikael Olofsson Linköping 2008 – 01 252008-01-25
Publishing Date (Electronic version)
Department of Electrical Engineering Communication Systems
URL, Electronic Version http://www.ep.liu.se
Language X English
Other (specify below)
Number of Pages 104 Type of Publication Licentiate thesis X Degree thesis Thesis C-level Thesis D-level Report
Other (specify below)
ISBN (Licentiate thesis)
ISRN: LiTH-ISY-EX--08/4036--SE Title of series (Licentiate thesis)
Series number/ISSN (Licentiate thesis)
Publication Title
Forward Error Correction for Packet Switched Networks Author(s)
Francisco Javier Parada Otte David Valverde Martínez Abstract
The main goal in this thesis is to select and test Forward Error Correction (FEC) schemes suitable for network video transmission over RTP/UDP.
There is a general concern in communication networks which is to achieve a tradeoff
between reliable transmission and the delay that it takes. Our purpose is to look for techniques that improve the reliability while the realtime delay constraints are fulfilled. In order to achieve it, the FEC techniques focus on recovering the packet losses that come up along any transmission. The FEC schemes that we have selected are Parity Check algorithm, ReedSolomon (RS) codes and a Convolutional code.
Simulations are performed to test the different schemes. The results obtained show that the RS codes are the more powerful schemes in terms of recovery capabilities. However they can not be deployed for every
configuration since they go beyond the delay threshold. On the other hand, despite of the Parity Check codes being the less efficient in terms of error recovery, they show a reasonable low delay. Therefore, depending on the packet loss probability that we are working with, we may chose one or other of the different schemes. To summarize, this thesis includes a theoretical background, a thorough analysis of the FEC schemes chosen, simulation results, conclusions and proposed future work.
Keywords
Abstract
The main goal in this thesis is to select and test Forward Error Correction (FEC) schemes suitable for network video transmission over RTP/UDP. There is a general concern in communication networks which is to achieve a tradeoff between reliable transmission and the delay that it takes. Our purpose is to look for techniques that improve the reliability while the realtime delay constraints are fulfilled. In order to achieve it, the FEC techniques focus on recovering the packet losses that come up along any transmission. The FEC schemes that we have selected are Parity Check algorithm, ReedSolomon (RS) codes and a Convolutional code. Simulations are performed to test the different schemes. The results obtained show that the RS codes are the more powerful schemes in terms of recovery capabilities. However they can not be deployed for every configuration since they go beyond the delay threshold. On the other hand, despite of the Parity Check codes being the less efficient in terms of error recovery, they show a reasonable low delay. Therefore, depending on the packet loss probability that we are working with, we may chose one or other of the different schemes.To summarize, this thesis includes a theoretical background, a thorough analysis of the FEC schemes chosen, simulation results, conclusions and proposed future work.
Acknowledgement
First of all, we would like to thank Edgar E. Iglesias for giving us the opportunity to write our thesis at Axis Communications. His help extended beyond his role as our supervisor, being always there whenever we need some help. We will never forget it. Thank you. Besides Edgar E. Iglesias, we would like to acknowledge Ulf Olsson for helping us to prepare our presentation at the company. His feedback was absolutely valuable. It has been really nice to work in such a big company, and we are sure this experience will proof to be invaluable to our future career. Everybody at Axis has been always very nice to us, and the work atmosphere that we found there was simply excellent. We could not have imagined a better finish towards our degree when we started our studies in Telecommunications Engineering than writing our thesis in Sweden in one of the leading companies of the sector. Besides the people at Axis, we would like to thank all the personnel at Linköping University (LiU), both teachers and nonteachers. All of them, not only during the thesis but also during the subjects we went through, have been very kind and comprehensible with us. It has been an incredible oportunity for us to study a master's programme there. This thesis could not have been done without the help of a number of people.We would like to express our gratitude to Mikael Olofsson and Robert Forchheimer, specially the first one for his valuable comments on the report. His feedback, not only about technical topics but also about language issues, has been a great help in order to deal with this report. I, David, would like to dedicate this thesis to my family, specially my father, who have backed me from the first day I enrrolled my studies. This is the culmination of a big effort for them. Furthermore, I would like to thank Ana and Miguel for upholding whenever I needed. All of them have provided me the strength I lacked in the worst times. I, Javier, would like to thank to my family as well, my parents and my sisters, who supported me at every time even living far. And then, thanks all the students who lived in Sankt Lars (Lund), specially to Kathie, Lalo and Ale, for making me spend such a great time during the months I have lived in Lund carrying out the Thesis.Table of Contents
1 INTRODUCTION...10 1.1 Overview...10 1.2 Problem Statement...11 1.3 Disposition...11 2 BACKGROUND AND PREVIOUS WORK...13 2.1 Network Protocols...13 2.1.1 IP...14 2.1.2 TCP...16 2.1.3 UDP...19 2.1.4 RTP...20 2.1.5 RTSP...21 2.2 Error Control Strategies: ARQ / FEC ...22 2.2.1 Automatic Repeat reQuest (ARQ)...22 2.2.2 Types of Codes...27 2.2.2.1 Block Codes...29 2.2.2.2 Convolutional Codes:...34 2.2.2.3. State of the art in Error Control Coding...37 2.2.3 Hybrid ARQ Schemes...38 2.3 Galois Field...40 2.4 Previous Work...45 3 OUR WORK...48 3.1 Parity Check...48 3.2 ReedSolomon...51 3.2.2 Interleaver...59 3.3 Convolutional Code...66 Our convolutional code...68 4 TESTS...71 4.1 Tests Sceneario...71 4.2 Parity Check Results...72 4.3 RS Results...79 4.4 Convolutional Code Results...88 5 RESULTS AND CONCLUSIONS...92 5.1 Results...92 5.2 Conclusions...94 6 FUTURE WORK ...98List of Figures
Figure 1: Comparison of the OSI model and the IP/TCP suite...13 Figure 2: IPv4 Header...15 Figure 3: TCP Header...17 Figure 4: UDP Header...19 Figure 5: RTP Header...20 Figure 6: StopandWait ARQ...24 Figure 7: GoBackN ARQ (N=5)...25 Figure 8: SelectiveRepeat ARQ ...26 Figure 9: Symmetric Erasure Channel...27 Figure 10: Systematic Encoding...28 Figure 11: ( 2, 1, 3 ) Nonsystematic Feedforward Convolutional encoder...35 Figure 12: ( 2, 1, 3 ) Systematic Feedforward Convolutional encoder...35 Figure 13: ( 3, 2, 2 ) Systematic Feedback Convolutional encoder...36 Figure 14: Error Detection example...49 Figure 15: Packetlevel Checksum for different k grouporder...50 Figure 16: Example of RS (7,3) code...53 Figure 17: LFSR encoder for a RS (n,k) ...55 Figure 18: Transmission of an interleaved code...59 Figure 19: d =1 Interleaver over a RS(7,4) code...60 Figure 20: Interleaver's 1st packet sent at the transmitter side...62 Figure 21: Deinterleaver's 1st packet received at the receiver side...62 Figure 22: Interleaver's 2nd packet sent at the transmitter side...63 Figure 23: Deinterleaver's 2nd packet received at the receiver side...63 Figure 24: Interleaver's 63rd packet sent at the transmitter side...64 Figure 25: Deinterleaver's 63rd packet received at the receiver side...64 Figure 26: Receiver side when 2nd packet has been lost in a d = 4 interleaver...65 Image 27: Systematic convolutional encoder...67 Figure 28: ParityCheck algorithm over k=10...73 Figure 29: ParityCheck algorithm over k=5...74 Figure 30: ParityCheck algorithm over k=4...75 Figure 31: ParityCheck algorithm over k=3...76 Figure 32: ParityCheck algorithm over k=2...77 Figure 33: Parity Check results...78 Figure 34: RS(255,239) with a depth4 Interleaver...80 Figure 35: RS(255,231) with a depth4 Interleaver...81 Figure 36: RS(255,223) with a depth4 Interleaver...82 Figure 37: RS(255,205) with a depth4 Interleaver...83 Figure 38: RS(255,191) with a depth4 Interleaver...84 Figure 39: RS( 255 , 175 ) with a depth4 Interleaver...85 Figure 40: Convolutional code reutls...89 Figure 41: Recovery rate comparison for equivalent Parity Check, RS and ...92List of Tables
Table 1: List of primitive polynomials...42 Table 2: Representation for the elements ofgenerated by...44 Table 3: Representations for the elements ofgenerated by...45 Table 4: Results for ParityCheck algorithm over k=10...73 Table 5: Results for ParityCheck algorithm over k=5...74 Table 6: Results for ParityCheck algorithm over k=4...75 Table 7: Results for ParityCheck algorithm over k=3...76 Table 8: Results for ParityCheck algorithm over k=2...77 Table 9: Redundancy for different configurations...79 Table 10: Delay for an input_rate = 225 [KB/s]...79 Table 11: Delay for an input_rate = 375 [KB/s]...79 Table 12: Results for RS( 255 , 239 ) over a depth4 Interleaver...80 Table 13: Results for RS( 255 , 231 ) with a depth4 Interleaver...81 Table 14: Results for RS( 255 , 223 ) with a depth4 Interleaver...82 Table 15: Results for RS( 255 , 205 ) with a depth4 Interleaver...83 Table 16: Results for RS( 255 , 191 ) with a depth4 Interleaver...84 Table 17: Results for RS( 255 , 175 ) with a depth4 Interleaver...85 Table 18: Redundancy introduced for the different configurations...86 Table 19: Interleaver's delay with an input rate r = 225 [KB/s]. ...87 Table 20: Interleaver's delay with an input rate r = 375 [KB/s]...88 Table 21: Maximum packet size for a delay of 200 ms...89 Table 22: Delay obtained for a packet size of 1024 Bytes...89 Table23: Convolutional code results...90 Table 24: Parity Check, RS and Convolutional Delays...94 Table 25: Quality ranges...94 Table 26: Recovery rate and Delay comparisons...95List of Abbreviations
ACK Acknowledgement ANSI American National Standards Institute ARP Adress Resolution Protocol ARQ Automatic Repeat Requesting BCH Bose, Chaudhuri and Hocquengem codes BIC Binary Increase Congestion CIDR Class InterDomain Routing CRC Cyclic Redundancy Check DHCP Dynamic Host Configuration Protocol FEC Forward Error Correction FTP File Transfer Protocol HSTCP High Speed Transport Control Protocol HTTP Hyper Text Transfer Protocol ICMP Internet Control Message Protocol IEEE Institute of Electrical and Electronics Engineers IETF Internet Engineering Task Force IGMP Internet Group Management Protocol IP Internet Protocol ISO International Standard Organization LDPC Low Density Parity Check LFSR Linear Feedback Shift Register MLD Maximum Likelihood Decoding MTU Maximum Transmission Unit NAK Negative Acknowledgement OSI Open System Interconnection PT Payload Type QoS Quality of Service RS Reed Solomon RTCP Realtime Control Protocol RTP Realtime Transport Protocol RTSP Realtime Streaming Protocol RTT Round Trip Time SPC Single Parity Check SMTP Simple Mail Transfer Protocol SSRC Synchronization Source Identifier TCP Transport Control Protocol ToS Type of Service TTL Time To Live UDP User Datagram Protocol VCR Video Cassette RecorderIntroduction
1.1 Overview
In recent years there has been an immense increase in the use of demanding applications such real time applications. We can mention video conferencing and audio chat as examples of them. In order to classify the quality of the links where those applications work, we mainly focus on the packet loss probability. Unfortunately many of the networks where they are working do not provide reliability. Thus, packet losses come up along the transmissions. RealTime Transport Protocol (RTP) running over User Datagram Protocol and Internet Protocol (UDP/IP) provides an unreliable packet delivery service. In such a scenario an application may have to deal with an incomplete media stream. There are cases (e.g. wireless scenarios) where bit corruption is mainly responsible for errors. However, when working with RTP over UDP/IP, the dominant error comes from packet losses. In order to face up the packet loss, the application can either try to correct the errors (packet losses) or to conceal them. Although it is clearly important to be able to conceal the effects of transmission errors, many times it would be more convenient if those errors could be either avoided or corrected. Error Correction techniques deal with those lost packets, trying to recover them. This techniques fall into two basic categories: Forward Error Correction (FEC) and Automatic Repeat Request (ARQ). When applying recovery packet techniques within realtime applications it is necessary to take into account the tight timing constraints that these applications require. FEC algorithms are notably employed in digital broadcasting systems (e.g. mobile telephony, space communication systems), storage systems (e.g. compact discs, computer hard disks,) and memory devices. Due to the fact that Internet works lossy networks, and because media applications are sensitive to losses, FEC schemes are widely deployed for RTP applications.1.2 Problem Statement
This thesis is expected to analyse different FEC schemes over lossy networks. Based on a previous theoretical introduction we propose three FEC algorithms that could be suitable for realtime applications where RTP and UDP protocols would be deployed. Packet loss recovery rate and end toend latency are analysed for each of the three chosen schemes. Once the schemes are selected, highlevel simulation models for each of them are developed. For such purpose American National Standards Institute (ANSI) C programming language has been chosen
.
1.3 Disposition
Chapter 2 deals with a theoretical background and previous work. It describes the basics of some relevant network protocols (eg. IP, TCP, UDP and RTP protocols). Furthermore it provides a general view about Error Control Strategies, focusing on FEC schemes. Chapter 3 thoroughly describes the FEC schemes selected. These are: Parity Check algorithm, ReedSolomon codes concatenated with an interleaver and Convolutional codes. Chapter 4 contains the simulation results of the algorithms. Results for each scheme are presented as figures of packet loss recovery along with explanations of the results. Chapter 5 encloses conclusions that come up due to the obtained results. Chapter 6 concludes the thesis with some suggestions proposed as future work.
Background and Previous Work
2.1 Network Protocols
First of all, as we said at the beginning, our scenario is based on network video so we have to talk about network protocols, which is the essential part of a communication system. They establish the rules and conventions used to transmit the information among the different end systems. So these protocols are layered in a stack in a way that each protocol provide services to the upper one. Network protocols are the main element to transmit the video over the network and to achieve all the properties also for the realtime feature. The model used on the internet is the TCP/IP suite, the most extended, named after the two most important protocols, the Transport Control Protocol (TCP) and the Internet Protocol (IP). This suite consists basically on four different layers within their own protocols, from the top to the bottom: the application layer, the transport layer, the network layer and the link layer, which can be compared with the OSI model created by the ISO, but not so deeply as it has seven layers (the functionality doesn't really match within each level). Each layer communicates itself with the parallel layers at other systems, and each layer provides some services to upper layers and uses some others from lower layers. Figure 1: Comparison of the OSI model and the IP/TCP suiteThe Application layer consists of user processes that communicate with each other. The network applications at different end systems communicate with each other by sending messages. The format of these messages are defined by the different applicationlayer protocols, like the Hyper Text Transfer Protocol (HTTP) for the web, the File Transfer Protocol (FTP), the Simple Mail Transfer Protocol (SMTP) and much more [1, Ch. 1.7].
The Transport layer is responsible for the transmission of data between the end systems for the application layer above. There are two main protocols, TCP and UDP; the former is a connection oriented and reliable protocol, while the latter is a connectionless and unreliable one [1, Ch. 1.7]. Both of them will be described later. The Network layer handles the movement of the packets through the network. Routing is managed here with different protocols. This layer is responsible for routing the packets from the source to the destination through the different routers and subnetworks depending on different algorithms. The main protocol here is IP. It includes more functionality with some useful work like handling errors in the network, multicasting and some additional tasks [1, Ch. 1.7]. The Link layer is at the bottom. It is related to the hardware used to drop the packets into the network, and it is the responsible to send the packets through the switches between the source and the destination. It handles the details of physical interfacing in whatever the media is (Ethernet, Token Ring, ...) [2, Ch. 2]. The most common media used is the Institute of Electrical and Electronics Engineers (IEEE) 802 standards, known as Ethernet, which is based on 48 bit unique addresses for each device (or interface) and uses the Address Resolution Protocol (ARP) to map the IP addresses and the Ethernet ones. There is also a very important parameter used in this layer, the maximum transmission unit (MTU), this is, the maximum size a frame can fit in a particular link without causing segmentation in the network datagrams that we will consider in order to transmit our data.
2.1.1 IP
The IP protocol is at the network layer, and usually includes more protocols within it, like the Internet Control Message Protocol (ICMP) and the Internet Group Management Protocol (IGMP), used for multicast. It is completely defined in [3]. It provides a connectionless datagram service, there is no status and each packet is handled independently of previous ones, so the source does not establish any previous contact in order to send the datagram [2, Ch. 3]. When the Network layer receives a segment from the transport one, it encapsulates the segment into an IP datagram with the destination address and drops it in the network to be routed to the destination.
IP is also a best effort protocol, that is, it does not guarantee that the packets arrive with any order, it does not assure a certain time limit for the arrival of the packets and it does not even assure that the packets arrive to their destination [2, Ch. 3]. The mission to make a reliable communication depends on upper layers like the transport layer as we will see . The IP header consists of 20 bytes plus options (if needed). This header is split into the following fields [4, Ch. 2.1]:
Version: there are two versions, IP version 4 (IPv4) and IPv6, the most
extended is the first one, the second one is a design trying to improve the address space problem with larger addresses and removing some fields to make it simpler. It also adds some features as security and routing facilities. It is considered to replace IPv4 gradually. Length: the header length, as it is variable size because the options. Type of Service (ToS): to distinguish among different IP datagrams. There are some cases where it is interesting to differentiate them, for example when there is congestion, control packets are preferred. Packet Length: the length of the whole packet, header and data. Identifier, DF, MF, Frag Offset: these fields are used for the fragmentation, to allow it and to reassemble the data at the destination without errors. The fragmentation is carried by any router if the packet size exceeds the MTU in that link. However the reassembly is only done at the destination. Time to Live (TTL): the number of routers a packet can pass through. This is used to avoid packets being routed around the net forever. Figure 2: IPv4 Header
Transport: the protocol above that the packet is encapsulated, the transport layer protocol. Checksum: a checksum of the header to avoid errors. In case there is an error the packet is discarded. Source and Destination IP addresses: the 32 bits addresses of the sender and the receiver in order to route the packet. Options: this field is optional in order to add some features to the IP, but not very common. About internet addressing, the addresses consist of 32 bits divided in four fields of one byte each, usually expressed as decimal numbers between 0 and 255. The actual system used is the Class InterDomain Routing (CIDR). This technique provides the addresses prefixes of any length allowing to group the addresses in blocks making less complex the routing procedure. The prefix length is indicated after a slash or with a mask (32bit address with ones on the first length bits and zeroes on the rest) so the routers and end systems know how to interpret the address and also the routing tables decrease considerably their size as explained in [4, Ch. 2.1]. As a result, the efficiency of the address space increased with respect to the first system used (old fixed classes); but there is still a problem with the space, it is not large enough for the rapid growing of the Internet. This is one the reasons why IPv6 has bigger addresses. IP routing has a very simple principle of action. If the destination of a datagram is directly connected to the host or in a shared network, then it sends the datagram directly. In other cases (i.e. in another subnetwork) the host sends the datagram to a default router, and this keeps routing it depending on the configuration of its tables until it reach to the destination. ICMP is a protocol often considered part of IP. Its goal is to warn about some errors or conditions in the network that require some action, then the host decides what to do, for example when a packet is discarded because network congestion or time to live exceeded. ICMP messages are sent within IP datagrams; some of the types that exist are: network or host unreachable/unknown, echo request, time exceeded and so on [1, Ch. 4.4.5].
2.1.2 TCP
There are two main protocols for the transport layer in the IP/TCP suite. These are TCP and UDP, both with quite different features. TCP is a reliable and connectionoriented protocol. You can find the specification at [5]. So it needs to establish a connection before the communication between hosts, and it ensures the delivery of the segments from the source to the destination. However UDP is a connectionless and unreliable protocol. It is also known as a best effort protocol.Before two hosts want to exchange some information using TCP, they first have to establish a connection. In order to achieve this, called the threeway handshake, they have to follow three steps [2, Ch. 18]: 1. The requesting end (client) sends a synchronize (SYN) segment with the port number and the initial sequence number (ISN). 2. The server replies with the SYN segment with its own initial sequence number. It also acknowledges the client's SYN with an acknowledgement (ACK) with the client's ISN plus one. 3. Finally the client must acknowledge the previous segment by sending an ACK with the server's ISN plus one. The connection also has to be finished at the end, so both client and server exchange final (FIN) segments and their respective acknowledgements. TCP is a reliable protocol because is supposed to deliver all the segments the host sends in the same order they were sent. For that purpose it uses the ACK technique. Each segment contains a sequence number so the receiver can detect which segments are lost and the order as well. The receiver has to acknowledge (ACK) the data received. The sender has a timeout, and if it does not receive the ACK before the timeout is over, it will retransmit the proper segments again. There are different possibilities when segments are retransmitted: segment lost or delayed (more than the timeout), ACK lost or delayed (more than the timeout) or segment corrupted, where the checksum does not match, so there is no ACK sent. There are different kinds of implementing the ACK depending on the needs of the transmission (bandwidth, time line...) [2, Ch. 19]. Figure 3: TCP Header The TCP header is much more complex than the UDP one. The fields can be seen in [2, Ch. 17]:
● Source and Destination Ports: these are used also for both TCP and UDP. The
port number specifies a particular application, so the packets can be distributed in a good way. That is, it is possible to multiplex and demultiplex over the transport protocol, so each application on the sender can pack segments with a port number so it will be given to the same application at the destination. There are some fixed port numbers for common applications called wellknown ports (0 1023), then the registered ports (1024 – 49151) and finally the dynamic/private ports (49152 – 65535) that can be assigned at the time. Sequence number and Acknowledgement: these numbers are used for several functions. As it was said before they are established when the connection starts and later increased depending on the number of bytes that are sent. They are mainly used for reliable data transmission. Data offset: the header length. It can be variable as well because the options field.
Flags: there are different flags used to validate an ACK (ACK flag), to
establish, to set up and tear down the connection (RST, SYN and FIN), for indicating that urgent data for the upper layer is carried in the segment (URG) which is indicated by the Urgent Pointer. Window: this field specifies the size of the window for purposes of the flow control, so it warns about the quantity of data that can be sent. Checksum: it is the checksum of the whole segment, header plus data, instead of only the header. TCP includes flow control, which means that it adapts the sending rate to the receivers capacity trying not to overflow its buffer. So, the buffer is emptied by the receiver application at some rate taking the data arrived, and the transport layer indicates in each ACK the size of the buffer at the receive window field, so the sender can check the quantity of information to send that fits the conditions without overflowing. There is also congestion control. This feature prevents a sender from overloading the capacity of the network. The sender detects that the network is congested by different ways, so it adapts the rates according to some algorithms. The congestion is usually due to the overload of router's buffers, then the router discards more packets and because the reliable transmission the sender tries to send the packet again, which would increase the congestion with more traffic. To avoid this situation, when the sender detects the congestion, with several same ACKs received or timeouts, it decreases the rate of sending data. Most important algorithms are New Reno, Binary Increase Congestion (BIC), CUBIC and the High Speed Transport Control Protocol (HSTCP) and the most important parameters are the congestion window (CongWin) and the threshold.
2.1.3 UDP
The UDP transport protocol, as opposite to TCP, is an unreliable and connectionless protocol. The specification is available at [6]. With unreliable we mean that the packets might arrive out of order, duplicated or just do not arrive if they are discarded. It is a very simple protocol therefore the overhead of the header is really low. It does not assure that the packets reach their destination. Aside from the multiplexing/demultiplexing function as we have seen in the previous section with the source and destination ports, and from a light checksum to check errors and discard the packets in case, it does not add anything to the IP. But this protocol really suits well to realtime applications as is our case, realtime video [1, Ch. 3.3]. Furthermore, it also allows multicast, TCP does not. Among the common uses of this protocol, we should mention the DNS to translate the IP addresses to readable host names, and the Dynamic Host Configuration Protocol (DHCP) to get an IP address dynamically when we want to connect to a network . There is no connection established as in TCP. UDP just delivers the packets to IP and sends them. There are no reliable processes like ACK, congestion control nor flow control. So the main advantages of UDP over TCP is about the delay and time requirements; there is no connection establishment, no connection state, small overhead and the rate is not regulated, so it uses always the maximum available capacity while TCP does not due to the congestion and flow controls. Figure 4: UDP Header2.1.4 RTP
The RTP is a sublayer of the transport protocol and itself belongs to the transport layer but it is also sometimes considered part of the application layer and usually implemented within it. For a complete specification please check [7]. It provides endtoend delivery services for data with realtime characteristics, as it can be interactive video or audio. These services typically run on the top of UDP, as it is a protocol that suits to real time applications as we explained before and because it supports multicast, using its multiplexing feature and the checksum as well. The main characteristics are the sequence number and the timestamps. RTP supports multicast, so TCP does not adapt well to this protocol due to TCP uses retransmission imposing an excessive delay for realtime, and because TCP does not carry out multicast [1, Ch. 6.4]. This protocol does not ensure any reliability neither in qualityofservice (QoS) nor in the time delays. There has to be another mechanism to do it. But it provides other services which are: Payloadtype identification. Sequence numbering. Time stamping. Delivery monitoring.Here the most important fields are the Payload Type (PT), the Sequence Number, the Timestamp and the Synchronization Source Identifier (SSRC). The first one shows the type of media the packet is carrying. The Sequence Number serves the application to keep the order of the sequence of the packet arrivals and to keep track of the packets lost; in each packet sent
the number increments by one. The timestamp is helpful when the receiver application wants to synchronize the play out and also to remove the jitter that some packets can make and to order the packets arrived. Finally the SSRC is used as an identifier of the sender of a media stream; it is a random number with a very small probability of being repeated [7]. The protocol includes two parts, RTP and RealTime Control Protocol (RTCP), and in our scenario it works together with RealTime Streaming Protocol (RTSP). RTP sends the media packets. Each source sends the packets with an identifier, the SSRC, a sequence number which is incrementing and a timestamp. The sequence number allows the receiver application to order the packets received and to find out which ones are lost. The timestamp is used to synchronize the packets once they are received. The RTCP is in charge of sending control packets within reports. These packets contain information about the quality of the transmission such packets sent, packets received and the jitter, so each application can decide what to do depending on these reports [7]. They also contain a canonical name (CNAME) for each RTP source, an identifier used to keep track of multiple streams or to solve conflicts about two senders using the same SSRC. Another function of the protocol is to control the rate of packets sent, as they are sent periodically and from both senders and receivers, it would be easy to overload the network with those packets with a large number of participants. There is also an option of conveying a minimal information to be displayed in case the session is loosely controlled, with many participants entering and leaving. These RTCP packets are sent by the sender and the receiver periodically, depending on the bandwidth available. Both RTP and RTCP packets are sent separately, that is, multiplexed in UDP using different but contiguous ports. In case there is a multimedia stream, each media is also sent with different ports and different RTCP packets.
2.1.5 RTSP
Finally we shall mention the RTSP; this is an application protocol used to control a stream of data with realtime properties as it can be audio and video. The IETF specification is available at [8]. So a client uses it to control the stream from the server establishing a session and sending commands as a network remote control. There is a presentation description for the set of streams (if more than one) to show the main information about the streams to the client; it may take several formats.The server needs to set up an RTSP session with an identifier to keep a state of the transmission since the RTSP does not have any relation with the transport layer and during a session it could open and close many TCP connections or use UDP [8]. Although the protocol is commonly used together with RTP, they are not tied since the mechanism used to carry the media does not matter. This protocol is quite similar to HTTP in the syntax and operation as it uses methods to communicate client and server but there are still important differences. The main methods are Video Cassette Recorder (VCRlike) commands such as describe, setup, play, record, pause and teardown.
2.2 Error Control Strategies: ARQ / FEC
Error Control coding falls into two main categories: Forward Error Correction (FEC) and
Automatic Repeat reQuest (ARQ) schemes.
FEC deals with errorcorrecting codes. When errors are detected at the receiver side, it attempts to detect their locations in order to, later on, correct them. If locations are properly determined, the receiver will successfully decode them; otherwise the decoding would be wrongly performed. ARQ uses errordetecting codes. The receiver checks out whether there is any error. If no errors are detected, the receiver notifies the transmitter that everything was fine. Otherwise the receiver requests the retransmission of the same codeword. Retransmissions continues until the codeword is successfully received.
Each scheme entails different properties, configurations, performances, strengths and weaknesses. In the next chapters these schemes, mainly the FEC one, receive a thorough treatment.
2.2.1 Automatic Repeat reQuest (ARQ)
There is a straight differentiation of Error Control schemes. It can be distinguished between
chapter focuses on the last ones, ARQ, detailing different configurations as well as how they work. In those cases where the communication in oneway, Error Control strategies must use FEC schemes. Within this strategies, codes automatically correct errors detected at the receiver without the intervention of the senderside [this thesis, Ch 2.2.2]. This technique seem to be the most suitable one for those cases where asymmetric complexity at receiver and transmitter side are set up, as for example digital storage or deepspace communication systems. Twoway communication can be found in another scenarios. Examples of them are some data networks and satellite communication systems. In those cases, ARQ is a widely deployed strategy. The core idea is, when errors are detected at the receiver side, a request is sent for the transmitter to repeat the message. Requests are repeated as many times as necessary until the message is correctly received. Therefore, ARQ schemes successfully perform in those applications where high reliability is a must and there are not tight constraints about timing delays. The major advantage of ARQ over FEC is that error detection requires much simpler decoders than error correction techniques [9, Ch 1]. This is the reason because they are widely deployed in oneway communications or in those cases where asymmetric complexity is a must. ARQ systems fall into two categories: StopandWait ARQ and Continuous ARQ. In the case of StopandWait ARQ, the transmitter sends the codeword to the receiver and waits for either a positive (ACK) or negative (NAK) acknowledgement from the receiver. If an ACK is received, it means that no errors were discovered. Thus, the communication keeps going with the next codeword. When a NAK is received it means that some kind of error came up. The receiver, in spite of attempting to fix them, it asks the transmitter to forward the same codeword again. This process is repeated as many times as necessary until receiving an ACK for the treated codeword. When the ACK is received the senderside can go through the next codeword. The main advantage is that they require very little buffering in the transmitter side, and almost nothing in the receiver one. Nevertheless, its main constraint lies on the fact that it performs a very inefficient use of the channel when small packet sizes are used in scenarios with high packet loss probability [10]. More detailed information about StopandWait ARQ can be found in [11].
Figure 6: StopandWait ARQ
An example of StopandWait ARQ communication is further introduced. Henceforth, consider the same dummy scenario for every ARQ example. Suppose sending one packet takes 1 time unit, and some error come up when packets (codewords) number 3 and 5 are received. In order to make easier the comparison among the different examples, the same amount of time has been represented (i.e. 18 sent packets, so 18 time units). Note that within this scheme only 3 packets (codewords) have been successfully received during the 18 time units that it lasted. There was not time enough for sending the 5th packet where the other error should come up.
The second type of ARQ, Continuous ARQ, performs according to two different configurations: GoBack N ARQ and SelectiveRepeat ARQ. Within a GoBack N scheme, the
transmitter continuously sends codewords to the receiver side. The sender does not wait for the corresponding acknowledgement. As soon as the packet (codeword) is sent, it starts
sending the next one. Obviously, some time last since a packet (codeword) is sent until its acknowledgement, either ACK or NAK, is received. This time is known as RoundTrip Time (RTT). Do realize that during any RTT, a certain amount N1 of packets (codewords) has been sent. Figure 7: GoBackN ARQ (N=5) When a NAK arrives to the senderside, the transmission is resumed from the past N packets (codewords). Hence, the unacknowledged codeword plus the subsequent N1 ones are resent. As it was previously remarked, certain buffering level in the transmitter side is required in order to back up these packets (codewords) [12]. Because of the continuous transmission and retransmission of packets (codewords), this scheme is more efficient than the StopandWait one. It does not demand high complexity. However it becomes unsuitable when the RTT is large and the data rate is high. In that case, N would become rather large, which means that a large amount of correct packets (codewords) should be discarded and retransmitted when a
Alternatively to the former commented versions, there is another configuration called
SelectiveRepeat ARQ [13] which overcomes some of the former problems. Within this
scheme only the negatively acknowledged codewords are resent. This is the most effective ARQ version, though it is fair to mention that it entails some drawbacks. It requires a more complicated logic design, as well as a larger buffering capacity [9, Ch 22.1]. The next figure shows how this configuration performs within the dummy scenario. Figure 8: SelectiveRepeat ARQ StopandWait, GoBackN and SelectiveRepeat ARQ configurations provide different levels of throughput (i.e. rate at which newly generated messages are correctly received) by changing the amount of buffering (none in the case of StopandWait, larger in the other two ARQ schemes).
Besides the complexity issue, ARQ is adaptive in the sense that information is only retransmitted when errors occur. In contrast, when channel rate is high, many errors occur. This means that many retransmissions are necessary, what at the end leads to decrease the system throughput. Therefore, in applications with tight timing constraints (high system
throughput required) ARQ is not an option. Sometimes it is convenient to think of hybrid FEC/ARQ schemes [Ch 2.2.3]. In those cases, FEC is used for the most frequent error patterns along with error detection, and retransmission for the less likely error patterns.
2.2.2 Types of Codes
When talking about Forward Error Correcting (FEC) codes, we can differentiate two main kind of codes: Block codes and Convolutional codes. Within this chapter we introduce both classes. Some parameters are also presented, as well as their strengths and constraints. Later on, we separately describe each type of code along with the most popular encoders and decoders. Throughout this thesis, we model our transmission channel as a Symmetric Erasure Channel. The reason for that statement is that at the receiverside, besides of correct and error symbols (no matter whether they are either binary or nonbinary), erased symbols (erasures) can arrive. Figure 9: Symmetric Erasure Channel We can set a new classification for decoders, differentiating between harddecision and soft decision decoding. Within harddecision decoding, the output in the demodulator is quantized in two levels, 0 or 1. The metric used is Hamming distance. When decoding, the received sequence is decoded to the closest codeword in Hamming distance.In the case of softdecision decoding, the output is unquantified or quantized in more than two levels. It has a better error performance than harddecision decoding. However, it is much harder to implement than the former one. For more information on softdecision decoding, see [9, Ch 10]. Besides the former classifications, we can also differentiate them according to where the redundancy symbols are placed within the codewords. Thus, when the redundancy is added either at the end or at the beginning of the message, it is said to be systematic coding. Thus, it is immediate the differentiation between information and redundancy symbols. However, when the information and redundancy symbols are mixed up within the codeword, the scheme is called nonsystematic. Figure 10: Systematic Encoding
Within the Block codes, the information sequence is divided into message blocks of k information symbols each block. Thus, we can represent a message block by an ktuple which will be called m : m = m0 , m1 , ... , mk −1 At the transmitter side the encoder adds the redundancy symbols to the message. Hence, the k tuple becomes an ntuple, where n > k, called codeword c: c = c0 , c1 , ... , cn−1 . Consider that we are working with 1bit symbols. Thus, there are 2k possible messages corresponding to 2k codewords. In the case that we were handling nonbinary symbols (i.e. symbols belonging to GF 2m [ Ch 2.3]) each symbol would be made up of m bits. Thus
there would be 2km possible messages for 2km possible codewords. Regardless to the sort of symbols, the set of nsymbol codewords is know as a C(n,k) code.
The nk extra symbols added during the encoding process represent the redundancy. They entail the capability of combating errors and erasures that can arise during the transmission of a codeword through a channel. How to set up those n and k parameters is a major concern that must be faced up in the designing process depending on the applications where the code is
supposed to work on.
There is another important parameter to mention. It is called code rate and represents the percentage of information symbols compared to the total amount of symbols:
R=k/ n
Each message is independently encoded. It means that, every nsymbol codeword only depends on the corresponding ksymbol message. Those systems that fulfill this property are called memoryless systems. They can be implemented with a combinational logic circuit. In the case of Convolutional codes, they also accept ksymbol blocks of information and produce nsymbol blocks. The main difference to the Block codes is the fact that each encoded block depends not only on the corresponding ksymbols message, but also on m previous message blocks. Thus, Convolutional codes are said to have memory order m. Since the encoder contains memory, it must be implemented with a sequential logic circuit [9, Ch 11]. Several decoding algorithms can be used for decoding convolutional codes. Viterbi algorithm is one of the most popular ones. They are often used to improve the performance of digital radio, mobile phones, satellite links, and Bluetooth implementation.
2.2.2.1 Block Codes
We focus on linear block codes, a part of block codes, due to its ease of code synthesis and implementation compared to nonlinear ones. Some of the most popular ones are pointed out as well as some techniques to improve their capabilities. In order to get a more comprehensive understanding of them, some concepts are presented before. Minimum distance, ErrorDetecting and ErrorCorrecting capabilities: In order to determine the errordetecting and errorcorrecting capabilities of a block code, theminimum distance parameter dmin must be introduced. The distance d v ,w between two codewords, v and w, is the number of positions where they differ. The minimum distance of a
code C is defined as:
dmin=min { d v,w; v, w ∈ C }
Thus, any two distinct codewords of C differ in at least dmin positions. Therefore, no error pattern of dmin−1 or fewer errors can change one codeword into another. In that case it
is capable of detecting all the error patterns of dmin−1 or fewer errors.
Once dmin has been introduced it is possible to explain one of the main advantages of non binary codes over binary ones. Let C n , k be a code whose symbols could be either binary or nonbinary. Nonbinary codes can achieve larger dmin than its equivalent binary version [14, Ch. 8], therefore larger errorcorrecting capabilities. It can be clarified with the next example: Assume a C 7,3 code whose symbols belong to a GF 2m . In the case that m = 2 we are working over a binary (7,3) code. It contains 22 7 7tuple of symbols, where there are 22 3 codewords (i.e. 1/24 are codewords). Let's now consider the case m = 8. Thus, C is a nonbinary code whose symbols are made up of 8 bits. Hence it contains 27⋅8 7tuple of symbols, from which 23⋅8 are codewords (i.e. 1/236 are codewords). Therefore it can be stated that when working with nonbinary codes only a small fraction of n tuples are codewords (i.e. 2k⋅m of 2n⋅m ). Thus, since only a small fraction of the ntuple space is used for codewords, a large dmin can be achieved.
Another common parameter when talking about codes is t, the ErrorCorrecting Capability, which is related to dmin by:
t=⌊dmin– 1/2⌋
A terrorcorrecting code guarantees the correction of all the error patterns of t or fewer errors.
In the case of binary symmetric erasure channel, soft decision schemes must be considered for the receiver since it may accept also erased symbols. A code with dmin is capable of correcting any pattern of v errors and e erasures whether the following condition is satisfied:
dmin≥2ve1
Syndrome:
Henceforth some general ideas about syndrome are introduced. For a rigorous mathematics analysis you can check some of the referred literature [9, Ch 3.2]. Assume a C(n,k) linear block code. We can state that: v=u⋅G where v is a codeword, u the block message and G the Generator Matrix. Any k linearly independent codewords belonging to a C(n,k) linear code can be used to form a generator matrix. A desirable property for a linear block is to posses systematic structure for their codewords since it makes easier to differentiate between information and redundancy
symbols during the decoding. There is another useful matrix associated to every linear block code called ParityCheck Matrix (H). It can be stated that: v⋅HT =0 Consider a C(n,k) linear code with parity check matrix H. Let v be a codeword, and let r be the received vector, received after transmitting v over a noisy channel. It is possible to recover the error pattern as: e=rv When the presence of errors is detected, the decoder will either take actions to locate the errors and correct them (FEC), or it will request a retransmission of v (ARQ). At this point it can be introduced s, the syndrome of the received vector r : s= r⋅HT = s0 , ... , sn−k−1 From the previous formulas it can be stated: s= r⋅HT
=ev⋅HT=e⋅HTv⋅HT=e⋅HT
When s is not zero it means that r is not a codeword, so at least one error is detected. However, when s is equal to zero, r is accepted as a valid codeword. That is not absolutely sufficient to assure that r does not contain errors. In such case, those are called undetectable error patterns (e.g. when e is identical to a nonzero codeword, r is another codeword as the result of adding two codewords e and v, in that case s would be equal to zero despite of r contains errors). Roughly speaking, when a linear code is decoded, it is necessary to go through three steps: computing the syndrome, associating that syndrome to an error pattern and carrying out the error correction. This last step, the error correction, consists of adding modulo2 the error pattern to the received vector. Remarkable Linear Block Codes: Let us point out some of the most popular linear block codes. The first block codes assessed were Hamming codes. They present dmin=3 , so they are capable of correcting any single error over the code block. By properly shortening of them it is possible to get Hamming codes with dmin=4 , which means singleerror correction and doubleerror detection [9, Ch 4.1].
Among their strengths we must mention their decoding simplicity (they are easily decoded using a tablelookup scheme), as well as the high rates that can be achieved. They have been widely deployed in digital communication and data storage systems.
ReedMuller codes were also very popular in the first ages. They provide multiple random error correction. One of the main advantages they show is their simplicity since they are rich in structural properties. They can be decoded in many ways, using either harddecision or soft decision algorithms. Another popular code among the mathematicians due to its attractive structural properties is the (24,12) Golay code. It has minimum distance of 8. Anyway in practice it is seldom used. Cyclic codes are part of linear block codes. Their main strengths are that encoding and syndrome computation can be implemented easily by employing shift registers with feedback connections (i.e. linear sequential circuits). They have a rich algebraic structure which allows several decoding methods. They are widely used in communication systems, particularly for error detection. Cyclic codes can be used for both randomerror and bursterror correction. Among popular cyclic codes it can be mentioned BCH codes, ReedSolomon codes and Fire codes. BCH stands for Bose, Chaudhuri and Hocquengem, its inventors. They are powerful random errorcorrecting cyclic codes. Actually, they are a generalization of Hamming codes. In spite of correcting single errors as the Hamming ones, BCH codes are able to achieve multiple error correction. BCH codes can handle either binary or nonbinary symbols [Ch 2.3]. For nonbinary symbols we must highlight the subclass of BCH known as ReedSolomon (RS) codes. We thoroughly treat them in the chapter 3.2. As it has been remarked before, one of the most popular block codes are ReedSolomon (RS) codes. The minimum distance of a RS code is equal to the number of paritycheck symbols plus one. RS codes have been proved to be good errordetecting codes [15]. Indeed, they are very effective correcting both random symbol errors and random burst errors [9, Ch 7], particularly useful for bursterror correction (i.e. for channels that have memory) [14, Ch. 8]. Due to that effectiveness, they are widely spread in many applications [16]. Sometimes they are even used as outer codes in concatenation with other codes in order to provide a higher data reliability with reduced decoding complexity [9, Ch 7]. Decoding a nonbinary BCH or RS code requires determining location and value of symbol errors. When we are working with erasures in spite of errors, the decoding process focuses on just determining the value since the erasure location is easily found. There are various decoding algorithms for both BCH and RS codes. A good treatment of non binary BCH and RS codes and their decoding algorithms can be found in [11, 1623]. It is fair to mention among the decoding algorithms the Euclidean's algorithm and the Berlekamp's
Code construction methods:
Besides the former presented codes, there are some code construction methods that can be used in order to construct long and powerful codes from short ones. As the RS codes is concerned, sometimes it is not necessary to use the “natural” sizes. It is possible to produce a smaller code of any desired size from a larger one. The core idea is to pad the unused portion of the block with zeroes and not transmitting them. This technique is know as Shortening, and it can be found out in popular applications as the Compact Disk where shortened RS codes are employed.
Product Coding is another remarkable technique. Assume C1n1,k1 and C2n2, k2 as
two linear block codes with their respective minimum distances dmin and d 'min . Applying this scheme, a new kind of two dimensional block code n1⋅n2, k1⋅k2 can be derived. This
new code has dmin⋅d 'min minimum distance. In such a fashion we can construct powerful
codes from short component codes [9, Ch 4.7].
Interleaved codes [ this thesis, Ch 3.2.2] represent another way of improving any code's
capabilities. It is possible to construct from a C(n,k) linear block code, another n , k2
linear block code. The core idea is to arrange λ codewords into λ rows. Once they are arranged we start transmitting a certain amount of symbols from every codeword, all together, building up a new block. When this new block is received, the symbols belonging to it are splitted into their corresponding codewords. Likewise the same process is repeated for the next certain amount of symbols from every codeword until going through the n symbols. It is very effective for achieving long, powerful codes for correcting burst errors just by transmitting in a certain way the codewords [9, Ch 4.8]. The last technique that we mention is Concatenation coding. Roughly speaking, it consists of concatenating various codes, one after other, in order to get a general code. Singlelevel concatenation (i.e. concatenation of only two codes) is widely used in communication and digital data storage systems to achieve high reliability with reduced decoding complexity. In most of the cases, the inner code usually is a short binary code while for the outer code a longer nonbinary one is selected. As it has been remarked before, RS codes are very commonly used as outer codes.
One step further can be taken with multilevel concatenation. It provides more flexibility and allows different rates for various communication environments. Obviously, the decoding complexity as well as the time to decode the general code are increased.
2.2.2.2 Convolutional Codes:
Convolutional codes are an alternative to Block codes. They were first introduced by [24].
Within the encoders, the entire data stream is converted, regardless of its length, into a single codeword. The performance of a Convolutional code depends on the decoding algorithm employed and the distance properties of the code. The most important distance measure for Convolutional codes is the minimum free distance dfree , where
dfree=min { d v' , v' '; u' != u' ' } , v' and v'' are the codewords corresponding to the
information sequences u' and u'', respectively. The main difference of these codes compared to Block codes is the fact that Convolutional encoders do have certain amount of memory. This means that, at any time, the encoded output depends not only on the actual input but on some previous ones. Thus, when we talk about an m memoryorder code it means that the input remains in the encoder during m time units. Another difference to Block codes is that large minimum distance and low error probabilities are not achieved by increasing n and k, but by increasing the memory order [9, Ch 11]. Thus, it is possible to accomplish an increased (burst) error correction capability since the free distance can be increased in that way. The cost is increased encoding and decoding complexities.
A new parameter for Convolutional codes is the overall constraint length v, which is the sum of the lengths of all k shift registers. Thus, any Convolutional encoder is referred to as an (n,
k, v) encoder.
Apart from the previous differentiation of encoders as systematic and nonsystematic, Convolutional encoders fall into two subclasses: Feedforward and Feedback. Systematic Feedback and its corresponding nonsystematic Feedforward encoders generate the same codes. The only difference is the mapping between information sequence and codewords. In order to provide the reader a rough idea about them, we show some examples about the different configurations, where u and v are respectively the information and the output
Figure 11: ( 2, 1, 3 ) Nonsystematic Feedforward Convolutional encoder. The former figure represents a Convolutional encoder. It consist of k=1 shift register with m=3 delay elements and n=2 modulo2 adders. The information sequence u=u0, u1,u2... enters the encoder one bit at a time. The output sequences are v0 =v00, v10, v20, ... and v1 =v10, v11, v21,... . The two output sequences are multiplexed into a single sequence v called the codeword, where v= v0 0v 0 1 , v 1 0v 1 1 , v 2 0v 2 1, ... . It has a binary rate R = ½. Some other configurations for Convolutional encoders are represented in the next figures: Figure 12: ( 2, 1, 3 ) Systematic Feedforward Convolutional encoder.